-
Opinion Dynamics for Utility Maximizing Agents: Exploring the Impact of Resource Penalty
Authors:
Prashil Wankhede,
Nirabhra Mandal,
Sonia Martínez,
Pavankumar Tallapragada
Abstract:
We propose a continuous-time nonlinear model of opinion dynamics with utility-maximizing agents connected via a social influence network. A distinguishing feature of the proposed model is the inclusion of an opinion-dependent resource-penalty term in the utilities, which limits the agents from holding opinions of large magnitude. The proposed utility functions also account for how the relative res…
▽ More
We propose a continuous-time nonlinear model of opinion dynamics with utility-maximizing agents connected via a social influence network. A distinguishing feature of the proposed model is the inclusion of an opinion-dependent resource-penalty term in the utilities, which limits the agents from holding opinions of large magnitude. The proposed utility functions also account for how the relative resources within the social group affect both an agent's stubbornness and social influence. Each agent myopically seeks to maximize its utility by revising its opinion in the gradient ascent direction of its utility function, thus leading to the proposed opinion dynamics. We show that, for any arbitrary social influence network, opinions are ultimately bounded. For networks with weak antagonistic relations, we show that there exists a globally exponentially stable equilibrium using contraction theory. We establish conditions for the existence of consensus equilibrium and analyze the relative dominance of the agents at consensus. We also conduct a game-theoretic analysis of the underlying opinion formation game, including on Nash equilibria and on prices of anarchy in terms of satisfaction ratios. Additionally, we also investigate the oscillatory behavior of opinions in a two-agent scenario. Finally, simulations illustrate our findings.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Multi-Contact Inertial Estimation and Localization in Legged Robots
Authors:
Sergi Martinez,
Robert Griffin,
Carlos Mastalli
Abstract:
Optimal estimation is a promising tool for multi-contact inertial estimation and localization. To harness its advantages in robotics, it is crucial to solve these large and challenging optimization problems efficiently. To tackle this, we (i) develop a multiple-shooting solver that exploits both temporal and parametric structures through a parametrized Riccati recursion. Additionally, we (ii) prop…
▽ More
Optimal estimation is a promising tool for multi-contact inertial estimation and localization. To harness its advantages in robotics, it is crucial to solve these large and challenging optimization problems efficiently. To tackle this, we (i) develop a multiple-shooting solver that exploits both temporal and parametric structures through a parametrized Riccati recursion. Additionally, we (ii) propose an inertial local manifold that ensures its full physical consistency. It also enhances convergence compared to the singularity-free log-Cholesky approach. To handle its singularities, we (iii) introduce a nullspace approach in our optimal estimation solver. We (iv) finally develop the analytical derivatives of contact dynamics for both inertial parametrizations. Our framework can successfully solve estimation problems for complex maneuvers such as brachiation in humanoids. We demonstrate its numerical capabilities across various robotics tasks and its benefits in experimental trials with the Go1 robot.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Robotic Exploration using Generalized Behavioral Entropy
Authors:
Aamodh Suresh,
Carlos Nieto-Granda,
Sonia Martinez
Abstract:
This work presents and evaluates a novel strategy for robotic exploration that leverages human models of uncertainty perception. To do this, we introduce a measure of uncertainty that we term ``Behavioral entropy'', which builds on Prelec's probability weighting from Behavioral Economics. We show that the new operator is an admissible generalized entropy, analyze its theoretical properties and com…
▽ More
This work presents and evaluates a novel strategy for robotic exploration that leverages human models of uncertainty perception. To do this, we introduce a measure of uncertainty that we term ``Behavioral entropy'', which builds on Prelec's probability weighting from Behavioral Economics. We show that the new operator is an admissible generalized entropy, analyze its theoretical properties and compare it with other common formulations such as Shannon's and Renyi's. In particular, we discuss how the new formulation is more expressive in the sense of measures of sensitivity and perceptiveness to uncertainty introduced here. Then we use Behavioral entropy to define a new type of utility function that can guide a frontier-based environment exploration process. The approach's benefits are illustrated and compared in a Proof-of-Concept and ROS-unity simulation environment with a Clearpath Warthog robot. We show that the robot equipped with Behavioral entropy explores faster than Shannon and Renyi entropies.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Conciliating Privacy and Utility in Data Releases via Individual Differential Privacy and Microaggregation
Authors:
Jordi Soria-Comas,
David Sánchez,
Josep Domingo-Ferrer,
Sergio Martínez,
Luis Del Vasto-Terrientes
Abstract:
$ε$-Differential privacy (DP) is a well-known privacy model that offers strong privacy guarantees. However, when applied to data releases, DP significantly deteriorates the analytical utility of the protected outcomes. To keep data utility at reasonable levels, practical applications of DP to data releases have used weak privacy parameters (large $ε…
▽ More
$ε$-Differential privacy (DP) is a well-known privacy model that offers strong privacy guarantees. However, when applied to data releases, DP significantly deteriorates the analytical utility of the protected outcomes. To keep data utility at reasonable levels, practical applications of DP to data releases have used weak privacy parameters (large $ε$), which dilute the privacy guarantees of DP. In this work, we tackle this issue by using an alternative formulation of the DP privacy guarantees, named $ε$-individual differential privacy (iDP), which causes less data distortion while providing the same protection as DP to subjects. We enforce iDP in data releases by relying on attribute masking plus a pre-processing step based on data microaggregation. The goal of this step is to reduce the sensitivity to record changes, which determines the amount of noise required to enforce iDP (and DP). Specifically, we propose data microaggregation strategies designed for iDP whose sensitivities are significantly lower than those used in DP. As a result, we obtain iDP-protected data with significantly better utility than with DP. We report on experiments that show how our approach can provide strong privacy (small $ε$) while yielding protected data that do not significantly degrade the accuracy of secondary data analysis.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities
Authors:
Parth Paritosh,
Nikolay Atanasov,
Sonia Martinez
Abstract:
In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual…
▽ More
In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents. This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest. We present Bayesian density estimation algorithms using data from non-linear likelihoods at agents in centralized, distributed, and marginal distributed settings. After setting up a distributed estimation objective, we prove almost-sure convergence to the optimal set of pdfs at each agent. Then, we prove the same for a storage-aware algorithm estimating densities only over relevant variables at each agent. Finally, we present a Gaussian version of these algorithms and implement it in a map** problem using variational inference to handle non-linear likelihood models associated with LiDAR sensing.
△ Less
Submitted 7 December, 2023; v1 submitted 2 December, 2023;
originally announced December 2023.
-
Multiple Protein Profiler 1.0 (MPP): A webserver for predicting and visualizing physiochemical properties of proteins at the proteome level
Authors:
Gustavo Sganzerla Martinez,
Mansi Dutt,
Anuj Kumar,
David J Kelvin
Abstract:
Determining the physicochemical properties of a protein can reveal important insights in their structure, biological functions, stability, and interactions with other molecules. Although tools for computing properties of proteins already existed, we could not find a comprehensive tool that enables the calculations of multiple properties for multiple input proteins on the proteome level at once. Fa…
▽ More
Determining the physicochemical properties of a protein can reveal important insights in their structure, biological functions, stability, and interactions with other molecules. Although tools for computing properties of proteins already existed, we could not find a comprehensive tool that enables the calculations of multiple properties for multiple input proteins on the proteome level at once. Facing this limitation, we have developed Multiple Protein Profiler (MPP) 1.0 as an integrated tool that allows the profiling of 12 individual properties of multiple proteins in a significant manner. MPP provides a tabular and graphic visualization of properties of multiple proteins. The tool is freely accessible at https://mproteinprofiler.microbiologyandimmunology.dal.ca/
△ Less
Submitted 17 November, 2023;
originally announced December 2023.
-
RecolorCloud: A Point Cloud Tool for Recoloring, Segmentation, and Conversion
Authors:
Esteban Segarra Martinez,
Ryan P. McMahan
Abstract:
Point clouds are a 3D space representation of an environment that was recorded with a high precision laser scanner. These scanners can suffer from environmental interference such as surface shading, texturing, and reflections. Because of this, point clouds may be contaminated with fake or incorrect colors. Current open source or proprietary tools offer limited or no access to correcting these visu…
▽ More
Point clouds are a 3D space representation of an environment that was recorded with a high precision laser scanner. These scanners can suffer from environmental interference such as surface shading, texturing, and reflections. Because of this, point clouds may be contaminated with fake or incorrect colors. Current open source or proprietary tools offer limited or no access to correcting these visual errors automatically.
RecolorCloud is a tool developed to resolve these color conflicts by utilizing automated color recoloring. We offer the ability to deleting or recoloring outlier points automatically with users only needing to specify bounding box regions to effect colors. Results show a vast improvement of the photo-realistic quality of large point clouds. Additionally, users can quickly recolor a point cloud with set semantic segmentation colors.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Distributed Variational Inference for Online Supervised Learning
Authors:
Parth Paritosh,
Nikolay Atanasov,
Sonia Martinez
Abstract:
Develo** efficient solutions for inference problems in intelligent sensor networks is crucial for the next generation of location, tracking, and map** services. This paper develops a scalable distributed probabilistic inference algorithm that applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks. In a centralized setting, variational inferenc…
▽ More
Develo** efficient solutions for inference problems in intelligent sensor networks is crucial for the next generation of location, tracking, and map** services. This paper develops a scalable distributed probabilistic inference algorithm that applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks. In a centralized setting, variational inference is a fundamental technique for performing approximate Bayesian estimation, in which an intractable posterior density is approximated with a parametric density. Our key contribution lies in the derivation of a separable lower bound on the centralized estimation objective, which enables distributed variational inference with one-hop communication in a sensor network. Our distributed evidence lower bound (DELBO) consists of a weighted sum of observation likelihood and divergence to prior densities, and its gap to the measurement evidence is due to consensus and modeling errors. To solve binary classification and regression problems while handling streaming data, we design an online distributed algorithm that maximizes DELBO, and specialize it to Gaussian variational densities with non-linear likelihoods. The resulting distributed Gaussian variational inference (DGVI) efficiently inverts a $1$-rank correction to the covariance matrix. Finally, we derive a diagonalized version for online distributed inference in high-dimensional models, and apply it to multi-robot probabilistic map** using indoor LiDAR data.
△ Less
Submitted 22 October, 2023; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Incorporation of Eye-Tracking and Gaze Feedback to Characterize and Improve Radiologist Search Patterns of Chest X-rays: A Randomized Controlled Clinical Trial
Authors:
Carolina Ramirez-Tamayo,
Syed Hasib Akhter Faruqui,
Stanford Martinez,
Angel Brisco,
Nicholas Czarnek,
Adel Alaeddini,
Jeffrey R. Mock,
Edward J. Golob,
Kal L. Clark
Abstract:
Diagnostic errors in radiology often occur due to incomplete visual assessments by radiologists, despite their knowledge of predicting disease classes. This insufficiency is possibly linked to the absence of required training in search patterns. Additionally, radiologists lack consistent feedback on their visual search patterns, relying on ad-hoc strategies and peer input to minimize errors and en…
▽ More
Diagnostic errors in radiology often occur due to incomplete visual assessments by radiologists, despite their knowledge of predicting disease classes. This insufficiency is possibly linked to the absence of required training in search patterns. Additionally, radiologists lack consistent feedback on their visual search patterns, relying on ad-hoc strategies and peer input to minimize errors and enhance efficiency, leading to suboptimal patterns and potential false negatives. This study aimed to use eye-tracking technology to analyze radiologist search patterns, quantify performance using established metrics, and assess the impact of an automated feedback-driven educational framework on detection accuracy. Ten residents participated in a controlled trial focused on detecting suspicious pulmonary nodules. They were divided into an intervention group (received automated feedback) and a control group. Results showed that the intervention group exhibited a 38.89% absolute improvement in detecting suspicious-for-cancer nodules, surpassing the control group's improvement (5.56%, p-value=0.006). Improvement was more rapid over the four training sessions (p-value=0.0001). However, other metrics such as speed, search pattern heterogeneity, distractions, and coverage did not show significant changes. In conclusion, implementing an automated feedback-driven educational framework improved radiologist accuracy in detecting suspicious nodules. The study underscores the potential of such systems in enhancing diagnostic performance and reducing errors. Further research and broader implementation are needed to consolidate these promising results and develop effective training strategies for radiologists, ultimately benefiting patient outcomes.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Discrimination of Radiologists Utilizing Eye-Tracking Technology and Machine Learning: A Case Study
Authors:
Stanford Martinez,
Carolina Ramirez-Tamayo,
Syed Hasib Akhter Faruqui,
Kal L. Clark,
Adel Alaeddini,
Nicholas Czarnek,
Aarushi Aggarwal,
Sahra Emamzadeh,
Jeffrey R. Mock,
Edward J. Golob
Abstract:
Perception-related errors comprise most diagnostic mistakes in radiology. To mitigate this problem, radiologists employ personalized and high-dimensional visual search strategies, otherwise known as search patterns. Qualitative descriptions of these search patterns, which involve the physician verbalizing or annotating the order he/she analyzes the image, can be unreliable due to discrepancies in…
▽ More
Perception-related errors comprise most diagnostic mistakes in radiology. To mitigate this problem, radiologists employ personalized and high-dimensional visual search strategies, otherwise known as search patterns. Qualitative descriptions of these search patterns, which involve the physician verbalizing or annotating the order he/she analyzes the image, can be unreliable due to discrepancies in what is reported versus the actual visual patterns. This discrepancy can interfere with quality improvement interventions and negatively impact patient care. This study presents a novel discretized feature encoding based on spatiotemporal binning of fixation data for efficient geometric alignment and temporal ordering of eye movement when reading chest X-rays. The encoded features of the eye-fixation data are employed by machine learning classifiers to discriminate between faculty and trainee radiologists. We include a clinical trial case study utilizing the Area Under the Curve (AUC), Accuracy, F1, Sensitivity, and Specificity metrics for class separability to evaluate the discriminability between the two subjects in regard to their level of experience. We then compare the classification performance to state-of-the-art methodologies. A repeatability experiment using a separate dataset, experimental protocol, and eye tracker was also performed using eight subjects to evaluate the robustness of the proposed approach. The numerical results from both experiments demonstrate that classifiers employing the proposed feature encoding methods outperform the current state-of-the-art in differentiating between radiologists in terms of experience level. This signifies the potential impact of the proposed method for identifying radiologists' level of expertise and those who would benefit from additional training.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
On the Complexity of the Bipartite Polarization Problem: from Neutral to Highly Polarized Discussions
Authors:
Teresa Alsinet,
Josep Argelich,
Ramón Béjar,
Santi Martínez
Abstract:
The Bipartite Polarization Problem is an optimization problem where the goal is to find the highest polarized bipartition on a weighted and labelled graph that represents a debate developed through some social network, where nodes represent user's opinions and edges agreement or disagreement between users. This problem can be seen as a generalization of the maxcut problem, and in previous work app…
▽ More
The Bipartite Polarization Problem is an optimization problem where the goal is to find the highest polarized bipartition on a weighted and labelled graph that represents a debate developed through some social network, where nodes represent user's opinions and edges agreement or disagreement between users. This problem can be seen as a generalization of the maxcut problem, and in previous work approximate solutions and exact solutions have been obtained for real instances obtained from Reddit discussions, showing that such real instances seem to be very easy to solve. In this paper, we investigate further the complexity of this problem, by introducing an instance generation model where a single parameter controls the polarization of the instances in such a way that this correlates with the average complexity to solve those instances. The average complexity results we obtain are consistent with our hypothesis: the higher the polarization of the instance, the easier is to find the corresponding polarized bipartition.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Quantum Search Approaches to Sampling-Based Motion Planning
Authors:
Paul Lathrop,
Beth Boardman,
Sonia Martínez
Abstract:
In this paper, we present a novel formulation of traditional sampling-based motion planners as database-oracle structures that can be solved via quantum search algorithms. We consider two complementary scenarios: for simpler sparse environments, we formulate the Quantum Full Path Search Algorithm (q-FPS), which creates a superposition of full random path solutions, manipulates probability amplitud…
▽ More
In this paper, we present a novel formulation of traditional sampling-based motion planners as database-oracle structures that can be solved via quantum search algorithms. We consider two complementary scenarios: for simpler sparse environments, we formulate the Quantum Full Path Search Algorithm (q-FPS), which creates a superposition of full random path solutions, manipulates probability amplitudes with Quantum Amplitude Amplification (QAA), and quantum measures a single obstacle free full path solution. For dense unstructured environments, we formulate the Quantum Rapidly Exploring Random Tree algorithm, q-RRT, that creates quantum superpositions of possible parent-child connections, manipulates probability amplitudes with QAA, and quantum measures a single reachable state, which is added to a tree. As performance depends on the number of oracle calls and the probability of measuring good quantum states, we quantify how these errors factor into the probabilistic completeness properties of the algorithm. We then numerically estimate the expected number of database solutions to provide an approximation of the optimal number of oracle calls in the algorithm. We compare the q-RRT algorithm with a classical implementation and verify quadratic run-time speedup in the largest connected component of a 2D dense random lattice. We conclude by evaluating a proposed approach to limit the expected number of database solutions and thus limit the optimal number of oracle calls to a given number.
△ Less
Submitted 23 October, 2023; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Robot Navigation in Risky, Crowded Environments: Understanding Human Preferences
Authors:
Aamodh Suresh,
Angelique Taylor,
Laurel D. Riek,
Sonia Martinez
Abstract:
Risky and crowded environments (RCE) contain abstract sources of risk and uncertainty, which are perceived differently by humans, leading to a variety of behaviors. Thus, robots deployed in RCEs, need to exhibit diverse perception and planning capabilities in order to interpret other human agents' behavior and act accordingly in such environments. To understand this problem domain, we conducted a…
▽ More
Risky and crowded environments (RCE) contain abstract sources of risk and uncertainty, which are perceived differently by humans, leading to a variety of behaviors. Thus, robots deployed in RCEs, need to exhibit diverse perception and planning capabilities in order to interpret other human agents' behavior and act accordingly in such environments. To understand this problem domain, we conducted a study to explore human path choices in RCEs, enabling better robotic navigational explainable AI (XAI) designs. We created a novel COVID-19 pandemic grocery shop** scenario which had time-risk tradeoffs, and acquired users' path preferences. We found that participants showcase a variety of path preferences: from risky and urgent to safe and relaxed. To model users' decision making, we evaluated three popular risk models (Cumulative Prospect Theory (CPT), Conditional Value at Risk (CVAR), and Expected Risk (ER). We found that CPT captured people's decision making more accurately than CVaR and ER, corroborating theoretical results that CPT is more expressive and inclusive than CVaR and ER. We also found that people's self assessments of risk and time-urgency do not correlate with their path preferences in RCEs. Finally, we conducted thematic analysis of open-ended questions, providing crucial design insights for robots is RCE. Thus, through this study, we provide novel and critical insights about human behavior and perception to help design better navigational explainable AI (XAI) in RCEs.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Fine-Tuned Convex Approximations of Probabilistic Reachable Sets under Data-driven Uncertainties
Authors:
Pengcheng Wu,
Sonia Martinez,
Jun Chen
Abstract:
This paper proposes a mechanism to fine-tune convex approximations of probabilistic reachable sets (PRS) of uncertain dynamic systems. We consider the case of unbounded uncertainties, for which it may be impossible to find a bounded reachable set of the system. Instead, we turn to find a PRS that bounds system states with high confidence. Our data-driven approach builds on a kernel density estimat…
▽ More
This paper proposes a mechanism to fine-tune convex approximations of probabilistic reachable sets (PRS) of uncertain dynamic systems. We consider the case of unbounded uncertainties, for which it may be impossible to find a bounded reachable set of the system. Instead, we turn to find a PRS that bounds system states with high confidence. Our data-driven approach builds on a kernel density estimator (KDE) accelerated by a fast Fourier transform (FFT), which is customized to model the uncertainties and obtain the PRS efficiently. However, the non-convex shape of the PRS can make it impractical for subsequent optimal designs. Motivated by this, we formulate a mixed integer nonlinear programming (MINLP) problem whose solution result is an optimal $n$ sided convex polygon that approximates the PRS. Leveraging this formulation, we propose a heuristic algorithm to find this convex set efficiently while ensuring accuracy. The algorithm is tested on comprehensive case studies that demonstrate its near-optimality, accuracy, efficiency, and robustness. The benefits of this work pave the way for promising applications to safety-critical, real-time motion planning of uncertain dynamic systems.
△ Less
Submitted 3 February, 2024; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Do Multi-Document Summarization Models Synthesize?
Authors:
Jay DeYoung,
Stephanie C. Martinez,
Iain J. Marshall,
Byron C. Wallace
Abstract:
Multi-document summarization entails producing concise synopses of collections of inputs. For some applications, the synopsis should accurately \emph{synthesize} inputs with respect to a key property or aspect. For example, a synopsis of film reviews all written about a particular movie should reflect the average critic consensus. As a more consequential example, consider narrative summaries that…
▽ More
Multi-document summarization entails producing concise synopses of collections of inputs. For some applications, the synopsis should accurately \emph{synthesize} inputs with respect to a key property or aspect. For example, a synopsis of film reviews all written about a particular movie should reflect the average critic consensus. As a more consequential example, consider narrative summaries that accompany biomedical \emph{systematic reviews} of clinical trial results. These narratives should fairly summarize the potentially conflicting results from individual trials.
In this paper we ask: To what extent do modern multi-document summarization models implicitly perform this type of synthesis? To assess this we perform a suite of experiments that probe the degree to which conditional generation models trained for summarization using standard methods yield outputs that appropriately synthesize inputs. We find that existing models do partially perform synthesis, but do so imperfectly. In particular, they are over-sensitive to changes in input ordering and under-sensitive to changes in input compositions (e.g., the ratio of positive to negative movie reviews). We propose a simple, general method for improving model synthesis capabilities by generating an explicitly diverse set of candidate outputs, and then selecting from these the string best aligned with the expected aggregate measure for the inputs, or \emph{abstaining} when the model produces no good candidate. This approach improves model synthesis performance. We hope highlighting the need for synthesis (in some summarization settings), motivates further research into multi-document summarization methods and learning objectives that explicitly account for the need to synthesize.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
A ResNet is All You Need? Modeling A Strong Baseline for Detecting Referable Diabetic Retinopathy in Fundus Images
Authors:
Tomás Castilla,
Marcela S. Martínez,
Mercedes Leguía,
Ignacio Larrabide,
José Ignacio Orlando
Abstract:
Deep learning is currently the state-of-the-art for automated detection of referable diabetic retinopathy (DR) from color fundus photographs (CFP). While the general interest is put on improving results through methodological innovations, it is not clear how good these approaches perform compared to standard deep classification models trained with the appropriate settings. In this paper we propose…
▽ More
Deep learning is currently the state-of-the-art for automated detection of referable diabetic retinopathy (DR) from color fundus photographs (CFP). While the general interest is put on improving results through methodological innovations, it is not clear how good these approaches perform compared to standard deep classification models trained with the appropriate settings. In this paper we propose to model a strong baseline for this task based on a simple and standard ResNet-18 architecture. To this end, we built on top of prior art by training the model with a standard preprocessing strategy but using images from several public sources and an empirically calibrated data augmentation setting. To evaluate its performance, we covered multiple clinically relevant perspectives, including image and patient level DR screening, discriminating responses by input quality and DR grade, assessing model uncertainties and analyzing its results in a qualitative manner. With no other methodological innovation than a carefully designed training, our ResNet model achieved an AUC = 0.955 (0.953 - 0.956) on a combined test set of 61007 test images from different public datasets, which is in line or even better than what other more complex deep learning models reported in the literature. Similar AUC values were obtained in 480 images from two separate in-house databases specially prepared for this study, which emphasize its generalization ability. This confirms that standard networks can still be strong baselines for this task if properly trained.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Diversifying the Genomic Data Science Research Community
Authors:
The Genomic Data Science Community Network,
Rosa Alcazar,
Maria Alvarez,
Rachel Arnold,
Mentewab Ayalew,
Lyle G. Best,
Michael C. Campbell,
Kamal Chowdhury,
Katherine E. L. Cox,
Christina Daulton,
You** Deng,
Carla Easter,
Karla Fuller,
Shazia Tabassum Hakim,
Ava M. Hoffman,
Natalie Kucher,
Andrew Lee,
Joslynn Lee,
Jeffrey T. Leek,
Robert Meller,
Loyda B. Méndez,
Miguel P. Méndez-González,
Stephen Mosher,
Michele Nishiguchi,
Siddharth Pratap
, et al. (13 additional authors not shown)
Abstract:
Over the last 20 years, there has been an explosion of genomic data collected for disease association, functional analyses, and other large-scale discoveries. At the same time, there have been revolutions in cloud computing that enable computational and data science research, while making data accessible to anyone with a web browser and an internet connection. However, students at institutions wit…
▽ More
Over the last 20 years, there has been an explosion of genomic data collected for disease association, functional analyses, and other large-scale discoveries. At the same time, there have been revolutions in cloud computing that enable computational and data science research, while making data accessible to anyone with a web browser and an internet connection. However, students at institutions with limited resources have received relatively little exposure to curricula or professional development opportunities that lead to careers in genomic data science. To broaden participation in genomics research, the scientific community needs to support students, faculty, and administrators at Underserved Institutions (UIs) including Community Colleges, Historically Black Colleges and Universities, Hispanic-Serving Institutions, and Tribal Colleges and Universities in taking advantage of these tools in local educational and research programs. We have formed the Genomic Data Science Community Network (http://www.gdscn.org/) to identify opportunities and support broadening access to cloud-enabled genomic data science. Here, we provide a summary of the priorities for faculty members at UIs, as well as administrators, funders, and R1 researchers to consider as we create a more diverse genomic data science community.
△ Less
Submitted 9 June, 2022; v1 submitted 20 January, 2022;
originally announced January 2022.
-
Risk-perception-aware control design under dynamic spatial risks
Authors:
Aamodh Suresh,
Sonia Martinez
Abstract:
This work proposes a novel risk-perception-aware (RPA) control design using non-rational perception of risks associated with uncertain dynamic spatial costs. We use Cumulative Prospect Theory (CPT) to model the risk perception of a decision maker (DM) and use it to construct perceived risk functions that transform the uncertain dynamic spatial cost to deterministic perceived risks of a DM. These r…
▽ More
This work proposes a novel risk-perception-aware (RPA) control design using non-rational perception of risks associated with uncertain dynamic spatial costs. We use Cumulative Prospect Theory (CPT) to model the risk perception of a decision maker (DM) and use it to construct perceived risk functions that transform the uncertain dynamic spatial cost to deterministic perceived risks of a DM. These risks are then used to build safety sets which can represent risk-averse to risk-insensitive perception. We define a notions of "inclusiveness" and "versatility" based on safety sets and use it to compare with other models such as Conditional value at Risk (CVaR) and Expected risk (ER). We theoretically prove that CPT is the most "inclusive" and "versatile" model of the lot in the context of risk-perception-aware controls. We further use the perceived risk function along with ideas from control barrier functions (CBF) to construct a class of perceived risk CBFs. For a class of truncated-Gaussian costs, we find sufficient geometric conditions for the validity of this class of CBFs, thus guaranteeing safety. Then, we generate perceived-safety-critical controls using a Quadratic program (QP) to guide an agent safely according to a given perceived risk model. We present simulations in a 2D environment to illustrate the performance of the proposed controller.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Requirements-Aided Automatic Test Case Generation for Industrial Cyber-physical Systems
Authors:
Roopak Sinha,
Cheng Pang,
Gerardo Santillán Martínez,
Juha Kuronen,
Valeriy Vyatkin
Abstract:
Industrial cyber-physical systems require complex distributed software to orchestrate many heterogeneous mechatronic components and control multiple physical processes. Industrial automation software is typically developed in a model-driven fashion where abstractions of physical processes called plant models are co-developed and iteratively refined along with the control code. Testing such multi-d…
▽ More
Industrial cyber-physical systems require complex distributed software to orchestrate many heterogeneous mechatronic components and control multiple physical processes. Industrial automation software is typically developed in a model-driven fashion where abstractions of physical processes called plant models are co-developed and iteratively refined along with the control code. Testing such multi-dimensional systems is extremely difficult because often models might not be accurate, do not correspond accurately with subsequent refinements, and the software must eventually be tested on the real plant, especially in safety-critical systems like nuclear plants. This paper proposes a framework wherein high-level functional requirements are used to automatically generate test cases for designs at all abstraction levels in the model-driven engineering process. Requirements are initially specified in natural language and then analyzed and specified using a formalized ontology. The requirements ontology is then refined along with controller and plant models during design and development stages such that test cases can be generated automatically at any stage. A representative industrial water process system case study illustrates the strengths of the proposed formalism. The requirements meta-model proposed by the CESAR European project is used for requirements engineering while IEC 61131-3 and model-driven concepts are used in the design and development phases. A tool resulting from the proposed framework called REBATE (Requirements Based Automatic Testing Engine) is used to generate and execute test cases for increasingly concrete controller and plant models.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Online Optimization and Learning in Uncertain Dynamical Environments with Performance Guarantees
Authors:
Dan Li,
Dariush Fooladivanda,
Sonia Martinez
Abstract:
We propose a new framework to solve online optimization and learning problems in unknown and uncertain dynamical environments. This framework enables us to simultaneously learn the uncertain dynamical environment while making online decisions in a quantifiably robust manner. The main technical approach relies on the theory of distributional robust optimization that leverages adaptive probabilistic…
▽ More
We propose a new framework to solve online optimization and learning problems in unknown and uncertain dynamical environments. This framework enables us to simultaneously learn the uncertain dynamical environment while making online decisions in a quantifiably robust manner. The main technical approach relies on the theory of distributional robust optimization that leverages adaptive probabilistic ambiguity sets. However, as defined, the ambiguity set usually leads to online intractable problems, and the first part of our work is directed to find reformulations in the form of online convex problems for two sub-classes of objective functions. To solve the resulting problems in the proposed framework, we further introduce an online version of the Nesterov accelerated-gradient algorithm. We determine how the proposed solution system achieves a probabilistic regret bound under certain conditions. Two applications illustrate the applicability of the proposed framework.
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
Achieving Security and Privacy in Federated Learning Systems: Survey, Research Challenges and Future Directions
Authors:
Alberto Blanco-Justicia,
Josep Domingo-Ferrer,
Sergio Martínez,
David Sánchez,
Adrian Flanagan,
Kuan Eeik Tan
Abstract:
Federated learning (FL) allows a server to learn a machine learning (ML) model across multiple decentralized clients that privately store their own training data. In contrast with centralized ML approaches, FL saves computation to the server and does not require the clients to outsource their private data to the server. However, FL is not free of issues. On the one hand, the model updates sent by…
▽ More
Federated learning (FL) allows a server to learn a machine learning (ML) model across multiple decentralized clients that privately store their own training data. In contrast with centralized ML approaches, FL saves computation to the server and does not require the clients to outsource their private data to the server. However, FL is not free of issues. On the one hand, the model updates sent by the clients at each training epoch might leak information on the clients' private data. On the other hand, the model learnt by the server may be subjected to attacks by malicious clients; these security attacks might poison the model or prevent it from converging. In this paper, we first examine security and privacy attacks to FL and critically survey solutions proposed in the literature to mitigate each attack. Afterwards, we discuss the difficulty of simultaneously achieving security and privacy protection. Finally, we sketch ways to tackle this open problem and attain both security and privacy.
△ Less
Submitted 12 December, 2020;
originally announced December 2020.
-
Leveraging Technology for Healthcare and Retaining Access to Personal Health Data to Enhance Personal Health and Well-being
Authors:
Ayan Chatterjee,
Ali Shahaab,
Martin W. Gerdes,
Santiago Martinez,
Pankaj Khatiwada
Abstract:
Health data is a sensitive category of personal data. It might result in a high risk to individual and health information handling rights and opportunities unless there is a palatable defense. Reasonable security standards are needed to protect electronic health records (EHR). All personal data handling needs adequate explanation. Maintaining access to medical data even in the develo** world wou…
▽ More
Health data is a sensitive category of personal data. It might result in a high risk to individual and health information handling rights and opportunities unless there is a palatable defense. Reasonable security standards are needed to protect electronic health records (EHR). All personal data handling needs adequate explanation. Maintaining access to medical data even in the develo** world would favor health and well-being across the world. Unfortunately, there are still countries that hinder the portability of medical records. Numerous occurrences have shown that it still takes weeks for the medical data to be ported from one general physician (GP) to another. Cross border portability is nearly impossible due to the lack of technical infrastructure and standardization. We demonstrate the difficulty of the portability of medical records with some example case studies as a collaborative engagement exercise through a data map** process to describe how different people and datapoints interact and evaluate EHR portability techniques. We then propose a blockchain-based EHR system that allows secure, and cross border sharing of medical data. The ethical and technical challenges around having such a system have also been discussed in this study.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Frequency Regulation with Heterogeneous Energy Resources: A Realization using Distributed Control
Authors:
Tor Anderson,
Manasa Muralidharan,
Priyank Srivastava,
Hamed Valizadeh Haghi,
Jorge Cortes,
Jan Kleissl,
Sonia Martinez,
Byron Washom
Abstract:
This paper presents one of the first real-life demonstrations of coordinated and distributed resource control for secondary frequency response in a power distribution grid. We conduct a series of tests with up to 69 heterogeneous active devices consisting of air handling units, unidirectional and bidirectional electric vehicle charging stations, a battery energy storage system, and 107 passive dev…
▽ More
This paper presents one of the first real-life demonstrations of coordinated and distributed resource control for secondary frequency response in a power distribution grid. We conduct a series of tests with up to 69 heterogeneous active devices consisting of air handling units, unidirectional and bidirectional electric vehicle charging stations, a battery energy storage system, and 107 passive devices consisting of building loads and photovoltaic generators. Actuation commands for the test devices are obtained by solving an economic dispatch problem at every regulation instant using distributed ratio-consensus, primal-dual, and Newton-like algorithms. The distributed control setup consists of a set of Raspberry Pi end-points exchanging messages via an ethernet switch. The problem formulation minimizes the sum of device costs while tracking the setpoints provided by the system operator. We demonstrate accurate and fast real-time distributed computation of the optimization solution and effective tracking of the regulation signal by measuring physical device outputs over 40-minute time horizons. We also perform an economic benefit analysis which confirms eligibility to participate in an ancillary services market and demonstrates up to $53K of potential annual revenue for the selected population of devices.
△ Less
Submitted 4 February, 2021; v1 submitted 15 July, 2020;
originally announced July 2020.
-
Measuring privacy in smart metering anonymized data
Authors:
Santi Martínez,
Francesc Sebé,
Christoph Sorge
Abstract:
In recent years, many proposals have arisen from research on privacy in smart metering. In one of the considered approaches, referred to as anonymization, smart meters transmit fine-grained electricity consumption values in such a way that the energy supplier can not exactly determine procedence. This paper measures the real privacy provided by such approach by taking into account that at the end…
▽ More
In recent years, many proposals have arisen from research on privacy in smart metering. In one of the considered approaches, referred to as anonymization, smart meters transmit fine-grained electricity consumption values in such a way that the energy supplier can not exactly determine procedence. This paper measures the real privacy provided by such approach by taking into account that at the end of a billing period the energy supplier collects the overall electricity consumption of each meter for billing purposes. An entropy-based measure is proposed for quantifying privacy and determine the extent to which knowledge on the overall consumption of meters allows to re-identify anonymous fine-grained consumption values.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
Data-driven Predictive Control for a Class of Uncertain Control-Affine Systems
Authors:
Dan Li,
Dariush Fooladivanda,
Sonia Martinez
Abstract:
This paper studies a data-driven predictive control for a class of control-affine systems which is subject to uncertainty. With the accessibility to finite sample measurements of the uncertain variables, we aim to find controls which are feasible and provide superior performance guarantees with high probability. This results into the formulation of a stochastic optimization problem (P), which is i…
▽ More
This paper studies a data-driven predictive control for a class of control-affine systems which is subject to uncertainty. With the accessibility to finite sample measurements of the uncertain variables, we aim to find controls which are feasible and provide superior performance guarantees with high probability. This results into the formulation of a stochastic optimization problem (P), which is intractable due to the unknown distribution of the uncertainty variables. By develo** a distributionally robust optimization framework, we present an equivalent and yet tractable reformulation of (P). Further, we propose an efficient algorithm that provides online suboptimal data-driven solutions and guarantees performance with high probability. To illustrate the effectiveness of the proposed approach, we consider a highway speed-limit control problem. We then develop a set of data-driven speed controls that allow us to prevent traffic congestion with high probability. Finally, we employ the resulting control method on a traffic simulator to illustrate the effectiveness of this approach numerically.
△ Less
Submitted 29 April, 2021; v1 submitted 22 November, 2019;
originally announced November 2019.
-
Universal One-Dimensional Cellular Automata Derived for Turing Machines and its Dynamical Behaviour
Authors:
Sergio J. Martinez,
Ivan M. Mendoza,
Genaro J. Martinez,
Shigeru Ninagawa
Abstract:
Universality in cellular automata theory is a central problem studied and developed from their origins by John von Neumann. In this paper, we present an algorithm where any Turing machine can be converted to one-dimensional cellular automaton with a 2-linear time and display its spatial dynamics. Three particular Turing machines are converted in three universal one-dimensional cellular automata, t…
▽ More
Universality in cellular automata theory is a central problem studied and developed from their origins by John von Neumann. In this paper, we present an algorithm where any Turing machine can be converted to one-dimensional cellular automaton with a 2-linear time and display its spatial dynamics. Three particular Turing machines are converted in three universal one-dimensional cellular automata, they are: binary sum, rule 110 and a universal reversible Turing machine.
△ Less
Submitted 6 July, 2019;
originally announced July 2019.
-
Planning under non-rational perception of uncertain spatial costs
Authors:
Aamodh Suresh,
Sonia Martinez
Abstract:
This work investigates the design of risk-perception-aware motion-planning strategies that incorporate non-rational perception of risks associated with uncertain spatial costs. Our proposed method employs the Cumulative Prospect Theory (CPT) to generate a perceived risk map over a given environment. CPT-like perceived risks and path-length metrics are then combined to define a cost function that i…
▽ More
This work investigates the design of risk-perception-aware motion-planning strategies that incorporate non-rational perception of risks associated with uncertain spatial costs. Our proposed method employs the Cumulative Prospect Theory (CPT) to generate a perceived risk map over a given environment. CPT-like perceived risks and path-length metrics are then combined to define a cost function that is compliant with the requirements of asymptotic optimality of sampling-based motion planners (RRT*). The modeling power of CPT is illustrated in theory and in simulation, along with a comparison to other risk perception models like Conditional Value at Risk (CVaR). Theoretically, we define a notion of expressiveness for a risk perception model and show that CPT's is higher than that of CVaR and expected risk. We then show that this expressiveness translates to our path planning setting, where we observe that a planner equipped with CPT together with a simultaneous perturbation stochastic approximation (SPSA) method can better approximate arbitrary paths in an environment. Additionally, we show in simulation that our planner captures a rich set of meaningful paths, representative of different risk perceptions in a custom environment. We then compare the performance of our planner with T-RRT* (a planner for continuous cost spaces) and Risk-RRT* (a risk-aware planner for dynamic human obstacles) through simulations in cluttered and dynamic environments respectively, showing the advantage of our proposed planner.
△ Less
Submitted 20 October, 2020; v1 submitted 4 April, 2019;
originally announced April 2019.
-
How to Avoid Reidentification with Proper Anonymization
Authors:
David Sánchez,
Sergio Martínez,
Josep Domingo-Ferrer
Abstract:
De Montjoye et al. claimed that most individuals can be reidentified from a deidentified transaction database and that anonymization mechanisms are not effective against reidentification. We demonstrate that anonymization can be performed by techniques well established in the literature.
De Montjoye et al. claimed that most individuals can be reidentified from a deidentified transaction database and that anonymization mechanisms are not effective against reidentification. We demonstrate that anonymization can be performed by techniques well established in the literature.
△ Less
Submitted 3 August, 2018;
originally announced August 2018.
-
Superresolution method for data deconvolution by superposition of point sources
Authors:
Sandra Martínez,
Oscar E. Martínez
Abstract:
In this work we present a new algorithm for data deconvolution that allows the retrieval of the target function with super-resolution with a simple approach that after a precis e measurement of the instrument response function (IRF), the measured data are fit by a superposition of point sources (SUPPOSe) of equal intensity. In this manner only the positions of the sources need to be determined by…
▽ More
In this work we present a new algorithm for data deconvolution that allows the retrieval of the target function with super-resolution with a simple approach that after a precis e measurement of the instrument response function (IRF), the measured data are fit by a superposition of point sources (SUPPOSe) of equal intensity. In this manner only the positions of the sources need to be determined by an algorithm that minimizes the norm of the difference between the measured data and the convolution of the superposed point sources with the IRF. An upper bound for the uncertainty in the position of the sources was derived and two very different experimental situations were used for the test (an optical spectrum and fluorescent microscopy images) showing excellent reconstructions and agreement with the predicted uncertainties, achieving λ/10 resolution for the microscope and a fivefold improvement in the spectral resolution for the spectrometer. The method also provides a way to determine the optimum number of sources to be used for the fit.
△ Less
Submitted 5 December, 2018; v1 submitted 8 May, 2018;
originally announced May 2018.
-
Gesture based Human-Swarm Interactions for Formation Control using interpreters
Authors:
Aamodh Suresh,
Sonia Martinez
Abstract:
We propose a novel Human-Swarm Interaction (HSI) framework which enables the user to control a swarm shape and formation. The user commands the swarm utilizing just arm gestures and motions which are recorded by an off-the-shelf wearable armband. We propose a novel interpreter system, which acts as an intermediary between the user and the swarm to simplify the user's role in the interaction. The i…
▽ More
We propose a novel Human-Swarm Interaction (HSI) framework which enables the user to control a swarm shape and formation. The user commands the swarm utilizing just arm gestures and motions which are recorded by an off-the-shelf wearable armband. We propose a novel interpreter system, which acts as an intermediary between the user and the swarm to simplify the user's role in the interaction. The interpreter takes in a high level input drawn using gestures by the user, and translates it into low level swarm control commands. This interpreter employs machine learning, Kalman filtering and optimal control techniques to translate the user input into swarm control parameters. A notion of Human Interpretable dynamics is introduced, which is used by the interpreter for planning as well as to provide feedback to the user. The dynamics of the swarm are controlled using a novel decentralized formation controller based on distributed linear iterations and dynamic average consensus. The framework is demonstrated theoretically as well as experimentally in a 2D environment, with a human controlling a swarm of simulated robots in real time.
△ Less
Submitted 23 April, 2018;
originally announced April 2018.
-
Cooperative Robot Localization Using Event-triggered Estimation
Authors:
Michael Ouimet,
David Iglesias,
Nisar Ahmed,
Sonia Martinez
Abstract:
This paper describes a novel communication-spare cooperative localization algorithm for a team of mobile unmanned robotic vehicles. Exploiting an event-based estimation paradigm, robots only send measurements to neighbors when the expected innovation for state estimation is high. Since agents know the event-triggering condition for measurements to be sent, the lack of a measurement is thus also in…
▽ More
This paper describes a novel communication-spare cooperative localization algorithm for a team of mobile unmanned robotic vehicles. Exploiting an event-based estimation paradigm, robots only send measurements to neighbors when the expected innovation for state estimation is high. Since agents know the event-triggering condition for measurements to be sent, the lack of a measurement is thus also informative and fused into state estimates. The robots use a Covariance Intersection (CI) mechanism to occasionally synchronize their local estimates of the full network state. In addition, heuristic balancing dynamics on the robots' CI-triggering thresholds ensure that, in large diameter networks, the local error covariances remains below desired bounds across the network. Simulations on both linear and nonlinear dynamics/measurement models show that the event-triggering approach achieves nearly optimal state estimation performance in a wide range of operating conditions, even when using only a fraction of the communication cost required by conventional full data sharing. The robustness of the proposed approach to lossy communications, as well as the relationship between network topology and CI-based synchronization requirements, are also examined.
△ Less
Submitted 20 February, 2018;
originally announced February 2018.
-
Cooperation risk and Nash equilibrium: quantitative description for realistic players
Authors:
G. M. Nakamura,
G. S. Contesini,
A. S. Martinez
Abstract:
The emergence of cooperation figures among the main goal of game theory in competitive-cooperative environments. Potential games have long been hinted as viable alternatives to study realistic player behavior. Here, we expand the potential games approach by taking into account the inherent risks of cooperation. We show the Public Goods game reduce to a Hamiltonian with one-body operators, with the…
▽ More
The emergence of cooperation figures among the main goal of game theory in competitive-cooperative environments. Potential games have long been hinted as viable alternatives to study realistic player behavior. Here, we expand the potential games approach by taking into account the inherent risks of cooperation. We show the Public Goods game reduce to a Hamiltonian with one-body operators, with the correct Nash Equilibrium as the ground state. The inclusion of punishments to the Public Goods game reduces the cooperation risk, creating two-body interaction with a rich phase diagram, where phase transitions segregates the cooperative from competitive regimes.
△ Less
Submitted 19 January, 2018;
originally announced January 2018.
-
Weight Design of Distributed Approximate Newton Algorithms for Constrained Optimization
Authors:
Tor Anderson,
Chin-Yao Chang,
Sonia Martinez
Abstract:
Motivated by economic dispatch and linearly-constrained resource allocation problems, this paper proposes a novel Distributed Approx-Newton algorithm that approximates the standard Newton optimization method. A main property of this distributed algorithm is that it only requires agents to exchange constant-size communication messages. The convergence of this algorithm is discussed and rigorously a…
▽ More
Motivated by economic dispatch and linearly-constrained resource allocation problems, this paper proposes a novel Distributed Approx-Newton algorithm that approximates the standard Newton optimization method. A main property of this distributed algorithm is that it only requires agents to exchange constant-size communication messages. The convergence of this algorithm is discussed and rigorously analyzed. In addition, we aim to address the problem of designing communication topologies and weightings that are optimal for second-order methods. To this end, we propose an effective approximation which is loosely based on completing the square to address the NP-hard bilinear optimization involved in the design. Simulations demonstrate that our proposed weight design applied to the Distributed Approx-Newton algorithm has a superior convergence property compared to existing weighted and distributed first-order gradient descent methods.
△ Less
Submitted 22 March, 2017;
originally announced March 2017.
-
Server assisted distributed cooperative localization over unreliable communication links
Authors:
Solmaz S. Kia,
Jonathan Hechtbauer,
David Gogokhiya,
Sonia Martinez
Abstract:
This paper considers the problem of cooperative localization (CL) using inter-robot measurements for a group of networked robots with limited on-board resources. We propose a novel recursive algorithm in which each robot localizes itself in a global coordinate frame by local dead reckoning, and opportunistically corrects its pose estimate whenever it receives a relative measurement update message…
▽ More
This paper considers the problem of cooperative localization (CL) using inter-robot measurements for a group of networked robots with limited on-board resources. We propose a novel recursive algorithm in which each robot localizes itself in a global coordinate frame by local dead reckoning, and opportunistically corrects its pose estimate whenever it receives a relative measurement update message from a server. The computation and storage cost per robot in terms of the size of the team is of order O(1), and the robots are only required to transmit information when they are involved in a relative measurement. The server also only needs to compute and transmit update messages when it receives an inter-robot measurement. We show that under perfect communication, our algorithm is an alternative but exact implementation of a joint CL for the entire team via Extended Kalman Filter (EKF). The perfect communication however is not a hard requirement. In fact, we show that our algorithm is intrinsically robust with respect to communication failures, with formal guarantees that the updated estimates of the robots receiving the update message are of minimum variance in a first-order approximate sense at that given timestep. We demonstrate the performance of the algorithm in simulation and experiments.
△ Less
Submitted 24 December, 2017; v1 submitted 1 August, 2016;
originally announced August 2016.
-
t-Closeness through Microaggregation: Strict Privacy with Enhanced Utility Preservation
Authors:
Jordi Soria-Comas,
Josep Domingo-Ferrer,
David Sánchez,
Sergio Martínez
Abstract:
Microaggregation is a technique for disclosure limitation aimed at protecting the privacy of data subjects in microdata releases. It has been used as an alternative to generalization and suppression to generate $k$-anonymous data sets, where the identity of each subject is hidden within a group of $k$ subjects. Unlike generalization, microaggregation perturbs the data and this additional masking f…
▽ More
Microaggregation is a technique for disclosure limitation aimed at protecting the privacy of data subjects in microdata releases. It has been used as an alternative to generalization and suppression to generate $k$-anonymous data sets, where the identity of each subject is hidden within a group of $k$ subjects. Unlike generalization, microaggregation perturbs the data and this additional masking freedom allows improving data utility in several ways, such as increasing data granularity, reducing the impact of outliers and avoiding discretization of numerical data. $k$-Anonymity, on the other side, does not protect against attribute disclosure, which occurs if the variability of the confidential values in a group of $k$ subjects is too small. To address this issue, several refinements of $k$-anonymity have been proposed, among which $t$-closeness stands out as providing one of the strictest privacy guarantees. Existing algorithms to generate $t$-close data sets are based on generalization and suppression (they are extensions of $k$-anonymization algorithms based on the same principles). This paper proposes and shows how to use microaggregation to generate $k$-anonymous $t$-close data sets. The advantages of microaggregation are analyzed, and then several microaggregation algorithms for $k$-anonymous $t$-closeness are presented and empirically evaluated.
△ Less
Submitted 9 December, 2015;
originally announced December 2015.
-
Utility-Preserving Differentially Private Data Releases Via Individual Ranking Microaggregation
Authors:
David Sánchez,
Josep Domingo-Ferrer,
Sergio Martínez,
Jordi Soria-Comas
Abstract:
Being able to release and exploit open data gathered in information systems is crucial for researchers, enterprises and the overall society. Yet, these data must be anonymized before release to protect the privacy of the subjects to whom the records relate. Differential privacy is a privacy model for anonymization that offers more robust privacy guarantees than previous models, such as $k$-anonymi…
▽ More
Being able to release and exploit open data gathered in information systems is crucial for researchers, enterprises and the overall society. Yet, these data must be anonymized before release to protect the privacy of the subjects to whom the records relate. Differential privacy is a privacy model for anonymization that offers more robust privacy guarantees than previous models, such as $k$-anonymity and its extensions. However, it is often disregarded that the utility of differentially private outputs is quite limited, either because of the amount of noise that needs to be added to obtain them or because utility is only preserved for a restricted type and/or a limited number of queries. On the contrary, $k$-anonymity-like data releases make no assumptions on the uses of the protected data and, thus, do not restrict the number and type of doable analyses. Recently, some authors have proposed mechanisms to offer general-purpose differentially private data releases. This paper extends such works with a specific focus on the preservation of the utility of the protected data. Our proposal builds on microaggregation-based anonymization, which is more flexible and utility-preserving than alternative anonymization methods used in the literature, in order to reduce the amount of noise needed to satisfy differential privacy. In this way, we improve the utility of differentially private data releases. Moreover, the noise reduction we achieve does not depend on the size of the data set, but just on the number of attributes to be protected, which is a more desirable behavior for large data sets. The utility benefits brought by our proposal are empirically evaluated and compared with related works for several data sets and metrics.
△ Less
Submitted 16 December, 2015; v1 submitted 9 December, 2015;
originally announced December 2015.
-
Supplementary Materials for "How to Avoid Reidentification with Proper Anonymization"- Comment on "Unique in the shop** mall: on the reidentifiability of credit card metadata"
Authors:
David Sánchez,
Sergio Martínez,
Josep Domingo-Ferrer
Abstract:
The study by De Montjoye et al. ("Science", 30 January 2015, p. 536) claimed that most individuals can be reidentified from a deidentified credit card transaction database and that anonymization mechanisms are not effective against reidentification. Such claims deserve detailed quantitative scrutiny, as they might seriously undermine the willingness of data owners and subjects to share data for re…
▽ More
The study by De Montjoye et al. ("Science", 30 January 2015, p. 536) claimed that most individuals can be reidentified from a deidentified credit card transaction database and that anonymization mechanisms are not effective against reidentification. Such claims deserve detailed quantitative scrutiny, as they might seriously undermine the willingness of data owners and subjects to share data for research. In a recent Technical Comment published in "Science" (18 March 2016, p. 1274), we demonstrate that the reidentification risk reported by De Montjoye et al. was significantly overestimated (due to a misunderstanding of the reidentification attack) and that the alleged ineffectiveness of anonymization is due to the choice of poor and undocumented methods and to a general disregard of 40 years of anonymization literature. The technical comment also shows how to properly anonymize data, in order to reduce unequivocal reidentifications to zero while retaining even more analytical utility than with the poor anonymization mechanisms employed by De Montjoye et al. In conclusion, data owners, subjects and users can be reassured that sound privacy models and anonymization methods exist to produce safe and useful anonymized data.
Supplementary materials detailing the data sets, algorithms and extended results of our study are available here. Moreover, unlike the De Montjoye et al.'s data set, which was never made available, our data, anonymized results, and anonymization algorithms can be freely downloaded from http://crises-deim.urv.cat/opendata/SPD_Science.zip
△ Less
Submitted 18 March, 2016; v1 submitted 18 November, 2015;
originally announced November 2015.
-
Cooperative localization for mobile agents: a recursive decentralized algorithm based on Kalman filter decoupling
Authors:
Solmaz S. Kia,
Stephen Rounds,
Sonia Martinez
Abstract:
We consider cooperative localization technique for mobile agents with communication and computation capabilities. We start by provide and overview of different decentralization strategies in the literature, with special focus on how these algorithms maintain an account of intrinsic correlations between state estimate of team members. Then, we present a novel decentralized cooperative localization…
▽ More
We consider cooperative localization technique for mobile agents with communication and computation capabilities. We start by provide and overview of different decentralization strategies in the literature, with special focus on how these algorithms maintain an account of intrinsic correlations between state estimate of team members. Then, we present a novel decentralized cooperative localization algorithm that is a decentralized implementation of a centralized Extended Kalman Filter for cooperative localization. In this algorithm, instead of propagating cross-covariance terms, each agent propagates new intermediate local variables that can be used in an update stage to create the required propagated cross-covariance terms. Whenever there is a relative measurement in the network, the algorithm declares the agent making this measurement as the interim master. By acquiring information from the interim landmark, the agent the relative measurement is taken from, the interim master can calculate and broadcast a set of intermediate variables which each robot can then use to update its estimates to match that of a centralized Extended Kalman Filter for cooperative localization. Once an update is done, no further communication is needed until the next relative measurement.
△ Less
Submitted 5 October, 2015; v1 submitted 21 May, 2015;
originally announced May 2015.
-
Navigating MazeMap: indoor human mobility, spatio-logical ties and future potential
Authors:
Gergely Biczok,
Santiago Diez Martinez,
Thomas Jelle,
John Krogstie
Abstract:
Global navigation systems and location-based services have found their way into our daily lives. Recently, indoor positioning techniques have also been proposed, and there are several live or trial systems already operating. In this paper, we present insights from MazeMap, the first live indoor/outdoor positioning and navigation system deployed at a large university campus in Norway. Our main cont…
▽ More
Global navigation systems and location-based services have found their way into our daily lives. Recently, indoor positioning techniques have also been proposed, and there are several live or trial systems already operating. In this paper, we present insights from MazeMap, the first live indoor/outdoor positioning and navigation system deployed at a large university campus in Norway. Our main contribution is a measurement case study; we show the spatial and temporal distribution of MazeMap geo-location and wayfinding requests, construct the aggregated human mobility map of the campus and find strong logical ties between different locations. On one hand, our findings are specific to the venue; on the other hand, the nature of available data and insights coupled with our discussion on potential usage scenarios for indoor positioning and location-based services predict a successful future for these systems and applications.
△ Less
Submitted 21 January, 2014;
originally announced January 2014.
-
Satellite image classification and segmentation using non-additive entropy
Authors:
Lucas Assirati,
Alexandre Souto Martinez,
Odemir Martinez Bruno
Abstract:
Here we compare the Boltzmann-Gibbs-Shannon (standard) with the Tsallis entropy on the pattern recognition and segmentation of coloured images obtained by satellites, via "Google Earth". By segmentation we mean split an image to locate regions of interest. Here, we discriminate and define an image partition classes according to a training basis. This training basis consists of three pattern classe…
▽ More
Here we compare the Boltzmann-Gibbs-Shannon (standard) with the Tsallis entropy on the pattern recognition and segmentation of coloured images obtained by satellites, via "Google Earth". By segmentation we mean split an image to locate regions of interest. Here, we discriminate and define an image partition classes according to a training basis. This training basis consists of three pattern classes: aquatic, urban and vegetation regions. Our numerical experiments demonstrate that the Tsallis entropy, used as a feature vector composed of distinct entropic indexes $q$ outperforms the standard entropy. There are several applications of our proposed methodology, once satellite images can be used to monitor migration form rural to urban regions, agricultural activities, oil spreading on the ocean etc.
△ Less
Submitted 10 January, 2014;
originally announced January 2014.
-
IACTalks: an on-line archive of astronomy-related seminars
Authors:
Johan H. Knapen,
Jorge A. Pérez Prieto,
Tariq Shahbaz,
Anna Ferré-Mateu,
Nicola Caon,
Cristina Ramos Almeida,
Brandon Tingley,
Valentina Luridiana,
Inés Flores-Cacho,
Orlagh Creevey,
Arturo Manchado Torres,
Ignacio Trujillo,
Maria Rosa Zapatero Osorio,
Francisco Sánchez Martínez,
Francisco López Molina,
Gabriel Pérez Díaz,
Miguel Briganti,
Inés Bonet
Abstract:
We present IACTalks, a free and open access seminars archive (http://iactalks.iac.es) aimed at promoting astronomy and the exchange of ideas by providing high-quality scientific seminars to the astronomical community. The archive of seminars and talks given at the Instituto de Astrofiísica de Canarias goes back to 2008. Over 360 talks and seminars are now freely available by streaming over the int…
▽ More
We present IACTalks, a free and open access seminars archive (http://iactalks.iac.es) aimed at promoting astronomy and the exchange of ideas by providing high-quality scientific seminars to the astronomical community. The archive of seminars and talks given at the Instituto de Astrofiísica de Canarias goes back to 2008. Over 360 talks and seminars are now freely available by streaming over the internet. We describe the user interface, which includes two video streams, one showing the speaker, the other the presentation. A search function is available, and seminars are indexed by keywords and in some cases by series, such as special training courses or the 2011 Winter School of Astrophysics, on secular evolution of galaxies. The archive is made available as an open resource, to be used by scientists and the public.
△ Less
Submitted 27 June, 2012;
originally announced June 2012.
-
Fast, parallel and secure cryptography algorithm using Lorenz's attractor
Authors:
Anderson Gonçalves Marco,
Alexandre Souto Martinez,
Odemir Martinez Bruno
Abstract:
A novel cryptography method based on the Lorenz's attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known.…
▽ More
A novel cryptography method based on the Lorenz's attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known. The algorithm has been implemented in two versions: one sequential and slow and the other, parallel and fast. Our algorithm assures the integrity of the ciphertext (we know if it has been altered, which is not assured by traditional algorithms) and consequently its authenticity. Numerical experiments are presented, discussed and show the behavior of the method in terms of security and performance. The fast version of the algorithm has a performance comparable to AES, a popular cryptography program used commercially nowadays, but it is more secure, which makes it immediately suitable for general purpose cryptography applications. An internet page has been set up, which enables the readers to test the algorithm and also to try to break into the cipher in.
△ Less
Submitted 15 January, 2012;
originally announced January 2012.
-
Complex network classification using partially self-avoiding deterministic walks
Authors:
Wesley Nunes Gonçalves,
Alexandre Souto Martinez,
Odemir Martinez Bruno
Abstract:
Complex networks have attracted increasing interest from various fields of science. It has been demonstrated that each complex network model presents specific topological structures which characterize its connectivity and dynamics. Complex network classification rely on the use of representative measurements that model topological structures. Although there are a large number of measurements, most…
▽ More
Complex networks have attracted increasing interest from various fields of science. It has been demonstrated that each complex network model presents specific topological structures which characterize its connectivity and dynamics. Complex network classification rely on the use of representative measurements that model topological structures. Although there are a large number of measurements, most of them are correlated. To overcome this limitation, this paper presents a new measurement for complex network classification based on partially self-avoiding walks. We validate the measurement on a data set composed by 40.000 complex networks of four well-known models. Our results indicate that the proposed measurement improves correct classification of networks compared to the traditional ones.
△ Less
Submitted 16 February, 2012; v1 submitted 23 December, 2011;
originally announced December 2011.
-
Universality in Bibliometrics
Authors:
Roberto da Silva,
Fahad Kalil,
Alexandre Souto Martinez,
Jose Palazzo Moreira de Oliveira
Abstract:
Many discussions have enlarged the literature in Bibliometrics since the Hirsh proposal, the so called $h$-index. Ranking papers according to their citations, this index quantifies a researcher only by its greatest possible number of papers that are cited at least $h$ times. A closed formula for $h$-index distribution that can be applied for distinct databases is not yet known. In fact, to obtain…
▽ More
Many discussions have enlarged the literature in Bibliometrics since the Hirsh proposal, the so called $h$-index. Ranking papers according to their citations, this index quantifies a researcher only by its greatest possible number of papers that are cited at least $h$ times. A closed formula for $h$-index distribution that can be applied for distinct databases is not yet known. In fact, to obtain such distribution, the knowledge of citation distribution of the authors and its specificities are required. Instead of dealing with researchers randomly chosen, here we address different groups based on distinct databases. The first group is composed by physicists and biologists, with data extracted from Institute of Scientific Information (ISI). The second group composed by computer scientists, which data were extracted from Google-Scholar system. In this paper, we obtain a general formula for the $h$-index probability density function (pdf) for groups of authors by using generalized exponentials in the context of escort probability. Our analysis includes the use of several statistical methods to estimate the necessary parameters. Also an exhaustive comparison among the possible candidate distributions are used to describe the way the citations are distributed among authors. The $h$-index pdf should be used to classify groups of researchers from a quantitative point of view, which is meaningfully interesting to eliminate obscure qualitative methods.
△ Less
Submitted 11 November, 2011;
originally announced November 2011.
-
Coverage control for mobile sensing networks
Authors:
J. Cortes,
S. Martinez,
T. Karatas,
F. Bullo
Abstract:
This paper presents control and coordination algorithms for groups of vehicles. The focus is on autonomous vehicle networks performing distributed sensing tasks where each vehicle plays the role of a mobile tunable sensor. The paper proposes gradient descent algorithms for a class of utility functions which encode optimal coverage and sensing policies. The resulting closed-loop behavior is adapt…
▽ More
This paper presents control and coordination algorithms for groups of vehicles. The focus is on autonomous vehicle networks performing distributed sensing tasks where each vehicle plays the role of a mobile tunable sensor. The paper proposes gradient descent algorithms for a class of utility functions which encode optimal coverage and sensing policies. The resulting closed-loop behavior is adaptive, distributed, asynchronous, and verifiably correct.
△ Less
Submitted 16 December, 2002;
originally announced December 2002.
-
Computational problems for vector-valued quadratic forms
Authors:
Francesco Bullo,
Jorge Cortes,
Andrew D. Lewis,
Sonia Martinez
Abstract:
Given two real vector spaces $U$ and $V$, and a symmetric bilinear map $B: U\times U\to V$, let $Q_B$ be its associated quadratic map $Q_B$. The problems we consider are as follows: (i) are there necessary and sufficient conditions, checkable in polynomial-time, for determining when $Q_B$ is surjective?; (ii) if $Q_B$ is surjective, given $v\in V$ is there a polynomial-time algorithm for finding…
▽ More
Given two real vector spaces $U$ and $V$, and a symmetric bilinear map $B: U\times U\to V$, let $Q_B$ be its associated quadratic map $Q_B$. The problems we consider are as follows: (i) are there necessary and sufficient conditions, checkable in polynomial-time, for determining when $Q_B$ is surjective?; (ii) if $Q_B$ is surjective, given $v\in V$ is there a polynomial-time algorithm for finding a point $u\in Q_B^{-1}(v)$?; (iii) are there necessary and sufficient conditions, checkable in polynomial-time, for determining when $B$ is indefinite? We present an alternative formulation of the problem of determining the image of a vector-valued quadratic form in terms of the unprojectivised Veronese surface. The relation of these questions with several interesting problems in Control Theory is illustrated.
△ Less
Submitted 5 April, 2002;
originally announced April 2002.