-
STL: Still Tricky Logic (for System Validation, Even When Showing Your Work)
Authors:
Isabelle Hurley,
Rohan Paleja,
Ashley Suh,
Jaime D. Peña,
Ho Chit Siu
Abstract:
As learned control policies become increasingly common in autonomous systems, there is increasing need to ensure that they are interpretable and can be checked by human stakeholders. Formal specifications have been proposed as ways to produce human-interpretable policies for autonomous systems that can still be learned from examples. Previous work showed that despite claims of interpretability, hu…
▽ More
As learned control policies become increasingly common in autonomous systems, there is increasing need to ensure that they are interpretable and can be checked by human stakeholders. Formal specifications have been proposed as ways to produce human-interpretable policies for autonomous systems that can still be learned from examples. Previous work showed that despite claims of interpretability, humans are unable to use formal specifications presented in a variety of ways to validate even simple robot behaviors. This work uses active learning, a standard pedagogical method, to attempt to improve humans' ability to validate policies in signal temporal logic (STL). Results show that overall validation accuracy is not high, at $65\% \pm 15\%$ (mean $\pm$ standard deviation), and that the three conditions of no active learning, active learning, and active learning with feedback do not significantly differ from each other. Our results suggest that the utility of formal specifications for human interpretability is still unsupported but point to other avenues of development which may enable improvements in system validation.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Why Would You Suggest That? Human Trust in Language Model Responses
Authors:
Manasi Sharma,
Ho Chit Siu,
Rohan Paleja,
Jaime D. Peña
Abstract:
The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performa…
▽ More
The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performance. Overall, we provide evidence that adding an explanation in the model response to justify its reasoning significantly increases self-reported user trust in the model when the user has the opportunity to compare various responses. Position and faithfulness of these explanations are also important factors. However, these gains disappear when users are shown responses independently, suggesting that humans trust all model responses, including deceptive ones, equitably when they are shown in isolation. Our findings urge future research to delve deeper into the nuanced evaluation of trust in human-machine teaming systems.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Classical integrability in the presence of a cosmological constant: analytic and machine learning results
Authors:
Gabriel Lopes Cardoso,
Damián Mayorga Peña,
Suresh Nampuri
Abstract:
We study the integrability of two-dimensional theories that are obtained by a dimensional reduction of certain four-dimensional gravitational theories describing the coupling of Maxwell fields and neutral scalar fields to gravity in the presence of a potential for the neutral scalar fields. By focusing on a certain solution subspace, we show that a subset of the equations of motion in two dimensio…
▽ More
We study the integrability of two-dimensional theories that are obtained by a dimensional reduction of certain four-dimensional gravitational theories describing the coupling of Maxwell fields and neutral scalar fields to gravity in the presence of a potential for the neutral scalar fields. By focusing on a certain solution subspace, we show that a subset of the equations of motion in two dimensions are the compatibility conditions for a modified version of the Breitenlohner-Maison linear system. Subsequently, we study the Liouville integrability of the 2D models encoding the chosen 4D solution subspace from a one-dimensional point of view by constructing Lax pair matrices. In this endeavour, we successfully employ a linear neural network to search for Lax pair matrices for these models, thereby illustrating how machine learning approaches can be effectively implemented to augment the identification of integrable structures in classical systems.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Risk-Aware Wasserstein Distributionally Robust Control of Vessels in Natural Waterways
Authors:
Juan Moreno Nadales,
Astghik Hakobyan,
David Muñoz de la Peña,
Daniel Limon,
Insoon Yang
Abstract:
In the realm of maritime transportation, autonomous vessel navigation in natural inland waterways faces persistent challenges due to unpredictable natural factors. Existing scheduling algorithms fall short in handling these uncertainties, compromising both safety and efficiency. Moreover, these algorithms are primarily designed for non-autonomous vessels, leading to labor-intensive operations vuln…
▽ More
In the realm of maritime transportation, autonomous vessel navigation in natural inland waterways faces persistent challenges due to unpredictable natural factors. Existing scheduling algorithms fall short in handling these uncertainties, compromising both safety and efficiency. Moreover, these algorithms are primarily designed for non-autonomous vessels, leading to labor-intensive operations vulnerable to human error. To address these issues, this study proposes a risk-aware motion control approach for vessels that accounts for the dynamic and uncertain nature of tide islands in a distributionally robust manner. Specifically, a model predictive control method is employed to follow the reference trajectory in the time-space map while incorporating a risk constraint to prevent grounding accidents. To address uncertainties in tide islands, a novel modeling technique represents them as stochastic polytopes. Additionally, potential inaccuracies in waterway depth are addressed through a risk constraint that considers the worst-case uncertainty distribution within a Wasserstein ambiguity set around the empirical distribution. Using sensor data collected in the Guadalquivir River, we empirically demonstrate the performance of the proposed method through simulations on a vessel. As a result, the vessel successfully navigates the waterway while avoiding grounding accidents, even with a limited dataset of observations. This stands in contrast to existing non-robust controllers, highlighting the robustness and practical applicability of the proposed approach.
△ Less
Submitted 21 October, 2023;
originally announced October 2023.
-
Machine Learned Calabi-Yau Metrics and Curvature
Authors:
Per Berglund,
Giorgi Butbaia,
Tristan Hübsch,
Vishnu Jejjala,
Damián Mayorga Peña,
Challenger Mishra,
Justin Tan
Abstract:
Finding Ricci-flat (Calabi-Yau) metrics is a long standing problem in geometry with deep implications for string theory and phenomenology. A new attack on this problem uses neural networks to engineer approximations to the Calabi-Yau metric within a given Kähler class. In this paper we investigate numerical Ricci-flat metrics over smooth and singular K3 surfaces and Calabi-Yau threefolds. Using th…
▽ More
Finding Ricci-flat (Calabi-Yau) metrics is a long standing problem in geometry with deep implications for string theory and phenomenology. A new attack on this problem uses neural networks to engineer approximations to the Calabi-Yau metric within a given Kähler class. In this paper we investigate numerical Ricci-flat metrics over smooth and singular K3 surfaces and Calabi-Yau threefolds. Using these Ricci-flat metric approximations for the Cefalú family of quartic twofolds and the Dwork family of quintic threefolds, we study characteristic forms on these geometries. We observe that the numerical stability of the numerically computed topological characteristic is heavily influenced by the choice of the neural network model, in particular, we briefly discuss a different neural network model, namely Spectral networks, which correctly approximate the topological characteristic of a Calabi-Yau. Using persistent homology, we show that high curvature regions of the manifolds form clusters near the singular points. For our neural network approximations, we observe a Bogomolov--Yau type inequality $3c_2 \geq c_1^2$ and observe an identity when our geometries have isolated $A_1$ type singularities. We sketch a proof that $χ(X~\smallsetminus~\mathrm{Sing}\,{X}) + 2~|\mathrm{Sing}\,{X}| = 24$ also holds for our numerical approximations.
△ Less
Submitted 6 June, 2023; v1 submitted 17 November, 2022;
originally announced November 2022.
-
Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi
Authors:
Ho Chit Siu,
Jaime D. Pena,
Edenna Chen,
Yutai Zhou,
Victor J. Lopez,
Kyle Palko,
Kimberlee C. Chang,
Ross E. Allen
Abstract:
Deep reinforcement learning has generated superhuman AI in competitive games such as Go and StarCraft. Can similar learning techniques create a superior AI teammate for human-machine collaborative games? Will humans prefer AI teammates that improve objective team performance or those that improve subjective metrics of trust? In this study, we perform a single-blind evaluation of teams of humans an…
▽ More
Deep reinforcement learning has generated superhuman AI in competitive games such as Go and StarCraft. Can similar learning techniques create a superior AI teammate for human-machine collaborative games? Will humans prefer AI teammates that improve objective team performance or those that improve subjective metrics of trust? In this study, we perform a single-blind evaluation of teams of humans and AI agents in the cooperative card game Hanabi, with both rule-based and learning-based agents. In addition to the game score, used as an objective metric of the human-AI team performance, we also quantify subjective measures of the human's perceived performance, teamwork, interpretability, trust, and overall preference of AI teammate. We find that humans have a clear preference toward a rule-based AI teammate (SmartBot) over a state-of-the-art learning-based AI teammate (Other-Play) across nearly all subjective metrics, and generally view the learning-based agent negatively, despite no statistical difference in the game score. This result has implications for future AI design and reinforcement learning benchmarking, highlighting the need to incorporate subjective metrics of human-AI teaming rather than a singular focus on objective task performance.
△ Less
Submitted 21 October, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Neural Network Approximations for Calabi-Yau Metrics
Authors:
Vishnu Jejjala,
Damian Kaloni Mayorga Pena,
Challenger Mishra
Abstract:
Ricci flat metrics for Calabi-Yau threefolds are not known analytically. In this work, we employ techniques from machine learning to deduce numerical flat metrics for the Fermat quintic, for the Dwork quintic, and for the Tian-Yau manifold. This investigation employs a single neural network architecture that is capable of approximating Ricci flat Kaehler metrics for several Calabi-Yau manifolds of…
▽ More
Ricci flat metrics for Calabi-Yau threefolds are not known analytically. In this work, we employ techniques from machine learning to deduce numerical flat metrics for the Fermat quintic, for the Dwork quintic, and for the Tian-Yau manifold. This investigation employs a single neural network architecture that is capable of approximating Ricci flat Kaehler metrics for several Calabi-Yau manifolds of dimensions two and three. We show that measures that assess the Ricci flatness of the geometry decrease after training by three orders of magnitude. This is corroborated on the validation set, where the improvement is more modest. Finally, we demonstrate that discrete symmetries of manifolds can be learned in the process of learning the metric.
△ Less
Submitted 27 January, 2021; v1 submitted 31 December, 2020;
originally announced December 2020.
-
Los perfiles de investigación y su implantación en la Universidad Publica de Navarra
Authors:
Manuel Ruiz de Luzuriaga Peña,
Isabel Muñoz Mouriño,
Mercedes Bogino Larrambebere
Abstract:
This work aims to monitor and control the presence of UPNA research staff in the main research profiles platforms, not only in the most obvious ones such as Google Scholar Citation, Researcher ID, Scopus ID and ORCID, but also in other services that, in practice, they function as research profiles, such as Mendeley, Linkedin, ResearchGate, Academia.edu and Academica-e. We also find it interesting…
▽ More
This work aims to monitor and control the presence of UPNA research staff in the main research profiles platforms, not only in the most obvious ones such as Google Scholar Citation, Researcher ID, Scopus ID and ORCID, but also in other services that, in practice, they function as research profiles, such as Mendeley, Linkedin, ResearchGate, Academia.edu and Academica-e. We also find it interesting to analyze that presence and see how it responds to a variables, such as the department, gender, job category, research group. In this study we have excluded some platforms for different reasons. Dialnet profiles are entered from the UPNA library (BUPNA), which means that all those who meet the requirements for inclusion would be there, so their analysis does not make much sense, since it depends on factors outside the will of the researcher himself. The same is the case with the UPNA Scientific Production Portal (PPC): the data is entered from the Vicerrectorado de Investigación and should include all members of the UPNA PDI. Using as a base the census of university research staff provided by the Vicerrectorado de Investigación, it has been verified, for each author, the existence or not of a profile in the different services studied. The results have been tabulated in an Excel file to be able to analyze them later. The data has been collected in March 2018 for Orcid, ResearcherID, ScopusID, Google Scholar Citations and Mendeley. In November 2018, data from Academica-e, Academia.edu, ResearchGate and Linkedin were taken. For each of the profiles, a search by institutional affiliation was used, when possible, to obtain a first list of UPNA research personnel with that profile. Subsequently, a search was carried out, person by person, of the rest of the research staff that did not appear in that first list.
△ Less
Submitted 29 May, 2020;
originally announced May 2020.
-
Baryons from Mesons: A Machine Learning Perspective
Authors:
Yarin Gal,
Vishnu Jejjala,
Damian Kaloni Mayorga Pena,
Challenger Mishra
Abstract:
Quantum chromodynamics (QCD) is the theory of the strong interaction. The fundamental particles of QCD, quarks and gluons, carry colour charge and form colourless bound states at low energies. The hadronic bound states of primary interest to us are the mesons and the baryons. From knowledge of the meson spectrum, we use neural networks and Gaussian processes to predict the masses of baryons with 9…
▽ More
Quantum chromodynamics (QCD) is the theory of the strong interaction. The fundamental particles of QCD, quarks and gluons, carry colour charge and form colourless bound states at low energies. The hadronic bound states of primary interest to us are the mesons and the baryons. From knowledge of the meson spectrum, we use neural networks and Gaussian processes to predict the masses of baryons with 90.3% and 96.6% accuracy, respectively. These results compare favourably to the constituent quark model. We as well predict the masses of pentaquarks and other exotic hadrons.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Understanding complex predictive models with Ghost Variables
Authors:
Pedro Delicado,
Daniel Peña
Abstract:
We propose a procedure for assigning a relevance measure to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check the out of sample performance. First, the individual relevance of each variable is computed by comparing the predictions in the test set, given by the model that includes all the variables with those of a…
▽ More
We propose a procedure for assigning a relevance measure to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check the out of sample performance. First, the individual relevance of each variable is computed by comparing the predictions in the test set, given by the model that includes all the variables with those of another model in which the variable of interest is substituted by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. Second, we check the joint effects among the variables by using the eigenvalues of a relevance matrix that is the covariance matrix of the vectors of individual effects. It is shown that in simple models, as linear or additive models, the proposed measures are related to standard measures of significance of the variables and in neural networks models (and in other algorithmic prediction models) the procedure provides information about the joint and individual effects of the variables that is not usually available by other methods. The procedure is illustrated with simulated examples and the analysis of a large real data set.
△ Less
Submitted 14 February, 2020; v1 submitted 13 December, 2019;
originally announced December 2019.
-
From Brain Imaging to Graph Analysis: a study on ADNI's patient cohort
Authors:
Rui Zhang,
Luca Giancardo,
Danilo A. Pena,
Ye** Kim,
Hanghang Tong,
Xiaoqian Jiang
Abstract:
In this paper, we studied the association between the change of structural brain volumes to the potential development of Alzheimer's disease (AD). Using a simple abstraction technique, we converted regional cortical and subcortical volume differences over two time points for each study subject into a graph. We then obtained substructures of interest using a graph decomposition algorithm in order t…
▽ More
In this paper, we studied the association between the change of structural brain volumes to the potential development of Alzheimer's disease (AD). Using a simple abstraction technique, we converted regional cortical and subcortical volume differences over two time points for each study subject into a graph. We then obtained substructures of interest using a graph decomposition algorithm in order to extract pivotal nodes via multi-view feature selection. Intensive experiments using robust classification frameworks were conducted to evaluate the performance of using the brain substructures obtained under different thresholds. The results indicated that compact substructures acquired by examining the differences between patient groups were sufficient to discriminate between AD and healthy controls with an area under the receiver operating curve of 0.72.
△ Less
Submitted 14 May, 2019;
originally announced May 2019.
-
It All Matters: Reporting Accuracy, Inference Time and Power Consumption for Face Emotion Recognition on Embedded Systems
Authors:
Jelena Milosevic,
Dexmont Pena,
Andrew Forembsky,
David Moloney,
Miroslaw Malek
Abstract:
While several approaches to face emotion recognition task are proposed in literature, none of them reports on power consumption nor inference time required to run the system in an embedded environment. Without adequate knowledge about these factors it is not clear whether we are actually able to provide accurate face emotion recognition in the embedded environment or not, and if not, how far we ar…
▽ More
While several approaches to face emotion recognition task are proposed in literature, none of them reports on power consumption nor inference time required to run the system in an embedded environment. Without adequate knowledge about these factors it is not clear whether we are actually able to provide accurate face emotion recognition in the embedded environment or not, and if not, how far we are from making it feasible and what are the biggest bottlenecks we face.
The main goal of this paper is to answer these questions and to convey the message that instead of reporting only detection accuracy also power consumption and inference time should be reported as real usability of the proposed systems and their adoption in human computer interaction strongly depends on it. In this paper, we identify the state-of-the art face emotion recognition methods that are potentially suitable for embedded environment and the most frequently used datasets for this task. Our study shows that most of the performed experiments use datasets with posed expressions or in a particular experimental setup with special conditions for image collection. Since our goal is to evaluate the performance of the identified promising methods in the realistic scenario, we collect a new dataset with non-exaggerated emotions and we use it, in addition to the publicly available datasets, for the evaluation of detection accuracy, power consumption and inference time on three frequently used embedded devices with different computational capabilities. Our results show that gray images are still more suitable for embedded environment than color ones and that for most of the analyzed systems either inference time or energy consumption or both are limiting factor for their adoption in real-life embedded applications.
△ Less
Submitted 29 June, 2018;
originally announced July 2018.
-
Distance weighted discrimination of face images for gender classification
Authors:
Mónica Benito,
Eduardo García-Portugués,
J. S. Marron,
Daniel Peña
Abstract:
We illustrate the advantages of distance weighted discrimination for classification and feature extraction in a High Dimension Low Sample Size (HDLSS) situation. The HDLSS context is a gender classification problem of face images in which the dimension of the data is several orders of magnitude larger than the sample size. We compare distance weighted discrimination with Fisher's linear discrimina…
▽ More
We illustrate the advantages of distance weighted discrimination for classification and feature extraction in a High Dimension Low Sample Size (HDLSS) situation. The HDLSS context is a gender classification problem of face images in which the dimension of the data is several orders of magnitude larger than the sample size. We compare distance weighted discrimination with Fisher's linear discriminant, support vector machines, and principal component analysis by exploring their classification interpretation through insightful visuanimations and by examining the classifiers' discriminant errors. This analysis enables us to make new contributions to the understanding of the drivers of human discrimination between males and females.
△ Less
Submitted 21 September, 2020; v1 submitted 15 June, 2017;
originally announced June 2017.
-
Parallel adaptive integration in high-performance functional Renormalization Group computations
Authors:
Julian Lichtenstein,
Jan Winkelmann,
David Sánchez de la Peña,
Toni Vidović,
Edoardo Di Napoli
Abstract:
The conceptual framework provided by the functional Renormalization Group (fRG) has become a formidable tool to study correlated electron systems on lattices which, in turn, provided great insights to our understanding of complex many-body phenomena, such as high- temperature superconductivity or topological states of matter. In this work we present one of the latest realizations of fRG which make…
▽ More
The conceptual framework provided by the functional Renormalization Group (fRG) has become a formidable tool to study correlated electron systems on lattices which, in turn, provided great insights to our understanding of complex many-body phenomena, such as high- temperature superconductivity or topological states of matter. In this work we present one of the latest realizations of fRG which makes use of an adaptive numerical quadrature scheme specifically tailored to the described fRG scheme. The final result is an increase in performance thanks to improved parallelism and scalability.
△ Less
Submitted 31 October, 2016;
originally announced October 2016.