-
MEMO-QCD: Quantum Density Estimation through Memetic Optimisation for Quantum Circuit Design
Authors:
Juan E. Ardila-García,
Vladimir Vargas-Calderón,
Fabio A. González,
Diego H. Useche,
Herbert Vinck-Posada
Abstract:
This paper presents a strategy for efficient quantum circuit design for density estimation. The strategy is based on a quantum-inspired algorithm for density estimation and a circuit optimisation routine based on memetic algorithms. The model maps a training dataset to a quantum state represented by a density matrix through a quantum feature map. This training state encodes the probability distrib…
▽ More
This paper presents a strategy for efficient quantum circuit design for density estimation. The strategy is based on a quantum-inspired algorithm for density estimation and a circuit optimisation routine based on memetic algorithms. The model maps a training dataset to a quantum state represented by a density matrix through a quantum feature map. This training state encodes the probability distribution of the dataset in a quantum state, such that the density of a new sample can be estimated by projecting its corresponding quantum state onto the training state. We propose the application of a memetic algorithm to find the architecture and parameters of a variational quantum circuit that implements the quantum feature map, along with a variational learning strategy to prepare the training state. Demonstrations of the proposed strategy show an accurate approximation of the Gaussian kernel density estimation method through shallow quantum circuits illustrating the feasibility of the algorithm for near-term quantum hardware.
△ Less
Submitted 14 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Interpreting Themes from Educational Stories
Authors:
Yigeng Zhang,
Fabio A. González,
Thamar Solorio
Abstract:
Reading comprehension continues to be a crucial research focus in the NLP community. Recent advances in Machine Reading Comprehension (MRC) have mostly centered on literal comprehension, referring to the surface-level understanding of content. In this work, we focus on the next level - interpretive comprehension, with a particular emphasis on inferring the themes of a narrative text. We introduce…
▽ More
Reading comprehension continues to be a crucial research focus in the NLP community. Recent advances in Machine Reading Comprehension (MRC) have mostly centered on literal comprehension, referring to the surface-level understanding of content. In this work, we focus on the next level - interpretive comprehension, with a particular emphasis on inferring the themes of a narrative text. We introduce the first dataset specifically designed for interpretive comprehension of educational narratives, providing corresponding well-edited theme texts. The dataset spans a variety of genres and cultural origins and includes human-annotated theme keywords with varying levels of granularity. We further formulate NLP tasks under different abstractions of interpretive comprehension toward the main idea of a story. After conducting extensive experiments with state-of-the-art methods, we found the task to be both challenging and significant for NLP research. The dataset and source code have been made publicly available to the research community at https://github.com/RiTUAL-UH/EduStory.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
Authors:
Karim Lekadir,
Aasa Feragen,
Abdul Joseph Fofanah,
Alejandro F Frangi,
Alena Buyx,
Anais Emelie,
Andrea Lara,
Antonio R Porras,
An-Wen Chan,
Arcadi Navarro,
Ben Glocker,
Benard O Botwe,
Bishesh Khanal,
Brigit Beger,
Carol C Wu,
Celia Cintas,
Curtis P Langlotz,
Daniel Rueckert,
Deogratias Mzurikwao,
Dimitrios I Fotiadis,
Doszhan Zhussupov,
Enzo Ferrante,
Erik Meijering,
Eva Weicken,
Fabio A González
, et al. (95 additional authors not shown)
Abstract:
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted…
▽ More
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI.
△ Less
Submitted 8 July, 2024; v1 submitted 11 August, 2023;
originally announced September 2023.
-
Positive and Risky Message Assessment for Music Products
Authors:
Yigeng Zhang,
Mahsa Shafaei,
Fabio A. González,
Thamar Solorio
Abstract:
In this work, we introduce a pioneering research challenge: evaluating positive and potentially harmful messages within music products. We initiate by setting a multi-faceted, multi-task benchmark for music content assessment. Subsequently, we introduce an efficient multi-task predictive model fortified with ordinality-enforcement to address this challenge. Our findings reveal that the proposed me…
▽ More
In this work, we introduce a pioneering research challenge: evaluating positive and potentially harmful messages within music products. We initiate by setting a multi-faceted, multi-task benchmark for music content assessment. Subsequently, we introduce an efficient multi-task predictive model fortified with ordinality-enforcement to address this challenge. Our findings reveal that the proposed method not only significantly outperforms robust task-specific alternatives but also possesses the capability to assess multiple aspects simultaneously. Furthermore, through detailed case studies, where we employed Large Language Models (LLMs) as surrogates for content assessment, we provide valuable insights to inform and guide future research on this topic. The code for dataset creation and model implementation is publicly available at https://github.com/RiTUAL-UH/music-message-assessment.
△ Less
Submitted 8 April, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
On-orbit model training for satellite imagery with label proportions
Authors:
Raúl Ramos-Pollán,
Fabio A. González
Abstract:
This work addresses the challenge of training supervised machine or deep learning models on orbiting platforms where we are generally constrained by limited on-board hardware capabilities and restricted uplink bandwidths to upload. We aim at enabling orbiting spacecrafts to (1) continuously train a lightweight model as it acquires imagery; and (2) receive new labels while on orbit to refine or eve…
▽ More
This work addresses the challenge of training supervised machine or deep learning models on orbiting platforms where we are generally constrained by limited on-board hardware capabilities and restricted uplink bandwidths to upload. We aim at enabling orbiting spacecrafts to (1) continuously train a lightweight model as it acquires imagery; and (2) receive new labels while on orbit to refine or even change the predictive task being trained. For this, we consider chip level regression tasks (i.e. predicting the vegetation percentage of a 20 km$^2$ patch) when we only have coarser label proportions, such as municipality level vegetation statistics (a municipality containing several patches). Such labels proportions have the additional advantage that usually come in tabular data and are widely available in many regions of the world and application areas. This can be framed as a Learning from Label Proportions (LLP) problem setup. LLP applied to Earth Observation (EO) data is still an emerging field and performing comparative studies in applied scenarios remains a challenge due to the lack of standardized datasets. In this work, first, we show how very simple deep learning and probabilistic methods (with {\raise.17ex\hbox{$\scriptstyle\sim$}}5K parameters) generally perform better than standard more complex ones, providing a surprising level of finer grained spatial detail when trained with much coarser label proportions. Second, we publish a set of benchmarking datasets enabling comparative LLP applied to EO, providing both fine grained labels and aggregated data according to existing administrative divisions. Finally, we show how this approach fits an on-orbit training scenario by reducing vastly both the amount of computing and the size of the labels sets. Source code is available at https://github.com/rramosp/llpeo
△ Less
Submitted 10 December, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Kernel Density Matrices for Probabilistic Deep Learning
Authors:
Fabio A. González,
Raúl Ramos-Pollán,
Joseph A. Gallego-Mejia
Abstract:
This paper introduces a novel approach to probabilistic deep learning, kernel density matrices, which provide a simpler yet effective mechanism for representing joint probability distributions of both continuous and discrete random variables. In quantum mechanics, a density matrix is the most general way to describe the state of a quantum system. This work extends the concept of density matrices b…
▽ More
This paper introduces a novel approach to probabilistic deep learning, kernel density matrices, which provide a simpler yet effective mechanism for representing joint probability distributions of both continuous and discrete random variables. In quantum mechanics, a density matrix is the most general way to describe the state of a quantum system. This work extends the concept of density matrices by allowing them to be defined in a reproducing kernel Hilbert space. This abstraction allows the construction of differentiable models for density estimation, inference, and sampling, and enables their integration into end-to-end deep neural models. In doing so, we provide a versatile representation of marginal and joint probability distributions that allows us to develop a differentiable, compositional, and reversible inference procedure that covers a wide range of machine learning tasks, including density estimation, discriminative learning, and generative modeling. The broad applicability of the framework is illustrated by two examples: an image classification model that can be naturally transformed into a conditional generative model, and a model for learning with label proportions that demonstrates the framework's ability to deal with uncertainty in the training samples. The framework is implemented as a library and is available at: https://github.com/fagonzalezo/kdm.
△ Less
Submitted 30 April, 2024; v1 submitted 26 May, 2023;
originally announced May 2023.
-
What are the Machine Learning best practices reported by practitioners on Stack Exchange?
Authors:
Anamaria Mojica-Hanke,
Andrea Bayona,
Mario Linares-Vásquez,
Steffen Herbold,
Fabio A. González
Abstract:
Machine Learning (ML) is being used in multiple disciplines due to its powerful capability to infer relationships within data. In particular, Software Engineering (SE) is one of those disciplines in which ML has been used for multiple tasks, like software categorization, bugs prediction, and testing. In addition to the multiple ML applications, some studies have been conducted to detect and unders…
▽ More
Machine Learning (ML) is being used in multiple disciplines due to its powerful capability to infer relationships within data. In particular, Software Engineering (SE) is one of those disciplines in which ML has been used for multiple tasks, like software categorization, bugs prediction, and testing. In addition to the multiple ML applications, some studies have been conducted to detect and understand possible pitfalls and issues when using ML. However, to the best of our knowledge, only a few studies have focused on presenting ML best practices or guidelines for the application of ML in different domains. In addition, the practices and literature presented in previous literature (i) are domain-specific (e.g., concrete practices in biomechanics), (ii) describe few practices, or (iii) the practices lack rigorous validation and are presented in gray literature. In this paper, we present a study listing 127 ML best practices systematically mining 242 posts of 14 different Stack Exchange (STE) websites and validated by four independent ML experts. The list of practices is presented in a set of categories related to different stages of the implementation process of an ML-enabled system; for each practice, we include explanations and examples. In all the practices, the provided examples focus on SE tasks. We expect this list of practices could help practitioners to understand better the practices and use ML in a more informed way, in particular newcomers to this new area that sits at the intersection of software engineering and machine learning.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
LEAN-DMKDE: Quantum Latent Density Estimation for Anomaly Detection
Authors:
Joseph Gallego-Mejia,
Oscar Bustos-Brinez,
Fabio A. González
Abstract:
This paper presents an anomaly detection model that combines the strong statistical foundation of density-estimation-based anomaly detection methods with the representation-learning ability of deep-learning models. The method combines an autoencoder, for learning a low-dimensional representation of the data, with a density-estimation model based on random Fourier features and density matrices in a…
▽ More
This paper presents an anomaly detection model that combines the strong statistical foundation of density-estimation-based anomaly detection methods with the representation-learning ability of deep-learning models. The method combines an autoencoder, for learning a low-dimensional representation of the data, with a density-estimation model based on random Fourier features and density matrices in an end-to-end architecture that can be trained using gradient-based optimization techniques. The method predicts a degree of normality for new samples based on the estimated density. A systematic experimental evaluation was performed on different benchmark datasets. The experimental results show that the method performs on par with or outperforms other state-of-the-art methods.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
AD-DMKDE: Anomaly Detection through Density Matrices and Fourier Features
Authors:
Oscar Bustos-Brinez,
Joseph Gallego-Mejia,
Fabio A. González
Abstract:
This paper presents a novel density estimation method for anomaly detection using density matrices (a powerful mathematical formalism from quantum mechanics) and Fourier features. The method can be seen as an efficient approximation of Kernel Density Estimation (KDE). A systematic comparison of the proposed method with eleven state-of-the-art anomaly detection methods on various data sets is prese…
▽ More
This paper presents a novel density estimation method for anomaly detection using density matrices (a powerful mathematical formalism from quantum mechanics) and Fourier features. The method can be seen as an efficient approximation of Kernel Density Estimation (KDE). A systematic comparison of the proposed method with eleven state-of-the-art anomaly detection methods on various data sets is presented, showing competitive performance on different benchmark data sets. The method is trained efficiently and it uses optimization to find the parameters of data embedding. The prediction phase complexity of the proposed algorithm is constant relative to the training data size, and it performs well in data sets with different anomaly rates. Its architecture allows vectorization and can be implemented on GPU/TPU hardware.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Deep Semi-Supervised and Self-Supervised Learning for Diabetic Retinopathy Detection
Authors:
Jose Miguel Arrieta Ramos,
Oscar Perdómo,
Fabio A. González
Abstract:
Diabetic retinopathy (DR) is one of the leading causes of blindness in the working-age population of developed countries, caused by a side effect of diabetes that reduces the blood supply to the retina. Deep neural networks have been widely used in automated systems for DR classification on eye fundus images. However, these models need a large number of annotated images. In the medical domain, ann…
▽ More
Diabetic retinopathy (DR) is one of the leading causes of blindness in the working-age population of developed countries, caused by a side effect of diabetes that reduces the blood supply to the retina. Deep neural networks have been widely used in automated systems for DR classification on eye fundus images. However, these models need a large number of annotated images. In the medical domain, annotations from experts are costly, tedious, and time-consuming; as a result, a limited number of annotated images are available. This paper presents a semi-supervised method that leverages unlabeled images and labeled ones to train a model that detects diabetic retinopathy. The proposed method uses unsupervised pretraining via self-supervised learning followed by supervised fine-tuning with a small set of labeled images and knowledge distillation to increase the performance in classification task. This method was evaluated on the EyePACS test and Messidor-2 dataset achieving 0.94 and 0.89 AUC respectively using only 2% of EyePACS train labeled images.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Fast Kernel Density Estimation with Density Matrices and Random Fourier Features
Authors:
Joseph A. Gallego,
Juan F. Osorio,
Fabio A. González
Abstract:
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most current big data applications. Several strategies, such as tree-based or hashing-based estimators, have been proposed to improve the efficiency of the kernel densit…
▽ More
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most current big data applications. Several strategies, such as tree-based or hashing-based estimators, have been proposed to improve the efficiency of the kernel density estimation method. The novel density kernel density estimation method (DMKDE) uses density matrices, a quantum mechanical formalism, and random Fourier features, an explicit kernel approximation, to produce density estimates. This method has its roots in the KDE and can be considered as an approximation method, without its memory-based restriction. In this paper, we systematically evaluate the novel DMKDE algorithm and compare it with other state-of-the-art fast procedures for approximating the kernel density estimation method on different synthetic data sets. Our experimental results show that DMKDE is on par with its competitors for computing density estimates and advantages are shown when performed on high-dimensional data. We have made all the code available as an open source software repository.
△ Less
Submitted 4 August, 2022; v1 submitted 1 August, 2022;
originally announced August 2022.
-
Quantum Adaptive Fourier Features for Neural Density Estimation
Authors:
Joseph A. Gallego,
Fabio A. González
Abstract:
Density estimation is a fundamental task in statistics and machine learning applications. Kernel density estimation is a powerful tool for non-parametric density estimation in low dimensions; however, its performance is poor in higher dimensions. Moreover, its prediction complexity scale linearly with more training data points. This paper presents a method for neural density estimation that can be…
▽ More
Density estimation is a fundamental task in statistics and machine learning applications. Kernel density estimation is a powerful tool for non-parametric density estimation in low dimensions; however, its performance is poor in higher dimensions. Moreover, its prediction complexity scale linearly with more training data points. This paper presents a method for neural density estimation that can be seen as a type of kernel density estimation, but without the high prediction computational complexity. The method is based on density matrices, a formalism used in quantum mechanics, and adaptive Fourier features. The method can be trained without optimization, but it could be also integrated with deep learning architectures and trained using gradient descent. Thus, it could be seen as a form of neural density estimation method. The method was evaluated in different synthetic and real datasets, and its performance compared against state-of-the-art neural density estimation methods, obtaining competitive results.
△ Less
Submitted 4 August, 2022; v1 submitted 31 July, 2022;
originally announced August 2022.
-
First Measurement of the EMC Effect in $^{10}$B and $^{11}$B
Authors:
A. Karki,
D. Biswas,
F. A. Gonzalez,
W. Henry,
C. Morean,
A. Nadeeshani,
A. Sun,
D. Abrams,
Z. Ahmed,
B. Aljawrneh,
S. Alsalmi,
R. Ambrose,
D. Androic,
W. Armstrong,
J. Arrington,
A. Asaturyan,
K. Assumin-Gyimah,
C. Ayerbe Gayoso,
A. Bandari,
J. Bane,
J. Barrow,
S. Basnet,
V. Berdnikov,
H. Bhatt,
D. Bhetuwal
, et al. (72 additional authors not shown)
Abstract:
The nuclear dependence of the inclusive inelastic electron scattering cross section (the EMC effect) has been measured for the first time in $^{10}$B and $^{11}$B. Previous measurements of the EMC effect in $A \leq 12$ nuclei showed an unexpected nuclear dependence; $^{10}$B and $^{11}$B were measured to explore the EMC effect in this region in more detail. Results are presented for $^9$Be,…
▽ More
The nuclear dependence of the inclusive inelastic electron scattering cross section (the EMC effect) has been measured for the first time in $^{10}$B and $^{11}$B. Previous measurements of the EMC effect in $A \leq 12$ nuclei showed an unexpected nuclear dependence; $^{10}$B and $^{11}$B were measured to explore the EMC effect in this region in more detail. Results are presented for $^9$Be, $^{10}$B, $^{11}$B, and $^{12}$C at an incident beam energy of 10.6~GeV. The EMC effect in the boron isotopes was found to be similar to that for $^9$Be and $^{12}$C, yielding almost no nuclear dependence in the EMC effect in the range $A=4-12$. This represents important, new data supporting the hypothesis that the EMC effect depends primarily on the local nuclear environment due to the cluster structure of these nuclei.
△ Less
Submitted 31 July, 2023; v1 submitted 8 July, 2022;
originally announced July 2022.
-
An Empirical Study of Quantum Dynamics as a Ground State Problem with Neural Quantum States
Authors:
Vladimir Vargas-Calderón,
Herbert Vinck-Posada,
Fabio A. González
Abstract:
We consider the Feynman-Kitaev formalism applied to a spin chain described by the transverse field Ising model. This formalism consists of building a Hamiltonian whose ground state encodes the time evolution of the spin chain at discrete time steps. To find this ground state, variational wave functions parameterised by artificial neural networks -- also known as neural quantum states (NQSs) -- are…
▽ More
We consider the Feynman-Kitaev formalism applied to a spin chain described by the transverse field Ising model. This formalism consists of building a Hamiltonian whose ground state encodes the time evolution of the spin chain at discrete time steps. To find this ground state, variational wave functions parameterised by artificial neural networks -- also known as neural quantum states (NQSs) -- are used. Our work focuses on assessing, in the context of the Feynman-Kitaev formalism, two properties of NQSs: expressivity (the possibility that variational parameters can be set to values such that the NQS is faithful to the true ground state of the system) and trainability (the process of reaching said values). We find that the considered NQSs are capable of accurately approximating the true ground state of the system, i.e., they are expressive enough ansätze. However, extensive hyperparameter tuning experiments show that, empirically, reaching the set of values for the variational parameters that correctly describe the ground state becomes ever more difficult as the number of time steps increase because the true ground state becomes more entangled, and the probability distribution starts to spread across the Hilbert space canonical basis.
△ Less
Submitted 30 January, 2023; v1 submitted 18 June, 2022;
originally announced June 2022.
-
Constraints on the onset of color transparency from quasi-elastic $^{12}$C$(e,e'p)$ up to $Q^2=\,14.2\,$(GeV$/c)^2$
Authors:
D. Bhetuwal,
J. Matter,
H. Szumila-Vance,
C. Ayerbe Gayoso,
M. L. Kabir,
D. Dutta,
R. Ent,
D. Abrams,
Z. Ahmed,
B. Aljawrneh,
S. Alsalmi,
R. Ambrose,
D. Androic,
W. Armstrong,
A. Asaturyan,
K. Assumin-Gyimah,
A. Bandari,
S. Basnet,
V. Berdnikov,
H. Bhatt,
D. Biswas,
W. U. Boeglin,
P. Bosted,
E. Brash,
M. H. S. Bukhari
, et al. (65 additional authors not shown)
Abstract:
Quasi-elastic scattering on $^{12}$C$(e,e'p)$ was measured in Hall C at Jefferson Lab for space-like 4-momentum transfer squared $Q^2$ in the range of 8--14.2\,(GeV/$c$)$^2$ with proton momenta up to 8.3\,GeV/$c$. The experiment was carried out in the upgraded Hall C at Jefferson Lab. It used the existing high momentum spectrometer and the new super high momentum spectrometer to detect the scatter…
▽ More
Quasi-elastic scattering on $^{12}$C$(e,e'p)$ was measured in Hall C at Jefferson Lab for space-like 4-momentum transfer squared $Q^2$ in the range of 8--14.2\,(GeV/$c$)$^2$ with proton momenta up to 8.3\,GeV/$c$. The experiment was carried out in the upgraded Hall C at Jefferson Lab. It used the existing high momentum spectrometer and the new super high momentum spectrometer to detect the scattered electrons and protons in coincidence. The nuclear transparency was extracted as the ratio of the measured yield to the yield calculated in the plane wave impulse approximation. Additionally, the transparency of the $1s_{1/2}$ and $1p_{3/2}$ shell protons in $^{12}$C was extracted, and the asymmetry of the missing momentum distribution was examined for hints of the quantum chromodynamics prediction of Color Transparency. All of these results were found to be consistent with traditional nuclear physics and inconsistent with the onset of Color Transparency.
△ Less
Submitted 14 August, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Optimisation-free Classification and Density Estimation with Quantum Circuits
Authors:
Vladimir Vargas-Calderón,
Fabio A. González,
Herbert Vinck-Posada
Abstract:
We demonstrate the implementation of a novel machine learning framework for probability density estimation and classification using quantum circuits. The framework maps a training data set or a single data sample to the quantum state of a physical system through quantum feature maps. The quantum state of the arbitrarily large training data set summarises its probability distribution in a finite-di…
▽ More
We demonstrate the implementation of a novel machine learning framework for probability density estimation and classification using quantum circuits. The framework maps a training data set or a single data sample to the quantum state of a physical system through quantum feature maps. The quantum state of the arbitrarily large training data set summarises its probability distribution in a finite-dimensional quantum wave function. By projecting the quantum state of a new data sample onto the quantum state of the training data set, one can derive statistics to classify or estimate the density of the new data sample. Remarkably, the implementation of our framework on a real quantum device does not require any optimisation of quantum circuit parameters. Nonetheless, we discuss a variational quantum circuit approach that could leverage quantum advantage for our framework.
△ Less
Submitted 22 May, 2022; v1 submitted 27 March, 2022;
originally announced March 2022.
-
Quantum density estimation with density matrices: Application to quantum anomaly detection
Authors:
Diego H. Useche,
Oscar A. Bustos-Brinez,
Joseph A. Gallego-Mejia,
Fabio A. González
Abstract:
Density estimation is a central task in statistics and machine learning. This problem aims to determine the underlying probability density function that best aligns with an observed data set. Some of its applications include statistical inference, unsupervised learning, and anomaly detection. Despite its relevance, few works have explored the application of quantum computing to density estimation.…
▽ More
Density estimation is a central task in statistics and machine learning. This problem aims to determine the underlying probability density function that best aligns with an observed data set. Some of its applications include statistical inference, unsupervised learning, and anomaly detection. Despite its relevance, few works have explored the application of quantum computing to density estimation. In this article, we present a novel quantum-classical density matrix density estimation model, called Q-DEMDE, based on the expected values of density matrices and a novel quantum embedding called quantum Fourier features. The method uses quantum hardware to build probability distributions of training data via mixed quantum states. As a core subroutine, we propose a new algorithm to estimate the expected value of a mixed density matrix from its spectral decomposition on a quantum computer. In addition, we present an application of the method for quantum-classical anomaly detection. We evaluated the density estimation model with quantum random and quantum adaptive Fourier features on different data sets on a quantum simulator and a real quantum computer. An important result of this work is to show that it is possible to perform density estimation and anomaly detection with high performance on present-day quantum computers.
△ Less
Submitted 18 March, 2024; v1 submitted 24 January, 2022;
originally announced January 2022.
-
A deep learning model for classification of diabetic retinopathy in eye fundus images based on retinal lesion detection
Authors:
Melissa delaPava,
Hernán Ríos,
Francisco J. Rodríguez,
Oscar J. Perdomo,
Fabio A. González
Abstract:
Diabetic retinopathy (DR) is the result of a complication of diabetes affecting the retina. It can cause blindness, if left undiagnosed and untreated. An ophthalmologist performs the diagnosis by screening each patient and analyzing the retinal lesions via ocular imaging. In practice, such analysis is time-consuming and cumbersome to perform. This paper presents a model for automatic DR classifica…
▽ More
Diabetic retinopathy (DR) is the result of a complication of diabetes affecting the retina. It can cause blindness, if left undiagnosed and untreated. An ophthalmologist performs the diagnosis by screening each patient and analyzing the retinal lesions via ocular imaging. In practice, such analysis is time-consuming and cumbersome to perform. This paper presents a model for automatic DR classification on eye fundus images. The approach identifies the main ocular lesions related to DR and subsequently diagnoses the illness. The proposed method follows the same workflow as the clinicians, providing information that can be interpreted clinically to support the prediction. A subset of the kaggle EyePACS and the Messidor-2 datasets, labeled with ocular lesions, is made publicly available. The kaggle EyePACS subset is used as a training set and the Messidor-2 as a test set for lesions and DR classification models. For DR diagnosis, our model has an area-under-the-curve, sensitivity, and specificity of 0.948, 0.886, and 0.875, respectively, which competes with state-of-the-art approaches.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Quantum Measurement Classification with Qudits
Authors:
Diego H. Useche,
Andres Giraldo-Carvajal,
Hernan M. Zuluaga-Bucheli,
Jose A. Jaramillo-Villegas,
Fabio A. González
Abstract:
This paper presents a hybrid classical-quantum program for density estimation and supervised classification. The program is implemented as a quantum circuit in a high-dimensional quantum computer simulator. We show that the proposed quantum protocols allow to estimate probability density functions and to make predictions in a supervised learning manner. This model can be generalized to find expect…
▽ More
This paper presents a hybrid classical-quantum program for density estimation and supervised classification. The program is implemented as a quantum circuit in a high-dimensional quantum computer simulator. We show that the proposed quantum protocols allow to estimate probability density functions and to make predictions in a supervised learning manner. This model can be generalized to find expected values of density matrices in high-dimensional quantum computers. Experiments on various data sets are presented. Results show that the proposed method is a viable strategy to implement supervised classification and density estimation in a high-dimensional quantum computer.
△ Less
Submitted 20 July, 2021;
originally announced July 2021.
-
Prostate Tissue Grading with Deep Quantum Measurement Ordinal Regression
Authors:
Santiago Toledo-Cortés,
Diego H. Useche,
Fabio A. González
Abstract:
Prostate cancer (PCa) is one of the most common and aggressive cancers worldwide. The Gleason score (GS) system is the standard way of classifying prostate cancer and the most reliable method to determine the severity and treatment to follow. The pathologist looks at the arrangement of cancer cells in the prostate and assigns a score on a scale that ranges from 6 to 10. Automatic analysis of prost…
▽ More
Prostate cancer (PCa) is one of the most common and aggressive cancers worldwide. The Gleason score (GS) system is the standard way of classifying prostate cancer and the most reliable method to determine the severity and treatment to follow. The pathologist looks at the arrangement of cancer cells in the prostate and assigns a score on a scale that ranges from 6 to 10. Automatic analysis of prostate whole-slide images (WSIs) is usually addressed as a binary classification problem, which misses the finer distinction between stages given by the GS. This paper presents a probabilistic deep learning ordinal classification method that can estimate the GS from a prostate WSI. Approaching the problem as an ordinal regression task using a differentiable probabilistic model not only improves the interpretability of the results, but also improves the accuracy of the model when compared to conventional deep classification and regression architectures.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Deep Bag-of-Sub-Emotions for Depression Detection in Social Media
Authors:
Juan S. Lara,
Mario Ezra Aragon,
Fabio A. Gonzalez,
Manuel Montes-y-Gomez
Abstract:
This paper presents the Deep Bag-of-Sub-Emotions (DeepBoSE), a novel deep learning model for depression detection in social media. The model is formulated such that it internally computes a differentiable Bag-of-Features (BoF) representation that incorporates emotional information. This is achieved by a reinterpretation of classical weighting schemes like term frequency-inverse document frequency…
▽ More
This paper presents the Deep Bag-of-Sub-Emotions (DeepBoSE), a novel deep learning model for depression detection in social media. The model is formulated such that it internally computes a differentiable Bag-of-Features (BoF) representation that incorporates emotional information. This is achieved by a reinterpretation of classical weighting schemes like term frequency-inverse document frequency into probabilistic deep learning operations. An important advantage of the proposed method is that it can be trained under the transfer learning paradigm, which is useful to enhance conventional BoF models that cannot be directly integrated into deep learning architectures. Experiments were performed in the eRisk17 and eRisk18 datasets for the depression detection task; results show that DeepBoSE outperforms conventional BoF representations and it is competitive with the state of the art, achieving a F1-score over the positive class of 0.64 in eRisk17 and 0.65 in eRisk18.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
Many-Qudit representation for the Travelling Salesman Problem Optimisation
Authors:
Vladimir Vargas-Calderón,
Nicolas Parra-A.,
Herbert Vinck-Posada,
Fabio A. González
Abstract:
We present a map from the travelling salesman problem (TSP), a prototypical NP-complete combinatorial optimisation task, to the ground state associated with a system of many-qudits. Conventionally, the TSP is cast into a quadratic unconstrained binary optimisation (QUBO) problem, that can be solved on an Ising machine. The size of the corresponding physical system's Hilbert space is $2^{N^2}$, whe…
▽ More
We present a map from the travelling salesman problem (TSP), a prototypical NP-complete combinatorial optimisation task, to the ground state associated with a system of many-qudits. Conventionally, the TSP is cast into a quadratic unconstrained binary optimisation (QUBO) problem, that can be solved on an Ising machine. The size of the corresponding physical system's Hilbert space is $2^{N^2}$, where $N$ is the number of cities considered in the TSP. Our proposal provides a many-qudit system with a Hilbert space of dimension $2^{N\log_2N}$, which is considerably smaller than the dimension of the Hilbert space of the system resulting from the usual QUBO map. This reduction can yield a significant speedup in quantum and classical computers. We simulate and validate our proposal using variational Monte Carlo with a neural quantum state, solving the TSP in a linear layout for up to almost 100 cities.
△ Less
Submitted 1 October, 2021; v1 submitted 25 February, 2021;
originally announced February 2021.
-
Learning with Density Matrices and Random Features
Authors:
Fabio A. González,
Alejandro Gallego,
Santiago Toledo-Cortés,
Vladimir Vargas-Calderón
Abstract:
A density matrix describes the statistical state of a quantum system. It is a powerful formalism to represent both the quantum and classical uncertainty of quantum systems and to express different statistical operations such as measurement, system combination and expectations as linear algebra operations. This paper explores how density matrices can be used as a building block for machine learning…
▽ More
A density matrix describes the statistical state of a quantum system. It is a powerful formalism to represent both the quantum and classical uncertainty of quantum systems and to express different statistical operations such as measurement, system combination and expectations as linear algebra operations. This paper explores how density matrices can be used as a building block for machine learning models exploiting their ability to straightforwardly combine linear algebra and probability. One of the main results of the paper is to show that density matrices coupled with random Fourier features could approximate arbitrary probability distributions over $\mathbb{R}^n$. Based on this finding the paper builds different models for density estimation, classification and regression. These models are differentiable, so it is possible to integrate them with other differentiable components, such as deep learning architectures and to learn their parameters using gradient-based optimization. In addition, the paper presents optimization-less training strategies based on estimation and model averaging. The models are evaluated in benchmark tasks and the results are reported and discussed.
△ Less
Submitted 30 April, 2024; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Ruling out color transparency in quasi-elastic $^{12}$C(e,e'p) up to $Q^2$ of 14.2 (GeV/c)$^2$
Authors:
D. Bhetuwal,
J. Matter,
H. Szumila-Vance,
M. L. Kabir,
D. Dutta,
R. Ent,
D. Abrams,
Z. Ahmed,
B. Aljawrneh,
S. Alsalmi,
R. Ambrose,
D. Androic,
W. Armstrong,
A. Asaturyan,
K. Assumin-Gyimah,
C. Ayerbe Gayoso,
A. Bandari,
S. Basnet,
V. Berdnikov,
H. Bhatt,
D. Biswas,
W. U. Boeglin,
P. Bosted,
E. Brash,
M. H. S. Bukhari
, et al. (65 additional authors not shown)
Abstract:
Quasielastic $^{12}$C$(e,e'p)$ scattering was measured at space-like 4-momentum transfer squared $Q^2$~=~8, 9.4, 11.4, and 14.2 (GeV/c)$^2$, the highest ever achieved to date. Nuclear transparency for this reaction was extracted by comparing the measured yield to that expected from a plane-wave impulse approximation calculation without any final state interactions. The measured transparency was co…
▽ More
Quasielastic $^{12}$C$(e,e'p)$ scattering was measured at space-like 4-momentum transfer squared $Q^2$~=~8, 9.4, 11.4, and 14.2 (GeV/c)$^2$, the highest ever achieved to date. Nuclear transparency for this reaction was extracted by comparing the measured yield to that expected from a plane-wave impulse approximation calculation without any final state interactions. The measured transparency was consistent with no $Q^2$ dependence, up to proton momenta of 8.5~GeV/c, ruling out the quantum chromodynamics effect of color transparency at the measured $Q^2$ scales in exclusive $(e,e'p)$ reactions. These results impose strict constraints on models of color transparency for protons.
△ Less
Submitted 1 March, 2021; v1 submitted 1 November, 2020;
originally announced November 2020.
-
Hybrid Deep Learning Gaussian Process for Diabetic Retinopathy Diagnosis and Uncertainty Quantification
Authors:
Santiago Toledo-Cortés,
Melissa De La Pava,
Oscar Perdómo,
Fabio A. González
Abstract:
Diabetic Retinopathy (DR) is one of the microvascular complications of Diabetes Mellitus, which remains as one of the leading causes of blindness worldwide. Computational models based on Convolutional Neural Networks represent the state of the art for the automatic detection of DR using eye fundus images. Most of the current work address this problem as a binary classification task. However, inclu…
▽ More
Diabetic Retinopathy (DR) is one of the microvascular complications of Diabetes Mellitus, which remains as one of the leading causes of blindness worldwide. Computational models based on Convolutional Neural Networks represent the state of the art for the automatic detection of DR using eye fundus images. Most of the current work address this problem as a binary classification task. However, including the grade estimation and quantification of predictions uncertainty can potentially increase the robustness of the model. In this paper, a hybrid Deep Learning-Gaussian process method for DR diagnosis and uncertainty quantification is presented. This method combines the representational power of deep learning, with the ability to generalize from small datasets of Gaussian process models. The results show that uncertainty quantification in the predictions improves the interpretability of the method as a diagnostic support tool. The source code to replicate the experiments is publicly available at https://github.com/stoledoc/DLGP-DR-Diagnosis.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Dissimilarity Mixture Autoencoder for Deep Clustering
Authors:
Juan S. Lara,
Fabio A. González
Abstract:
The dissimilarity mixture autoencoder (DMAE) is a neural network model for feature-based clustering that incorporates a flexible dissimilarity function and can be integrated into any kind of deep learning architecture. It internally represents a dissimilarity mixture model (DMM) that extends classical methods like K-Means, Gaussian mixture models, or Bregman clustering to any convex and differenti…
▽ More
The dissimilarity mixture autoencoder (DMAE) is a neural network model for feature-based clustering that incorporates a flexible dissimilarity function and can be integrated into any kind of deep learning architecture. It internally represents a dissimilarity mixture model (DMM) that extends classical methods like K-Means, Gaussian mixture models, or Bregman clustering to any convex and differentiable dissimilarity function through the reinterpretation of probabilities as neural network representations. DMAE can be integrated with deep learning architectures into end-to-end models, allowing the simultaneous estimation of the clustering and neural network's parameters. Experimental evaluation was performed on image and text clustering benchmark datasets showing that DMAE is competitive in terms of unsupervised classification accuracy and normalized mutual information. The source code with the implementation of DMAE is publicly available at: https://github.com/juselara1/dmae
△ Less
Submitted 15 July, 2021; v1 submitted 15 June, 2020;
originally announced June 2020.
-
Phase diagram reconstruction of the Bose-Hubbard model with a Restricted Boltzmann Machine wavefunction
Authors:
Vladimir Vargas-Calderón,
Herbert Vinck-Posada,
Fabio A. González
Abstract:
Recently, the use of neural quantum states for describing the ground state of many- and few-body problems has been gaining popularity because of their high expressivity and ability to handle intractably large Hilbert spaces. In particular, methods based on variational Monte Carlo have proven to be successful in describing the physics of bosonic systems such as the Bose-Hubbard (BH) model. However,…
▽ More
Recently, the use of neural quantum states for describing the ground state of many- and few-body problems has been gaining popularity because of their high expressivity and ability to handle intractably large Hilbert spaces. In particular, methods based on variational Monte Carlo have proven to be successful in describing the physics of bosonic systems such as the Bose-Hubbard (BH) model. However, this technique has not been systematically tested on the parameter space of the BH model, particularly at the boundary between the Mott insulator and superfluid phases. In this work, we evaluate the capabilities of variational Monte Carlo with a trial wavefunction given by a Restricted Boltzmann Machine to reproduce the quantum ground state of the BH model on several points of its parameter space. To benchmark the technique, we compare its results to the ground state found through exact diagonalization for small one-dimensional chains. In general, we find that the learned ground state correctly estimates many observables, reproducing to a high degree the phase diagram for the first Mott lobe and part of the second one. However, we find that the technique is challenged whenever the system transitions between excitation manifolds, as the ground state is not learned correctly at these boundaries. We improve the quality of the results produced by the technique by proposing a method to discard noisy probabilities learned in the ground state.
△ Less
Submitted 6 November, 2020; v1 submitted 26 April, 2020;
originally announced April 2020.
-
Supervised Learning with Quantum Measurements
Authors:
Fabio A. González,
Vladimir Vargas-Calderón,
Herbert Vinck-Posada
Abstract:
This paper reports a novel method for supervised machine learning based on the mathematical formalism that supports quantum mechanics. The method uses projective quantum measurement as a way of building a prediction function. Specifically, the relationship between input and output variables is represented as the state of a bipartite quantum system. The state is estimated from training samples thro…
▽ More
This paper reports a novel method for supervised machine learning based on the mathematical formalism that supports quantum mechanics. The method uses projective quantum measurement as a way of building a prediction function. Specifically, the relationship between input and output variables is represented as the state of a bipartite quantum system. The state is estimated from training samples through an averaging process that produces a density matrix. Prediction of the label for a new sample is made by performing a projective measurement on the bipartite system with an operator, prepared from the new input sample, and applying a partial trace to obtain the state of the subsystem representing the output. The method can be seen as a generalization of Bayesian inference classification and as a type of kernel-based learning method. One remarkable characteristic of the method is that it does not require learning any parameters through optimization. We illustrate the method with different 2-D classification benchmark problems and different quantum information encodings.
△ Less
Submitted 12 February, 2021; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Second cohomology group of the finite-dimensional simple Jordan superalgebra $\mathcal{D}_{t}$, $t\neq 0$
Authors:
F. A. Gomez Gonzalez,
J. A. Ramirez Bermudez
Abstract:
The second cohomology group (SCG) of the Jordan superalgebra $\mathcal{D}_{t}$, $t\neq 0$, is calculated by using the coefficients which appear in the regular superbimodule $\mathrm{Reg}\mathcal{D}_t$. Contrary to the case of algebras, this group is nontrivial thanks to the non-splitting caused by the Wedderburn Decomposition Theorem \cite{Faber1}. First, to calculate the SCG of a Jordan superalge…
▽ More
The second cohomology group (SCG) of the Jordan superalgebra $\mathcal{D}_{t}$, $t\neq 0$, is calculated by using the coefficients which appear in the regular superbimodule $\mathrm{Reg}\mathcal{D}_t$. Contrary to the case of algebras, this group is nontrivial thanks to the non-splitting caused by the Wedderburn Decomposition Theorem \cite{Faber1}. First, to calculate the SCG of a Jordan superalgebra we use split-null extension of the Jordan superalgebra and the Jordan superalgebra representation. We prove conditions that satisfy the bilinear forms $h$ that determine the SCG in Jordan superalgebras. We use these to calculate the SCG for the Jordan superalgebra $\mathcal{D}_{t}$ , $t\neq 0$. Finally, we prove that $\mathcal{H}^2(\mathcal{D}_{t}, \textrm{Reg}\mathcal{D}_{t})=0\oplus\mathbb{F}^2$, $t\neq 0$.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Jordan super algebras of type $JP_n$, $n\geq 3$ and the Wedderburn principal theorem
Authors:
F. A. Gomez Gonzalez,
J. A. Ramirez Bermudez
Abstract:
We investigate an analogue to the Wedderburn Principal Theorem (WPT) for a finite-dimensional Jordan superalgebra $J$ with solvable radical $N$ such that $N^2=0$ and $J/N\cong JP_n$, $n\geq 3$.
We consider $N$ as an irreducible $JP_n$-bimodule and we prove that the WPT holds for $J$.
We investigate an analogue to the Wedderburn Principal Theorem (WPT) for a finite-dimensional Jordan superalgebra $J$ with solvable radical $N$ such that $N^2=0$ and $J/N\cong JP_n$, $n\geq 3$.
We consider $N$ as an irreducible $JP_n$-bimodule and we prove that the WPT holds for $J$.
△ Less
Submitted 21 January, 2020;
originally announced January 2020.
-
Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media
Authors:
Gustavo Aguilar,
A. Pastor López-Monroy,
Fabio A. González,
Thamar Solorio
Abstract:
Recognizing named entities in a document is a key task in many NLP applications. Although current state-of-the-art approaches to this task reach a high performance on clean text (e.g. newswire genres), those algorithms dramatically degrade when they are moved to noisy environments such as social media domains. We present two systems that address the challenges of processing social media data using…
▽ More
Recognizing named entities in a document is a key task in many NLP applications. Although current state-of-the-art approaches to this task reach a high performance on clean text (e.g. newswire genres), those algorithms dramatically degrade when they are moved to noisy environments such as social media domains. We present two systems that address the challenges of processing social media data using character-level phonetics and phonology, word embeddings, and Part-of-Speech tags as features. The first model is a multitask end-to-end Bidirectional Long Short-Term Memory (BLSTM)-Conditional Random Field (CRF) network whose output layer contains two CRF classifiers. The second model uses a multitask BLSTM network as feature extractor that transfers the learning to a CRF classifier for the final prediction. Our systems outperform the current F1 scores of the state of the art on the Workshop on Noisy User-generated Text 2017 dataset by 2.45% and 3.69%, establishing a more suitable approach for social media environments.
△ Less
Submitted 10 June, 2019;
originally announced June 2019.
-
Quantum Latent Semantic Analysis
Authors:
Fabio A. González,
Juan C. Caicedo
Abstract:
The main goal of this paper is to explore latent topic analysis (LTA), in the context of quantum information retrieval. LTA is a valuable technique for document analysis and representation, which has been extensively used in information retrieval and machine learning. Different LTA techniques have been proposed, some based on geometrical modeling (such as latent semantic analysis, LSA) and others…
▽ More
The main goal of this paper is to explore latent topic analysis (LTA), in the context of quantum information retrieval. LTA is a valuable technique for document analysis and representation, which has been extensively used in information retrieval and machine learning. Different LTA techniques have been proposed, some based on geometrical modeling (such as latent semantic analysis, LSA) and others based on a strong statistical foundation. However, these two different approaches are not usually mixed. Quantum information retrieval has the remarkable virtue of combining both geometry and probability in a common principled framework. We built on this quantum framework to propose a new LTA method, which has a clear geometrical motivation but also supports a well-founded probabilistic interpretation. An initial exploratory experimentation was performed on three standard data sets. The results show that the proposed method outperforms LSA on two of the three datasets. These results suggests that the quantum-motivated representation is an alternative for geometrical latent topic modeling worthy of further exploration.
△ Less
Submitted 7 March, 2019;
originally announced March 2019.
-
Letting Emotions Flow: Success Prediction by Modeling the Flow of Emotions in Books
Authors:
Suraj Maharjan,
Sudipta Kar,
Manuel Montes-y-Gomez,
Fabio A. Gonzalez,
Thamar Solorio
Abstract:
Books have the power to make us feel happiness, sadness, pain, surprise, or sorrow. An author's dexterity in the use of these emotions captivates readers and makes it difficult for them to put the book down. In this paper, we model the flow of emotions over a book using recurrent neural networks and quantify its usefulness in predicting success in books. We obtained the best weighted F1-score of 6…
▽ More
Books have the power to make us feel happiness, sadness, pain, surprise, or sorrow. An author's dexterity in the use of these emotions captivates readers and makes it difficult for them to put the book down. In this paper, we model the flow of emotions over a book using recurrent neural networks and quantify its usefulness in predicting success in books. We obtained the best weighted F1-score of 69% for predicting books' success in a multitask setting (simultaneously predicting success and genre of books).
△ Less
Submitted 24 May, 2018; v1 submitted 24 May, 2018;
originally announced May 2018.
-
Orthosymplectic Jordan superalgebras and the Wedderburn principal theorem (WPT)
Authors:
F. A. Gómez González,
R. Velásquez
Abstract:
An analogue of the Wedderbur principal theorem (WPT) is considered for finite dimensional Jordan superalgebras A with solvable radical N, such that N^2=0 and A/N is isomorphic to Josp_n|2m(F), where F is an algebraicallly closed field of characteristic zero. Let's we prove that the WPT is valid under some restrictions over the irreducible Josp_n|2m(F)-bimodules contained in N, and it is shown with…
▽ More
An analogue of the Wedderbur principal theorem (WPT) is considered for finite dimensional Jordan superalgebras A with solvable radical N, such that N^2=0 and A/N is isomorphic to Josp_n|2m(F), where F is an algebraicallly closed field of characteristic zero. Let's we prove that the WPT is valid under some restrictions over the irreducible Josp_n|2m(F)-bimodules contained in N, and it is shown with counter-examples that these restrictions can not be weakened.
△ Less
Submitted 28 September, 2017;
originally announced September 2017.
-
Gated Multimodal Units for Information Fusion
Authors:
John Arevalo,
Thamar Solorio,
Manuel Montes-y-Gómez,
Fabio A. González
Abstract:
This paper presents a novel model for multimodal learning based on gated neural networks. The Gated Multimodal Unit (GMU) model is intended to be used as an internal unit in a neural network architecture whose purpose is to find an intermediate representation based on a combination of data from different modalities. The GMU learns to decide how modalities influence the activation of the unit using…
▽ More
This paper presents a novel model for multimodal learning based on gated neural networks. The Gated Multimodal Unit (GMU) model is intended to be used as an internal unit in a neural network architecture whose purpose is to find an intermediate representation based on a combination of data from different modalities. The GMU learns to decide how modalities influence the activation of the unit using multiplicative gates. It was evaluated on a multilabel scenario for genre classification of movies using the plot and the poster. The GMU improved the macro f-score performance of single-modality approaches and outperformed other fusion strategies, including mixture of experts models. Along with this work, the MM-IMDb dataset is released which, to the best of our knowledge, is the largest publicly available multimodal dataset for genre prediction on movies.
△ Less
Submitted 7 February, 2017;
originally announced February 2017.
-
Finding Relationships between Socio-Technical Aspects and Personality Traits by Mining Developer E-mails
Authors:
Oscar Hernán Paruma-Pabón,
Fabio A. González,
Jairo Aponte,
Jorge E. Camargo,
Felipe Restrepo-Calle
Abstract:
Personality traits influence most, if not all, of the human activities, from those as natural as the way people walk, talk, dress and write to those most complex as the way they interact with others. Most importantly, personality influences the way people make decisions including, in the case of developers, the criteria they consider when selecting a software project they want to participate. Most…
▽ More
Personality traits influence most, if not all, of the human activities, from those as natural as the way people walk, talk, dress and write to those most complex as the way they interact with others. Most importantly, personality influences the way people make decisions including, in the case of developers, the criteria they consider when selecting a software project they want to participate. Most of the works that study the influence of social, technical and human factors in software development projects have been focused on the impact of communications in software quality. For instance, on identifying predictors to detect files that may contain bugs before releasing an enhanced version of a software product. Only a few of these works focus on the analysis of personality traits of developers with commit permissions (committers) in Free/Libre and Open-Source Software projects and their relationship with the software artifacts they interact with. This paper presents an approach, based on the automatic recognition of personality traits from e-mails sent by committers in FLOSS projects, to uncover relationships between the social and technical aspects that occur during the software development process. Our experimental results suggest the existence of some relationships among personality traits projected by the committers through their e-mails and the social (communication) and technical activities they undertake. This work is a preliminary study aimed at supporting the setting up of efficient work teams in software development projects based on an appropriate mix of stakeholders taking into account their personality traits.
△ Less
Submitted 3 March, 2016; v1 submitted 2 March, 2016;
originally announced March 2016.