Search | arXiv e-print repository

Posterior Matching for Arbitrary Conditioning

Authors: Ryan R. Strauss, Junier B. Oliva

Abstract: Arbitrary conditioning is an important problem in unsupervised learning, where we seek to model the conditional densities $p(\mathbf{x}_u \mid \mathbf{x}_o)$ that underly some data, for all possible non-intersecting subsets $o, u \subset \{1, \dots , d\}$. However, the vast majority of density estimation only focuses on modeling the joint distribution $p(\mathbf{x})$, in which important conditiona… ▽ More Arbitrary conditioning is an important problem in unsupervised learning, where we seek to model the conditional densities $p(\mathbf{x}_u \mid \mathbf{x}_o)$ that underly some data, for all possible non-intersecting subsets $o, u \subset \{1, \dots , d\}$. However, the vast majority of density estimation only focuses on modeling the joint distribution $p(\mathbf{x})$, in which important conditional dependencies between features are opaque. We propose a simple and general framework, coined Posterior Matching, that enables Variational Autoencoders (VAEs) to perform arbitrary conditioning, without modification to the VAE itself. Posterior Matching applies to the numerous existing VAE-based approaches to joint density estimation, thereby circumventing the specialized models required by previous approaches to arbitrary conditioning. We find that Posterior Matching is comparable or superior to current state-of-the-art methods for a variety of tasks with an assortment of VAEs (e.g.~discrete, hierarchical, VaDE). △ Less

Submitted 11 November, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

Comments: Accepted at NeurIPS 2022

arXiv:2102.04426 [pdf, other]

Arbitrary Conditional Distributions with Energy

Authors: Ryan R. Strauss, Junier B. Oliva

Abstract: Modeling distributions of covariates, or density estimation, is a core challenge in unsupervised learning. However, the majority of work only considers the joint distribution, which has limited utility in practical situations. A more general and useful problem is arbitrary conditional density estimation, which aims to model any possible conditional distribution over a set of covariates, reflecting… ▽ More Modeling distributions of covariates, or density estimation, is a core challenge in unsupervised learning. However, the majority of work only considers the joint distribution, which has limited utility in practical situations. A more general and useful problem is arbitrary conditional density estimation, which aims to model any possible conditional distribution over a set of covariates, reflecting the more realistic setting of inference based on prior knowledge. We propose a novel method, Arbitrary Conditioning with Energy (ACE), that can simultaneously estimate the distribution $p(\mathbf{x}_u \mid \mathbf{x}_o)$ for all possible subsets of unobserved features $\mathbf{x}_u$ and observed features $\mathbf{x}_o$. ACE is designed to avoid unnecessary bias and complexity -- we specify densities with a highly expressive energy function and reduce the problem to only learning one-dimensional conditionals (from which more complex distributions can be recovered during inference). This results in an approach that is both simpler and higher-performing than prior methods. We show that ACE achieves state-of-the-art for arbitrary conditional likelihood estimation and data imputation on standard benchmarks. △ Less

Submitted 26 October, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

Comments: Accepted at NeurIPS 2021

arXiv:2008.08226 [pdf, other]

Augmenting Neural Differential Equations to Model Unknown Dynamical Systems with Incomplete State Information

Authors: Robert Strauss

Abstract: Neural Ordinary Differential Equations replace the right-hand side of a conventional ODE with a neural net, which by virtue of the universal approximation theorem, can be trained to the representation of any function. When we do not know the function itself, but have state trajectories (time evolution) of the ODE system we can still train the neural net to learn the representation of the underlyin… ▽ More Neural Ordinary Differential Equations replace the right-hand side of a conventional ODE with a neural net, which by virtue of the universal approximation theorem, can be trained to the representation of any function. When we do not know the function itself, but have state trajectories (time evolution) of the ODE system we can still train the neural net to learn the representation of the underlying but unknown ODE. However if the state of the system is incompletely known then the right-hand side of the ODE cannot be calculated. The derivatives to propagate the system are unavailable. We show that a specially augmented Neural ODE can learn the system when given incomplete state information. As a worked example we apply neural ODEs to the Lotka-Voltera problem of 3 species, rabbits, wolves, and bears. We show that even when the data for the bear time series is removed the remaining time series of the rabbits and wolves is sufficient to learn the dynamical system despite the missing the incomplete state information. This is surprising since a conventional ODE system cannot output the correct derivatives without the full state as the input. We implement augmented neural ODEs and differential equation solvers in the julia programming language. △ Less

Submitted 21 August, 2020; v1 submitted 18 August, 2020; originally announced August 2020.

arXiv:2008.02757 [pdf, other]

doi 10.1016/j.nima.2021.165461

Unsupervised Learning for Identifying Events in Active Target Experiments

Authors: Robert Solli, Daniel Bazin, Michelle P. Kuchera, Ryan R. Strauss, Morten Hjorth-Jensen

Abstract: This article presents novel applications of unsupervised machine learning methods to the problem of event separation in an active target detector, the Active-Target Time Projection Chamber (AT-TPC). The overarching goal is to group similar events in the early stages of the data analysis, thereby improving efficiency by limiting the computationally expensive processing of unnecessary events. The ap… ▽ More This article presents novel applications of unsupervised machine learning methods to the problem of event separation in an active target detector, the Active-Target Time Projection Chamber (AT-TPC). The overarching goal is to group similar events in the early stages of the data analysis, thereby improving efficiency by limiting the computationally expensive processing of unnecessary events. The application of unsupervised clustering algorithms to the analysis of two-dimensional projections of particle tracks from a resonant proton scattering experiment on $^{46}$Ar is introduced. We explore the performance of autoencoder neural networks and a pre-trained VGG16 convolutional neural network. We study clustering performance on both data from a simulated $^{46}$Ar experiment, and real events from the AT-TPC detector. We find that a $k$-means algorithm applied to simulated data in the VGG16 latent space forms almost perfect clusters. Additionally, the VGG16+$k$-means approach finds high purity clusters of proton events for real experimental data. We also explore the application of clustering the latent space of autoencoder neural networks for event separation. While these networks show strong performance, they suffer from high variability in their results. △ Less

Submitted 13 March, 2021; v1 submitted 6 August, 2020; originally announced August 2020.

arXiv:2001.11103 [pdf, other]

doi 10.24963/ijcai.2021/293

Simulation of electron-proton scattering events by a Feature-Augmented and Transformed Generative Adversarial Network (FAT-GAN)

Authors: Yasir Alanazi, N. Sato, Tianbo Liu, W. Melnitchouk, Pawel Ambrozewicz, Florian Hauenstein, Michelle P. Kuchera, Evan Pritchard, Michael Robertson, Ryan Strauss, Luisa Velasco, Yaohang Li

Abstract: We apply generative adversarial network (GAN) technology to build an event generator that simulates particle production in electron-proton scattering that is free of theoretical assumptions about underlying particle dynamics. The difficulty of efficiently training a GAN event simulator lies in learning the complicated patterns of the distributions of the particles physical properties. We develop a… ▽ More We apply generative adversarial network (GAN) technology to build an event generator that simulates particle production in electron-proton scattering that is free of theoretical assumptions about underlying particle dynamics. The difficulty of efficiently training a GAN event simulator lies in learning the complicated patterns of the distributions of the particles physical properties. We develop a GAN that selects a set of transformed features from particle momenta that can be generated easily by the generator, and uses these to produce a set of augmented features that improve the sensitivity of the discriminator. The new Feature-Augmented and Transformed GAN (FAT-GAN) is able to faithfully reproduce the distribution of final state electron momenta in inclusive electron scattering, without the need for input derived from domain-based theoretical assumptions. The developed technology can play a significant role in boosting the science of existing and future accelerator facilities, such as the Electron-Ion Collider. △ Less

Submitted 27 May, 2021; v1 submitted 29 January, 2020; originally announced January 2020.

Comments: 7 pages, 5 figures, expanded author list, paper accepted in IJCAI21

Report number: JLAB-THY-20-3136

Journal ref: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) Main Track, p. 2126 (2021)

arXiv:1810.10350 [pdf, other]

doi 10.1016/j.nima.2019.05.097

Machine Learning Methods for Track Classification in the AT-TPC

Authors: Michelle P. Kuchera, Raghuram Ramanujan, Jack Z. Taylor, Ryan R. Strauss, Daniel Bazin, Joshua Bradt, Ruiming Chen

Abstract: We evaluate machine learning methods for event classification in the Active-Target Time Projection Chamber detector at the National Superconducting Cyclotron Laboratory (NSCL) at Michigan State University. An automated method to single out the desired reaction product would result in more accurate physics results as well as a faster analysis process. Binary and multi-class classification methods w… ▽ More We evaluate machine learning methods for event classification in the Active-Target Time Projection Chamber detector at the National Superconducting Cyclotron Laboratory (NSCL) at Michigan State University. An automated method to single out the desired reaction product would result in more accurate physics results as well as a faster analysis process. Binary and multi-class classification methods were tested on data produced by the $^{46}$Ar(p,p) experiment run at the NSCL in September 2015. We found a Convolutional Neural Network to be the most successful classifier of proton scattering events for transfer learning. Results from this investigation and recommendations for event classification in future experiments are presented. △ Less

Submitted 29 April, 2019; v1 submitted 21 October, 2018; originally announced October 2018.

Report number: NIMA-D-18-01137

Journal ref: NIMA 940 (2019) 56-167, ISSN 0168-9002

Showing 1–6 of 6 results for author: Strauss, R