Search | arXiv e-print repository

Manifold Contrastive Learning with Variational Lie Group Operators

Authors: Kion Fallah, Alec Helbling, Kyle A. Johnsen, Christopher J. Rozell

Abstract: Self-supervised learning of deep neural networks has become a prevalent paradigm for learning representations that transfer to a variety of downstream tasks. Similar to proposed models of the ventral stream of biological vision, it is observed that these networks lead to a separation of category manifolds in the representations of the penultimate layer. Although this observation matches the manifo… ▽ More Self-supervised learning of deep neural networks has become a prevalent paradigm for learning representations that transfer to a variety of downstream tasks. Similar to proposed models of the ventral stream of biological vision, it is observed that these networks lead to a separation of category manifolds in the representations of the penultimate layer. Although this observation matches the manifold hypothesis of representation learning, current self-supervised approaches are limited in their ability to explicitly model this manifold. Indeed, current approaches often only apply augmentations from a pre-specified set of "positive pairs" during learning. In this work, we propose a contrastive learning approach that directly models the latent manifold using Lie group operators parameterized by coefficients with a sparsity-promoting prior. A variational distribution over these coefficients provides a generative model of the manifold, with samples which provide feature augmentations applicable both during contrastive training and downstream tasks. Additionally, learned coefficient distributions provide a quantification of which transformations are most likely at each point on the manifold while preserving identity. We demonstrate benefits in self-supervised benchmarks for image datasets, as well as a downstream semi-supervised task. In the former case, we demonstrate that the proposed methods can effectively apply manifold feature augmentations and improve learning both with and without a projection head. In the latter case, we demonstrate that feature augmentations sampled from learned Lie group operators can improve classification performance when using few labels. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2304.00185 [pdf, other]

PrefGen: Preference Guided Image Generation with Relative Attributes

Authors: Alec Helbling, Christopher J. Rozell, Matthew O'Shaughnessy, Kion Fallah

Abstract: Deep generative models have the capacity to render high fidelity images of content like human faces. Recently, there has been substantial progress in conditionally generating images with specific quantitative attributes, like the emotion conveyed by one's face. These methods typically require a user to explicitly quantify the desired intensity of a visual attribute. A limitation of this method is… ▽ More Deep generative models have the capacity to render high fidelity images of content like human faces. Recently, there has been substantial progress in conditionally generating images with specific quantitative attributes, like the emotion conveyed by one's face. These methods typically require a user to explicitly quantify the desired intensity of a visual attribute. A limitation of this method is that many attributes, like how "angry" a human face looks, are difficult for a user to precisely quantify. However, a user would be able to reliably say which of two faces seems "angrier". Following this premise, we develop the $\textit{PrefGen}$ system, which allows users to control the relative attributes of generated images by presenting them with simple paired comparison queries of the form "do you prefer image $a$ or image $b$?" Using information from a sequence of query responses, we can estimate user preferences over a set of image attributes and perform preference-guided image editing and generation. Furthermore, to make preference localization feasible and efficient, we apply an active query selection strategy. We demonstrate the success of this approach using a StyleGAN2 generator on the task of human face editing. Additionally, we demonstrate how our approach can be combined with CLIP, allowing a user to edit the relative intensity of attributes specified by text prompts. Code at https://github.com/helblazer811/PrefGen. △ Less

Submitted 31 March, 2023; originally announced April 2023.

arXiv:2207.12710 [pdf, other]

Active Learning of Ordinal Embeddings: A User Study on Football Data

Authors: Christoffer Loeffler, Kion Fallah, Stefano Fenu, Dario Zanca, Bjoern Eskofier, Christopher John Rozell, Christopher Mutschler

Abstract: Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. Distance metrics can only serve as proxy for similarity in information retrieval of similar instances. Learning a good similarity function from human annotations improves the quality of retrievals. This work uses deep metric learning to learn these user-defined similarity functions from… ▽ More Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. Distance metrics can only serve as proxy for similarity in information retrieval of similar instances. Learning a good similarity function from human annotations improves the quality of retrievals. This work uses deep metric learning to learn these user-defined similarity functions from few annotations for a large football trajectory dataset. We adapt an entropy-based active learning method with recent work from triplet mining to collect easy-to-answer but still informative annotations from human participants and use them to train a deep convolutional network that generalizes to unseen samples. Our user study shows that our approach improves the quality of the information retrieval compared to a previous deep metric learning approach that relies on a Siamese network. Specifically, we shed light on the strengths and weaknesses of passive sampling heuristics and active learners alike by analyzing the participants' response efficacy. To this end, we collect accuracy, algorithmic time complexity, the participants' fatigue and time-to-response, qualitative self-assessment and statements, as well as the effects of mixed-expertise annotators and their consistency on model performance and transfer-learning. △ Less

Submitted 10 November, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

Comments: 23 pages, 17 figures

Journal ref: Transactions on Machine Learning Research 04/2023 https://openreview.net/forum?id=oq3tx5kinu

arXiv:2206.02972 [pdf, other]

Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics

Authors: Noga Mudrik, Yenho Chen, Eva Yezerets, Christopher J. Rozell, Adam S. Charles

Abstract: Learning interpretable representations of neural dynamics at a population level is a crucial first step to understanding how observed neural activity relates to perception and behavior. Models of neural dynamics often focus on either low-dimensional projections of neural activity, or on learning dynamical systems that explicitly relate to the neural state over time. We discuss how these two approa… ▽ More Learning interpretable representations of neural dynamics at a population level is a crucial first step to understanding how observed neural activity relates to perception and behavior. Models of neural dynamics often focus on either low-dimensional projections of neural activity, or on learning dynamical systems that explicitly relate to the neural state over time. We discuss how these two approaches are interrelated by considering dynamical systems as representative of flows on a low-dimensional manifold. Building on this concept, we propose a new decomposed dynamical system model that represents complex non-stationary and nonlinear dynamics of time series data as a sparse combination of simpler, more interpretable components. Our model is trained through a dictionary learning procedure, where we leverage recent results in tracking sparse vectors over time. The decomposed nature of the dynamics is more expressive than previous switched approaches for a given number of parameters and enables modeling of overlap** and non-stationary dynamics. In both continuous-time and discrete-time instructional examples we demonstrate that our model can well approximate the original system, learn efficient representations, and capture smooth transitions between dynamical modes, focusing on intuitive low-dimensional non-stationary linear and nonlinear systems. Furthermore, we highlight our model's ability to efficiently capture and demix population dynamics generated from multiple independent subnetworks, a task that is computationally impractical for switched models. Finally, we apply our model to neural "full brain" recordings of C. elegans data, illustrating a diversity of dynamics that is obscured when classified into discrete states. △ Less

Submitted 16 June, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: 35 pages, 12 figures

arXiv:2205.03665 [pdf, other]

Variational Sparse Coding with Learned Thresholding

Authors: Kion Fallah, Christopher J. Rozell

Abstract: Sparse coding strategies have been lauded for their parsimonious representations of data that leverage low dimensional structure. However, inference of these codes typically relies on an optimization procedure with poor computational scaling in high-dimensional problems. For example, sparse inference in the representations learned in the high-dimensional intermediary layers of deep neural networks… ▽ More Sparse coding strategies have been lauded for their parsimonious representations of data that leverage low dimensional structure. However, inference of these codes typically relies on an optimization procedure with poor computational scaling in high-dimensional problems. For example, sparse inference in the representations learned in the high-dimensional intermediary layers of deep neural networks (DNNs) requires an iterative minimization to be performed at each training step. As such, recent, quick methods in variational inference have been proposed to infer sparse codes by learning a distribution over the codes with a DNN. In this work, we propose a new approach to variational sparse coding that allows us to learn sparse distributions by thresholding samples, avoiding the use of problematic relaxations. We first evaluate and analyze our method by training a linear generator, showing that it has superior performance, statistical efficiency, and gradient estimation compared to other sparse distributions. We then compare to a standard variational autoencoder using a DNN generator on the Fashion MNIST and CelebA datasets △ Less

Submitted 1 September, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: ICML 2022

arXiv:2204.14189 [pdf, other]

Oracle Guided Image Synthesis with Relative Queries

Authors: Alec Helbling, Christopher John Rozell, Matthew O'Shaughnessy, Kion Fallah

Abstract: Isolating and controlling specific features in the outputs of generative models in a user-friendly way is a difficult and open-ended problem. We develop techniques that allow an oracle user to generate an image they are envisioning in their head by answering a sequence of relative queries of the form \textit{"do you prefer image $a$ or image $b$?"} Our framework consists of a Conditional VAE that… ▽ More Isolating and controlling specific features in the outputs of generative models in a user-friendly way is a difficult and open-ended problem. We develop techniques that allow an oracle user to generate an image they are envisioning in their head by answering a sequence of relative queries of the form \textit{"do you prefer image $a$ or image $b$?"} Our framework consists of a Conditional VAE that uses the collected relative queries to partition the latent space into preference-relevant features and non-preference-relevant features. We then use the user's responses to relative queries to determine the preference-relevant features that correspond to their envisioned output image. Additionally, we develop techniques for modeling the uncertainty in images' predicted preference-relevant features, allowing our framework to generalize to scenarios in which the relative query training set contains noise. △ Less

Submitted 28 April, 2022; originally announced April 2022.

Comments: Published at the International Conference on Learning Representations 2022, Workshop on Deep Generative Models for Highly Structured Data

arXiv:2104.01511 [pdf, other]

Late fusion of machine learning models using passively captured interpersonal social interactions and motion from smartphones predicts decompensation in heart failure

Authors: Ayse S. Cakmak, Samuel Densen, Gabriel Najarro, Pratik Rout, Christopher J. Rozell, Omer T. Inan, Amit J. Shah, Gari D. Clifford

Abstract: Objective: Worldwide, heart failure (HF) is a major cause of morbidity and mortality and one of the leading causes of hospitalization. Early detection of HF symptoms and pro-active management may reduce adverse events. Approach: Twenty-eight participants were monitored using a smartphone app after discharge from hospitals, and each clinical event during the enrollment (N=110 clinical events) was r… ▽ More Objective: Worldwide, heart failure (HF) is a major cause of morbidity and mortality and one of the leading causes of hospitalization. Early detection of HF symptoms and pro-active management may reduce adverse events. Approach: Twenty-eight participants were monitored using a smartphone app after discharge from hospitals, and each clinical event during the enrollment (N=110 clinical events) was recorded. Motion, social, location, and clinical survey data collected via the smartphone-based monitoring system were used to develop and validate an algorithm for predicting or classifying HF decompensation events (hospitalizations or clinic visit) versus clinic monitoring visits in which they were determined to be compensated or stable. Models based on single modality as well as early and late fusion approaches combining patient-reported outcomes and passive smartphone data were evaluated. Results: The highest AUCPr for classifying decompensation with a late fusion approach was 0.80 using leave one subject out cross-validation. Significance: Passively collected data from smartphones, especially when combined with weekly patient-reported outcomes, may reflect behavioral and physiological changes due to HF and thus could enable prediction of HF decompensation. △ Less

Submitted 3 April, 2021; originally announced April 2021.

arXiv:2006.10597 [pdf, other]

Variational Autoencoder with Learned Latent Structure

Authors: Marissa C. Connor, Gregory H. Canal, Christopher J. Rozell

Abstract: The manifold hypothesis states that high-dimensional data can be modeled as lying on or near a low-dimensional, nonlinear manifold. Variational Autoencoders (VAEs) approximate this manifold by learning map**s from low-dimensional latent vectors to high-dimensional data while encouraging a global structure in the latent space through the use of a specified prior distribution. When this prior does… ▽ More The manifold hypothesis states that high-dimensional data can be modeled as lying on or near a low-dimensional, nonlinear manifold. Variational Autoencoders (VAEs) approximate this manifold by learning map**s from low-dimensional latent vectors to high-dimensional data while encouraging a global structure in the latent space through the use of a specified prior distribution. When this prior does not match the structure of the true data manifold, it can lead to a less accurate model of the data. To resolve this mismatch, we introduce the Variational Autoencoder with Learned Latent Structure (VAELLS) which incorporates a learnable manifold model into the latent space of a VAE. This enables us to learn the nonlinear manifold structure from the data and use that structure to define a prior in the latent space. The integration of a latent manifold model not only ensures that our prior is well-matched to the data, but also allows us to define generative transformation paths in the latent space and describe class manifolds with transformations stemming from examples of each class. We validate our model on examples with known latent structure and also demonstrate its capabilities on a real-world dataset. △ Less

Submitted 2 March, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: Accepted at The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)

arXiv:1906.11768 [pdf, other]

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Authors: John Lee, Max Dabagia, Eva L. Dyer, Christopher J. Rozell

Abstract: In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which… ▽ More In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which leverages clustered structure in data to improve alignment in noisy, ambiguous, or multimodal settings. To solve this numerically, we propose a distributed ADMM algorithm that also exploits the Sinkhorn distance, thus it has an efficient computational complexity that scales quadratically with the size of the largest cluster. When the transformation between two datasets is unitary, we provide performance guarantees that describe when and how well aligned cluster correspondences can be recovered with our formulation, as well as provide worst-case dataset geometry for such a strategy. We apply this method to synthetic datasets that model data as mixtures of low-rank Gaussians and study the impact that different geometric properties of the data have on alignment. Next, we applied our approach to a neural decoding application where the goal is to predict movement directions and instantaneous velocities from populations of neurons in the macaque primary motor cortex. Our results demonstrate that when clustered structure exists in datasets, and is consistent across trials or time points, a hierarchical alignment strategy that leverages such structure can provide significant improvements in cross-domain alignment. △ Less

Submitted 3 November, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

arXiv:1905.04363 [pdf, other]

Active embedding search via noisy paired comparisons

Authors: Gregory H. Canal, Andrew K. Massimino, Mark A. Davenport, Christopher J. Rozell

Abstract: Suppose that we wish to estimate a user's preference vector $w$ from paired comparisons of the form "does user $w$ prefer item $p$ or item $q$?," where both the user and items are embedded in a low-dimensional Euclidean space with distances that reflect user and item similarities. Such observations arise in numerous settings, including psychometrics and psychology experiments, search tasks, advert… ▽ More Suppose that we wish to estimate a user's preference vector $w$ from paired comparisons of the form "does user $w$ prefer item $p$ or item $q$?," where both the user and items are embedded in a low-dimensional Euclidean space with distances that reflect user and item similarities. Such observations arise in numerous settings, including psychometrics and psychology experiments, search tasks, advertising, and recommender systems. In such tasks, queries can be extremely costly and subject to varying levels of response noise; thus, we aim to actively choose pairs that are most informative given the results of previous comparisons. We provide new theoretical insights into the benefits and challenges of greedy information maximization in this setting, and develop two novel strategies that maximize lower bounds on information gain and are simpler to analyze and compute respectively. We use simulated responses from a real-world dataset to validate our strategies through their similar performance to greedy information maximization, and their superior preference estimation over state-of-the-art selection methods as well as random queries. △ Less

Submitted 24 May, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

Comments: ICML 2019

arXiv:1307.7970 [pdf, other]

Short Term Memory Capacity in Networks via the Restricted Isometry Property

Authors: Adam S. Charles, Han Lun Yap, Christopher J. Rozell

Abstract: Cortical networks are hypothesized to rely on transient network activity to support short term memory (STM). In this paper we study the capacity of randomly connected recurrent linear networks for performing STM when the input signals are approximately sparse in some basis. We leverage results from compressed sensing to provide rigorous non asymptotic recovery guarantees, quantifying the impact of… ▽ More Cortical networks are hypothesized to rely on transient network activity to support short term memory (STM). In this paper we study the capacity of randomly connected recurrent linear networks for performing STM when the input signals are approximately sparse in some basis. We leverage results from compressed sensing to provide rigorous non asymptotic recovery guarantees, quantifying the impact of the input sparsity level, the input sparsity basis, and the network characteristics on the system capacity. Our analysis demonstrates that network memory capacities can scale superlinearly with the number of nodes, and in some situations can achieve STM capacities that are much larger than the network size. We provide perfect recovery guarantees for finite sequences and recovery bounds for infinite sequences. The latter analysis predicts that network STM systems may have an optimal recovery length that balances errors due to omission and recall mistakes. Furthermore, we show that the conditions yielding optimal STM capacity can be embodied in several network topologies, including networks with sparse or dense connectivities. △ Less

Submitted 3 March, 2015; v1 submitted 30 June, 2013; originally announced July 2013.

Comments: 50 pages, 5 figures

Journal ref: A.S. Charles, H.L. Yap, and C.J. Rozell. Short term network memory capacity via the restricted isometry property. Neural Computation, 26(6), June 2014

arXiv:1210.3395 [pdf, other]

The Restricted Isometry Property for Random Block Diagonal Matrices

Authors: Armin Eftekhari, Han Lun Yap, Christopher J. Rozell, Michael B. Wakin

Abstract: In Compressive Sensing, the Restricted Isometry Property (RIP) ensures that robust recovery of sparse vectors is possible from noisy, undersampled measurements via computationally tractable algorithms. It is by now well-known that Gaussian (or, more generally, sub-Gaussian) random matrices satisfy the RIP under certain conditions on the number of measurements. Their use can be limited in practice,… ▽ More In Compressive Sensing, the Restricted Isometry Property (RIP) ensures that robust recovery of sparse vectors is possible from noisy, undersampled measurements via computationally tractable algorithms. It is by now well-known that Gaussian (or, more generally, sub-Gaussian) random matrices satisfy the RIP under certain conditions on the number of measurements. Their use can be limited in practice, however, due to storage limitations, computational considerations, or the mismatch of such matrices with certain measurement architectures. These issues have recently motivated considerable effort towards studying the RIP for structured random matrices. In this paper, we study the RIP for block diagonal measurement matrices where each block on the main diagonal is itself a sub-Gaussian random matrix. Our main result states that such matrices can indeed satisfy the RIP but that the requisite number of measurements depends on certain properties of the basis in which the signals are sparse. In the best case, these matrices perform nearly as well as dense Gaussian random matrices, despite having many fewer nonzero entries. △ Less

Submitted 13 February, 2014; v1 submitted 11 October, 2012; originally announced October 2012.

MSC Class: 94A20; 60B20; 46B09;

arXiv:1209.3312 [pdf, ps, other]

Stable Manifold Embeddings with Structured Random Matrices

Authors: Han Lun Yap, Michael B. Wakin, Christopher J. Rozell

Abstract: The fields of compressed sensing (CS) and matrix completion have shown that high-dimensional signals with sparse or low-rank structure can be effectively projected into a low-dimensional space (for efficient acquisition or processing) when the projection operator achieves a stable embedding of the data by satisfying the Restricted Isometry Property (RIP). It has also been shown that such stable em… ▽ More The fields of compressed sensing (CS) and matrix completion have shown that high-dimensional signals with sparse or low-rank structure can be effectively projected into a low-dimensional space (for efficient acquisition or processing) when the projection operator achieves a stable embedding of the data by satisfying the Restricted Isometry Property (RIP). It has also been shown that such stable embeddings can be achieved for general Riemannian submanifolds when random orthoprojectors are used for dimensionality reduction. Due to computational costs and system constraints, the CS community has recently explored the RIP for structured random matrices (e.g., random convolutions, localized measurements, deterministic constructions). The main contribution of this paper is to show that any matrix satisfying the RIP (i.e., providing a stable embedding for sparse signals) can be used to construct a stable embedding for manifold-modeled signals by randomizing the column signs and paying reasonable additional factors in the number of measurements. We demonstrate this result with several new constructions for stable manifold embeddings using structured matrices. This result allows advances in efficient projection schemes for sparse signals to be immediately applied to manifold signal models. △ Less

Submitted 15 May, 2013; v1 submitted 14 September, 2012; originally announced September 2012.

arXiv:1010.5938 [pdf, ps, other]

Stable Takens' Embeddings for Linear Dynamical Systems

Authors: Han Lun Yap, Christopher J. Rozell

Abstract: Takens' Embedding Theorem remarkably established that concatenating M previous outputs of a dynamical system into a vector (called a delay coordinate map) can be a one-to-one map** of a low-dimensional attractor from the system state space. However, Takens' theorem is fragile in the sense that even small imperfections can induce arbitrarily large errors in this attractor representation. We exten… ▽ More Takens' Embedding Theorem remarkably established that concatenating M previous outputs of a dynamical system into a vector (called a delay coordinate map) can be a one-to-one map** of a low-dimensional attractor from the system state space. However, Takens' theorem is fragile in the sense that even small imperfections can induce arbitrarily large errors in this attractor representation. We extend Takens' result to establish deterministic, explicit and non-asymptotic sufficient conditions for a delay coordinate map to form a stable embedding in the restricted case of linear dynamical systems and observation functions. Our work is inspired by the field of Compressive Sensing (CS), where results guarantee that low-dimensional signal families can be robustly reconstructed if they are stably embedded by a measurement operator. However, in contrast to typical CS results, i) our sufficient conditions are independent of the size of the ambient state space, and ii) some system and measurement pairs have fundamental limits on the conditioning of the embedding (i.e., how close it is to an isometry), meaning that further measurements beyond some point add no further significant value. We use several simple simulations to explore the conditions of the main results, including the tightness of the bounds and the convergence speed of the stable embedding. We also present an example task of estimating the attractor dimension from time-series data to highlight the value of stable embeddings over traditional Takens' embeddings. △ Less

Submitted 18 June, 2011; v1 submitted 28 October, 2010; originally announced October 2010.

Showing 1–14 of 14 results for author: Rozell, C J