Search | arXiv e-print repository

LINOCS: Lookahead Inference of Networked Operators for Continuous Stability

Authors: Noga Mudrik, Eva Yezerets, Yenho Chen, Christopher Rozell, Adam Charles

Abstract: Identifying latent interactions within complex systems is key to unlocking deeper insights into their operational dynamics, including how their elements affect each other and contribute to the overall system behavior. For instance, in neuroscience, discovering neuron-to-neuron interactions is essential for understanding brain function; in ecology, recognizing the interactions among populations is… ▽ More Identifying latent interactions within complex systems is key to unlocking deeper insights into their operational dynamics, including how their elements affect each other and contribute to the overall system behavior. For instance, in neuroscience, discovering neuron-to-neuron interactions is essential for understanding brain function; in ecology, recognizing the interactions among populations is key for understanding complex ecosystems. Such systems, often modeled as dynamical systems, typically exhibit noisy high-dimensional and non-stationary temporal behavior that renders their identification challenging. Existing dynamical system identification methods often yield operators that accurately capture short-term behavior but fail to predict long-term trends, suggesting an incomplete capture of the underlying process. Methods that consider extended forecasts (e.g., recurrent neural networks) lack explicit representations of element-wise interactions and require substantial training data, thereby failing to capture interpretable network operators. Here we introduce Lookahead-driven Inference of Networked Operators for Continuous Stability (LINOCS), a robust learning procedure for identifying hidden dynamical interactions in noisy time-series data. LINOCS integrates several multi-step predictions with adaptive weights during training to recover dynamical operators that can yield accurate long-term predictions. We demonstrate LINOCS' ability to recover the ground truth dynamical operators underlying synthetic time-series data for multiple dynamical systems models (including linear, piece-wise linear, time-changing linear systems' decomposition, and regularized linear time-varying systems) as well as its capability to produce meaningful operators with robust reconstructions through various real-world examples. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: under review

arXiv:2306.13544 [pdf, other]

Manifold Contrastive Learning with Variational Lie Group Operators

Authors: Kion Fallah, Alec Helbling, Kyle A. Johnsen, Christopher J. Rozell

Abstract: Self-supervised learning of deep neural networks has become a prevalent paradigm for learning representations that transfer to a variety of downstream tasks. Similar to proposed models of the ventral stream of biological vision, it is observed that these networks lead to a separation of category manifolds in the representations of the penultimate layer. Although this observation matches the manifo… ▽ More Self-supervised learning of deep neural networks has become a prevalent paradigm for learning representations that transfer to a variety of downstream tasks. Similar to proposed models of the ventral stream of biological vision, it is observed that these networks lead to a separation of category manifolds in the representations of the penultimate layer. Although this observation matches the manifold hypothesis of representation learning, current self-supervised approaches are limited in their ability to explicitly model this manifold. Indeed, current approaches often only apply augmentations from a pre-specified set of "positive pairs" during learning. In this work, we propose a contrastive learning approach that directly models the latent manifold using Lie group operators parameterized by coefficients with a sparsity-promoting prior. A variational distribution over these coefficients provides a generative model of the manifold, with samples which provide feature augmentations applicable both during contrastive training and downstream tasks. Additionally, learned coefficient distributions provide a quantification of which transformations are most likely at each point on the manifold while preserving identity. We demonstrate benefits in self-supervised benchmarks for image datasets, as well as a downstream semi-supervised task. In the former case, we demonstrate that the proposed methods can effectively apply manifold feature augmentations and improve learning both with and without a projection head. In the latter case, we demonstrate that feature augmentations sampled from learned Lie group operators can improve classification performance when using few labels. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2304.00185 [pdf, other]

PrefGen: Preference Guided Image Generation with Relative Attributes

Authors: Alec Helbling, Christopher J. Rozell, Matthew O'Shaughnessy, Kion Fallah

Abstract: Deep generative models have the capacity to render high fidelity images of content like human faces. Recently, there has been substantial progress in conditionally generating images with specific quantitative attributes, like the emotion conveyed by one's face. These methods typically require a user to explicitly quantify the desired intensity of a visual attribute. A limitation of this method is… ▽ More Deep generative models have the capacity to render high fidelity images of content like human faces. Recently, there has been substantial progress in conditionally generating images with specific quantitative attributes, like the emotion conveyed by one's face. These methods typically require a user to explicitly quantify the desired intensity of a visual attribute. A limitation of this method is that many attributes, like how "angry" a human face looks, are difficult for a user to precisely quantify. However, a user would be able to reliably say which of two faces seems "angrier". Following this premise, we develop the $\textit{PrefGen}$ system, which allows users to control the relative attributes of generated images by presenting them with simple paired comparison queries of the form "do you prefer image $a$ or image $b$?" Using information from a sequence of query responses, we can estimate user preferences over a set of image attributes and perform preference-guided image editing and generation. Furthermore, to make preference localization feasible and efficient, we apply an active query selection strategy. We demonstrate the success of this approach using a StyleGAN2 generator on the task of human face editing. Additionally, we demonstrate how our approach can be combined with CLIP, allowing a user to edit the relative intensity of attributes specified by text prompts. Code at https://github.com/helblazer811/PrefGen. △ Less

Submitted 31 March, 2023; originally announced April 2023.

arXiv:2303.17776 [pdf, other]

Learning Internal Representations of 3D Transformations from 2D Projected Inputs

Authors: Marissa Connor, Bruno Olshausen, Christopher Rozell

Abstract: When interacting in a three dimensional world, humans must estimate 3D structure from visual inputs projected down to two dimensional retinal images. It has been shown that humans use the persistence of object shape over motion-induced transformations as a cue to resolve depth ambiguity when solving this underconstrained problem. With the aim of understanding how biological vision systems may inte… ▽ More When interacting in a three dimensional world, humans must estimate 3D structure from visual inputs projected down to two dimensional retinal images. It has been shown that humans use the persistence of object shape over motion-induced transformations as a cue to resolve depth ambiguity when solving this underconstrained problem. With the aim of understanding how biological vision systems may internally represent 3D transformations, we propose a computational model, based on a generative manifold model, which can be used to infer 3D structure from the motion of 2D points. Our model can also learn representations of the transformations with minimal supervision, providing a proof of concept for how humans may develop internal representations on a developmental or evolutionary time scale. Focused on rotational motion, we show how our model infers depth from moving 2D projected points, learns 3D rotational transformations from 2D training stimuli, and compares to human performance on psychophysical structure-from-motion experiments. △ Less

Submitted 30 March, 2023; originally announced March 2023.

arXiv:2207.12710 [pdf, other]

Active Learning of Ordinal Embeddings: A User Study on Football Data

Authors: Christoffer Loeffler, Kion Fallah, Stefano Fenu, Dario Zanca, Bjoern Eskofier, Christopher John Rozell, Christopher Mutschler

Abstract: Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. Distance metrics can only serve as proxy for similarity in information retrieval of similar instances. Learning a good similarity function from human annotations improves the quality of retrievals. This work uses deep metric learning to learn these user-defined similarity functions from… ▽ More Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. Distance metrics can only serve as proxy for similarity in information retrieval of similar instances. Learning a good similarity function from human annotations improves the quality of retrievals. This work uses deep metric learning to learn these user-defined similarity functions from few annotations for a large football trajectory dataset. We adapt an entropy-based active learning method with recent work from triplet mining to collect easy-to-answer but still informative annotations from human participants and use them to train a deep convolutional network that generalizes to unseen samples. Our user study shows that our approach improves the quality of the information retrieval compared to a previous deep metric learning approach that relies on a Siamese network. Specifically, we shed light on the strengths and weaknesses of passive sampling heuristics and active learners alike by analyzing the participants' response efficacy. To this end, we collect accuracy, algorithmic time complexity, the participants' fatigue and time-to-response, qualitative self-assessment and statements, as well as the effects of mixed-expertise annotators and their consistency on model performance and transfer-learning. △ Less

Submitted 10 November, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

Comments: 23 pages, 17 figures

Journal ref: Transactions on Machine Learning Research 04/2023 https://openreview.net/forum?id=oq3tx5kinu

arXiv:2206.02972 [pdf, other]

Decomposed Linear Dynamical Systems (dLDS) for learning the latent components of neural dynamics

Authors: Noga Mudrik, Yenho Chen, Eva Yezerets, Christopher J. Rozell, Adam S. Charles

Abstract: Learning interpretable representations of neural dynamics at a population level is a crucial first step to understanding how observed neural activity relates to perception and behavior. Models of neural dynamics often focus on either low-dimensional projections of neural activity, or on learning dynamical systems that explicitly relate to the neural state over time. We discuss how these two approa… ▽ More Learning interpretable representations of neural dynamics at a population level is a crucial first step to understanding how observed neural activity relates to perception and behavior. Models of neural dynamics often focus on either low-dimensional projections of neural activity, or on learning dynamical systems that explicitly relate to the neural state over time. We discuss how these two approaches are interrelated by considering dynamical systems as representative of flows on a low-dimensional manifold. Building on this concept, we propose a new decomposed dynamical system model that represents complex non-stationary and nonlinear dynamics of time series data as a sparse combination of simpler, more interpretable components. Our model is trained through a dictionary learning procedure, where we leverage recent results in tracking sparse vectors over time. The decomposed nature of the dynamics is more expressive than previous switched approaches for a given number of parameters and enables modeling of overlap** and non-stationary dynamics. In both continuous-time and discrete-time instructional examples we demonstrate that our model can well approximate the original system, learn efficient representations, and capture smooth transitions between dynamical modes, focusing on intuitive low-dimensional non-stationary linear and nonlinear systems. Furthermore, we highlight our model's ability to efficiently capture and demix population dynamics generated from multiple independent subnetworks, a task that is computationally impractical for switched models. Finally, we apply our model to neural "full brain" recordings of C. elegans data, illustrating a diversity of dynamics that is obscured when classified into discrete states. △ Less

Submitted 16 June, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: 35 pages, 12 figures

arXiv:2205.03665 [pdf, other]

Variational Sparse Coding with Learned Thresholding

Authors: Kion Fallah, Christopher J. Rozell

Abstract: Sparse coding strategies have been lauded for their parsimonious representations of data that leverage low dimensional structure. However, inference of these codes typically relies on an optimization procedure with poor computational scaling in high-dimensional problems. For example, sparse inference in the representations learned in the high-dimensional intermediary layers of deep neural networks… ▽ More Sparse coding strategies have been lauded for their parsimonious representations of data that leverage low dimensional structure. However, inference of these codes typically relies on an optimization procedure with poor computational scaling in high-dimensional problems. For example, sparse inference in the representations learned in the high-dimensional intermediary layers of deep neural networks (DNNs) requires an iterative minimization to be performed at each training step. As such, recent, quick methods in variational inference have been proposed to infer sparse codes by learning a distribution over the codes with a DNN. In this work, we propose a new approach to variational sparse coding that allows us to learn sparse distributions by thresholding samples, avoiding the use of problematic relaxations. We first evaluate and analyze our method by training a linear generator, showing that it has superior performance, statistical efficiency, and gradient estimation compared to other sparse distributions. We then compare to a standard variational autoencoder using a DNN generator on the Fashion MNIST and CelebA datasets △ Less

Submitted 1 September, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: ICML 2022

arXiv:2204.14189 [pdf, other]

Oracle Guided Image Synthesis with Relative Queries

Authors: Alec Helbling, Christopher John Rozell, Matthew O'Shaughnessy, Kion Fallah

Abstract: Isolating and controlling specific features in the outputs of generative models in a user-friendly way is a difficult and open-ended problem. We develop techniques that allow an oracle user to generate an image they are envisioning in their head by answering a sequence of relative queries of the form \textit{"do you prefer image $a$ or image $b$?"} Our framework consists of a Conditional VAE that… ▽ More Isolating and controlling specific features in the outputs of generative models in a user-friendly way is a difficult and open-ended problem. We develop techniques that allow an oracle user to generate an image they are envisioning in their head by answering a sequence of relative queries of the form \textit{"do you prefer image $a$ or image $b$?"} Our framework consists of a Conditional VAE that uses the collected relative queries to partition the latent space into preference-relevant features and non-preference-relevant features. We then use the user's responses to relative queries to determine the preference-relevant features that correspond to their envisioned output image. Additionally, we develop techniques for modeling the uncertainty in images' predicted preference-relevant features, allowing our framework to generalize to scenarios in which the relative query training set contains noise. △ Less

Submitted 28 April, 2022; originally announced April 2022.

Comments: Published at the International Conference on Learning Representations 2022, Workshop on Deep Generative Models for Highly Structured Data

arXiv:2106.12096 [pdf, other]

Learning Identity-Preserving Transformations on Data Manifolds

Authors: Marissa Connor, Kion Fallah, Christopher Rozell

Abstract: Many machine learning techniques incorporate identity-preserving transformations into their models to generalize their performance to previously unseen data. These transformations are typically selected from a set of functions that are known to maintain the identity of an input when applied (e.g., rotation, translation, flip**, and scaling). However, there are many natural variations that cannot… ▽ More Many machine learning techniques incorporate identity-preserving transformations into their models to generalize their performance to previously unseen data. These transformations are typically selected from a set of functions that are known to maintain the identity of an input when applied (e.g., rotation, translation, flip**, and scaling). However, there are many natural variations that cannot be labeled for supervision or defined through examination of the data. As suggested by the manifold hypothesis, many of these natural variations live on or near a low-dimensional, nonlinear manifold. Several techniques represent manifold variations through a set of learned Lie group operators that define directions of motion on the manifold. However, these approaches are limited because they require transformation labels when training their models and they lack a method for determining which regions of the manifold are appropriate for applying each specific operator. We address these limitations by introducing a learning strategy that does not require transformation labels and develo** a method that learns the local regions where each operator is likely to be used while preserving the identity of inputs. Experiments on MNIST and Fashion MNIST highlight our model's ability to learn identity-preserving transformations on multi-class datasets. Additionally, we train on CelebA to showcase our model's ability to learn semantically meaningful transformations on complex datasets in an unsupervised manner. △ Less

Submitted 28 March, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

arXiv:2104.01511 [pdf, other]

Late fusion of machine learning models using passively captured interpersonal social interactions and motion from smartphones predicts decompensation in heart failure

Authors: Ayse S. Cakmak, Samuel Densen, Gabriel Najarro, Pratik Rout, Christopher J. Rozell, Omer T. Inan, Amit J. Shah, Gari D. Clifford

Abstract: Objective: Worldwide, heart failure (HF) is a major cause of morbidity and mortality and one of the leading causes of hospitalization. Early detection of HF symptoms and pro-active management may reduce adverse events. Approach: Twenty-eight participants were monitored using a smartphone app after discharge from hospitals, and each clinical event during the enrollment (N=110 clinical events) was r… ▽ More Objective: Worldwide, heart failure (HF) is a major cause of morbidity and mortality and one of the leading causes of hospitalization. Early detection of HF symptoms and pro-active management may reduce adverse events. Approach: Twenty-eight participants were monitored using a smartphone app after discharge from hospitals, and each clinical event during the enrollment (N=110 clinical events) was recorded. Motion, social, location, and clinical survey data collected via the smartphone-based monitoring system were used to develop and validate an algorithm for predicting or classifying HF decompensation events (hospitalizations or clinic visit) versus clinic monitoring visits in which they were determined to be compensated or stable. Models based on single modality as well as early and late fusion approaches combining patient-reported outcomes and passive smartphone data were evaluated. Results: The highest AUCPr for classifying decompensation with a late fusion approach was 0.80 using leave one subject out cross-validation. Significance: Passively collected data from smartphones, especially when combined with weekly patient-reported outcomes, may reflect behavioral and physiological changes due to HF and thus could enable prediction of HF decompensation. △ Less

Submitted 3 April, 2021; originally announced April 2021.

arXiv:2103.00654 [pdf, other]

Feedback Coding for Active Learning

Authors: Gregory Canal, Matthieu Bloch, Christopher Rozell

Abstract: The iterative selection of examples for labeling in active machine learning is conceptually similar to feedback channel coding in information theory: in both tasks, the objective is to seek a minimal sequence of actions to encode information in the presence of noise. While this high-level overlap has been previously noted, there remain open questions on how to best formulate active learning as a c… ▽ More The iterative selection of examples for labeling in active machine learning is conceptually similar to feedback channel coding in information theory: in both tasks, the objective is to seek a minimal sequence of actions to encode information in the presence of noise. While this high-level overlap has been previously noted, there remain open questions on how to best formulate active learning as a communications system to leverage existing analysis and algorithms in feedback coding. In this work, we formally identify and leverage the structural commonalities between the two problems, including the characterization of encoder and noisy channel components, to design a new algorithm. Specifically, we develop an optimal transport-based feedback coding scheme called Approximate Posterior Matching (APM) for the task of active example selection and explore its application to Bayesian logistic regression, a popular model in active learning. We evaluate APM on a variety of datasets and demonstrate learning performance comparable to existing active learning methods, at a reduced computational cost. These results demonstrate the potential of directly deploying concepts from feedback channel coding to design efficient active learning strategies. △ Less

Submitted 28 February, 2021; originally announced March 2021.

Comments: AISTATS 2021

arXiv:2006.13913 [pdf, other]

Generative causal explanations of black-box classifiers

Authors: Matthew O'Shaughnessy, Gregory Canal, Marissa Connor, Mark Davenport, Christopher Rozell

Abstract: We develop a method for generating causal post-hoc explanations of black-box classifiers based on a learned low-dimensional representation of the data. The explanation is causal in the sense that changing learned latent factors produces a change in the classifier output statistics. To construct these explanations, we design a learning framework that leverages a generative model and information-the… ▽ More We develop a method for generating causal post-hoc explanations of black-box classifiers based on a learned low-dimensional representation of the data. The explanation is causal in the sense that changing learned latent factors produces a change in the classifier output statistics. To construct these explanations, we design a learning framework that leverages a generative model and information-theoretic measures of causal influence. Our objective function encourages both the generative model to faithfully represent the data distribution and the latent factors to have a large causal influence on the classifier output. Our method learns both global and local explanations, is compatible with any classifier that admits class probabilities and a gradient, and does not require labeled attributes or knowledge of causal structure. Using carefully controlled test cases, we provide intuition that illuminates the function of our objective. We then demonstrate the practical utility of our method on image recognition tasks. △ Less

Submitted 22 October, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

Comments: Camera-ready version to appear at NeurIPS 2020

arXiv:2006.10597 [pdf, other]

Variational Autoencoder with Learned Latent Structure

Authors: Marissa C. Connor, Gregory H. Canal, Christopher J. Rozell

Abstract: The manifold hypothesis states that high-dimensional data can be modeled as lying on or near a low-dimensional, nonlinear manifold. Variational Autoencoders (VAEs) approximate this manifold by learning map**s from low-dimensional latent vectors to high-dimensional data while encouraging a global structure in the latent space through the use of a specified prior distribution. When this prior does… ▽ More The manifold hypothesis states that high-dimensional data can be modeled as lying on or near a low-dimensional, nonlinear manifold. Variational Autoencoders (VAEs) approximate this manifold by learning map**s from low-dimensional latent vectors to high-dimensional data while encouraging a global structure in the latent space through the use of a specified prior distribution. When this prior does not match the structure of the true data manifold, it can lead to a less accurate model of the data. To resolve this mismatch, we introduce the Variational Autoencoder with Learned Latent Structure (VAELLS) which incorporates a learnable manifold model into the latent space of a VAE. This enables us to learn the nonlinear manifold structure from the data and use that structure to define a prior in the latent space. The integration of a latent manifold model not only ensures that our prior is well-matched to the data, but also allows us to define generative transformation paths in the latent space and describe class manifolds with transformations stemming from examples of each class. We validate our model on examples with known latent structure and also demonstrate its capabilities on a real-world dataset. △ Less

Submitted 2 March, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: Accepted at The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)

arXiv:1912.02644 [pdf, other]

Representing Closed Transformation Paths in Encoded Network Latent Space

Authors: Marissa Connor, Christopher Rozell

Abstract: Deep generative networks have been widely used for learning map**s from a low-dimensional latent space to a high-dimensional data space. In many cases, data transformations are defined by linear paths in this latent space. However, the Euclidean structure of the latent space may be a poor match for the underlying latent structure in the data. In this work, we incorporate a generative manifold mo… ▽ More Deep generative networks have been widely used for learning map**s from a low-dimensional latent space to a high-dimensional data space. In many cases, data transformations are defined by linear paths in this latent space. However, the Euclidean structure of the latent space may be a poor match for the underlying latent structure in the data. In this work, we incorporate a generative manifold model into the latent space of an autoencoder in order to learn the low-dimensional manifold structure from the data and adapt the latent space to accommodate this structure. In particular, we focus on applications in which the data has closed transformation paths which extend from a starting point and return to nearly the same point. Through experiments on data with natural closed transformation paths, we show that this model introduces the ability to learn the latent dynamics of complex systems, generate transformation paths, and classify samples that belong on the same transformation path. △ Less

Submitted 5 December, 2019; originally announced December 2019.

Comments: Accepted at AAAI 2020

arXiv:1910.04115 [pdf, other]

Active Ordinal Querying for Tuplewise Similarity Learning

Authors: Gregory Canal, Stefano Fenu, Christopher Rozell

Abstract: Many machine learning tasks such as clustering, classification, and dataset search benefit from embedding data points in a space where distances reflect notions of relative similarity as perceived by humans. A common way to construct such an embedding is to request triplet similarity queries to an oracle, comparing two objects with respect to a reference. This work generalizes triplet queries to t… ▽ More Many machine learning tasks such as clustering, classification, and dataset search benefit from embedding data points in a space where distances reflect notions of relative similarity as perceived by humans. A common way to construct such an embedding is to request triplet similarity queries to an oracle, comparing two objects with respect to a reference. This work generalizes triplet queries to tuple queries of arbitrary size that ask an oracle to rank multiple objects against a reference, and introduces an efficient and robust adaptive selection method called InfoTuple that uses a novel approach to mutual information maximization. We show that the performance of InfoTuple at various tuple sizes exceeds that of the state-of-the-art adaptive triplet selection method on synthetic tests and new human response datasets, and empirically demonstrate the significant gains in efficiency and query consistency achieved by querying larger tuples instead of triplets. △ Less

Submitted 21 November, 2019; v1 submitted 9 October, 2019; originally announced October 2019.

Comments: Canal and Fenu contributed equally; correction in metadata title - metadata title now matches manuscript title; updated to camera-ready version to appear in AAAI 2020

arXiv:1906.11768 [pdf, other]

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Authors: John Lee, Max Dabagia, Eva L. Dyer, Christopher J. Rozell

Abstract: In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which… ▽ More In many machine learning applications, it is necessary to meaningfully aggregate, through alignment, different but related datasets. Optimal transport (OT)-based approaches pose alignment as a divergence minimization problem: the aim is to transform a source dataset to match a target dataset using the Wasserstein distance as a divergence measure. We introduce a hierarchical formulation of OT which leverages clustered structure in data to improve alignment in noisy, ambiguous, or multimodal settings. To solve this numerically, we propose a distributed ADMM algorithm that also exploits the Sinkhorn distance, thus it has an efficient computational complexity that scales quadratically with the size of the largest cluster. When the transformation between two datasets is unitary, we provide performance guarantees that describe when and how well aligned cluster correspondences can be recovered with our formulation, as well as provide worst-case dataset geometry for such a strategy. We apply this method to synthetic datasets that model data as mixtures of low-rank Gaussians and study the impact that different geometric properties of the data have on alignment. Next, we applied our approach to a neural decoding application where the goal is to predict movement directions and instantaneous velocities from populations of neurons in the macaque primary motor cortex. Our results demonstrate that when clustered structure exists in datasets, and is consistent across trials or time points, a hierarchical alignment strategy that leverages such structure can provide significant improvements in cross-domain alignment. △ Less

Submitted 3 November, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

arXiv:1905.04363 [pdf, other]

Active embedding search via noisy paired comparisons

Authors: Gregory H. Canal, Andrew K. Massimino, Mark A. Davenport, Christopher J. Rozell

Abstract: Suppose that we wish to estimate a user's preference vector $w$ from paired comparisons of the form "does user $w$ prefer item $p$ or item $q$?," where both the user and items are embedded in a low-dimensional Euclidean space with distances that reflect user and item similarities. Such observations arise in numerous settings, including psychometrics and psychology experiments, search tasks, advert… ▽ More Suppose that we wish to estimate a user's preference vector $w$ from paired comparisons of the form "does user $w$ prefer item $p$ or item $q$?," where both the user and items are embedded in a low-dimensional Euclidean space with distances that reflect user and item similarities. Such observations arise in numerous settings, including psychometrics and psychology experiments, search tasks, advertising, and recommender systems. In such tasks, queries can be extremely costly and subject to varying levels of response noise; thus, we aim to actively choose pairs that are most informative given the results of previous comparisons. We provide new theoretical insights into the benefits and challenges of greedy information maximization in this setting, and develop two novel strategies that maximize lower bounds on information gain and are simpler to analyze and compute respectively. We use simulated responses from a real-world dataset to validate our strategies through their similar performance to greedy information maximization, and their superior preference estimation over state-of-the-art selection methods as well as random queries. △ Less

Submitted 24 May, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

Comments: ICML 2019

arXiv:1605.08346 [pdf, other]

Distributed Sequence Memory of Multidimensional Inputs in Recurrent Networks

Authors: Adam Charles, Dong Yin, Christopher Rozell

Abstract: Recurrent neural networks (RNNs) have drawn interest from machine learning researchers because of their effectiveness at preserving past inputs for time-varying data processing tasks. To understand the success and limitations of RNNs, it is critical that we advance our analysis of their fundamental memory properties. We focus on echo state networks (ESNs), which are RNNs with simple memoryless nod… ▽ More Recurrent neural networks (RNNs) have drawn interest from machine learning researchers because of their effectiveness at preserving past inputs for time-varying data processing tasks. To understand the success and limitations of RNNs, it is critical that we advance our analysis of their fundamental memory properties. We focus on echo state networks (ESNs), which are RNNs with simple memoryless nodes and random connectivity. In most existing analyses, the short-term memory (STM) capacity results conclude that the ESN network size must scale linearly with the input size for unstructured inputs. The main contribution of this paper is to provide general results characterizing the STM capacity for linear ESNs with multidimensional input streams when the inputs have common low-dimensional structure: sparsity in a basis or significant statistical dependence between inputs. In both cases, we show that the number of nodes in the network must scale linearly with the information rate and poly-logarithmically with the ambient input dimension. The analysis relies on advanced applications of random matrix theory and results in explicit non-asymptotic bounds on the recovery error. Taken together, this analysis provides a significant step forward in our understanding of the STM properties in RNNs. △ Less

Submitted 27 January, 2017; v1 submitted 26 May, 2016; originally announced May 2016.

Comments: 37 pages, 3 figures

Journal ref: Journal of Machine Learning Research, 18:1-37 Jan. 2017

arXiv:1307.7970 [pdf, other]

Short Term Memory Capacity in Networks via the Restricted Isometry Property

Authors: Adam S. Charles, Han Lun Yap, Christopher J. Rozell

Abstract: Cortical networks are hypothesized to rely on transient network activity to support short term memory (STM). In this paper we study the capacity of randomly connected recurrent linear networks for performing STM when the input signals are approximately sparse in some basis. We leverage results from compressed sensing to provide rigorous non asymptotic recovery guarantees, quantifying the impact of… ▽ More Cortical networks are hypothesized to rely on transient network activity to support short term memory (STM). In this paper we study the capacity of randomly connected recurrent linear networks for performing STM when the input signals are approximately sparse in some basis. We leverage results from compressed sensing to provide rigorous non asymptotic recovery guarantees, quantifying the impact of the input sparsity level, the input sparsity basis, and the network characteristics on the system capacity. Our analysis demonstrates that network memory capacities can scale superlinearly with the number of nodes, and in some situations can achieve STM capacities that are much larger than the network size. We provide perfect recovery guarantees for finite sequences and recovery bounds for infinite sequences. The latter analysis predicts that network STM systems may have an optimal recovery length that balances errors due to omission and recall mistakes. Furthermore, we show that the conditions yielding optimal STM capacity can be embodied in several network topologies, including networks with sparse or dense connectivities. △ Less

Submitted 3 March, 2015; v1 submitted 30 June, 2013; originally announced July 2013.

Comments: 50 pages, 5 figures

Journal ref: A.S. Charles, H.L. Yap, and C.J. Rozell. Short term network memory capacity via the restricted isometry property. Neural Computation, 26(6), June 2014

arXiv:1210.3395 [pdf, other]

The Restricted Isometry Property for Random Block Diagonal Matrices

Authors: Armin Eftekhari, Han Lun Yap, Christopher J. Rozell, Michael B. Wakin

Abstract: In Compressive Sensing, the Restricted Isometry Property (RIP) ensures that robust recovery of sparse vectors is possible from noisy, undersampled measurements via computationally tractable algorithms. It is by now well-known that Gaussian (or, more generally, sub-Gaussian) random matrices satisfy the RIP under certain conditions on the number of measurements. Their use can be limited in practice,… ▽ More In Compressive Sensing, the Restricted Isometry Property (RIP) ensures that robust recovery of sparse vectors is possible from noisy, undersampled measurements via computationally tractable algorithms. It is by now well-known that Gaussian (or, more generally, sub-Gaussian) random matrices satisfy the RIP under certain conditions on the number of measurements. Their use can be limited in practice, however, due to storage limitations, computational considerations, or the mismatch of such matrices with certain measurement architectures. These issues have recently motivated considerable effort towards studying the RIP for structured random matrices. In this paper, we study the RIP for block diagonal measurement matrices where each block on the main diagonal is itself a sub-Gaussian random matrix. Our main result states that such matrices can indeed satisfy the RIP but that the requisite number of measurements depends on certain properties of the basis in which the signals are sparse. In the best case, these matrices perform nearly as well as dense Gaussian random matrices, despite having many fewer nonzero entries. △ Less

Submitted 13 February, 2014; v1 submitted 11 October, 2012; originally announced October 2012.

MSC Class: 94A20; 60B20; 46B09;

arXiv:1209.3312 [pdf, ps, other]

Stable Manifold Embeddings with Structured Random Matrices

Authors: Han Lun Yap, Michael B. Wakin, Christopher J. Rozell

Abstract: The fields of compressed sensing (CS) and matrix completion have shown that high-dimensional signals with sparse or low-rank structure can be effectively projected into a low-dimensional space (for efficient acquisition or processing) when the projection operator achieves a stable embedding of the data by satisfying the Restricted Isometry Property (RIP). It has also been shown that such stable em… ▽ More The fields of compressed sensing (CS) and matrix completion have shown that high-dimensional signals with sparse or low-rank structure can be effectively projected into a low-dimensional space (for efficient acquisition or processing) when the projection operator achieves a stable embedding of the data by satisfying the Restricted Isometry Property (RIP). It has also been shown that such stable embeddings can be achieved for general Riemannian submanifolds when random orthoprojectors are used for dimensionality reduction. Due to computational costs and system constraints, the CS community has recently explored the RIP for structured random matrices (e.g., random convolutions, localized measurements, deterministic constructions). The main contribution of this paper is to show that any matrix satisfying the RIP (i.e., providing a stable embedding for sparse signals) can be used to construct a stable embedding for manifold-modeled signals by randomizing the column signs and paying reasonable additional factors in the number of measurements. We demonstrate this result with several new constructions for stable manifold embeddings using structured matrices. This result allows advances in efficient projection schemes for sparse signals to be immediately applied to manifold signal models. △ Less

Submitted 15 May, 2013; v1 submitted 14 September, 2012; originally announced September 2012.

arXiv:1010.5938 [pdf, ps, other]

Stable Takens' Embeddings for Linear Dynamical Systems

Authors: Han Lun Yap, Christopher J. Rozell

Abstract: Takens' Embedding Theorem remarkably established that concatenating M previous outputs of a dynamical system into a vector (called a delay coordinate map) can be a one-to-one map** of a low-dimensional attractor from the system state space. However, Takens' theorem is fragile in the sense that even small imperfections can induce arbitrarily large errors in this attractor representation. We exten… ▽ More Takens' Embedding Theorem remarkably established that concatenating M previous outputs of a dynamical system into a vector (called a delay coordinate map) can be a one-to-one map** of a low-dimensional attractor from the system state space. However, Takens' theorem is fragile in the sense that even small imperfections can induce arbitrarily large errors in this attractor representation. We extend Takens' result to establish deterministic, explicit and non-asymptotic sufficient conditions for a delay coordinate map to form a stable embedding in the restricted case of linear dynamical systems and observation functions. Our work is inspired by the field of Compressive Sensing (CS), where results guarantee that low-dimensional signal families can be robustly reconstructed if they are stably embedded by a measurement operator. However, in contrast to typical CS results, i) our sufficient conditions are independent of the size of the ambient state space, and ii) some system and measurement pairs have fundamental limits on the conditioning of the embedding (i.e., how close it is to an isometry), meaning that further measurements beyond some point add no further significant value. We use several simple simulations to explore the conditions of the main results, including the tightness of the bounds and the convergence speed of the stable embedding. We also present an example task of estimating the attractor dimension from time-series data to highlight the value of stable embeddings over traditional Takens' embeddings. △ Less

Submitted 18 June, 2011; v1 submitted 28 October, 2010; originally announced October 2010.

Showing 1–22 of 22 results for author: Rozell, C