Search | arXiv e-print repository

A Geometric View of Data Complexity: Efficient Local Intrinsic Dimension Estimation with Diffusion Models

Authors: Hamidreza Kamkari, Brendan Leigh Ross, Rasa Hosseinzadeh, Jesse C. Cresswell, Gabriel Loaiza-Ganem

Abstract: High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum -- i.e. the dimension of the submanifold it belongs to -- is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven usefu… ▽ More High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum -- i.e. the dimension of the submanifold it belongs to -- is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven useful in contexts ranging from generalization in neural networks to detection of out-of-distribution data, adversarial examples, and AI-generated text. The recent successes of deep generative models present an opportunity to leverage them for LID estimation, but current methods based on generative models produce inaccurate estimates, require more than a single pre-trained model, are computationally intensive, or do not exploit the best available deep generative models, i.e. diffusion models (DMs). In this work, we show that the Fokker-Planck equation associated with a DM can provide a LID estimator which addresses all the aforementioned deficiencies. Our estimator, called FLIPD, is compatible with all popular DMs, and outperforms existing baselines on LID estimation benchmarks. We also apply FLIPD on natural images where the true LID is unknown. Compared to competing estimators, FLIPD exhibits a higher correlation with non-LID measures of complexity, better matches a qualitative assessment of complexity, and is the only estimator to remain tractable with high-resolution images at the scale of Stable Diffusion. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 10 pages

arXiv:2404.02954 [pdf, other]

Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections

Authors: Gabriel Loaiza-Ganem, Brendan Leigh Ross, Rasa Hosseinzadeh, Anthony L. Caterini, Jesse C. Cresswell

Abstract: In recent years there has been increased interest in understanding the interplay between deep generative models (DGMs) and the manifold hypothesis. Research in this area focuses on understanding the reasons why commonly-used DGMs succeed or fail at learning distributions supported on unknown low-dimensional manifolds, as well as develo** new models explicitly designed to account for manifold-sup… ▽ More In recent years there has been increased interest in understanding the interplay between deep generative models (DGMs) and the manifold hypothesis. Research in this area focuses on understanding the reasons why commonly-used DGMs succeed or fail at learning distributions supported on unknown low-dimensional manifolds, as well as develo** new models explicitly designed to account for manifold-supported data. This manifold lens provides both clarity as to why some DGMs (e.g. diffusion models and some generative adversarial networks) empirically surpass others (e.g. likelihood-based models such as variational autoencoders, normalizing flows, or energy-based models) at sample generation, and guidance for devising more performant DGMs. We carry out the first survey of DGMs viewed through this lens, making two novel contributions along the way. First, we formally establish that numerical instability of high-dimensional likelihoods is unavoidable when modelling low-dimensional data. We then show that DGMs on learned representations of autoencoders can be interpreted as approximately minimizing Wasserstein distance: this result, which applies to latent diffusion models, helps justify their outstanding empirical results. The manifold lens provides a rich perspective from which to understand DGMs, which we aim to make more accessible and widespread. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.18910 [pdf, other]

A Geometric Explanation of the Likelihood OOD Detection Paradox

Authors: Hamidreza Kamkari, Brendan Leigh Ross, Jesse C. Cresswell, Anthony L. Caterini, Rahul G. Krishnan, Gabriel Loaiza-Ganem

Abstract: Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making l… ▽ More Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making likelihood-based OOD detection unreliable. Our primary observation is that high-likelihood regions will not be generated if they contain minimal probability mass. We demonstrate how this seeming contradiction of large densities yet low probability mass can occur around data confined to low-dimensional manifolds. We also show that this scenario can be identified through local intrinsic dimension (LID) estimation, and propose a method for OOD detection which pairs the likelihoods and LID estimates obtained from a pre-trained DGM. Our method can be applied to normalizing flows and score-based diffusion models, and obtains results which match or surpass state-of-the-art OOD detection benchmarks using the same DGM backbones. Our code is available at https://github.com/layer6ai-labs/dgm_ood_detection. △ Less

Submitted 11 June, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: ICML 2024

arXiv:2401.13744 [pdf, other]

Conformal Prediction Sets Improve Human Decision Making

Authors: Jesse C. Cresswell, Yi Sui, Bhargava Kumar, Noël Vouitsis

Abstract: In response to everyday queries, humans explicitly signal uncertainty and offer alternative answers when they are unsure. Machine learning models that output calibrated prediction sets through conformal prediction mimic this human behaviour; larger sets signal greater uncertainty while providing alternatives. In this work, we study the usefulness of conformal prediction sets as an aid for human de… ▽ More In response to everyday queries, humans explicitly signal uncertainty and offer alternative answers when they are unsure. Machine learning models that output calibrated prediction sets through conformal prediction mimic this human behaviour; larger sets signal greater uncertainty while providing alternatives. In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee. The results show that quantifying model uncertainty with conformal prediction is helpful for human-in-the-loop decision making and human-AI teams. △ Less

Submitted 9 June, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: Published at ICML 2024. Code available at https://github.com/layer6ai-labs/hitl-conformal-prediction

arXiv:2312.10144 [pdf, other]

Data-Efficient Multimodal Fusion on a Single GPU

Authors: Noël Vouitsis, Zhaoyan Liu, Satya Krishna Gorti, Valentin Villecroze, Jesse C. Cresswell, Guangwei Yu, Gabriel Loaiza-Ganem, Maksims Volkovs

Abstract: The goal of multimodal alignment is to learn a single latent space that is shared between multimodal inputs. The most powerful models in this space have been trained using massive datasets of paired inputs and large-scale computational resources, making them prohibitively expensive to train in many practical scenarios. We surmise that existing unimodal encoders pre-trained on large amounts of unim… ▽ More The goal of multimodal alignment is to learn a single latent space that is shared between multimodal inputs. The most powerful models in this space have been trained using massive datasets of paired inputs and large-scale computational resources, making them prohibitively expensive to train in many practical scenarios. We surmise that existing unimodal encoders pre-trained on large amounts of unimodal data should provide an effective bootstrap to create multimodal models from unimodal ones at much lower costs. We therefore propose FuseMix, a multimodal augmentation scheme that operates on the latent spaces of arbitrary pre-trained unimodal encoders. Using FuseMix for multimodal alignment, we achieve competitive performance -- and in certain cases outperform state-of-the art methods -- in both image-text and audio-text retrieval, with orders of magnitude less compute and data: for example, we outperform CLIP on the Flickr30K text-to-image retrieval task with $\sim \! 600\times$ fewer GPU days and $\sim \! 80\times$ fewer image-text pairs. Additionally, we show how our method can be applied to convert pre-trained text-to-image generative models into audio-to-image ones. Code is available at: https://github.com/layer6ai-labs/fusemix. △ Less

Submitted 10 April, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: CVPR 2024 (Highlight)

arXiv:2310.07756 [pdf, other]

Self-supervised Representation Learning From Random Data Projectors

Authors: Yi Sui, Tongzi Wu, Jesse C. Cresswell, Ga Wu, George Stein, Xiao Shi Huang, Xiaochen Zhang, Maksims Volkovs

Abstract: Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations. While augmentation-based SSRL algorithms push the boundaries of performance in computer vision and natural language processing, they are often not directly applicable to other data modalities, and can conflict with applicati… ▽ More Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations. While augmentation-based SSRL algorithms push the boundaries of performance in computer vision and natural language processing, they are often not directly applicable to other data modalities, and can conflict with application-specific data augmentation constraints. This paper presents an SSRL approach that can be applied to any data modality and network architecture because it does not rely on augmentations or masking. Specifically, we show that high-quality data representations can be learned by reconstructing random data projections. We evaluate the proposed approach on a wide range of representation learning tasks that span diverse modalities and real-world applications. We show that it outperforms multiple state-of-the-art SSRL baselines. Due to its wide applicability and strong empirical results, we argue that learning from randomness is a fruitful research direction worthy of attention and further study. △ Less

Submitted 20 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: Published as a conference paper of ICLR 2024. https://openreview.net/pdf?id=EpYnZpDpsQ

arXiv:2306.08656 [pdf, other]

Augment then Smooth: Reconciling Differential Privacy with Certified Robustness

Authors: Jiapeng Wu, Atiyeh Ashari Ghomi, David Glukhov, Jesse C. Cresswell, Franziska Boenisch, Nicolas Papernot

Abstract: Machine learning models are susceptible to a variety of attacks that can erode trust in their deployment. These threats include attacks against the privacy of training data and adversarial examples that jeopardize model accuracy. Differential privacy and randomized smoothing are effective defenses that provide certifiable guarantees for each of these threats, however, it is not well understood how… ▽ More Machine learning models are susceptible to a variety of attacks that can erode trust in their deployment. These threats include attacks against the privacy of training data and adversarial examples that jeopardize model accuracy. Differential privacy and randomized smoothing are effective defenses that provide certifiable guarantees for each of these threats, however, it is not well understood how implementing either defense impacts the other. In this work, we argue that it is possible to achieve both privacy guarantees and certified robustness simultaneously. We provide a framework called DP-CERT for integrating certified robustness through randomized smoothing into differentially private model training. For instance, compared to differentially private stochastic gradient descent on CIFAR10, DP-CERT leads to a 12-fold increase in certified accuracy and a 10-fold increase in the average certified radius at the expense of a drop in accuracy of 1.2%. Through in-depth per-sample metric analysis, we show that the certified radius correlates with the local Lipschitz constant and smoothness of the loss surface. This provides a new way to diagnose when private models will fail to be robust. △ Less

Submitted 14 June, 2023; originally announced June 2023.

Comments: 25 pages, 19 figures

arXiv:2306.04675 [pdf, other]

Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

Authors: George Stein, Jesse C. Cresswell, Rasa Hosseinzadeh, Yi Sui, Brendan Leigh Ross, Valentin Villecroze, Zhaoyan Liu, Anthony L. Caterini, J. Eric T. Taylor, Gabriel Loaiza-Ganem

Abstract: We systematically study a wide variety of generative models spanning semantically-diverse image datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metr… ▽ More We systematically study a wide variety of generative models spanning semantically-diverse image datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metric strongly correlates with human evaluations. Comparing to 17 modern metrics for evaluating the overall performance, fidelity, diversity, rarity, and memorization of generative models, we find that the state-of-the-art perceptual realism of diffusion models as judged by humans is not reflected in commonly reported metrics such as FID. This discrepancy is not explained by diversity in generated samples, though one cause is over-reliance on Inception-V3. We address these flaws through a study of alternative self-supervised feature extractors, find that the semantic information encoded by individual networks strongly depends on their training procedure, and show that DINOv2-ViT-L/14 allows for much richer evaluation of generative models. Next, we investigate data memorization, and find that generative models do memorize training examples on simple, smaller datasets like CIFAR10, but not necessarily on more complex datasets like ImageNet. However, our experiments show that current metrics do not properly detect memorization: none in the literature is able to separate memorization from other phenomena such as underfitting or mode shrinkage. To facilitate further development of generative models and their evaluation we release all generated image datasets, human evaluation data, and a modular library to compute 17 common metrics for 9 different encoders at https://github.com/layer6ai-labs/dgm-eval. △ Less

Submitted 30 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023. 53 pages, 29 figures, 12 tables. Code at https://github.com/layer6ai-labs/dgm-eval, reviews at https://openreview.net/forum?id=08zf7kTOoh

Journal ref: Thirty-seventh Conference on Neural Information Processing Systems (2023)

arXiv:2212.01265 [pdf, other]

Denoising Deep Generative Models

Authors: Gabriel Loaiza-Ganem, Brendan Leigh Ross, Luhuan Wu, John P. Cunningham, Jesse C. Cresswell, Anthony L. Caterini

Abstract: Likelihood-based deep generative models have recently been shown to exhibit pathological behaviour under the manifold hypothesis as a consequence of using high-dimensional densities to model data with low-dimensional structure. In this paper we propose two methodologies aimed at addressing this problem. Both are based on adding Gaussian noise to the data to remove the dimensionality mismatch durin… ▽ More Likelihood-based deep generative models have recently been shown to exhibit pathological behaviour under the manifold hypothesis as a consequence of using high-dimensional densities to model data with low-dimensional structure. In this paper we propose two methodologies aimed at addressing this problem. Both are based on adding Gaussian noise to the data to remove the dimensionality mismatch during training, and both provide a denoising mechanism whose goal is to sample from the model as though no noise had been added to the data. Our first approach is based on Tweedie's formula, and the second on models which take the variance of added noise as a conditional input. We show that surprisingly, while well motivated, these approaches only sporadically improve performance over not adding noise, and that other methods of addressing the dimensionality mismatch are more empirically adequate. △ Less

Submitted 4 January, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

Comments: NeurIPS 2022 ICBINB workshop (spotlight)

arXiv:2211.15380 [pdf, other]

CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds

Authors: Jesse C. Cresswell, Brendan Leigh Ross, Gabriel Loaiza-Ganem, Humberto Reyes-Gonzalez, Marco Letizia, Anthony L. Caterini

Abstract: Precision measurements and new physics searches at the Large Hadron Collider require efficient simulations of particle propagation and interactions within the detectors. The most computationally expensive simulations involve calorimeter showers. Advances in deep generative modelling - particularly in the realm of high-dimensional data - have opened the possibility of generating realistic calorimet… ▽ More Precision measurements and new physics searches at the Large Hadron Collider require efficient simulations of particle propagation and interactions within the detectors. The most computationally expensive simulations involve calorimeter showers. Advances in deep generative modelling - particularly in the realm of high-dimensional data - have opened the possibility of generating realistic calorimeter showers orders of magnitude more quickly than physics-based simulation. However, the high-dimensional representation of showers belies the relative simplicity and structure of the underlying physical laws. This phenomenon is yet another example of the manifold hypothesis from machine learning, which states that high-dimensional data is supported on low-dimensional manifolds. We thus propose modelling calorimeter showers first by learning their manifold structure, and then estimating the density of data across this manifold. Learning manifold structure reduces the dimensionality of the data, which enables fast training and generation when compared with competing methods. △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: Accepted to the Machine Learning and the Physical Sciences Workshop at NeurIPS 2022

arXiv:2210.06597 [pdf, other]

Find Your Friends: Personalized Federated Learning with the Right Collaborators

Authors: Yi Sui, Junfeng Wen, Yenson Lau, Brendan Leigh Ross, Jesse C. Cresswell

Abstract: In the traditional federated learning setting, a central server coordinates a network of clients to train one global model. However, the global model may serve many clients poorly due to data heterogeneity. Moreover, there may not exist a trusted central party that can coordinate the clients to ensure that each of them can benefit from others. To address these concerns, we present a novel decentra… ▽ More In the traditional federated learning setting, a central server coordinates a network of clients to train one global model. However, the global model may serve many clients poorly due to data heterogeneity. Moreover, there may not exist a trusted central party that can coordinate the clients to ensure that each of them can benefit from others. To address these concerns, we present a novel decentralized framework, FedeRiCo, where each client can learn as much or as little from other clients as is optimal for its local data distribution. Based on expectation-maximization, FedeRiCo estimates the utilities of other participants' models on each client's data so that everyone can select the right collaborators for learning. As a result, our algorithm outperforms other federated, personalized, and/or decentralized approaches on several benchmark datasets, being the only approach that consistently performs better than training with local data only. △ Less

Submitted 14 October, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

arXiv:2207.02862 [pdf, other]

Verifying the Union of Manifolds Hypothesis for Image Data

Authors: Bradley C. A. Brown, Anthony L. Caterini, Brendan Leigh Ross, Jesse C. Cresswell, Gabriel Loaiza-Ganem

Abstract: Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does… ▽ More Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does not properly capture the low-dimensional structure typically present in image data. Assuming that data lies on a single manifold implies intrinsic dimension is identical across the entire data space, and does not allow for subregions of this space to have a different number of factors of variation. To address this deficiency, we consider the union of manifolds hypothesis, which states that data lies on a disjoint union of manifolds of varying intrinsic dimensions. We empirically verify this hypothesis on commonly-used image datasets, finding that indeed, observed data lies on a disconnected set and that intrinsic dimension is not constant. We also provide insights into the implications of the union of manifolds hypothesis in deep learning, both supervised and unsupervised, showing that designing models with an inductive bias for this structure improves performance across classification and generative modelling tasks. Our code is available at https://github.com/layer6ai-labs/UoMH. △ Less

Submitted 2 March, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

Comments: ICLR 2023

arXiv:2206.11267 [pdf, other]

Neural Implicit Manifold Learning for Topology-Aware Density Estimation

Authors: Brendan Leigh Ross, Gabriel Loaiza-Ganem, Anthony L. Caterini, Jesse C. Cresswell

Abstract: Natural data observed in $\mathbb{R}^n$ is often constrained to an $m$-dimensional manifold $\mathcal{M}$, where $m < n$. This work focuses on the task of building theoretically principled generative models for such data. Current generative models learn $\mathcal{M}$ by map** an $m$-dimensional latent variable through a neural network $f_θ: \mathbb{R}^m \to \mathbb{R}^n$. These procedures, which… ▽ More Natural data observed in $\mathbb{R}^n$ is often constrained to an $m$-dimensional manifold $\mathcal{M}$, where $m < n$. This work focuses on the task of building theoretically principled generative models for such data. Current generative models learn $\mathcal{M}$ by map** an $m$-dimensional latent variable through a neural network $f_θ: \mathbb{R}^m \to \mathbb{R}^n$. These procedures, which we call pushforward models, incur a straightforward limitation: manifolds cannot in general be represented with a single parameterization, meaning that attempts to do so will incur either computational instability or the inability to learn probability densities within the manifold. To remedy this problem, we propose to model $\mathcal{M}$ as a neural implicit manifold: the set of zeros of a neural network. We then learn the probability density within $\mathcal{M}$ with a constrained energy-based model, which employs a constrained variant of Langevin dynamics to train and sample from the learned manifold. In experiments on synthetic and natural data, we show that our model can learn manifold-supported distributions with complex topologies more accurately than pushforward models. △ Less

Submitted 21 December, 2023; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: Accepted to TMLR in 2023. Code: https://github.com/layer6ai-labs/implicit-manifolds

arXiv:2206.07737 [pdf, other]

Disparate Impact in Differential Privacy from Gradient Misalignment

Authors: Maria S. Esipova, Atiyeh Ashari Ghomi, Yaqiao Luo, Jesse C. Cresswell

Abstract: As machine learning becomes more widespread throughout society, aspects including data privacy and fairness must be carefully considered, and are crucial for deployment in highly regulated industries. Unfortunately, the application of privacy enhancing technologies can worsen unfair tendencies in models. In particular, one of the most widely used techniques for private model training, differential… ▽ More As machine learning becomes more widespread throughout society, aspects including data privacy and fairness must be carefully considered, and are crucial for deployment in highly regulated industries. Unfortunately, the application of privacy enhancing technologies can worsen unfair tendencies in models. In particular, one of the most widely used techniques for private model training, differentially private stochastic gradient descent (DPSGD), frequently intensifies disparate impact on groups within data. In this work we study the fine-grained causes of unfairness in DPSGD and identify gradient misalignment due to inequitable gradient clip** as the most significant source. This observation leads us to a new method for reducing unfairness by preventing gradient misalignment in DPSGD. △ Less

Submitted 23 February, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

Comments: ICLR 2023 notable top 25%, https://openreview.net/forum?id=qLOaeRvteqbx. Our code is available at https://github.com/layer6ai-labs/fair-dp

arXiv:2204.07172 [pdf, other]

Diagnosing and Fixing Manifold Overfitting in Deep Generative Models

Authors: Gabriel Loaiza-Ganem, Brendan Leigh Ross, Jesse C. Cresswell, Anthony L. Caterini

Abstract: Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which states that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space. In this paper we investigate the pathologies of maximum-likelihood training in the presence of this di… ▽ More Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which states that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space. In this paper we investigate the pathologies of maximum-likelihood training in the presence of this dimensionality mismatch. We formally prove that degenerate optima are achieved wherein the manifold itself is learned but not the distribution on it, a phenomenon we call manifold overfitting. We propose a class of two-step procedures consisting of a dimensionality reduction step followed by maximum-likelihood density estimation, and prove that they recover the data-generating distribution in the nonparametric regime, thus avoiding manifold overfitting. We also show that these procedures enable density estimation on the manifolds learned by implicit models, such as generative adversarial networks, hence addressing a major shortcoming of these models. Several recently proposed methods are instances of our two-step procedures; we thus unify, extend, and theoretically justify a large class of models. △ Less

Submitted 28 November, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

Comments: Accepted for publication in TMLR

arXiv:2111.11343 [pdf, other]

doi 10.1038/s41467-023-38569-4

Decentralized Federated Learning through Proxy Model Sharing

Authors: Shivam Kalra, Junfeng Wen, Jesse C. Cresswell, Maksims Volkovs, Hamid R. Tizhoosh

Abstract: Institutions in highly regulated domains such as finance and healthcare often have restrictive rules around data sharing. Federated learning is a distributed learning framework that enables multi-institutional collaborations on decentralized data with improved protection for each collaborator's data privacy. In this paper, we propose a communication-efficient scheme for decentralized federated lea… ▽ More Institutions in highly regulated domains such as finance and healthcare often have restrictive rules around data sharing. Federated learning is a distributed learning framework that enables multi-institutional collaborations on decentralized data with improved protection for each collaborator's data privacy. In this paper, we propose a communication-efficient scheme for decentralized federated learning called ProxyFL, or proxy-based federated learning. Each participant in ProxyFL maintains two models, a private model, and a publicly shared proxy model designed to protect the participant's privacy. Proxy models allow efficient information exchange among participants without the need of a centralized server. The proposed method eliminates a significant limitation of canonical federated learning by allowing model heterogeneity; each participant can have a private model with any architecture. Furthermore, our protocol for communication by proxy leads to stronger privacy guarantees using differential privacy analysis. Experiments on popular image datasets, and a cancer diagnostic problem using high-quality gigapixel histology whole slide images, show that ProxyFL can outperform existing alternatives with much less communication overhead and stronger privacy. △ Less

Submitted 22 May, 2023; v1 submitted 22 November, 2021; originally announced November 2021.

Journal ref: Nature Communications 14, 2899 (2023)

arXiv:2106.05275 [pdf, other]

Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

Authors: Brendan Leigh Ross, Jesse C. Cresswell

Abstract: Normalizing flows are generative models that provide tractable density estimation via an invertible transformation from a simple base distribution to a complex target distribution. However, this technique cannot directly model data supported on an unknown low-dimensional manifold, a common occurrence in real-world domains such as image data. Recent attempts to remedy this limitation have introduce… ▽ More Normalizing flows are generative models that provide tractable density estimation via an invertible transformation from a simple base distribution to a complex target distribution. However, this technique cannot directly model data supported on an unknown low-dimensional manifold, a common occurrence in real-world domains such as image data. Recent attempts to remedy this limitation have introduced geometric complications that defeat a central benefit of normalizing flows: exact density estimation. We recover this benefit with Conformal Embedding Flows, a framework for designing flows that learn manifolds with tractable densities. We argue that composing a standard flow with a trainable conformal embedding is the most natural way to model manifold-supported data. To this end, we present a series of conformal building blocks and apply them in experiments with synthetic and real-world data to demonstrate that flows can model manifold-supported distributions without sacrificing tractable likelihoods. △ Less

Submitted 11 November, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: NeurIPS 2021 Camera-Ready. Code: https://github.com/layer6ai-labs/CEF

arXiv:2011.12363 [pdf, other]

C-Learning: Horizon-Aware Cumulative Accessibility Estimation

Authors: Panteha Naderian, Gabriel Loaiza-Ganem, Harry J. Braviner, Anthony L. Caterini, Jesse C. Cresswell, Tong Li, Animesh Garg

Abstract: Multi-goal reaching is an important problem in reinforcement learning needed to achieve algorithmic generalization. Despite recent advances in this field, current algorithms suffer from three major challenges: high sample complexity, learning only a single way of reaching the goals, and difficulties in solving complex motion planning tasks. In order to address these limitations, we introduce the c… ▽ More Multi-goal reaching is an important problem in reinforcement learning needed to achieve algorithmic generalization. Despite recent advances in this field, current algorithms suffer from three major challenges: high sample complexity, learning only a single way of reaching the goals, and difficulties in solving complex motion planning tasks. In order to address these limitations, we introduce the concept of cumulative accessibility functions, which measure the reachability of a goal from a given state within a specified horizon. We show that these functions obey a recurrence relation, which enables learning from offline interactions. We also prove that optimal cumulative accessibility functions are monotonic in the planning horizon. Additionally, our method can trade off speed and reliability in goal-reaching by suggesting multiple paths to a single goal depending on the provided horizon. We evaluate our approach on a set of multi-goal discrete and continuous control tasks. We show that our method outperforms state-of-the-art goal-reaching algorithms in success rate, sample complexity, and path optimality. Our code is available at https://github.com/layer6ai-labs/CAE, and additional visualizations can be found at https://sites.google.com/view/learning-cae/. △ Less

Submitted 25 January, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

Comments: Accepted at ICLR 2021

arXiv:1909.13347 [pdf, other]

Quantum Information Approaches to Quantum Gravity

Authors: Jesse C. Cresswell

Abstract: In this thesis we apply techniques from quantum information theory to study quantum gravity within the framework of the anti-de Sitter / conformal field theory correspondence (AdS/CFT). Through AdS/CFT, progress has been made in understanding the structure of entanglement in quantum field theories, and in how gravitational physics can emerge from these structures. However, this understanding is fa… ▽ More In this thesis we apply techniques from quantum information theory to study quantum gravity within the framework of the anti-de Sitter / conformal field theory correspondence (AdS/CFT). Through AdS/CFT, progress has been made in understanding the structure of entanglement in quantum field theories, and in how gravitational physics can emerge from these structures. However, this understanding is far from complete and will require the development of new tools to quantify correlations in CFT. This thesis presents refinements of a duality between operator product expansion (OPE) blocks in the CFT, giving the contribution of a conformal family to the OPE, and geodesic integrated fields in AdS which are diffeomorphism invariant quantities. This duality was originally discovered in the maximally symmetric setting of pure AdS dual to the CFT ground state. In less symmetric states the duality must be modified. Working with excited states within AdS$_3$/CFT$_2$, this thesis shows how the OPE block decomposes into more fine-grained CFT observables that are dual to AdS fields integrated over non-minimal geodesics. Additionally, this thesis contains results on the dynamics of entanglement measures for general quantum systems. Results are presented for the family of quantum Rényi entropies and entanglement negativity. Rényi entropies are studied for general dynamics by imposing special initial conditions. Around pure, separable initial states, all Rényi entropies grow with the same timescale at leading, and next-to-leading order. Mathematical tools are developed for the differentiation of non-analytic matrix functions with respect to constrained arguments and are used to construct analytic expressions for derivatives of negativity. We establish bounds on the rate of change of state distinguishability and the rate of entanglement growth for closed systems. Note: Abstract shortened. △ Less

Submitted 29 September, 2019; originally announced September 2019.

Comments: PhD Thesis, contains previously published papers and additional work, 135 pages

arXiv:1906.07731 [pdf, ps, other]

doi 10.1088/1751-8121/ab6fc9

Operational symmetries of entangled states

Authors: Ilan Tzitrin, Aaron Z. Goldberg, Jesse C. Cresswell

Abstract: Quantum entanglement obscures the notion of local operations; there exist quantum states for which all local actions on one subsystem can be equivalently realized by actions on another. We characterize the states for which this fundamental property of entanglement does and does not hold, including multipartite and mixed states. Our results lead to a method for quantifying entanglement based on ope… ▽ More Quantum entanglement obscures the notion of local operations; there exist quantum states for which all local actions on one subsystem can be equivalently realized by actions on another. We characterize the states for which this fundamental property of entanglement does and does not hold, including multipartite and mixed states. Our results lead to a method for quantifying entanglement based on operational symmetries and has connections to quantum steering, envariance, the Reeh-Schlieder theorem, and classical entanglement. △ Less

Submitted 12 February, 2020; v1 submitted 18 June, 2019; originally announced June 2019.

Comments: 8 pages including 2 appendices and 2 figures; comments welcome

Journal ref: Journal of Physics A: Mathematical and Theoretical (2020), 53(9) 095304

arXiv:1809.09107 [pdf, other]

doi 10.1007/JHEP03(2019)058

Holographic relations for OPE blocks in excited states

Authors: Jesse C. Cresswell, Ian T. Jardine, Amanda W. Peet

Abstract: We study the holographic duality between boundary OPE blocks and geodesic integrated bulk fields in quotients of AdS$_3$ dual to excited CFT states. The quotient geometries exhibit non-minimal geodesics between pairs of spacelike separated boundary points which modify the OPE block duality. We decompose OPE blocks into quotient invariant operators and propose a duality with bulk fields integrated… ▽ More We study the holographic duality between boundary OPE blocks and geodesic integrated bulk fields in quotients of AdS$_3$ dual to excited CFT states. The quotient geometries exhibit non-minimal geodesics between pairs of spacelike separated boundary points which modify the OPE block duality. We decompose OPE blocks into quotient invariant operators and propose a duality with bulk fields integrated over individual geodesics, minimal or non-minimal. We provide evidence for this relationship by studying the monodromy of asymptotic maps that implement the quotients. △ Less

Submitted 13 March, 2019; v1 submitted 24 September, 2018; originally announced September 2018.

Comments: 22 pages. As published in JHEP

Journal ref: JHEP03(2019)058

arXiv:1809.07772 [pdf, other]

doi 10.1103/PhysRevA.99.012322

Perturbative expansion of entanglement negativity using patterned matrix calculus

Authors: Jesse C. Cresswell, Ilan Tzitrin, Aaron Z. Goldberg

Abstract: Negativity is an entanglement monotone frequently used to quantify entanglement in bipartite states. Because negativity is a non-analytic function of a density matrix, existing methods used in the physics literature are insufficient to compute its derivatives. To this end we develop techniques in the calculus of complex, patterned matrices and use them to conduct a perturbative analysis of negativ… ▽ More Negativity is an entanglement monotone frequently used to quantify entanglement in bipartite states. Because negativity is a non-analytic function of a density matrix, existing methods used in the physics literature are insufficient to compute its derivatives. To this end we develop techniques in the calculus of complex, patterned matrices and use them to conduct a perturbative analysis of negativity in terms of arbitrary variations of the density operator. The result is an easy-to-implement expansion that can be carried out to all orders. On the way we provide convenient representations of the partial transposition map appearing in the definition of negativity. Our methods are well-suited to study the growth and decay of entanglement in a wide range of physical systems, including the generic linear growth of entanglement in many-body systems, and have broad relevance to many functions of quantum states and observables. △ Less

Submitted 15 January, 2019; v1 submitted 20 September, 2018; originally announced September 2018.

Comments: 10 pages, 3 figures; as published in PRA

Journal ref: Phys. Rev. A 99, 012322 (2019)

arXiv:1709.10064 [pdf, other]

doi 10.1103/PhysRevA.97.022317

Universal entanglement timescale for Rényi entropies

Authors: Jesse C. Cresswell

Abstract: Recently it was shown that the growth of entanglement in an initially separable state, as measured by the purity of subsystems, can be characterized by a timescale that takes a universal form for any Hamiltonian. We show that the same timescale governs the growth of entanglement for all Rényi entropies. Since the family of Rényi entropies completely characterizes the entanglement of a pure biparti… ▽ More Recently it was shown that the growth of entanglement in an initially separable state, as measured by the purity of subsystems, can be characterized by a timescale that takes a universal form for any Hamiltonian. We show that the same timescale governs the growth of entanglement for all Rényi entropies. Since the family of Rényi entropies completely characterizes the entanglement of a pure bipartite state, our timescale is a universal feature of bipartite entanglement. The timescale depends only on the interaction Hamiltonian and the initial state. △ Less

Submitted 13 February, 2018; v1 submitted 28 September, 2017; originally announced September 2017.

Comments: 6 pages, 6 figures. As published in PRA

Journal ref: Phys. Rev. A 97, 022317 (2018)

arXiv:1708.09838 [pdf, other]

doi 10.1007/JHEP11(2017)155

Kinematic space for conical defects

Authors: Jesse C. Cresswell, Amanda W. Peet

Abstract: Kinematic space can be used as an intermediate step in the AdS/CFT dictionary and lends itself naturally to the description of diffeomorphism invariant quantities. From the bulk it has been defined as the space of boundary anchored geodesics, and from the boundary as the space of pairs of CFT points. When the bulk is not globally AdS$_3$ the appearance of non-minimal geodesics leads to ambiguities… ▽ More Kinematic space can be used as an intermediate step in the AdS/CFT dictionary and lends itself naturally to the description of diffeomorphism invariant quantities. From the bulk it has been defined as the space of boundary anchored geodesics, and from the boundary as the space of pairs of CFT points. When the bulk is not globally AdS$_3$ the appearance of non-minimal geodesics leads to ambiguities in these definitions. In this work conical defect spacetimes are considered as an example where non-minimal geodesics are common. From the bulk it is found that the conical defect kinematic space can be obtained from the AdS$_3$ kinematic space by the same quotient under which one obtains the defect from AdS$_3$. The resulting kinematic space is one of many equivalent fundamental regions. From the boundary the conical defect kinematic space can be determined by breaking up OPE blocks into contributions from individual bulk geodesics. A duality is established between partial OPE blocks and bulk fields integrated over individual geodesics, minimal or non-minimal. △ Less

Submitted 25 November, 2017; v1 submitted 31 August, 2017; originally announced August 2017.

Comments: 29 pages, 9 figures. As published in JHEP

Journal ref: JHEP11(2017)155

arXiv:1504.05914 [pdf, ps, other]

doi 10.1103/PhysRevD.91.084008

Lorenz gauge quantization in conformally flat spacetimes

Authors: Jesse C. Cresswell, Dan N. Vollick

Abstract: Recently it was shown that Dirac's method of quantizing constrained dynamical systems can be used to impose the Lorenz gauge condition in a four-dimensional cosmological spacetime. In this paper we use Dirac's method to impose the Lorenz gauge condition in a general four-dimensional conformally flat spacetime and find that there is no particle production. We show that in cosmological spacetimes wi… ▽ More Recently it was shown that Dirac's method of quantizing constrained dynamical systems can be used to impose the Lorenz gauge condition in a four-dimensional cosmological spacetime. In this paper we use Dirac's method to impose the Lorenz gauge condition in a general four-dimensional conformally flat spacetime and find that there is no particle production. We show that in cosmological spacetimes with dimension $D\neq 4$ there will be particle production when the scale factor changes, and we calculate the particle production due to a sudden change. △ Less

Submitted 22 April, 2015; originally announced April 2015.

Comments: 8 pages

Journal ref: Phys. Rev. D 91, 084008 (2015)

Showing 1–25 of 25 results for author: Cresswell, J C