-
An Algorithm for Persistent Homology Computation Using Homomorphic Encryption
Authors:
Dominic Gold,
Koray Karabina,
Francis C. Motta
Abstract:
Topological Data Analysis (TDA) offers a suite of computational tools that provide quantified shape features in high dimensional data that can be used by modern statistical and predictive machine learning (ML) models. In particular, persistent homology (PH) takes in data (e.g., point clouds, images, time series) and derives compact representations of latent topological structures, known as persist…
▽ More
Topological Data Analysis (TDA) offers a suite of computational tools that provide quantified shape features in high dimensional data that can be used by modern statistical and predictive machine learning (ML) models. In particular, persistent homology (PH) takes in data (e.g., point clouds, images, time series) and derives compact representations of latent topological structures, known as persistence diagrams (PDs). Because PDs enjoy inherent noise tolerance, are interpretable and provide a solid basis for data analysis, and can be made compatible with the expansive set of well-established ML model architectures, PH has been widely adopted for model development including on sensitive data, such as genomic, cancer, sensor network, and financial data. Thus, TDA should be incorporated into secure end-to-end data analysis pipelines. In this paper, we take the first step to address this challenge and develop a version of the fundamental algorithm to compute PH on encrypted data using homomorphic encryption (HE).
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Discretized Gradient Flow for Manifold Learning in the Space of Embeddings
Authors:
Dara Gold,
Steven Rosenberg
Abstract:
Gradient descent, or negative gradient flow, is a standard technique in optimization to find minima of functions. Many implementations of gradient descent rely on discretized versions, i.e., moving in the gradient direction for a set step size, recomputing the gradient, and continuing. In this paper, we present an approach to manifold learning where gradient descent takes place in the infinite dim…
▽ More
Gradient descent, or negative gradient flow, is a standard technique in optimization to find minima of functions. Many implementations of gradient descent rely on discretized versions, i.e., moving in the gradient direction for a set step size, recomputing the gradient, and continuing. In this paper, we present an approach to manifold learning where gradient descent takes place in the infinite dimensional space $\mathcal{E} = {\rm Emb}(M,\mathbb{R}^N)$ of smooth embeddings $φ$ of a manifold $M$ into $\mathbb{R}^N$. Implementing a discretized version of gradient descent for $P:\mathcal{E}\to {\mathbb R}$, a penalty function that scores an embedding $φ\in \mathcal{E}$, requires estimating how far we can move in a fixed direction -- the direction of one gradient step -- before leaving the space of smooth embeddings. Our main result is to give an explicit lower bound for this step length in terms of the Riemannian geometry of $φ(M)$. In particular, we consider the case when the gradient of $P$ is pointwise normal to the embedded manifold $φ(M)$. We prove this case arises when $P$ is invariant under diffeomorphisms of $M$, a natural condition in manifold learning.
△ Less
Submitted 2 May, 2024; v1 submitted 25 January, 2019;
originally announced January 2019.
-
Inference for high-dimensional instrumental variables regression
Authors:
David Gold,
Johannes Lederer,
**g Tao
Abstract:
This paper concerns statistical inference for the components of a high-dimensional regression parameter despite possible endogeneity of each regressor. Given a first-stage linear model for the endogenous regressors and a second-stage linear model for the dependent variable, we develop a novel adaptation of the parametric one-step update to a generic second-stage estimator. We provide conditions un…
▽ More
This paper concerns statistical inference for the components of a high-dimensional regression parameter despite possible endogeneity of each regressor. Given a first-stage linear model for the endogenous regressors and a second-stage linear model for the dependent variable, we develop a novel adaptation of the parametric one-step update to a generic second-stage estimator. We provide conditions under which the scaled update is asymptotically normal. We then introduce a two-stage Lasso procedure and show that the second-stage Lasso estimator satisfies the aforementioned conditions. Using these results, we construct asymptotically valid confidence intervals for the components of the second-stage regression coefficients. We complement our asymptotic theory with simulation studies, which demonstrate the performance of our method in finite samples.
△ Less
Submitted 21 November, 2019; v1 submitted 17 August, 2017;
originally announced August 2017.
-
Gradient Flows of Penalty Functions in the Space of Smooth Embeddings
Authors:
Dara Gold
Abstract:
Motivated by manifold learning techniques, we give an explicit lower bound for how far a smoothly embedded compact submanifold in ${\mathbb R}^N$ can move in a normal direction and remain an embedding. In addition, given a penalty function $P : \text{Emb}(M,\mathbb{R}^N) \rightarrow \mathbb{R} $ on the space of embeddings, we give a condition which guarantees that the gradient $\nabla P$ of the pe…
▽ More
Motivated by manifold learning techniques, we give an explicit lower bound for how far a smoothly embedded compact submanifold in ${\mathbb R}^N$ can move in a normal direction and remain an embedding. In addition, given a penalty function $P : \text{Emb}(M,\mathbb{R}^N) \rightarrow \mathbb{R} $ on the space of embeddings, we give a condition which guarantees that the gradient $\nabla P$ of the penalty function is normal to $φ(M)$ at every point.
△ Less
Submitted 8 April, 2015;
originally announced April 2015.