-
Robust Estimation for Random Graphs
Authors:
Jayadev Acharya,
Ayush Jain,
Gautam Kamath,
Ananda Theertha Suresh,
Huanyu Zhang
Abstract:
We study the problem of robustly estimating the parameter $p$ of an Erdős-Rényi random graph on $n$ nodes, where a $γ$ fraction of nodes may be adversarially corrupted. After showing the deficiencies of canonical estimators, we design a computationally-efficient spectral algorithm which estimates $p$ up to accuracy $\tilde O(\sqrt{p(1-p)}/n + γ\sqrt{p(1-p)} /\sqrt{n}+ γ/n)$ for $γ< 1/60$. Furtherm…
▽ More
We study the problem of robustly estimating the parameter $p$ of an Erdős-Rényi random graph on $n$ nodes, where a $γ$ fraction of nodes may be adversarially corrupted. After showing the deficiencies of canonical estimators, we design a computationally-efficient spectral algorithm which estimates $p$ up to accuracy $\tilde O(\sqrt{p(1-p)}/n + γ\sqrt{p(1-p)} /\sqrt{n}+ γ/n)$ for $γ< 1/60$. Furthermore, we give an inefficient algorithm with similar accuracy for all $γ<1/2$, the information-theoretic limit. Finally, we prove a nearly-matching statistical lower bound, showing that the error of our algorithms is optimal up to logarithmic factors.
△ Less
Submitted 15 February, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning
Authors:
Jyotibdha Acharya,
Arindam Basu
Abstract:
The primary objective of this paper is to build classification models and strategies to identify breathing sound anomalies (wheeze, crackle) for automated diagnosis of respiratory and pulmonary diseases. In this work we propose a deep CNN-RNN model that classifies respiratory sounds based on Mel-spectrograms. We also implement a patient specific model tuning strategy that first screens respiratory…
▽ More
The primary objective of this paper is to build classification models and strategies to identify breathing sound anomalies (wheeze, crackle) for automated diagnosis of respiratory and pulmonary diseases. In this work we propose a deep CNN-RNN model that classifies respiratory sounds based on Mel-spectrograms. We also implement a patient specific model tuning strategy that first screens respiratory patients and then builds patient specific classification models using limited patient data for reliable anomaly detection. Moreover, we devise a local log quantization strategy for model weights to reduce the memory footprint for deployment in memory constrained systems such as wearable devices. The proposed hybrid CNN-RNN model achieves a score of 66.31% on four-class classification of breathing cycles for ICBHI'17 scientific challenge respiratory sound database. When the model is re-trained with patient specific data, it produces a score of 71.81% for leave-one-out validation. The proposed weight quantization technique achieves ~4X reduction in total memory cost without loss of performance. The main contribution of the paper is as follows: Firstly, the proposed model is able to achieve state of the art score on the ICBHI'17 dataset. Secondly, deep learning models are shown to successfully learn domain specific knowledge when pre-trained with breathing data and produce significantly superior performance compared to generalized models. Finally, local log quantization of trained weights is shown to be able to reduce the memory requirement significantly. This type of patient-specific re-training strategy can be very useful in develo** reliable long-term automated patient monitoring systems particularly in wearable healthcare solutions.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
Differentially Private Assouad, Fano, and Le Cam
Authors:
Jayadev Acharya,
Ziteng Sun,
Huanyu Zhang
Abstract:
Le Cam's method, Fano's inequality, and Assouad's lemma are three widely used techniques to prove lower bounds for statistical estimation tasks. We propose their analogues under central differential privacy. Our results are simple, easy to apply and we use them to establish sample complexity bounds in several estimation tasks. We establish the optimal sample complexity of discrete distribution est…
▽ More
Le Cam's method, Fano's inequality, and Assouad's lemma are three widely used techniques to prove lower bounds for statistical estimation tasks. We propose their analogues under central differential privacy. Our results are simple, easy to apply and we use them to establish sample complexity bounds in several estimation tasks. We establish the optimal sample complexity of discrete distribution estimation under total variation distance and $\ell_2$ distance. We also provide lower bounds for several other distribution classes, including product distributions and Gaussian mixtures that are tight up to logarithmic factors. The technical component of our paper relates coupling between distributions to the sample complexity of estimation under differential privacy.
△ Less
Submitted 1 November, 2020; v1 submitted 14 April, 2020;
originally announced April 2020.
-
Is my Neural Network Neuromorphic? Taxonomy, Recent Trends and Future Directions in Neuromorphic Engineering
Authors:
Sumon Kumar Bose,
Jyotibdha Acharya,
Arindam Basu
Abstract:
In this paper, we review recent work published over the last 3 years under the umbrella of Neuromorphic engineering to analyze what are the common features among such systems. We see that there is no clear consensus but each system has one or more of the following features:(1) Analog computing (2) Non vonNeumann Architecture and low-precision digital processing (3) Spiking Neural Networks (SNN) wi…
▽ More
In this paper, we review recent work published over the last 3 years under the umbrella of Neuromorphic engineering to analyze what are the common features among such systems. We see that there is no clear consensus but each system has one or more of the following features:(1) Analog computing (2) Non vonNeumann Architecture and low-precision digital processing (3) Spiking Neural Networks (SNN) with components closely related to biology. We compare recent machine learning accelerator chips to show that indeed analog processing and reduced bit precision architectures have best throughput, energy and area efficiencies. However, pure digital architectures can also achieve quite high efficiencies by just adopting a non von-Neumann architecture. Given the design automation tools for digital hardware design, it raises a question on the likelihood of adoption of analog processing in the near future for industrial designs. Next, we argue about the importance of defining standards and choosing proper benchmarks for the progress of neuromorphic system designs and propose some desired characteristics of such benchmarks. Finally, we show brain-machine interfaces as a potential task that fulfils all the criteria of such benchmarks.
△ Less
Submitted 27 February, 2020;
originally announced February 2020.
-
Context-Aware Local Differential Privacy
Authors:
Jayadev Acharya,
Keith Bonawitz,
Peter Kairouz,
Daniel Ramage,
Ziteng Sun
Abstract:
Local differential privacy (LDP) is a strong notion of privacy for individual users that often comes at the expense of a significant drop in utility. The classical definition of LDP assumes that all elements in the data domain are equally sensitive. However, in many applications, some symbols are more sensitive than others. This work proposes a context-aware framework of local differential privacy…
▽ More
Local differential privacy (LDP) is a strong notion of privacy for individual users that often comes at the expense of a significant drop in utility. The classical definition of LDP assumes that all elements in the data domain are equally sensitive. However, in many applications, some symbols are more sensitive than others. This work proposes a context-aware framework of local differential privacy that allows a privacy designer to incorporate the application's context into the privacy definition. For binary data domains, we provide a universally optimal privatization scheme and highlight its connections to Warner's randomized response (RR) and Mangat's improved response. Motivated by geolocation and web search applications, for $k$-ary data domains, we consider two special cases of context-aware LDP: block-structured LDP and high-low LDP. We study discrete distribution estimation and provide communication-efficient, sample-optimal schemes and information-theoretic lower bounds for both models. We show that using contextual information can require fewer samples than classical LDP to achieve the same accuracy.
△ Less
Submitted 27 July, 2020; v1 submitted 31 October, 2019;
originally announced November 2019.
-
Optimal multiclass overfitting by sequence reconstruction from Hamming queries
Authors:
Jayadev Acharya,
Ananda Theertha Suresh
Abstract:
A primary concern of excessive reuse of test datasets in machine learning is that it can lead to overfitting. Multiclass classification was recently shown to be more resistant to overfitting than binary classification. In an open problem of COLT 2019, Feldman, Frostig, and Hardt ask to characterize the dependence of the amount of overfitting bias with the number of classes $m$, the number of accur…
▽ More
A primary concern of excessive reuse of test datasets in machine learning is that it can lead to overfitting. Multiclass classification was recently shown to be more resistant to overfitting than binary classification. In an open problem of COLT 2019, Feldman, Frostig, and Hardt ask to characterize the dependence of the amount of overfitting bias with the number of classes $m$, the number of accuracy queries $k$, and the number of examples in the dataset $n$. We resolve this problem and determine the amount of overfitting possible in multi-class classification. We provide computationally efficient algorithms that achieve overfitting bias of $\tildeΘ(\max\{\sqrt{{k}/{(mn)}}, k/n\})$, matching the known upper bounds.
△ Less
Submitted 21 October, 2019; v1 submitted 8 August, 2019;
originally announced August 2019.
-
Distributed Learning with Sublinear Communication
Authors:
Jayadev Acharya,
Christopher De Sa,
Dylan J. Foster,
Karthik Sridharan
Abstract:
In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the…
▽ More
In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the number of features ("dimension"), the speedup afforded by distributed learning may be overshadowed by the cost of communicating a single example. This paper investigates the following question: When is it possible to learn a $d$-dimensional model in the distributed setting with total communication sublinear in $d$?
Starting with a negative result, we show that for learning $\ell_1$-bounded or sparse linear models, no algorithm can obtain optimal error until communication is linear in dimension. Our main result is that that by slightly relaxing the standard boundedness assumptions for linear models, we can obtain distributed algorithms that enjoy optimal error with communication logarithmic in dimension. This result is based on a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates, and extends to the general stochastic convex optimization model.
△ Less
Submitted 17 March, 2019; v1 submitted 28 February, 2019;
originally announced February 2019.
-
Learning and Testing Causal Models with Interventions
Authors:
Jayadev Acharya,
Arnab Bhattacharyya,
Constantinos Daskalakis,
Saravanan Kandasamy
Abstract:
We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network $\mathcal{M}$ on a graph with $n$ discrete variables and bounded in-degree and bounded `confounded components', we show that $O(\log n)$ interventions on an unknown causal Bayesian network $\mathcal{X}$ on the same graph, and $\tilde{O}(n/ε^2)$ samples per interv…
▽ More
We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network $\mathcal{M}$ on a graph with $n$ discrete variables and bounded in-degree and bounded `confounded components', we show that $O(\log n)$ interventions on an unknown causal Bayesian network $\mathcal{X}$ on the same graph, and $\tilde{O}(n/ε^2)$ samples per intervention, suffice to efficiently distinguish whether $\mathcal{X}=\mathcal{M}$ or whether there exists some intervention under which $\mathcal{X}$ and $\mathcal{M}$ are farther than $ε$ in total variation distance. We also obtain sample/time/intervention efficient algorithms for: (i) testing the identity of two unknown causal Bayesian networks on the same graph; and (ii) learning a causal Bayesian network on a given graph. Although our algorithms are non-adaptive, we show that adaptivity does not help in general: $Ω(\log n)$ interventions are necessary for testing the identity of two unknown causal Bayesian networks on the same graph, even adaptively. Our algorithms are enabled by a new subadditivity inequality for the squared Hellinger distance between two causal Bayesian networks.
△ Less
Submitted 24 May, 2018;
originally announced May 2018.
-
Near-optimal-sample estimators for spherical Gaussian mixtures
Authors:
Jayadev Acharya,
Ashkan Jafarpour,
Alon Orlitsky,
Ananda Theertha Suresh
Abstract:
Statistical and machine-learning algorithms are frequently applied to high-dimensional data. In many of these applications data is scarce, and often much more costly than computation time. We provide the first sample-efficient polynomial-time estimator for high-dimensional spherical Gaussian mixtures.
For mixtures of any $k$ $d$-dimensional spherical Gaussians, we derive an intuitive spectral-es…
▽ More
Statistical and machine-learning algorithms are frequently applied to high-dimensional data. In many of these applications data is scarce, and often much more costly than computation time. We provide the first sample-efficient polynomial-time estimator for high-dimensional spherical Gaussian mixtures.
For mixtures of any $k$ $d$-dimensional spherical Gaussians, we derive an intuitive spectral-estimator that uses $\mathcal{O}_k\bigl(\frac{d\log^2d}{ε^4}\bigr)$ samples and runs in time $\mathcal{O}_{k,ε}(d^3\log^5 d)$, both significantly lower than previously known. The constant factor $\mathcal{O}_k$ is polynomial for sample complexity and is exponential for the time complexity, again much smaller than what was previously known. We also show that $Ω_k\bigl(\frac{d}{ε^2}\bigr)$ samples are needed for any algorithm. Hence the sample complexity is near-optimal in the number of dimensions.
We also derive a simple estimator for one-dimensional mixtures that uses $\mathcal{O}\bigl(\frac{k \log \frac{k}ε }{ε^2} \bigr)$ samples and runs in time $\widetilde{\mathcal{O}}\left(\bigl(\frac{k}ε\bigr)^{3k+1}\right)$. Our other technical contributions include a faster algorithm for choosing a density estimate from a set of distributions, that minimizes the $\ell_1$ distance to an unknown underlying distribution.
△ Less
Submitted 19 February, 2014;
originally announced February 2014.