Skip to main content

Showing 1–30 of 30 results for author: Kluger, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.13854  [pdf, other

    cs.LG math.OC

    Exponential weight averaging as damped harmonic motion

    Authors: Jonathan Patsenker, Henry Li, Yuval Kluger

    Abstract: The exponential moving average (EMA) is a commonly used statistic for providing stable estimates of stochastic quantities in deep learning optimization. Recently, EMA has seen considerable use in generative models, where it is computed with respect to the model weights, and significantly improves the stability of the inference model during and after training. While the practice of weight averaging… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 10 pages, 7 figures. ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems. 2023

  2. arXiv:2303.09381  [pdf, other

    cs.LG

    Multi-modal Differentiable Unsupervised Feature Selection

    Authors: Junchen Yang, Ofir Lindenbaum, Yuval Kluger, Ariel Jaffe

    Abstract: Multi-modal high throughput biological data presents a great scientific opportunity and a significant computational challenge. In multi-modal measurements, every sample is observed simultaneously by two or more sets of sensors. In such settings, many observed variables in both modalities are often nuisance and do not carry information about the phenomenon of interest. Here, we propose a multi-moda… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  3. arXiv:2210.10715  [pdf, other

    cs.LG stat.ML

    Autoregressive Generative Modeling with Noise Conditional Maximum Likelihood Estimation

    Authors: Henry Li, Yuval Kluger

    Abstract: We introduce a simple modification to the standard maximum likelihood estimation (MLE) framework. Rather than maximizing a single unconditional likelihood of the data under the model, we maximize a family of \textit{noise conditional} likelihoods consisting of the data perturbed by a continuum of noise levels. We find that models trained this way are more robust to noise, obtain higher test likeli… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: 18 pages, 10 figures, 2 tables

  4. arXiv:2207.08574  [pdf, other

    stat.ML cs.LG eess.SP

    ManiFeSt: Manifold-based Feature Selection for Small Data Sets

    Authors: David Cohen, Tal Shnitzer, Yuval Kluger, Ronen Talmon

    Abstract: In this paper, we present a new method for few-sample supervised feature selection (FS). Our method first learns the manifold of the feature space of each class using kernels capturing multi-feature associations. Then, based on Riemannian geometry, a composite kernel is computed, extracting the differences between the learned feature associations. Finally, a FS score based on spectral analysis is… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 22 pages, 10 figures

  5. arXiv:2206.11172  [pdf, other

    cs.LG

    Neural Inverse Transform Sampler

    Authors: Henry Li, Yuval Kluger

    Abstract: Any explicit functional representation $f$ of a density is hampered by two main obstacles when we wish to use it as a generative model: designing $f$ so that sampling is fast, and estimating $Z = \int f$ so that $Z^{-1}f$ integrates to 1. This becomes increasingly complicated as $f$ itself becomes complicated. In this paper, we show that when modeling one-dimensional conditional densities with a n… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: 13 pages, 3 figures

  6. arXiv:2110.05306  [pdf, other

    stat.ML cs.AI cs.LG

    Deep Unsupervised Feature Selection by Discarding Nuisance and Correlated Features

    Authors: Uri Shaham, Ofir Lindenbaum, Jonathan Svirsky, Yuval Kluger

    Abstract: Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  7. arXiv:2110.00494  [pdf, other

    cs.LG stat.ML

    Probabilistic Robust Autoencoders for Outlier Detection

    Authors: Ofir Lindenbaum, Yariv Aizenbud, Yuval Kluger

    Abstract: Anomalies (or outliers) are prevalent in real-world empirical observations and potentially mask important underlying structures. Accurate identification of anomalous samples is crucial for the success of downstream data analysis tasks. To automatically identify anomalies, we propose Probabilistic Robust AutoEncoder (PRAE). PRAE aims to simultaneously remove outliers and identify a low-dimensional… ▽ More

    Submitted 24 August, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

  8. arXiv:2106.06468  [pdf, other

    cs.LG stat.ML

    Locally Sparse Neural Networks for Tabular Biomedical Data

    Authors: Junchen Yang, Ofir Lindenbaum, Yuval Kluger

    Abstract: Tabular datasets with low-sample-size or many variables are prevalent in biomedicine. Practitioners in this domain prefer linear or tree-based models over neural networks since the latter are harder to interpret and tend to overfit when applied to tabular datasets. To address these neural networks' shortcomings, we propose an intrinsically interpretable network for heterogeneous biomedical data. W… ▽ More

    Submitted 7 February, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

  9. arXiv:2103.13840  [pdf, other

    math.ST cs.IT

    Biwhitening Reveals the Rank of a Count Matrix

    Authors: Boris Landa, Thomas T. C. K. Zhang, Yuval Kluger

    Abstract: Estimating the rank of a corrupted data matrix is an important task in data analysis, most notably for choosing the number of components in PCA. Significant progress on this task was achieved using random matrix theory by characterizing the spectral properties of large noise matrices. However, utilizing such tools is not straightforward when the data matrix consists of count random variables, e.g.… ▽ More

    Submitted 2 November, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    MSC Class: 62H12; 62H25

  10. arXiv:2102.13276  [pdf, other

    stat.ML cs.LG q-bio.PE

    Spectral Top-Down Recovery of Latent Tree Models

    Authors: Yariv Aizenbud, Ariel Jaffe, Meng Wang, Amber Hu, Noah Amsel, Boaz Nadler, Joseph T. Chang, Yuval Kluger

    Abstract: Modeling the distribution of high dimensional data by a latent tree graphical model is a prevalent approach in multiple scientific domains. A common task is to infer the underlying tree structure, given only observations of its terminal nodes. Many algorithms for tree recovery are computationally intensive, which limits their applicability to trees of moderate size. For large trees, a common appro… ▽ More

    Submitted 7 December, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

  11. arXiv:2010.05620  [pdf, other

    cs.LG stat.ML

    $\ell_0$-based Sparse Canonical Correlation Analysis

    Authors: Ofir Lindenbaum, Moshe Salhov, Amir Averbuch, Yuval Kluger

    Abstract: Canonical Correlation Analysis (CCA) models are powerful for studying the associations between two sets of variables. The canonically correlated representations, termed \textit{canonical variates} are widely used in unsupervised learning to analyze unlabeled multi-modal registered datasets. Despite their success, CCA models may break (or overfit) if the number of variables in either of the modalit… ▽ More

    Submitted 8 June, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

  12. arXiv:2007.04728  [pdf, other

    cs.LG stat.ML

    Differentiable Unsupervised Feature Selection based on a Gated Laplacian

    Authors: Ofir Lindenbaum, Uri Shaham, Jonathan Svirsky, Erez Peterfreund, Yuval Kluger

    Abstract: Scientific observations may consist of a large number of variables (features). Identifying a subset of meaningful features is often ignored in unsupervised learning, despite its potential for unraveling clear patterns hidden in the ambient space. In this paper, we present a method for unsupervised feature selection, and we demonstrate its use for the task of clustering. We propose a differentiable… ▽ More

    Submitted 9 November, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

  13. arXiv:2006.00402  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Doubly-Stochastic Normalization of the Gaussian Kernel is Robust to Heteroskedastic Noise

    Authors: Boris Landa, Ronald R. Coifman, Yuval Kluger

    Abstract: A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We d… ▽ More

    Submitted 25 January, 2021; v1 submitted 30 May, 2020; originally announced June 2020.

  14. arXiv:2002.12547  [pdf, ps, other

    stat.ML cs.LG

    Spectral neighbor joining for reconstruction of latent tree models

    Authors: Ariel Jaffe, Noah Amsel, Yariv Aizenbud, Boaz Nadler, Joseph T. Chang, Yuval Kluger

    Abstract: A common assumption in multiple scientific applications is that the distribution of observed data can be modeled by a latent tree graphical model. An important example is phylogenetics, where the tree models the evolutionary lineages of a set of observed organisms. Given a set of independent realizations of the random variables at the leaves of the tree, a key challenge is to infer the underlying… ▽ More

    Submitted 22 September, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

  15. arXiv:2002.12317  [pdf, other

    cs.LG stat.ML

    The Spectral Underpinning of word2vec

    Authors: Ariel Jaffe, Yuval Kluger, Ofir Lindenbaum, Jonathan Patsenker, Erez Peterfreund, Stefan Steinerberger

    Abstract: word2vec due to Mikolov \textit{et al.} (2013) is a word embedding method that is widely used in natural language processing. Despite its great success and frequent use, theoretical justification is still lacking. The main contribution of our paper is to propose a rigorous analysis of the highly nonlinear functional of word2vec. Our results suggest that word2vec may be primarily driven by an under… ▽ More

    Submitted 9 November, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

  16. Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations

    Authors: Dmitry Kobak, George Linderman, Stefan Steinerberger, Yuval Kluger, Philipp Berens

    Abstract: T-distributed stochastic neighbour embedding (t-SNE) is a widely used data visualisation technique. It differs from its predecessor SNE by the low-dimensional similarity kernel: the Gaussian kernel was replaced by the heavy-tailed Cauchy kernel, solving the "crowding problem" of SNE. Here, we develop an efficient implementation of t-SNE for a $t$-distribution kernel with an arbitrary degree of fre… ▽ More

    Submitted 4 April, 2019; v1 submitted 15 February, 2019; originally announced February 2019.

    Journal ref: ECML PKDD 2019

  17. arXiv:1810.04247  [pdf, other

    cs.LG stat.ML

    Feature Selection using Stochastic Gates

    Authors: Yutaro Yamada, Ofir Lindenbaum, Sahand Negahban, Yuval Kluger

    Abstract: Feature selection problems have been extensively studied for linear estimation, for instance, Lasso, but less emphasis has been placed on feature selection for non-linear functions. In this study, we propose a method for feature selection in high-dimensional non-linear function estimation problems. The new procedure is based on minimizing the $\ell_0$ norm of the vector of indicator variables that… ▽ More

    Submitted 26 July, 2020; v1 submitted 9 October, 2018; originally announced October 2018.

    Comments: Published in ICML 2020

    Journal ref: Proceedings of Machine Learning and Systems 2020, pages 8952--8963

  18. arXiv:1803.10840  [pdf, other

    stat.ML cs.LG

    Defending against Adversarial Images using Basis Functions Transformations

    Authors: Uri Shaham, James Garritano, Yutaro Yamada, Ethan Weinberger, Alex Cloninger, Xiuyuan Cheng, Kelly Stanton, Yuval Kluger

    Abstract: We study the effectiveness of various approaches that defend against adversarial attacks on deep networks via manipulations based on basis function representations of images. Specifically, we experiment with low-pass filtering, PCA, JPEG compression, low resolution wavelet approximation, and soft-thresholding. We evaluate these defense techniques using three types of popular attacks in black, gray… ▽ More

    Submitted 16 April, 2018; v1 submitted 28 March, 2018; originally announced March 2018.

    Comments: added link to GitHub repository

  19. arXiv:1801.01587  [pdf, other

    stat.ML cs.LG

    SpectralNet: Spectral Clustering using Deep Neural Networks

    Authors: Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, Yuval Kluger

    Abstract: Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data p… ▽ More

    Submitted 4 April, 2018; v1 submitted 4 January, 2018; originally announced January 2018.

    Comments: Added citations. Accepted to ICLR 2018

  20. Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

    Authors: George C. Linderman, Manas Rachh, Jeremy G. Hoskins, Stefan Steinerberger, Yuval Kluger

    Abstract: t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization that has become widely popular in recent years. Efficient implementations of t-SNE are available, but they scale poorly to datasets with hundreds of thousands to millions of high dimensional data-points. We present Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)… ▽ More

    Submitted 24 December, 2017; originally announced December 2017.

  21. arXiv:1711.04712  [pdf, other

    math.CO cs.DM cs.DS math.PR stat.ML

    Randomized Near Neighbor Graphs, Giant Components, and Applications in Data Science

    Authors: George C. Linderman, Gal Mishne, Yuval Kluger, Stefan Steinerberger

    Abstract: If we pick $n$ random points uniformly in $[0,1]^d$ and connect each point to its $k-$nearest neighbors, then it is well known that there exists a giant connected component with high probability. We prove that in $[0,1]^d$ it suffices to connect every point to $ c_{d,1} \log{\log{n}}$ points chosen randomly among its $ c_{d,2} \log{n}-$nearest neighbors to ensure a giant component of size… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

  22. arXiv:1708.05768  [pdf, other

    stat.ML cs.LG q-bio.QM

    Data-Driven Tree Transforms and Metrics

    Authors: Gal Mishne, Ronen Talmon, Israel Cohen, Ronald R. Coifman, Yuval Kluger

    Abstract: We consider the analysis of high dimensional data given in the form of a matrix with columns consisting of observations and rows consisting of features. Often the data is such that the observations do not reside on a regular grid, and the given order of the features is arbitrary and does not convey a notion of locality. Therefore, traditional transforms and metrics cannot be used for data organiza… ▽ More

    Submitted 18 August, 2017; originally announced August 2017.

    Comments: 16 pages, 5 figures. Accepted to IEEE Transactions on Signal and Information Processing over Networks

  23. arXiv:1703.02965  [pdf, ps, other

    stat.ML cs.LG

    Unsupervised Ensemble Regression

    Authors: Omer Dror, Boaz Nadler, Erhan Bilal, Yuval Kluger

    Abstract: Consider a regression problem where there is no labeled data and the only observations are the predictions $f_i(x_j)$ of $m$ experts $f_{i}$ over many samples $x_j$. With no knowledge on the accuracy of the experts, is it still possible to accurately estimate the unknown responses $y_{j}$? Can one still detect the least or most accurate experts? In this work we propose a framework to study these q… ▽ More

    Submitted 8 March, 2017; originally announced March 2017.

  24. arXiv:1612.08709  [pdf, other

    cs.DC math.NA stat.CO

    Randomized algorithms for distributed computation of principal component analysis and singular value decomposition

    Authors: Huamin Li, Yuval Kluger, Mark Tygert

    Abstract: Randomized algorithms provide solutions to two ubiquitous problems: (1) the distributed calculation of a principal component analysis or singular value decomposition of a highly rectangular matrix, and (2) the distributed calculation of a low-rank approximation (in the form of a singular value decomposition) to an arbitrary matrix. Carefully honed algorithms yield results that are uniformly superi… ▽ More

    Submitted 1 January, 2018; v1 submitted 27 December, 2016; originally announced December 2016.

    Comments: 21 pages, 29 tables, 1 figure, 8 algorithms in pseudocode

    Journal ref: Advances in Computational Mathematics, 44 (5): 1651-1672, 2018

  25. DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network

    Authors: Jared Katzman, Uri Shaham, Jonathan Bates, Alexander Cloninger, Tingting Jiang, Yuval Kluger

    Abstract: Medical practitioners use survival models to explore and understand the relationships between patients' covariates (e.g. clinical and genetic features) and the effectiveness of various treatment options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. Whil… ▽ More

    Submitted 8 August, 2017; v1 submitted 2 June, 2016; originally announced June 2016.

    Comments: Presented at the International Conference of Machine Learning Computational Biology Workshop 2016

  26. arXiv:1602.02285  [pdf, other

    stat.ML cs.LG

    A Deep Learning Approach to Unsupervised Ensemble Learning

    Authors: Uri Shaham, Xiuyuan Cheng, Omer Dror, Ariel Jaffe, Boaz Nadler, Joseph Chang, Yuval Kluger

    Abstract: We show how deep learning methods can be applied in the context of crowdsourcing and unsupervised ensemble learning. First, we prove that the popular model of Dawid and Skene, which assumes that all classifiers are conditionally independent, is {\em equivalent} to a Restricted Boltzmann Machine (RBM) with a single hidden node. Hence, under this model, the posterior probabilities of the true labels… ▽ More

    Submitted 6 February, 2016; originally announced February 2016.

    Report number: PMLR 48:30-39

  27. arXiv:1510.05830  [pdf, ps, other

    cs.LG stat.ML

    Unsupervised Ensemble Learning with Dependent Classifiers

    Authors: Ariel Jaffe, Ethan Fetaya, Boaz Nadler, Tingting Jiang, Yuval Kluger

    Abstract: In unsupervised ensemble learning, one obtains predictions from multiple sources or classifiers, yet without knowing the reliability and expertise of each source, and with no labeled data to assess it. The task is to combine these possibly conflicting predictions into an accurate meta-learner. Most works to date assumed perfect diversity between the different sources, a property known as condition… ▽ More

    Submitted 23 February, 2016; v1 submitted 20 October, 2015; originally announced October 2015.

  28. arXiv:1412.3510  [pdf, other

    stat.CO cs.MS

    An implementation of a randomized algorithm for principal component analysis

    Authors: Arthur Szlam, Yuval Kluger, Mark Tygert

    Abstract: Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis (PCA) and the calculation of truncated singular value decompositions (SVD). The present paper presents an essentially black-box, fool-proof implementation for Mathworks' MATLAB, a popular software platform for numerical computation. As illustrated via… ▽ More

    Submitted 10 December, 2014; originally announced December 2014.

    Comments: 13 pages, 4 figures

    Journal ref: ACM TOMS, 43(3): 28:1-28:14, 2016

  29. arXiv:1407.7644  [pdf, ps, other

    stat.ML cs.LG

    Estimating the Accuracies of Multiple Classifiers Without Labeled Data

    Authors: Ariel Jaffe, Boaz Nadler, Yuval Kluger

    Abstract: In various situations one is given only the predictions of multiple classifiers over a large unlabeled test data. This scenario raises the following questions: Without any labeled data and without any a-priori knowledge about the reliability of these different classifiers, is it possible to consistently and computationally efficiently estimate their accuracies? Furthermore, also in a completely un… ▽ More

    Submitted 30 October, 2014; v1 submitted 29 July, 2014; originally announced July 2014.

  30. Ranking and combining multiple predictors without labeled data

    Authors: Fabio Parisi, Francesco Strino, Boaz Nadler, Yuval Kluger

    Abstract: In a broad range of classification and decision making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This scenario is different from the standard supervised setting, where each classifier accuracy can be assessed using available labeled data, and raises two questions: given only the predictions of several classi… ▽ More

    Submitted 24 November, 2013; v1 submitted 13 March, 2013; originally announced March 2013.

    Comments: Supplementary Information is included at the end of the manuscript. This is a revision of our original submission of the manuscript entitled "The student's dilemma: ranking and improving prediction at test time without access to training data", which is now entitled "Ranking and combining multiple predictors without labeled data"

    Journal ref: Proc. Natl. Acad. Sci. U.S.A. 111 (2014) 1253-1258