Skip to main content

Showing 1–42 of 42 results for author: Aragam, B

.
  1. arXiv:2406.18400  [pdf, other

    cs.CL cs.LG stat.ML

    Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering their factual meanings. These findings highlight that LLMs might behave like an associative memory model where certain tokens in the contexts serve as clues to… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.17228  [pdf, ps, other

    stat.ML cs.LG math.ST

    Greedy equivalence search for nonparametric graphical models

    Authors: Bryon Aragam

    Abstract: One of the hallmark achievements of the theory of graphical models and Bayesian model selection is the celebrated greedy equivalence search (GES) algorithm due to Chickering and Meek. GES is known to consistently estimate the structure of directed acyclic graph (DAG) models in various special cases including Gaussian and discrete models, which are in particular curved exponential families. A gener… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  3. arXiv:2403.03867  [pdf, other

    cs.CL cs.LG stat.ML

    On the Origins of Linear Representations in Large Language Models

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam, Victor Veitch

    Abstract: Recent works have argued that high-level semantic concepts are encoded "linearly" in the representation space of large language models. In this work, we study the origins of such linear representations. To that end, we introduce a simple latent variable model to abstract and formalize the concept dynamics of the next token prediction. We use this formalism to show that the next token prediction ob… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  4. arXiv:2402.09236  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 36 pages

  5. arXiv:2402.06380  [pdf, other

    cs.LG stat.ML

    Optimal estimation of Gaussian (poly)trees

    Authors: Yuhao Wang, Ming Gao, Wai Ming Tai, Bryon Aragam, Arnab Bhattacharyya

    Abstract: We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC al… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  6. arXiv:2312.17047  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    Inconsistency of cross-validation for structure learning in Gaussian graphical models

    Authors: Zhao Lyu, Wai Ming Tai, Mladen Kolar, Bryon Aragam

    Abstract: Despite numerous years of research into the merits and trade-offs of various model selection criteria, obtaining robust results that elucidate the behavior of cross-validation remains a challenging endeavor. In this paper, we highlight the inherent limitations of cross-validation when employed to discern the structure of a Gaussian graphical model. We provide finite-sample bounds on the probabilit… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Preliminary version; 47 pages, 15 figures

  7. arXiv:2310.17611  [pdf, other

    cs.LG cs.CL stat.ML

    Uncovering Meanings of Embeddings via Partial Orthogonality

    Authors: Yibo Jiang, Bryon Aragam, Victor Veitch

    Abstract: Machine learning tools often rely on embedding text as vectors of real numbers. In this paper, we study how the semantic structure of language is encoded in the algebraic structure of such embeddings. Specifically, we look at a notion of ``semantic independence'' capturing the idea that, e.g., ``eggplant'' and ``tomato'' are independent given ``vegetable''. Although such examples are intuitive, it… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  8. arXiv:2310.13387  [pdf, other

    stat.ME cs.LG

    Assumption violations in causal discovery and the robustness of score matching

    Authors: Francesco Montagna, Atalanti A. Mastakouri, Elias Eulig, Nicoletta Noceti, Lorenzo Rosasco, Dominik Janzing, Bryon Aragam, Francesco Locatello

    Abstract: When domain knowledge is limited and experimentation is restricted by ethical, financial, or time constraints, practitioners turn to observational causal discovery methods to recover the causal structure, exploiting the statistical properties of their data. Because causal discovery without further assumptions is an ill-posed problem, each algorithm comes with its own set of usually untestable assu… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  9. arXiv:2306.17378  [pdf, other

    cs.LG stat.ML

    Global Optimality in Bivariate Gradient-based DAG Learning

    Authors: Chang Deng, Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order optimization schemes to solve this problem, proving the global optimality of such approaches has proven elusive. The difficulty lies in the fact that unlike other no… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 39 pages, 13 figures

  10. arXiv:2306.17361  [pdf, other

    cs.LG cs.AI stat.AP stat.ME stat.ML

    iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models

    Authors: Tianyu Chen, Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: Structural causal models (SCMs) are widely used in various disciplines to represent causal relationships among variables in complex systems. Unfortunately, the underlying causal structure is often unknown, and estimating it from data remains a challenging task. In many situations, however, the end goal is to localize the changes (shifts) in the causal mechanisms between related datasets instead of… ▽ More

    Submitted 12 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 36 pages, 18 figures. Published at NeurIPS 2023

  11. arXiv:2306.02899  [pdf, other

    stat.ML cs.LG

    Learning nonparametric latent causal graphs with unknown interventions

    Authors: Yibo Jiang, Bryon Aragam

    Abstract: We establish conditions under which latent causal graphs are nonparametrically identifiable and can be reconstructed from unknown interventions in the latent space. Our primary focus is the identification of the latent structure in measurement models without parametric assumptions such as linearity or Gaussianity. Moreover, we do not assume the number of hidden variables is known, and we show that… ▽ More

    Submitted 3 November, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: To appear at NeurIPS 2023

  12. arXiv:2306.02244  [pdf, other

    math.ST stat.ME

    Optimal neighbourhood selection in structural equation models

    Authors: Ming Gao, Wai Ming Tai, Bryon Aragam

    Abstract: We study the optimal sample complexity of neighbourhood selection in linear structural equation models, and compare this to best subset selection (BSS) for linear models under general design. We show by example that -- even when the structure is \emph{unknown} -- the existence of underlying structure can reduce the sample complexity of neighbourhood selection. This result is complicated by the pos… ▽ More

    Submitted 28 November, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  13. arXiv:2306.02235  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted as Oral paper at NeurIPS 2023

  14. arXiv:2305.19802  [pdf, other

    stat.ML cs.LG

    Neuro-Causal Factor Analysis

    Authors: Alex Markham, Mingyu Liu, Bryon Aragam, Liam Solus

    Abstract: Factor analysis (FA) is a statistical tool for studying how observed variables with some mutual dependences can be expressed as functions of mutually independent unobserved factors, and it is widely applied throughout the psychological, biological, and physical sciences. We revisit this classic method from the comparatively new perspective given by advancements in causal discovery and deep learnin… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 23 pages, 13 figures

  15. arXiv:2305.17277  [pdf, other

    stat.ML cs.LG

    Optimizing NOTEARS Objectives via Topological Swaps

    Authors: Chang Deng, Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: Recently, an intriguing class of non-convex optimization problems has emerged in the context of learning directed acyclic graphs (DAGs). These problems involve minimizing a given loss or score function, subject to a non-convex continuous constraint that penalizes the presence of cycles in a graph. In this work, we delve into the optimization challenges associated with this class of non-convex prog… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 39 pages, 12 figures, ICML 2023

  16. arXiv:2305.04127  [pdf, ps, other

    cs.LG stat.ML

    Learning Mixtures of Gaussians with Censored Data

    Authors: Wai Ming Tai, Bryon Aragam

    Abstract: We study the problem of learning mixtures of Gaussians with censored data. Statistical learning with censored data is a classical problem, with numerous practical applications, however, finite-sample guarantees for even simple latent variable models such as Gaussian mixtures are missing. Formally, we are given censored data from a mixture of univariate Gaussians… ▽ More

    Submitted 28 June, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

  17. arXiv:2209.08037  [pdf, other

    cs.LG stat.ME stat.ML

    DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

    Authors: Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles… ▽ More

    Submitted 15 January, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: 28 pages, 13 figures, published at NeurIPS 2022

  18. arXiv:2206.10044  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Identifiability of deep generative models without auxiliary information

    Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational autoencoders that are commonly used in practice. Unlike existing work, our analysis does not require weak supervision, auxiliary information, or conditioning in the latent space. Specifically, we show that for a broad class of generativ… ▽ More

    Submitted 18 October, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: 34 pages, 9 figures, to appear in NeurIPS 2022

  19. arXiv:2206.05829  [pdf, other

    math.ST cs.DM stat.ML

    A non-graphical representation of conditional independence via the neighbourhood lattice

    Authors: Arash A. Amini, Bryon Aragam, Qing Zhou

    Abstract: We introduce and study the neighbourhood lattice decomposition of a distribution, which is a compact, non-graphical representation of conditional independence that is valid in the absence of a faithful graphical representation. The idea is to view the set of neighbourhoods of a variable as a subset lattice, and partition this lattice into convex sublattices, each of which directly encodes a collec… ▽ More

    Submitted 12 June, 2022; originally announced June 2022.

    Comments: 30 pages, 3 figures

  20. arXiv:2203.15150  [pdf, other

    cs.LG math.ST stat.ML

    Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures

    Authors: Bryon Aragam, Wai Ming Tai

    Abstract: We study the problem of learning nonparametric distributions in a finite mixture, and establish tight bounds on the sample complexity for learning the component distributions in such models. Namely, we are given i.i.d. samples from a pdf $f$ where $$ f=w_1f_1+w_2f_2, \quad w_1+w_2=1, \quad w_1,w_2>0 $$ and we are interested in learning each component $f_i$. Without any assumptions on $f_i$, this p… ▽ More

    Submitted 4 July, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

  21. arXiv:2201.10548  [pdf, other

    math.ST cs.AI cs.LG stat.ML

    Optimal estimation of Gaussian DAG models

    Authors: Ming Gao, Wai Ming Tai, Bryon Aragam

    Abstract: We study the optimal sample complexity of learning a Gaussian directed acyclic graph (DAG) from observational data. Our main results establish the minimax optimal sample complexity for learning the structure of a linear Gaussian DAG model in two settings of interest: 1) Under equal variances without knowledge of the true ordering, and 2) For general linear models given knowledge of the ordering. I… ▽ More

    Submitted 20 March, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: 21 pages, 2 figures, to appear in AISTATS 2022

  22. arXiv:2111.03739  [pdf, other

    q-bio.QM cs.LG q-bio.PE stat.ME

    Tradeoffs of Linear Mixed Models in Genome-wide Association Studies

    Authors: Haohan Wang, Bryon Aragam, Eric Xing

    Abstract: Motivated by empirical arguments that are well-known from the genome-wide association studies (GWAS) literature, we study the statistical properties of linear mixed models (LMMs) applied to GWAS. First, we study the sensitivity of LMMs to the inclusion of a candidate SNP in the kinship matrix, which is often done in practice to speed up computations. Our results shed light on the size of the error… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: in final revision of Journal of Computational Biology

  23. arXiv:2111.01104  [pdf, other

    stat.ML cs.AI cs.LG

    NOTMAD: Estimating Bayesian Networks with Sample-Specific Structures and Parameters

    Authors: Ben Lengerich, Caleb Ellington, Bryon Aragam, Eric P. Xing, Manolis Kellis

    Abstract: Context-specific Bayesian networks (i.e. directed acyclic graphs, DAGs) identify context-dependent relationships between variables, but the non-convexity induced by the acyclicity requirement makes it difficult to share information between context-specific estimators (e.g. with graph generator functions). For this reason, existing methods for inferring context-specific Bayesian networks have favor… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  24. arXiv:2110.06082  [pdf, other

    math.ST cs.AI cs.LG stat.ML

    Efficient Bayesian network structure learning via local Markov boundary search

    Authors: Ming Gao, Bryon Aragam

    Abstract: We analyze the complexity of learning directed acyclic graphical models from observational data in general settings without specific distributional assumptions. Our approach is information-theoretic and uses a local Markov boundary search procedure in order to recursively construct ancestral sets in the underlying graphical model. Perhaps surprisingly, we show that for certain graph ensembles, a s… ▽ More

    Submitted 21 November, 2021; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: 31 pages, 3 figures, to appear in NeurIPS 2021

  25. arXiv:2110.04719  [pdf, other

    cs.LG cs.AI stat.ML

    Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families

    Authors: Goutham Rajendran, Bohdan Kivva, Ming Gao, Bryon Aragam

    Abstract: Greedy algorithms have long been a workhorse for learning graphical models, and more broadly for learning statistical models with sparse structure. In the context of learning directed acyclic graphs, greedy algorithms are popular despite their worst-case exponential runtime. In practice, however, they are very efficient. We provide new insight into this phenomenon by studying a general greedy scor… ▽ More

    Submitted 28 October, 2021; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021; 27 pages, 9 figures

  26. arXiv:2108.14003  [pdf, other

    math.ST stat.ML

    Uniform Consistency in Nonparametric Mixture Models

    Authors: Bryon Aragam, Ruiyi Yang

    Abstract: We study uniform consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density. We construct uniformly consistent estimators under general conditions while simultaneously highlighting… ▽ More

    Submitted 27 December, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: To appear in The Annals of Statistics

  27. arXiv:2106.15563  [pdf, other

    cs.LG cs.AI stat.ML

    Learning latent causal graphs via mixture oracles

    Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: We study the problem of reconstructing a causal graphical model from data in the presence of latent variables. The main problem of interest is recovering the causal structure over the latent variables while allowing for general, potentially nonlinear dependence between the variables. In many practical problems, the dependence between raw observations (e.g. pixels in an image) is much less relevant… ▽ More

    Submitted 21 November, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: To appear at NeurIPS 2021. 41 pages

  28. arXiv:2012.10713  [pdf, other

    cs.LG cs.AI stat.ML

    Fundamental Limits and Tradeoffs in Invariant Representation Learning

    Authors: Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar

    Abstract: A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protect… ▽ More

    Submitted 23 November, 2022; v1 submitted 19 December, 2020; originally announced December 2020.

    Comments: JMLR camera-ready version

  29. arXiv:2006.11970  [pdf, other

    stat.ML cs.LG math.ST

    A polynomial-time algorithm for learning nonparametric causal graphs

    Authors: Ming Gao, Yi Ding, Bryon Aragam

    Abstract: We establish finite-sample guarantees for a polynomial-time algorithm for learning a nonlinear, nonparametric directed acyclic graphical (DAG) model from data. The analysis is model-free and does not assume linearity, additivity, independent noise, or faithfulness. Instead, we impose a condition on the residual variances that is closely related to previous work on linear models with equal variance… ▽ More

    Submitted 10 November, 2020; v1 submitted 21 June, 2020; originally announced June 2020.

    Comments: To appear at NeurIPS 2020

  30. arXiv:2002.00498  [pdf, other

    stat.ML cs.LG

    DYNOTEARS: Structure Learning from Time-Series Data

    Authors: Roxana Pamfil, Nisara Sriwattanaworachai, Shaan Desai, Philip Pilgerstorfer, Paul Beaumont, Konstantinos Georgatzis, Bryon Aragam

    Abstract: We revisit the structure learning problem for dynamic Bayesian networks and propose a method that simultaneously estimates contemporaneous (intra-slice) and time-lagged (inter-slice) relationships between variables in a time-series. Our approach is score-based, and revolves around minimizing a penalized loss subject to an acyclicity constraint. To solve this problem, we leverage a recent algebraic… ▽ More

    Submitted 27 April, 2020; v1 submitted 2 February, 2020; originally announced February 2020.

    Comments: 23 pages, 13 figures, accepted to AISTATS 2020, corrected version

  31. arXiv:1912.01108  [pdf, other

    cs.LG stat.ML

    Automated Dependence Plots

    Authors: David I. Inouye, Liu Leqi, Joon Sik Kim, Bryon Aragam, Pradeep Ravikumar

    Abstract: In practical applications of machine learning, it is necessary to look beyond standard metrics such as test accuracy in order to validate various qualitative properties of a model. Partial dependence plots (PDP), including instance-specific PDPs (i.e., ICE plots), have been widely used as a visual tool to understand or validate a model. Yet, current PDPs suffer from two main drawbacks: (1) a user… ▽ More

    Submitted 29 July, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: In Uncertainty in Artificial Intelligence (UAI 2020). Camera-ready version. Code is available at https://github.com/davidinouye/adp

  32. arXiv:1910.06939  [pdf, other

    stat.ML cs.LG stat.ME

    Learning Sample-Specific Models with Low-Rank Personalized Regression

    Authors: Benjamin Lengerich, Bryon Aragam, Eric P. Xing

    Abstract: Modern applications of machine learning (ML) deal with increasingly heterogeneous datasets comprised of data collected from overlap** latent subpopulations. As a result, traditional models trained over large datasets may fail to recognize highly predictive localized effects in favour of weakly predictive global patterns. This is a problem because localized effects are critical to develo** indi… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

    Comments: Accepted at NeurIPS 2019

  33. arXiv:1909.13189  [pdf, other

    stat.ML cs.LG stat.ME

    Learning Sparse Nonparametric DAGs

    Authors: Xun Zheng, Chen Dan, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing

    Abstract: We develop a framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data. Our approach is based on a recent algebraic characterization of DAGs that led to a fully continuous program for score-based learning of DAG models parametrized by a linear structural equation model (SEM). We extend this algebraic characterization to nonparametric SEM by leveraging nonparametric spars… ▽ More

    Submitted 23 March, 2020; v1 submitted 28 September, 2019; originally announced September 2019.

    Comments: To appear in AISTATS 2020

  34. arXiv:1909.01978  [pdf, other

    math.ST cs.LG stat.ML

    On perfectness in Gaussian graphical models

    Authors: Arash A. Amini, Bryon Aragam, Qing Zhou

    Abstract: Knowing when a graphical model is perfect to a distribution is essential in order to relate separation in the graph to conditional independence in the distribution, and this is particularly important when performing inference from data. When the model is perfect, there is a one-to-one correspondence between conditional independence statements in the distribution and separation statements in the gr… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: This note is based on a result that first appeared in arXiv:1711.00991v1. The original article has now been split into two parts

  35. arXiv:1810.07354  [pdf, other

    cs.LG stat.ML

    Fault Tolerance in Iterative-Convergent Machine Learning

    Authors: Aurick Qiao, Bryon Aragam, Bing**g Zhang, Eric P. Xing

    Abstract: Machine learning (ML) training algorithms often possess an inherent self-correcting behavior due to their iterative-convergent nature. Recent systems exploit this property to achieve adaptability and efficiency in unreliable computing environments by relaxing the consistency of execution and allowing calculation errors to be self-corrected during training. However, the behavior of such systems are… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

  36. arXiv:1809.03073  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Sample Complexity of Nonparametric Semi-Supervised Learning

    Authors: Chen Dan, Liu Leqi, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing

    Abstract: We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions. Under these assumptions, we establish an $Ω(K\log K)$ labeled sample complexity bound without imposing parametric assumptions, where $K$ is the nu… ▽ More

    Submitted 9 September, 2018; originally announced September 2018.

    Comments: 18 pages, 3 figures

  37. arXiv:1803.01422  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    DAGs with NO TEARS: Continuous Optimization for Structure Learning

    Authors: Xun Zheng, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing

    Abstract: Estimating the structure of directed acyclic graphs (DAGs, also known as Bayesian networks) is a challenging problem since the search space of DAGs is combinatorial and scales superexponentially with the number of nodes. Existing approaches rely on various local heuristics for enforcing the acyclicity constraint. In this paper, we introduce a fundamentally different strategy: We formulate the stru… ▽ More

    Submitted 2 November, 2018; v1 submitted 4 March, 2018; originally announced March 2018.

    Comments: 22 pages, 8 figures, accepted to NIPS 2018

  38. arXiv:1802.04397  [pdf, other

    math.ST cs.AI cs.LG stat.ML

    Identifiability of Nonparametric Mixture Models and Bayes Optimal Clustering

    Authors: Bryon Aragam, Chen Dan, Eric P. Xing, Pradeep Ravikumar

    Abstract: Motivated by problems in data clustering, we establish general conditions under which families of nonparametric mixture models are identifiable, by introducing a novel framework involving clustering overfitted \emph{parametric} (i.e. misspecified) mixture models. These identifiability conditions generalize existing conditions in the literature, and are flexible enough to include for example mixtur… ▽ More

    Submitted 17 February, 2020; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: 35 pages, to appear in the Annals of Statistics

  39. arXiv:1711.00991  [pdf, other

    math.ST cs.LG stat.ML

    The neighborhood lattice for encoding partial correlations in a Hilbert space

    Authors: Arash A. Amini, Bryon Aragam, Qing Zhou

    Abstract: Neighborhood regression has been a successful approach in graphical and structural equation modeling, with applications to learning undirected and directed graphical models. We extend these ideas by defining and studying an algebraic structure called the neighborhood lattice based on a generalized notion of neighborhood regression. We show that this algebraic structure has the potential to provide… ▽ More

    Submitted 6 February, 2019; v1 submitted 2 November, 2017; originally announced November 2017.

  40. arXiv:1703.04025  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Learning Large-Scale Bayesian Networks with the sparsebn Package

    Authors: Bryon Aragam, Jiaying Gu, Qing Zhou

    Abstract: Learning graphical models from data is an important problem with wide applications, ranging from genomics to the social sciences. Nowadays datasets often have upwards of thousands---sometimes tens or hundreds of thousands---of variables and far fewer samples. To meet this challenge, we have developed a new R package called sparsebn for learning the structure of large, sparse graphical models with… ▽ More

    Submitted 10 March, 2018; v1 submitted 11 March, 2017; originally announced March 2017.

    Comments: To appear in the Journal of Statistical Software, 39 pages, 7 figures

    Journal ref: Journal of Statistical Software, 91(11), 1-38, 2019

  41. arXiv:1511.08963  [pdf, ps, other

    math.ST cs.LG stat.ML

    Learning Directed Acyclic Graphs with Penalized Neighbourhood Regression

    Authors: Bryon Aragam, Arash A. Amini, Qing Zhou

    Abstract: We study a family of regularized score-based estimators for learning the structure of a directed acyclic graph (DAG) for a multivariate normal distribution from high-dimensional data with $p\gg n$. Our main results establish support recovery guarantees and deviation bounds for a family of penalized least-squares estimators under concave regularization without assuming prior knowledge of a variable… ▽ More

    Submitted 1 October, 2017; v1 submitted 28 November, 2015; originally announced November 2015.

    Comments: 54 pages, 1 figure

  42. arXiv:1401.0852  [pdf, other

    stat.ME cs.LG stat.ML

    Concave Penalized Estimation of Sparse Gaussian Bayesian Networks

    Authors: Bryon Aragam, Qing Zhou

    Abstract: We develop a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data. In contrast to recent methods which accelerate the learning problem by restricting the search space, our main contribution is a fast algorithm for score-based structure learning which does not restrict the search space in any way and works on high-dimensional data… ▽ More

    Submitted 4 January, 2015; v1 submitted 4 January, 2014; originally announced January 2014.

    Comments: 57 pages

    Journal ref: Journal of Machine Learning Research 16(Nov):2273-2328, 2015