-
Maximum likelihood thresholds of Gaussian graphical models and graphical lasso
Authors:
Daniel Irving Bernstein,
Hayden Outlaw
Abstract:
Associated to each graph G is a Gaussian graphical model. Such models are often used in high-dimensional settings, i.e. where there are relatively few data points compared to the number of variables. The maximum likelihood threshold of a graph is the minimum number of data points required to fit the corresponding graphical model using maximum likelihood estimation. Graphical lasso is a method for…
▽ More
Associated to each graph G is a Gaussian graphical model. Such models are often used in high-dimensional settings, i.e. where there are relatively few data points compared to the number of variables. The maximum likelihood threshold of a graph is the minimum number of data points required to fit the corresponding graphical model using maximum likelihood estimation. Graphical lasso is a method for selecting and fitting a graphical model. In this project, we ask: when graphical lasso is used to select and fit a graphical model on n data points, how likely is it that n is greater than or equal to the maximum likelihood threshold of the corresponding graph? Our results are a series of computational experiments.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Matroid lifts and representability
Authors:
Daniel Irving Bernstein,
Zach Walsh
Abstract:
A 1965 result of Crapo shows that every elementary lift of a matroid $M$ can be constructed from a linear class of circuits of $M$. In a recent paper, Walsh generalized this construction by defining a rank-$k$ lift of a matroid $M$ given a rank-$k$ matroid $N$ on the set of circuits of $M$, and conjectured that all matroid lifts can be obtained in this way. In this sequel paper we simplify Walsh's…
▽ More
A 1965 result of Crapo shows that every elementary lift of a matroid $M$ can be constructed from a linear class of circuits of $M$. In a recent paper, Walsh generalized this construction by defining a rank-$k$ lift of a matroid $M$ given a rank-$k$ matroid $N$ on the set of circuits of $M$, and conjectured that all matroid lifts can be obtained in this way. In this sequel paper we simplify Walsh's construction and show that this conjecture is true for representable matroids but is false in general. This gives a new way to certify that a particular matroid is non-representable, which we use to construct new classes of non-representable matroids.
Walsh also applied the new matroid lift construction to gain graphs over the additive group of a non-prime finite field, generalizing a construction of Zaslavsky for these special groups. He conjectured that this construction is possible on three or more vertices only for the additive group of a non-prime finite field. We show that this conjecture holds for four or more vertices, but fails for exactly three.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Maximum likelihood thresholds of generic linear concentration models
Authors:
Daniel Irving Bernstein,
Steven J. Gortler,
Louis Theran
Abstract:
The maximum likelihood threshold of a statistical model is the minimum number of datapoints required to fit the model via maximum likelihood estimation. In this paper we determine the maximum likelihood thresholds of generic linear concentration models. This turns out to be the number one would expect from a naive dimension count, which is surprising and nontrivial to prove given that the maximum…
▽ More
The maximum likelihood threshold of a statistical model is the minimum number of datapoints required to fit the model via maximum likelihood estimation. In this paper we determine the maximum likelihood thresholds of generic linear concentration models. This turns out to be the number one would expect from a naive dimension count, which is surprising and nontrivial to prove given that the maximum likelihood threshold is a semi-algebraic concept. We also describe geometrically how a linear concentration model can fail to exhibit this generic behavior and briefly discuss connections to rigidity theory.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Computing maximum likelihood thresholds using graph rigidity
Authors:
Daniel Irving Bernstein,
Sean Dewar,
Steven J. Gortler,
Anthony Nixon,
Meera Sitharam,
Louis Theran
Abstract:
The maximum likelihood threshold (MLT) of a graph $G$ is the minimum number of samples to almost surely guarantee existence of the maximum likelihood estimate in the corresponding Gaussian graphical model. Recently a new characterization of the MLT in terms of rigidity-theoretic properties of $G$ was proved \cite{Betal}. This characterization was then used to give new combinatorial lower bounds on…
▽ More
The maximum likelihood threshold (MLT) of a graph $G$ is the minimum number of samples to almost surely guarantee existence of the maximum likelihood estimate in the corresponding Gaussian graphical model. Recently a new characterization of the MLT in terms of rigidity-theoretic properties of $G$ was proved \cite{Betal}. This characterization was then used to give new combinatorial lower bounds on the MLT of any graph. We continue this line of research by exploiting combinatorial rigidity results to compute the MLT precisely for several families of graphs. These include graphs with at most $9$ vertices, graphs with at most 24 edges, every graph sufficiently close to a complete graph and graphs with bounded degrees.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
$K_{5,5}$ is fully reconstructible in $\mathbb{C}^3$
Authors:
Daniel Irving Bernstein,
Steven J. Gortler
Abstract:
A graph $G$ is fully reconstructible in $\mathbb{C}^d$ if the graph is determined from its $d$-dimensional measurement variety. The full reconstructibility problem has been solved for $d=1$ and $d=2$. For $d=3$, some necessary and some sufficient conditions are known and $K_{5,5}$ falls squarely within the gap in the theory. In this paper, we show that $K_{5,5}$ is fully reconstructible in…
▽ More
A graph $G$ is fully reconstructible in $\mathbb{C}^d$ if the graph is determined from its $d$-dimensional measurement variety. The full reconstructibility problem has been solved for $d=1$ and $d=2$. For $d=3$, some necessary and some sufficient conditions are known and $K_{5,5}$ falls squarely within the gap in the theory. In this paper, we show that $K_{5,5}$ is fully reconstructible in $\mathbb{C}^3$.
△ Less
Submitted 17 November, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
Maximum likelihood thresholds via graph rigidity
Authors:
Daniel Irving Bernstein,
Sean Dewar,
Steven J. Gortler,
Anthony Nixon,
Meera Sitharam,
Louis Theran
Abstract:
The maximum likelihood threshold (MLT) of a graph $G$ is the minimum number of samples to almost surely guarantee existence of the maximum likelihood estimate in the corresponding Gaussian graphical model. We give a new characterization of the MLT in terms of rigidity-theoretic properties of $G$ and use this characterization to give new combinatorial lower bounds on the MLT of any graph.
We use…
▽ More
The maximum likelihood threshold (MLT) of a graph $G$ is the minimum number of samples to almost surely guarantee existence of the maximum likelihood estimate in the corresponding Gaussian graphical model. We give a new characterization of the MLT in terms of rigidity-theoretic properties of $G$ and use this characterization to give new combinatorial lower bounds on the MLT of any graph.
We use the new lower bounds to give high-probability guarantees on the maximum likelihood thresholds of sparse Erd{ö}s-Rényi random graphs in terms of their average density. These examples show that the new lower bounds are within a polylog factor of tight, where, on the same graph families, all known lower bounds are trivial.
Based on computational experiments made possible by our methods, we conjecture that the MLT of an Erd{ö}s-Rényi random graph is equal to its generic completion rank with high probability. Using structural results on rigid graphs in low dimension, we can prove the conjecture for graphs with MLT at most $4$ and describe the threshold probability for the MLT to switch from $3$ to $4$.
We also give a geometric characterization of the MLT of a graph in terms of a new "lifting" problem for frameworks that is interesting in its own right. The lifting perspective yields a new connection between the weak MLT (where the maximum likelihood estimate exists only with positive probability) and the classical Hadwiger-Nelson problem.
△ Less
Submitted 5 December, 2023; v1 submitted 4 August, 2021;
originally announced August 2021.
-
Generic symmetry-forced infinitesimal rigidity: translations and rotations
Authors:
Daniel Irving Bernstein
Abstract:
We characterize the combinatorial types of symmetric frameworks in the plane that are minimally generically symmetry-forced infinitesimally rigid when the symmetry group consists of rotations and translations. Along the way, we use tropical geometry to show how a construction of Edmonds that associates a matroid to a submodular function can be used to give a description of the algebraic matroid of…
▽ More
We characterize the combinatorial types of symmetric frameworks in the plane that are minimally generically symmetry-forced infinitesimally rigid when the symmetry group consists of rotations and translations. Along the way, we use tropical geometry to show how a construction of Edmonds that associates a matroid to a submodular function can be used to give a description of the algebraic matroid of a Hadamard product of two linear spaces in terms of the matroids of each linear space. This leads to new, short, proofs of Laman's theorem, and a theorem of Jord{á}n, Kaszanitzky, and Tanigawa, and Malestein and Theran characterizing the minimally generically symmetry-forced rigid graphs in the plane when the symmetry group contains only rotations.
△ Less
Submitted 8 December, 2021; v1 submitted 23 March, 2020;
originally announced March 2020.
-
Ordering-Based Causal Structure Learning in the Presence of Latent Variables
Authors:
Daniel Irving Bernstein,
Basil Saeed,
Chandler Squires,
Caroline Uhler
Abstract:
We consider the task of learning a causal graph in the presence of latent confounders given i.i.d.~samples from the model. While current algorithms for causal structure discovery in the presence of latent confounders are constraint-based, we here propose a score-based approach. We prove that under assumptions weaker than faithfulness, any sparsest independence map (IMAP) of the distribution belong…
▽ More
We consider the task of learning a causal graph in the presence of latent confounders given i.i.d.~samples from the model. While current algorithms for causal structure discovery in the presence of latent confounders are constraint-based, we here propose a score-based approach. We prove that under assumptions weaker than faithfulness, any sparsest independence map (IMAP) of the distribution belongs to the Markov equivalence class of the true model. This motivates the \emph{Sparsest Poset} formulation - that posets can be mapped to minimal IMAPs of the true model such that the sparsest of these IMAPs is Markov equivalent to the true model. Motivated by this result, we propose a greedy algorithm over the space of posets for causal structure discovery in the presence of latent confounders and compare its performance to the current state-of-the-art algorithms FCI and FCI+ on synthetic data.
△ Less
Submitted 24 March, 2020; v1 submitted 20 October, 2019;
originally announced October 2019.
-
Typical ranks in symmetric matrix completion
Authors:
Daniel Irving Bernstein,
Grigoriy Blekherman,
Kisun Lee
Abstract:
We study the problem of low-rank matrix completion for symmetric matrices. The minimum rank of a completion of a generic partially specified symmetric matrix depends only on the location of the specified entries, and not their values, if complex entries are allowed. When the entries are required to be real, this is no longer the case and the possible minimum ranks are called typical ranks. We give…
▽ More
We study the problem of low-rank matrix completion for symmetric matrices. The minimum rank of a completion of a generic partially specified symmetric matrix depends only on the location of the specified entries, and not their values, if complex entries are allowed. When the entries are required to be real, this is no longer the case and the possible minimum ranks are called typical ranks. We give a combinatorial description of the patterns of specified entires of $n\times n$ symmetric matrices that have $n$ as a typical rank. Moreover, we describe exactly when such a generic partial matrix is minimally completable to rank $n$. We also characterize the typical ranks for patterns of entries with low maximal typical rank.
△ Less
Submitted 14 October, 2020; v1 submitted 14 September, 2019;
originally announced September 2019.
-
The algebraic matroid of the funtf variety
Authors:
Daniel Irving Bernstein,
Cameron Farnsworth,
Jose Israel Rodriguez
Abstract:
A finite unit norm tight frame is a collection of $r$ vectors in $\mathbb{R}^n$ that generalizes the notion of orthonormal bases. The affine finite unit norm tight frame variety is the Zariski closure of the set of finite unit norm tight frames. Determining the fiber of a projection of this variety onto a set of coordinates is called the algebraic finite unit norm tight frame completion problem. O…
▽ More
A finite unit norm tight frame is a collection of $r$ vectors in $\mathbb{R}^n$ that generalizes the notion of orthonormal bases. The affine finite unit norm tight frame variety is the Zariski closure of the set of finite unit norm tight frames. Determining the fiber of a projection of this variety onto a set of coordinates is called the algebraic finite unit norm tight frame completion problem. Our techniques involve the algebraic matroid of an algebraic variety, which encodes the dimensions of fibers of coordinate projections. This work characterizes the bases of the algebraic matroid underlying the variety of finite unit norm tight frames in $\mathbb{R}^3$. Partial results towards similar characterizations for finite unit norm tight frames in $\mathbb{R}^n$ with $n \ge 4$ are also given. We provide a method to bound the degree of the projections based off of combinatorial~data.
△ Less
Submitted 14 January, 2020; v1 submitted 26 December, 2018;
originally announced December 2018.
-
The tropical Cayley-Menger variety
Authors:
Daniel Irving Bernstein,
Robert Krone
Abstract:
The Cayley-Menger variety is the Zariski closure of the set of vectors specifying the pairwise squared distances between $n$ points in $\mathbb{R}^d$. This variety is fundamental to algebraic approaches in rigidity theory. We study the tropicalization of the Cayley-Menger variety. In particular, when $d = 2$, we show that it is the Minkowski sum of the set of ultrametrics on $n$ leaves with itself…
▽ More
The Cayley-Menger variety is the Zariski closure of the set of vectors specifying the pairwise squared distances between $n$ points in $\mathbb{R}^d$. This variety is fundamental to algebraic approaches in rigidity theory. We study the tropicalization of the Cayley-Menger variety. In particular, when $d = 2$, we show that it is the Minkowski sum of the set of ultrametrics on $n$ leaves with itself, and we describe its polyhedral structure. We then give a new, tropical, proof of Laman's theorem.
△ Less
Submitted 4 December, 2019; v1 submitted 21 December, 2018;
originally announced December 2018.
-
Typical and Generic Ranks in Matrix Completion
Authors:
Daniel Irving Bernstein,
Grigoriy Blekherman,
Rainer Sinn
Abstract:
We consider the problem of exact low-rank matrix completion from a geometric viewpoint: given a partially filled matrix M, we keep the positions of specified and unspecified entries fixed, and study how the minimal completion rank depends on the values of the known entries. If the entries of the matrix are complex numbers, then for a fixed pattern of locations of specified and unspecified entries…
▽ More
We consider the problem of exact low-rank matrix completion from a geometric viewpoint: given a partially filled matrix M, we keep the positions of specified and unspecified entries fixed, and study how the minimal completion rank depends on the values of the known entries. If the entries of the matrix are complex numbers, then for a fixed pattern of locations of specified and unspecified entries there is a unique completion rank which occurs with positive probability. We call this rank the generic completion rank. Over the real numbers there can be multiple ranks that occur with positive probability; we call them typical completion ranks. We introduce these notions formally, and provide a number of inequalities and exact results on typical and generic ranks for different families of patterns of known and unknown entries.
△ Less
Submitted 22 September, 2019; v1 submitted 26 February, 2018;
originally announced February 2018.
-
Unimodular hierarchical models and their Graver bases
Authors:
Daniel Irving Bernstein,
Christopher O'Neill
Abstract:
Given a simplicial complex whose vertices are labeled with positive integers, one can associate a vector configuration whose corresponding toric variety is the Zariski closure of a hierarchical model. We classify all the vertex-weighted simplicial complexes that give rise to unimodular vector configurations. We also provide a combinatorial characterization of their Graver bases.
Given a simplicial complex whose vertices are labeled with positive integers, one can associate a vector configuration whose corresponding toric variety is the Zariski closure of a hierarchical model. We classify all the vertex-weighted simplicial complexes that give rise to unimodular vector configurations. We also provide a combinatorial characterization of their Graver bases.
△ Less
Submitted 22 September, 2017; v1 submitted 28 April, 2017;
originally announced April 2017.
-
L-Infinity optimization to Bergman fans of matroids with an application to phylogenetics
Authors:
Daniel Irving Bernstein
Abstract:
Given a dissimilarity map $δ$ on finite set $X$, the set of ultrametrics (equidistant tree metrics) which are $l^\infty$-nearest to $δ$ is a tropical polytope. We give an internal description of this tropical polytope which we use to derive a polynomial-time checkable test for the condition that all ultrametrics $l^\infty$-nearest to $δ$ have the same tree structure. It was shown by Ardila and Kli…
▽ More
Given a dissimilarity map $δ$ on finite set $X$, the set of ultrametrics (equidistant tree metrics) which are $l^\infty$-nearest to $δ$ is a tropical polytope. We give an internal description of this tropical polytope which we use to derive a polynomial-time checkable test for the condition that all ultrametrics $l^\infty$-nearest to $δ$ have the same tree structure. It was shown by Ardila and Klivans \cite{ardila-klivans2006} that the set of all ultrametrics on a finite set of size $n$ is the Bergman fan associated to the matroid underlying the complete graph on $n$ vertices. Therefore, we derive our results in the more general context of Bergman fans of matroids. This added generality allows our results to be used on dissimilarity maps where only a subset of the entries are known.
△ Less
Submitted 20 December, 2019; v1 submitted 16 February, 2017;
originally announced February 2017.
-
L-infinity optimization to linear spaces and phylogenetic trees
Authors:
Daniel Irving Bernstein,
Colby Long
Abstract:
Given a distance matrix consisting of pairwise distances between species, a distance-based phylogenetic reconstruction method returns a tree metric or equidistant tree metric (ultrametric) that best fits the data. We investigate distance-based phylogenetic reconstruction using the $l^\infty$-metric. In particular, we analyze the set of $l^\infty$-closest ultrametrics and tree metrics to an arbitra…
▽ More
Given a distance matrix consisting of pairwise distances between species, a distance-based phylogenetic reconstruction method returns a tree metric or equidistant tree metric (ultrametric) that best fits the data. We investigate distance-based phylogenetic reconstruction using the $l^\infty$-metric. In particular, we analyze the set of $l^\infty$-closest ultrametrics and tree metrics to an arbitrary dissimilarity map to determine its dimension and the tree topologies it represents. In the case of ultrametrics, we decompose the space of dissimilarity maps on 3 elements and on 4 elements relative to the tree topologies represented.
Our approach is to first address uniqueness issues arising in $l^\infty$-optimization to linear spaces. We show that the $l^\infty$-closest point in a linear space is unique if and only if the underlying matroid of the linear space is uniform. We also give a polyhedral decomposition of $\rr^m$ based on the dimension of the set of $l^\infty$-closest points in a linear space.
△ Less
Submitted 16 February, 2017;
originally announced February 2017.
-
Completion of tree metrics and rank-2 matrices
Authors:
Daniel Irving Bernstein
Abstract:
Motivated by applications to low-rank matrix completion, we give a combinatorial characterization of the independent sets in the algebraic matroid associated to the collection of $m\times n$ rank-2 matrices and $n\times n$ skew-symmetric rank-2 matrices. Our approach is to use tropical geometry to reduce this to a problem about phylogenetic trees which we then solve. In particular, we give a combi…
▽ More
Motivated by applications to low-rank matrix completion, we give a combinatorial characterization of the independent sets in the algebraic matroid associated to the collection of $m\times n$ rank-2 matrices and $n\times n$ skew-symmetric rank-2 matrices. Our approach is to use tropical geometry to reduce this to a problem about phylogenetic trees which we then solve. In particular, we give a combinatorial description of the collections of pairwise distances between several taxa that may be arbitrarily prescribed while still allowing the resulting dissimilarity map to be completed to a tree metric.
△ Less
Submitted 14 July, 2017; v1 submitted 20 December, 2016;
originally announced December 2016.
-
L-Infinity optimization in tropical geometry and phylogenetics
Authors:
Daniel Irving Bernstein,
Colby Long
Abstract:
We investigate uniqueness issues that arise in $l^\infty$-optimization to linear spaces and Bergman fans of matroids. For linear spaces, we give a polyhedral decomposition of $\mathbb{R}^n$ based on the dimension of the set of $l^\infty$-nearest neighbors. This implies that the $l^\infty$-nearest neighbor in a linear space is unique if and only if the underlying matroid is uniform. For Bergman fan…
▽ More
We investigate uniqueness issues that arise in $l^\infty$-optimization to linear spaces and Bergman fans of matroids. For linear spaces, we give a polyhedral decomposition of $\mathbb{R}^n$ based on the dimension of the set of $l^\infty$-nearest neighbors. This implies that the $l^\infty$-nearest neighbor in a linear space is unique if and only if the underlying matroid is uniform. For Bergman fans of matroids, we show that the set of $l^\infty$-nearest points is a tropical polytope and give an algorithm to compute its tropical vertices. A key ingredient here is a notion of topology that generalizes tree topology. These results have practical implications for distance-based phylogenetic reconstruction using the $l^\infty$-metric. We analyze the possible dimensions of the set of $l^\infty$-nearest equidistant tree metrics to an arbitrary dissimilarity map and the number of tree topologies represented in this set. For both 3 and 4-leaf trees, we decompose the space of dissimilarity maps relative to the tree topologies represented.
△ Less
Submitted 20 February, 2017; v1 submitted 12 June, 2016;
originally announced June 2016.
-
Normal Binary Hierarchical Models
Authors:
Daniel Irving Bernstein,
Seth Sullivant
Abstract:
Each simplicial complex and integer vector yields a vector configuration whose combinatorial properties are important for the analysis of contingency tables. We study the normality of these vector configurations including a description of operations on simplicial complexes that preserve normality, constructions of families of minimally nonnormal complexes, and computations classifying all of the n…
▽ More
Each simplicial complex and integer vector yields a vector configuration whose combinatorial properties are important for the analysis of contingency tables. We study the normality of these vector configurations including a description of operations on simplicial complexes that preserve normality, constructions of families of minimally nonnormal complexes, and computations classifying all of the normal complexes on up to six vertices. We repeat this analysis for compressed vector configurations, classifying all of the compressed complexes on up to six vertices.
△ Less
Submitted 6 January, 2016; v1 submitted 21 August, 2015;
originally announced August 2015.
-
Unimodular Binary Hierarchical Models
Authors:
Daniel Irving Bernstein,
Seth Sullivant
Abstract:
Associated to each simplicial complex is a binary hierarchical model. We classify the simplicial complexes that yield unimodular binary hierarchical models. Our main theorem provides both a construction of all unimodular binary hierarchical models, together with a characterization in terms of excluded minors, where our definition of a minor allows the taking of links and induced complexes. A key t…
▽ More
Associated to each simplicial complex is a binary hierarchical model. We classify the simplicial complexes that yield unimodular binary hierarchical models. Our main theorem provides both a construction of all unimodular binary hierarchical models, together with a characterization in terms of excluded minors, where our definition of a minor allows the taking of links and induced complexes. A key tool in the proof is the lemma that the class of unimodular binary hierarchical models is closed under the Alexander duality operation on simplicial complexes.
△ Less
Submitted 18 February, 2016; v1 submitted 21 February, 2015;
originally announced February 2015.
-
Bounds on the Expected Size of the Maximum Agreement Subtree
Authors:
Daniel Irving Bernstein,
Lam Si Tung Ho,
Colby Long,
Mike Steel,
Katherine St. John,
Seth Sullivant
Abstract:
We prove polynomial upper and lower bounds on the expected size of the maximum agreement subtree of two random binary phylogenetic trees under both the uniform distribution and Yule-Harding distribution. This positively answers a question posed in earlier work. Determining tight upper and lower bounds remains an open problem.
We prove polynomial upper and lower bounds on the expected size of the maximum agreement subtree of two random binary phylogenetic trees under both the uniform distribution and Yule-Harding distribution. This positively answers a question posed in earlier work. Determining tight upper and lower bounds remains an open problem.
△ Less
Submitted 31 August, 2015; v1 submitted 26 November, 2014;
originally announced November 2014.