-
Fuglede-Kadison determinants of matrix-valued semicircular elements and capacity estimates
Authors:
Tobias Mai,
Roland Speicher
Abstract:
We calculate the Fuglede-Kadison determinant of arbitrary matrix-valued semicircular operators in terms of the capacity of the corresponding covariance map**. We also improve a lower bound by Garg, Gurvits, Oliveira, and Widgerson on this capacity, by making it dimension-independent.
We calculate the Fuglede-Kadison determinant of arbitrary matrix-valued semicircular operators in terms of the capacity of the corresponding covariance map**. We also improve a lower bound by Garg, Gurvits, Oliveira, and Widgerson on this capacity, by making it dimension-independent.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix
Authors:
The Tien Mai
Abstract:
Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the mat…
▽ More
Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the matrix is sparse, methods like graphical Lasso, graphical SCAD, and CLIME are popular for estimating GGMs. While frequentist methods are well-studied, Bayesian approaches for (unstructured) sparse precision matrices are less explored. The graphical horseshoe estimate by \citet{li2019graphical}, applying the global-local horseshoe prior, shows superior empirical performance, but theoretical work for sparse precision matrix estimations using shrinkage priors is limited. This paper addresses these gaps by providing concentration results for the tempered posterior with the fully specified horseshoe prior in high-dimensional settings. Moreover, we also provide novel theoretical results for model misspecification, offering a general oracle inequality for the posterior.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Adaptive posterior concentration rates for sparse high-dimensional linear regression with random design and unknown error variance
Authors:
The Tien Mai
Abstract:
This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investig…
▽ More
This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investigation to establish concentration outcomes for parameter estimation using specific distance measures. These findings are in line with recent discoveries in frequentist studies. Additionally, by employing techniques to address model misspecification through a fractional posterior, we broaden our analysis through oracle inequalities to encompass the critical aspect of model misspecification for the regular posterior. Our novel findings are demonstrated using two different types of sparsity priors: a shrinkage prior and a spike-and-slab prior.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Misclassification bounds for PAC-Bayesian sparse deep learning
Authors:
The Tien Mai
Abstract:
Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in dee…
▽ More
Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in deep learning for classification. This study presents an attempt to bridge that gap. By leveraging PAC-Bayes bounds techniques, we present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We establish non-asymptotic results for the prediction error. Additionally, we demonstrate that, by considering different architectures, our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor. Moreover, our additional logarithmic term yields slight improvements over previous works. Additionally, we propose and analyze an automated model selection approach aimed at optimally choosing a network architecture with guaranteed optimality.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
On properties of fractional posterior in generalized reduced-rank regression
Authors:
The Tien Mai
Abstract:
Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized line…
▽ More
Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized linear models (GLM) within the RRR framework. In this work, we relax the requirement for the link function of the generalized linear model to be canonical. We examine the properties of fractional posteriors in GLM within the RRR context, where a fractional power of the likelihood is utilized. By employing a spectral scaled Student prior distribution, we establish consistency and concentration results for the fractional posterior. Our results highlight adaptability, as they do not necessitate prior knowledge of the rank of the parameter matrix. These results are in line with those found in frequentist literature. Additionally, an examination of model mis-specification is undertaken, underscoring the effectiveness of our approach in such scenarios.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Generalized multiscale finite element method for a nonlinear elastic strain-limiting Cosserat model
Authors:
Dmitry Ammosov,
Tina Mai,
Juan Galvis
Abstract:
For nonlinear Cosserat elasticity, we consider multiscale methods in this paper. In particular, we explore the generalized multiscale finite element method (GMsFEM) to solve an isotropic Cosserat problem with strain-limiting property (ensuring bounded linearized strains even under high stresses). Such strain-limiting Cosserat model can find potential applications in solids and biological fibers. H…
▽ More
For nonlinear Cosserat elasticity, we consider multiscale methods in this paper. In particular, we explore the generalized multiscale finite element method (GMsFEM) to solve an isotropic Cosserat problem with strain-limiting property (ensuring bounded linearized strains even under high stresses). Such strain-limiting Cosserat model can find potential applications in solids and biological fibers. However, Cosserat media with naturally rotational degrees of freedom, nonlinear constitutive relations, high contrast, and heterogeneities may produce challenging multiscale characteristics in the solution, and upscaling by multiscale methods is necessary. Therefore, we utilize the offline and residual-based online (adaptive or uniform) GMsFEM in this context while handling the nonlinearity by Picard iteration. Through various two-dimensional experiments (for perforated, composite, and stochastically heterogeneous media with small and big strain-limiting parameters), our numerical results show the approaches' convergence, efficiency, and robustness. In addition, these results demonstrate that such approaches provide good accuracy, the online GMsFEM gives more accurate solutions than the offline one, and the online adaptive strategy has similar accuracy to the uniform one but with fewer degrees of freedom.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Prediction of discretization of online GMsFEM using deep learning for Richards equation
Authors:
Denis Spiridonov,
Sergei Stepanov,
Tina Mai
Abstract:
We develop a new coarse-scale approximation strategy for the nonlinear single-continuum Richards equation as an unsaturated flow over heterogeneous non-periodic media, using the online generalized multiscale finite element method (online GMsFEM) together with deep learning. A novelty of this approach is that local online multiscale basis functions are computed rapidly and frequently by utilizing d…
▽ More
We develop a new coarse-scale approximation strategy for the nonlinear single-continuum Richards equation as an unsaturated flow over heterogeneous non-periodic media, using the online generalized multiscale finite element method (online GMsFEM) together with deep learning. A novelty of this approach is that local online multiscale basis functions are computed rapidly and frequently by utilizing deep neural networks (DNNs). More precisely, we employ the training set of stochastic permeability realizations and the computed relating online multiscale basis functions to train neural networks. The nonlinear map between such permeability fields and online multiscale basis functions is developed by our proposed deep learning algorithm. That is, in a new way, the predicted online multiscale basis functions incorporate the nonlinearity treatment of the Richards equation and refect any time-dependent changes in the problem's properties. Multiple numerical experiments in two-dimensional model problems show the good performance of this technique, in terms of predictions of the online multiscale basis functions and thus finding solutions.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
On high-dimensional classification by sparse generalized Bayesian logistic regression
Authors:
The Tien Mai
Abstract:
This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predicto…
▽ More
This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predictor and response variable but also for the regression coefficients. Significantly, we derive novel findings concerning misclassification excess risk bounds using sparse generalized Bayesian logistic regression. These results parallel recent findings for penalized methods in the frequentist literature. Furthermore, we extend our results to the scenario of model misspecification, which is of critical importance.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Computing the noncommutative inner rank by means of operator-valued free probability theory
Authors:
Johannes Hoffmann,
Tobias Mai,
Roland Speicher
Abstract:
We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for w…
▽ More
We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for which we provide precise analytical and numerical control on the fixed point algorithm for solving the equation. Numerical examples show the efficiency of the algorithm.
△ Less
Submitted 28 June, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
Constrained Assortment Optimization under the Cross-Nested Logit Model
Authors:
Cuong Le,
Tien Mai
Abstract:
We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We…
▽ More
We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We show that, under the Cross-Nested Logit model, the assortment problem is NP-hard, even without any constraints. To tackle the assortment optimization problem, we develop a new discretization mechanism to approximate the problem by a linear fractional program with a performance guarantee of $\frac{1 - ε}{1+ε}$, for any accuracy level $ε>0$. We then show that optimal solutions to the approximate problem can be obtained by solving mixed-integer linear programs. We further show that our discretization approach can also be applied to solve a joint assortment optimization and pricing problem, as well as an assortment problem under a mixture of Cross-Nested Logit models to account for multiple classes of customers. Our empirical results on a large number of randomly generated test instances demonstrate that, under a performance guarantee of 90%, the percentage gaps between the objective values obtained from our approximation methods and the optimal expected revenues are no larger than 1.2%.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Tackling Stackelberg Network Interdiction against a Boundedly Rational Adversary
Authors:
Tien Mai,
Avinandan Bose,
Arunesh Sinha,
Thanh H. Nguyen
Abstract:
This work studies Stackelberg network interdiction games -- an important class of games in which a defender first allocates (randomized) defense resources to a set of critical nodes on a graph while an adversary chooses its path to attack these nodes accordingly. We consider a boundedly rational adversary in which the adversary's response model is based on a dynamic form of classic logit-based dis…
▽ More
This work studies Stackelberg network interdiction games -- an important class of games in which a defender first allocates (randomized) defense resources to a set of critical nodes on a graph while an adversary chooses its path to attack these nodes accordingly. We consider a boundedly rational adversary in which the adversary's response model is based on a dynamic form of classic logit-based discrete choice models. We show that the problem of finding an optimal interdiction strategy for the defender in the rational setting is NP-hard. The resulting optimization is in fact non-convex and additionally, involves complex terms that sum over exponentially many paths. We tackle these computational challenges by presenting new efficient approximation algorithms with bounded solution guarantees. First, we address the exponentially-many-path challenge by proposing a polynomial-time dynamic programming-based formulation. We then show that the gradient of the non-convex objective can also be computed in polynomial time, which allows us to use a gradient-based method to solve the problem efficiently. Second, we identify a restricted problem that is convex and hence gradient-based methods find the global optimal solution for this restricted problem. We further identify mild conditions under which this restricted problem provides a bounded approximation for the original problem.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
Binary-Continuous Sum-of-ratios Optimization: Discretization, Approximations, and Convex Reformulations
Authors:
Tien Mai,
Ngan Ha Duong,
Thuy Anh Ta
Abstract:
We study a class of non-convex sum-of-ratios programs which can be used for decision-making in prominent areas such as product assortment and price optimization, facility location, and security games. Such an optimization problem involves both continuous and binary decision variables and is known to be highly non-convex and intractable to solve. We explore a discretization approach to approximate…
▽ More
We study a class of non-convex sum-of-ratios programs which can be used for decision-making in prominent areas such as product assortment and price optimization, facility location, and security games. Such an optimization problem involves both continuous and binary decision variables and is known to be highly non-convex and intractable to solve. We explore a discretization approach to approximate the optimization problem and show that the approximate program can be reformulated as mixed-integer linear or second-order cone programs, which can be conveniently handled by an off-the-shelf solver (e.g., CPLEX or GUROBI). We further establish (mild) conditions under which solutions to the approximate problem converge to optimal solutions as the number of discretization points increases. We also provide approximation abounds for solutions obtained from the approximated problem. We show how our approach applies to product assortment and price optimization, maximum covering facility location, and Bayesian Stackelberg security games and provide experimental results to evaluate the efficiency of our approach.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
On a low-rank matrix single index model
Authors:
The Tien Mai
Abstract:
In this paper, we present a theoretical study of a low-rank matrix single index model. This model is recently introduced in biostatistics however its theoretical properties on estimating together the link function and the coefficient matrix are not yet carried out. Here, we advance on using PAC-Bayesian bounds technique to provide a rigorous theoretical understanding for jointly estimation of the…
▽ More
In this paper, we present a theoretical study of a low-rank matrix single index model. This model is recently introduced in biostatistics however its theoretical properties on estimating together the link function and the coefficient matrix are not yet carried out. Here, we advance on using PAC-Bayesian bounds technique to provide a rigorous theoretical understanding for jointly estimation of the link function and the coefficient matrix.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
The Dyson equation for $2$-positive maps and Hölder bounds for the Lévy distance of densities of states
Authors:
Tobias Mai
Abstract:
The so-called density of states is a Borel probability measure on the real line associated with the solution of the Dyson equation which we set up, on any fixed $C^\ast$-probability space, for a selfadjoint offset and a $2$-positive linear map. Using techniques from free noncommutative function theory, we prove explicit Hölder bounds for the Lévy distance of two such measures when any of the two p…
▽ More
The so-called density of states is a Borel probability measure on the real line associated with the solution of the Dyson equation which we set up, on any fixed $C^\ast$-probability space, for a selfadjoint offset and a $2$-positive linear map. Using techniques from free noncommutative function theory, we prove explicit Hölder bounds for the Lévy distance of two such measures when any of the two parameters varies. As the main tools for the proof, which are also of independent interest, we show that solutions of the Dyson equation have strong analytic properties and evolve along any $C^1$-path of $2$-positive linear maps according to an operator-valued version of the inviscid Burgers equation.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Prediction of numerical homogenization using deep learning for the Richards equation
Authors:
Sergei Stepanov,
Denis Spiridonov,
Tina Mai
Abstract:
For the nonlinear Richards equation as an unsaturated flow through heterogeneous media, we build a new coarse-scale approximation algorithm utilizing numerical homogenization. This approach follows deep neural networks (DNNs) to quickly and frequently calculate macroscopic parameters. More specifically, we train neural networks with a training set consisting of stochastic permeability realizations…
▽ More
For the nonlinear Richards equation as an unsaturated flow through heterogeneous media, we build a new coarse-scale approximation algorithm utilizing numerical homogenization. This approach follows deep neural networks (DNNs) to quickly and frequently calculate macroscopic parameters. More specifically, we train neural networks with a training set consisting of stochastic permeability realizations and corresponding computed macroscopic targets (effective permeability tensor, homogenized stiffness matrix, and right-hand side vector). Our proposed deep learning scheme develops nonlinear maps between such permeability fields and macroscopic characteristics, and the treatment for Richards equation's nonlinearity is included in the predicted coarse-scale homogenized stiffness matrix, which is a novelty. This strategy's good performance is demonstrated by several numerical tests in two-dimensional model problems, for predictions of the macroscopic properties and consequently solutions.
△ Less
Submitted 25 August, 2022;
originally announced August 2022.
-
Optimal quasi-Bayesian reduced rank regression with incomplete response
Authors:
The Tien Mai,
Pierre Alquier
Abstract:
The aim of reduced rank regression is to connect multiple response variables to multiple predictors. This model is very popular, especially in biostatistics where multiple measurements on individuals can be re-used to predict multiple outputs. Unfortunately, there are often missing data in such datasets, making it difficult to use standard estimation tools. In this paper, we study the problem of r…
▽ More
The aim of reduced rank regression is to connect multiple response variables to multiple predictors. This model is very popular, especially in biostatistics where multiple measurements on individuals can be re-used to predict multiple outputs. Unfortunately, there are often missing data in such datasets, making it difficult to use standard estimation tools. In this paper, we study the problem of reduced rank regression where the response matrix is incomplete. We propose a quasi-Bayesian approach to this problem, in the sense that the likelihood is replaced by a quasi-likelihood. We provide a tight oracle inequality, proving that our method is adaptive to the rank of the coefficient matrix. We describe a Langevin Monte Carlo algorithm for the computation of the posterior mean. Numerical comparison on synthetic and real data show that our method are competitive to the state-of-the-art where the rank is chosen by cross validation, and sometimes lead to an improvement.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
Scalable Distributional Robustness in a Class of Non Convex Optimization with Guarantees
Authors:
Avinandan Bose,
Arunesh Sinha,
Tien Mai
Abstract:
Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems. We endeavor to provide DRO solutions for a class of sum of fractionals, non-convex optimization which is used for decision making in prominent areas such as facility location and security games. In contrast to previous work, we find it more tractabl…
▽ More
Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems. We endeavor to provide DRO solutions for a class of sum of fractionals, non-convex optimization which is used for decision making in prominent areas such as facility location and security games. In contrast to previous work, we find it more tractable to optimize the equivalent variance regularized form of DRO rather than the minimax form. We transform the variance regularized form to a mixed-integer second order cone program (MISOCP), which, while guaranteeing near global optimality, does not scale enough to solve problems with real world data-sets. We further propose two abstraction approaches based on clustering and stratified sampling to increase scalability, which we then use for real world data-sets. Importantly, we provide near global optimality guarantees for our approach and show experimentally that our solution quality is better than the locally optimal ones achieved by state-of-the-art gradient-based methods. We experimentally compare our different approaches and baselines, and reveal nuanced properties of a DRO solution.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Constraint Energy Minimizing Generalized Multiscale Finite Element Method for multi-continuum Richards equations
Authors:
Tina Mai,
Siu Wun Cheung,
Jun Sur Richard Park
Abstract:
In fluid flow simulation, the multi-continuum model is a useful strategy. When the heterogeneity and contrast of coefficients are high, the system becomes multiscale, and some kinds of reduced-order methods are demanded. Combining these techniques with nonlinearity, we will consider in this paper a dual-continuum model which is generalized as a multi-continuum model for a coupled system of nonline…
▽ More
In fluid flow simulation, the multi-continuum model is a useful strategy. When the heterogeneity and contrast of coefficients are high, the system becomes multiscale, and some kinds of reduced-order methods are demanded. Combining these techniques with nonlinearity, we will consider in this paper a dual-continuum model which is generalized as a multi-continuum model for a coupled system of nonlinear Richards equations as unsaturated flows, in complex heterogeneous fractured porous media; and we will solve it by a novel multiscale approach utilizing the constraint energy minimizing generalized multiscale finite element method (CEM-GMsFEM). In particular, such a nonlinear system will be discretized in time and then linearized by Picard iteration (whose global convergence is proved theoretically). Subsequently, we tackle the resulting linearized equations by the CEM-GMsFEM and obtain proper offline multiscale basis functions to span the multiscale space (which contains the pressure solution). More specifically, we first introduce two new sources of samples, and the GMsFEM is used over each coarse block to build local auxiliary multiscale basis functions via solving local spectral problems, that are crucial for detecting high-contrast channels. Second, per oversampled coarse region, local multiscale basis functions are created through the CEM as constrainedly minimizing an energy functional. Various numerical tests for our approach reveal that the error converges with the coarse-grid size alone and that only a few oversampling layers, as well as basis functions, are needed.
△ Less
Submitted 23 May, 2022;
originally announced May 2022.
-
Joint Location and Cost Planning in Maximum Capture Facility Location under Multiplicative Random Utility Maximization
Authors:
Ngan Ha Duong,
Tien Thanh Dam,
Thuy Anh Ta,
Tien Mai
Abstract:
We study a joint facility location and cost planning problem in a competitive market under random utility maximization (RUM) models. The objective is to locate new facilities and make decisions on the costs (or budgets) to spend on the new facilities, aiming to maximize an expected captured customer demand, assuming that customers choose a facility among all available facilities according to a RUM…
▽ More
We study a joint facility location and cost planning problem in a competitive market under random utility maximization (RUM) models. The objective is to locate new facilities and make decisions on the costs (or budgets) to spend on the new facilities, aiming to maximize an expected captured customer demand, assuming that customers choose a facility among all available facilities according to a RUM model. We examine two RUM frameworks in the discrete choice literature, namely, the additive and multiplicative RUM. While the former has been widely used in facility location problems, we are the first to explore the latter in the context. We numerically show that the two RUM frameworks can well approximate each other in the context of the cost optimization problem. In addition, we show that, under the additive RUM framework, the resultant cost optimization problem becomes highly non-convex and may have several local optima. In contrast, the use of the multiplicative RUM brings several advantages to the competitive facility location problem. For instance, the cost optimization problem under the multiplicative RUM can be solved efficiently by a general convex optimization solver or can be reformulated as a conic quadratic program and handled by a conic solver available in some off-the-shelf solvers such as CPLEX or GUROBI. Furthermore, we consider a joint location and cost optimization problem under the multiplicative RUM and propose three approaches to solve the problem, namely, an equivalent conic reformulation, a multi-cut outer-approximation algorithm, and a local search heuristic. We provide numerical experiments based on synthetic instances of various sizes to evaluate the performances of the proposed algorithms in solving the cost optimization, and the joint location and cost optimization problems.
△ Less
Submitted 11 February, 2023; v1 submitted 15 May, 2022;
originally announced May 2022.
-
Estimation of Recursive Route Choice Models with Incomplete Trip Observations
Authors:
Tien Mai,
The Viet Bui,
Quoc Phong Nguyen,
Tho V. Le
Abstract:
This work concerns the estimation of recursive route choice models in the situation that the trip observations are incomplete, i.e., there are unconnected links (or nodes) in the observations. A direct approach to handle this issue would be intractable because enumerating all paths between unconnected links (or nodes) in a real network is typically not possible. We exploit an expectation-maximizat…
▽ More
This work concerns the estimation of recursive route choice models in the situation that the trip observations are incomplete, i.e., there are unconnected links (or nodes) in the observations. A direct approach to handle this issue would be intractable because enumerating all paths between unconnected links (or nodes) in a real network is typically not possible. We exploit an expectation-maximization (EM) method that allows to deal with the missing-data issue by alternatively performing two steps of sampling the missing segments in the observations and solving maximum likelihood estimation problems. Moreover, observing that the EM method would be expensive, we propose a new estimation method based on the idea that the choice probabilities of unconnected link observations can be exactly computed by solving systems of linear equations. We further design a new algorithm, called as decomposition-composition (DC), that helps reduce the number of systems of linear equations to be solved and speed up the estimation. We compare our proposed algorithms with some standard baselines using a dataset from a real network and show that the DC algorithm outperforms the other approaches in recovering missing information in the observations. Our methods work with most of the recursive route choice models proposed in the literature, including the recursive logit, nested recursive logit, or discounted recursive models.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Robust Maximum Capture Facility Location under Random Utility Maximization Models
Authors:
Anh Thuy Ta,
Tien Thanh Dam,
Tien Mai
Abstract:
We study a robust version of the maximum capture facility location problem in a competitive market, assuming that each customer chooses among all available facilities according to a random utility maximization (RUM) model. We employ the generalized extreme value (GEV) family of models and assume that the parameters of the RUM model are not given exactly but lie in convex uncertainty sets. The prob…
▽ More
We study a robust version of the maximum capture facility location problem in a competitive market, assuming that each customer chooses among all available facilities according to a random utility maximization (RUM) model. We employ the generalized extreme value (GEV) family of models and assume that the parameters of the RUM model are not given exactly but lie in convex uncertainty sets. The problem is to locate new facilities to maximize the worst-case captured user demand. We show that, interestingly, our robust model preserves the monotonicity and submodularity from its deterministic counterpart, implying that a simple greedy heuristic can guarantee a (1-1/e) approximation solution. We further show the concavity of the objective function under the classical multinomial logit (MNL) model, suggesting that an outer-approximation algorithm can be used to solve the robust model under MNL to optimality. We conduct experiments comparing our robust method to other deterministic and sampling approaches, using instances from different discrete choice models. Our results clearly demonstrate the advantages of our roust model in protecting the decision-maker from bad-case scenarios.
△ Less
Submitted 11 February, 2023; v1 submitted 16 October, 2021;
originally announced October 2021.
-
Berry-Esseen bounds for the multivariate $\mathcal{B}$-free CLT and operator-valued matrices
Authors:
Marwa Banna,
Tobias Mai
Abstract:
We provide bounds of Berry-Esseen type for fundamental limit theorems in operator-valued free probability theory such as the operator-valued free Central Limit Theorem and the asymptotic behaviour of distributions of operator-valued matrices. Our estimates are on the level of operator-valued Cauchy transforms and the Lévy distance. We address the single-variable as well as the multivariate setting…
▽ More
We provide bounds of Berry-Esseen type for fundamental limit theorems in operator-valued free probability theory such as the operator-valued free Central Limit Theorem and the asymptotic behaviour of distributions of operator-valued matrices. Our estimates are on the level of operator-valued Cauchy transforms and the Lévy distance. We address the single-variable as well as the multivariate setting for which we consider linear matrix pencils and noncommutative polynomials as test functions. The estimates are in terms of operator-valued moments and yield the first quantitative bounds on the Lévy distance for the operator-valued free Central Limit Theorem. Our results also yield quantitative estimates on joint noncommutative distributions of operator-valued matrices having a general covariance profile. In the scalar-valued multivariate case, these estimates could be passed to explicit bounds on the order of convergence under the Kolmogorov distance.
△ Less
Submitted 28 February, 2022; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Convergence for noncommutative rational functions evaluated in random matrices
Authors:
Benoît Collins,
Tobias Mai,
Akihiro Miyagawa,
Félix Parraud,
Sheng Yin
Abstract:
One of the main applications of free probability is to show that for appropriately chosen independent copies of $d$ random matrix models, any noncommutative polynomial in these $d$ variables has a spectral distribution that converges asymptotically and can be described with the help of free probability. This paper aims to show that this can be extended to noncommutative rational functions, answeri…
▽ More
One of the main applications of free probability is to show that for appropriately chosen independent copies of $d$ random matrix models, any noncommutative polynomial in these $d$ variables has a spectral distribution that converges asymptotically and can be described with the help of free probability. This paper aims to show that this can be extended to noncommutative rational functions, answering an open question by Roland Speicher. This paper also provides a noncommutative probability approach to approximating the free field. At the algebraic level, its construction relies on the approximation by generic matrices. On the other hand, it admits many embeddings in the algebra of operators affiliated with a $II_1$ factor. A consequence of our result is that, as soon as the generators admit a random matrix model, the approximation of any self-adjoint noncommutative rational function by generic matrices can be upgraded at the level of convergence in distribution.
△ Less
Submitted 15 November, 2022; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Submodularity and Local Search Approaches for Maximum Capture Problems under Generalized Extreme Value Models
Authors:
Tien Thanh Dam,
Thuy Anh Ta,
Tien Mai
Abstract:
We study the maximum capture problem in facility location under random utility models, i.e., the problem of seeking to locate new facilities in a competitive market such that the captured user demand is maximized, assuming that each customer chooses among all available facilities according to a random utility maximization model. We employ the generalized extreme value (GEV) family of discrete choi…
▽ More
We study the maximum capture problem in facility location under random utility models, i.e., the problem of seeking to locate new facilities in a competitive market such that the captured user demand is maximized, assuming that each customer chooses among all available facilities according to a random utility maximization model. We employ the generalized extreme value (GEV) family of discrete choice models and show that the objective function in this context is monotonic and submodular. This finding implies that a simple greed heuristic can always guarantee an (1-1/e) approximation solution. We further develop a new algorithm combining a greedy heuristic, a gradient-based local search and an exchanging procedure to efficiently solve the problem. We conduct experiments using instances of difference sizes and under different discrete choice models, and we show that our approach significantly outperforms prior approaches in terms of both returned objective value and CPU time. Our algorithm and theoretical findings can be applied to the maximum capture problems under various random utility models in the literature, including the popular multinomial logit, nested logit, cross nested logit, and the mixed logit models.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Online Discrepancy Minimization via Persistent Self-Balancing Walks
Authors:
David Arbour,
Drew Dimmery,
Tung Mai,
Anup Rao
Abstract:
We study the online discrepancy minimization problem for vectors in $\mathbb{R}^d$ in the oblivious setting where an adversary is allowed fix the vectors $x_1, x_2, \ldots, x_n$ in arbitrary order ahead of time. We give an algorithm that maintains $O(\sqrt{\log(nd/δ)})$ discrepancy with probability $1-δ$, matching the lower bound given in [Bansal et al. 2020] up to an $O(\sqrt{\log \log n})$ facto…
▽ More
We study the online discrepancy minimization problem for vectors in $\mathbb{R}^d$ in the oblivious setting where an adversary is allowed fix the vectors $x_1, x_2, \ldots, x_n$ in arbitrary order ahead of time. We give an algorithm that maintains $O(\sqrt{\log(nd/δ)})$ discrepancy with probability $1-δ$, matching the lower bound given in [Bansal et al. 2020] up to an $O(\sqrt{\log \log n})$ factor in the high-probability regime. We also provide results for the weighted and multi-color versions of the problem.
△ Less
Submitted 5 February, 2021; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Fundamental Tradeoffs in Distributionally Adversarial Training
Authors:
Mohammad Mehrabi,
Adel Javanmard,
Ryan A. Rossi,
Anup Rao,
Tung Mai
Abstract:
Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adver…
▽ More
Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adversary). Even more, such behavior is impacted by various elements of the learning problem, including the size and quality of training data, specific forms of adversarial perturbations in the input, model overparameterization, and adversary's power, among others. In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. The neighborhood is defined via Wasserstein distance between distributions and the radius of the neighborhood is a measure of adversary's manipulative power. We study the tradeoff between standard risk and adversarial risk and derive the Pareto-optimal tradeoff, achievable over specific classes of models, in the infinite data limit with features dimension kept fixed. We consider three learning settings: 1) Regression with the class of linear models; 2) Binary classification under the Gaussian mixtures data model, with the class of linear classifiers; 3) Regression with the class of random features model (which can be equivalently represented as two-layer neural network with random first-layer weights). We show that a tradeoff between standard and adversarial risk is manifested in all three settings. We further characterize the Pareto-optimal tradeoff curves and discuss how a variety of factors, such as features correlation, adversary's power or the width of two-layer neural network would affect this tradeoff.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Multiscale simulations for multi-continuum Richards equations
Authors:
Jun Sur Richard Park,
Siu Wun Cheung,
Tina Mai
Abstract:
In this paper, we study a multiscale method for simulating a dual-continuum unsaturated flow problem within complex heterogeneous fractured porous media. Mathematically, each of the dual continua is modeled by a multiscale Richards equation (for pressure head), and these equations are coupled to one another by transfer terms. On its own, Richards equation is already a nonlinear partial differentia…
▽ More
In this paper, we study a multiscale method for simulating a dual-continuum unsaturated flow problem within complex heterogeneous fractured porous media. Mathematically, each of the dual continua is modeled by a multiscale Richards equation (for pressure head), and these equations are coupled to one another by transfer terms. On its own, Richards equation is already a nonlinear partial differential equation, and it is exceedingly difficult to solve numerically due to the extra nonlinear dependencies involving the soil water. To deal with multiple scales, our strategy is that starting from a microscopic scale, we upscale the coupled system of dual-continuum Richards equations via homogenization by the two-scale asymptotic expansion, to obtain a homogenized system, at an intermediate scale (level). Based on a hierarchical approach, the homogenization's effective coefficients are computed through solving the arising cell problems. To tackle the nonlinearity, after time discretization, we use Picard iteration procedure for linearization of the homogenized Richards equations. At each Picard iteration, some degree of multiscale still remains from the intermediate level, so we utilize the generalized multiscale finite element method (GMsFEM) combining with a multi-continuum approach, to upscale the homogenized system to a macroscopic (coarse-grid) level. This scheme involves building uncoupled and coupled multiscale basis functions, which are used not only to construct coarse-grid solution approximation with high accuracy but also (with the coupled multiscale basis) to capture the interactions among continua. These prospects and convergence are demonstrated by several numerical results for the proposed method.
△ Less
Submitted 2 June, 2021; v1 submitted 18 October, 2020;
originally announced October 2020.
-
A Relation Analysis of Markov Decision Process Frameworks
Authors:
Tien Mai,
Patrick Jaillet
Abstract:
We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a…
▽ More
We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a stochastic MDP model, and is strictly subsumed by the general regularized MDP. Moreover, we propose a distributional stochastic MDP framework by assuming that the distribution of the reward function is ambiguous. We further show that the distributional stochastic MDP is equivalent to the regularized MDP, in the sense that they always yield the same optimal policies. We also provide a connection between stochastic/regularized MDP and constrained MDP. Our work gives a unified view on several important MDP frameworks, which would lead new ways to interpret the (entropy/general) regularized MDP frameworks through the lens of stochastic rewards and vice-versa. Given the recent popularity of regularized MDP in (deep) reinforcement learning, our work brings new understandings of how such algorithmic schemes work and suggest ideas to develop new ones.
△ Less
Submitted 18 August, 2020;
originally announced August 2020.
-
Robust Product-line Pricing under Generalized Extreme Value Models
Authors:
Tien Mai,
Patrick Jaillet
Abstract:
We study robust versions of pricing problems where customers choose products according to a generalized extreme value (GEV) choice model, and the choice parameters are not known exactly but lie in an uncertainty set. We show that, when the robust problem is unconstrained and the price sensitivity parameters are homogeneous, the robust optimal prices have a constant markup over products, and we pro…
▽ More
We study robust versions of pricing problems where customers choose products according to a generalized extreme value (GEV) choice model, and the choice parameters are not known exactly but lie in an uncertainty set. We show that, when the robust problem is unconstrained and the price sensitivity parameters are homogeneous, the robust optimal prices have a constant markup over products, and we provide formulas that allow to compute this constant markup by bisection. We further show that, in the case that the price sensitivity parameters are only homogeneous in each partition of the products, under the assumption that the choice probability generating function and the uncertainty set are partition-wise separable, a robust solution will have a constant markup in each subset, and this constant-markup vector can be found efficiently by convex optimization. We provide numerical results to illustrate the advantages of our robust approach in protecting from bad scenarios. Our results hold for convex and bounded uncertainty sets,} and for any arbitrary GEV model, including the multinomial logit, nested or cross-nested logit.
△ Less
Submitted 17 October, 2021; v1 submitted 19 December, 2019;
originally announced December 2019.
-
A Note on the Free and Cyclic Differential Calculus
Authors:
Tobias Mai,
Roland Speicher
Abstract:
In 2000, Voiculescu proved an algebraic characterization of cyclic gradients of noncommutative polynomials. We extend this remarkable result in two different directions: first, we obtain an analogous characterization of free gradients; second, we lift both of these results to Voiculescu's fundamental framework of multivariable generalized difference quotient rings. For that purpose, we develop the…
▽ More
In 2000, Voiculescu proved an algebraic characterization of cyclic gradients of noncommutative polynomials. We extend this remarkable result in two different directions: first, we obtain an analogous characterization of free gradients; second, we lift both of these results to Voiculescu's fundamental framework of multivariable generalized difference quotient rings. For that purpose, we develop the concept of divergence operators, for both free and cyclic gradients, and study the associated (weak) grading and cyclic symmetrization operators, respectively. One the one hand, this puts a new complexion on the initial polynomial case, and on the other hand, it provides a uniform framework within which also other examples - such as a discrete version of the Itô stochastic integral - can be treated.
△ Less
Submitted 25 June, 2020; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Theory of functional connections applied to quadratic and nonlinear programming under equality constraints
Authors:
Tina Mai,
Daniele Mortari
Abstract:
This paper introduces an efficient approach to solve quadratic and nonlinear programming problems subject to linear equality constraints via the Theory of Functional Connections. This is done without using the traditional Lagrange multiplier technique. More specifically, two distinct expressions (fully satisfying the equality constraints) are provided, to first solve the constrained quadratic prog…
▽ More
This paper introduces an efficient approach to solve quadratic and nonlinear programming problems subject to linear equality constraints via the Theory of Functional Connections. This is done without using the traditional Lagrange multiplier technique. More specifically, two distinct expressions (fully satisfying the equality constraints) are provided, to first solve the constrained quadratic programming problem as an unconstrained one for closed-form solution. Such expressions are derived via using an optimization variable vector, which is called the free vector $\boldsymbol{g}$ by the Theory of Functional Connections. In the spirit of this Theory, for the equality constrained nonlinear programming problem, its solution is obtained by the Newton's method combining with elimination scheme in optimization. Convergence analysis is supported by a numerical example for the proposed approach.
△ Less
Submitted 25 August, 2022; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Constraint energy minimizing generalized multiscale finite element method for nonlinear poroelasticity and elasticity
Authors:
Shubin Fu,
Eric Chung,
Tina Mai
Abstract:
In this paper, we apply the constraint energy minimizing generalized multiscale finite element method (CEM-GMsFEM) to first solving a nonlinear poroelasticity problem. The arising system consists of a nonlinear pressure equation and a nonlinear stress equation in strain-limiting setting, where strains keep bounded while stresses can grow arbitrarily large. After time discretization of the system,…
▽ More
In this paper, we apply the constraint energy minimizing generalized multiscale finite element method (CEM-GMsFEM) to first solving a nonlinear poroelasticity problem. The arising system consists of a nonlinear pressure equation and a nonlinear stress equation in strain-limiting setting, where strains keep bounded while stresses can grow arbitrarily large. After time discretization of the system, to tackle the nonlinearity, we linearize the resulting equations by Picard iteration. To handle the linearized equations, we employ the CEM-GMsFEM and obtain appropriate offline multiscale basis functions for the pressure and the displacement. More specifically, first, auxiliary multiscale basis functions are generated by solving local spectral problems, via the GMsFEM. Then, multiscale spaces are constructed in oversampled regions, by solving a constraint energy minimizing (CEM) problem. After that, this strategy (with the CEM-GMsFEM) is also applied to a static case of the above nonlinear poroelasticity problem, that is, elasticity problem, where the residual based online multiscale basis functions are generated by an adaptive enrichment procedure, to further reduce the error. Convergence of the two cases is demonstrated by several numerical simulations, which give accurate solutions, with converging coarse-mesh sizes as well as few basis functions (degrees of freedom) and oversampling layers.
△ Less
Submitted 29 September, 2019;
originally announced September 2019.
-
Multiscale simulations for upscaled multi-continuum flows
Authors:
Jun Sur Richard Park,
Siu Wun Cheung,
Tina Mai,
Viet Ha Hoang
Abstract:
We consider in this paper a challenging problem of simulating fluid flows, in complex multiscale media possessing multi-continuum background. As an effort to handle this obstacle, model reduction is employed. In \cite{rh2}, homogenization was nicely applied, to find effective coefficients and homogenized equations (for fluid flow pressures) of a dual-continuum system, with new convection terms and…
▽ More
We consider in this paper a challenging problem of simulating fluid flows, in complex multiscale media possessing multi-continuum background. As an effort to handle this obstacle, model reduction is employed. In \cite{rh2}, homogenization was nicely applied, to find effective coefficients and homogenized equations (for fluid flow pressures) of a dual-continuum system, with new convection terms and negative interaction coefficients. However, some degree of multiscale still remains. This motivates us to propose the generalized multiscale finite element method (GMsFEM), which is coupled with the dual-continuum homogenized equations, toward speeding up the simulation, improving the accuracy as well as clearly representing the interactions between the dual continua. In our paper, globally, each continuum is viewed as a system and connected to the other throughout the domain. We take into consideration the flow transfers between the dual continua and within each continuum itself. Such multiscale flow dynamics are modeled by the GMsFEM, which systematically generates either uncoupled or coupled multiscale basis (to carry the local characteristics to the global ones), via establishing local snapshots and spectral decomposition in the snapshot space. As a result, we will work with a system of two equations coupled with some interaction terms, and each equation describes one of the dual continua on the fine grid. Convergence analysis of the proposed GMsFEM is accompanied with the numerical results, which support the favorable outcomes.
△ Less
Submitted 10 September, 2019;
originally announced September 2019.
-
The free field: realization via unbounded operators and Atiyah property
Authors:
Tobias Mai,
Roland Speicher,
Sheng Yin
Abstract:
Let $X_1,\dots,X_n$ be operators in a finite von Neumann algebra and consider their division closure in the affiliated unbounded operators. We address the question when this division closure is a skew field (aka division ring) and when it is the free skew field. We show that the first property is equivalent to the strong Atiyah property and that the second property can be characterized in terms of…
▽ More
Let $X_1,\dots,X_n$ be operators in a finite von Neumann algebra and consider their division closure in the affiliated unbounded operators. We address the question when this division closure is a skew field (aka division ring) and when it is the free skew field. We show that the first property is equivalent to the strong Atiyah property and that the second property can be characterized in terms of the non-commutative distribution of $X_1,\dots,X_n$. More precisely, $X_1,\dots,X_n$ generate the free skew field if and only if there exist no non-zero finite rank operators $T_1,\dots,T_n$ such that $\sum_i[T_i,X_i]=0$. Sufficient conditions for this are the maximality of the free entropy dimension or the existence of a dual system of $X_1,\dots,X_n$. Our general theory is not restricted to selfadjoint operators and thus does also include and recover the result of Linnell that the generators of the free group give the free skew field.
We give also consequences of our result for the question of atoms in the distribution of rational functions in free variables or in the asymptotic eigenvalue distribution of matrices over polynomials in asymptotically free random matrices. This solves in particular a conjecture of Charlesworth and Shlyakhtenko.
△ Less
Submitted 16 April, 2020; v1 submitted 20 May, 2019;
originally announced May 2019.
-
Generalized multiscale finite element method for a strain-limiting nonlinear elasticity model
Authors:
Shubin Fu,
Eric Chung,
Tina Mai
Abstract:
In this paper, we consider multiscale methods for nonlinear elasticity. In particular, we investigate the Generalized Multiscale Finite Element Method (GMsFEM) for a strain-limiting elasticity problem. Being a special case of the naturally implicit constitutive theory of nonlinear elasticity, strain-limiting relation has presented an interesting class of material bodies, for which strains remain b…
▽ More
In this paper, we consider multiscale methods for nonlinear elasticity. In particular, we investigate the Generalized Multiscale Finite Element Method (GMsFEM) for a strain-limiting elasticity problem. Being a special case of the naturally implicit constitutive theory of nonlinear elasticity, strain-limiting relation has presented an interesting class of material bodies, for which strains remain bounded (even infinitesimal) while stresses can become arbitrarily large. The nonlinearity and material heterogeneities can create multiscale features in the solution, and multiscale methods are therefore necessary. To handle the resulting nonlinear monotone quasilinear elliptic equation, we use linearization based on the Picard iteration. We consider two types of basis functions, offline and online basis functions, following the general framework of GMsFEM. The offline basis functions depend nonlinearly on the solution. Thus, we design an indicator function and we will recompute the offline basis functions when the indicator function predicts that the material property has significant change during the iterations. On the other hand, we will use the residual based online basis functions to reduce the error substantially when updating basis functions is necessary. Our numerical results show that the above combination of offline and online basis functions is able to give accurate solutions with only a few basis functions per each coarse region and updating basis functions in selected iterations.
△ Less
Submitted 21 December, 2018;
originally announced December 2018.
-
A note on existence of free Stein kernels
Authors:
Guillaume Cébron,
Max Fathi,
Tobias Mai
Abstract:
Stein kernels are a way of comparing probability distributions, defined via integration by parts formulas. We provide two constructions of Stein kernels in free probability. One is given by an explicit formula, and the other via free Poincaré inequalities. In particular, we show that unlike in the classical setting, free Stein kernels always exist. As corollaries, we derive new bounds on the rate…
▽ More
Stein kernels are a way of comparing probability distributions, defined via integration by parts formulas. We provide two constructions of Stein kernels in free probability. One is given by an explicit formula, and the other via free Poincaré inequalities. In particular, we show that unlike in the classical setting, free Stein kernels always exist. As corollaries, we derive new bounds on the rate of convergence in the free CLT, and a strengthening of a characterization of the semicircular law due to Biane.
△ Less
Submitted 7 November, 2018;
originally announced November 2018.
-
Hölder Continuity of Cumulative Distribution Functions for Noncommutative Polynomials under Finite Free Fisher Information
Authors:
Marwa Banna,
Tobias Mai
Abstract:
This paper contributes to the current studies on regularity properties of noncommutative distributions in free probability theory. More precisely, we consider evaluations of selfadjoint noncommutative polynomials in noncommutative random variables that have finite non-microstates free Fisher information, highlighting the special case of Lipschitz conjugate variables. For the first time in this gen…
▽ More
This paper contributes to the current studies on regularity properties of noncommutative distributions in free probability theory. More precisely, we consider evaluations of selfadjoint noncommutative polynomials in noncommutative random variables that have finite non-microstates free Fisher information, highlighting the special case of Lipschitz conjugate variables. For the first time in this generality, it is shown that the analytic distributions of those evaluations have Hölder continuous cumulative distribution functions with an explicit Hölder exponent that depends only on the degree of the considered polynomial. For linear polynomials, we reach in the case of finite non-microstates free Fisher information the optimal Hölder exponent $\frac{2}{3}$, and get Lipschitz continuity in the case of Lipschitz conjugate variables. In particular, our results guarantee that such polynomial evaluations have finite logarithmic energy and thus finite (non-microstates) free entropy, which partially settles a conjecture of Charlesworth and Shlyakhtenko [CS16].
We further provide a very general criterion that gives for weak approximations of measures having Hölder continuous cumulative distribution functions explicit rates of convergence in terms of the Kolmogorov distance.
Finally, we combine these results to study the asymptotic eigenvalue distributions of polynomials in GUEs or matrices with more general Gibbs laws. For Gibbs laws, this extends the corresponding result obtained in [GS09] from convergence in distribution to convergence in Kolmogorov distance; in the GUE case, we even provide explicit rates, which quantify results of [HT05,HST06] in terms of the Kolmogorov distance.
△ Less
Submitted 20 November, 2019; v1 submitted 28 September, 2018;
originally announced September 2018.
-
The free field: zero divisors, Atiyah property and realizations via unbounded operators
Authors:
Tobias Mai,
Roland Speicher,
Sheng Yin
Abstract:
We consider noncommutative rational functions as well as matrices in polynomials in noncommuting variables in two settings: in an algebraic context the variables are formal variables, and their rational functions generate the "free field"; in an analytic context the variables are given by operators from a finite von Neumann algebra and the question of rational functions is treated within the affil…
▽ More
We consider noncommutative rational functions as well as matrices in polynomials in noncommuting variables in two settings: in an algebraic context the variables are formal variables, and their rational functions generate the "free field"; in an analytic context the variables are given by operators from a finite von Neumann algebra and the question of rational functions is treated within the affiliated unbounded operators. Our main result shows that for a "good" class of operators - namely those for which the free entropy dimension is maximal - the analytic and the algebraic theory are isomorphic. This means in particular that any non-trivial rational function can be evaluated as an unbounded operator for any such good tuple and that those operators don't have zero divisors. On the matrix side, this means that matrices of polynomials which are invertible in the free field are also invertible as matrices over unbounded operators when we plug in our good operator tuples. We also address the question how this is related to the strong Atiyah property. The above yields a quite complete picture for the question of zero divisors (or atoms in the corresponding distributions) for operator tuples with maximal free entropy dimension. We give also some partial results for the question of existence and regularity of a density of the distribution.
△ Less
Submitted 16 April, 2020; v1 submitted 10 May, 2018;
originally announced May 2018.
-
Rock-Paper-Scissors, Differential Games and Biological Diversity
Authors:
Tung Mai,
Ioannis Panageas,
Will Ratcliff,
Vijay V. Vazirani,
Peter Yunker
Abstract:
We model a situation in which a collection of species derive their fitnesses via a rock-paper-scissors-type game; however, the precise payoffs are a function of the environment. The new aspect of our model lies in adding a feedback loop: the environment changes according to the relative fitnesses of the species; in particular, it gives a boost to the species having small populations. We cast our m…
▽ More
We model a situation in which a collection of species derive their fitnesses via a rock-paper-scissors-type game; however, the precise payoffs are a function of the environment. The new aspect of our model lies in adding a feedback loop: the environment changes according to the relative fitnesses of the species; in particular, it gives a boost to the species having small populations. We cast our model in the setting of a differential game and we show that for a certain setting of parameters, this dynamics cycles. Our model is a natural one, since depletion of resources used by more frequent species will shift the payoff matrix towards favoring less frequent ones. Since the dynamics cycles, no species goes extinct and diversity is maintained.
△ Less
Submitted 30 October, 2017;
originally announced October 2017.
-
Nash Social Welfare for Indivisible Items under Separable, Piecewise-Linear Concave Utilities
Authors:
Nima Anari,
Tung Mai,
Shayan Oveis Gharan,
Vijay V. Vazirani
Abstract:
Recently Cole and Gkatzelis gave the first constant factor approximation algorithm for the problem of allocating indivisible items to agents, under additive valuations, so as to maximize the Nash Social Welfare. We give constant factor algorithms for a substantial generalization of their problem -- to the case of separable, piecewise-linear concave utility functions. We give two such algorithms, t…
▽ More
Recently Cole and Gkatzelis gave the first constant factor approximation algorithm for the problem of allocating indivisible items to agents, under additive valuations, so as to maximize the Nash Social Welfare. We give constant factor algorithms for a substantial generalization of their problem -- to the case of separable, piecewise-linear concave utility functions. We give two such algorithms, the first using market equilibria and the second using the theory of stable polynomials.
In AGT, there is a paucity of methods for the design of mechanisms for the allocation of indivisible goods and the result of Cole and Gkatzelis seemed to be taking a major step towards filling this gap. Our result can be seen as another step in this direction.
△ Less
Submitted 6 April, 2017; v1 submitted 15 December, 2016;
originally announced December 2016.
-
Entropic convergence and the linearized limit for the Boltzmann equation with external force
Authors:
Tina Mai
Abstract:
The purpose of this note, as a compendium, is to extend the results on entropic convergence and the linearized limit for the Boltzmann equation (without external force) in \cite{Levermore} by Levermore to the case of the Boltzmann equation with external force. More specifically, starting from the Boltzmann equation with an external force introduced in \cite{DL16} by Arsénio and Saint-Raymond, we f…
▽ More
The purpose of this note, as a compendium, is to extend the results on entropic convergence and the linearized limit for the Boltzmann equation (without external force) in \cite{Levermore} by Levermore to the case of the Boltzmann equation with external force. More specifically, starting from the Boltzmann equation with an external force introduced in \cite{DL16} by Arsénio and Saint-Raymond, we find conditions on the force, to maintain the result in \cite{Levermore} (as an important application of the theory of DiPerna-Lions renormalized solutions) about the validity of the linearization approximation when the initial datum approaches a global Maxwellian.
△ Less
Submitted 15 December, 2016;
originally announced December 2016.
-
Pseudo-Bayesian Quantum Tomography with Rank-adaptation
Authors:
The Tien Mai,
Pierre Alquier
Abstract:
Quantum state tomography, an important task in quantum information processing, aims at reconstructing a state from prepared measurement data. Bayesian methods are recognized to be one of the good and reliable choice in estimating quantum states~\cite{blume2010optimal}. Several numerical works showed that Bayesian estimations are comparable to, and even better than other methods in the problem of…
▽ More
Quantum state tomography, an important task in quantum information processing, aims at reconstructing a state from prepared measurement data. Bayesian methods are recognized to be one of the good and reliable choice in estimating quantum states~\cite{blume2010optimal}. Several numerical works showed that Bayesian estimations are comparable to, and even better than other methods in the problem of $1$-qubit state recovery. However, the problem of choosing prior distribution in the general case of $n$ qubits is not straightforward. More importantly, the statistical performance of Bayesian type estimators have not been studied from a theoretical perspective yet. In this paper, we propose a novel prior for quantum states (density matrices), and we define pseudo-Bayesian estimators of the density matrix. Then, using PAC-Bayesian theorems, we derive rates of convergence for the posterior mean. The numerical performance of these estimators are tested on simulated and real datasets.
△ Less
Submitted 10 October, 2016; v1 submitted 19 May, 2016;
originally announced May 2016.
-
Regularity of distributions of Wigner integrals
Authors:
Tobias Mai
Abstract:
Wigner integrals and the corresponding Wigner chaos were introduced by P. Biane and R. Speicher in 1998 as a non-commutative counterpart of classical Wiener-Itô integrals and the corresponding Wiener-Itô chaos, respectively, in free probability.
In the classical case, a famous result of I. Shigekawa states that non-trivial elements in the finite Wiener-Itô chaos have an absolutely continuous dis…
▽ More
Wigner integrals and the corresponding Wigner chaos were introduced by P. Biane and R. Speicher in 1998 as a non-commutative counterpart of classical Wiener-Itô integrals and the corresponding Wiener-Itô chaos, respectively, in free probability.
In the classical case, a famous result of I. Shigekawa states that non-trivial elements in the finite Wiener-Itô chaos have an absolutely continuous distribution. We provide here a first contribution to such regularity questions for Wigner integrals by showing that the distribution of non-trivial elements in the finite Wigner chaos cannot have atoms. This answers a question of I. Nourdin and G. Peccati.
For doing so, we establish the notion of directional gradients in the context of the free Malliavin calculus. These directional gradients bridge between free Malliavin calculus and the theory of non-commutative derivations as initiated by D. Voiculescu and Y. Dabrowski. Methods recently invented by R. Speicher, M. Weber, and the author for treating similar questions in the case of finitely many variables are extended, such that they apply to directional gradients. This approach also excludes zero-divisors for the considered elements in the finite Wigner chaos.
△ Less
Submitted 7 March, 2016; v1 submitted 23 December, 2015;
originally announced December 2015.
-
Applications of Realizations (aka Linearizations) to Free Probability
Authors:
J. William Helton,
Tobias Mai,
Roland Speicher
Abstract:
We show how the combination of new "linearization" ideas in free probability theory with the powerful "realization" machinery -- developed over the last 50 years in fields including systems engineering and automata theory -- allows solving the problem of determining the eigenvalue distribution (or even the Brown measure, in the non-selfadjoint case) of noncommutative rational functions of random m…
▽ More
We show how the combination of new "linearization" ideas in free probability theory with the powerful "realization" machinery -- developed over the last 50 years in fields including systems engineering and automata theory -- allows solving the problem of determining the eigenvalue distribution (or even the Brown measure, in the non-selfadjoint case) of noncommutative rational functions of random matrices when their size tends to infinity. Along the way we extend evaluations of noncommutative rational expressions from matrices to stably finite algebras, e.g. type II$_1$ von Neumann algebras, with a precise control of the domains of the rational expressions.
The paper provides sufficient background information, with the intention that it should be accessible both to functional analysts and to algebraists.
△ Less
Submitted 29 September, 2017; v1 submitted 17 November, 2015;
originally announced November 2015.
-
Absence of algebraic relations and of zero divisors under the assumption of full non-microstates free entropy dimension
Authors:
Tobias Mai,
Roland Speicher,
Moritz Weber
Abstract:
We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables excludes algebraic relations for the generators. Moreover, under the assumption of maximal non-microstates free entropy dimension, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any element from the von Neumann…
▽ More
We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables excludes algebraic relations for the generators. Moreover, under the assumption of maximal non-microstates free entropy dimension, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any element from the von Neumann algebra is zero if and only if at least one of those factors is zero. In particular, this shows that in this case the distribution of any non-constant self-adjoint non-commutative polynomial in the generators does not have atoms.
Questions on the absence of atoms for polynomials in non-commuting random variables (or for polynomials in random matrices) have been an open problem for quite a while. We solve this general problem by showing that maximality of free entropy dimension excludes atoms.
△ Less
Submitted 2 September, 2015; v1 submitted 23 February, 2015;
originally announced February 2015.
-
A Bayesian Approach for Noisy Matrix Completion: Optimal Rate under General Sampling Distribution
Authors:
The Tien Mai,
Pierre Alquier
Abstract:
Bayesian methods for low-rank matrix completion with noise have been shown to be very efficient computationally. While the behaviour of penalized minimization methods is well understood both from the theoretical and computational points of view in this problem, the theoretical optimality of Bayesian estimators have not been explored yet. In this paper, we propose a Bayesian estimator for matrix co…
▽ More
Bayesian methods for low-rank matrix completion with noise have been shown to be very efficient computationally. While the behaviour of penalized minimization methods is well understood both from the theoretical and computational points of view in this problem, the theoretical optimality of Bayesian estimators have not been explored yet. In this paper, we propose a Bayesian estimator for matrix completion under general sampling distribution. We also provide an oracle inequality for this estimator. This inequality proves that, whatever the rank of the matrix to be estimated, our estimator reaches the minimax-optimal rate of convergence (up to a logarithmic factor). We end the paper with a short simulation study.
△ Less
Submitted 21 January, 2015; v1 submitted 25 August, 2014;
originally announced August 2014.
-
Absence of algebraic relations and of zero divisors under the assumption of finite non-microstates free Fisher information
Authors:
Tobias Mai,
Roland Speicher,
Moritz Weber
Abstract:
We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables in an appropriate sense exclude algebraic relations for the generators. Moreover, under the assumption of finite non-microstates free Fisher information, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any eleme…
▽ More
We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables in an appropriate sense exclude algebraic relations for the generators. Moreover, under the assumption of finite non-microstates free Fisher information, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any element from the von Neumann algebra is zero if and only if at least one of those factors is zero.
△ Less
Submitted 22 February, 2015; v1 submitted 21 July, 2014;
originally announced July 2014.
-
Analytic subordination theory of operator-valued free additive convolution and the solution of a general random matrix problem
Authors:
Serban Belinschi,
Tobias Mai,
Roland Speicher
Abstract:
We develop an analytic theory of operator-valued additive free convolution in terms of subordination functions. In contrast to earlier investigations our functions are not just given by power series expansions, but are defined as Frechet analytic functions in all of the operator upper half plane. Furthermore, we do not have to assume that our state is tracial. Combining this new analytic theory of…
▽ More
We develop an analytic theory of operator-valued additive free convolution in terms of subordination functions. In contrast to earlier investigations our functions are not just given by power series expansions, but are defined as Frechet analytic functions in all of the operator upper half plane. Furthermore, we do not have to assume that our state is tracial. Combining this new analytic theory of operator-valued free convolution with Anderson's selfadjoint version of the linearization trick we are able to provide a solution to the following general random matrix problem: How can we calculate the asymptotic eigenvalue distribution of a polynomial evaluated in independent random matrices with known asymptotic eigenvalue distributions?
△ Less
Submitted 31 August, 2013; v1 submitted 13 March, 2013;
originally announced March 2013.
-
Operator-valued and multivariate free Berry-Esseen theorems
Authors:
Tobias Mai,
Roland Speicher
Abstract:
We address the question of a Berry-Esseen type theorem for the speed of convergence in a multivariate free central limit theorem. For this, we estimate the difference between the operator-valued Cauchy transforms of the normalized partial sums in an operator-valued free central limit theorem and the Cauchy transform of the limiting operator-valued semicircular element. Since we have to deal with i…
▽ More
We address the question of a Berry-Esseen type theorem for the speed of convergence in a multivariate free central limit theorem. For this, we estimate the difference between the operator-valued Cauchy transforms of the normalized partial sums in an operator-valued free central limit theorem and the Cauchy transform of the limiting operator-valued semicircular element. Since we have to deal with in general non-self-adjoint operators, we introduce the notion of matrix-valued resolvent sets and study the behavior of Cauchy transforms on them.
△ Less
Submitted 13 February, 2012;
originally announced February 2012.
-
Analytically weak solutions to SPDEs with unbounded time-dependent differential operators and an application
Authors:
Benedict Baur,
Martin Grothaus,
Tan Thanh Mai
Abstract:
We analyze the concepts of analytically weak solutions of stochastic differential equations (SDEs) in Hilbert spaces with time-dependent unbounded operators and give conditions for existence and uniqueness of such solutions. Our studies are motivated by a stochastic partial differential equation (SPDE) arising in industrial mathematics.
We analyze the concepts of analytically weak solutions of stochastic differential equations (SDEs) in Hilbert spaces with time-dependent unbounded operators and give conditions for existence and uniqueness of such solutions. Our studies are motivated by a stochastic partial differential equation (SPDE) arising in industrial mathematics.
△ Less
Submitted 30 January, 2013; v1 submitted 8 December, 2011;
originally announced December 2011.