Search | arXiv e-print repository

Fuglede-Kadison determinants of matrix-valued semicircular elements and capacity estimates

Abstract: We calculate the Fuglede-Kadison determinant of arbitrary matrix-valued semicircular operators in terms of the capacity of the corresponding covariance map**. We also improve a lower bound by Garg, Gurvits, Oliveira, and Widgerson on this capacity, by making it dimension-independent. We calculate the Fuglede-Kadison determinant of arbitrary matrix-valued semicircular operators in terms of the capacity of the corresponding covariance map**. We also improve a lower bound by Garg, Gurvits, Oliveira, and Widgerson on this capacity, by making it dimension-independent. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.14269 [pdf, ps, other]

Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix

Authors: The Tien Mai

Abstract: Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the mat… ▽ More Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the matrix is sparse, methods like graphical Lasso, graphical SCAD, and CLIME are popular for estimating GGMs. While frequentist methods are well-studied, Bayesian approaches for (unstructured) sparse precision matrices are less explored. The graphical horseshoe estimate by \citet{li2019graphical}, applying the global-local horseshoe prior, shows superior empirical performance, but theoretical work for sparse precision matrix estimations using shrinkage priors is limited. This paper addresses these gaps by providing concentration results for the tempered posterior with the fully specified horseshoe prior in high-dimensional settings. Moreover, we also provide novel theoretical results for model misspecification, offering a general oracle inequality for the posterior. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2405.19016 [pdf, ps, other]

Adaptive posterior concentration rates for sparse high-dimensional linear regression with random design and unknown error variance

Authors: The Tien Mai

Abstract: This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investig… ▽ More This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investigation to establish concentration outcomes for parameter estimation using specific distance measures. These findings are in line with recent discoveries in frequentist studies. Additionally, by employing techniques to address model misspecification through a fractional posterior, we broaden our analysis through oracle inequalities to encompass the critical aspect of model misspecification for the regular posterior. Our novel findings are demonstrated using two different types of sparsity priors: a shrinkage prior and a spike-and-slab prior. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.01304 [pdf, ps, other]

Misclassification bounds for PAC-Bayesian sparse deep learning

Authors: The Tien Mai

Abstract: Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in dee… ▽ More Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in deep learning for classification. This study presents an attempt to bridge that gap. By leveraging PAC-Bayes bounds techniques, we present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We establish non-asymptotic results for the prediction error. Additionally, we demonstrate that, by considering different architectures, our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor. Moreover, our additional logarithmic term yields slight improvements over previous works. Additionally, we propose and analyze an automated model selection approach aimed at optimally choosing a network architecture with guaranteed optimality. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:1908.04847 by other authors

arXiv:2404.17850 [pdf, ps, other]

On properties of fractional posterior in generalized reduced-rank regression

Authors: The Tien Mai

Abstract: Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized line… ▽ More Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized linear models (GLM) within the RRR framework. In this work, we relax the requirement for the link function of the generalized linear model to be canonical. We examine the properties of fractional posteriors in GLM within the RRR context, where a fractional power of the likelihood is utilized. By employing a spectral scaled Student prior distribution, we establish consistency and concentration results for the fractional posterior. Our results highlight adaptability, as they do not necessitate prior knowledge of the rank of the parameter matrix. These results are in line with those found in frequentist literature. Additionally, an examination of model mis-specification is undertaken, underscoring the effectiveness of our approach in such scenarios. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2403.14178 [pdf, other]

Generalized multiscale finite element method for a nonlinear elastic strain-limiting Cosserat model

Authors: Dmitry Ammosov, Tina Mai, Juan Galvis

Abstract: For nonlinear Cosserat elasticity, we consider multiscale methods in this paper. In particular, we explore the generalized multiscale finite element method (GMsFEM) to solve an isotropic Cosserat problem with strain-limiting property (ensuring bounded linearized strains even under high stresses). Such strain-limiting Cosserat model can find potential applications in solids and biological fibers. H… ▽ More For nonlinear Cosserat elasticity, we consider multiscale methods in this paper. In particular, we explore the generalized multiscale finite element method (GMsFEM) to solve an isotropic Cosserat problem with strain-limiting property (ensuring bounded linearized strains even under high stresses). Such strain-limiting Cosserat model can find potential applications in solids and biological fibers. However, Cosserat media with naturally rotational degrees of freedom, nonlinear constitutive relations, high contrast, and heterogeneities may produce challenging multiscale characteristics in the solution, and upscaling by multiscale methods is necessary. Therefore, we utilize the offline and residual-based online (adaptive or uniform) GMsFEM in this context while handling the nonlinearity by Picard iteration. Through various two-dimensional experiments (for perforated, composite, and stochastically heterogeneous media with small and big strain-limiting parameters), our numerical results show the approaches' convergence, efficiency, and robustness. In addition, these results demonstrate that such approaches provide good accuracy, the online GMsFEM gives more accurate solutions than the offline one, and the online adaptive strategy has similar accuracy to the uniform one but with fewer degrees of freedom. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: submitted to Journal of Computational Physics, fixed a notation

MSC Class: 65N30; 65N99

arXiv:2403.14177 [pdf, other]

Prediction of discretization of online GMsFEM using deep learning for Richards equation

Authors: Denis Spiridonov, Sergei Stepanov, Tina Mai

Abstract: We develop a new coarse-scale approximation strategy for the nonlinear single-continuum Richards equation as an unsaturated flow over heterogeneous non-periodic media, using the online generalized multiscale finite element method (online GMsFEM) together with deep learning. A novelty of this approach is that local online multiscale basis functions are computed rapidly and frequently by utilizing d… ▽ More We develop a new coarse-scale approximation strategy for the nonlinear single-continuum Richards equation as an unsaturated flow over heterogeneous non-periodic media, using the online generalized multiscale finite element method (online GMsFEM) together with deep learning. A novelty of this approach is that local online multiscale basis functions are computed rapidly and frequently by utilizing deep neural networks (DNNs). More precisely, we employ the training set of stochastic permeability realizations and the computed relating online multiscale basis functions to train neural networks. The nonlinear map between such permeability fields and online multiscale basis functions is developed by our proposed deep learning algorithm. That is, in a new way, the predicted online multiscale basis functions incorporate the nonlinearity treatment of the Richards equation and refect any time-dependent changes in the problem's properties. Multiple numerical experiments in two-dimensional model problems show the good performance of this technique, in terms of predictions of the online multiscale basis functions and thus finding solutions. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: submitted to Journal of Computational and Applied Mathematics

MSC Class: 65M60; 65M12; 68T07

arXiv:2403.12832 [pdf, ps, other]

On high-dimensional classification by sparse generalized Bayesian logistic regression

Authors: The Tien Mai

Abstract: This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predicto… ▽ More This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predictor and response variable but also for the regression coefficients. Significantly, we derive novel findings concerning misclassification excess risk bounds using sparse generalized Bayesian logistic regression. These results parallel recent findings for penalized methods in the frequentist literature. Furthermore, we extend our results to the scenario of model misspecification, which is of critical importance. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2308.03667 [pdf, other]

Computing the noncommutative inner rank by means of operator-valued free probability theory

Authors: Johannes Hoffmann, Tobias Mai, Roland Speicher

Abstract: We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for w… ▽ More We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for which we provide precise analytical and numerical control on the fixed point algorithm for solving the equation. Numerical examples show the efficiency of the algorithm. △ Less

Submitted 28 June, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: In the second version we have not only improved the presentation of the results, but we supply in addition now actually also a certificate for the termination of our algorithm (this relies on recent theoretical results in the paper arxiv.longhoe.net/abs/2406.15922)

MSC Class: 46L54; 65J15; 12E15

arXiv:2304.08790 [pdf, other]

Constrained Assortment Optimization under the Cross-Nested Logit Model

Authors: Cuong Le, Tien Mai

Abstract: We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We… ▽ More We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We show that, under the Cross-Nested Logit model, the assortment problem is NP-hard, even without any constraints. To tackle the assortment optimization problem, we develop a new discretization mechanism to approximate the problem by a linear fractional program with a performance guarantee of $\frac{1 - ε}{1+ε}$, for any accuracy level $ε>0$. We then show that optimal solutions to the approximate problem can be obtained by solving mixed-integer linear programs. We further show that our discretization approach can also be applied to solve a joint assortment optimization and pricing problem, as well as an assortment problem under a mixture of Cross-Nested Logit models to account for multiple classes of customers. Our empirical results on a large number of randomly generated test instances demonstrate that, under a performance guarantee of 90%, the percentage gaps between the objective values obtained from our approximation methods and the optimal expected revenues are no larger than 1.2%. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2301.12232 [pdf, other]

Tackling Stackelberg Network Interdiction against a Boundedly Rational Adversary

Authors: Tien Mai, Avinandan Bose, Arunesh Sinha, Thanh H. Nguyen

Abstract: This work studies Stackelberg network interdiction games -- an important class of games in which a defender first allocates (randomized) defense resources to a set of critical nodes on a graph while an adversary chooses its path to attack these nodes accordingly. We consider a boundedly rational adversary in which the adversary's response model is based on a dynamic form of classic logit-based dis… ▽ More This work studies Stackelberg network interdiction games -- an important class of games in which a defender first allocates (randomized) defense resources to a set of critical nodes on a graph while an adversary chooses its path to attack these nodes accordingly. We consider a boundedly rational adversary in which the adversary's response model is based on a dynamic form of classic logit-based discrete choice models. We show that the problem of finding an optimal interdiction strategy for the defender in the rational setting is NP-hard. The resulting optimization is in fact non-convex and additionally, involves complex terms that sum over exponentially many paths. We tackle these computational challenges by presenting new efficient approximation algorithms with bounded solution guarantees. First, we address the exponentially-many-path challenge by proposing a polynomial-time dynamic programming-based formulation. We then show that the gradient of the non-convex objective can also be computed in polynomial time, which allows us to use a gradient-based method to solve the problem efficiently. Second, we identify a restricted problem that is convex and hence gradient-based methods find the global optimal solution for this restricted problem. We further identify mild conditions under which this restricted problem provides a bounded approximation for the original problem. △ Less

Submitted 28 January, 2023; originally announced January 2023.

arXiv:2211.02152 [pdf, other]

Binary-Continuous Sum-of-ratios Optimization: Discretization, Approximations, and Convex Reformulations

Authors: Tien Mai, Ngan Ha Duong, Thuy Anh Ta

Abstract: We study a class of non-convex sum-of-ratios programs which can be used for decision-making in prominent areas such as product assortment and price optimization, facility location, and security games. Such an optimization problem involves both continuous and binary decision variables and is known to be highly non-convex and intractable to solve. We explore a discretization approach to approximate… ▽ More We study a class of non-convex sum-of-ratios programs which can be used for decision-making in prominent areas such as product assortment and price optimization, facility location, and security games. Such an optimization problem involves both continuous and binary decision variables and is known to be highly non-convex and intractable to solve. We explore a discretization approach to approximate the optimization problem and show that the approximate program can be reformulated as mixed-integer linear or second-order cone programs, which can be conveniently handled by an off-the-shelf solver (e.g., CPLEX or GUROBI). We further establish (mild) conditions under which solutions to the approximate problem converge to optimal solutions as the number of discretization points increases. We also provide approximation abounds for solutions obtained from the approximated problem. We show how our approach applies to product assortment and price optimization, maximum covering facility location, and Bayesian Stackelberg security games and provide experimental results to evaluate the efficiency of our approach. △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2210.08851 [pdf, ps, other]

doi 10.3390/math11092065

On a low-rank matrix single index model

Authors: The Tien Mai

Abstract: In this paper, we present a theoretical study of a low-rank matrix single index model. This model is recently introduced in biostatistics however its theoretical properties on estimating together the link function and the coefficient matrix are not yet carried out. Here, we advance on using PAC-Bayesian bounds technique to provide a rigorous theoretical understanding for jointly estimation of the… ▽ More In this paper, we present a theoretical study of a low-rank matrix single index model. This model is recently introduced in biostatistics however its theoretical properties on estimating together the link function and the coefficient matrix are not yet carried out. Here, we advance on using PAC-Bayesian bounds technique to provide a rigorous theoretical understanding for jointly estimation of the link function and the coefficient matrix. △ Less

Submitted 17 October, 2022; originally announced October 2022.

arXiv:2210.04743 [pdf, ps, other]

The Dyson equation for $2$-positive maps and Hölder bounds for the Lévy distance of densities of states

Authors: Tobias Mai

Abstract: The so-called density of states is a Borel probability measure on the real line associated with the solution of the Dyson equation which we set up, on any fixed $C^\ast$-probability space, for a selfadjoint offset and a $2$-positive linear map. Using techniques from free noncommutative function theory, we prove explicit Hölder bounds for the Lévy distance of two such measures when any of the two p… ▽ More The so-called density of states is a Borel probability measure on the real line associated with the solution of the Dyson equation which we set up, on any fixed $C^\ast$-probability space, for a selfadjoint offset and a $2$-positive linear map. Using techniques from free noncommutative function theory, we prove explicit Hölder bounds for the Lévy distance of two such measures when any of the two parameters varies. As the main tools for the proof, which are also of independent interest, we show that solutions of the Dyson equation have strong analytic properties and evolve along any $C^1$-path of $2$-positive linear maps according to an operator-valued version of the inviscid Burgers equation. △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: 27 pages

arXiv:2208.12161 [pdf, other]

doi 10.1016/j.cam.2022.114980

Prediction of numerical homogenization using deep learning for the Richards equation

Authors: Sergei Stepanov, Denis Spiridonov, Tina Mai

Abstract: For the nonlinear Richards equation as an unsaturated flow through heterogeneous media, we build a new coarse-scale approximation algorithm utilizing numerical homogenization. This approach follows deep neural networks (DNNs) to quickly and frequently calculate macroscopic parameters. More specifically, we train neural networks with a training set consisting of stochastic permeability realizations… ▽ More For the nonlinear Richards equation as an unsaturated flow through heterogeneous media, we build a new coarse-scale approximation algorithm utilizing numerical homogenization. This approach follows deep neural networks (DNNs) to quickly and frequently calculate macroscopic parameters. More specifically, we train neural networks with a training set consisting of stochastic permeability realizations and corresponding computed macroscopic targets (effective permeability tensor, homogenized stiffness matrix, and right-hand side vector). Our proposed deep learning scheme develops nonlinear maps between such permeability fields and macroscopic characteristics, and the treatment for Richards equation's nonlinearity is included in the predicted coarse-scale homogenized stiffness matrix, which is a novelty. This strategy's good performance is demonstrated by several numerical tests in two-dimensional model problems, for predictions of the macroscopic properties and consequently solutions. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: 32 pages, submitted to Journal of Computational and Applied Mathematics

MSC Class: 65M60; 65M12; 68T07

Journal ref: Journal of Computational and Applied Mathematics, Volume 424, 1 May 2023, 114980

arXiv:2206.08619 [pdf, ps, other]

Optimal quasi-Bayesian reduced rank regression with incomplete response

Authors: The Tien Mai, Pierre Alquier

Abstract: The aim of reduced rank regression is to connect multiple response variables to multiple predictors. This model is very popular, especially in biostatistics where multiple measurements on individuals can be re-used to predict multiple outputs. Unfortunately, there are often missing data in such datasets, making it difficult to use standard estimation tools. In this paper, we study the problem of r… ▽ More The aim of reduced rank regression is to connect multiple response variables to multiple predictors. This model is very popular, especially in biostatistics where multiple measurements on individuals can be re-used to predict multiple outputs. Unfortunately, there are often missing data in such datasets, making it difficult to use standard estimation tools. In this paper, we study the problem of reduced rank regression where the response matrix is incomplete. We propose a quasi-Bayesian approach to this problem, in the sense that the likelihood is replaced by a quasi-likelihood. We provide a tight oracle inequality, proving that our method is adaptive to the rank of the coefficient matrix. We describe a Langevin Monte Carlo algorithm for the computation of the posterior mean. Numerical comparison on synthetic and real data show that our method are competitive to the state-of-the-art where the rank is chosen by cross validation, and sometimes lead to an improvement. △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2205.15624 [pdf, other]

Scalable Distributional Robustness in a Class of Non Convex Optimization with Guarantees

Authors: Avinandan Bose, Arunesh Sinha, Tien Mai

Abstract: Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems. We endeavor to provide DRO solutions for a class of sum of fractionals, non-convex optimization which is used for decision making in prominent areas such as facility location and security games. In contrast to previous work, we find it more tractabl… ▽ More Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems. We endeavor to provide DRO solutions for a class of sum of fractionals, non-convex optimization which is used for decision making in prominent areas such as facility location and security games. In contrast to previous work, we find it more tractable to optimize the equivalent variance regularized form of DRO rather than the minimax form. We transform the variance regularized form to a mixed-integer second order cone program (MISOCP), which, while guaranteeing near global optimality, does not scale enough to solve problems with real world data-sets. We further propose two abstraction approaches based on clustering and stratified sampling to increase scalability, which we then use for real world data-sets. Importantly, we provide near global optimality guarantees for our approach and show experimentally that our solution quality is better than the locally optimal ones achieved by state-of-the-art gradient-based methods. We experimentally compare our different approaches and baselines, and reveal nuanced properties of a DRO solution. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: 24 pages, 3 figures, 5 tables

arXiv:2205.11294 [pdf, other]

doi 10.1016/j.jcp.2023.111915

Constraint Energy Minimizing Generalized Multiscale Finite Element Method for multi-continuum Richards equations

Authors: Tina Mai, Siu Wun Cheung, Jun Sur Richard Park

Abstract: In fluid flow simulation, the multi-continuum model is a useful strategy. When the heterogeneity and contrast of coefficients are high, the system becomes multiscale, and some kinds of reduced-order methods are demanded. Combining these techniques with nonlinearity, we will consider in this paper a dual-continuum model which is generalized as a multi-continuum model for a coupled system of nonline… ▽ More In fluid flow simulation, the multi-continuum model is a useful strategy. When the heterogeneity and contrast of coefficients are high, the system becomes multiscale, and some kinds of reduced-order methods are demanded. Combining these techniques with nonlinearity, we will consider in this paper a dual-continuum model which is generalized as a multi-continuum model for a coupled system of nonlinear Richards equations as unsaturated flows, in complex heterogeneous fractured porous media; and we will solve it by a novel multiscale approach utilizing the constraint energy minimizing generalized multiscale finite element method (CEM-GMsFEM). In particular, such a nonlinear system will be discretized in time and then linearized by Picard iteration (whose global convergence is proved theoretically). Subsequently, we tackle the resulting linearized equations by the CEM-GMsFEM and obtain proper offline multiscale basis functions to span the multiscale space (which contains the pressure solution). More specifically, we first introduce two new sources of samples, and the GMsFEM is used over each coarse block to build local auxiliary multiscale basis functions via solving local spectral problems, that are crucial for detecting high-contrast channels. Second, per oversampled coarse region, local multiscale basis functions are created through the CEM as constrainedly minimizing an energy functional. Various numerical tests for our approach reveal that the error converges with the coarse-grid size alone and that only a few oversampling layers, as well as basis functions, are needed. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: 22 pages, 7 figures, 4 tables, submitted to Journal of Computational Physics, fixed some typos and notation

MSC Class: 65M60; 65M12

arXiv:2205.07345 [pdf, other]

Joint Location and Cost Planning in Maximum Capture Facility Location under Multiplicative Random Utility Maximization

Authors: Ngan Ha Duong, Tien Thanh Dam, Thuy Anh Ta, Tien Mai

Abstract: We study a joint facility location and cost planning problem in a competitive market under random utility maximization (RUM) models. The objective is to locate new facilities and make decisions on the costs (or budgets) to spend on the new facilities, aiming to maximize an expected captured customer demand, assuming that customers choose a facility among all available facilities according to a RUM… ▽ More We study a joint facility location and cost planning problem in a competitive market under random utility maximization (RUM) models. The objective is to locate new facilities and make decisions on the costs (or budgets) to spend on the new facilities, aiming to maximize an expected captured customer demand, assuming that customers choose a facility among all available facilities according to a RUM model. We examine two RUM frameworks in the discrete choice literature, namely, the additive and multiplicative RUM. While the former has been widely used in facility location problems, we are the first to explore the latter in the context. We numerically show that the two RUM frameworks can well approximate each other in the context of the cost optimization problem. In addition, we show that, under the additive RUM framework, the resultant cost optimization problem becomes highly non-convex and may have several local optima. In contrast, the use of the multiplicative RUM brings several advantages to the competitive facility location problem. For instance, the cost optimization problem under the multiplicative RUM can be solved efficiently by a general convex optimization solver or can be reformulated as a conic quadratic program and handled by a conic solver available in some off-the-shelf solvers such as CPLEX or GUROBI. Furthermore, we consider a joint location and cost optimization problem under the multiplicative RUM and propose three approaches to solve the problem, namely, an equivalent conic reformulation, a multi-cut outer-approximation algorithm, and a local search heuristic. We provide numerical experiments based on synthetic instances of various sizes to evaluate the performances of the proposed algorithms in solving the cost optimization, and the joint location and cost optimization problems. △ Less

Submitted 11 February, 2023; v1 submitted 15 May, 2022; originally announced May 2022.

Journal ref: Computer and Operations Research (2023)

arXiv:2204.12992 [pdf, other]

Estimation of Recursive Route Choice Models with Incomplete Trip Observations

Authors: Tien Mai, The Viet Bui, Quoc Phong Nguyen, Tho V. Le

Abstract: This work concerns the estimation of recursive route choice models in the situation that the trip observations are incomplete, i.e., there are unconnected links (or nodes) in the observations. A direct approach to handle this issue would be intractable because enumerating all paths between unconnected links (or nodes) in a real network is typically not possible. We exploit an expectation-maximizat… ▽ More This work concerns the estimation of recursive route choice models in the situation that the trip observations are incomplete, i.e., there are unconnected links (or nodes) in the observations. A direct approach to handle this issue would be intractable because enumerating all paths between unconnected links (or nodes) in a real network is typically not possible. We exploit an expectation-maximization (EM) method that allows to deal with the missing-data issue by alternatively performing two steps of sampling the missing segments in the observations and solving maximum likelihood estimation problems. Moreover, observing that the EM method would be expensive, we propose a new estimation method based on the idea that the choice probabilities of unconnected link observations can be exactly computed by solving systems of linear equations. We further design a new algorithm, called as decomposition-composition (DC), that helps reduce the number of systems of linear equations to be solved and speed up the estimation. We compare our proposed algorithms with some standard baselines using a dataset from a real network and show that the DC algorithm outperforms the other approaches in recovering missing information in the observations. Our methods work with most of the recursive route choice models proposed in the literature, including the recursive logit, nested recursive logit, or discounted recursive models. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 26 pages

arXiv:2110.08497 [pdf, other]

Robust Maximum Capture Facility Location under Random Utility Maximization Models

Authors: Anh Thuy Ta, Tien Thanh Dam, Tien Mai

Abstract: We study a robust version of the maximum capture facility location problem in a competitive market, assuming that each customer chooses among all available facilities according to a random utility maximization (RUM) model. We employ the generalized extreme value (GEV) family of models and assume that the parameters of the RUM model are not given exactly but lie in convex uncertainty sets. The prob… ▽ More We study a robust version of the maximum capture facility location problem in a competitive market, assuming that each customer chooses among all available facilities according to a random utility maximization (RUM) model. We employ the generalized extreme value (GEV) family of models and assume that the parameters of the RUM model are not given exactly but lie in convex uncertainty sets. The problem is to locate new facilities to maximize the worst-case captured user demand. We show that, interestingly, our robust model preserves the monotonicity and submodularity from its deterministic counterpart, implying that a simple greedy heuristic can guarantee a (1-1/e) approximation solution. We further show the concavity of the objective function under the classical multinomial logit (MNL) model, suggesting that an outer-approximation algorithm can be used to solve the robust model under MNL to optimality. We conduct experiments comparing our robust method to other deterministic and sampling approaches, using instances from different discrete choice models. Our results clearly demonstrate the advantages of our roust model in protecting the decision-maker from bad-case scenarios. △ Less

Submitted 11 February, 2023; v1 submitted 16 October, 2021; originally announced October 2021.

Journal ref: European Journal of Operational Research (2023)

arXiv:2105.02044 [pdf, ps, other]

Berry-Esseen bounds for the multivariate $\mathcal{B}$-free CLT and operator-valued matrices

Authors: Marwa Banna, Tobias Mai

Abstract: We provide bounds of Berry-Esseen type for fundamental limit theorems in operator-valued free probability theory such as the operator-valued free Central Limit Theorem and the asymptotic behaviour of distributions of operator-valued matrices. Our estimates are on the level of operator-valued Cauchy transforms and the Lévy distance. We address the single-variable as well as the multivariate setting… ▽ More We provide bounds of Berry-Esseen type for fundamental limit theorems in operator-valued free probability theory such as the operator-valued free Central Limit Theorem and the asymptotic behaviour of distributions of operator-valued matrices. Our estimates are on the level of operator-valued Cauchy transforms and the Lévy distance. We address the single-variable as well as the multivariate setting for which we consider linear matrix pencils and noncommutative polynomials as test functions. The estimates are in terms of operator-valued moments and yield the first quantitative bounds on the Lévy distance for the operator-valued free Central Limit Theorem. Our results also yield quantitative estimates on joint noncommutative distributions of operator-valued matrices having a general covariance profile. In the scalar-valued multivariate case, these estimates could be passed to explicit bounds on the order of convergence under the Kolmogorov distance. △ Less

Submitted 28 February, 2022; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: 52 pages

arXiv:2103.05962 [pdf, ps, other]

doi 10.1007/s00208-022-02530-5

Convergence for noncommutative rational functions evaluated in random matrices

Authors: Benoît Collins, Tobias Mai, Akihiro Miyagawa, Félix Parraud, Sheng Yin

Abstract: One of the main applications of free probability is to show that for appropriately chosen independent copies of $d$ random matrix models, any noncommutative polynomial in these $d$ variables has a spectral distribution that converges asymptotically and can be described with the help of free probability. This paper aims to show that this can be extended to noncommutative rational functions, answeri… ▽ More One of the main applications of free probability is to show that for appropriately chosen independent copies of $d$ random matrix models, any noncommutative polynomial in these $d$ variables has a spectral distribution that converges asymptotically and can be described with the help of free probability. This paper aims to show that this can be extended to noncommutative rational functions, answering an open question by Roland Speicher. This paper also provides a noncommutative probability approach to approximating the free field. At the algebraic level, its construction relies on the approximation by generic matrices. On the other hand, it admits many embeddings in the algebra of operators affiliated with a $II_1$ factor. A consequence of our result is that, as soon as the generators admit a random matrix model, the approximation of any self-adjoint noncommutative rational function by generic matrices can be upgraded at the level of convergence in distribution. △ Less

Submitted 15 November, 2022; v1 submitted 10 March, 2021; originally announced March 2021.

Comments: 27 pages. This version contains minor revisions and additional references

Journal ref: Mathematische Annalen (2022), 1--32

arXiv:2102.05754 [pdf, ps, other]

doi 10.1016/j.ejor.2021.09.006

Submodularity and Local Search Approaches for Maximum Capture Problems under Generalized Extreme Value Models

Authors: Tien Thanh Dam, Thuy Anh Ta, Tien Mai

Abstract: We study the maximum capture problem in facility location under random utility models, i.e., the problem of seeking to locate new facilities in a competitive market such that the captured user demand is maximized, assuming that each customer chooses among all available facilities according to a random utility maximization model. We employ the generalized extreme value (GEV) family of discrete choi… ▽ More We study the maximum capture problem in facility location under random utility models, i.e., the problem of seeking to locate new facilities in a competitive market such that the captured user demand is maximized, assuming that each customer chooses among all available facilities according to a random utility maximization model. We employ the generalized extreme value (GEV) family of discrete choice models and show that the objective function in this context is monotonic and submodular. This finding implies that a simple greed heuristic can always guarantee an (1-1/e) approximation solution. We further develop a new algorithm combining a greedy heuristic, a gradient-based local search and an exchanging procedure to efficiently solve the problem. We conduct experiments using instances of difference sizes and under different discrete choice models, and we show that our approach significantly outperforms prior approaches in terms of both returned objective value and CPU time. Our algorithm and theoretical findings can be applied to the maximum capture problems under various random utility models in the literature, including the popular multinomial logit, nested logit, cross nested logit, and the mixed logit models. △ Less

Submitted 10 February, 2021; originally announced February 2021.

Journal ref: European Journal of Operational Research - 300(2022) 953-965

arXiv:2102.02765

Online Discrepancy Minimization via Persistent Self-Balancing Walks

Authors: David Arbour, Drew Dimmery, Tung Mai, Anup Rao

Abstract: We study the online discrepancy minimization problem for vectors in $\mathbb{R}^d$ in the oblivious setting where an adversary is allowed fix the vectors $x_1, x_2, \ldots, x_n$ in arbitrary order ahead of time. We give an algorithm that maintains $O(\sqrt{\log(nd/δ)})$ discrepancy with probability $1-δ$, matching the lower bound given in [Bansal et al. 2020] up to an $O(\sqrt{\log \log n})$ facto… ▽ More We study the online discrepancy minimization problem for vectors in $\mathbb{R}^d$ in the oblivious setting where an adversary is allowed fix the vectors $x_1, x_2, \ldots, x_n$ in arbitrary order ahead of time. We give an algorithm that maintains $O(\sqrt{\log(nd/δ)})$ discrepancy with probability $1-δ$, matching the lower bound given in [Bansal et al. 2020] up to an $O(\sqrt{\log \log n})$ factor in the high-probability regime. We also provide results for the weighted and multi-color versions of the problem. △ Less

Submitted 5 February, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: The proof of Lemma 7 is incorrect. There is a serious issue that we don't know how to fix at the moment. We thank Yang, Nikhil and collaborators for bringing it to our attention

arXiv:2101.06309 [pdf, other]

Fundamental Tradeoffs in Distributionally Adversarial Training

Authors: Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi, Anup Rao, Tung Mai

Abstract: Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adver… ▽ More Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adversary). Even more, such behavior is impacted by various elements of the learning problem, including the size and quality of training data, specific forms of adversarial perturbations in the input, model overparameterization, and adversary's power, among others. In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. The neighborhood is defined via Wasserstein distance between distributions and the radius of the neighborhood is a measure of adversary's manipulative power. We study the tradeoff between standard risk and adversarial risk and derive the Pareto-optimal tradeoff, achievable over specific classes of models, in the infinite data limit with features dimension kept fixed. We consider three learning settings: 1) Regression with the class of linear models; 2) Binary classification under the Gaussian mixtures data model, with the class of linear classifiers; 3) Regression with the class of random features model (which can be equivalently represented as two-layer neural network with random first-layer weights). We show that a tradeoff between standard and adversarial risk is manifested in all three settings. We further characterize the Pareto-optimal tradeoff curves and discuss how a variety of factors, such as features correlation, adversary's power or the width of two-layer neural network would affect this tradeoff. △ Less

Submitted 15 January, 2021; originally announced January 2021.

Comments: 23 pages, 3 figures

arXiv:2010.09181 [pdf, other]

doi 10.1016/j.cam.2021.113648

Multiscale simulations for multi-continuum Richards equations

Authors: Jun Sur Richard Park, Siu Wun Cheung, Tina Mai

Abstract: In this paper, we study a multiscale method for simulating a dual-continuum unsaturated flow problem within complex heterogeneous fractured porous media. Mathematically, each of the dual continua is modeled by a multiscale Richards equation (for pressure head), and these equations are coupled to one another by transfer terms. On its own, Richards equation is already a nonlinear partial differentia… ▽ More In this paper, we study a multiscale method for simulating a dual-continuum unsaturated flow problem within complex heterogeneous fractured porous media. Mathematically, each of the dual continua is modeled by a multiscale Richards equation (for pressure head), and these equations are coupled to one another by transfer terms. On its own, Richards equation is already a nonlinear partial differential equation, and it is exceedingly difficult to solve numerically due to the extra nonlinear dependencies involving the soil water. To deal with multiple scales, our strategy is that starting from a microscopic scale, we upscale the coupled system of dual-continuum Richards equations via homogenization by the two-scale asymptotic expansion, to obtain a homogenized system, at an intermediate scale (level). Based on a hierarchical approach, the homogenization's effective coefficients are computed through solving the arising cell problems. To tackle the nonlinearity, after time discretization, we use Picard iteration procedure for linearization of the homogenized Richards equations. At each Picard iteration, some degree of multiscale still remains from the intermediate level, so we utilize the generalized multiscale finite element method (GMsFEM) combining with a multi-continuum approach, to upscale the homogenized system to a macroscopic (coarse-grid) level. This scheme involves building uncoupled and coupled multiscale basis functions, which are used not only to construct coarse-grid solution approximation with high accuracy but also (with the coupled multiscale basis) to capture the interactions among continua. These prospects and convergence are demonstrated by several numerical results for the proposed method. △ Less

Submitted 2 June, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

Comments: 45 pages, 4 figures, 2 tables, major revision. This is the accepted manuscript by Journal of Computational and Applied Mathematics (2021). The published journal article is available at https://doi.org/10.1016/j.cam.2021.113648 (2021)

MSC Class: 65N30; 65N99

Journal ref: Journal of Computational and Applied Mathematics, 397:113648, 2021

arXiv:2008.07820 [pdf, other]

A Relation Analysis of Markov Decision Process Frameworks

Authors: Tien Mai, Patrick Jaillet

Abstract: We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a… ▽ More We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a stochastic MDP model, and is strictly subsumed by the general regularized MDP. Moreover, we propose a distributional stochastic MDP framework by assuming that the distribution of the reward function is ambiguous. We further show that the distributional stochastic MDP is equivalent to the regularized MDP, in the sense that they always yield the same optimal policies. We also provide a connection between stochastic/regularized MDP and constrained MDP. Our work gives a unified view on several important MDP frameworks, which would lead new ways to interpret the (entropy/general) regularized MDP frameworks through the lens of stochastic rewards and vice-versa. Given the recent popularity of regularized MDP in (deep) reinforcement learning, our work brings new understandings of how such algorithmic schemes work and suggest ideas to develop new ones. △ Less

Submitted 18 August, 2020; originally announced August 2020.

arXiv:1912.09552 [pdf, other]

Robust Product-line Pricing under Generalized Extreme Value Models

Authors: Tien Mai, Patrick Jaillet

Abstract: We study robust versions of pricing problems where customers choose products according to a generalized extreme value (GEV) choice model, and the choice parameters are not known exactly but lie in an uncertainty set. We show that, when the robust problem is unconstrained and the price sensitivity parameters are homogeneous, the robust optimal prices have a constant markup over products, and we pro… ▽ More We study robust versions of pricing problems where customers choose products according to a generalized extreme value (GEV) choice model, and the choice parameters are not known exactly but lie in an uncertainty set. We show that, when the robust problem is unconstrained and the price sensitivity parameters are homogeneous, the robust optimal prices have a constant markup over products, and we provide formulas that allow to compute this constant markup by bisection. We further show that, in the case that the price sensitivity parameters are only homogeneous in each partition of the products, under the assumption that the choice probability generating function and the uncertainty set are partition-wise separable, a robust solution will have a constant markup in each subset, and this constant-markup vector can be found efficiently by convex optimization. We provide numerical results to illustrate the advantages of our robust approach in protecting from bad scenarios. Our results hold for convex and bounded uncertainty sets,} and for any arbitrary GEV model, including the multinomial logit, nested or cross-nested logit. △ Less

Submitted 17 October, 2021; v1 submitted 19 December, 2019; originally announced December 2019.

arXiv:1910.07570 [pdf, ps, other]

A Note on the Free and Cyclic Differential Calculus

Authors: Tobias Mai, Roland Speicher

Abstract: In 2000, Voiculescu proved an algebraic characterization of cyclic gradients of noncommutative polynomials. We extend this remarkable result in two different directions: first, we obtain an analogous characterization of free gradients; second, we lift both of these results to Voiculescu's fundamental framework of multivariable generalized difference quotient rings. For that purpose, we develop the… ▽ More In 2000, Voiculescu proved an algebraic characterization of cyclic gradients of noncommutative polynomials. We extend this remarkable result in two different directions: first, we obtain an analogous characterization of free gradients; second, we lift both of these results to Voiculescu's fundamental framework of multivariable generalized difference quotient rings. For that purpose, we develop the concept of divergence operators, for both free and cyclic gradients, and study the associated (weak) grading and cyclic symmetrization operators, respectively. One the one hand, this puts a new complexion on the initial polynomial case, and on the other hand, it provides a uniform framework within which also other examples - such as a discrete version of the Itô stochastic integral - can be treated. △ Less

Submitted 25 June, 2020; v1 submitted 16 October, 2019; originally announced October 2019.

Comments: We have undertaken a minor revision; the paper has been accepted for publication in the special issue of the Journal of Operator Theory on the occasion of the 70th anniversary of Dan Voiculescu

arXiv:1910.04917 [pdf, other]

doi 10.1016/j.cam.2021.113912

Theory of functional connections applied to quadratic and nonlinear programming under equality constraints

Authors: Tina Mai, Daniele Mortari

Abstract: This paper introduces an efficient approach to solve quadratic and nonlinear programming problems subject to linear equality constraints via the Theory of Functional Connections. This is done without using the traditional Lagrange multiplier technique. More specifically, two distinct expressions (fully satisfying the equality constraints) are provided, to first solve the constrained quadratic prog… ▽ More This paper introduces an efficient approach to solve quadratic and nonlinear programming problems subject to linear equality constraints via the Theory of Functional Connections. This is done without using the traditional Lagrange multiplier technique. More specifically, two distinct expressions (fully satisfying the equality constraints) are provided, to first solve the constrained quadratic programming problem as an unconstrained one for closed-form solution. Such expressions are derived via using an optimization variable vector, which is called the free vector $\boldsymbol{g}$ by the Theory of Functional Connections. In the spirit of this Theory, for the equality constrained nonlinear programming problem, its solution is obtained by the Newton's method combining with elimination scheme in optimization. Convergence analysis is supported by a numerical example for the proposed approach. △ Less

Submitted 25 August, 2022; v1 submitted 10 October, 2019; originally announced October 2019.

Comments: 34 pages, 1 figure, 3 tables, revision. This is the accepted manuscript by Journal of Computational and Applied Mathematics (2021). The published journal article is available at https://doi.org/10.1016/j.cam.2021.113912 (2021)

MSC Class: 65N99

Journal ref: Journal of Computational and Applied Mathematics Volume 406, 1 May 2022, 113912

arXiv:1909.13267 [pdf, other]

doi 10.1016/j.jcp.2020.109569

Constraint energy minimizing generalized multiscale finite element method for nonlinear poroelasticity and elasticity

Authors: Shubin Fu, Eric Chung, Tina Mai

Abstract: In this paper, we apply the constraint energy minimizing generalized multiscale finite element method (CEM-GMsFEM) to first solving a nonlinear poroelasticity problem. The arising system consists of a nonlinear pressure equation and a nonlinear stress equation in strain-limiting setting, where strains keep bounded while stresses can grow arbitrarily large. After time discretization of the system,… ▽ More In this paper, we apply the constraint energy minimizing generalized multiscale finite element method (CEM-GMsFEM) to first solving a nonlinear poroelasticity problem. The arising system consists of a nonlinear pressure equation and a nonlinear stress equation in strain-limiting setting, where strains keep bounded while stresses can grow arbitrarily large. After time discretization of the system, to tackle the nonlinearity, we linearize the resulting equations by Picard iteration. To handle the linearized equations, we employ the CEM-GMsFEM and obtain appropriate offline multiscale basis functions for the pressure and the displacement. More specifically, first, auxiliary multiscale basis functions are generated by solving local spectral problems, via the GMsFEM. Then, multiscale spaces are constructed in oversampled regions, by solving a constraint energy minimizing (CEM) problem. After that, this strategy (with the CEM-GMsFEM) is also applied to a static case of the above nonlinear poroelasticity problem, that is, elasticity problem, where the residual based online multiscale basis functions are generated by an adaptive enrichment procedure, to further reduce the error. Convergence of the two cases is demonstrated by several numerical simulations, which give accurate solutions, with converging coarse-mesh sizes as well as few basis functions (degrees of freedom) and oversampling layers. △ Less

Submitted 29 September, 2019; originally announced September 2019.

Comments: 32 pages, 7 figures, 6 tables, submitted to Journal of Computational Physics

MSC Class: 65N30; 65N99

Journal ref: Journal of Computational Physics Volume 417, 15 September 2020, 109569

arXiv:1909.04722 [pdf, other]

doi 10.1016/j.cam.2020.112782

Multiscale simulations for upscaled multi-continuum flows

Authors: Jun Sur Richard Park, Siu Wun Cheung, Tina Mai, Viet Ha Hoang

Abstract: We consider in this paper a challenging problem of simulating fluid flows, in complex multiscale media possessing multi-continuum background. As an effort to handle this obstacle, model reduction is employed. In \cite{rh2}, homogenization was nicely applied, to find effective coefficients and homogenized equations (for fluid flow pressures) of a dual-continuum system, with new convection terms and… ▽ More We consider in this paper a challenging problem of simulating fluid flows, in complex multiscale media possessing multi-continuum background. As an effort to handle this obstacle, model reduction is employed. In \cite{rh2}, homogenization was nicely applied, to find effective coefficients and homogenized equations (for fluid flow pressures) of a dual-continuum system, with new convection terms and negative interaction coefficients. However, some degree of multiscale still remains. This motivates us to propose the generalized multiscale finite element method (GMsFEM), which is coupled with the dual-continuum homogenized equations, toward speeding up the simulation, improving the accuracy as well as clearly representing the interactions between the dual continua. In our paper, globally, each continuum is viewed as a system and connected to the other throughout the domain. We take into consideration the flow transfers between the dual continua and within each continuum itself. Such multiscale flow dynamics are modeled by the GMsFEM, which systematically generates either uncoupled or coupled multiscale basis (to carry the local characteristics to the global ones), via establishing local snapshots and spectral decomposition in the snapshot space. As a result, we will work with a system of two equations coupled with some interaction terms, and each equation describes one of the dual continua on the fine grid. Convergence analysis of the proposed GMsFEM is accompanied with the numerical results, which support the favorable outcomes. △ Less

Submitted 10 September, 2019; originally announced September 2019.

Comments: 35 pages, 6 figures, 4 tables, submitted to Journal of Computational and Applied Mathematics

MSC Class: 65N30; 65N99

Journal ref: Journal of Computational and Applied Mathematics Volume 374, 15 August 2020, 112782

arXiv:1905.08187 [pdf, ps, other]

The free field: realization via unbounded operators and Atiyah property

Authors: Tobias Mai, Roland Speicher, Sheng Yin

Abstract: Let $X_1,\dots,X_n$ be operators in a finite von Neumann algebra and consider their division closure in the affiliated unbounded operators. We address the question when this division closure is a skew field (aka division ring) and when it is the free skew field. We show that the first property is equivalent to the strong Atiyah property and that the second property can be characterized in terms of… ▽ More Let $X_1,\dots,X_n$ be operators in a finite von Neumann algebra and consider their division closure in the affiliated unbounded operators. We address the question when this division closure is a skew field (aka division ring) and when it is the free skew field. We show that the first property is equivalent to the strong Atiyah property and that the second property can be characterized in terms of the non-commutative distribution of $X_1,\dots,X_n$. More precisely, $X_1,\dots,X_n$ generate the free skew field if and only if there exist no non-zero finite rank operators $T_1,\dots,T_n$ such that $\sum_i[T_i,X_i]=0$. Sufficient conditions for this are the maximality of the free entropy dimension or the existence of a dual system of $X_1,\dots,X_n$. Our general theory is not restricted to selfadjoint operators and thus does also include and recover the result of Linnell that the generators of the free group give the free skew field. We give also consequences of our result for the question of atoms in the distribution of rational functions in free variables or in the asymptotic eigenvalue distribution of matrices over polynomials in asymptotically free random matrices. This solves in particular a conjecture of Charlesworth and Shlyakhtenko. △ Less

Submitted 16 April, 2020; v1 submitted 20 May, 2019; originally announced May 2019.

Comments: small change in abstract and acknowledgement in version 2; this is a completely revised version of arXiv:1805.04150; parts of the previous version are removed, we concentrate now on the realization of the free field and have there, via new methods, much stronger results than before

MSC Class: 46L54; 16S85; 15B52

arXiv:1812.09347 [pdf, ps, other]

doi 10.1016/j.cam.2019.03.047

Generalized multiscale finite element method for a strain-limiting nonlinear elasticity model

Authors: Shubin Fu, Eric Chung, Tina Mai

Abstract: In this paper, we consider multiscale methods for nonlinear elasticity. In particular, we investigate the Generalized Multiscale Finite Element Method (GMsFEM) for a strain-limiting elasticity problem. Being a special case of the naturally implicit constitutive theory of nonlinear elasticity, strain-limiting relation has presented an interesting class of material bodies, for which strains remain b… ▽ More In this paper, we consider multiscale methods for nonlinear elasticity. In particular, we investigate the Generalized Multiscale Finite Element Method (GMsFEM) for a strain-limiting elasticity problem. Being a special case of the naturally implicit constitutive theory of nonlinear elasticity, strain-limiting relation has presented an interesting class of material bodies, for which strains remain bounded (even infinitesimal) while stresses can become arbitrarily large. The nonlinearity and material heterogeneities can create multiscale features in the solution, and multiscale methods are therefore necessary. To handle the resulting nonlinear monotone quasilinear elliptic equation, we use linearization based on the Picard iteration. We consider two types of basis functions, offline and online basis functions, following the general framework of GMsFEM. The offline basis functions depend nonlinearly on the solution. Thus, we design an indicator function and we will recompute the offline basis functions when the indicator function predicts that the material property has significant change during the iterations. On the other hand, we will use the residual based online basis functions to reduce the error substantially when updating basis functions is necessary. Our numerical results show that the above combination of offline and online basis functions is able to give accurate solutions with only a few basis functions per each coarse region and updating basis functions in selected iterations. △ Less

Submitted 21 December, 2018; originally announced December 2018.

Comments: 19 pages, 2 figures, submitted to Journal of Computational and Applied Mathematics

MSC Class: 65N30; 65N99

Journal ref: Journal of Computational and Applied Mathematics, Volume 359, 15 October 2019, Pages 153-165

arXiv:1811.02926 [pdf, ps, other]

A note on existence of free Stein kernels

Authors: Guillaume Cébron, Max Fathi, Tobias Mai

Abstract: Stein kernels are a way of comparing probability distributions, defined via integration by parts formulas. We provide two constructions of Stein kernels in free probability. One is given by an explicit formula, and the other via free Poincaré inequalities. In particular, we show that unlike in the classical setting, free Stein kernels always exist. As corollaries, we derive new bounds on the rate… ▽ More Stein kernels are a way of comparing probability distributions, defined via integration by parts formulas. We provide two constructions of Stein kernels in free probability. One is given by an explicit formula, and the other via free Poincaré inequalities. In particular, we show that unlike in the classical setting, free Stein kernels always exist. As corollaries, we derive new bounds on the rate of convergence in the free CLT, and a strengthening of a characterization of the semicircular law due to Biane. △ Less

Submitted 7 November, 2018; originally announced November 2018.

Comments: 10 pages, comments are welcome

arXiv:1809.11153 [pdf, ps, other]

Hölder Continuity of Cumulative Distribution Functions for Noncommutative Polynomials under Finite Free Fisher Information

Authors: Marwa Banna, Tobias Mai

Abstract: This paper contributes to the current studies on regularity properties of noncommutative distributions in free probability theory. More precisely, we consider evaluations of selfadjoint noncommutative polynomials in noncommutative random variables that have finite non-microstates free Fisher information, highlighting the special case of Lipschitz conjugate variables. For the first time in this gen… ▽ More This paper contributes to the current studies on regularity properties of noncommutative distributions in free probability theory. More precisely, we consider evaluations of selfadjoint noncommutative polynomials in noncommutative random variables that have finite non-microstates free Fisher information, highlighting the special case of Lipschitz conjugate variables. For the first time in this generality, it is shown that the analytic distributions of those evaluations have Hölder continuous cumulative distribution functions with an explicit Hölder exponent that depends only on the degree of the considered polynomial. For linear polynomials, we reach in the case of finite non-microstates free Fisher information the optimal Hölder exponent $\frac{2}{3}$, and get Lipschitz continuity in the case of Lipschitz conjugate variables. In particular, our results guarantee that such polynomial evaluations have finite logarithmic energy and thus finite (non-microstates) free entropy, which partially settles a conjecture of Charlesworth and Shlyakhtenko [CS16]. We further provide a very general criterion that gives for weak approximations of measures having Hölder continuous cumulative distribution functions explicit rates of convergence in terms of the Kolmogorov distance. Finally, we combine these results to study the asymptotic eigenvalue distributions of polynomials in GUEs or matrices with more general Gibbs laws. For Gibbs laws, this extends the corresponding result obtained in [GS09] from convergence in distribution to convergence in Kolmogorov distance; in the GUE case, we even provide explicit rates, which quantify results of [HT05,HST06] in terms of the Kolmogorov distance. △ Less

Submitted 20 November, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: 33 pages; minor modifications in order to improve the exposition

arXiv:1805.04150 [pdf, ps, other]

The free field: zero divisors, Atiyah property and realizations via unbounded operators

Authors: Tobias Mai, Roland Speicher, Sheng Yin

Abstract: We consider noncommutative rational functions as well as matrices in polynomials in noncommuting variables in two settings: in an algebraic context the variables are formal variables, and their rational functions generate the "free field"; in an analytic context the variables are given by operators from a finite von Neumann algebra and the question of rational functions is treated within the affil… ▽ More We consider noncommutative rational functions as well as matrices in polynomials in noncommuting variables in two settings: in an algebraic context the variables are formal variables, and their rational functions generate the "free field"; in an analytic context the variables are given by operators from a finite von Neumann algebra and the question of rational functions is treated within the affiliated unbounded operators. Our main result shows that for a "good" class of operators - namely those for which the free entropy dimension is maximal - the analytic and the algebraic theory are isomorphic. This means in particular that any non-trivial rational function can be evaluated as an unbounded operator for any such good tuple and that those operators don't have zero divisors. On the matrix side, this means that matrices of polynomials which are invertible in the free field are also invertible as matrices over unbounded operators when we plug in our good operator tuples. We also address the question how this is related to the strong Atiyah property. The above yields a quite complete picture for the question of zero divisors (or atoms in the corresponding distributions) for operator tuples with maximal free entropy dimension. We give also some partial results for the question of existence and regularity of a density of the distribution. △ Less

Submitted 16 April, 2020; v1 submitted 10 May, 2018; originally announced May 2018.

Comments: a completely revised version of this preprint appears in arXiv:1905.08187; there parts of the present version are removed, and we concentrate there on the realization of the free field and have there, via new methods, much stronger results than in this version

MSC Class: 46L54; 16S85; 15B52

arXiv:1710.11249 [pdf, other]

Rock-Paper-Scissors, Differential Games and Biological Diversity

Authors: Tung Mai, Ioannis Panageas, Will Ratcliff, Vijay V. Vazirani, Peter Yunker

Abstract: We model a situation in which a collection of species derive their fitnesses via a rock-paper-scissors-type game; however, the precise payoffs are a function of the environment. The new aspect of our model lies in adding a feedback loop: the environment changes according to the relative fitnesses of the species; in particular, it gives a boost to the species having small populations. We cast our m… ▽ More We model a situation in which a collection of species derive their fitnesses via a rock-paper-scissors-type game; however, the precise payoffs are a function of the environment. The new aspect of our model lies in adding a feedback loop: the environment changes according to the relative fitnesses of the species; in particular, it gives a boost to the species having small populations. We cast our model in the setting of a differential game and we show that for a certain setting of parameters, this dynamics cycles. Our model is a natural one, since depletion of resources used by more frequent species will shift the payoff matrix towards favoring less frequent ones. Since the dynamics cycles, no species goes extinct and diversity is maintained. △ Less

Submitted 30 October, 2017; originally announced October 2017.

arXiv:1612.05191 [pdf, ps, other]

Nash Social Welfare for Indivisible Items under Separable, Piecewise-Linear Concave Utilities

Authors: Nima Anari, Tung Mai, Shayan Oveis Gharan, Vijay V. Vazirani

Abstract: Recently Cole and Gkatzelis gave the first constant factor approximation algorithm for the problem of allocating indivisible items to agents, under additive valuations, so as to maximize the Nash Social Welfare. We give constant factor algorithms for a substantial generalization of their problem -- to the case of separable, piecewise-linear concave utility functions. We give two such algorithms, t… ▽ More Recently Cole and Gkatzelis gave the first constant factor approximation algorithm for the problem of allocating indivisible items to agents, under additive valuations, so as to maximize the Nash Social Welfare. We give constant factor algorithms for a substantial generalization of their problem -- to the case of separable, piecewise-linear concave utility functions. We give two such algorithms, the first using market equilibria and the second using the theory of stable polynomials. In AGT, there is a paucity of methods for the design of mechanisms for the allocation of indivisible goods and the result of Cole and Gkatzelis seemed to be taking a major step towards filling this gap. Our result can be seen as another step in this direction. △ Less

Submitted 6 April, 2017; v1 submitted 15 December, 2016; originally announced December 2016.

arXiv:1612.05096 [pdf, ps, other]

Entropic convergence and the linearized limit for the Boltzmann equation with external force

Authors: Tina Mai

Abstract: The purpose of this note, as a compendium, is to extend the results on entropic convergence and the linearized limit for the Boltzmann equation (without external force) in \cite{Levermore} by Levermore to the case of the Boltzmann equation with external force. More specifically, starting from the Boltzmann equation with an external force introduced in \cite{DL16} by Arsénio and Saint-Raymond, we f… ▽ More The purpose of this note, as a compendium, is to extend the results on entropic convergence and the linearized limit for the Boltzmann equation (without external force) in \cite{Levermore} by Levermore to the case of the Boltzmann equation with external force. More specifically, starting from the Boltzmann equation with an external force introduced in \cite{DL16} by Arsénio and Saint-Raymond, we find conditions on the force, to maintain the result in \cite{Levermore} (as an important application of the theory of DiPerna-Lions renormalized solutions) about the validity of the linearization approximation when the initial datum approaches a global Maxwellian. △ Less

Submitted 15 December, 2016; originally announced December 2016.

Comments: 21 pages

MSC Class: 35Q20; 35Q99; 82C22; 82C40; 82D10

arXiv:1605.05933 [pdf, other]

doi 10.1016/j.jspi.2016.11.003

Pseudo-Bayesian Quantum Tomography with Rank-adaptation

Authors: The Tien Mai, Pierre Alquier

Abstract: Quantum state tomography, an important task in quantum information processing, aims at reconstructing a state from prepared measurement data. Bayesian methods are recognized to be one of the good and reliable choice in estimating quantum states~\cite{blume2010optimal}. Several numerical works showed that Bayesian estimations are comparable to, and even better than other methods in the problem of… ▽ More Quantum state tomography, an important task in quantum information processing, aims at reconstructing a state from prepared measurement data. Bayesian methods are recognized to be one of the good and reliable choice in estimating quantum states~\cite{blume2010optimal}. Several numerical works showed that Bayesian estimations are comparable to, and even better than other methods in the problem of $1$-qubit state recovery. However, the problem of choosing prior distribution in the general case of $n$ qubits is not straightforward. More importantly, the statistical performance of Bayesian type estimators have not been studied from a theoretical perspective yet. In this paper, we propose a novel prior for quantum states (density matrices), and we define pseudo-Bayesian estimators of the density matrix. Then, using PAC-Bayesian theorems, we derive rates of convergence for the posterior mean. The numerical performance of these estimators are tested on simulated and real datasets. △ Less

Submitted 10 October, 2016; v1 submitted 19 May, 2016; originally announced May 2016.

arXiv:1512.07593 [pdf, ps, other]

Regularity of distributions of Wigner integrals

Authors: Tobias Mai

Abstract: Wigner integrals and the corresponding Wigner chaos were introduced by P. Biane and R. Speicher in 1998 as a non-commutative counterpart of classical Wiener-Itô integrals and the corresponding Wiener-Itô chaos, respectively, in free probability. In the classical case, a famous result of I. Shigekawa states that non-trivial elements in the finite Wiener-Itô chaos have an absolutely continuous dis… ▽ More Wigner integrals and the corresponding Wigner chaos were introduced by P. Biane and R. Speicher in 1998 as a non-commutative counterpart of classical Wiener-Itô integrals and the corresponding Wiener-Itô chaos, respectively, in free probability. In the classical case, a famous result of I. Shigekawa states that non-trivial elements in the finite Wiener-Itô chaos have an absolutely continuous distribution. We provide here a first contribution to such regularity questions for Wigner integrals by showing that the distribution of non-trivial elements in the finite Wigner chaos cannot have atoms. This answers a question of I. Nourdin and G. Peccati. For doing so, we establish the notion of directional gradients in the context of the free Malliavin calculus. These directional gradients bridge between free Malliavin calculus and the theory of non-commutative derivations as initiated by D. Voiculescu and Y. Dabrowski. Methods recently invented by R. Speicher, M. Weber, and the author for treating similar questions in the case of finitely many variables are extended, such that they apply to directional gradients. This approach also excludes zero-divisors for the considered elements in the finite Wigner chaos. △ Less

Submitted 7 March, 2016; v1 submitted 23 December, 2015; originally announced December 2015.

Comments: a few minor changes; typos corrected

arXiv:1511.05330 [pdf, ps, other]

Applications of Realizations (aka Linearizations) to Free Probability

Authors: J. William Helton, Tobias Mai, Roland Speicher

Abstract: We show how the combination of new "linearization" ideas in free probability theory with the powerful "realization" machinery -- developed over the last 50 years in fields including systems engineering and automata theory -- allows solving the problem of determining the eigenvalue distribution (or even the Brown measure, in the non-selfadjoint case) of noncommutative rational functions of random m… ▽ More We show how the combination of new "linearization" ideas in free probability theory with the powerful "realization" machinery -- developed over the last 50 years in fields including systems engineering and automata theory -- allows solving the problem of determining the eigenvalue distribution (or even the Brown measure, in the non-selfadjoint case) of noncommutative rational functions of random matrices when their size tends to infinity. Along the way we extend evaluations of noncommutative rational expressions from matrices to stably finite algebras, e.g. type II$_1$ von Neumann algebras, with a precise control of the domains of the rational expressions. The paper provides sufficient background information, with the intention that it should be accessible both to functional analysts and to algebraists. △ Less

Submitted 29 September, 2017; v1 submitted 17 November, 2015; originally announced November 2015.

Comments: We have undertaken a major revision, mainly for the sake of clarity and readability

arXiv:1502.06357 [pdf, ps, other]

Absence of algebraic relations and of zero divisors under the assumption of full non-microstates free entropy dimension

Authors: Tobias Mai, Roland Speicher, Moritz Weber

Abstract: We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables excludes algebraic relations for the generators. Moreover, under the assumption of maximal non-microstates free entropy dimension, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any element from the von Neumann… ▽ More We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables excludes algebraic relations for the generators. Moreover, under the assumption of maximal non-microstates free entropy dimension, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any element from the von Neumann algebra is zero if and only if at least one of those factors is zero. In particular, this shows that in this case the distribution of any non-constant self-adjoint non-commutative polynomial in the generators does not have atoms. Questions on the absence of atoms for polynomials in non-commuting random variables (or for polynomials in random matrices) have been an open problem for quite a while. We solve this general problem by showing that maximality of free entropy dimension excludes atoms. △ Less

Submitted 2 September, 2015; v1 submitted 23 February, 2015; originally announced February 2015.

Comments: this is an extended version of arXiv:1407.5715

arXiv:1408.5820 [pdf, other]

doi 10.1214/15-EJS1020

A Bayesian Approach for Noisy Matrix Completion: Optimal Rate under General Sampling Distribution

Authors: The Tien Mai, Pierre Alquier

Abstract: Bayesian methods for low-rank matrix completion with noise have been shown to be very efficient computationally. While the behaviour of penalized minimization methods is well understood both from the theoretical and computational points of view in this problem, the theoretical optimality of Bayesian estimators have not been explored yet. In this paper, we propose a Bayesian estimator for matrix co… ▽ More Bayesian methods for low-rank matrix completion with noise have been shown to be very efficient computationally. While the behaviour of penalized minimization methods is well understood both from the theoretical and computational points of view in this problem, the theoretical optimality of Bayesian estimators have not been explored yet. In this paper, we propose a Bayesian estimator for matrix completion under general sampling distribution. We also provide an oracle inequality for this estimator. This inequality proves that, whatever the rank of the matrix to be estimated, our estimator reaches the minimax-optimal rate of convergence (up to a logarithmic factor). We end the paper with a short simulation study. △ Less

Submitted 21 January, 2015; v1 submitted 25 August, 2014; originally announced August 2014.

Journal ref: Electronic Journal of Statistics 9, pp. 823-841, 2015

arXiv:1407.5715 [pdf, ps, other]

Absence of algebraic relations and of zero divisors under the assumption of finite non-microstates free Fisher information

Authors: Tobias Mai, Roland Speicher, Moritz Weber

Abstract: We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables in an appropriate sense exclude algebraic relations for the generators. Moreover, under the assumption of finite non-microstates free Fisher information, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any eleme… ▽ More We show that in a tracial and finitely generated $W^\ast$-probability space existence of conjugate variables in an appropriate sense exclude algebraic relations for the generators. Moreover, under the assumption of finite non-microstates free Fisher information, we prove that there are no zero divisors in the sense that the product of any non-commutative polynomial in the generators with any element from the von Neumann algebra is zero if and only if at least one of those factors is zero. △ Less

Submitted 22 February, 2015; v1 submitted 21 July, 2014; originally announced July 2014.

Comments: not intended for publication in this form; an extended version will appear under the title "Absence of algebraic relations and of zero divisors under the assumption of full non-microstates free entropy dimension"

arXiv:1303.3196 [pdf, other]

Analytic subordination theory of operator-valued free additive convolution and the solution of a general random matrix problem

Authors: Serban Belinschi, Tobias Mai, Roland Speicher

Abstract: We develop an analytic theory of operator-valued additive free convolution in terms of subordination functions. In contrast to earlier investigations our functions are not just given by power series expansions, but are defined as Frechet analytic functions in all of the operator upper half plane. Furthermore, we do not have to assume that our state is tracial. Combining this new analytic theory of… ▽ More We develop an analytic theory of operator-valued additive free convolution in terms of subordination functions. In contrast to earlier investigations our functions are not just given by power series expansions, but are defined as Frechet analytic functions in all of the operator upper half plane. Furthermore, we do not have to assume that our state is tracial. Combining this new analytic theory of operator-valued free convolution with Anderson's selfadjoint version of the linearization trick we are able to provide a solution to the following general random matrix problem: How can we calculate the asymptotic eigenvalue distribution of a polynomial evaluated in independent random matrices with known asymptotic eigenvalue distributions? △ Less

Submitted 31 August, 2013; v1 submitted 13 March, 2013; originally announced March 2013.

Comments: a few remarks added, references updated

arXiv:1202.2740 [pdf, ps, other]

Operator-valued and multivariate free Berry-Esseen theorems

Authors: Tobias Mai, Roland Speicher

Abstract: We address the question of a Berry-Esseen type theorem for the speed of convergence in a multivariate free central limit theorem. For this, we estimate the difference between the operator-valued Cauchy transforms of the normalized partial sums in an operator-valued free central limit theorem and the Cauchy transform of the limiting operator-valued semicircular element. Since we have to deal with i… ▽ More We address the question of a Berry-Esseen type theorem for the speed of convergence in a multivariate free central limit theorem. For this, we estimate the difference between the operator-valued Cauchy transforms of the normalized partial sums in an operator-valued free central limit theorem and the Cauchy transform of the limiting operator-valued semicircular element. Since we have to deal with in general non-self-adjoint operators, we introduce the notion of matrix-valued resolvent sets and study the behavior of Cauchy transforms on them. △ Less

Submitted 13 February, 2012; originally announced February 2012.

arXiv:1112.1807 [pdf, ps, other]

Analytically weak solutions to SPDEs with unbounded time-dependent differential operators and an application

Authors: Benedict Baur, Martin Grothaus, Tan Thanh Mai

Abstract: We analyze the concepts of analytically weak solutions of stochastic differential equations (SDEs) in Hilbert spaces with time-dependent unbounded operators and give conditions for existence and uniqueness of such solutions. Our studies are motivated by a stochastic partial differential equation (SPDE) arising in industrial mathematics. We analyze the concepts of analytically weak solutions of stochastic differential equations (SDEs) in Hilbert spaces with time-dependent unbounded operators and give conditions for existence and uniqueness of such solutions. Our studies are motivated by a stochastic partial differential equation (SPDE) arising in industrial mathematics. △ Less

Submitted 30 January, 2013; v1 submitted 8 December, 2011; originally announced December 2011.

MSC Class: 65J08; 60H15; 60H05; 58D25

Showing 1–50 of 51 results for author: Mai, T