Search | arXiv e-print repository

Kernel Expansions for High-Dimensional Mean-Field Control with Non-local Interactions

Authors: Alexander Vidal, Samy Wu Fung, Stanley Osher, Luis Tenorio, Levon Nurbekyan

Abstract: Mean-field control (MFC) problems aim to find the optimal policy to control massive populations of interacting agents. These problems are crucial in areas such as economics, physics, and biology. We consider the non-local setting, where the interactions between agents are governed by a suitable kernel. For $N$ agents, the interaction cost has $\mathcal{O}(N^2)$ complexity, which can be prohibitive… ▽ More Mean-field control (MFC) problems aim to find the optimal policy to control massive populations of interacting agents. These problems are crucial in areas such as economics, physics, and biology. We consider the non-local setting, where the interactions between agents are governed by a suitable kernel. For $N$ agents, the interaction cost has $\mathcal{O}(N^2)$ complexity, which can be prohibitively slow to evaluate and differentiate when $N$ is large. To this end, we propose an efficient primal-dual algorithm that utilizes basis expansions of the kernels. The basis expansions reduce the cost of computing the interactions, while the primal-dual methodology decouples the agents at the expense of solving for a moderate number of dual variables. We also demonstrate that our approach can further be structured in a multi-resolution manner, where we estimate optimal dual variables using a moderate $N$ and solve decoupled trajectory optimization problems for large $N$. We illustrate the effectiveness of our method on an optimal control of 5000 interacting quadrotors. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2404.12884 [pdf, ps, other]

Expanding Čech cohomology for quantales

Authors: Ana Luiza Tenório, Peter Arndt, Hugo Luiz Mariano

Abstract: We expand Čech cohomology of a topological space $X$ with values in a presheaf on $X$ to Čech cohomology of a commutative ring with unity $R$ with values in a presheaf on $R$. The strategy is to observe that both the set of open subsets of $X$ and the set of ideals of $R$ provide examples of a (semicartesian) quantale. We study a particular pair of (adjoint) functors $(θ, τ)$ between the quantale… ▽ More We expand Čech cohomology of a topological space $X$ with values in a presheaf on $X$ to Čech cohomology of a commutative ring with unity $R$ with values in a presheaf on $R$. The strategy is to observe that both the set of open subsets of $X$ and the set of ideals of $R$ provide examples of a (semicartesian) quantale. We study a particular pair of (adjoint) functors $(θ, τ)$ between the quantale of open subsets of $X$ and the quantale of ideals of $C(X)$, the ring of real-valued continuous functions on $X$. This leads to the main result of this paper: the $q$th Čech cohomology groups of $X$ with values on the constant sheaf $F$ on $X$ is isomorphic to the $q$th Čech cohomology groups of the ring $C(X)$ with values on a sheaf $F \circ τ$ on $C(X)$. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 26 pages

MSC Class: 18F75 55N30 13D03

arXiv:2404.12313 [pdf, ps, other]

Grothendieck prelopologies: towards a closed monoidal sheaf category

Authors: Ana Luiza Tenório, Hugo Luiz Mariano

Abstract: In this paper, we present a generalization of Grothendieck pretopologies -- suited for semicartesian categories with equalizers $C$ -- leading to a closed monoidal category of sheaves, instead of closed cartesian category. This is proved through a different sheafification process, which is the left adjoint functor of the suitable inclusion functor but does not preserve all finite limits. If the mo… ▽ More In this paper, we present a generalization of Grothendieck pretopologies -- suited for semicartesian categories with equalizers $C$ -- leading to a closed monoidal category of sheaves, instead of closed cartesian category. This is proved through a different sheafification process, which is the left adjoint functor of the suitable inclusion functor but does not preserve all finite limits. If the monoidal structure in $C$ is given by the categorical product, all constructions coincide with those for Grothendieck toposes. The motivation for such generalization stems from a certain notion of sheaves on quantales that does not form a topos. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 35 pages

MSC Class: 18F10; 06F07; 18A40

arXiv:2211.16757 [pdf, other]

Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the JKO Scheme

Authors: Alexander Vidal, Samy Wu Fung, Luis Tenorio, Stanley Osher, Levon Nurbekyan

Abstract: A normalizing flow (NF) is a map** that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In o… ▽ More A normalizing flow (NF) is a map** that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the map** and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, $α$, that controls the strength of the soft penalty and requires significant tuning. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $α$. This is achieved by integrating the OT CNF framework into a Wasserstein gradient flow framework, also known as the JKO scheme. Instead of tuning $α$, we repeatedly solve the optimization problem for a fixed $α$ effectively performing a JKO update with a time-step $α$. Hence we obtain a "divide and conquer" algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large $α$. △ Less

Submitted 30 November, 2022; originally announced November 2022.

arXiv:2205.02362 [pdf, ps, other]

The Category of Hypergroups as a Hyper (quasi)Abelian Category

Authors: Kaique Matias de Andrade Roberto, Ana Luiza Tenório

Abstract: In this paper we study categorical properties of the category of (abelian) hypergroups that leads to the notion of hyper-quasipreadditive and hyper-quasiabelian categories. Our goal is to create a path towards a general theory of homological algebra for hyperalgebras. This is a first attempt to achieve this goal. We hope to improve the definitions and results, and provide more examples soon. In this paper we study categorical properties of the category of (abelian) hypergroups that leads to the notion of hyper-quasipreadditive and hyper-quasiabelian categories. Our goal is to create a path towards a general theory of homological algebra for hyperalgebras. This is a first attempt to achieve this goal. We hope to improve the definitions and results, and provide more examples soon. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: 20 pages

MSC Class: 20N20 (Primary) 18E99 (Secondary)

arXiv:2204.08351 [pdf, ps, other]

On sheaves on semicartesian quantales and their truth values

Authors: Ana Luiza Tenório, Caio de Andrade Mendes, Hugo Luiz Mariano

Abstract: In this paper, we introduce a new definition of sheaves on semicartesian quantales, providing first examples and categorical properties. We note that our sheaves are similar to the standard definition of a sheaf on a locale, however, we prove in that in general it is not an elementary topos - since the lattice of external truth values of $Sh(Q)$, $Sub(1)$, is canonically isomorphic to the quantale… ▽ More In this paper, we introduce a new definition of sheaves on semicartesian quantales, providing first examples and categorical properties. We note that our sheaves are similar to the standard definition of a sheaf on a locale, however, we prove in that in general it is not an elementary topos - since the lattice of external truth values of $Sh(Q)$, $Sub(1)$, is canonically isomorphic to the quantale $Q$ - placing this paper as part of a greater project towards a monoidal (not necessarily cartesian) closed version of elementary topos. To start the study the logical aspects of the category of sheaves we are introducing, we explore the nature of the "internal truth value objects" in such sheaves categories. More precisely, we analyze two candidates for subobject classifier for different subclasses of commutative and semicartesian quantales. △ Less

Submitted 14 October, 2023; v1 submitted 18 April, 2022; originally announced April 2022.

Comments: 43 pages. We have updated the paper to show that our sheaves on quantales do not form a topos and replaced a section about change of base by a section discussing a subobject classifier for $Sh(Q)$. The discussion about change of base will appear elsewhere with more details

MSC Class: 18F20 06F07

arXiv:2107.03941 [pdf, other]

Zeroth order optimization with orthogonal random directions

Authors: David Kozak, Cesare Molinari, Lorenzo Rosasco, Luis Tenorio, Silvia Villa

Abstract: We propose and analyze a randomized zeroth-order approach based on approximating the exact gradient byfinite differences computed in a set of orthogonal random directions that changes with each iteration. A number ofpreviously proposed methods are recovered as special cases including spherical smoothing, coordinate descent, as wellas discretized gradient descent. Our main contribution is proving c… ▽ More We propose and analyze a randomized zeroth-order approach based on approximating the exact gradient byfinite differences computed in a set of orthogonal random directions that changes with each iteration. A number ofpreviously proposed methods are recovered as special cases including spherical smoothing, coordinate descent, as wellas discretized gradient descent. Our main contribution is proving convergence guarantees as well as convergence ratesunder different parameter choices and assumptions. In particular, we consider convex objectives, but also possiblynon-convex objectives satisfying the Polyak-Łojasiewicz (PL) condition. Theoretical results are complemented andillustrated by numerical experiments. △ Less

Submitted 15 November, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2008.10677 [pdf, ps, other]

doi 10.1007/s40863-021-00246-z

On sheaf cohomology and natural expansions

Authors: Ana Luiza Tenorio, Hugo Luiz Mariano

Abstract: In this survey paper, we present Čech and sheaf cohomologies -- themes that were presented by Koszul in University of São Paulo during his visit in the late 1950s -- we present expansions for categories of generalized sheaves (i.e, Grothendieck toposes), with examples of applications in other cohomology theories and other areas of mathematics, besides providing motivations and historical notes. We… ▽ More In this survey paper, we present Čech and sheaf cohomologies -- themes that were presented by Koszul in University of São Paulo during his visit in the late 1950s -- we present expansions for categories of generalized sheaves (i.e, Grothendieck toposes), with examples of applications in other cohomology theories and other areas of mathematics, besides providing motivations and historical notes. We conclude explaining the difficulties in establishing a cohomology theory for elementary toposes, presenting alternative approaches by considering constructions over quantales, that provide structures similar to sheaves, and indicating researches related to logic: constructive (intuitionistic and linear) logic for toposes, sheaves over quantales, and homological algebra. △ Less

Submitted 9 July, 2021; v1 submitted 24 August, 2020; originally announced August 2020.

Comments: We corrected typos, improved the writing style, and provided additional comments. São Paulo J. Math. Sci. (2021)

MSC Class: 2020: 18F10 and 55N30 and 18B25

arXiv:2003.02684 [pdf, other]

doi 10.1007/s10589-021-00271-w

A stochastic subspace approach to gradient-free optimization in high dimensions

Authors: David Kozak, Stephen Becker, Alireza Doostan, Luis Tenorio

Abstract: We present a stochastic descent algorithm for unconstrained optimization that is particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The algorithm maps the gradient onto a low-dimensional random subspace of dimension $\ell$ at each iteration, similar to coordinate descen… ▽ More We present a stochastic descent algorithm for unconstrained optimization that is particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The algorithm maps the gradient onto a low-dimensional random subspace of dimension $\ell$ at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. Without requiring a full gradient, this map** can be performed by computing $\ell$ directional derivatives (e.g., via forward-mode automatic differentiation). We give proofs for convergence in expectation under various convexity assumptions as well as probabilistic convergence results under strong-convexity. Our method extends the well-known Gaussian smoothing technique to descent in subspaces of dimension greater than one, opening the doors to new analysis of Gaussian smoothing when more than one directional derivative is used at each iteration. We also provide a finite-dimensional variant of a special case of the Johnson-Lindenstrauss lemma. Experimentally, we show that our method compares favorably to coordinate descent, Gaussian smoothing, gradient descent and BFGS (when gradients are calculated via forward-mode automatic differentiation) on problems from the machine learning and shape optimization literature. △ Less

Submitted 6 May, 2021; v1 submitted 4 March, 2020; originally announced March 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1904.01145

MSC Class: 90C06; 93B40; 65K10

Journal ref: Comput Optim Appl 79, 339-368 (2021)

arXiv:1912.07962 [pdf, other]

doi 10.1088/1361-6420/ab77da

Sampled Limited Memory Methods for Massive Linear Inverse Problems

Authors: Julianne Chung, Matthias Chung, J. Tanner Slagel, Luis Tenorio

Abstract: In many modern imaging applications the desire to reconstruct high resolution images, coupled with the abundance of data from acquisition using ultra-fast detectors, have led to new challenges in image reconstruction. A main challenge is that the resulting linear inverse problems are massive. The size of the forward model matrix exceeds the storage capabilities of computer memory, or the observati… ▽ More In many modern imaging applications the desire to reconstruct high resolution images, coupled with the abundance of data from acquisition using ultra-fast detectors, have led to new challenges in image reconstruction. A main challenge is that the resulting linear inverse problems are massive. The size of the forward model matrix exceeds the storage capabilities of computer memory, or the observational dataset is enormous and not available all at once. Row-action methods that iterate over samples of rows can be used to approximate the solution while avoiding memory and data availability constraints. However, their overall convergence can be slow. In this paper, we introduce a sampled limited memory row-action method for linear least squares problems, where an approximation of the global curvature of the underlying least squares problem is used to speed up the initial convergence and to improve the accuracy of iterates. We show that this limited memory method is a generalization of the damped block Kaczmarz method, and we prove linear convergence of the expectation of the iterates and of the error norm up to a convergence horizon. Numerical experiments demonstrate the benefits of these sampled limited memory row-action methods for massive 2D and 3D inverse problems in tomography applications. △ Less

Submitted 17 December, 2019; originally announced December 2019.

Comments: 25 pages, 11 figures

arXiv:1904.01145 [pdf, other]

Stochastic Subspace Descent

Authors: David Kozak, Stephen Becker, Alireza Doostan, Luis Tenorio

Abstract: We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The basic algorithm projects the gradient onto a random subspace at each iteration, similar to coordinate descent but without… ▽ More We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The basic algorithm projects the gradient onto a random subspace at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. This algorithm is previously known but we provide new analysis. We also extend the popular SVRG method to this framework but without requiring that the objective function be written as a finite sum. We provide proofs of convergence for our methods under various convexity assumptions and show favorable results when compared to gradient descent and BFGS on non-convex problems from the machine learning and shape optimization literature. We also note that our analysis gives a proof that the iterates of SVRG and several other popular first-order stochastic methods, in their original formulation, converge almost surely to the optimum; to our knowledge, prior to this work the iterates of SVRG had only been known to converge in expectation. △ Less

Submitted 29 April, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

Comments: 34 pages, 7 figures, submitted on 4/1/19 Update: Main document: 24 Pages, Supplementary Material 9 pages

arXiv:1812.06165 [pdf, other]

Sampled Tikhonov Regularization for Large Linear Inverse Problems

Authors: J. Tanner Slagel, Julianne Chung, Matthias Chung, David Kozak, Luis Tenorio

Abstract: In this paper, we investigate iterative methods that are based on sampling of the data for computing Tikhonov-regularized solutions. We focus on very large inverse problems where access to the entire data set is not possible all at once (e.g., for problems with streaming or massive datasets). Row-access methods provide an ideal framework for solving such problems, since they only require access to… ▽ More In this paper, we investigate iterative methods that are based on sampling of the data for computing Tikhonov-regularized solutions. We focus on very large inverse problems where access to the entire data set is not possible all at once (e.g., for problems with streaming or massive datasets). Row-access methods provide an ideal framework for solving such problems, since they only require access to "blocks" of the data at any given time. However, when using these iterative sampling methods to solve inverse problems, the main challenges include a proper choice of the regularization parameter, appropriate sampling strategies, and a convergence analysis. To address these challenges, we first describe a family of sampled iterative methods that can incorporate data as they become available (e.g., randomly sampled). We consider two sampled iterative methods, where the iterates can be characterized as solutions to a sequence of approximate Tikhonov problems. The first method requires the regularization parameter to be fixed a priori and converges asymptotically to an unregularized solution for randomly sampled data. This is undesirable for inverse problems. Thus, we focus on the second method where the main benefits are that the regularization parameter can be updated during the iterative process and the iterates converge asymptotically to a Tikhonov-regularized solution. We describe adaptive approaches to update the regularization parameter that are based on sampled residuals, and we describe a limited-memory variant for larger problems. Numerical examples, including a large-scale super-resolution imaging example, demonstrate the potential for these methods. △ Less

Submitted 14 December, 2018; originally announced December 2018.

MSC Class: 65F22; 65F10; 15A29

arXiv:1702.07367 [pdf, other]

Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems

Authors: Julianne Chung, Matthias Chung, J. Tanner Slagel, Luis Tenorio

Abstract: We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In our proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computat… ▽ More We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In our proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computational limitations, and probability distributions that can exploit structure and/or sparsity are considered. Theoretical results on consistency of the approximations for both the stochastic Newton and the stochastic quasi-Newton methods are provided. The results show, in particular, that stochastic Newton iterates, in contrast to stochastic quasi-Newton iterates, may not converge to the desired least-squares solution. Numerical examples, including an example from extreme learning machines, demonstrate the potential applications of these methods. △ Less

Submitted 23 February, 2017; originally announced February 2017.

arXiv:1607.01881 [pdf, ps, other]

Goal-oriented optimal approximations of Bayesian linear inverse problems

Authors: Alessio Spantini, Tiangang Cui, Karen Willcox, Luis Tenorio, Youssef Marzouk

Abstract: We propose optimal dimensionality reduction techniques for the solution of goal-oriented linear-Gaussian inverse problems, where the quantity of interest (QoI) is a function of the inversion parameters. These approximations are suitable for large-scale applications. In particular, we study the approximation of the posterior covariance of the QoI as a low-rank negative update of its prior covarianc… ▽ More We propose optimal dimensionality reduction techniques for the solution of goal-oriented linear-Gaussian inverse problems, where the quantity of interest (QoI) is a function of the inversion parameters. These approximations are suitable for large-scale applications. In particular, we study the approximation of the posterior covariance of the QoI as a low-rank negative update of its prior covariance, and prove optimality of this update with respect to the natural geodesic distance on the manifold of symmetric positive definite matrices. Assuming exact knowledge of the posterior mean of the QoI, the optimality results extend to optimality in distribution with respect to the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose approximation of the posterior mean of the QoI as a low-rank linear function of the data, and prove optimality of this approximation with respect to a weighted Bayes risk. Both of these optimal approximations avoid the explicit computation of the full posterior distribution of the parameters and instead focus on directions that are well informed by the data and relevant to the QoI. These directions stem from a balance among all the components of the goal-oriented inverse problem: prior information, forward model, measurement noise, and ultimate goals. We illustrate the theory using a high-dimensional inverse problem in heat transfer. △ Less

Submitted 14 March, 2017; v1 submitted 7 July, 2016; originally announced July 2016.

arXiv:1407.3463 [pdf, ps, other]

Optimal low-rank approximations of Bayesian linear inverse problems

Authors: Alessio Spantini, Antti Solonen, Tiangang Cui, James Martin, Luis Tenorio, Youssef Marzouk

Abstract: In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of… ▽ More In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of the prior covariance matrix. We prove optimality of a particular update, based on the leading eigendirections of the matrix pencil defined by the Hessian of the negative log-likelihood and the prior precision, for a broad class of loss functions. This class includes the Förstner metric for symmetric positive definite matrices, as well as the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose two fast approximations of the posterior mean and prove their optimality with respect to a weighted Bayes risk under squared-error loss. These approximations are deployed in an offline-online manner, where a more costly but data-independent offline calculation is followed by fast online evaluations. As a result, these approximations are particularly useful when repeated posterior mean evaluations are required for multiple data sets. We demonstrate our theoretical results with several numerical examples, including high-dimensional X-ray tomography and an inverse heat conduction problem. In both of these examples, the intrinsic low-dimensional structure of the inference problem can be exploited while producing results that are essentially indistinguishable from solutions computed in the full space. △ Less

Submitted 6 July, 2015; v1 submitted 13 July, 2014; originally announced July 2014.

Showing 1–15 of 15 results for author: Tenorio, L