-
Kernel Expansions for High-Dimensional Mean-Field Control with Non-local Interactions
Authors:
Alexander Vidal,
Samy Wu Fung,
Stanley Osher,
Luis Tenorio,
Levon Nurbekyan
Abstract:
Mean-field control (MFC) problems aim to find the optimal policy to control massive populations of interacting agents. These problems are crucial in areas such as economics, physics, and biology. We consider the non-local setting, where the interactions between agents are governed by a suitable kernel. For $N$ agents, the interaction cost has $\mathcal{O}(N^2)$ complexity, which can be prohibitive…
▽ More
Mean-field control (MFC) problems aim to find the optimal policy to control massive populations of interacting agents. These problems are crucial in areas such as economics, physics, and biology. We consider the non-local setting, where the interactions between agents are governed by a suitable kernel. For $N$ agents, the interaction cost has $\mathcal{O}(N^2)$ complexity, which can be prohibitively slow to evaluate and differentiate when $N$ is large. To this end, we propose an efficient primal-dual algorithm that utilizes basis expansions of the kernels. The basis expansions reduce the cost of computing the interactions, while the primal-dual methodology decouples the agents at the expense of solving for a moderate number of dual variables. We also demonstrate that our approach can further be structured in a multi-resolution manner, where we estimate optimal dual variables using a moderate $N$ and solve decoupled trajectory optimization problems for large $N$. We illustrate the effectiveness of our method on an optimal control of 5000 interacting quadrotors.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Expanding Čech cohomology for quantales
Authors:
Ana Luiza Tenório,
Peter Arndt,
Hugo Luiz Mariano
Abstract:
We expand Čech cohomology of a topological space $X$ with values in a presheaf on $X$ to Čech cohomology of a commutative ring with unity $R$ with values in a presheaf on $R$. The strategy is to observe that both the set of open subsets of $X$ and the set of ideals of $R$ provide examples of a (semicartesian) quantale. We study a particular pair of (adjoint) functors $(θ, τ)$ between the quantale…
▽ More
We expand Čech cohomology of a topological space $X$ with values in a presheaf on $X$ to Čech cohomology of a commutative ring with unity $R$ with values in a presheaf on $R$. The strategy is to observe that both the set of open subsets of $X$ and the set of ideals of $R$ provide examples of a (semicartesian) quantale. We study a particular pair of (adjoint) functors $(θ, τ)$ between the quantale of open subsets of $X$ and the quantale of ideals of $C(X)$, the ring of real-valued continuous functions on $X$. This leads to the main result of this paper: the $q$th Čech cohomology groups of $X$ with values on the constant sheaf $F$ on $X$ is isomorphic to the $q$th Čech cohomology groups of the ring $C(X)$ with values on a sheaf $F \circ τ$ on $C(X)$.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Grothendieck prelopologies: towards a closed monoidal sheaf category
Authors:
Ana Luiza Tenório,
Hugo Luiz Mariano
Abstract:
In this paper, we present a generalization of Grothendieck pretopologies -- suited for semicartesian categories with equalizers $C$ -- leading to a closed monoidal category of sheaves, instead of closed cartesian category. This is proved through a different sheafification process, which is the left adjoint functor of the suitable inclusion functor but does not preserve all finite limits. If the mo…
▽ More
In this paper, we present a generalization of Grothendieck pretopologies -- suited for semicartesian categories with equalizers $C$ -- leading to a closed monoidal category of sheaves, instead of closed cartesian category. This is proved through a different sheafification process, which is the left adjoint functor of the suitable inclusion functor but does not preserve all finite limits. If the monoidal structure in $C$ is given by the categorical product, all constructions coincide with those for Grothendieck toposes. The motivation for such generalization stems from a certain notion of sheaves on quantales that does not form a topos.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the JKO Scheme
Authors:
Alexander Vidal,
Samy Wu Fung,
Luis Tenorio,
Stanley Osher,
Levon Nurbekyan
Abstract:
A normalizing flow (NF) is a map** that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In o…
▽ More
A normalizing flow (NF) is a map** that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the map** and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, $α$, that controls the strength of the soft penalty and requires significant tuning. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $α$. This is achieved by integrating the OT CNF framework into a Wasserstein gradient flow framework, also known as the JKO scheme. Instead of tuning $α$, we repeatedly solve the optimization problem for a fixed $α$ effectively performing a JKO update with a time-step $α$. Hence we obtain a "divide and conquer" algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large $α$.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
The Category of Hypergroups as a Hyper (quasi)Abelian Category
Authors:
Kaique Matias de Andrade Roberto,
Ana Luiza Tenório
Abstract:
In this paper we study categorical properties of the category of (abelian) hypergroups that leads to the notion of hyper-quasipreadditive and hyper-quasiabelian categories. Our goal is to create a path towards a general theory of homological algebra for hyperalgebras. This is a first attempt to achieve this goal. We hope to improve the definitions and results, and provide more examples soon.
In this paper we study categorical properties of the category of (abelian) hypergroups that leads to the notion of hyper-quasipreadditive and hyper-quasiabelian categories. Our goal is to create a path towards a general theory of homological algebra for hyperalgebras. This is a first attempt to achieve this goal. We hope to improve the definitions and results, and provide more examples soon.
△ Less
Submitted 4 May, 2022;
originally announced May 2022.
-
On sheaves on semicartesian quantales and their truth values
Authors:
Ana Luiza Tenório,
Caio de Andrade Mendes,
Hugo Luiz Mariano
Abstract:
In this paper, we introduce a new definition of sheaves on semicartesian quantales, providing first examples and categorical properties. We note that our sheaves are similar to the standard definition of a sheaf on a locale, however, we prove in that in general it is not an elementary topos - since the lattice of external truth values of $Sh(Q)$, $Sub(1)$, is canonically isomorphic to the quantale…
▽ More
In this paper, we introduce a new definition of sheaves on semicartesian quantales, providing first examples and categorical properties. We note that our sheaves are similar to the standard definition of a sheaf on a locale, however, we prove in that in general it is not an elementary topos - since the lattice of external truth values of $Sh(Q)$, $Sub(1)$, is canonically isomorphic to the quantale $Q$ - placing this paper as part of a greater project towards a monoidal (not necessarily cartesian) closed version of elementary topos. To start the study the logical aspects of the category of sheaves we are introducing, we explore the nature of the "internal truth value objects" in such sheaves categories. More precisely, we analyze two candidates for subobject classifier for different subclasses of commutative and semicartesian quantales.
△ Less
Submitted 14 October, 2023; v1 submitted 18 April, 2022;
originally announced April 2022.
-
Zeroth order optimization with orthogonal random directions
Authors:
David Kozak,
Cesare Molinari,
Lorenzo Rosasco,
Luis Tenorio,
Silvia Villa
Abstract:
We propose and analyze a randomized zeroth-order approach based on approximating the exact gradient byfinite differences computed in a set of orthogonal random directions that changes with each iteration. A number ofpreviously proposed methods are recovered as special cases including spherical smoothing, coordinate descent, as wellas discretized gradient descent. Our main contribution is proving c…
▽ More
We propose and analyze a randomized zeroth-order approach based on approximating the exact gradient byfinite differences computed in a set of orthogonal random directions that changes with each iteration. A number ofpreviously proposed methods are recovered as special cases including spherical smoothing, coordinate descent, as wellas discretized gradient descent. Our main contribution is proving convergence guarantees as well as convergence ratesunder different parameter choices and assumptions. In particular, we consider convex objectives, but also possiblynon-convex objectives satisfying the Polyak-Łojasiewicz (PL) condition. Theoretical results are complemented andillustrated by numerical experiments.
△ Less
Submitted 15 November, 2021; v1 submitted 8 July, 2021;
originally announced July 2021.
-
On sheaf cohomology and natural expansions
Authors:
Ana Luiza Tenorio,
Hugo Luiz Mariano
Abstract:
In this survey paper, we present Čech and sheaf cohomologies -- themes that were presented by Koszul in University of São Paulo during his visit in the late 1950s -- we present expansions for categories of generalized sheaves (i.e, Grothendieck toposes), with examples of applications in other cohomology theories and other areas of mathematics, besides providing motivations and historical notes. We…
▽ More
In this survey paper, we present Čech and sheaf cohomologies -- themes that were presented by Koszul in University of São Paulo during his visit in the late 1950s -- we present expansions for categories of generalized sheaves (i.e, Grothendieck toposes), with examples of applications in other cohomology theories and other areas of mathematics, besides providing motivations and historical notes. We conclude explaining the difficulties in establishing a cohomology theory for elementary toposes, presenting alternative approaches by considering constructions over quantales, that provide structures similar to sheaves, and indicating researches related to logic: constructive (intuitionistic and linear) logic for toposes, sheaves over quantales, and homological algebra.
△ Less
Submitted 9 July, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
A stochastic subspace approach to gradient-free optimization in high dimensions
Authors:
David Kozak,
Stephen Becker,
Alireza Doostan,
Luis Tenorio
Abstract:
We present a stochastic descent algorithm for unconstrained optimization that is particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The algorithm maps the gradient onto a low-dimensional random subspace of dimension $\ell$ at each iteration, similar to coordinate descen…
▽ More
We present a stochastic descent algorithm for unconstrained optimization that is particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The algorithm maps the gradient onto a low-dimensional random subspace of dimension $\ell$ at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. Without requiring a full gradient, this map** can be performed by computing $\ell$ directional derivatives (e.g., via forward-mode automatic differentiation). We give proofs for convergence in expectation under various convexity assumptions as well as probabilistic convergence results under strong-convexity. Our method extends the well-known Gaussian smoothing technique to descent in subspaces of dimension greater than one, opening the doors to new analysis of Gaussian smoothing when more than one directional derivative is used at each iteration. We also provide a finite-dimensional variant of a special case of the Johnson-Lindenstrauss lemma. Experimentally, we show that our method compares favorably to coordinate descent, Gaussian smoothing, gradient descent and BFGS (when gradients are calculated via forward-mode automatic differentiation) on problems from the machine learning and shape optimization literature.
△ Less
Submitted 6 May, 2021; v1 submitted 4 March, 2020;
originally announced March 2020.
-
Sampled Limited Memory Methods for Massive Linear Inverse Problems
Authors:
Julianne Chung,
Matthias Chung,
J. Tanner Slagel,
Luis Tenorio
Abstract:
In many modern imaging applications the desire to reconstruct high resolution images, coupled with the abundance of data from acquisition using ultra-fast detectors, have led to new challenges in image reconstruction. A main challenge is that the resulting linear inverse problems are massive. The size of the forward model matrix exceeds the storage capabilities of computer memory, or the observati…
▽ More
In many modern imaging applications the desire to reconstruct high resolution images, coupled with the abundance of data from acquisition using ultra-fast detectors, have led to new challenges in image reconstruction. A main challenge is that the resulting linear inverse problems are massive. The size of the forward model matrix exceeds the storage capabilities of computer memory, or the observational dataset is enormous and not available all at once. Row-action methods that iterate over samples of rows can be used to approximate the solution while avoiding memory and data availability constraints. However, their overall convergence can be slow. In this paper, we introduce a sampled limited memory row-action method for linear least squares problems, where an approximation of the global curvature of the underlying least squares problem is used to speed up the initial convergence and to improve the accuracy of iterates. We show that this limited memory method is a generalization of the damped block Kaczmarz method, and we prove linear convergence of the expectation of the iterates and of the error norm up to a convergence horizon. Numerical experiments demonstrate the benefits of these sampled limited memory row-action methods for massive 2D and 3D inverse problems in tomography applications.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
Stochastic Subspace Descent
Authors:
David Kozak,
Stephen Becker,
Alireza Doostan,
Luis Tenorio
Abstract:
We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The basic algorithm projects the gradient onto a random subspace at each iteration, similar to coordinate descent but without…
▽ More
We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained optimization and machine learning problems. The basic algorithm projects the gradient onto a random subspace at each iteration, similar to coordinate descent but without restricting directional derivatives to be along the axes. This algorithm is previously known but we provide new analysis. We also extend the popular SVRG method to this framework but without requiring that the objective function be written as a finite sum. We provide proofs of convergence for our methods under various convexity assumptions and show favorable results when compared to gradient descent and BFGS on non-convex problems from the machine learning and shape optimization literature. We also note that our analysis gives a proof that the iterates of SVRG and several other popular first-order stochastic methods, in their original formulation, converge almost surely to the optimum; to our knowledge, prior to this work the iterates of SVRG had only been known to converge in expectation.
△ Less
Submitted 29 April, 2019; v1 submitted 1 April, 2019;
originally announced April 2019.
-
Sampled Tikhonov Regularization for Large Linear Inverse Problems
Authors:
J. Tanner Slagel,
Julianne Chung,
Matthias Chung,
David Kozak,
Luis Tenorio
Abstract:
In this paper, we investigate iterative methods that are based on sampling of the data for computing Tikhonov-regularized solutions. We focus on very large inverse problems where access to the entire data set is not possible all at once (e.g., for problems with streaming or massive datasets). Row-access methods provide an ideal framework for solving such problems, since they only require access to…
▽ More
In this paper, we investigate iterative methods that are based on sampling of the data for computing Tikhonov-regularized solutions. We focus on very large inverse problems where access to the entire data set is not possible all at once (e.g., for problems with streaming or massive datasets). Row-access methods provide an ideal framework for solving such problems, since they only require access to "blocks" of the data at any given time. However, when using these iterative sampling methods to solve inverse problems, the main challenges include a proper choice of the regularization parameter, appropriate sampling strategies, and a convergence analysis. To address these challenges, we first describe a family of sampled iterative methods that can incorporate data as they become available (e.g., randomly sampled). We consider two sampled iterative methods, where the iterates can be characterized as solutions to a sequence of approximate Tikhonov problems. The first method requires the regularization parameter to be fixed a priori and converges asymptotically to an unregularized solution for randomly sampled data. This is undesirable for inverse problems. Thus, we focus on the second method where the main benefits are that the regularization parameter can be updated during the iterative process and the iterates converge asymptotically to a Tikhonov-regularized solution. We describe adaptive approaches to update the regularization parameter that are based on sampled residuals, and we describe a limited-memory variant for larger problems. Numerical examples, including a large-scale super-resolution imaging example, demonstrate the potential for these methods.
△ Less
Submitted 14 December, 2018;
originally announced December 2018.
-
Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems
Authors:
Julianne Chung,
Matthias Chung,
J. Tanner Slagel,
Luis Tenorio
Abstract:
We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In our proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computat…
▽ More
We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In our proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computational limitations, and probability distributions that can exploit structure and/or sparsity are considered. Theoretical results on consistency of the approximations for both the stochastic Newton and the stochastic quasi-Newton methods are provided. The results show, in particular, that stochastic Newton iterates, in contrast to stochastic quasi-Newton iterates, may not converge to the desired least-squares solution. Numerical examples, including an example from extreme learning machines, demonstrate the potential applications of these methods.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Goal-oriented optimal approximations of Bayesian linear inverse problems
Authors:
Alessio Spantini,
Tiangang Cui,
Karen Willcox,
Luis Tenorio,
Youssef Marzouk
Abstract:
We propose optimal dimensionality reduction techniques for the solution of goal-oriented linear-Gaussian inverse problems, where the quantity of interest (QoI) is a function of the inversion parameters. These approximations are suitable for large-scale applications. In particular, we study the approximation of the posterior covariance of the QoI as a low-rank negative update of its prior covarianc…
▽ More
We propose optimal dimensionality reduction techniques for the solution of goal-oriented linear-Gaussian inverse problems, where the quantity of interest (QoI) is a function of the inversion parameters. These approximations are suitable for large-scale applications. In particular, we study the approximation of the posterior covariance of the QoI as a low-rank negative update of its prior covariance, and prove optimality of this update with respect to the natural geodesic distance on the manifold of symmetric positive definite matrices. Assuming exact knowledge of the posterior mean of the QoI, the optimality results extend to optimality in distribution with respect to the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose approximation of the posterior mean of the QoI as a low-rank linear function of the data, and prove optimality of this approximation with respect to a weighted Bayes risk. Both of these optimal approximations avoid the explicit computation of the full posterior distribution of the parameters and instead focus on directions that are well informed by the data and relevant to the QoI. These directions stem from a balance among all the components of the goal-oriented inverse problem: prior information, forward model, measurement noise, and ultimate goals. We illustrate the theory using a high-dimensional inverse problem in heat transfer.
△ Less
Submitted 14 March, 2017; v1 submitted 7 July, 2016;
originally announced July 2016.
-
Optimal low-rank approximations of Bayesian linear inverse problems
Authors:
Alessio Spantini,
Antti Solonen,
Tiangang Cui,
James Martin,
Luis Tenorio,
Youssef Marzouk
Abstract:
In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of…
▽ More
In the Bayesian approach to inverse problems, data are often informative, relative to the prior, only on a low-dimensional subspace of the parameter space. Significant computational savings can be achieved by using this subspace to characterize and approximate the posterior distribution of the parameters. We first investigate approximation of the posterior covariance matrix as a low-rank update of the prior covariance matrix. We prove optimality of a particular update, based on the leading eigendirections of the matrix pencil defined by the Hessian of the negative log-likelihood and the prior precision, for a broad class of loss functions. This class includes the Förstner metric for symmetric positive definite matrices, as well as the Kullback-Leibler divergence and the Hellinger distance between the associated distributions. We also propose two fast approximations of the posterior mean and prove their optimality with respect to a weighted Bayes risk under squared-error loss. These approximations are deployed in an offline-online manner, where a more costly but data-independent offline calculation is followed by fast online evaluations. As a result, these approximations are particularly useful when repeated posterior mean evaluations are required for multiple data sets. We demonstrate our theoretical results with several numerical examples, including high-dimensional X-ray tomography and an inverse heat conduction problem. In both of these examples, the intrinsic low-dimensional structure of the inference problem can be exploited while producing results that are essentially indistinguishable from solutions computed in the full space.
△ Less
Submitted 6 July, 2015; v1 submitted 13 July, 2014;
originally announced July 2014.