-
Compact Model Parameter Extraction via Derivative-Free Optimization
Authors:
Rafael Perez Martinez,
Masaya Iwamoto,
Kelly Woo,
Zhengliang Bian,
Roberto Tinti,
Stephen Boyd,
Srabanti Chowdhury
Abstract:
In this paper, we address the problem of compact model parameter extraction to simultaneously extract tens of parameters via derivative-free optimization. Traditionally, parameter extraction is performed manually by dividing the complete set of parameters into smaller subsets, each targeting different operational regions of the device, a process that can take several days or even weeks. Our approa…
▽ More
In this paper, we address the problem of compact model parameter extraction to simultaneously extract tens of parameters via derivative-free optimization. Traditionally, parameter extraction is performed manually by dividing the complete set of parameters into smaller subsets, each targeting different operational regions of the device, a process that can take several days or even weeks. Our approach streamlines this process by employing derivative-free optimization to identify a good parameter set that best fits the compact model without performing an exhaustive number of simulations. We further enhance the optimization process to address critical issues in device modeling by carefully choosing a loss function that evaluates model performance consistently across varying magnitudes by focusing on relative errors (as opposed to absolute errors), prioritizing accuracy in key operational regions of the device above a certain threshold, and reducing sensitivity to outliers. Furthermore, we utilize the concept of train-test split to assess the model fit and avoid overfitting. This is done by fitting 80% of the data and testing the model efficacy with the remaining 20%. We demonstrate the effectiveness of our methodology by successfully modeling two semiconductor devices: a diamond Schottky diode and a GaN-on-SiC HEMT, with the latter involving the ASM-HEMT DC model, which requires simultaneously extracting 35 model parameters to fit the model to the measured data. These examples demonstrate the effectiveness of our approach and showcase the practical benefits of derivative-free optimization in device modeling.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Approximate Sequential Optimization for Informative Path Planning
Authors:
Joshua Ott,
Mykel J. Kochenderfer,
Stephen Boyd
Abstract:
We consider the problem of finding an informative path through a graph, given initial and terminal nodes and a given maximum path length. We assume that a linear noise corrupted measurement is taken at each node of an underlying unknown vector that we wish to estimate. The informativeness is measured by the reduction in uncertainty in our estimate, evaluated using several metrics. We present a con…
▽ More
We consider the problem of finding an informative path through a graph, given initial and terminal nodes and a given maximum path length. We assume that a linear noise corrupted measurement is taken at each node of an underlying unknown vector that we wish to estimate. The informativeness is measured by the reduction in uncertainty in our estimate, evaluated using several metrics. We present a convex relaxation for this informative path planning problem, which we can readily solve to obtain a bound on the possible performance. We develop an approximate sequential method where the path is constructed segment by segment through dynamic programming. This involves solving an orienteering problem, with the node reward acting as a surrogate for informativeness, taking the first step, and then repeating the process. The method scales to very large problem instances and achieves performance not too far from the bound produced by the convex relaxation. We also demonstrate our method's ability to handle adaptive objectives, multimodal sensing, and multi-agent variations of the informative path planning problem.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Finding Moving-Band Statistical Arbitrages via Convex-Concave Optimization
Authors:
Kasper Johansson,
Thomas Schmelzer,
Stephen Boyd
Abstract:
We propose a new method for finding statistical arbitrages that can contain more assets than just the traditional pair. We formulate the problem as seeking a portfolio with the highest volatility, subject to its price remaining in a band and a leverage limit. This optimization problem is not convex, but can be approximately solved using the convex-concave procedure, a specific sequential convex pr…
▽ More
We propose a new method for finding statistical arbitrages that can contain more assets than just the traditional pair. We formulate the problem as seeking a portfolio with the highest volatility, subject to its price remaining in a band and a leverage limit. This optimization problem is not convex, but can be approximately solved using the convex-concave procedure, a specific sequential convex programming method. We show how the method generalizes to finding moving-band statistical arbitrages, where the price band midpoint varies over time.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Factor Fitting, Rank Allocation, and Partitioning in Multilevel Low Rank Matrices
Authors:
Tetiana Parshakova,
Trevor Hastie,
Eric Darve,
Stephen Boyd
Abstract:
We consider multilevel low rank (MLR) matrices, defined as a row and column permutation of a sum of matrices, each one a block diagonal refinement of the previous one, with all blocks low rank given in factored form. MLR matrices extend low rank matrices but share many of their properties, such as the total storage required and complexity of matrix-vector multiplication. We address three problems…
▽ More
We consider multilevel low rank (MLR) matrices, defined as a row and column permutation of a sum of matrices, each one a block diagonal refinement of the previous one, with all blocks low rank given in factored form. MLR matrices extend low rank matrices but share many of their properties, such as the total storage required and complexity of matrix-vector multiplication. We address three problems that arise in fitting a given matrix by an MLR matrix in the Frobenius norm. The first problem is factor fitting, where we adjust the factors of the MLR matrix. The second is rank allocation, where we choose the ranks of the blocks in each level, subject to the total rank having a given value, which preserves the total storage needed for the MLR matrix. The final problem is to choose the hierarchical partition of rows and columns, along with the ranks and factors. This paper is accompanied by an open source package that implements the proposed methods.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
PV Fleet Modeling via Smooth Periodic Gaussian Copula
Authors:
Mehmet G. Ogut,
Bennet Meyers,
Stephen P. Boyd
Abstract:
We present a method for jointly modeling power generation from a fleet of photovoltaic (PV) systems. We propose a white-box method that finds a function that invertibly maps vector time-series data to independent and identically distributed standard normal variables. The proposed method, based on a novel approach for fitting a smooth, periodic copula transform to data, captures many aspects of the…
▽ More
We present a method for jointly modeling power generation from a fleet of photovoltaic (PV) systems. We propose a white-box method that finds a function that invertibly maps vector time-series data to independent and identically distributed standard normal variables. The proposed method, based on a novel approach for fitting a smooth, periodic copula transform to data, captures many aspects of the data such as diurnal variation in the distribution of power output, dependencies among different PV systems, and dependencies across time. It consists of interpretable steps and is scalable to many systems. The resulting joint probability model of PV fleet output across systems and time can be used to generate synthetic data, impute missing data, perform anomaly detection, and make forecasts. In this paper, we explain the method and demonstrate these applications.
△ Less
Submitted 5 June, 2023;
originally announced July 2023.
-
Specifying and Solving Robust Empirical Risk Minimization Problems Using CVXPY
Authors:
Eric Luxenberg,
Dhruv Malik,
Yuanzhi Li,
Aarti Singh,
Stephen Boyd
Abstract:
We consider robust empirical risk minimization (ERM), where model parameters are chosen to minimize the worst-case empirical loss when each data point varies over a given convex uncertainty set. In some simple cases, such problems can be expressed in an analytical form. In general the problem can be made tractable via dualization, which turns a min-max problem into a min-min problem. Dualization r…
▽ More
We consider robust empirical risk minimization (ERM), where model parameters are chosen to minimize the worst-case empirical loss when each data point varies over a given convex uncertainty set. In some simple cases, such problems can be expressed in an analytical form. In general the problem can be made tractable via dualization, which turns a min-max problem into a min-min problem. Dualization requires expertise and is tedious and error-prone. We demonstrate how CVXPY can be used to automate this dualization procedure in a user-friendly manner. Our framework allows practitioners to specify and solve robust ERM problems with a general class of convex losses, capturing many standard regression and classification problems. Users can easily specify any complex uncertainty set that is representable via disciplined convex programming (DCP) constraints.
△ Less
Submitted 13 June, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Joint Graph Learning and Model Fitting in Laplacian Regularized Stratified Models
Authors:
Ziheng Cheng,
Junzi Zhang,
Akshay Agrawal,
Stephen Boyd
Abstract:
Laplacian regularized stratified models (LRSM) are models that utilize the explicit or implicit network structure of the sub-problems as defined by the categorical features called strata (e.g., age, region, time, forecast horizon, etc.), and draw upon data from neighboring strata to enhance the parameter learning of each sub-problem. They have been widely applied in machine learning and signal pro…
▽ More
Laplacian regularized stratified models (LRSM) are models that utilize the explicit or implicit network structure of the sub-problems as defined by the categorical features called strata (e.g., age, region, time, forecast horizon, etc.), and draw upon data from neighboring strata to enhance the parameter learning of each sub-problem. They have been widely applied in machine learning and signal processing problems, including but not limited to time series forecasting, representation learning, graph clustering, max-margin classification, and general few-shot learning. Nevertheless, existing works on LRSM have either assumed a known graph or are restricted to specific applications. In this paper, we start by showing the importance and sensitivity of graph weights in LRSM, and provably show that the sensitivity can be arbitrarily large when the parameter scales and sample sizes are heavily imbalanced across nodes. We then propose a generic approach to jointly learn the graph while fitting the model parameters by solving a single optimization problem. We interpret the proposed formulation from both a graph connectivity viewpoint and an end-to-end Bayesian perspective, and propose an efficient algorithm to solve the problem. Convergence guarantees of the proposed optimization algorithm is also provided despite the lack of global strongly smoothness of the Laplacian regularization term typically required in the existing literature, which may be of independent interest. Finally, we illustrate the efficiency of our approach compared to existing methods by various real-world numerical examples.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Fast Path Planning Through Large Collections of Safe Boxes
Authors:
Tobia Marcucci,
Parth Nobel,
Russ Tedrake,
Stephen Boyd
Abstract:
We present a fast algorithm for the design of smooth paths (or trajectories) that are constrained to lie in a collection of axis-aligned boxes. We consider the case where the number of these safe boxes is large, and basic preprocessing of them (such as finding their intersections) can be done offline. At runtime we quickly generate a smooth path between given initial and terminal positions. Our al…
▽ More
We present a fast algorithm for the design of smooth paths (or trajectories) that are constrained to lie in a collection of axis-aligned boxes. We consider the case where the number of these safe boxes is large, and basic preprocessing of them (such as finding their intersections) can be done offline. At runtime we quickly generate a smooth path between given initial and terminal positions. Our algorithm designs trajectories that are guaranteed to be safe at all times, and detects infeasibility whenever such a trajectory does not exist. Our algorithm is based on two subproblems that we can solve very efficiently: finding a shortest path in a weighted graph, and solving (multiple) convex optimal-control problems. We demonstrate the proposed path planner on large-scale numerical examples, and we provide an efficient open-source software implementation, fastpathplanning.
△ Less
Submitted 2 January, 2024; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Disciplined Saddle Programming
Authors:
Philipp Schiele,
Eric Luxenberg,
Stephen Boyd
Abstract:
We consider convex-concave saddle point problems, and more generally convex optimization problems we refer to as $\textit{saddle problems}$, which include the partial supremum or infimum of convex-concave saddle functions. Saddle problems arise in a wide range of applications, including game theory, machine learning, and finance. It is well known that a saddle problem can be reduced to a single co…
▽ More
We consider convex-concave saddle point problems, and more generally convex optimization problems we refer to as $\textit{saddle problems}$, which include the partial supremum or infimum of convex-concave saddle functions. Saddle problems arise in a wide range of applications, including game theory, machine learning, and finance. It is well known that a saddle problem can be reduced to a single convex optimization problem by dualizing either the convex (min) or concave (max) objectives, reducing a min-max problem into a min-min (or max-max) problem. Carrying out this conversion by hand can be tedious and error prone. In this paper we introduce $\textit{disciplined saddle programming}$ (DSP), a domain specific language (DSL) for specifying saddle problems, for which the dualizing trick can be automated. The language and methods are based on recent work by Juditsky and Nemirovski arXiv:2102.01002 [math.OC], who developed the idea of conic-representable saddle point programs, and showed how to carry out the required dualization automatically using conic duality. Juditsky and Nemirovski's conic representation of saddle problems extends Nesterov and Nemirovski's earlier development of conic representable convex problems; DSP can be thought of as extending disciplined convex programming (DCP) to saddle problems. Just as DCP makes it easy for users to formulate and solve complex convex problems, DSP allows users to easily formulate and solve saddle problems. Our method is implemented in an open-source package, also called DSP.
△ Less
Submitted 10 January, 2024; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Model-Based Deep Learning: On the Intersection of Deep Learning and Optimization
Authors:
Nir Shlezinger,
Yonina C. Eldar,
Stephen P. Boyd
Abstract:
Decision making algorithms are used in a multitude of different applications. Conventional approaches for designing decision algorithms employ principled and simplified modelling, based on which one can determine decisions via tractable optimization. More recently, deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models, are becoming…
▽ More
Decision making algorithms are used in a multitude of different applications. Conventional approaches for designing decision algorithms employ principled and simplified modelling, based on which one can determine decisions via tractable optimization. More recently, deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models, are becoming increasingly popular. Model-based optimization and data-centric deep learning are often considered to be distinct disciplines. Here, we characterize them as edges of a continuous spectrum varying in specificity and parameterization, and provide a tutorial-style presentation to the methodologies lying in the middle ground of this spectrum, referred to as model-based deep learning. We accompany our presentation with running examples in super-resolution and stochastic control, and show how they are expressed using the provided characterization and specialized in each of the detailed methodologies. The gains of combining model-based optimization and deep learning are demonstrated using experimental results in various applications, ranging from biomedical imaging to digital communications.
△ Less
Submitted 21 June, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Approximation Algorithms for Flexible Graph Connectivity
Authors:
Sylvia Boyd,
Joseph Cheriyan,
Arash Haddadan,
Sharat Ibrahimpur
Abstract:
We present approximation algorithms for several network design problems in the model of Flexible Graph Connectivity (Adjiashvili, Hommelsheim and Mühlenthaler, "Flexible Graph Connectivity", Math. Program. pp. 1-33 (2021), and IPCO 2020: pp. 13-26).
Let $k\geq 1$, $p\geq 1$ and $q\geq 0$ be integers. In an instance of the $(p,q)$-Flexible Graph Connectivity problem, denoted $(p,q)$-FGC, we have…
▽ More
We present approximation algorithms for several network design problems in the model of Flexible Graph Connectivity (Adjiashvili, Hommelsheim and Mühlenthaler, "Flexible Graph Connectivity", Math. Program. pp. 1-33 (2021), and IPCO 2020: pp. 13-26).
Let $k\geq 1$, $p\geq 1$ and $q\geq 0$ be integers. In an instance of the $(p,q)$-Flexible Graph Connectivity problem, denoted $(p,q)$-FGC, we have an undirected connected graph $G = (V,E)$, a partition of $E$ into a set of safe edges $S$ and a set of unsafe edges $U$, and nonnegative costs $c: E\to\Re$ on the edges. A subset $F \subseteq E$ of edges is feasible for the $(p,q)$-FGC problem if for any subset $F'$ of unsafe edges with $|F'|\leq q$, the subgraph $(V, F \setminus F')$ is $p$-edge connected. The algorithmic goal is to find a feasible solution $F$ that minimizes $c(F) = \sum_{e \in F} c_e$. We present a simple $2$-approximation algorithm for the $(1,1)$-FGC problem via a reduction to the minimum-cost rooted $2$-arborescence problem. This improves on the $2.527$-approximation algorithm of Adjiashvili et al. Our $2$-approximation algorithm for the $(1,1)$-FGC problem extends to a $(k+1)$-approximation algorithm for the $(1,k)$-FGC problem. We present a $4$-approximation algorithm for the $(p,1)$-FGC problem, and an $O(q\log|V|)$-approximation algorithm for the $(p,q)$-FGC problem. Finally, we improve on the result of Adjiashvili et al. for the unweighted $(1,1)$-FGC problem by presenting a $16/11$-approximation algorithm.
The $(p,q)$-FGC problem is related to the well-known Capacitated $k$-Connected Subgraph problem (denoted Cap-k-ECSS) that arises in the area of Capacitated Network Design. We give a $\min(k,2 u_{max})$-approximation algorithm for the Cap-k-ECSS problem, where $u_{max}$ denotes the maximum capacity of an edge.
△ Less
Submitted 27 February, 2022;
originally announced February 2022.
-
Signal Decomposition Using Masked Proximal Operators
Authors:
Bennet E. Meyers,
Stephen P. Boyd
Abstract:
We consider the well-studied problem of decomposing a vector time series signal into components with different characteristics, such as smooth, periodic, nonnegative, or sparse. We describe a simple and general framework in which the components are defined by loss functions (which include constraints), and the signal decomposition is carried out by minimizing the sum of losses of the components (s…
▽ More
We consider the well-studied problem of decomposing a vector time series signal into components with different characteristics, such as smooth, periodic, nonnegative, or sparse. We describe a simple and general framework in which the components are defined by loss functions (which include constraints), and the signal decomposition is carried out by minimizing the sum of losses of the components (subject to the constraints). When each loss function is the negative log-likelihood of a density for the signal component, this framework coincides with maximum a posteriori probability (MAP) estimation; but it also includes many other interesting cases. Summarizing and clarifying prior results, we give two distributed optimization methods for computing the decomposition, which find the optimal decomposition when the component class loss functions are convex, and are good heuristics when they are not. Both methods require only the masked proximal operator of each of the component loss functions, a generalization of the well-known proximal operator that handles missing entries in its argument. Both methods are distributed, i.e., handle each component separately. We derive tractable methods for evaluating the masked proximal operators of some loss functions that, to our knowledge, have not appeared in the literature.
△ Less
Submitted 20 September, 2022; v1 submitted 18 February, 2022;
originally announced February 2022.
-
A Light-Weight Multi-Objective Asynchronous Hyper-Parameter Optimizer
Authors:
Gabriel Maher,
Stephen Boyd,
Mykel Kochenderfer,
Cristian Matache,
Dylan Reuter,
Alex Ulitsky,
Slava Yukhymuk,
Leonid Kopman
Abstract:
We describe a light-weight yet performant system for hyper-parameter optimization that approximately minimizes an overall scalar cost function that is obtained by combining multiple performance objectives using a target-priority-limit scalarizer. It also supports a trade-off mode, where the goal is to find an appropriate trade-off among objectives by interacting with the user. We focus on the comm…
▽ More
We describe a light-weight yet performant system for hyper-parameter optimization that approximately minimizes an overall scalar cost function that is obtained by combining multiple performance objectives using a target-priority-limit scalarizer. It also supports a trade-off mode, where the goal is to find an appropriate trade-off among objectives by interacting with the user. We focus on the common scenario where there are on the order of tens of hyper-parameters, each with various attributes such as a range of continuous values, or a finite list of values, and whether it should be treated on a linear or logarithmic scale. The system supports multiple asynchronous simulations and is robust to simulation stragglers and failures.
△ Less
Submitted 7 September, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Constructing High-Order Signed Distance Maps from Computed Tomography Data with Application to Bone Morphometry
Authors:
Bryce A. Besler,
Tannis D. Kemp,
Nils D. Forkert,
Steven K. Boyd
Abstract:
An algorithm is presented for constructing high-order signed distance fields for two phase materials imaged with computed tomography. The signed distance field is high-order in that it is free of the quantization artifact associated with the distance transform of sampled signals. The narrowband is solved using a closest point algorithm extended for implicit embeddings that are not a signed distanc…
▽ More
An algorithm is presented for constructing high-order signed distance fields for two phase materials imaged with computed tomography. The signed distance field is high-order in that it is free of the quantization artifact associated with the distance transform of sampled signals. The narrowband is solved using a closest point algorithm extended for implicit embeddings that are not a signed distance field. The high-order fast swee** algorithm is used to extend the narrowband to the remainder of the domain. The order of accuracy of the narrowband and extension methods are verified on ideal implicit surfaces. The method is applied to ten excised cubes of bovine trabecular bone. Localization of the surface, estimation of phase densities, and local morphometry is validated with these subjects. Since the embedding is high-order, gradients and thus curvatures can be accurately estimated locally in the image data.
△ Less
Submitted 1 November, 2021;
originally announced November 2021.
-
Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP
Authors:
Deepak Narayanan,
Fiodar Kazhamiaka,
Firas Abuzaid,
Peter Kraft,
Akshay Agrawal,
Srikanth Kandula,
Stephen Boyd,
Matei Zaharia
Abstract:
Resource allocation problems in many computer systems can be formulated as mathematical optimization problems. However, finding exact solutions to these problems using off-the-shelf solvers is often intractable for large problem sizes with tight SLAs, leading system designers to rely on cheap, heuristic algorithms. We observe, however, that many allocation problems are granular: they consist of a…
▽ More
Resource allocation problems in many computer systems can be formulated as mathematical optimization problems. However, finding exact solutions to these problems using off-the-shelf solvers is often intractable for large problem sizes with tight SLAs, leading system designers to rely on cheap, heuristic algorithms. We observe, however, that many allocation problems are granular: they consist of a large number of clients and resources, each client requests a small fraction of the total number of resources, and clients can interchangeably use different resources. For these problems, we propose an alternative approach that reuses the original optimization problem formulation and leads to better allocations than domain-specific heuristics. Our technique, Partitioned Optimization Problems (POP), randomly splits the problem into smaller problems (with a subset of the clients and resources in the system) and coalesces the resulting sub-allocations into a global allocation for all clients. We provide theoretical and empirical evidence as to why random partitioning works well. In our experiments, POP achieves allocations within 1.5% of the optimal with orders-of-magnitude improvements in runtime compared to existing systems for cluster scheduling, traffic engineering, and load balancing.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Local Morphometry of Closed, Implicit Surfaces
Authors:
Bryce A Besler,
Tannis D. Kemp,
Andrew S. Michalski,
Nils D. Forkert,
Steven K. Boyd
Abstract:
Anatomical structures such as the hippocampus, liver, and bones can be analyzed as orientable, closed surfaces. This permits the computation of volume, surface area, mean curvature, Gaussian curvature, and the Euler-Poincaré characteristic as well as comparison of these morphometrics between structures of different topology. The structures are commonly represented implicitly in curve evolution pro…
▽ More
Anatomical structures such as the hippocampus, liver, and bones can be analyzed as orientable, closed surfaces. This permits the computation of volume, surface area, mean curvature, Gaussian curvature, and the Euler-Poincaré characteristic as well as comparison of these morphometrics between structures of different topology. The structures are commonly represented implicitly in curve evolution problems as the zero level set of an embedding. Practically, binary images of anatomical structures are embedded using a signed distance transform. However, quantization prevents the accurate computation of curvatures, leading to considerable errors in morphometry. This paper presents a fast, simple embedding procedure for accurate local morphometry as the zero crossing of the Gaussian blurred binary image. The proposed method was validated based on the femur and fourth lumbar vertebrae of 50 clinical computed tomography datasets. The results show that the signed distance transform leads to large quantization errors in the computed local curvature. Global validation of morphometry using regression and Bland-Altman analysis revealed that the coefficient of determination for the average mean curvature is improved from 93.8% with the signed distance transform to 100% with the proposed method. For the surface area, the proportional bias is improved from -5.0% for the signed distance transform to +0.6% for the proposed method. The Euler-Poincaré characteristic is improved from unusable in the signed distance transform to 98% accuracy for the proposed method. The proposed method enables an improved local and global evaluation of curvature for purposes of morphometry on closed, implicit surfaces.
△ Less
Submitted 29 July, 2021;
originally announced August 2021.
-
Connected Learning, Collapsed Contexts
Authors:
Caroline Pitt,
Adam Bell,
Brandyn S. Boyd,
Nikki Demmel,
Katie Davis
Abstract:
Researchers and designers have incorporated social media affordances into learning technologies to engage young people and support personally relevant learning, but youth may reject these attempts because they do not meet user expectations. Through in-depth case studies, we explore the sociotechnical ecosystems of six teens (ages 15-18) working at a science center that had recently introduced a di…
▽ More
Researchers and designers have incorporated social media affordances into learning technologies to engage young people and support personally relevant learning, but youth may reject these attempts because they do not meet user expectations. Through in-depth case studies, we explore the sociotechnical ecosystems of six teens (ages 15-18) working at a science center that had recently introduced a digital badge system to track and recognize their learning. By analyzing interviews, observations, ecological momentary assessments, and system data, we examined tensions in how badges as connected learning technologies operate in teens' sociotechnical ecosystems. We found that, due to issues of unwanted context collapse and incongruent identity representations, youth only used certain affordances of the system and did so sporadically. Additionally, we noted that some features seemed to prioritize values of adult stakeholders over youth. Using badges as a lens, we reveal critical tensions and offer design recommendations for networked learning technologies.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
Operator Splitting for Adaptive Radiation Therapy with Nonlinear Health Dynamics
Authors:
Anqi Fu,
Lei Xing,
Stephen Boyd
Abstract:
We present an optimization-based approach to radiation treatment planning over time. Our approach formulates treatment planning as an optimal control problem with nonlinear patient health dynamics derived from the standard linear-quadratic cell survival model. As the formulation is nonconvex, we propose a method for obtaining an approximate solution by solving a sequence of convex optimization pro…
▽ More
We present an optimization-based approach to radiation treatment planning over time. Our approach formulates treatment planning as an optimal control problem with nonlinear patient health dynamics derived from the standard linear-quadratic cell survival model. As the formulation is nonconvex, we propose a method for obtaining an approximate solution by solving a sequence of convex optimization problems. This method is fast, efficient, and robust to model error, adapting readily to changes in the patient's health between treatment sessions. Moreover, we show that it can be combined with the operator splitting method ADMM to produce an algorithm that is highly scalable and can handle large clinical cases. We introduce an open-source Python implementation of our algorithm, AdaRad, and demonstrate its performance on several examples.
△ Less
Submitted 13 May, 2022; v1 submitted 4 May, 2021;
originally announced May 2021.
-
Allocation of Fungible Resources via a Fast, Scalable Price Discovery Method
Authors:
Akshay Agrawal,
Stephen Boyd,
Deepak Narayanan,
Fiodar Kazhamiaka,
Matei Zaharia
Abstract:
We consider the problem of assigning or allocating resources to a set of jobs. We consider the case when the resources are fungible, that is, the job can be done with any mix of the resources, but with different efficiencies. In our formulation we maximize a total utility subject to a given limit on the resource usage, which is a convex optimization problem and so is tractable. In this paper we de…
▽ More
We consider the problem of assigning or allocating resources to a set of jobs. We consider the case when the resources are fungible, that is, the job can be done with any mix of the resources, but with different efficiencies. In our formulation we maximize a total utility subject to a given limit on the resource usage, which is a convex optimization problem and so is tractable. In this paper we develop a custom, parallelizable algorithm for solving the resource allocation problem that scales to large problems, with millions of jobs. Our algorithm is based on the dual problem, in which the dual variables associated with the resource usage limit can be interpreted as resource prices. Our method updates the resource prices in each iteration, ultimately discovering the optimal resource prices, from which an optimal allocation is obtained. We provide an open-source implementation of our method, which can solve problems with millions of jobs in a few seconds on CPU, and under a second on a GPU; our software can solve smaller problems in milliseconds. On large problems, our implementation is up to three orders of magnitude faster than a commerical solver for convex optimization.
△ Less
Submitted 18 April, 2021; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Minimum-Distortion Embedding
Authors:
Akshay Agrawal,
Alnur Ali,
Stephen Boyd
Abstract:
We consider the vector embedding problem. We are given a finite set of items, with the goal of assigning a representative vector to each one, possibly under some constraints (such as the collection of vectors being standardized, i.e., having zero mean and unit covariance). We are given data indicating that some pairs of items are similar, and optionally, some other pairs are dissimilar. For pairs…
▽ More
We consider the vector embedding problem. We are given a finite set of items, with the goal of assigning a representative vector to each one, possibly under some constraints (such as the collection of vectors being standardized, i.e., having zero mean and unit covariance). We are given data indicating that some pairs of items are similar, and optionally, some other pairs are dissimilar. For pairs of similar items, we want the corresponding vectors to be near each other, and for dissimilar pairs, we want the corresponding vectors to not be near each other, measured in Euclidean distance. We formalize this by introducing distortion functions, defined for some pairs of the items. Our goal is to choose an embedding that minimizes the total distortion, subject to the constraints. We call this the minimum-distortion embedding (MDE) problem.
The MDE framework is simple but general. It includes a wide variety of embedding methods, such as spectral embedding, principal component analysis, multidimensional scaling, dimensionality reduction methods (like Isomap and UMAP), force-directed layout, and others. It also includes new embeddings, and provides principled ways of validating historical and new embeddings alike.
We develop a projected quasi-Newton method that approximately solves MDE problems and scales to large data sets. We implement this method in PyMDE, an open-source Python package. In PyMDE, users can select from a library of distortion functions and constraints or specify custom ones, making it easy to rapidly experiment with different embeddings. Our software scales to data sets with millions of items and tens of millions of distortion functions. To demonstrate our method, we compute embeddings for several real-world data sets, including images, an academic co-author network, US county demographic data, and single-cell mRNA transcriptomes.
△ Less
Submitted 24 August, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
A $2$-Approximation Algorithm for Flexible Graph Connectivity
Authors:
Sylvia Boyd,
Joseph Cheriyan,
Arash Haddadan,
Sharat Ibrahimpur
Abstract:
We present a $2$-approximation algorithm for the Flexible Graph Connectivity problem [AHM20] via a reduction to the minimum cost $r$-out $2$-arborescence problem.
We present a $2$-approximation algorithm for the Flexible Graph Connectivity problem [AHM20] via a reduction to the minimum cost $r$-out $2$-arborescence problem.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Covariance Prediction via Convex Optimization
Authors:
Shane Barratt,
Stephen Boyd
Abstract:
We consider the problem of predicting the covariance of a zero mean Gaussian vector, based on another feature vector. We describe a covariance predictor that has the form of a generalized linear model, i.e., an affine function of the features followed by an inverse link function that maps vectors to symmetric positive definite matrices. The log-likelihood is a concave function of the predictor par…
▽ More
We consider the problem of predicting the covariance of a zero mean Gaussian vector, based on another feature vector. We describe a covariance predictor that has the form of a generalized linear model, i.e., an affine function of the features followed by an inverse link function that maps vectors to symmetric positive definite matrices. The log-likelihood is a concave function of the predictor parameters, so fitting the predictor involves convex optimization. Such predictors can be combined with others, or recursively applied to improve performance.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
Low Rank Forecasting
Authors:
Shane Barratt,
Yining Dong,
Stephen Boyd
Abstract:
We consider the problem of forecasting multiple values of the future of a vector time series, using some past values. This problem, and related ones such as one-step-ahead prediction, have a very long history, and there are a number of well-known methods for it, including vector auto-regressive models, state-space methods, multi-task regression, and others. Our focus is on low rank forecasters, wh…
▽ More
We consider the problem of forecasting multiple values of the future of a vector time series, using some past values. This problem, and related ones such as one-step-ahead prediction, have a very long history, and there are a number of well-known methods for it, including vector auto-regressive models, state-space methods, multi-task regression, and others. Our focus is on low rank forecasters, which break forecasting up into two steps: estimating a vector that can be interpreted as a latent state, given the past, and then estimating the future values of the time series, given the latent state estimate. We introduce the concept of forecast consistency, which means that the estimates of the same value made at different times are consistent. We formulate the forecasting problem in general form, and focus on linear forecasters, for which we propose a formulation that can be solved via convex optimization. We describe a number of extensions and variations, including nonlinear forecasters, data weighting, the inclusion of auxiliary data, and additional objective terms. We illustrate our methods with several examples.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
Sample Efficient Reinforcement Learning with REINFORCE
Authors:
Junzi Zhang,
Jongho Kim,
Brendan O'Donoghue,
Stephen Boyd
Abstract:
Policy gradient methods are among the most effective methods for large-scale reinforcement learning, and their empirical success has prompted several works that develop the foundation of their global convergence theory. However, prior works have either required exact gradients or state-action visitation measure based mini-batch stochastic gradients with a diverging batch size, which limit their ap…
▽ More
Policy gradient methods are among the most effective methods for large-scale reinforcement learning, and their empirical success has prompted several works that develop the foundation of their global convergence theory. However, prior works have either required exact gradients or state-action visitation measure based mini-batch stochastic gradients with a diverging batch size, which limit their applicability in practical scenarios. In this paper, we consider classical policy gradient methods that compute an approximate gradient with a single trajectory or a fixed size mini-batch of trajectories under soft-max parametrization and log-barrier regularization, along with the widely-used REINFORCE gradient estimation procedure. By controlling the number of "bad" episodes and resorting to the classical doubling trick, we establish an anytime sub-linear high probability regret bound as well as almost sure global convergence of the average regret with an asymptotically sub-linear rate. These provide the first set of global convergence and sample efficiency results for the well-known REINFORCE algorithm and contribute to a better understanding of its performance in practice.
△ Less
Submitted 24 December, 2020; v1 submitted 21 October, 2020;
originally announced October 2020.
-
A $4/3$-Approximation Algorithm for the Minimum $2$-Edge Connected Multisubgraph Problem in the Half-Integral Case
Authors:
S. Boyd,
J. Cheriyan,
R. Cummings,
L. Grout,
S. Ibrahimpur,
Z. Szigeti,
L. Wang
Abstract:
Given a connected undirected graph $\bar{G}$ on $n$ vertices, and non-negative edge costs $c$, the 2ECM problem is that of finding a $2$-edge~connected spanning multisubgraph of $\bar{G}$ of minimum cost. The natural linear program (LP) for 2ECM, which coincides with the subtour LP for the Traveling Salesman Problem on the metric closure of $\bar{G}$, gives a lower bound on the optimal cost. For i…
▽ More
Given a connected undirected graph $\bar{G}$ on $n$ vertices, and non-negative edge costs $c$, the 2ECM problem is that of finding a $2$-edge~connected spanning multisubgraph of $\bar{G}$ of minimum cost. The natural linear program (LP) for 2ECM, which coincides with the subtour LP for the Traveling Salesman Problem on the metric closure of $\bar{G}$, gives a lower bound on the optimal cost. For instances where this LP is optimized by a half-integral solution $x$, Carr and Ravi (1998) showed that the integrality gap is at most $\frac43$: they show that the vector $\frac43 x$ dominates a convex combination of incidence vectors of $2$-edge connected spanning multisubgraphs of $\bar{G}$.
We present a simpler proof of the result due to Carr and Ravi by applying an extension of Lovász's splitting-off theorem. Our proof naturally leads to a $\frac43$-approximation algorithm for half-integral instances. Given a half-integral solution $x$ to the LP for 2ECM, we give an $O(n^2)$-time algorithm to obtain a $2$-edge connected spanning multisubgraph of $\bar{G}$ whose cost is at most $\frac43 c^T x$.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
Learning Convex Optimization Models
Authors:
Akshay Agrawal,
Shane Barratt,
Stephen Boyd
Abstract:
A convex optimization model predicts an output from an input by solving a convex optimization problem. The class of convex optimization models is large, and includes as special cases many well-known models like linear and logistic regression. We propose a heuristic for learning the parameters in a convex optimization model given a dataset of input-output pairs, using recently developed methods for…
▽ More
A convex optimization model predicts an output from an input by solving a convex optimization problem. The class of convex optimization models is large, and includes as special cases many well-known models like linear and logistic regression. We propose a heuristic for learning the parameters in a convex optimization model given a dataset of input-output pairs, using recently developed methods for differentiating the solution of a convex optimization problem with respect to its parameters. We describe three general classes of convex optimization models, maximum a posteriori (MAP) models, utility maximization models, and agent models, and present a numerical experiment for each.
△ Less
Submitted 18 June, 2020; v1 submitted 7 June, 2020;
originally announced June 2020.
-
Optimal Representative Sample Weighting
Authors:
Shane Barratt,
Guillermo Angeris,
Stephen Boyd
Abstract:
We consider the problem of assigning weights to a set of samples or data records, with the goal of achieving a representative weighting, which happens when certain sample averages of the data are close to prescribed values. We frame the problem of finding representative sample weights as an optimization problem, which in many cases is convex and can be efficiently solved. Our formulation includes…
▽ More
We consider the problem of assigning weights to a set of samples or data records, with the goal of achieving a representative weighting, which happens when certain sample averages of the data are close to prescribed values. We frame the problem of finding representative sample weights as an optimization problem, which in many cases is convex and can be efficiently solved. Our formulation includes as a special case the selection of a fixed number of the samples, with equal weights, i.e., the problem of selecting a smaller representative subset of the samples. While this problem is combinatorial and not convex, heuristic methods based on convex optimization seem to perform very well. We describe rsw, an open-source implementation of the ideas described in this paper, and apply it to a skewed sample of the CDC BRFSS dataset.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
Fitting Laplacian Regularized Stratified Gaussian Models
Authors:
Jonathan Tuck,
Stephen Boyd
Abstract:
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data. We propose to jointly estimate these covariance matrices using Laplacian regularized stratified model fitting, which includes loss and regularization terms for each covariance matrix, and also a term that encourages the different covariances matrices to be close. This method `borrows strength…
▽ More
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data. We propose to jointly estimate these covariance matrices using Laplacian regularized stratified model fitting, which includes loss and regularization terms for each covariance matrix, and also a term that encourages the different covariances matrices to be close. This method `borrows strength' from the neighboring covariances, to improve its estimate. With well chosen hyper-parameters, such models can perform very well, especially in the low data regime. We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
△ Less
Submitted 22 May, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
Eigen-Stratified Models
Authors:
Jonathan Tuck,
Stephen Boyd
Abstract:
Stratified models depend in an arbitrary way on a selected categorical feature that takes $K$ values, and depend linearly on the other $n$ features. Laplacian regularization with respect to a graph on the feature values can greatly improve the performance of a stratified model, especially in the low-data regime. A significant issue with Laplacian-regularized stratified models is that the model is…
▽ More
Stratified models depend in an arbitrary way on a selected categorical feature that takes $K$ values, and depend linearly on the other $n$ features. Laplacian regularization with respect to a graph on the feature values can greatly improve the performance of a stratified model, especially in the low-data regime. A significant issue with Laplacian-regularized stratified models is that the model is $K$ times the size of the base model, which can be quite large.
We address this issue by formulating eigen-stratifed models, which are stratified models with an additional constraint that the model parameters are linear combinations of some modest number $m$ of bottom eigenvectors of the graph Laplacian, i.e., those associated with the $m$ smallest eigenvalues. With eigen-stratified models, we only need to store the $m$ bottom eigenvectors and the corresponding coefficients as the stratified model parameters. This leads to a reduction, sometimes large, of model size when $m \leq n$ and $m \ll K$. In some cases, the additional regularization implicit in eigen-stratified models can improve out-of-sample performance over standard Laplacian regularized stratified models.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
Robust Self-Supervised Learning of Deterministic Errors in Single-Plane (Monoplanar) and Dual-Plane (Biplanar) X-ray Fluoroscopy
Authors:
Jacky C. K. Chow,
Steven K. Boyd,
Derek D. Lichti,
Janet L. Ronsky
Abstract:
Fluoroscopic imaging that captures X-ray images at video framerates is advantageous for guiding catheter insertions by vascular surgeons and interventional radiologists. Visualizing the dynamical movements non-invasively allows complex surgical procedures to be performed with less trauma to the patient. To improve surgical precision, endovascular procedures can benefit from more accurate fluorosco…
▽ More
Fluoroscopic imaging that captures X-ray images at video framerates is advantageous for guiding catheter insertions by vascular surgeons and interventional radiologists. Visualizing the dynamical movements non-invasively allows complex surgical procedures to be performed with less trauma to the patient. To improve surgical precision, endovascular procedures can benefit from more accurate fluoroscopy data via calibration. This paper presents a robust self-calibration algorithm suitable for single-plane and dual-plane fluoroscopy. A three-dimensional (3D) target field was imaged by the fluoroscope in a strong geometric network configuration. The unknown 3D positions of targets and the fluoroscope pose were estimated simultaneously by maximizing the likelihood of the Student-t probability distribution function. A smoothed k-nearest neighbour (kNN) regression is then used to model the deterministic component of the image reprojection error of the robust bundle adjustment. The Maximum Likelihood Estimation step and the kNN regression step are then repeated iteratively until convergence. Four different error modeling schemes were compared while varying the quantity of training images. It was found that using a smoothed kNN regression can automatically model the systematic errors in fluoroscopy with similar accuracy as a human expert using a small training dataset. When all training images were used, the 3D map** error was reduced from 0.61-0.83 mm to 0.04 mm post-calibration (94.2-95.7% improvement), and the 2D reprojection error was reduced from 1.17-1.31 to 0.20-0.21 pixels (83.2-83.8% improvement). When using biplanar fluoroscopy, the 3D measurement accuracy of the system improved from 0.60 mm to 0.32 mm (47.2% improvement).
△ Less
Submitted 2 January, 2020;
originally announced January 2020.
-
Learning Convex Optimization Control Policies
Authors:
Akshay Agrawal,
Shane Barratt,
Stephen Boyd,
Bartolomeo Stellato
Abstract:
Many control policies used in various applications determine the input or action by solving a convex optimization problem that depends on the current state and some parameters. Common examples of such convex optimization control policies (COCPs) include the linear quadratic regulator (LQR), convex model predictive control (MPC), and convex control-Lyapunov or approximate dynamic programming (ADP)…
▽ More
Many control policies used in various applications determine the input or action by solving a convex optimization problem that depends on the current state and some parameters. Common examples of such convex optimization control policies (COCPs) include the linear quadratic regulator (LQR), convex model predictive control (MPC), and convex control-Lyapunov or approximate dynamic programming (ADP) policies. These types of control policies are tuned by varying the parameters in the optimization problem, such as the LQR weights, to obtain good performance, judged by application-specific metrics. Tuning is often done by hand, or by simple methods such as a crude grid search. In this paper we propose a method to automate this process, by adjusting the parameters using an approximate gradient of the performance metric with respect to the parameters. Our method relies on recently developed methods that can efficiently evaluate the derivative of the solution of a convex optimization problem with respect to its parameters. We illustrate our method on several examples.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Differentiable Convex Optimization Layers
Authors:
Akshay Agrawal,
Brandon Amos,
Shane Barratt,
Stephen Boyd,
Steven Diamond,
Zico Kolter
Abstract:
Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach t…
▽ More
Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver's solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. We implement our methodology in version 1.1 of CVXPY, a popular Python-embedded DSL for convex optimization, and additionally implement differentiable layers for disciplined convex programs in PyTorch and TensorFlow 2.0. Our implementation significantly lowers the barrier to using convex optimization problems in differentiable programs. We present applications in linear machine learning models and in stochastic control, and we show that our layer is competitive (in execution time) compared to specialized differentiable solvers from past work.
△ Less
Submitted 28 October, 2019;
originally announced October 2019.
-
Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize
Authors:
Youngsuk Park,
Sauptik Dhar,
Stephen Boyd,
Mohak Shah
Abstract:
Variable metric proximal gradient (VM-PG) is a widely used class of convex optimization method. Lately, there has been a lot of research on the theoretical guarantees of VM-PG with different metric selections. However, most such metric selections are dependent on (an expensive) Hessian, or limited to scalar stepsizes like the Barzilai-Borwein (BB) stepsize with lots of safeguarding. Instead, in th…
▽ More
Variable metric proximal gradient (VM-PG) is a widely used class of convex optimization method. Lately, there has been a lot of research on the theoretical guarantees of VM-PG with different metric selections. However, most such metric selections are dependent on (an expensive) Hessian, or limited to scalar stepsizes like the Barzilai-Borwein (BB) stepsize with lots of safeguarding. Instead, in this paper we propose an adaptive metric selection strategy called the diagonal Barzilai-Borwein (BB) stepsize. The proposed diagonal selection better captures the local geometry of the problem while kee** per-step computation cost similar to the scalar BB stepsize i.e. $O(n)$. Under this metric selection for VM-PG, the theoretical convergence is analyzed. Our empirical studies illustrate the improved convergence results under the proposed diagonal BB stepsize, specifically for ill-conditioned machine learning problems for both synthetic and real-world datasets.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
A General Optimization Framework for Dynamic Time War**
Authors:
Dave Deriso,
Stephen Boyd
Abstract:
The goal of dynamic time war** is to transform or warp time in order to approximately align two signals together. We pose the choice of war** function as an optimization problem with several terms in the objective. The first term measures the misalignment of the time-warped signals. Two additional regularization terms penalize the cumulative war** and the instantaneous rate of time war**;…
▽ More
The goal of dynamic time war** is to transform or warp time in order to approximately align two signals together. We pose the choice of war** function as an optimization problem with several terms in the objective. The first term measures the misalignment of the time-warped signals. Two additional regularization terms penalize the cumulative war** and the instantaneous rate of time war**; constraints on the war** can be imposed by assigning the value +inf to the regularization terms. Different choices of the three objective terms yield different time war** functions that trade off signal fit or alignment and properties of the war** function. The optimization problem we formulate is a classical optimal control problem, with initial and terminal constraints, and a state dimension of one. We describe an effective general method that minimizes the objective by discretizing the values of the original and warped time, and using standard dynamic programming to compute the (globally) optimal war** function with the discretized values. Iterated refinement of this scheme yields a high accuracy war** function in just a few iterations. Our method is implemented as an open source Python package GDTW.
△ Less
Submitted 31 May, 2019; v1 submitted 30 May, 2019;
originally announced May 2019.
-
Disciplined Quasiconvex Programming
Authors:
Akshay Agrawal,
Stephen Boyd
Abstract:
We present a composition rule involving quasiconvex functions that generalizes the classical composition rule for convex functions. This rule complements well-known rules for the curvature of quasiconvex functions under increasing functions and pointwise maximums. We refer to the class of optimization problems generated by these rules, along with a base set of quasiconvex and quasiconcave function…
▽ More
We present a composition rule involving quasiconvex functions that generalizes the classical composition rule for convex functions. This rule complements well-known rules for the curvature of quasiconvex functions under increasing functions and pointwise maximums. We refer to the class of optimization problems generated by these rules, along with a base set of quasiconvex and quasiconcave functions, as disciplined quasiconvex programs. Disciplined quasiconvex programming generalizes disciplined convex programming, the class of optimization problems targeted by most modern domain-specific languages for convex optimization. We describe an implementation of disciplined quasiconvex programming that makes it possible to specify and solve quasiconvex programs in CVXPY 1.0.
△ Less
Submitted 27 February, 2020; v1 submitted 1 May, 2019;
originally announced May 2019.
-
A Distributed Method for Fitting Laplacian Regularized Stratified Models
Authors:
Jonathan Tuck,
Shane Barratt,
Stephen Boyd
Abstract:
Stratified models are models that depend in an arbitrary way on a set of selected categorical features, and depend linearly on the other features. In a basic and traditional formulation a separate model is fit for each value of the categorical feature, using only the data that has the specific categorical value. To this formulation we add Laplacian regularization, which encourages the model parame…
▽ More
Stratified models are models that depend in an arbitrary way on a set of selected categorical features, and depend linearly on the other features. In a basic and traditional formulation a separate model is fit for each value of the categorical feature, using only the data that has the specific categorical value. To this formulation we add Laplacian regularization, which encourages the model parameters for neighboring categorical values to be similar. Laplacian regularization allows us to specify one or more weighted graphs on the stratification feature values. For example, stratifying over the days of the week, we can specify that the Sunday model parameter should be close to the Saturday and Monday model parameters. The regularization improves the performance of the model over the traditional stratified model, since the model for each value of the categorical `borrows strength' from its neighbors. In particular, it produces a model even for categorical values that did not appear in the training data set.
We propose an efficient distributed method for fitting stratified models, based on the alternating direction method of multipliers (ADMM). When the fitting loss functions are convex, the stratified model fitting problem is convex, and our method computes the global minimizer of the loss plus regularization; in other cases it computes a local minimizer. The method is very efficient, and naturally scales to large data sets or numbers of stratified feature values. We illustrate our method with a variety of examples.
△ Less
Submitted 10 November, 2019; v1 submitted 26 April, 2019;
originally announced April 2019.
-
Least Squares Auto-Tuning
Authors:
Shane Barratt,
Stephen Boyd
Abstract:
Least squares is by far the simplest and most commonly applied computational method in many fields. In almost all applications, the least squares objective is rarely the true objective. We account for this discrepancy by parametrizing the least squares problem and automatically adjusting these parameters using an optimization algorithm. We apply our method, which we call least squares auto-tuning,…
▽ More
Least squares is by far the simplest and most commonly applied computational method in many fields. In almost all applications, the least squares objective is rarely the true objective. We account for this discrepancy by parametrizing the least squares problem and automatically adjusting these parameters using an optimization algorithm. We apply our method, which we call least squares auto-tuning, to data fitting.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
Disciplined Geometric Programming
Authors:
Akshay Agrawal,
Steven Diamond,
Stephen Boyd
Abstract:
We introduce log-log convex programs, which are optimization problems with positive variables that become convex when the variables, objective functions, and constraint functions are replaced with their logs, which we refer to as a log-log transformation. This class of problems generalizes traditional geometric programming and generalized geometric programming, and it includes interesting problems…
▽ More
We introduce log-log convex programs, which are optimization problems with positive variables that become convex when the variables, objective functions, and constraint functions are replaced with their logs, which we refer to as a log-log transformation. This class of problems generalizes traditional geometric programming and generalized geometric programming, and it includes interesting problems involving nonnegative matrices. We give examples of log-log convex functions, some well-known and some less so, and we develop an analog of disciplined convex programming, which we call disciplined geometric programming. Disciplined geometric programming is a subclass of log-log convex programming generated by a composition rule and a set of functions with known curvature under the log-log transformation. Finally, we describe an implementation of disciplined geometric programming as a reduction in CVXPY 1.0.
△ Less
Submitted 20 March, 2019; v1 submitted 10 December, 2018;
originally announced December 2018.
-
Learning Probabilistic Trajectory Models of Aircraft in Terminal Airspace from Position Data
Authors:
Shane Barratt,
Mykel Kochenderfer,
Stephen Boyd
Abstract:
Models for predicting aircraft motion are an important component of modern aeronautical systems. These models help aircraft plan collision avoidance maneuvers and help conduct offline performance and safety analyses. In this article, we develop a method for learning a probabilistic generative model of aircraft motion in terminal airspace, the controlled airspace surrounding a given airport. The me…
▽ More
Models for predicting aircraft motion are an important component of modern aeronautical systems. These models help aircraft plan collision avoidance maneuvers and help conduct offline performance and safety analyses. In this article, we develop a method for learning a probabilistic generative model of aircraft motion in terminal airspace, the controlled airspace surrounding a given airport. The method fits the model based on a historical dataset of radar-based position measurements of aircraft landings and takeoffs at that airport. We find that the model generates realistic trajectories, provides accurate predictions, and captures the statistical properties of aircraft trajectories. Furthermore, the model trains quickly, is compact, and allows for efficient real-time inference.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Network Optimization for Unified Packet and Circuit Switched Networks
Authors:
** Yin,
Steven Diamond,
Bill Lin,
Stephen Boyd
Abstract:
Internet traffic continues to grow relentlessly, driven largely by increasingly high resolution video content. Although studies have shown that the majority of packets processed by Internet routers are pass-through traffic, they nonetheless have to be queued and routed at every hop in current networks, which unnecessarily adds substantial delays and processing costs. Such pass-through traffic can…
▽ More
Internet traffic continues to grow relentlessly, driven largely by increasingly high resolution video content. Although studies have shown that the majority of packets processed by Internet routers are pass-through traffic, they nonetheless have to be queued and routed at every hop in current networks, which unnecessarily adds substantial delays and processing costs. Such pass-through traffic can be better circuit-switched through the underlying optical transport network by means of pre-established circuits, which is possible in a unified packet and circuit switched network. In this paper, we propose a novel convex optimization framework based on a new destination-based multicommodity flow formulation for the allocation of circuits in such unified networks. In particular, we consider two deployment settings, one based on real-time traffic monitoring, and the other relying upon history-based traffic predictions. In both cases, we formulate global network optimization objectives as concave functions that capture the fair sharing of network capacity among competing traffic flows. The convexity of our problem formulations ensures globally optimal solutions.
△ Less
Submitted 22 May, 2019; v1 submitted 1 August, 2018;
originally announced August 2018.
-
Fitting Jump Models
Authors:
A. Bemporad,
V. Breschi,
D. Piga,
S. Boyd
Abstract:
We describe a new framework for fitting jump models to a sequence of data. The key idea is to alternate between minimizing a loss function to fit multiple model parameters, and minimizing a discrete loss function to determine which set of model parameters is active at each data point. The framework is quite general and encompasses popular classes of models, such as hidden Markov models and piecewi…
▽ More
We describe a new framework for fitting jump models to a sequence of data. The key idea is to alternate between minimizing a loss function to fit multiple model parameters, and minimizing a discrete loss function to determine which set of model parameters is active at each data point. The framework is quite general and encompasses popular classes of models, such as hidden Markov models and piecewise affine models. The shape of the chosen loss functions to minimize determine the shape of the resulting jump model.
△ Less
Submitted 21 May, 2018; v1 submitted 25 November, 2017;
originally announced November 2017.
-
A Rewriting System for Convex Optimization Problems
Authors:
Akshay Agrawal,
Robin Verschueren,
Steven Diamond,
Stephen Boyd
Abstract:
We describe a modular rewriting system for translating optimization problems written in a domain-specific language to forms compatible with low-level solver interfaces. Translation is facilitated by reductions, which accept a category of problems and transform instances of that category to equivalent instances of another category. Our system proceeds in two key phases: analysis, in which we attemp…
▽ More
We describe a modular rewriting system for translating optimization problems written in a domain-specific language to forms compatible with low-level solver interfaces. Translation is facilitated by reductions, which accept a category of problems and transform instances of that category to equivalent instances of another category. Our system proceeds in two key phases: analysis, in which we attempt to find a suitable solver for a supplied problem, and canonicalization, in which we rewrite the problem in the selected solver's standard form. We implement the described system in version 1.0 of CVXPY, a domain-specific language for mathematical and especially convex optimization. By treating reductions as first-class objects, our method makes it easy to match problems to solvers well-suited for them and to support solvers with a wide variety of standard forms.
△ Less
Submitted 21 January, 2019; v1 submitted 13 September, 2017;
originally announced September 2017.
-
On the convergence of mirror descent beyond stochastic convex programming
Authors:
Zhengyuan Zhou,
Panayotis Mertikopoulos,
Nicholas Bambos,
Stephen Boyd,
Peter Glynn
Abstract:
In this paper, we examine the convergence of mirror descent in a class of stochastic optimization problems that are not necessarily convex (or even quasi-convex), and which we call variationally coherent. Since the standard technique of "ergodic averaging" offers no tangible benefits beyond convex programming, we focus directly on the algorithm's last generated sample (its "last iterate"), and we…
▽ More
In this paper, we examine the convergence of mirror descent in a class of stochastic optimization problems that are not necessarily convex (or even quasi-convex), and which we call variationally coherent. Since the standard technique of "ergodic averaging" offers no tangible benefits beyond convex programming, we focus directly on the algorithm's last generated sample (its "last iterate"), and we show that it converges with probabiility $1$ if the underlying problem is coherent. We further consider a localized version of variational coherence which ensures local convergence of stochastic mirror descent (SMD) with high probability. These results contribute to the landscape of non-convex stochastic optimization by showing that (quasi-)convexity is not essential for convergence to a global minimum: rather, variational coherence, a much weaker requirement, suffices. Finally, building on the above, we reveal an interesting insight regarding the convergence speed of SMD: in problems with sharp minima (such as generic linear programs or concave minimization problems), SMD reaches a minimum point in a finite number of steps (a.s.), even in the presence of persistent gradient noise. This result is to be contrasted with existing black-box convergence rate estimates that are only asymptotic.
△ Less
Submitted 16 July, 2018; v1 submitted 18 June, 2017;
originally announced June 2017.
-
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Authors:
David Hallac,
Sagar Vare,
Stephen Boyd,
Jure Leskovec
Abstract:
Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few action…
▽ More
Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.
△ Less
Submitted 14 May, 2018; v1 submitted 9 June, 2017;
originally announced June 2017.
-
The Salesman's Improved Tours for Fundamental Classes
Authors:
Sylvia Boyd,
András Sebö
Abstract:
Finding the exact integrality gap $α$ for the LP relaxation of the metric Travelling Salesman Problem (TSP) has been an open problem for over thirty years, with little progress made. It is known that $4/3 \leq α\leq 3/2$, and a famous conjecture states $α= 4/3$. For this problem, essentially two "fundamental" classes of instances have been proposed. This fundamental property means that in order to…
▽ More
Finding the exact integrality gap $α$ for the LP relaxation of the metric Travelling Salesman Problem (TSP) has been an open problem for over thirty years, with little progress made. It is known that $4/3 \leq α\leq 3/2$, and a famous conjecture states $α= 4/3$. For this problem, essentially two "fundamental" classes of instances have been proposed. This fundamental property means that in order to show that the integrality gap is at most $ρ$ for all instances of metric TSP, it is sufficient to show it only for the instances in the fundamental class. However, despite the importance and the simplicity of such classes, no apparent effort has been deployed for improving the integrality gap bounds for them. In this paper we take a natural first step in this endeavour, and consider the $1/2$-integer points of one such class. We successfully improve the upper bound for the integrality gap from $3/2$ to $10/7$ for a superclass of these points, as well as prove a lower bound of $4/3$ for the superclass. Our methods involve innovative applications of tools from combinatorial optimization which have the potential to be more broadly applied.
△ Less
Submitted 29 October, 2018; v1 submitted 5 May, 2017;
originally announced May 2017.
-
A Distributed Method for Optimal Capacity Reservation
Authors:
Nicholas Moehle,
Xinyue Shen,
Zhi-Quan Luo,
Stephen Boyd
Abstract:
We consider the problem of reserving link capacity in a network in such a way that any of a given set of flow scenarios can be supported. In the optimal capacity reservation problem, we choose the reserved link capacities to minimize the reservation cost. This problem reduces to a large linear program, with the number of variables and constraints on the order of the number of links times the numbe…
▽ More
We consider the problem of reserving link capacity in a network in such a way that any of a given set of flow scenarios can be supported. In the optimal capacity reservation problem, we choose the reserved link capacities to minimize the reservation cost. This problem reduces to a large linear program, with the number of variables and constraints on the order of the number of links times the number of scenarios. Small and medium size problems are within the capabilities of generic linear program solvers. We develop a more scalable, distributed algorithm for the problem that alternates between solving (in parallel) one flow problem per scenario, and coordination steps, which connect the individual flows and the reservation capacities.
△ Less
Submitted 1 May, 2017;
originally announced May 2017.
-
Network Inference via the Time-Varying Graphical Lasso
Authors:
David Hallac,
Youngsuk Park,
Stephen Boyd,
Jure Leskovec
Abstract:
Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce…
▽ More
Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce the time-varying graphical lasso (TVGL), a method of inferring time-varying networks from raw time series data. We cast the problem in terms of estimating a sparse time-varying inverse covariance matrix, which reveals a dynamic network of interdependencies between the entities. Since dynamic network inference is a computationally expensive task, we derive a scalable message-passing algorithm based on the Alternating Direction Method of Multipliers (ADMM) to solve this problem in an efficient way. We also discuss several extensions, including a streaming algorithm to update the model and incorporate new observations in real time. Finally, we evaluate our TVGL algorithm on both real and synthetic datasets, obtaining interpretable results and outperforming state-of-the-art baselines in terms of both accuracy and scalability.
△ Less
Submitted 9 June, 2017; v1 submitted 6 March, 2017;
originally announced March 2017.
-
Dirty Pixels: Towards End-to-End Image Processing and Perception
Authors:
Steven Diamond,
Vincent Sitzmann,
Frank Julca-Aguilar,
Stephen Boyd,
Gordon Wetzstein,
Felix Heide
Abstract:
Real-world imaging systems acquire measurements that are degraded by noise, optical aberrations, and other imperfections that make image processing for human viewing and higher-level perception tasks challenging. Conventional cameras address this problem by compartmentalizing imaging from high-level task processing. As such, conventional imaging involves processing the RAW sensor measurements in a…
▽ More
Real-world imaging systems acquire measurements that are degraded by noise, optical aberrations, and other imperfections that make image processing for human viewing and higher-level perception tasks challenging. Conventional cameras address this problem by compartmentalizing imaging from high-level task processing. As such, conventional imaging involves processing the RAW sensor measurements in a sequential pipeline of steps, such as demosaicking, denoising, deblurring, tone-map** and compression. This pipeline is optimized to obtain a visually pleasing image. High-level processing, on the other hand, involves steps such as feature extraction, classification, tracking, and fusion. While this siloed design approach allows for efficient development, it also dictates compartmentalized performance metrics, without knowledge of the higher-level task of the camera system. For example, today's demosaicking and denoising algorithms are designed using perceptual image quality metrics but not with domain-specific tasks such as object detection in mind. We propose an end-to-end differentiable architecture that jointly performs demosaicking, denoising, deblurring, tone-map**, and classification. The architecture learns processing pipelines whose outputs differ from those of existing ISPs optimized for perceptual quality, preserving fine detail at the cost of increased noise and artifacts. We demonstrate on captured and simulated data that our model substantially improves perception in low light and other challenging conditions, which is imperative for real-world applications. Finally, we found that the proposed model also achieves state-of-the-art accuracy when optimized for image reconstruction in low-light conditions, validating the architecture itself as a potentially useful drop-in network for reconstruction and analysis tasks beyond the applications demonstrated in this work.
△ Less
Submitted 7 May, 2021; v1 submitted 23 January, 2017;
originally announced January 2017.
-
Linear Programming Heuristics for the Graph Isomorphism Problem
Authors:
Reza Takapoui,
Stephen Boyd
Abstract:
An isomorphism between two graphs is a bijection between their vertices that preserves the edges. We consider the problem of determining whether two finite undirected weighted graphs are isomorphic, and finding an isomorphism relating them if the answer is positive. In this paper we introduce effective probabilistic linear programming (LP) heuristics to solve the graph isomorphism problem. We moti…
▽ More
An isomorphism between two graphs is a bijection between their vertices that preserves the edges. We consider the problem of determining whether two finite undirected weighted graphs are isomorphic, and finding an isomorphism relating them if the answer is positive. In this paper we introduce effective probabilistic linear programming (LP) heuristics to solve the graph isomorphism problem. We motivate our heuristics by showing guarantees under some conditions, and present numerical experiments that show effectiveness of these heuristics in the general case.
△ Less
Submitted 2 November, 2016;
originally announced November 2016.
-
Convolutional Imputation of Matrix Networks
Authors:
Qingyun Sun,
Mengyuan Yan David Donoho,
Stephen Boyd
Abstract:
A matrix network is a family of matrices, with relatedness modeled by a weighted graph. We consider the task of completing a partially observed matrix network. We assume a novel sampling scheme where a fraction of matrices might be completely unobserved. How can we recover the entire matrix network from incomplete observations? This mathematical problem arises in many applications including medica…
▽ More
A matrix network is a family of matrices, with relatedness modeled by a weighted graph. We consider the task of completing a partially observed matrix network. We assume a novel sampling scheme where a fraction of matrices might be completely unobserved. How can we recover the entire matrix network from incomplete observations? This mathematical problem arises in many applications including medical imaging and social networks.
To recover the matrix network, we propose a structural assumption that the matrices have a graph Fourier transform which is low-rank. We formulate a convex optimization problem and prove an exact recovery guarantee for the optimization problem. Furthermore, we numerically characterize the exact recovery regime for varying rank and sampling rate and discover a new phase transition phenomenon. Then we give an iterative imputation algorithm to efficiently solve the optimization problem and complete large scale matrix networks. We demonstrate the algorithm with a variety of applications such as MRI and Facebook user network.
△ Less
Submitted 7 June, 2018; v1 submitted 2 June, 2016;
originally announced June 2016.