Search | arXiv e-print repository

PlasmoData.jl -- A Julia Framework for Modeling and Analyzing Complex Data as Graphs

Abstract: Datasets encountered in scientific and engineering applications appear in complex formats (e.g., images, multivariate time series, molecules, video, text strings, networks). Graph theory provides a unifying framework to model such datasets and enables the use of powerful tools that can help analyze, visualize, and extract value from data. In this work, we present PlasmoData.jl, an open-source, Jul… ▽ More Datasets encountered in scientific and engineering applications appear in complex formats (e.g., images, multivariate time series, molecules, video, text strings, networks). Graph theory provides a unifying framework to model such datasets and enables the use of powerful tools that can help analyze, visualize, and extract value from data. In this work, we present PlasmoData.jl, an open-source, Julia framework that uses concepts of graph theory to facilitate the modeling and analysis of complex datasets. The core of our framework is a general data modeling abstraction, which we call a DataGraph. We show how the abstraction and software implementation can be used to represent diverse data objects as graphs and to enable the use of tools from topology, graph theory, and machine learning (e.g., graph neural networks) to conduct a variety of tasks. We illustrate the versatility of the framework by using real datasets: i) an image classification problem using topological data analysis to extract features from the graph model to train machine learning models; ii) a disease outbreak problem where we model multivariate time series as graphs to detect abnormal events; and iii) a technology pathway analysis problem where we highlight how we can use graphs to navigate connectivity. Our discussion also highlights how PlasmoData.jl leverages native Julia capabilities to enable compact syntax, scalable computations, and interfaces with diverse packages. △ Less

Submitted 10 May, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

Comments: 62 pages, 18 figures, 8 tables

arXiv:2311.11740 [pdf, other]

doi 10.1039/D3DD00226H

A Fast and Scalable Computational Topology Framework for the Euler Characteristic

Authors: Daniel J. Laky, Victor M. Zavala

Abstract: The Euler characteristic (EC) is a powerful topological descriptor that can be used to quantify the shape of data objects that are represented as fields/manifolds. Fast methods for computing the EC are required to enable processing of high-throughput data and real-time implementations. This represents a challenge when processing high-resolution 2D field data (e.g., images) and 3D field data (e.g.,… ▽ More The Euler characteristic (EC) is a powerful topological descriptor that can be used to quantify the shape of data objects that are represented as fields/manifolds. Fast methods for computing the EC are required to enable processing of high-throughput data and real-time implementations. This represents a challenge when processing high-resolution 2D field data (e.g., images) and 3D field data (e.g., video, hyperspectral images, and space-time data obtained from fluid dynamics and molecular simulations). In this work, we present parallel algorithms (and software implementations) to enable fast computations of the EC for 2D and 3D fields using vertex contributions. We test the proposed algorithms using synthetic data objects and data objects arising in real applications such as microscopy, 3D molecular dynamics simulations, and hyperspectral images. Results show that the proposed implementation can compute the EC a couple of orders of magnitude faster than ${\tt GUDHI}$ (an off-the-shelf and state-of-the art tool) and at speeds comparable to ${\tt CHUNKYEuler}$ (a tool tailored to scalable computation of the EC). The vertex contributions approach is flexible in that it compute the EC as well as other topological descriptors such as perimeter, area, and volume (${\tt CHUNKYEuler}$ can only compute the EC). Scalability with respect to memory use is also addressed by providing low-memory versions of the algorithms; this enables processing of data objects beyond the size of dynamic memory. All data and software needed for reproducing the results are shared as open-source code. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.11254 [pdf, other]

BOIS: Bayesian Optimization of Interconnected Systems

Authors: Leonardo D. González, Victor M. Zavala

Abstract: Bayesian optimization (BO) has proven to be an effective paradigm for the global optimization of expensive-to-sample systems. One of the main advantages of BO is its use of Gaussian processes (GPs) to characterize model uncertainty which can be leveraged to guide the learning and search process. However, BO typically treats systems as black-boxes and this limits the ability to exploit structural k… ▽ More Bayesian optimization (BO) has proven to be an effective paradigm for the global optimization of expensive-to-sample systems. One of the main advantages of BO is its use of Gaussian processes (GPs) to characterize model uncertainty which can be leveraged to guide the learning and search process. However, BO typically treats systems as black-boxes and this limits the ability to exploit structural knowledge (e.g., physics and sparse interconnections). Composite functions of the form $f(x, y(x))$, wherein GP modeling is shifted from the performance function $f$ to an intermediate function $y$, offer an avenue for exploiting structural knowledge. However, the use of composite functions in a BO framework is complicated by the need to generate a probability density for $f$ from the Gaussian density of $y$ calculated by the GP (e.g., when $f$ is nonlinear it is not possible to obtain a closed-form expression). Previous work has handled this issue using sampling techniques; these are easy to implement and flexible but are computationally intensive. In this work, we introduce a new paradigm which allows for the efficient use of composite functions in BO; this uses adaptive linearizations of $f$ to obtain closed-form expressions for the statistical moments of the composite function. We show that this simple approach (which we call BOIS) enables the exploitation of structural knowledge, such as that arising in interconnected systems as well as systems that embed multiple GP models and combinations of physics and GP models. Using a chemical process optimization case study, we benchmark the effectiveness of BOIS against standard BO and sampling approaches. Our results indicate that BOIS achieves performance gains and accurately captures the statistics of composite functions. △ Less

Submitted 28 November, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

Comments: 6 pages, 5 figures

arXiv:2307.10438 [pdf]

doi 10.1039/D4DD00088A

Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search

Authors: Shengli Jiang, Shiyi Qin, Reid C. Van Lehn, Prasanna Balaprakash, Victor M. Zavala

Abstract: Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven methods for molecular property prediction. However, a key limitation of typical GNN models is their inability to quantify uncertainties in the predictions. This capability is crucial for ensuring the trustworthy use and deployment of models in downstream tasks. To that end, we introduce AutoGNNUQ, an automated uncertaint… ▽ More Graph Neural Networks (GNNs) have emerged as a prominent class of data-driven methods for molecular property prediction. However, a key limitation of typical GNN models is their inability to quantify uncertainties in the predictions. This capability is crucial for ensuring the trustworthy use and deployment of models in downstream tasks. To that end, we introduce AutoGNNUQ, an automated uncertainty quantification (UQ) approach for molecular property prediction. AutoGNNUQ leverages architecture search to generate an ensemble of high-performing GNNs, enabling the estimation of predictive uncertainties. Our approach employs variance decomposition to separate data (aleatoric) and model (epistemic) uncertainties, providing valuable insights for reducing them. In our computational experiments, we demonstrate that AutoGNNUQ outperforms existing UQ methods in terms of both prediction accuracy and UQ performance on multiple benchmark datasets. Additionally, we utilize t-SNE visualization to explore correlations between molecular features and uncertainty, offering insight for dataset improvement. AutoGNNUQ has broad applicability in domains such as drug discovery and materials science, where accurate uncertainty quantification is crucial for decision-making. △ Less

Submitted 28 June, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

arXiv:2302.04991 [pdf, other]

A Graph-Based Modeling Framework for Tracing Hydrological Pollutant Transport in Surface Waters

Authors: David L. Cole, Gerardo J. Ruiz-Mercado, Victor M. Zavala

Abstract: Anthropogenic pollution of hydrological systems affects diverse communities and ecosystems around the world. Data analytics and modeling tools play a key role in fighting this challenge, as they can help identify key sources as well as trace transport and quantify impact within complex hydrological systems. Several tools exist for simulating and tracing pollutant transport throughout surface water… ▽ More Anthropogenic pollution of hydrological systems affects diverse communities and ecosystems around the world. Data analytics and modeling tools play a key role in fighting this challenge, as they can help identify key sources as well as trace transport and quantify impact within complex hydrological systems. Several tools exist for simulating and tracing pollutant transport throughout surface waters using detailed physical models; these tools are powerful, but can be computationally intensive, require significant amounts of data to be developed, and require expert knowledge for their use (ultimately limiting application scope). In this work, we present a graph modeling framework -- which we call ${\tt HydroGraphs}$ -- for understanding pollutant transport and fate across waterbodies, rivers, and watersheds. This framework uses a simplified representation of hydrological systems that can be constructed based purely on open-source data (National Hydrography Dataset and Watershed Boundary Dataset). The graph representation provides an flexible intuitive approach for capturing connectivity and for identifying upstream pollutant sources and for tracing downstream impacts within small and large hydrological systems. Moreover, the graph representation can facilitate the use of advanced algorithms and tools of graph theory, topology, optimization, and machine learning to aid data analytics and decision-making. We demonstrate the capabilities of our framework by using case studies in the State of Wisconsin; here, we aim to identify upstream nutrient pollutant sources that arise from agricultural practices and trace downstream impacts to waterbodies, rivers, and streams. Our tool ultimately seeks to help stakeholders design effective pollution prevention/mitigation practices and evaluate how surface waters respond to such practices. △ Less

Submitted 22 September, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

Comments: 41 pages, 9 figures; minor update to analysis (e.g., urban land cover to case studies; overall results remain unchanged)

arXiv:2212.11571 [pdf, other]

Scalable Primal Decomposition Schemes for Large-Scale Infrastructure Networks

Authors: Alexander Engelmann, Sungho Shin, François Pacaud, Victor M. Zavala

Abstract: The real-time operation of large-scale infrastructure networks requires scalable optimization capabilities. Decomposition schemes can help achieve scalability; classical decomposition approaches such as the alternating direction method of multipliers (ADMM) and distributed Newtons schemes, however, often either suffer from slow convergence or might require high degrees of communication. In this wo… ▽ More The real-time operation of large-scale infrastructure networks requires scalable optimization capabilities. Decomposition schemes can help achieve scalability; classical decomposition approaches such as the alternating direction method of multipliers (ADMM) and distributed Newtons schemes, however, often either suffer from slow convergence or might require high degrees of communication. In this work, we present new primal decomposition schemes for solving large-scale, strongly convex QPs. These approaches have global convergence guarantees and require limited communication. We benchmark their performance against the off-the-shelf interior-point method Ipopt and against ADMM on infrastructure networks that contain up to 300,000 decision variables and constraints. Overall, we find that the proposed approaches solve problems as fast as Ipopt but with reduced communication. Moreover, we find that the proposed schemes achieve higher accuracy than ADMM approaches. △ Less

Submitted 22 December, 2022; originally announced December 2022.

arXiv:2210.07848 [pdf, other]

Convolutional Neural Networks: Basic Concepts and Applications in Manufacturing

Authors: Shengli Jiang, Shiyi Qin, Joshua L. Pulsipher, Victor M. Zavala

Abstract: We discuss basic concepts of convolutional neural networks (CNNs) and outline uses in manufacturing. We begin by discussing how different types of data objects commonly encountered in manufacturing (e.g., time series, images, micrographs, videos, spectra, molecular structures) can be represented in a flexible manner using tensors and graphs. We then discuss how CNNs use convolution operations to e… ▽ More We discuss basic concepts of convolutional neural networks (CNNs) and outline uses in manufacturing. We begin by discussing how different types of data objects commonly encountered in manufacturing (e.g., time series, images, micrographs, videos, spectra, molecular structures) can be represented in a flexible manner using tensors and graphs. We then discuss how CNNs use convolution operations to extract informative features (e.g., geometric patterns and textures) from the such representations to predict emergent properties and phenomena and/or to identify anomalies. We also discuss how CNNs can exploit color as a key source of information, which enables the use of modern computer vision hardware (e.g., infrared, thermal, and hyperspectral cameras). We illustrate the concepts using diverse case studies arising in spectral analysis, molecule design, sensor design, image-based control, and multivariate process monitoring. △ Less

Submitted 14 October, 2022; originally announced October 2022.

arXiv:2210.01071 [pdf, other]

doi 10.1016/j.compchemeng.2022.108110

New Paradigms for Exploiting Parallel Experiments in Bayesian Optimization

Authors: Leonardo D. González, Victor M. Zavala

Abstract: Bayesian optimization (BO) is one of the most effective methods for closed-loop experimental design and black-box optimization. However, a key limitation of BO is that it is an inherently sequential algorithm (one experiment is proposed per round) and thus cannot directly exploit high-throughput (parallel) experiments. Diverse modifications to the BO framework have been proposed in the literature… ▽ More Bayesian optimization (BO) is one of the most effective methods for closed-loop experimental design and black-box optimization. However, a key limitation of BO is that it is an inherently sequential algorithm (one experiment is proposed per round) and thus cannot directly exploit high-throughput (parallel) experiments. Diverse modifications to the BO framework have been proposed in the literature to enable exploitation of parallel experiments but such approaches are limited in the degree of parallelization that they can achieve and can lead to redundant experiments (thus wasting resources and potentially compromising performance). In this work, we present new parallel BO paradigms that exploit the structure of the system to partition the design space. Specifically, we propose an approach that partitions the design space by following the level sets of the performance function and an approach that exploits partially-separable structures of the performance function found. We conduct extensive numerical experiments using a reactor case study to benchmark the effectiveness of these approaches against a variety of state-of-the-art parallel algorithms reported in the literature. Our computational results show that our approaches significantly reduce the required search time and increase the probability of finding a global (rather than local) solution. △ Less

Submitted 9 December, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

Comments: 36 pages, 16 figures, 8 algorithms

arXiv:2202.01816 [pdf, other]

SAFE-OCC: A Novelty Detection Framework for Convolutional Neural Network Sensors and its Application in Process Control

Authors: Joshua L. Pulsipher, Luke D. J. Coutinho, Tyler A. Soderstrom, Victor M. Zavala

Abstract: We present a novelty detection framework for Convolutional Neural Network (CNN) sensors that we call Sensor-Activated Feature Extraction One-Class Classification (SAFE-OCC). We show that this framework enables the safe use of computer vision sensors in process control architectures. Emergent control applications use CNN models to map visual data to a state signal that can be interpreted by the con… ▽ More We present a novelty detection framework for Convolutional Neural Network (CNN) sensors that we call Sensor-Activated Feature Extraction One-Class Classification (SAFE-OCC). We show that this framework enables the safe use of computer vision sensors in process control architectures. Emergent control applications use CNN models to map visual data to a state signal that can be interpreted by the controller. Incorporating such sensors introduces a significant system operation vulnerability because CNN sensors can exhibit high prediction errors when exposed to novel (abnormal) visual data. Unfortunately, identifying such novelties in real-time is nontrivial. To address this issue, the SAFE-OCC framework leverages the convolutional blocks of the CNN to create an effective feature space to conduct novelty detection using a desired one-class classification technique. This approach engenders a feature space that directly corresponds to that used by the CNN sensor and avoids the need to derive an independent latent space. We demonstrate the effectiveness of SAFE-OCC via simulated control environments. △ Less

Submitted 3 February, 2022; originally announced February 2022.

arXiv:2101.04869 [pdf, ps, other]

doi 10.1002/aic.17282

Convolutional Neural Nets in Chemical Engineering: Foundations, Computations, and Applications

Authors: Shengli Jiang, Victor M. Zavala

Abstract: In this paper we review the mathematical foundations of convolutional neural nets (CNNs) with the goals of: i) highlighting connections with techniques from statistics, signal processing, linear algebra, differential equations, and optimization, ii) demystifying underlying computations, and iii) identifying new types of applications. CNNs are powerful machine learning models that highlight feature… ▽ More In this paper we review the mathematical foundations of convolutional neural nets (CNNs) with the goals of: i) highlighting connections with techniques from statistics, signal processing, linear algebra, differential equations, and optimization, ii) demystifying underlying computations, and iii) identifying new types of applications. CNNs are powerful machine learning models that highlight features from grid data to make predictions (regression and classification). The grid data object can be represented as vectors (in 1D), matrices (in 2D), or tensors (in 3D or higher dimensions) and can incorporate multiple channels (thus providing high flexibility in the input data representation). CNNs highlight features from the grid data by performing convolution operations with different types of operators. The operators highlight different types of features (e.g., patterns, gradients, geometrical features) and are learned by using optimization techniques. In other words, CNNs seek to identify optimal operators that best map the input data to the output data. A common misconception is that CNNs are only capable of processing image or video data but their application scope is much wider; specifically, datasets encountered in diverse applications can be expressed as grid data. Here, we show how to apply CNNs to new types of applications such as optimal control, flow cytometry, multivariate process monitoring, and molecular simulations. △ Less

Submitted 7 July, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

Journal ref: AIChE J. 2021; e17282

arXiv:2012.11790 [pdf, other]

A Dynamic Penalty Function Approach for Constraints-Handling in Reinforcement Learning

Authors: Haeun Yoo, Victor M. Zavala, Jay H. Lee

Abstract: Reinforcement learning (RL) is attracting attention as an effective way to solve sequential optimization problems that involve high dimensional state/action space and stochastic uncertainties. Many such problems involve constraints expressed by inequality constraints. This study focuses on using RL to solve constrained optimal control problems. Most RL application studies have dealt with inequalit… ▽ More Reinforcement learning (RL) is attracting attention as an effective way to solve sequential optimization problems that involve high dimensional state/action space and stochastic uncertainties. Many such problems involve constraints expressed by inequality constraints. This study focuses on using RL to solve constrained optimal control problems. Most RL application studies have dealt with inequality constraints by adding soft penalty terms for violating the constraints to the reward function. However, while training neural networks to learn the value (or Q) function, one can run into computational issues caused by the sharp change in the function value at the constraint boundary due to the large penalty imposed. This difficulty during training can lead to convergence problems and ultimately lead to poor closed-loop performance. To address this issue, this study proposes a dynamic penalty (DP) approach where the penalty factor is gradually and systematically increased during training as the iteration episodes proceed. We first examine the ability of a neural network to represent a value function when uniform, linear, or DP functions are added to prevent constraint violation. The agent trained by a Deep Q Network (DQN) algorithm with the DP function approach was compared with agents with other constant penalty functions in a simple vehicle control problem. Results show that the proposed approach can improve the neural network approximation accuracy and provide faster convergence when close to a solution. △ Less

Submitted 31 March, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

Comments: Submitted to ADCHEM 2021

arXiv:2005.06674 [pdf, other]

On the Convergence of Overlap** Schwarz Decomposition for Nonlinear Optimal Control

Authors: Sen Na, Sungho Shin, Mihai Anitescu, Victor M. Zavala

Abstract: We study the convergence properties of an overlap** Schwarz decomposition algorithm for solving nonlinear optimal control problems (OCPs). The algorithm decomposes the time domain into a set of overlap** subdomains, and solves all subproblems defined over subdomains in parallel. The convergence is attained by updating primal-dual information at the boundaries of overlap** subdomains. We show… ▽ More We study the convergence properties of an overlap** Schwarz decomposition algorithm for solving nonlinear optimal control problems (OCPs). The algorithm decomposes the time domain into a set of overlap** subdomains, and solves all subproblems defined over subdomains in parallel. The convergence is attained by updating primal-dual information at the boundaries of overlap** subdomains. We show that the algorithm exhibits local linear convergence, and that the convergence rate improves exponentially with the overlap size. We also establish global convergence results for a general quadratic programming, which enables the application of the Schwarz scheme inside second-order optimization algorithms (e.g., sequential quadratic programming). The theoretical foundation of our convergence analysis is a sensitivity result of nonlinear OCPs, which we call "exponential decay of sensitivity" (EDS). Intuitively, EDS states that the impact of perturbations at domain boundaries (i.e. initial and terminal time) on the solution decays exponentially as one moves into the domain. Here, we expand a previous analysis available in the literature by showing that EDS holds for both primal and dual solutions of nonlinear OCPs, under uniform second-order sufficient condition, controllability condition, and boundedness condition. We conduct experiments with a quadrotor motion planning problem and a PDE control problem to validate our theory; and show that the approach is significantly more efficient than ADMM and as efficient as the centralized solver Ipopt. △ Less

Submitted 14 March, 2022; v1 submitted 13 May, 2020; originally announced May 2020.

Comments: 16 pages

arXiv:2003.05928 [pdf, ps, other]

On the Convergence of the Dynamic Inner PCA Algorithm

Authors: Sungho Shin, Alex D. Smith, S. Joe Qin, Victor M. Zavala

Abstract: Dynamic inner principal component analysis (DiPCA) is a powerful method for the analysis of time-dependent multivariate data. DiPCA extracts dynamic latent variables that capture the most dominant temporal trends by solving a large-scale, dense, and nonconvex nonlinear program (NLP). A scalable decomposition algorithm has been recently proposed in the literature to solve these challenging NLPs. Th… ▽ More Dynamic inner principal component analysis (DiPCA) is a powerful method for the analysis of time-dependent multivariate data. DiPCA extracts dynamic latent variables that capture the most dominant temporal trends by solving a large-scale, dense, and nonconvex nonlinear program (NLP). A scalable decomposition algorithm has been recently proposed in the literature to solve these challenging NLPs. The decomposition algorithm performs well in practice but its convergence properties are not well understood. In this work, we show that this algorithm is a specialized variant of a coordinate maximization algorithm. This observation allows us to explain why the decomposition algorithm might work (or not) in practice and can guide improvements. We compare the performance of the decomposition strategies with that of the off-the-shelf solver Ipopt. The results show that decomposition is more scalable and, surprisingly, delivers higher quality solutions. △ Less

Submitted 12 March, 2020; originally announced March 2020.

Journal ref: In Proceedings of Foundations of Process Analytics and Machine Learning, 2019

arXiv:1606.00350 [pdf, ps, other]

Data Centers as Dispatchable Loads to Harness Stranded Power

Authors: Kibaek Kim, Fan Yang, Victor M. Zavala, Andrew A. Chien

Abstract: We analyze how both traditional data center integration and dispatchable load integration affect power grid efficiency. We use detailed network models, parallel optimization solvers, and thousands of renewable generation scenarios to perform our analysis. Our analysis reveals that significant spillage and stranded power will be observed in power grids as wind power levels are increased. A counter-… ▽ More We analyze how both traditional data center integration and dispatchable load integration affect power grid efficiency. We use detailed network models, parallel optimization solvers, and thousands of renewable generation scenarios to perform our analysis. Our analysis reveals that significant spillage and stranded power will be observed in power grids as wind power levels are increased. A counter-intuitive finding is that collocating data centers with inflexible loads next to wind farms has limited impacts on renewable portfolio standard (RPS) goals because it provides limited system-level flexibility and can in fact increase stranded power and fossil-fueled generation. In contrast, optimally placing data centers that are dispatchable (with flexible loads) provides system-wide flexibility, reduces stranded power, and improves efficiency. In short, optimally placed dispatchable computing loads can enable better scaling to high RPS. We show that these dispatchable computing loads are powered to 60~80\% of their requested capacity, indicating that there are significant economic incentives provided by stranded power. △ Less

Submitted 1 June, 2016; originally announced June 2016.

Showing 1–14 of 14 results for author: Zavala, V M