Search | arXiv e-print repository

Physics-Informed Neural Networks for Dynamic Process Operations with Limited Physical Knowledge and Data

Authors: Mehmet Velioglu, Song Zhai, Sophia Rupprecht, Alexander Mitsos, Andreas Jupke, Manuel Dahmen

Abstract: In chemical engineering, process data is often expensive to acquire, and complex phenomena are difficult to model rigorously, rendering both entirely data-driven and purely mechanistic modeling approaches impractical. We explore using physics-informed neural networks (PINNs) for modeling dynamic processes governed by differential-algebraic equation systems when process data is scarce and complete… ▽ More In chemical engineering, process data is often expensive to acquire, and complex phenomena are difficult to model rigorously, rendering both entirely data-driven and purely mechanistic modeling approaches impractical. We explore using physics-informed neural networks (PINNs) for modeling dynamic processes governed by differential-algebraic equation systems when process data is scarce and complete mechanistic knowledge is missing. In particular, we focus on estimating states for which neither direct observational data nor constitutive equations are available. For demonstration purposes, we study a continuously stirred tank reactor and a liquid-liquid separator. We find that PINNs can infer unmeasured states with reasonable accuracy, and they generalize better in low-data scenarios than purely data-driven models. We thus show that PINNs, similar to hybrid mechanistic/data-driven models, are capable of modeling processes when relatively few experimental data and only partially known mechanistic descriptions are available, and conclude that they constitute a promising avenue that warrants further investigation. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: manuscript (31 pages, 8 figures, 7 tables), supporting materials (11 pages, 3 figures, 3 tables)

arXiv:2405.14403 [pdf, other]

Representative electricity price profiles for European day-ahead and intraday spot markets

Authors: Chrysanthi Papadimitriou, Jan C. Schulze, Alexander Mitsos

Abstract: We propose a method to construct representative price profiles of the day-ahead (DA) and the intraday (ID) electricity spot markets and use this method to provide examples of ready-to-use price data sets. In contrast to common scenario generation approaches, the method is deterministic and relies on a small number of degrees of freedom, with the aim to be well defined and easy to use. We thereby t… ▽ More We propose a method to construct representative price profiles of the day-ahead (DA) and the intraday (ID) electricity spot markets and use this method to provide examples of ready-to-use price data sets. In contrast to common scenario generation approaches, the method is deterministic and relies on a small number of degrees of freedom, with the aim to be well defined and easy to use. We thereby target an enhanced comparability of future research studies on demand-side management and energy cost optimization. We construct the price profiles based on historical time series from the spot markets of interest, e.g., European Power Exchange (EPEX) spot. To this end, we extract key price components from the data while also accounting for known dominant mechanisms in the price variation. Further, the method is able to preserve key statistical features of the historical data (e.g., mean and standard deviation) when constructing the benchmark profile. Finally, our approach ensures comparability of ID and DA price profiles by design, as their cumulative (integral) price can be made identical if needed. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Supplementary information (SI) included; Manuscript: 27 pages, 9 figures, 4 tables; SI: 7 pages, 5 figures, 2 tables

arXiv:2403.14425 [pdf, other]

Task-optimal data-driven surrogate models for eNMPC via differentiable simulation and optimization

Authors: Daniel Mayfrank, Na Young Ahn, Alexander Mitsos, Manuel Dahmen

Abstract: We present a method for end-to-end learning of Koopman surrogate models for optimal performance in control. In contrast to previous contributions that employ standard reinforcement learning (RL) algorithms, we use a training algorithm that exploits the potential differentiability of environments based on mechanistic simulation models. We evaluate the performance of our method by comparing it to th… ▽ More We present a method for end-to-end learning of Koopman surrogate models for optimal performance in control. In contrast to previous contributions that employ standard reinforcement learning (RL) algorithms, we use a training algorithm that exploits the potential differentiability of environments based on mechanistic simulation models. We evaluate the performance of our method by comparing it to that of other controller type and training algorithm combinations on a literature known eNMPC case study. Our method exhibits superior performance on this problem, thereby constituting a promising avenue towards more capable controllers that employ dynamic surrogate models. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 6 pages, 4 figures, 1 table

arXiv:2403.08376 [pdf, other]

Nonlinear Manifold Learning Determines Microgel Size from Raman Spectroscopy

Authors: Eleni D. Koronaki, Luise F. Kaven, Johannes M. M. Faust, Ioannis G. Kevrekidis, Alexander Mitsos

Abstract: Polymer particle size constitutes a crucial characteristic of product quality in polymerization. Raman spectroscopy is an established and reliable process analytical technology for in-line concentration monitoring. Recent approaches and some theoretical considerations show a correlation between Raman signals and particle sizes but do not determine polymer size from Raman spectroscopic measurements… ▽ More Polymer particle size constitutes a crucial characteristic of product quality in polymerization. Raman spectroscopy is an established and reliable process analytical technology for in-line concentration monitoring. Recent approaches and some theoretical considerations show a correlation between Raman signals and particle sizes but do not determine polymer size from Raman spectroscopic measurements accurately and reliably. With this in mind, we propose three alternative machine learning workflows to perform this task, all involving diffusion maps, a nonlinear manifold learning technique for dimensionality reduction: (i) directly from diffusion maps, (ii) alternating diffusion maps, and (iii) conformal autoencoder neural networks. We apply the workflows to a data set of Raman spectra with associated size measured via dynamic light scattering of 47 microgel (cross-linked polymer) samples in a diameter range of 208nm to 483 nm. The conformal autoencoders substantially outperform state-of-the-art methods and results for the first time in a promising prediction of polymer size from Raman spectra. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 51 pages, 12 figures, 4 tables

arXiv:2403.03767 [pdf, other]

Predicting the Temperature Dependence of Surfactant CMCs Using Graph Neural Networks

Authors: Christoforos Brozos, Jan G. Rittig, Sandip Bhattacharya, Elie Akanny, Christina Kohlmann, Alexander Mitsos

Abstract: The critical micelle concentration (CMC) of surfactant molecules is an essential property for surfactant applications in industry. Recently, classical QSPR and Graph Neural Networks (GNNs), a deep learning technique, have been successfully applied to predict the CMC of surfactants at room temperature. However, these models have not yet considered the temperature dependency of the CMC, which is hig… ▽ More The critical micelle concentration (CMC) of surfactant molecules is an essential property for surfactant applications in industry. Recently, classical QSPR and Graph Neural Networks (GNNs), a deep learning technique, have been successfully applied to predict the CMC of surfactants at room temperature. However, these models have not yet considered the temperature dependency of the CMC, which is highly relevant for practical applications. We herein develop a GNN model for temperature-dependent CMC prediction of surfactants. We collect about 1400 data points from public sources for all surfactant classes, i.e., ionic, nonionic, and zwitterionic, at multiple temperatures. We test the predictive quality of the model for following scenarios: i) when CMC data for surfactants are present in the training of the model in at least one different temperature, and ii) CMC data for surfactants are not present in the training, i.e., generalizing to unseen surfactants. In both test scenarios, our model exhibits a high predictive performance of R$^2 \geq $ 0.94 on test data. We also find that the model performance varies by surfactant class. Finally, we evaluate the model for sugar-based surfactants with complex molecular structures, as these represent a more sustainable alternative to synthetic surfactants and are therefore of great interest for future applications in the personal and home care industries. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2401.04508 [pdf, ps, other]

doi 10.1109/LCSYS.2022.3181443

Data-driven Nonlinear Model Reduction using Koopman Theory: Integrated Control Form and NMPC Case Study

Authors: Jan C. Schulze, Alexander Mitsos

Abstract: We use Koopman theory for data-driven model reduction of nonlinear dynamical systems with controls. We propose generic model structures combining delay-coordinate encoding of measurements and full-state decoding to integrate reduced Koopman modeling and state estimation. We present a deep-learning approach to train the proposed models. A case study demonstrates that our approach provides accurate… ▽ More We use Koopman theory for data-driven model reduction of nonlinear dynamical systems with controls. We propose generic model structures combining delay-coordinate encoding of measurements and full-state decoding to integrate reduced Koopman modeling and state estimation. We present a deep-learning approach to train the proposed models. A case study demonstrates that our approach provides accurate control models and enables real-time capable nonlinear model predictive control of a high-purity cryogenic distillation column. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Journal ref: IEEE Control Systems Letters, Vol. 6, 2022

arXiv:2401.01874 [pdf, other]

Graph Neural Networks for Surfactant Multi-Property Prediction

Authors: Christoforos Brozos, Jan G. Rittig, Sandip Bhattacharya, Elie Akanny, Christina Kohlmann, Alexander Mitsos

Abstract: Surfactants are of high importance in different industrial sectors such as cosmetics, detergents, oil recovery and drug delivery systems. Therefore, many quantitative structure-property relationship (QSPR) models have been developed for surfactants. Each predictive model typically focuses on one surfactant class, mostly nonionics. Graph Neural Networks (GNNs) have exhibited a great predictive perf… ▽ More Surfactants are of high importance in different industrial sectors such as cosmetics, detergents, oil recovery and drug delivery systems. Therefore, many quantitative structure-property relationship (QSPR) models have been developed for surfactants. Each predictive model typically focuses on one surfactant class, mostly nonionics. Graph Neural Networks (GNNs) have exhibited a great predictive performance for property prediction of ionic liquids, polymers and drugs in general. Specifically for surfactants, GNNs can successfully predict critical micelle concentration (CMC), a key surfactant property associated with micellization. A key factor in the predictive ability of QSPR and GNN models is the data available for training. Based on extensive literature search, we create the largest available CMC database with 429 molecules and the first large data collection for surface excess concentration ($Γ$$_{m}$), another surfactant property associated with foaming, with 164 molecules. Then, we develop GNN models to predict the CMC and $Γ$$_{m}$ and we explore different learning approaches, i.e., single- and multi-task learning, as well as different training strategies, namely ensemble and transfer learning. We find that a multi-task GNN with ensemble learning trained on all $Γ$$_{m}$ and CMC data performs best. Finally, we test the ability of our CMC model to generalize on industrial grade pure component surfactants. The GNN yields highly accurate predictions for CMC, showing great potential for future industrial applications. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2309.05386 [pdf, other]

Data-Driven Model Reduction and Nonlinear Model Predictive Control of an Air Separation Unit by Applied Koopman Theory

Authors: Jan C. Schulze, Danimir T. Doncevic, Nils Erwes, Alexander Mitsos

Abstract: Achieving real-time capability is an essential prerequisite for the industrial implementation of nonlinear model predictive control (NMPC). Data-driven model reduction offers a way to obtain low-order control models from complex digital twins. In particular, data-driven approaches require little expert knowledge of the particular process and its model, and provide reduced models of a well-defined… ▽ More Achieving real-time capability is an essential prerequisite for the industrial implementation of nonlinear model predictive control (NMPC). Data-driven model reduction offers a way to obtain low-order control models from complex digital twins. In particular, data-driven approaches require little expert knowledge of the particular process and its model, and provide reduced models of a well-defined generic structure. Herein, we apply our recently proposed data-driven reduction strategy based on Koopman theory [Schulze et al. (2022), Comput. Chem. Eng.] to generate a low-order control model of an air separation unit (ASU). The reduced Koopman model combines autoencoders and linear latent dynamics and is constructed using machine learning. Further, we present an NMPC implementation that uses derivative computation tailored to the fixed block structure of reduced Koopman models. Our reduction approach with tailored NMPC implementation enables real-time NMPC of an ASU at an average CPU time decrease by 98 %. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Journal ref: Foundations of Computer Aided Process Operations / Chemical Process Control (FOCAPO), 2023

arXiv:2308.16724 [pdf, other]

Data-driven Product-Process Optimization of N-isopropylacrylamide Microgel Flow-Synthesis

Authors: Luise F. Kaven, Artur M. Schweidtmann, Jan Keil, Jana Israel, Nadja Wolter, Alexander Mitsos

Abstract: Microgels are cross-linked, colloidal polymer networks with great potential for stimuli-response release in drug-delivery applications, as their size in the nanometer range allows them to pass human cell boundaries. For applications with specified requirements regarding size, producing tailored microgels in a continuous flow reactor is advantageous because the microgel properties can be controlled… ▽ More Microgels are cross-linked, colloidal polymer networks with great potential for stimuli-response release in drug-delivery applications, as their size in the nanometer range allows them to pass human cell boundaries. For applications with specified requirements regarding size, producing tailored microgels in a continuous flow reactor is advantageous because the microgel properties can be controlled tightly. However, no fully-specified mechanistic models are available for continuous microgel synthesis, as the physical properties of the included components are only studied partly. To address this gap and accelerate tailor-made microgel development, we propose a data-driven optimization in a hardware-in-the-loop approach to efficiently synthesize microgels with defined sizes. We optimize the synthesis regarding conflicting objectives (maximum production efficiency, minimum energy consumption, and the desired microgel radius) by applying Bayesian optimization via the solver ``Thompson sampling efficient multi-objective optimization'' (TS-EMO). We validate the optimization using the deterministic global solver ``McCormick-based Algorithm for mixed-integer Nonlinear Global Optimization'' (MAiNGO) and verify three computed Pareto optimal solutions via experiments. The proposed framework can be applied to other desired microgel properties and reactor setups and has the potential of efficient development by minimizing number of experiments and modelling effort needed. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: Manuscript: 24 pages, 8 figures; SI: 9 pages, 3 figures

arXiv:2308.01674 [pdf, other]

End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control

Authors: Daniel Mayfrank, Alexander Mitsos, Manuel Dahmen

Abstract: (Economic) nonlinear model predictive control ((e)NMPC) requires dynamic models that are sufficiently accurate and computationally tractable. Data-driven surrogate models for mechanistic models can reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum prediction accuracy on simulation samples and perform suboptimally in (e)NMPC.… ▽ More (Economic) nonlinear model predictive control ((e)NMPC) requires dynamic models that are sufficiently accurate and computationally tractable. Data-driven surrogate models for mechanistic models can reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum prediction accuracy on simulation samples and perform suboptimally in (e)NMPC. We present a method for end-to-end reinforcement learning of Koopman surrogate models for optimal performance as part of (e)NMPC. We apply our method to two applications derived from an established nonlinear continuous stirred-tank reactor model. The controller performance is compared to that of (e)NMPCs utilizing models trained using system identification, and model-free neural network controllers trained using reinforcement learning. We show that the end-to-end trained models outperform those trained using system identification in (e)NMPC, and that, in contrast to the neural network controllers, the (e)NMPC controllers can react to changes in the control setting without retraining. △ Less

Submitted 22 March, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

Comments: manuscript (18 pages, 7 figures, 5 tables), supplementary materials (3 pages, 2 tables)

arXiv:2306.07937 [pdf, other]

doi 10.1039/D3DD00103B

Gibbs-Duhem-Informed Neural Networks for Binary Activity Coefficient Prediction

Authors: Jan G. Rittig, Kobi C. Felton, Alexei A. Lapkin, Alexander Mitsos

Abstract: We propose Gibbs-Duhem-informed neural networks for the prediction of binary activity coefficients at varying compositions. That is, we include the Gibbs-Duhem equation explicitly in the loss function for training neural networks, which is straightforward in standard machine learning (ML) frameworks enabling automatic differentiation. In contrast to recent hybrid ML approaches, our approach does n… ▽ More We propose Gibbs-Duhem-informed neural networks for the prediction of binary activity coefficients at varying compositions. That is, we include the Gibbs-Duhem equation explicitly in the loss function for training neural networks, which is straightforward in standard machine learning (ML) frameworks enabling automatic differentiation. In contrast to recent hybrid ML approaches, our approach does not rely on embedding a specific thermodynamic model inside the neural network and corresponding prediction limitations. Rather, Gibbs-Duhem consistency serves as regularization, with the flexibility of ML models being preserved. Our results show increased thermodynamic consistency and generalization capabilities for activity coefficient predictions by Gibbs-Duhem-informed graph neural networks and matrix completion methods. We also find that the model architecture, particularly the activation function, can have a strong influence on the prediction quality. The approach can be easily extended to account for other thermodynamic consistency conditions. △ Less

Submitted 14 September, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

arXiv:2211.12386 [pdf, other]

A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms

Authors: Danimir T. Doncevic, Alexander Mitsos, Yue Guo, Qianxiao Li, Felix Dietrich, Manuel Dahmen, Ioannis G. Kevrekidis

Abstract: Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta… ▽ More Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta neural network to a recursively recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. In contrast to off-the-shelf deep learning approaches, it features a distinct division into modules for generation of information and for the subsequent assembly of this information towards a solution. Local information in the form of a subspace is generated by subordinate, inner, iterations of recurrent function evaluations starting at the current outer iterate. The update to the next outer iterate is computed as a linear combination of these evaluations, reducing the residual in this space, and constitutes the output of the network. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields iterations similar to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta integrators for ordinary differential equations. Due to its modularity, the superstructure can be readily extended with functionalities needed to represent more general classes of iterative algorithms traditionally based on Taylor series expansions. △ Less

Submitted 6 July, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: manuscript (22 pages, 9 figures), supporting information (11 pages, 9 figures)

arXiv:2208.04852 [pdf, other]

doi 10.1039/BK9781837670178-00159

Graph neural networks for the prediction of molecular structure-property relationships

Authors: Jan G. Rittig, Qinghe Gao, Manuel Dahmen, Alexander Mitsos, Artur M. Schweidtmann

Abstract: Molecular property prediction is of crucial importance in many disciplines such as drug discovery, molecular biology, or material and process design. The frequently employed quantitative structure-property/activity relationships (QSPRs/QSARs) characterize molecules by descriptors which are then mapped to the properties of interest via a linear or nonlinear model. In contrast, graph neural networks… ▽ More Molecular property prediction is of crucial importance in many disciplines such as drug discovery, molecular biology, or material and process design. The frequently employed quantitative structure-property/activity relationships (QSPRs/QSARs) characterize molecules by descriptors which are then mapped to the properties of interest via a linear or nonlinear model. In contrast, graph neural networks, a novel machine learning method, directly work on the molecular graph, i.e., a graph representation where atoms correspond to nodes and bonds correspond to edges. GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors as in QSPRs/QSARs. GNNs have been shown to achieve state-of-the-art prediction performance on various property predictions tasks and represent an active field of research. We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction. △ Less

Submitted 25 July, 2022; originally announced August 2022.

Journal ref: Machine Learning and Hybrid Modelling for Reaction Engineering, Royal Society of Chemistry, ISBN 978-1-83916-563-4, 159-181, 2023

arXiv:2207.13779 [pdf, other]

doi 10.1016/j.compchemeng.2023.108202

Physical Pooling Functions in Graph Neural Networks for Molecular Property Prediction

Authors: Artur M. Schweidtmann, Jan G. Rittig, Jana M. Weber, Martin Grohe, Manuel Dahmen, Kai Leonhard, Alexander Mitsos

Abstract: Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function which combines atom feature vectors into molecular fingerprints. Most previous works use a standard pooling function to predict a variety of properties. However, unsuitable pooling functions can lead to… ▽ More Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function which combines atom feature vectors into molecular fingerprints. Most previous works use a standard pooling function to predict a variety of properties. However, unsuitable pooling functions can lead to unphysical GNNs that poorly generalize. We compare and select meaningful GNN pooling methods based on physical knowledge about the learned properties. The impact of physical pooling functions is demonstrated with molecular properties calculated from quantum mechanical computations. We also compare our results to the recent set2set pooling approach. We recommend using sum pooling for the prediction of properties that depend on molecular size and compare pooling functions for properties that are molecular size-independent. Overall, we show that the use of physical pooling functions significantly enhances generalization. △ Less

Submitted 27 July, 2022; originally announced July 2022.

Journal ref: Computers and Chemical Engineering Volume 172, April 2023, 108202

arXiv:2206.11776 [pdf, other]

doi 10.1016/j.compchemeng.2023.108153

Graph Neural Networks for Temperature-Dependent Activity Coefficient Prediction of Solutes in Ionic Liquids

Authors: Jan G. Rittig, Karim Ben Hicham, Artur M. Schweidtmann, Manuel Dahmen, Alexander Mitsos

Abstract: Ionic liquids (ILs) are important solvents for sustainable processes and predicting activity coefficients (ACs) of solutes in ILs is needed. Recently, matrix completion methods (MCMs), transformers, and graph neural networks (GNNs) have shown high accuracy in predicting ACs of binary mixtures, superior to well-established models, e.g., COSMO-RS and UNIFAC. GNNs are particularly promising here as t… ▽ More Ionic liquids (ILs) are important solvents for sustainable processes and predicting activity coefficients (ACs) of solutes in ILs is needed. Recently, matrix completion methods (MCMs), transformers, and graph neural networks (GNNs) have shown high accuracy in predicting ACs of binary mixtures, superior to well-established models, e.g., COSMO-RS and UNIFAC. GNNs are particularly promising here as they learn a molecular graph-to-property relationship without pretraining, typically required for transformers, and are, unlike MCMs, applicable to molecules not included in training. For ILs, however, GNN applications are currently missing. Herein, we present a GNN to predict temperature-dependent infinite dilution ACs of solutes in ILs. We train the GNN on a database including more than 40,000 AC values and compare it to a state-of-the-art MCM. The GNN and MCM achieve similar high prediction performance, with the GNN additionally enabling high-quality predictions for ACs of solutions that contain ILs and solutes not considered during training. △ Less

Submitted 23 June, 2022; originally announced June 2022.

Comments: 16 pages, 4 figures, 5 tables

Journal ref: Computers & Chemical Engineering 171, 108153, 2023

arXiv:2206.00619 [pdf, other]

doi 10.1002/aic.17971

Graph Machine Learning for Design of High-Octane Fuels

Authors: Jan G. Rittig, Martin Ritzert, Artur M. Schweidtmann, Stefanie Winkler, Jana M. Weber, Philipp Morsch, K. Alexander Heufer, Martin Grohe, Alexander Mitsos, Manuel Dahmen

Abstract: Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the fiel… ▽ More Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data. △ Less

Submitted 14 October, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

Comments: manuscript (26 pages, 9 figures, 2 tables), supporting information (12 pages, 8 figures, 1 table)

Journal ref: AIChE Journal 69 (4), e17971, 2023

arXiv:2205.13826 [pdf, other]

Multivariate Probabilistic Forecasting of Intraday Electricity Prices using Normalizing Flows

Authors: Eike Cramer, Dirk Witthaut, Alexander Mitsos, Manuel Dahmen

Abstract: Electricity is traded on various markets with different time horizons and regulations. Short-term intraday trading becomes increasingly important due to the higher penetration of renewables. In Germany, the intraday electricity price typically fluctuates around the day-ahead price of the European Power EXchange (EPEX) spot markets in a distinct hourly pattern. This work proposes a probabilistic mo… ▽ More Electricity is traded on various markets with different time horizons and regulations. Short-term intraday trading becomes increasingly important due to the higher penetration of renewables. In Germany, the intraday electricity price typically fluctuates around the day-ahead price of the European Power EXchange (EPEX) spot markets in a distinct hourly pattern. This work proposes a probabilistic modeling approach that models the intraday price difference to the day-ahead contracts. The model captures the emerging hourly pattern by considering the four 15 min intervals in each day-ahead price interval as a four-dimensional joint probability distribution. The resulting nontrivial, multivariate price difference distribution is learned using a normalizing flow, i.e., a deep generative model that combines conditional multivariate density estimation and probabilistic regression. Furthermore, this work discusses the influence of different external impact factors based on literature insights and impact analysis using explainable artificial intelligence (XAI). The normalizing flow is compared to an informed selection of historical data and probabilistic forecasts using a Gaussian copula and a Gaussian regression model. Among the different models, the normalizing flow identifies the trends with the highest accuracy and has the narrowest prediction intervals. Both the XAI analysis and the empirical experiments highlight that the immediate history of the price difference realization and the increments of the day-ahead price have the most substantial impact on the price difference. △ Less

Submitted 10 March, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

Comments: manuscript (20 pages, 11 figures, 5 tables), supporting information (8 pages, 5 figures, 4 tables)

arXiv:2204.02242 [pdf, other]

Normalizing Flow-based Day-Ahead Wind Power Scenario Generation for Profitable and Reliable Delivery Commitments by Wind Farm Operators

Authors: Eike Cramer, Leonard Paeleke, Alexander Mitsos, Manuel Dahmen

Abstract: We present a specialized scenario generation method that utilizes forecast information to generate scenarios for day-ahead scheduling problems. In particular, we use normalizing flows to generate wind power scenarios by sampling from a conditional distribution that uses wind speed forecasts to tailor the scenarios to a specific day. We apply the generated scenarios in a stochastic day-ahead biddin… ▽ More We present a specialized scenario generation method that utilizes forecast information to generate scenarios for day-ahead scheduling problems. In particular, we use normalizing flows to generate wind power scenarios by sampling from a conditional distribution that uses wind speed forecasts to tailor the scenarios to a specific day. We apply the generated scenarios in a stochastic day-ahead bidding problem of a wind electricity producer and analyze whether the scenarios yield profitable decisions. Compared to Gaussian copulas and Wasserstein-generative adversarial networks, the normalizing flow successfully narrows the range of scenarios around the daily trends while maintaining a diverse variety of possible realizations. In the stochastic day-ahead bidding problem, the conditional scenarios from all methods lead to significantly more stable profitable results compared to an unconditional selection of historical scenarios. The normalizing flow consistently obtains the highest profits, even for small sets scenarios. △ Less

Submitted 11 July, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: manuscript (18 pages, 7 figures, 6 tables), supporting information (2 pages, 1 figure, 1 table)

arXiv:2203.03934 [pdf, other]

Nonlinear Isometric Manifold Learning for Injective Normalizing Flows

Authors: Eike Cramer, Felix Rauh, Alexander Mitsos, Raúl Tempone, Manuel Dahmen

Abstract: To model manifold data using normalizing flows, we employ isometric autoencoders to design embeddings with explicit inverses that do not distort the probability distribution. Using isometries separates manifold learning and density estimation and enables training of both parts to high accuracy. Thus, model selection and tuning are simplified compared to existing injective normalizing flows. Applie… ▽ More To model manifold data using normalizing flows, we employ isometric autoencoders to design embeddings with explicit inverses that do not distort the probability distribution. Using isometries separates manifold learning and density estimation and enables training of both parts to high accuracy. Thus, model selection and tuning are simplified compared to existing injective normalizing flows. Applied to data sets on (approximately) flat manifolds, the combined approach generates high-quality data. △ Less

Submitted 8 May, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

Comments: 11 pages, 7 figures, 4 tables

arXiv:2110.14451 [pdf, other]

Validation Methods for Energy Time Series Scenarios from Deep Generative Models

Authors: Eike Cramer, Leonardo Rydin Gorjão, Alexander Mitsos, Benjamin Schäfer, Dirk Witthaut, Manuel Dahmen

Abstract: The design and operation of modern energy systems are heavily influenced by time-dependent and uncertain parameters, e.g., renewable electricity generation, load-demand, and electricity prices. These are typically represented by a set of discrete realizations known as scenarios. A popular scenario generation approach uses deep generative models (DGM) that allow scenario generation without prior as… ▽ More The design and operation of modern energy systems are heavily influenced by time-dependent and uncertain parameters, e.g., renewable electricity generation, load-demand, and electricity prices. These are typically represented by a set of discrete realizations known as scenarios. A popular scenario generation approach uses deep generative models (DGM) that allow scenario generation without prior assumptions about the data distribution. However, the validation of generated scenarios is difficult, and a comprehensive discussion about appropriate validation methods is currently lacking. To start this discussion, we provide a critical assessment of the currently used validation methods in the energy scenario generation literature. In particular, we assess validation methods based on probability density, auto-correlation, and power spectral density. Furthermore, we propose using the multifractal detrended fluctuation analysis (MFDFA) as an additional validation method for non-trivial features like peaks, bursts, and plateaus. As representative examples, we train generative adversarial networks (GANs), Wasserstein GANs (WGANs), and variational autoencoders (VAEs) on two renewable power generation time series (photovoltaic and wind from Germany in 2013 to 2015) and an intra-day electricity price time series form the European Energy Exchange in 2017 to 2019. We apply the four validation methods to both the historical and the generated data and discuss the interpretation of validation results as well as common mistakes, pitfalls, and limitations of the validation methods. Our assessment shows that no single method sufficiently characterizes a scenario but ideally validation should include multiple methods and be interpreted carefully in the context of scenarios over short time periods. △ Less

Submitted 15 December, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: 20 pages, 8 figures, 2 tables

arXiv:2104.10410 [pdf, other]

Principal Component Density Estimation for Scenario Generation Using Normalizing Flows

Authors: Eike Cramer, Alexander Mitsos, Raul Tempone, Manuel Dahmen

Abstract: Neural networks-based learning of the distribution of non-dispatchable renewable electricity generation from sources such as photovoltaics (PV) and wind as well as load demands has recently gained attention. Normalizing flow density models are particularly well suited for this task due to the training through direct log-likelihood maximization. However, research from the field of image generation… ▽ More Neural networks-based learning of the distribution of non-dispatchable renewable electricity generation from sources such as photovoltaics (PV) and wind as well as load demands has recently gained attention. Normalizing flow density models are particularly well suited for this task due to the training through direct log-likelihood maximization. However, research from the field of image generation has shown that standard normalizing flows can only learn smeared-out versions of manifold distributions. Previous works on normalizing flow-based scenario generation do not address this issue, and the smeared-out distributions result in the sampling of noisy time series. In this paper, we exploit the isometry of the principal component analysis (PCA), which sets up the normalizing flow in a lower-dimensional space while maintaining the direct and computationally efficient likelihood maximization. We train the resulting principal component flow (PCF) on data of PV and wind power generation as well as load demand in Germany in the years 2013 to 2015. The results of this investigation show that the PCF preserves critical features of the original distributions, such as the probability density and frequency behavior of the time series. The application of the PCF is, however, not limited to renewable power generation but rather extends to any data set, time series, or otherwise, which can be efficiently reduced using PCA. △ Less

Submitted 7 January, 2022; v1 submitted 21 April, 2021; originally announced April 2021.

Comments: 18 pages, 7 figures

arXiv:2102.03782 [pdf, other]

Using Gaussian Processes to Design Dynamic Experiments for Black-Box Model Discrimination under Uncertainty

Authors: Simon Olofsson, Eduardo S. Schultz, Adel Mhamdi, Alexander Mitsos, Marc Peter Deisenroth, Ruth Misener

Abstract: Diverse domains of science and engineering use parameterised mechanistic models. Engineers and scientists can often hypothesise several rival models to explain a specific process or phenomenon. Consider a model discrimination setting where we wish to find the best mechanistic, dynamic model candidate and the best model parameter estimates. Typically, several rival mechanistic models can explain th… ▽ More Diverse domains of science and engineering use parameterised mechanistic models. Engineers and scientists can often hypothesise several rival models to explain a specific process or phenomenon. Consider a model discrimination setting where we wish to find the best mechanistic, dynamic model candidate and the best model parameter estimates. Typically, several rival mechanistic models can explain the available data, so design of dynamic experiments for model discrimination helps optimally collect additional data by finding experimental settings that maximise model prediction divergence. We argue there are two main approaches in the literature for solving the optimal design problem: (i) the analytical approach, using linear and Gaussian approximations to find closed-form expressions for the design objective, and (ii) the data-driven approach, which often relies on computationally intensive Monte Carlo techniques. Olofsson et al. (ICML 35, 2018) introduced Gaussian process (GP) surrogate models to hybridise the analytical and data-driven approaches, which allowed for computationally efficient design of experiments for discriminating between black-box models. In this study, we demonstrate that we can extend existing methods for optimal design of dynamic experiments to incorporate a wider range of problem uncertainty. We also extend the Olofsson et al. (2018) method of using GP surrogate models for discriminating between dynamic black-box models. We evaluate our approach on a well-known case study from literature, and explore the consequences of using GP surrogates to approximate gradient-based methods. △ Less

Submitted 31 October, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

arXiv:2005.10902 [pdf, other]

doi 10.1007/s12532-021-00204-y

Global Optimization of Gaussian processes

Authors: Artur M. Schweidtmann, Dominik Bongartz, Daniel Grothe, Tim Kerkenhoff, Xiaopeng Lin, Jaromil Najman, Alexander Mitsos

Abstract: Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting determini… ▽ More Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting deterministic global optimization to Gaussian processes trained on few data points. We propose a reduced-space formulation for deterministic global optimization with trained Gaussian processes embedded. For optimization, the branch-and-bound solver branches only on the degrees of freedom and McCormick relaxations are propagated through explicit Gaussian process models. The approach also leads to significantly smaller and computationally cheaper subproblems for lower and upper bounding. To further accelerate convergence, we derive envelopes of common covariance functions for GPs and tight relaxations of acquisition functions used in Bayesian optimization including expected improvement, probability of improvement, and lower confidence bound. In total, we reduce computational time by orders of magnitude compared to state-of-the-art methods, thus overcoming previous computational burdens. We demonstrate the performance and scaling of the proposed method and apply it to Bayesian optimization with global optimization of the acquisition function and chance-constrained programming. The Gaussian process models, acquisition functions, and training scripts are available open-source within the "MeLOn - Machine Learning Models for Optimization" toolbox~(https://git.rwth-aachen.de/avt.svt/public/MeLOn). △ Less

Submitted 21 May, 2020; originally announced May 2020.

MSC Class: 90C26; 90C30; 90C90; 68T01; 60-04

Journal ref: Math. Prog. Comp. 13, 553-581 (2021)

Showing 1–23 of 23 results for author: Mitsos, A