Search | arXiv e-print repository

Recency-Weighted Temporally-Segmented Ensemble for Time-Series Modeling

Authors: Pål V. Johnsen, Eivind Bøhn, Sølve Eidnes, Filippo Remonato, Signe Riemer-Sørensen

Abstract: Time-series modeling in process industries faces the challenge of dealing with complex, multi-faceted, and evolving data characteristics. Conventional single model approaches often struggle to capture the interplay of diverse dynamics, resulting in suboptimal forecasts. Addressing this, we introduce the Recency-Weighted Temporally-Segmented (ReWTS, pronounced `roots') ensemble model, a novel chunk… ▽ More Time-series modeling in process industries faces the challenge of dealing with complex, multi-faceted, and evolving data characteristics. Conventional single model approaches often struggle to capture the interplay of diverse dynamics, resulting in suboptimal forecasts. Addressing this, we introduce the Recency-Weighted Temporally-Segmented (ReWTS, pronounced `roots') ensemble model, a novel chunk-based approach for multi-step forecasting. The key characteristics of the ReWTS model are twofold: 1) It facilitates specialization of models into different dynamics by segmenting the training data into `chunks' of data and training one model per chunk. 2) During inference, an optimization procedure assesses each model on the recent past and selects the active models, such that the appropriate mixture of previously learned dynamics can be recalled to forecast the future. This method not only captures the nuances of each period, but also adapts more effectively to changes over time compared to conventional `global' models trained on all data in one go. We present a comparative analysis, utilizing two years of data from a wastewater treatment plant and a drinking water treatment plant in Norway, demonstrating the ReWTS ensemble's superiority. It consistently outperforms the global model in terms of mean squared forecasting error across various model architectures by 10-70\% on both datasets, notably exhibiting greater resilience to outliers. This approach shows promise in develo** automatic, adaptable forecasting models for decision-making and control systems in process industries and other complex systems. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Main article with 23 pages including 12 figures and 4 tables. Supplementary File with 11 pages including 9 figures

arXiv:2208.14037

doi 10.1007/s43681-022-00251-8

Towards Artificial Virtuous Agents: Games, Dilemmas and Machine Learning

Authors: Ajay Vishwanath, Einar Duenger Bøhn, Ole-Christoffer Granmo, Charl Maree, Christian Omlin

Abstract: Machine ethics has received increasing attention over the past few years because of the need to ensure safe and reliable artificial intelligence (AI). The two dominantly used theories in machine ethics are deontological and utilitarian ethics. Virtue ethics, on the other hand, has often been mentioned as an alternative ethical theory. While this interesting approach has certain advantages over pop… ▽ More Machine ethics has received increasing attention over the past few years because of the need to ensure safe and reliable artificial intelligence (AI). The two dominantly used theories in machine ethics are deontological and utilitarian ethics. Virtue ethics, on the other hand, has often been mentioned as an alternative ethical theory. While this interesting approach has certain advantages over popular ethical theories, little effort has been put into engineering artificial virtuous agents due to challenges in their formalization, codifiability, and the resolution of ethical dilemmas to train virtuous agents. We propose to bridge this gap by using role-playing games riddled with moral dilemmas. There are several such games in existence, such as Papers, Please and Life is Strange, where the main character encounters situations where they must choose the right course of action by giving up something else dear to them. We draw inspiration from such games to show how a systemic role-playing game can be designed to develop virtues within an artificial agent. Using modern day AI techniques, such as affinity-based reinforcement learning and explainable AI, we motivate the implementation of virtuous agents that play such role-playing games, and the examination of their decisions through a virtue ethical lens. The development of such agents and environments is a first step towards practically formalizing and demonstrating the value of virtue ethics in the development of ethical agents. △ Less

Submitted 10 December, 2022; v1 submitted 30 August, 2022; originally announced August 2022.

Comments: Premature submission of revised revision

arXiv:2206.02660 [pdf, other]

doi 10.1016/j.physd.2023.133673

Pseudo-Hamiltonian Neural Networks with State-Dependent External Forces

Authors: Sølve Eidnes, Alexander J. Stasik, Camilla Sterud, Eivind Bøhn, Signe Riemer-Sørensen

Abstract: Hybrid machine learning based on Hamiltonian formulations has recently been successfully demonstrated for simple mechanical systems, both energy conserving and not energy conserving. We introduce a pseudo-Hamiltonian formulation that is a generalization of the Hamiltonian formulation via the port-Hamiltonian formulation, and show that pseudo-Hamiltonian neural network models can be used to learn e… ▽ More Hybrid machine learning based on Hamiltonian formulations has recently been successfully demonstrated for simple mechanical systems, both energy conserving and not energy conserving. We introduce a pseudo-Hamiltonian formulation that is a generalization of the Hamiltonian formulation via the port-Hamiltonian formulation, and show that pseudo-Hamiltonian neural network models can be used to learn external forces acting on a system. We argue that this property is particularly useful when the external forces are state dependent, in which case it is the pseudo-Hamiltonian structure that facilitates the separation of internal and external forces. Numerical results are provided for a forced and damped mass-spring system and a tank system of higher complexity, and a symmetric fourth-order integration scheme is introduced for improved training on sparse and noisy data. △ Less

Submitted 23 January, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: 23 pages, 13 figures; v4: slight title change, expanded on methodology for more clarity, updated plots

arXiv:2111.04153 [pdf, other]

doi 10.1109/TNNLS.2023.3263430

Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field Experiments

Authors: Eivind Bøhn, Erlend M. Coates, Dirk Reinhardt, Tor Arne Johansen

Abstract: Attitude control of fixed-wing unmanned aerial vehicles (UAVs) is a difficult control problem in part due to uncertain nonlinear dynamics, actuator constraints, and coupled longitudinal and lateral motions. Current state-of-the-art autopilots are based on linear control and are thus limited in their effectiveness and performance. Deep reinforcement learning (DRL) is a machine learning method to au… ▽ More Attitude control of fixed-wing unmanned aerial vehicles (UAVs) is a difficult control problem in part due to uncertain nonlinear dynamics, actuator constraints, and coupled longitudinal and lateral motions. Current state-of-the-art autopilots are based on linear control and are thus limited in their effectiveness and performance. Deep reinforcement learning (DRL) is a machine learning method to automatically discover optimal control laws through interaction with the controlled system, which can handle complex nonlinear dynamics. We show in this paper that DRL can successfully learn to perform attitude control of a fixed-wing UAV operating directly on the original nonlinear dynamics, requiring as little as three minutes of flight data. We initially train our model in a simulation environment and then deploy the learned controller on the UAV in flight tests, demonstrating comparable performance to the state-of-the-art ArduPlane proportional-integral-derivative (PID) attitude controller with no further online learning required. Learning with significant actuation delay and diversified simulated dynamics were found to be crucial for successful transfer to control of the real UAV. In addition to a qualitative comparison with the ArduPlane autopilot, we present a quantitative assessment based on linear analysis to better understand the learning controller's behavior. △ Less

Submitted 19 April, 2023; v1 submitted 7 November, 2021; originally announced November 2021.

Comments: Published in IEEE Transactions on Neural Networks and Learning Systems - Special Issue: Reinforcement Learning Based Control: Data-Efficient and Resilient Methods

arXiv:2111.04146 [pdf, other]

Optimization of the Model Predictive Control Meta-Parameters Through Reinforcement Learning

Authors: Eivind Bøhn, Sebastien Gros, Signe Moe, Tor Arne Johansen

Abstract: Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources in battery-powered embedded systems. The MPC param… ▽ More Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources in battery-powered embedded systems. The MPC parameters must be tuned, which is largely a trial-and-error process that affects the control performance, the robustness and the computational complexity of the controller to a high degree. In this paper, we propose a novel framework in which any parameter of the control algorithm can be jointly tuned using reinforcement learning(RL), with the goal of simultaneously optimizing the control performance and the power usage of the control algorithm. We propose the novel idea of optimizing the meta-parameters of MPCwith RL, i.e. parameters affecting the structure of the MPCproblem as opposed to the solution to a given problem. Our control algorithm is based on an event-triggered MPC where we learn when the MPC should be re-computed, and a dual mode MPC and linear state feedback control law applied in between MPC computations. We formulate a novel mixture-distribution policy and show that with joint optimization we achieve improvements that do not present themselves when optimizing the same parameters in isolation. We demonstrate our framework on the inverted pendulum control task, reducing the total computation time of the control system by 36% while also improving the control performance by 18.4% over the best-performing MPC baseline. △ Less

Submitted 7 November, 2021; originally announced November 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2102.11122 [pdf, other]

Reinforcement Learning of the Prediction Horizon in Model Predictive Control

Authors: Eivind Bøhn, Sebastien Gros, Signe Moe, Tor Arne Johansen

Abstract: Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC's capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and the necessity of tuning its parameters to the speci… ▽ More Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC's capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and the necessity of tuning its parameters to the specific control application. The main tunable parameter affecting the computational complexity is the prediction horizon length, controlling how far into the future the MPC predicts the system response and thus evaluates the optimality of its computed trajectory. A longer horizon generally increases the control performance, but requires an increasingly powerful computing platform, excluding certain control applications.The performance sensitivity to the prediction horizon length varies over the state space, and this motivated the adaptive horizon model predictive control (AHMPC), which adapts the prediction horizon according to some criteria. In this paper we propose to learn the optimal prediction horizon as a function of the state using reinforcement learning (RL). We show how the RL learning problem can be formulated and test our method on two control tasks, showing clear improvements over the fixed horizon MPC scheme, while requiring only minutes of learning. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: This work has been submitted to IFAC NMPC 2021 for possible publication

arXiv:2011.13365 [pdf, other]

Optimization of the Model Predictive Control Update Interval Using Reinforcement Learning

Authors: Eivind Bøhn, Sebastien Gros, Signe Moe, Tor Arne Johansen

Abstract: In control applications there is often a compromise that needs to be made with regards to the complexity and performance of the controller and the computational resources that are available. For instance, the typical hardware platform in embedded control applications is a microcontroller with limited memory and processing power, and for battery powered applications the control system can account f… ▽ More In control applications there is often a compromise that needs to be made with regards to the complexity and performance of the controller and the computational resources that are available. For instance, the typical hardware platform in embedded control applications is a microcontroller with limited memory and processing power, and for battery powered applications the control system can account for a significant portion of the energy consumption. We propose a controller architecture in which the computational cost is explicitly optimized along with the control objective. This is achieved by a three-part architecture where a high-level, computationally expensive controller generates plans, which a computationally simpler controller executes by compensating for prediction errors, while a recomputation policy decides when the plan should be recomputed. In this paper, we employ model predictive control (MPC) as the high-level plan-generating controller, a linear state feedback controller as the simpler compensating controller, and reinforcement learning (RL) to learn the recomputation policy. Simulation results for two examples showcase the architecture's ability to improve upon the MPC approach and find reasonable compromises weighing the performance on the control objective and the computational resources expended. △ Less

Submitted 26 November, 2020; originally announced November 2020.

Comments: Submitted to 3rd Annual Learning for Dynamics and Control Conference (L4DC 2021)

arXiv:1911.09391 [pdf, other]

Accelerating Reinforcement Learning with Suboptimal Guidance

Authors: Eivind Bøhn, Signe Moe, Tor Arne Johansen

Abstract: Reinforcement Learning in domains with sparse rewards is a difficult problem, and a large part of the training process is often spent searching the state space in a more or less random fashion for any learning signals. For control problems, we often have some controller readily available which might be suboptimal but nevertheless solves the problem to some degree. This controller can be used to gu… ▽ More Reinforcement Learning in domains with sparse rewards is a difficult problem, and a large part of the training process is often spent searching the state space in a more or less random fashion for any learning signals. For control problems, we often have some controller readily available which might be suboptimal but nevertheless solves the problem to some degree. This controller can be used to guide the initial exploration phase of the learning controller towards reward yielding states, reducing the time before refinement of a viable policy can be initiated. In our work, the agent is guided through an auxiliary behaviour cloning loss which is made conditional on a Q-filter, i.e. it is only applied in situations where the critic deems the guiding controller to be better than the agent. The Q-filter provides a natural way to adjust the guidance throughout the training process, allowing the agent to exceed the guiding controller in a manner that is adaptive to the task at hand and the proficiency of the guiding controller. The contribution of this paper lies in identifying shortcomings in previously proposed implementations of the Q-filter concept, and in suggesting some ways these issues can be mitigated. These modifications are tested on the OpenAI Gym Fetch environments, showing clear improvements in adaptivity and yielding increased performance in all robotic environments tested. △ Less

Submitted 21 November, 2019; originally announced November 2019.

Comments: Submitted to IFAC 2020

arXiv:1911.05478 [pdf, other]

doi 10.1109/ICUAS.2019.8798254

Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs Using Proximal Policy Optimization

Authors: Eivind Bøhn, Erlend M. Coates, Signe Moe, Tor Arne Johansen

Abstract: Contemporary autopilot systems for unmanned aerial vehicles (UAVs) are far more limited in their flight envelope as compared to experienced human pilots, thereby restricting the conditions UAVs can operate in and the types of missions they can accomplish autonomously. This paper proposes a deep reinforcement learning (DRL) controller to handle the nonlinear attitude control problem, enabling exten… ▽ More Contemporary autopilot systems for unmanned aerial vehicles (UAVs) are far more limited in their flight envelope as compared to experienced human pilots, thereby restricting the conditions UAVs can operate in and the types of missions they can accomplish autonomously. This paper proposes a deep reinforcement learning (DRL) controller to handle the nonlinear attitude control problem, enabling extended flight envelopes for fixed-wing UAVs. A proof-of-concept controller using the proximal policy optimization (PPO) algorithm is developed, and is shown to be capable of stabilizing a fixed-wing UAV from a large set of initial conditions to reference roll, pitch and airspeed values. The training process is outlined and key factors for its progression rate are considered, with the most important factor found to be limiting the number of variables in the observation vector, and including values for several previous time steps for these variables. The trained reinforcement learning (RL) controller is compared to a proportional-integral-derivative (PID) controller, and is found to converge in more cases than the PID controller, with comparable performance. Furthermore, the RL controller is shown to generalize well to unseen disturbances in the form of wind and turbulence, even in severe disturbance conditions. △ Less

Submitted 13 November, 2019; originally announced November 2019.

Comments: 11 pages, 3 figures, 2019 International Conference on Unmanned Aircraft Systems (ICUAS)

Journal ref: In 2019 International Conference on Unmanned Aircraft Systems (ICUAS) (pp. 523-533). IEEE

arXiv:1407.6037 [pdf, other]

doi 10.1103/PhysRevLett.113.170403

Relaxation dynamics of a Fermi gas in an optical superlattice

Authors: D. Pertot, A. Sheikhan, E. Cocchi, L. A. Miller, J. E. Bohn, M. Koschorreck, M. Köhl, C. Kollath

Abstract: This paper comprises an experimental and theoretical investigation of the time evolution of a Fermi gas following fast and slow quenches of a one-dimensional optical double-well superlattice potential. We investigate both the local tunneling in the connected double wells and the global dynamics towards a steady state. The local observables in the steady-state resemble those of an equilibrium state… ▽ More This paper comprises an experimental and theoretical investigation of the time evolution of a Fermi gas following fast and slow quenches of a one-dimensional optical double-well superlattice potential. We investigate both the local tunneling in the connected double wells and the global dynamics towards a steady state. The local observables in the steady-state resemble those of an equilibrium state, whereas the global properties indicate a strong non-equilibrium situation. △ Less

Submitted 22 July, 2014; originally announced July 2014.

Journal ref: Phys. Rev. Lett. 113, 170403 (2014)

arXiv:1111.2727 [pdf, other]

doi 10.1103/PhysRevLett.108.075303

Fermionization of two distinguishable fermions

Authors: G. Zürn, F. Serwane, T. Lompe, A. N. Wenz, M. G. Ries, J. E. Bohn, S. Jochim

Abstract: In this work we study a system of two distinguishable fermions in a 1D harmonic potential. This system has the exceptional property that there is an analytic solution for arbitrary values of the interparticle interaction. We tune the interaction strength via a magnetic offset field and compare the measured properties of the system to the theoretical prediction. At the point where the interaction s… ▽ More In this work we study a system of two distinguishable fermions in a 1D harmonic potential. This system has the exceptional property that there is an analytic solution for arbitrary values of the interparticle interaction. We tune the interaction strength via a magnetic offset field and compare the measured properties of the system to the theoretical prediction. At the point where the interaction strength diverges, the energy and square of the wave function for two distinguishable particles are the same as for a system of two identical fermions. This is referred to as fermionization. We have observed this phenomenon by directly comparing two distinguishable fermions with diverging interaction strength with two identical fermions in the same potential. We observe good agreement between experiment and theory. By adding one or more particles our system can be used as a quantum simulator for more complex few-body systems where no theoretical solution is available. △ Less

Submitted 15 July, 2013; v1 submitted 11 November, 2011; originally announced November 2011.

Journal ref: Phys. Rev. Lett. 108, 075303 (2012)

Showing 1–11 of 11 results for author: Bøhn, E