Search | arXiv e-print repository

Learning Paradigms and Modelling Methodologies for Digital Twins in Process Industry

Authors: Michael Mayr, Georgios C. Chasparis, Josef Küng

Abstract: Central to the digital transformation of the process industry are Digital Twins (DTs), virtual replicas of physical manufacturing systems that combine sensor data with sophisticated data-based or physics-based models, or a combination thereof, to tackle a variety of industrial-relevant tasks like process monitoring, predictive control or decision support. The backbone of a DT, i.e. the concrete mo… ▽ More Central to the digital transformation of the process industry are Digital Twins (DTs), virtual replicas of physical manufacturing systems that combine sensor data with sophisticated data-based or physics-based models, or a combination thereof, to tackle a variety of industrial-relevant tasks like process monitoring, predictive control or decision support. The backbone of a DT, i.e. the concrete modelling methodologies and architectural frameworks supporting these models, are complex, diverse and evolve fast, necessitating a thorough understanding of the latest state-of-the-art methods and trends to stay on top of a highly competitive market. From a research perspective, despite the high research interest in reviewing various aspects of DTs, structured literature reports specifically focusing on unravelling the utilized learning paradigms (e.g. self-supervised learning) for DT-creation in the process industry are a novel contribution in this field. This study aims to address these gaps by (1) systematically analyzing the modelling methodologies (e.g. Convolutional Neural Network, Encoder-Decoder, Hidden Markov Model) and paradigms (e.g. data-driven, physics-based, hybrid) used for DT-creation; (2) assessing the utilized learning strategies (e.g. supervised, unsupervised, self-supervised); (3) analyzing the type of modelling task (e.g. regression, classification, clustering); and (4) identifying the challenges and research gaps, as well as, discuss potential resolutions provided. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.02231 [pdf, other]

Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real Approach

Authors: Ammar N. Abbas, Shakra Mehak, Georgios C. Chasparis, John D. Kelleher, Michael Guilfoyle, Maria Chiara Leva, Aswin K Ramasubramanian

Abstract: This study presents a novel methodology incorporating safety constraints into a robotic simulation during the training of deep reinforcement learning (DRL). The framework integrates specific parts of the safety requirements, such as velocity constraints, as specified by ISO 10218, directly within the DRL model that becomes a part of the robot's learning algorithm. The study then evaluated the effi… ▽ More This study presents a novel methodology incorporating safety constraints into a robotic simulation during the training of deep reinforcement learning (DRL). The framework integrates specific parts of the safety requirements, such as velocity constraints, as specified by ISO 10218, directly within the DRL model that becomes a part of the robot's learning algorithm. The study then evaluated the efficiency of these safety constraints by subjecting the DRL model to various scenarios, including gras** tasks with and without obstacle avoidance. The validation process involved comprehensive simulation-based testing of the DRL model's responses to potential hazards and its compliance. Also, the performance of the system is carried out by the functional safety standards IEC 61508 to determine the safety integrity level. The study indicated a significant improvement in the safety performance of the robotic system. The proposed DRL model anticipates and mitigates hazards while maintaining operational efficiency. This study was validated in a testbed with a collaborative robotic arm with safety sensors and assessed with metrics such as the average number of safety violations, obstacle avoidance, and the number of successful grasps. The proposed approach outperforms the conventional method by a 16.5% average success rate on the tested scenarios in the simulations and 2.5% in the testbed without safety violations. The project repository is available at https://github.com/ammar-n-abbas/sim2real-ur-gym-gazebo. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: This paper has been accepted for publication in the proceedings of the IEEE/IFAC International Conference on Control, Decision, and Information Technologies (CoDIT), 2024

arXiv:2407.02106 [pdf, other]

Automated Knowledge Graph Learning in Industrial Processes

Authors: Lolitta Ammann, Jorge Martinez-Gil, Michael Mayr, Georgios C. Chasparis

Abstract: Industrial processes generate vast amounts of time series data, yet extracting meaningful relationships and insights remains challenging. This paper introduces a framework for automated knowledge graph learning from time series data, specifically tailored for industrial applications. Our framework addresses the complexities inherent in industrial datasets, transforming them into knowledge graphs t… ▽ More Industrial processes generate vast amounts of time series data, yet extracting meaningful relationships and insights remains challenging. This paper introduces a framework for automated knowledge graph learning from time series data, specifically tailored for industrial applications. Our framework addresses the complexities inherent in industrial datasets, transforming them into knowledge graphs that improve decision-making, process optimization, and knowledge discovery. Additionally, it employs Granger causality to identify key attributes that can inform the design of predictive models. To illustrate the practical utility of our approach, we also present a motivating use case demonstrating the benefits of our framework in a real-world industrial scenario. Further, we demonstrate how the automated conversion of time series data into knowledge graphs can identify causal influences or dependencies between important process parameters. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2402.13752 [pdf]

AI-Powered Predictions for Electricity Load in Prosumer Communities

Authors: Aleksei Kychkin, Georgios C. Chasparis

Abstract: The flexibility in electricity consumption and production in communities of residential buildings, including those with renewable energy sources and energy storage (a.k.a., prosumers), can effectively be utilized through the advancement of short-term demand response mechanisms. It is known that flexibility can further be increased if demand response is performed at the level of communities of pros… ▽ More The flexibility in electricity consumption and production in communities of residential buildings, including those with renewable energy sources and energy storage (a.k.a., prosumers), can effectively be utilized through the advancement of short-term demand response mechanisms. It is known that flexibility can further be increased if demand response is performed at the level of communities of prosumers, since aggregated groups can better coordinate electricity consumption. However, the effectiveness of such short-term optimization is highly dependent on the accuracy of electricity load forecasts both for each building as well as for the whole community. Structural variations in the electricity load profile can be associated with different exogenous factors, such as weather conditions, calendar information and day of the week, as well as user behavior. In this paper, we review a wide range of electricity load forecasting techniques, that can provide significant assistance in optimizing load consumption in prosumer communities. We present and test artificial intelligence (AI) powered short-term load forecasting methodologies that operate with black-box time series models, such as Facebook's Prophet and Long Short-term Memory (LSTM) models; season-based SARIMA and smoothing Holt-Winters models; and empirical regression-based models that utilize domain knowledge. The integration of weather forecasts into data-driven time series forecasts is also tested. Results show that the combination of persistent and regression terms (adapted to the load forecasting task) achieves the best forecast accuracy. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: It has been presented in the 18. Symposium Energieinnovation (14.-16.02.2024). Further information can be found at: https://www.tugraz.at/events/eninnov2024/home

arXiv:2310.18811 [pdf]

Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning

Authors: Ammar N. Abbas, Georgios C. Chasparis, John D. Kelleher

Abstract: The difficulty of identifying the physical model of complex systems has led to exploring methods that do not rely on such complex modeling of the systems. Deep reinforcement learning has been the pioneer for solving this problem without the need for relying on the physical model of complex systems by just interacting with it. However, it uses a black-box learning approach that makes it difficult t… ▽ More The difficulty of identifying the physical model of complex systems has led to exploring methods that do not rely on such complex modeling of the systems. Deep reinforcement learning has been the pioneer for solving this problem without the need for relying on the physical model of complex systems by just interacting with it. However, it uses a black-box learning approach that makes it difficult to be applied within real-world and safety-critical systems without providing explanations of the actions derived by the model. Furthermore, an open research question in deep reinforcement learning is how to focus the policy learning of critical decisions within a sparse domain. This paper proposes a novel approach for the use of deep reinforcement learning in safety-critical systems. It combines the advantages of probabilistic modeling and reinforcement learning with the added benefits of interpretability and works in collaboration and synchronization with conventional decision-making strategies. The BC-SRLA is activated in specific situations which are identified autonomously through the fused information of probabilistic model and reinforcement learning, such as abnormal conditions or when the system is near-to-failure. Further, it is initialized with a baseline policy using policy cloning to allow minimum interactions with the environment to address the challenges associated with using RL in safety-critical industries. The effectiveness of the BC-SRLA is demonstrated through a case study in maintenance applied to turbofan engines, where it shows superior performance to the prior art and other baselines. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: arXiv admin note: text overlap with arXiv:2206.13433

Journal ref: Data & Knowledge Engineering, 2023

arXiv:2310.14788 [pdf]

Specialized Deep Residual Policy Safe Reinforcement Learning-Based Controller for Complex and Continuous State-Action Spaces

Authors: Ammar N. Abbas, Georgios C. Chasparis, John D. Kelleher

Abstract: Traditional controllers have limitations as they rely on prior knowledge about the physics of the problem, require modeling of dynamics, and struggle to adapt to abnormal situations. Deep reinforcement learning has the potential to address these problems by learning optimal control policies through exploration in an environment. For safety-critical environments, it is impractical to explore random… ▽ More Traditional controllers have limitations as they rely on prior knowledge about the physics of the problem, require modeling of dynamics, and struggle to adapt to abnormal situations. Deep reinforcement learning has the potential to address these problems by learning optimal control policies through exploration in an environment. For safety-critical environments, it is impractical to explore randomly, and replacing conventional controllers with black-box models is also undesirable. Also, it is expensive in continuous state and action spaces, unless the search space is constrained. To address these challenges we propose a specialized deep residual policy safe reinforcement learning with a cycle of learning approach adapted for complex and continuous state-action spaces. Residual policy learning allows learning a hybrid control architecture where the reinforcement learning agent acts in synchronous collaboration with the conventional controller. The cycle of learning initiates the policy through the expert trajectory and guides the exploration around it. Further, the specialization through the input-output hidden Markov model helps to optimize policy that lies within the region of interest (such as abnormality), where the reinforcement learning agent is required and is activated. The proposed solution is validated on the Tennessee Eastman process control. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:1803.02751 [pdf, other]

Aspiration-based Perturbed Learning Automata

Authors: Georgios C. Chasparis

Abstract: This paper introduces a novel payoff-based learning scheme for distributed optimization in repeatedly-played strategic-form games. Standard reinforcement-based learning exhibits several limitations with respect to their asymptotic stability. For example, in two-player coordination games, payoff-dominant (or efficient) Nash equilibria may not be stochastically stable. In this work, we present an ex… ▽ More This paper introduces a novel payoff-based learning scheme for distributed optimization in repeatedly-played strategic-form games. Standard reinforcement-based learning exhibits several limitations with respect to their asymptotic stability. For example, in two-player coordination games, payoff-dominant (or efficient) Nash equilibria may not be stochastically stable. In this work, we present an extension of perturbed learning automata, namely aspiration-based perturbed learning automata (APLA) that overcomes these limitations. We provide a stochastic stability analysis of APLA in multi-player coordination games. We further show that payoff-dominant Nash equilibria are the only stochastically stable states. △ Less

Submitted 7 March, 2018; originally announced March 2018.

Comments: arXiv admin note: text overlap with arXiv:1709.05859, arXiv:1702.08334

arXiv:1803.00355 [pdf, other]

Learning-based Dynamic Pinning of Parallelized Applications in Many-Core Systems

Authors: Georgios C. Chasparis, Vladimir Janjic, Michael Rossbory

Abstract: Motivated by the need for adaptive, secure and responsive scheduling in a great range of computing applications, including human-centered and time-critical applications, this paper proposes a scheduling framework that seamlessly adds resource-awareness to any parallel application. In particular, we introduce a learning-based framework for dynamic placement of parallel threads to Non-Uniform Memory… ▽ More Motivated by the need for adaptive, secure and responsive scheduling in a great range of computing applications, including human-centered and time-critical applications, this paper proposes a scheduling framework that seamlessly adds resource-awareness to any parallel application. In particular, we introduce a learning-based framework for dynamic placement of parallel threads to Non-Uniform Memory Access (NUMA) architectures. Decisions are taken independently by each thread in a decentralized fashion that significantly reduces computational complexity. The advantage of the proposed learning scheme is the ability to easily incorporate any multi-objective criterion and easily adapt to performance variations during runtime. Under the multi-objective criterion of maximizing total completed instructions per second (i.e., both computational and memory-access instructions), we provide analytical guarantees with respect to the expected performance of the parallel application. We also compare the performance of the proposed scheme with the Linux operating system scheduler in an extensive set of applications, including both computationally and memory intensive ones. We have observed that performance improvement could be significant especially under limited availability of resources and under irregular memory-access patterns. △ Less

Submitted 11 January, 2020; v1 submitted 1 March, 2018; originally announced March 2018.

Comments: arXiv admin note: text overlap with arXiv:1606.08156

arXiv:1709.05859 [pdf, other]

Stochastic Stability of Perturbed Learning Automata in Positive-Utility Games

Authors: Georgios C. Chasparis

Abstract: This paper considers a class of reinforcement-based learning (namely, perturbed learning automata) and provides a stochastic-stability analysis in repeatedly-played, positive-utility, finite strategic-form games. Prior work in this class of learning dynamics primarily analyzes asymptotic convergence through stochastic approximations, where convergence can be associated with the limit points of an… ▽ More This paper considers a class of reinforcement-based learning (namely, perturbed learning automata) and provides a stochastic-stability analysis in repeatedly-played, positive-utility, finite strategic-form games. Prior work in this class of learning dynamics primarily analyzes asymptotic convergence through stochastic approximations, where convergence can be associated with the limit points of an ordinary-differential equation (ODE). However, analyzing global convergence through an ODE-approximation requires the existence of a Lyapunov or a potential function, which naturally restricts the analysis to a fine class of games. To overcome these limitations, this paper introduces an alternative framework for analyzing asymptotic convergence that is based upon an explicit characterization of the invariant probability measure of the induced Markov chain. We further provide a methodology for computing the invariant probability measure in positive-utility games, together with an illustration in the context of coordination games. △ Less

Submitted 28 January, 2019; v1 submitted 18 September, 2017; originally announced September 2017.

Comments: arXiv admin note: text overlap with arXiv:1702.08334

arXiv:1707.08776 [pdf, other]

An Evolutionary Stochastic-Local-Search Framework for One-Dimensional Cutting-Stock Problems

Authors: Georgios C. Chasparis, Michael Rossbory, Verena Haunschmid

Abstract: We introduce an evolutionary stochastic-local-search (SLS) algorithm for addressing a generalized version of the so-called 1/V/D/R cutting-stock problem. Cutting-stock problems are encountered often in industrial environments and the ability to address them efficiently usually results in large economic benefits. Traditionally linear-programming-based techniques have been utilized to address such p… ▽ More We introduce an evolutionary stochastic-local-search (SLS) algorithm for addressing a generalized version of the so-called 1/V/D/R cutting-stock problem. Cutting-stock problems are encountered often in industrial environments and the ability to address them efficiently usually results in large economic benefits. Traditionally linear-programming-based techniques have been utilized to address such problems, however their flexibility might be limited when nonlinear constraints and objective functions are introduced. To this end, this paper proposes an evolutionary SLS algorithm for addressing one-dimensional cutting-stock problems. The contribution lies in the introduction of a flexible structural framework of the optimization that may accommodate a large family of diversification strategies including a novel parallel pattern appropriate for SLS algorithms (not necessarily restricted to cutting-stock problems). We finally demonstrate through experiments in a real-world manufacturing problem the benefit in cost reduction of the considered diversification strategies. △ Less

Submitted 27 July, 2017; originally announced July 2017.

arXiv:1702.08334 [pdf, ps, other]

Stochastic Stability Analysis of Perturbed Learning Automata with Constant Step-Size in Strategic-Form Games

Authors: Georgios C. Chasparis

Abstract: This paper considers a class of reinforcement-learning that belongs to the family of Learning Automata and provides a stochastic-stability analysis in strategic-form games. For this class of dynamics, convergence to pure Nash equilibria has been demonstrated only for the fine class of potential games. Prior work primarily provides convergence properties of the dynamics through stochastic approxima… ▽ More This paper considers a class of reinforcement-learning that belongs to the family of Learning Automata and provides a stochastic-stability analysis in strategic-form games. For this class of dynamics, convergence to pure Nash equilibria has been demonstrated only for the fine class of potential games. Prior work primarily provides convergence properties of the dynamics through stochastic approximations, where the asymptotic behavior can be associated with the limit points of an ordinary-differential equation (ODE). However, analyzing global convergence through the ODE-approximation requires the existence of a Lyapunov or a potential function, which naturally restricts the applicabity of these algorithms to a fine class of games. To overcome these limitations, this paper introduces an alternative framework for analyzing stochastic-stability that is based upon an explicit characterization of the (unique) invariant probability measure of the induced Markov chain. △ Less

Submitted 27 February, 2017; originally announced February 2017.

arXiv:1606.08156 [pdf, other]

Efficient Dynamic Pinning of Parallelized Applications by Distributed Reinforcement Learning

Authors: Georgios C. Chasparis, Michael Rossbory

Abstract: This paper introduces a resource allocation framework specifically tailored for addressing the problem of dynamic placement (or pinning) of parallelized applications to processing units. Under the proposed setup each thread of the parallelized application constitutes an independent decision maker (or agent), which (based on its own prior performance measurements and its own prior CPU-affinities) d… ▽ More This paper introduces a resource allocation framework specifically tailored for addressing the problem of dynamic placement (or pinning) of parallelized applications to processing units. Under the proposed setup each thread of the parallelized application constitutes an independent decision maker (or agent), which (based on its own prior performance measurements and its own prior CPU-affinities) decides on which processing unit to run next. Decisions are updated recursively for each thread by a resource manager/scheduler which runs in parallel to the application's threads and periodically records their performances and assigns to them new CPU affinities. For updating the CPU-affinities, the scheduler uses a distributed reinforcement-learning algorithm, each branch of which is responsible for assigning a new placement strategy to each thread. According to this algorithm, prior allocations are going to be reinforced in the future proportionally to their prior performance. The proposed resource allocation framework is flexible enough to address alternative optimization criteria, such as maximum average processing speed and minimum speed variance among threads. We demonstrate analytically that convergence to locally-optimal placements is achieved asymptotically. Finally, we validate these results through experiments in Linux platforms. △ Less

Submitted 27 June, 2016; originally announced June 2016.

arXiv:1508.04544 [pdf, other]

Design and Implementation of Distributed Resource Management for Time Sensitive Applications

Authors: Georgios C. Chasparis, Martina Maggio, Enrico Bini, Karl-Eric Årzén

Abstract: In this paper, we address distributed convergence to fair allocations of CPU resources for time-sensitive applications. We propose a novel resource management framework where a centralized objective for fair allocations is decomposed into a pair of performance-driven recursive processes for updating: (a) the allocation of computing bandwidth to the applications (resource adaptation), executed by t… ▽ More In this paper, we address distributed convergence to fair allocations of CPU resources for time-sensitive applications. We propose a novel resource management framework where a centralized objective for fair allocations is decomposed into a pair of performance-driven recursive processes for updating: (a) the allocation of computing bandwidth to the applications (resource adaptation), executed by the resource manager, and (b) the service level of each application (service-level adaptation), executed by each application independently. We provide conditions under which the distributed recursive scheme exhibits convergence to solutions of the centralized objective (i.e., fair allocations). Contrary to prior work on centralized optimization schemes, the proposed framework exhibits adaptivity and robustness to changes both in the number and nature of applications, while it assumes minimum information available to both applications and the resource manager. We finally validate our framework with simulations using the TrueTime toolbox in MATLAB/Simulink. △ Less

Submitted 19 August, 2015; originally announced August 2015.

arXiv:1110.4412 [pdf, ps, other]

doi 10.1137/110852462

Aspiration Learning in Coordination Games

Authors: Georgios C. Chasparis, Ari Arapostathis, Jeff S. Shamma

Abstract: We consider the problem of distributed convergence to efficient outcomes in coordination games through dynamics based on aspiration learning. Under aspiration learning, a player continues to play an action as long as the rewards received exceed a specified aspiration level. Here, the aspiration level is a fading memory average of past rewards, and these levels also are subject to occasional random… ▽ More We consider the problem of distributed convergence to efficient outcomes in coordination games through dynamics based on aspiration learning. Under aspiration learning, a player continues to play an action as long as the rewards received exceed a specified aspiration level. Here, the aspiration level is a fading memory average of past rewards, and these levels also are subject to occasional random perturbations. A player becomes dissatisfied whenever a received reward is less than the aspiration level, in which case the player experiments with a probability proportional to the degree of dissatisfaction. Our first contribution is the characterization of the asymptotic behavior of the induced Markov chain of the iterated process in terms of an equivalent finite-state Markov chain. We then characterize explicitly the behavior of the proposed aspiration learning in a generalized version of coordination games, examples of which include network formation and common-pool games. In particular, we show that in generic coordination games the frequency at which an efficient action profile is played can be made arbitrarily large. Although convergence to efficient outcomes is desirable, in several coordination games, such as common-pool games, attainability of fair outcomes, i.e., sequences of plays at which players experience highly rewarding returns with the same frequency, might also be of special interest. To this end, we demonstrate through analysis and simulations that aspiration learning also establishes fair outcomes in all symmetric coordination games, including common-pool games. △ Less

Submitted 19 October, 2011; originally announced October 2011.

Comments: 27 pages

MSC Class: 68T05; 91A26; 91A22; 93E35; 60J05; 91A80

Journal ref: SIAM J. Control Optim. 51 (2013), no. 1, 465-490

Showing 1–14 of 14 results for author: Chasparis, G C