Search | arXiv e-print repository

Synthesis of separation processes with reinforcement learning

Authors: Stephan C. P. A. van Kalmthout, Laurence I. Midgley, Meik B. Franke

Abstract: This paper shows the implementation of reinforcement learning (RL) in commercial flowsheet simulator software (Aspen Plus V12) for designing and optimising a distillation sequence. The aim of the SAC agent was to separate a hydrocarbon mixture in its individual components by utilising distillation. While doing so it tries to maximise the profit produced by the distillation sequence. All actions of… ▽ More This paper shows the implementation of reinforcement learning (RL) in commercial flowsheet simulator software (Aspen Plus V12) for designing and optimising a distillation sequence. The aim of the SAC agent was to separate a hydrocarbon mixture in its individual components by utilising distillation. While doing so it tries to maximise the profit produced by the distillation sequence. All actions of the agent were set by the SAC agent in Python and communicated in Aspen Plus via an API. Here the distillation column was simulated by use of the build-in RADFRAC column. With this a connection was established for data transfer between Python and Aspen and the agent succeeded to show learning behaviour, while increasing profit. Although results were generated, the use of Aspen was slow (190 hours) and Aspen was found unsuitable for parallelisation. This makes that Aspen is incompatible for solving RL problems. Code and thesis are available at https://github.com/lollcat/Aspen-RL △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2112.10558 [pdf, other]

doi 10.1016/j.neunet.2023.04.022

Lifelong Learning on Evolving Graphs Under the Constraints of Imbalanced Classes and New Classes

Authors: Lukas Galke, Iacopo Vagliano, Benedikt Franke, Tobias Zielke, Marcel Hoffmann, Ansgar Scherp

Abstract: Lifelong graph learning deals with the problem of continually adapting graph neural network (GNN) models to changes in evolving graphs. We address two critical challenges of lifelong graph learning in this work: dealing with new classes and tackling imbalanced class distributions. The combination of these two challenges is particularly relevant since newly emerging classes typically resemble only… ▽ More Lifelong graph learning deals with the problem of continually adapting graph neural network (GNN) models to changes in evolving graphs. We address two critical challenges of lifelong graph learning in this work: dealing with new classes and tackling imbalanced class distributions. The combination of these two challenges is particularly relevant since newly emerging classes typically resemble only a tiny fraction of the data, adding to the already skewed class distribution. We make several contributions: First, we show that the amount of unlabeled data does not influence the results, which is an essential prerequisite for lifelong learning on a sequence of tasks. Second, we experiment with different label rates and show that our methods can perform well with only a tiny fraction of annotated nodes. Third, we propose the gDOC method to detect new classes under the constraint of having an imbalanced class distribution. The critical ingredient is a weighted binary cross-entropy loss function to account for the class imbalance. Moreover, we demonstrate combinations of gDOC with various base GNN models such as GraphSAGE, Simplified Graph Convolution, and Graph Attention Networks. Lastly, our k-neighborhood time difference measure provably normalizes the temporal changes across different graph datasets. With extensive experimentation, we find that the proposed gDOC method is consistently better than a naive adaption of DOC to graphs. Specifically, in experiments using the smallest history size, the out-of-distribution detection score of gDOC is 0.09 compared to 0.01 for DOC. Furthermore, gDOC achieves an Open-F1 score, a combined measure of in-distribution classification and out-of-distribution detection, of 0.33 compared to 0.25 of DOC (32% increase). △ Less

Submitted 9 May, 2023; v1 submitted 20 December, 2021; originally announced December 2021.

Comments: Accepted manuscript (after peer review, before copy-editing). Published article available at https://doi.org/10.1016/j.neunet.2023.04.022

ACM Class: I.2.6

Journal ref: Neural Networks 164 (2023) 156-176

arXiv:2107.13057 [pdf]

doi 10.1038/s41928-021-00705-7

Neuromorphic scaling advantages for energy-efficient random walk computation

Authors: J. Darby Smith, Aaron J. Hill, Leah E. Reeder, Brian C. Franke, Richard B. Lehoucq, Ojas Parekh, William Severa, James B. Aimone

Abstract: Computing stands to be radically improved by neuromorphic computing (NMC) approaches inspired by the brain's incredible efficiency and capabilities. Most NMC research, which aims to replicate the brain's computational structure and architecture in man-made hardware, has focused on artificial intelligence; however, less explored is whether this brain-inspired hardware can provide value beyond cogni… ▽ More Computing stands to be radically improved by neuromorphic computing (NMC) approaches inspired by the brain's incredible efficiency and capabilities. Most NMC research, which aims to replicate the brain's computational structure and architecture in man-made hardware, has focused on artificial intelligence; however, less explored is whether this brain-inspired hardware can provide value beyond cognitive tasks. We demonstrate that high-degree parallelism and configurability of spiking neuromorphic architectures makes them well-suited to implement random walks via discrete time Markov chains. Such random walks are useful in Monte Carlo methods, which represent a fundamental computational tool for solving a wide range of numerical computing tasks. Additionally, we show how the mathematical basis for a probabilistic solution involving a class of stochastic differential equations can leverage those simulations to provide solutions for a range of broadly applicable computational tasks. Despite being in an early development stage, we find that NMC platforms, at a sufficient scale, can drastically reduce the energy demands of high-performance computing (HPC) platforms. △ Less

Submitted 27 July, 2021; originally announced July 2021.

Comments: Paper, figures, supplement

Report number: SAND2021-9085 O

Journal ref: Nature Electronics 2022

arXiv:2006.14422 [pdf, other]

doi 10.1109/IJCNN52387.2021.9533412

Lifelong Learning of Graph Neural Networks for Open-World Node Classification

Authors: Lukas Galke, Benedikt Franke, Tobias Zielke, Ansgar Scherp

Abstract: Graph neural networks (GNNs) have emerged as the standard method for numerous tasks on graph-structured data such as node classification. However, real-world graphs are often evolving over time and even new classes may arise. We model these challenges as an instance of lifelong learning, in which a learner faces a sequence of tasks and may take over knowledge acquired in past tasks. Such knowledge… ▽ More Graph neural networks (GNNs) have emerged as the standard method for numerous tasks on graph-structured data such as node classification. However, real-world graphs are often evolving over time and even new classes may arise. We model these challenges as an instance of lifelong learning, in which a learner faces a sequence of tasks and may take over knowledge acquired in past tasks. Such knowledge may be stored explicitly as historic data or implicitly within model parameters. In this work, we systematically analyze the influence of implicit and explicit knowledge. Therefore, we present an incremental training method for lifelong learning on graphs and introduce a new measure based on $k$-neighborhood time differences to address variances in the historic data. We apply our training method to five representative GNN architectures and evaluate them on three new lifelong node classification datasets. Our results show that no more than 50% of the GNN's receptive field is necessary to retain at least 95% accuracy compared to training over the complete history of the graph data. Furthermore, our experiments confirm that implicit knowledge becomes more important when fewer explicit knowledge is available. △ Less

Submitted 20 December, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: 9 pages, 4 figures, major update compared to v2, as appeared in IEEE International Joint Conference on Neural Networks (IJCNN) 2021

ACM Class: I.2.6

arXiv:2005.10904 [pdf, other]

Solving a steady-state PDE using spiking networks and neuromorphic hardware

Authors: J. Darby Smith, William Severa, Aaron J. Hill, Leah Reeder, Brian Franke, Richard B. Lehoucq, Ojas D. Parekh, James B. Aimone

Abstract: The widely parallel, spiking neural networks of neuromorphic processors can enable computationally powerful formulations. While recent interest has focused on primarily machine learning tasks, the space of appropriate applications is wide and continually expanding. Here, we leverage the parallel and event-driven structure to solve a steady state heat equation using a random walk method. The random… ▽ More The widely parallel, spiking neural networks of neuromorphic processors can enable computationally powerful formulations. While recent interest has focused on primarily machine learning tasks, the space of appropriate applications is wide and continually expanding. Here, we leverage the parallel and event-driven structure to solve a steady state heat equation using a random walk method. The random walk can be executed fully within a spiking neural network using stochastic neuron behavior, and we provide results from both IBM TrueNorth and Intel Loihi implementations. Additionally, we position this algorithm as a potential scalable benchmark for neuromorphic systems. △ Less

Submitted 21 May, 2020; originally announced May 2020.

Comments: Submitted to 2020 International Conference on Neuromorphic Systems (2020 ICONS)

Report number: SAND2020-5296 O

arXiv:2002.08697 [pdf, other]

Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

Authors: Valentin Radu, Kuba Kaszyk, Yuan Wen, Jack Turner, Jose Cano, Elliot J. Crowley, Bjorn Franke, Amos Storkey, Michael O'Boyle

Abstract: Convolutional Neural Networks (CNN) are becoming a common presence in many applications and services, due to their superior recognition accuracy. They are increasingly being used on mobile devices, many times just by porting large models designed for server space, although several model compression techniques have been considered. One model compression technique intended to reduce computations is… ▽ More Convolutional Neural Networks (CNN) are becoming a common presence in many applications and services, due to their superior recognition accuracy. They are increasingly being used on mobile devices, many times just by porting large models designed for server space, although several model compression techniques have been considered. One model compression technique intended to reduce computations is channel pruning. Mobile and embedded systems now have GPUs which are ideal for the parallel computations of neural networks and for their lower energy cost per operation. Specialized libraries perform these neural network computations through highly optimized routines. As we find in our experiments, these libraries are optimized for the most common network shapes, making uninstructed channel pruning inefficient. We evaluate higher level libraries, which analyze the input characteristics of a convolutional layer, based on which they produce optimized OpenCL (Arm Compute Library and TVM) and CUDA (cuDNN) code. However, in reality, these characteristics and subsequent choices intended for optimization can have the opposite effect. We show that a reduction in the number of convolutional channels, pruning 12% of the initial size, is in some cases detrimental to performance, leading to 2x slowdown. On the other hand, we also find examples where performance-aware pruning achieves the intended results, with performance speedups of 3x with cuDNN and above 10x with Arm Compute Library and TVM. Our findings expose the need for hardware-instructed neural network pruning. △ Less

Submitted 20 February, 2020; originally announced February 2020.

Comments: A copy of this was published in IISWC'19

arXiv:1808.06352 [pdf, other]

doi 10.1109/JPROC.2018.2856739

Navigating the Landscape for Real-time Localisation and Map** for Robotics and Virtual and Augmented Reality

Authors: Sajad Saeedi, Bruno Bodin, Harry Wagstaff, Andy Nisbet, Luigi Nardi, John Mawer, Nicolas Melot, Oscar Palomar, Emanuele Vespa, Tom Spink, Cosmin Gorgovan, Andrew Webb, James Clarkson, Erik Tomusk, Thomas Debrunner, Kuba Kaszyk, Pablo Gonzalez-de-Aledo, Andrey Rodchenko, Graham Riley, Christos Kotselidis, Björn Franke, Michael F. P. O'Boyle, Andrew J. Davison, Paul H. J. Kelly, Mikel Luján , et al. (1 additional authors not shown)

Abstract: Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Map**), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, to… ▽ More Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Map**), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, tools, and systems software needed to enable delivery of SLAM, by supporting applications specialists in selecting and configuring the appropriate algorithm and the appropriate hardware, and compilation pathway, to meet their performance, accuracy, and energy consumption goals. The major contributions we present are (1) tools and methodology for systematic quantitative evaluation of SLAM algorithms, (2) automated, machine-learning-guided exploration of the algorithmic and implementation design space with respect to multiple objectives, (3) end-to-end simulation tools to enable optimisation of heterogeneous, accelerated architectures for the specific algorithmic requirements of the various SLAM algorithmic approaches, and (4) tools for delivering, where appropriate, accelerated, adaptive SLAM solutions in a managed, JIT-compiled, adaptive runtime context. △ Less

Submitted 20 August, 2018; originally announced August 2018.

Comments: Proceedings of the IEEE 2018

arXiv:1603.01214 [pdf, other]

Network modularity in the presence of covariates

Authors: Beate Franke, Patrick J. Wolfe

Abstract: We characterize the large-sample properties of network modularity in the presence of covariates, under a natural and flexible nonparametric null model. This provides for the first time an objective measure of whether or not a particular value of modularity is meaningful. In particular, our results quantify the strength of the relation between observed community structure and the interactions in a… ▽ More We characterize the large-sample properties of network modularity in the presence of covariates, under a natural and flexible nonparametric null model. This provides for the first time an objective measure of whether or not a particular value of modularity is meaningful. In particular, our results quantify the strength of the relation between observed community structure and the interactions in a network. Our technical contribution is to provide limit theorems for modularity when a community assignment is given by nodal features or covariates. These theorems hold for a broad class of network models over a range of sparsity regimes, as well as weighted, multi-edge, and power-law networks. This allows us to assign $p$-values to observed community structure, which we validate using several benchmark examples in the literature. We conclude by applying this methodology to investigate a multi-edge network of corporate email interactions. △ Less

Submitted 3 March, 2016; originally announced March 2016.

Comments: 56 pages, 4 figures; submitted for publication

arXiv:1509.02900 [pdf]

doi 10.1111/insr.12176

Statistical Inference, Learning and Models in Big Data

Authors: Beate Franke, Jean-François Plante, Ribana Roscher, Annie Lee, Cathal Smyth, Armin Hatefi, Fuqi Chen, Einat Gil, Alexander Schwing, Alessandro Selvitella, Michael M. Hoffman, Roger Grosse, Dieter Hendricks, Nancy Reid

Abstract: The need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context. Statistical ideas are an essential part of this, and as a partial response, a thematic program on statistical inference, learning, and models in big data was held in 2015 in Canada, under the general direction of the Canadian Statistical Sciences Insti… ▽ More The need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context. Statistical ideas are an essential part of this, and as a partial response, a thematic program on statistical inference, learning, and models in big data was held in 2015 in Canada, under the general direction of the Canadian Statistical Sciences Institute, with major funding from, and most activities located at, the Fields Institute for Research in Mathematical Sciences. This paper gives an overview of the topics covered, describing challenges and strategies that seem common to many different areas of application, and including some examples of applications to make these challenges and strategies more concrete. △ Less

Submitted 28 January, 2016; v1 submitted 9 September, 2015; originally announced September 2015.

Comments: Thematic Program on Statistical Inference, Learning, and Models for Big Data, Fields Institute; 23 pages, 2 figures

MSC Class: 62-07 ACM Class: I.2.6; I.2.3; I.5.1; G.3

Journal ref: Int Stat Rev 84 (2017) 371-389

Showing 1–9 of 9 results for author: Franke, B