-
Temporally Layered Architecture for Efficient Continuous Control
Authors:
Devdhar Patel,
Terrence Sejnowski,
Hava Siegelmann
Abstract:
We present a temporally layered architecture (TLA) for temporally adaptive control with minimal energy expenditure. The TLA layers a fast and a slow policy together to achieve temporal abstraction that allows each layer to focus on a different time scale. Our design draws on the energy-saving mechanism of the human brain, which executes actions at different timescales depending on the environment'…
▽ More
We present a temporally layered architecture (TLA) for temporally adaptive control with minimal energy expenditure. The TLA layers a fast and a slow policy together to achieve temporal abstraction that allows each layer to focus on a different time scale. Our design draws on the energy-saving mechanism of the human brain, which executes actions at different timescales depending on the environment's demands. We demonstrate that beyond energy saving, TLA provides many additional advantages, including persistent exploration, fewer required decisions, reduced jerk, and increased action repetition. We evaluate our method on a suite of continuous control tasks and demonstrate the significant advantages of TLA over existing methods when measured over multiple important metrics. We also introduce a multi-objective score to qualitatively assess continuous control policies and demonstrate a significantly better score for TLA. Our training algorithm uses minimal communication between the slow and fast layers to train both policies simultaneously, making it viable for future applications in distributed control.
△ Less
Submitted 8 August, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Temporally Layered Architecture for Adaptive, Distributed and Continuous Control
Authors:
Devdhar Patel,
Joshua Russell,
Francesca Walsh,
Tauhidur Rahman,
Terrence Sejnowski,
Hava Siegelmann
Abstract:
We present temporally layered architecture (TLA), a biologically inspired system for temporally adaptive distributed control. TLA layers a fast and a slow controller together to achieve temporal abstraction that allows each layer to focus on a different time-scale. Our design is biologically inspired and draws on the architecture of the human brain which executes actions at different timescales de…
▽ More
We present temporally layered architecture (TLA), a biologically inspired system for temporally adaptive distributed control. TLA layers a fast and a slow controller together to achieve temporal abstraction that allows each layer to focus on a different time-scale. Our design is biologically inspired and draws on the architecture of the human brain which executes actions at different timescales depending on the environment's demands. Such distributed control design is widespread across biological systems because it increases survivability and accuracy in certain and uncertain environments. We demonstrate that TLA can provide many advantages over existing approaches, including persistent exploration, adaptive control, explainable temporal behavior, compute efficiency and distributed control. We present two different algorithms for training TLA: (a) Closed-loop control, where the fast controller is trained over a pre-trained slow controller, allowing better exploration for the fast controller and closed-loop control where the fast controller decides whether to "act-or-not" at each timestep; and (b) Partially open loop control, where the slow controller is trained over a pre-trained fast controller, allowing for open loop-control where the slow controller picks a temporally extended action or defers the next n-actions to the fast controller. We evaluated our method on a suite of continuous control tasks and demonstrate the advantages of TLA over several strong baselines.
△ Less
Submitted 5 February, 2023; v1 submitted 25 December, 2022;
originally announced January 2023.
-
Insulin Regimen ML-based control for T2DM patients
Authors:
Mark Shifrin,
Hava Siegelmann
Abstract:
\begin{abstract} We model individual T2DM patient blood glucose level (BGL) by stochastic process with discrete number of states mainly but not solely governed by medication regimen (e.g. insulin injections). BGL states change otherwise according to various physiological triggers which render a stochastic, statistically unknown, yet assumed to be quasi-stationary, nature of the process. In order t…
▽ More
\begin{abstract} We model individual T2DM patient blood glucose level (BGL) by stochastic process with discrete number of states mainly but not solely governed by medication regimen (e.g. insulin injections). BGL states change otherwise according to various physiological triggers which render a stochastic, statistically unknown, yet assumed to be quasi-stationary, nature of the process. In order to express incentive for being in desired healthy BGL we heuristically define a reward function which returns positive values for desirable BG levels and negative values for undesirable BG levels. The state space consists of sufficient number of states in order to allow for memoryless assumption. This, in turn, allows to formulate Markov Decision Process (MDP), with an objective to maximize the total reward, summarized over a long run. The probability law is found by model-based reinforcement learning (RL) and the optimal insulin treatment policy is retrieved from MDP solution.
△ Less
Submitted 21 October, 2017;
originally announced October 2017.