-
Improving the Robustness of Reinforcement Learning Policies with $\mathcal{L}_{1}$ Adaptive Control
Authors:
Y. Cheng,
P. Zhao,
F. Wang,
D. J. Block,
N. Hovakimyan
Abstract:
A reinforcement learning (RL) control policy could fail in a new/perturbed environment that is different from the training environment, due to the presence of dynamic variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RL policy by augmenting it with an $\mathcal{L}_{1}$ adaptive controller ($\mathcal{L}_{1}$AC).…
▽ More
A reinforcement learning (RL) control policy could fail in a new/perturbed environment that is different from the training environment, due to the presence of dynamic variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RL policy by augmenting it with an $\mathcal{L}_{1}$ adaptive controller ($\mathcal{L}_{1}$AC). Leveraging the capability of an $\mathcal{L}_{1}$AC for fast estimation and active compensation of dynamic variations, the proposed approach can improve the robustness of an RL policy which is trained either in a simulator or in the real world without consideration of a broad class of dynamic variations. Numerical and real-world experiments empirically demonstrate the efficacy of the proposed approach in robustifying RL policies trained using both model-free and model-based methods.
△ Less
Submitted 29 August, 2022; v1 submitted 3 December, 2021;
originally announced December 2021.
-
HOPPY: An Open-source Kit for Education with Dynamic Legged Robots
Authors:
Joao Ramos,
Yanran Ding,
Young-woo Sim,
Kevin Murphy,
Daniel Block
Abstract:
This paper introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit is intended to lower the entry barrier for studying dynamic robots and legged locomotion with real systems. It bridges the theoretical content of fundamental robotic courses with real dynamic robots by facilitating and…
▽ More
This paper introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit is intended to lower the entry barrier for studying dynamic robots and legged locomotion with real systems. It bridges the theoretical content of fundamental robotic courses with real dynamic robots by facilitating and guiding the software and hardware integration. This paper describes the topics which can be studied using the kit, lists its components, discusses preferred practices for implementation, presents results from experiments with the simulator and the real system, and suggests further improvements. A simple heuristic-based controller is described to achieve velocities up to 1.7m/s, navigate small objects, and mitigate external disturbances when the robot is aided by a counterweight. HOPPY was utilized as the subject of a semester-long project for the Robot Dynamics and Control course at the University of Illinois at Urbana-Champaign. The positive feedback from the students and instructors about the hands-on activities during the course motivates us to share this kit and continue improving in the future.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
HOPPY: An open-source and low-cost kit for dynamic robotics education
Authors:
Joao Ramos,
Yanran Ding,
Young-woo Sim,
Kevin Murphy,
Daniel Block
Abstract:
This letter introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit lowers the entry barrier for studying dynamic robots and legged locomotion in real systems. The kit bridges the theoretical content of fundamental robotic courses and real dynamic robots by facilitating and guiding th…
▽ More
This letter introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit lowers the entry barrier for studying dynamic robots and legged locomotion in real systems. The kit bridges the theoretical content of fundamental robotic courses and real dynamic robots by facilitating and guiding the software and hardware integration. This letter describes the topics which can be studied using the kit, lists its components, discusses best practices for implementation, presents results from experiments with the simulator and the real system, and suggests further improvements. A simple controller is described to achieve velocities up to 2m/s, navigate small objects, and mitigate external disturbances (kicks). HOPPY was utilized as the topic of a semester-long project for the Robot Dynamics and Control course at the University of Illinois at Urbana-Champaign. Students provided an overwhelmingly positive feedback from the hands-on activities during the course and the instructors will continue to improve the kit for upcoming semesters.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
Resource Allocation for a Wireless Coexistence Management System Based on Reinforcement Learning
Authors:
Philip Soeffker,
Dimitri Block,
Nico Wiebusch,
Uwe Meier
Abstract:
In industrial environments, an increasing amount of wireless devices are used, which utilize license-free bands. As a consequence of these mutual interferences of wireless systems might decrease the state of coexistence. Therefore, a central coexistence management system is needed, which allocates conflict-free resources to wireless systems. To ensure a conflict-free resource utilization, it is us…
▽ More
In industrial environments, an increasing amount of wireless devices are used, which utilize license-free bands. As a consequence of these mutual interferences of wireless systems might decrease the state of coexistence. Therefore, a central coexistence management system is needed, which allocates conflict-free resources to wireless systems. To ensure a conflict-free resource utilization, it is useful to predict the prospective medium utilization before resources are allocated. This paper presents a self-learning concept, which is based on reinforcement learning. A simulative evaluation of reinforcement learning agents based on neural networks, called deep Q-networks and double deep Q-networks, was realized for exemplary and practically relevant coexistence scenarios. The evaluation of the double deep Q-network showed that a prediction accuracy of at least 98 % can be reached in all investigated scenarios.
△ Less
Submitted 24 May, 2018;
originally announced June 2018.
-
Multi-Label Wireless Interference Identification with Convolutional Neural Networks
Authors:
Sergej Grunau,
Dimitri Block,
Uwe Meier
Abstract:
The steadily growing use of license-free frequency bands require reliable coexistence management and therefore proper wireless interference identification (WII). In this work, we propose a WII approach based upon a deep convolutional neural network (CNN) which classifies multiple IEEE 802.15.1, IEEE 802.11 b/g and IEEE 802.15.4 interfering signals in the presence of a utilized signal. The generate…
▽ More
The steadily growing use of license-free frequency bands require reliable coexistence management and therefore proper wireless interference identification (WII). In this work, we propose a WII approach based upon a deep convolutional neural network (CNN) which classifies multiple IEEE 802.15.1, IEEE 802.11 b/g and IEEE 802.15.4 interfering signals in the presence of a utilized signal. The generated multi-label dataset contains frequency- and time-limited sensing snapshots with the bandwidth of 10 MHz and duration of 12.8 $μ$s, respectively. Each snapshot combines one utilized signal with up to multiple interfering signals. The approach shows promising results for same-technology interference with a classification accuracy of approximately 100 % for IEEE 802.15.1 and IEEE 802.15.4 signals. For IEEE 802.11 b/g signals the accuracy increases for cross-technology interference with at least 90 %.
△ Less
Submitted 12 April, 2018;
originally announced April 2018.
-
Wireless Interference Identification with Convolutional Neural Networks
Authors:
Malte Schmidt,
Dimitri Block,
Uwe Meier
Abstract:
The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during a…
▽ More
The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during an extensive data-driven GPU-based training process. We propose a CNN example which is based upon sensing snapshots with a limited duration of 12.8 μs and an acquisition bandwidth of 10 MHz. The CNN differs between 15 classes. They represent packet transmissions of IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 with overlap** frequency channels within the 2.4 GHz ISM band. We show that the CNN outperforms state-of-the-art WII approaches and has a classification accuracy greater than 95% for signal-to-noise ratio of at least -5 dB.
△ Less
Submitted 2 March, 2017;
originally announced March 2017.
-
Simulations on Consumer Tests: Systematic Evaluation of Tolerance Ranges by Model-Based Generation of Simulation Scenarios
Authors:
Christian Berger,
Delf Block,
Sönke Heeren,
Christian Hons,
Stefan Kühnel,
André Leschke,
Dimitri Plotnikov,
Bernhard Rumpe
Abstract:
Context: Since 2014 several modern cars were rated regarding the performances of their active safety systems at the European New Car Assessment Programme (EuroNCAP). Nowadays, consumer tests play a significant role for the OEM's series development with worldwide perspective, because a top rating is needed to underline the worthiness of active safety features from the customers' point of view. Furt…
▽ More
Context: Since 2014 several modern cars were rated regarding the performances of their active safety systems at the European New Car Assessment Programme (EuroNCAP). Nowadays, consumer tests play a significant role for the OEM's series development with worldwide perspective, because a top rating is needed to underline the worthiness of active safety features from the customers' point of view. Furthermore, EuroNCAP already published their roadmap 2020 in which they outline further extensions in today's testing and rating procedures that will aggravate the current requirements addressed to those systems. Especially Autonomous Emergency Braking/Forward Collision Warning systems (AEB/FCW) are going to face a broader field of application as pedestrian detection or two-way traffic scenarios. Objective: This work focuses on the systematic generation of test scenarios concentrating on specific parameters that can vary within certain tolerance ranges like the lateral position of the vehicle-under-test (VUT) and its test velocity for example. It is of high interest to examine the effect of the tolerance ranges on the braking points in different test cases representing different trajectories and velocities because they will influence significantly a later scoring during the assessments and thus the safety abilities of the regarding car. Method: We present a formal model using a graph to represent the allowed variances based on the relevant points in time. Now, varying velocities of the VUT will be added to the model while the vehicle is approaching a target vehicle. The derived trajectories were used as test cases for a simulation environment. Selecting interesting test cases and processing them with the simulation environment, the influence on the system's performance of different test parameters will be investigated.
△ Less
Submitted 9 September, 2015;
originally announced September 2015.
-
Meta-Metrics for Simulations in Software Engineering on the Example of Integral Safety Systems
Authors:
Christian Berger,
Delf Block,
Christian Hons,
Stefan Kühnel,
André Leschke,
Bernhard Rumpe,
Torsten Strutz
Abstract:
Vehicles passengers and other traffic participants are protected more and more by integral safety systems. They continuously perceive the vehicles environment to prevent dangerous situations by e.g. emergency braking systems. Furthermore, increasingly intelligent vehicle functions are still of major interest in research and development to reduce the risk of accidents. However, the development and…
▽ More
Vehicles passengers and other traffic participants are protected more and more by integral safety systems. They continuously perceive the vehicles environment to prevent dangerous situations by e.g. emergency braking systems. Furthermore, increasingly intelligent vehicle functions are still of major interest in research and development to reduce the risk of accidents. However, the development and testing of these functions should not rely only on validations on proving grounds and on long-term test-runs in real traffic; instead, they should be extended by virtual testing approaches to model potentially dangerous situations or to re-run specific traffic situations easily. This article outlines meta-metrics as one of todays challenges for the software engineering of these cyber-physical systems to provide guidance during the system development: For example, unstable results of simulation test-runs over the vehicle functions revision history are elaborated as an indicating metric where to focus on with real or further virtual test-runs; furthermore, varying acting time points for the same virtual traffic situation are indicating problems with the reliability to interpret the specific situation. In this article, several of such meta-metrics are discussed and assigned both to different phases during the series development and to different levels of detailedness of virtual testing approaches.
△ Less
Submitted 25 August, 2014;
originally announced August 2014.
-
Simulations on Consumer Tests: A Perspective for Driver Assistance Systems
Authors:
Delf Block,
Sönke Heeren,
Stefan Kühnel,
André Leschke,
Bernhard Rumpe,
Vladislavs Serebro
Abstract:
This article discusses new challenges for series development regarding the vehicle safety that arise from the recently published AEB test protocol by the consumer-test-organisation EuroNCAP for driver assistance systems [6]. The tests from the test protocol are of great significance for an OEM that sells millions of cars each year, due to the fact that a positive rating of the vehicle-under-test (…
▽ More
This article discusses new challenges for series development regarding the vehicle safety that arise from the recently published AEB test protocol by the consumer-test-organisation EuroNCAP for driver assistance systems [6]. The tests from the test protocol are of great significance for an OEM that sells millions of cars each year, due to the fact that a positive rating of the vehicle-under-test (VUT) in safety relevant aspects is important for the reputation of a car manufacturer. The further intensification and aggravation of the test requirements for those systems is one of the challenges, that has to be mastered in order to continuously make significant contributions to safety for high-volume cars. Therefore, it is to be shown how a simulation approach may support the development process, especially with tolerance analysis. This article discusses the current stage of work, steps that are planned for the future and results that can be expected at the end of such an analysis.
△ Less
Submitted 14 August, 2014;
originally announced August 2014.