Search | arXiv e-print repository

arXiv:2112.01953 [pdf, other]

doi 10.1109/LRA.2022.3169309

Improving the Robustness of Reinforcement Learning Policies with $\mathcal{L}_{1}$ Adaptive Control

Authors: Y. Cheng, P. Zhao, F. Wang, D. J. Block, N. Hovakimyan

Abstract: A reinforcement learning (RL) control policy could fail in a new/perturbed environment that is different from the training environment, due to the presence of dynamic variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RL policy by augmenting it with an $\mathcal{L}_{1}$ adaptive controller ($\mathcal{L}_{1}$AC).… ▽ More A reinforcement learning (RL) control policy could fail in a new/perturbed environment that is different from the training environment, due to the presence of dynamic variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RL policy by augmenting it with an $\mathcal{L}_{1}$ adaptive controller ($\mathcal{L}_{1}$AC). Leveraging the capability of an $\mathcal{L}_{1}$AC for fast estimation and active compensation of dynamic variations, the proposed approach can improve the robustness of an RL policy which is trained either in a simulator or in the real world without consideration of a broad class of dynamic variations. Numerical and real-world experiments empirically demonstrate the efficacy of the proposed approach in robustifying RL policies trained using both model-free and model-based methods. △ Less

Submitted 29 August, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: Included extended work for the journal version https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9761728. arXiv admin note: substantial text overlap with arXiv:2106.02249

Journal ref: IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 7, NO. 3, JULY 2022

arXiv:2103.08433 [pdf, other]

HOPPY: An Open-source Kit for Education with Dynamic Legged Robots

Authors: Joao Ramos, Yanran Ding, Young-woo Sim, Kevin Murphy, Daniel Block

Abstract: This paper introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit is intended to lower the entry barrier for studying dynamic robots and legged locomotion with real systems. It bridges the theoretical content of fundamental robotic courses with real dynamic robots by facilitating and… ▽ More This paper introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit is intended to lower the entry barrier for studying dynamic robots and legged locomotion with real systems. It bridges the theoretical content of fundamental robotic courses with real dynamic robots by facilitating and guiding the software and hardware integration. This paper describes the topics which can be studied using the kit, lists its components, discusses preferred practices for implementation, presents results from experiments with the simulator and the real system, and suggests further improvements. A simple heuristic-based controller is described to achieve velocities up to 1.7m/s, navigate small objects, and mitigate external disturbances when the robot is aided by a counterweight. HOPPY was utilized as the subject of a semester-long project for the Robot Dynamics and Control course at the University of Illinois at Urbana-Champaign. The positive feedback from the students and instructors about the hands-on activities during the course motivates us to share this kit and continue improving in the future. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2010.14580

arXiv:2010.14580 [pdf, other]

HOPPY: An open-source and low-cost kit for dynamic robotics education

Authors: Joao Ramos, Yanran Ding, Young-woo Sim, Kevin Murphy, Daniel Block

Abstract: This letter introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit lowers the entry barrier for studying dynamic robots and legged locomotion in real systems. The kit bridges the theoretical content of fundamental robotic courses and real dynamic robots by facilitating and guiding th… ▽ More This letter introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit lowers the entry barrier for studying dynamic robots and legged locomotion in real systems. The kit bridges the theoretical content of fundamental robotic courses and real dynamic robots by facilitating and guiding the software and hardware integration. This letter describes the topics which can be studied using the kit, lists its components, discusses best practices for implementation, presents results from experiments with the simulator and the real system, and suggests further improvements. A simple controller is described to achieve velocities up to 2m/s, navigate small objects, and mitigate external disturbances (kicks). HOPPY was utilized as the topic of a semester-long project for the Robot Dynamics and Control course at the University of Illinois at Urbana-Champaign. Students provided an overwhelmingly positive feedback from the hands-on activities during the course and the instructors will continue to improve the kit for upcoming semesters. △ Less

Submitted 27 October, 2020; originally announced October 2020.

arXiv:1806.04702 [pdf, other]

Resource Allocation for a Wireless Coexistence Management System Based on Reinforcement Learning

Authors: Philip Soeffker, Dimitri Block, Nico Wiebusch, Uwe Meier

Abstract: In industrial environments, an increasing amount of wireless devices are used, which utilize license-free bands. As a consequence of these mutual interferences of wireless systems might decrease the state of coexistence. Therefore, a central coexistence management system is needed, which allocates conflict-free resources to wireless systems. To ensure a conflict-free resource utilization, it is us… ▽ More In industrial environments, an increasing amount of wireless devices are used, which utilize license-free bands. As a consequence of these mutual interferences of wireless systems might decrease the state of coexistence. Therefore, a central coexistence management system is needed, which allocates conflict-free resources to wireless systems. To ensure a conflict-free resource utilization, it is useful to predict the prospective medium utilization before resources are allocated. This paper presents a self-learning concept, which is based on reinforcement learning. A simulative evaluation of reinforcement learning agents based on neural networks, called deep Q-networks and double deep Q-networks, was realized for exemplary and practically relevant coexistence scenarios. The evaluation of the double deep Q-network showed that a prediction accuracy of at least 98 % can be reached in all investigated scenarios. △ Less

Submitted 24 May, 2018; originally announced June 2018.

Comments: Submitted to the 23rd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA 2018)

arXiv:1804.04395 [pdf, other]

Multi-Label Wireless Interference Identification with Convolutional Neural Networks

Authors: Sergej Grunau, Dimitri Block, Uwe Meier

Abstract: The steadily growing use of license-free frequency bands require reliable coexistence management and therefore proper wireless interference identification (WII). In this work, we propose a WII approach based upon a deep convolutional neural network (CNN) which classifies multiple IEEE 802.15.1, IEEE 802.11 b/g and IEEE 802.15.4 interfering signals in the presence of a utilized signal. The generate… ▽ More The steadily growing use of license-free frequency bands require reliable coexistence management and therefore proper wireless interference identification (WII). In this work, we propose a WII approach based upon a deep convolutional neural network (CNN) which classifies multiple IEEE 802.15.1, IEEE 802.11 b/g and IEEE 802.15.4 interfering signals in the presence of a utilized signal. The generated multi-label dataset contains frequency- and time-limited sensing snapshots with the bandwidth of 10 MHz and duration of 12.8 $μ$s, respectively. Each snapshot combines one utilized signal with up to multiple interfering signals. The approach shows promising results for same-technology interference with a classification accuracy of approximately 100 % for IEEE 802.15.1 and IEEE 802.15.4 signals. For IEEE 802.11 b/g signals the accuracy increases for cross-technology interference with at least 90 %. △ Less

Submitted 12 April, 2018; originally announced April 2018.

Comments: Submitted to the 16th International Conference on Industrial Informatics (INDIN 2018)

arXiv:1703.00737 [pdf, other]

doi 10.1109/indin.2017.8104767

Wireless Interference Identification with Convolutional Neural Networks

Authors: Malte Schmidt, Dimitri Block, Uwe Meier

Abstract: The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during a… ▽ More The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during an extensive data-driven GPU-based training process. We propose a CNN example which is based upon sensing snapshots with a limited duration of 12.8 μs and an acquisition bandwidth of 10 MHz. The CNN differs between 15 classes. They represent packet transmissions of IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 with overlap** frequency channels within the 2.4 GHz ISM band. We show that the CNN outperforms state-of-the-art WII approaches and has a classification accuracy greater than 95% for signal-to-noise ratio of at least -5 dB. △ Less

Submitted 2 March, 2017; originally announced March 2017.

Journal ref: IEEE 15th International Conference on Industrial Informatics (INDIN)

arXiv:1509.02654 [pdf]

Simulations on Consumer Tests: Systematic Evaluation of Tolerance Ranges by Model-Based Generation of Simulation Scenarios

Authors: Christian Berger, Delf Block, Sönke Heeren, Christian Hons, Stefan Kühnel, André Leschke, Dimitri Plotnikov, Bernhard Rumpe

Abstract: Context: Since 2014 several modern cars were rated regarding the performances of their active safety systems at the European New Car Assessment Programme (EuroNCAP). Nowadays, consumer tests play a significant role for the OEM's series development with worldwide perspective, because a top rating is needed to underline the worthiness of active safety features from the customers' point of view. Furt… ▽ More Context: Since 2014 several modern cars were rated regarding the performances of their active safety systems at the European New Car Assessment Programme (EuroNCAP). Nowadays, consumer tests play a significant role for the OEM's series development with worldwide perspective, because a top rating is needed to underline the worthiness of active safety features from the customers' point of view. Furthermore, EuroNCAP already published their roadmap 2020 in which they outline further extensions in today's testing and rating procedures that will aggravate the current requirements addressed to those systems. Especially Autonomous Emergency Braking/Forward Collision Warning systems (AEB/FCW) are going to face a broader field of application as pedestrian detection or two-way traffic scenarios. Objective: This work focuses on the systematic generation of test scenarios concentrating on specific parameters that can vary within certain tolerance ranges like the lateral position of the vehicle-under-test (VUT) and its test velocity for example. It is of high interest to examine the effect of the tolerance ranges on the braking points in different test cases representing different trajectories and velocities because they will influence significantly a later scoring during the assessments and thus the safety abilities of the regarding car. Method: We present a formal model using a graph to represent the allowed variances based on the relevant points in time. Now, varying velocities of the VUT will be added to the model while the vehicle is approaching a target vehicle. The derived trajectories were used as test cases for a simulation environment. Selecting interesting test cases and processing them with the simulation environment, the influence on the system's performance of different test parameters will be investigated. △ Less

Submitted 9 September, 2015; originally announced September 2015.

Comments: 15 pages, 6 figures, Fahrerassistenzsysteme und Integrierte Sicherheit, VDI Berichte 2014, pp. 403-418

Journal ref: Fahrerassistenzsysteme und Integrierte Sicherheit, VDI Berichte 2014, pp. 403-418

arXiv:1408.5691 [pdf]

Meta-Metrics for Simulations in Software Engineering on the Example of Integral Safety Systems

Authors: Christian Berger, Delf Block, Christian Hons, Stefan Kühnel, André Leschke, Bernhard Rumpe, Torsten Strutz

Abstract: Vehicles passengers and other traffic participants are protected more and more by integral safety systems. They continuously perceive the vehicles environment to prevent dangerous situations by e.g. emergency braking systems. Furthermore, increasingly intelligent vehicle functions are still of major interest in research and development to reduce the risk of accidents. However, the development and… ▽ More Vehicles passengers and other traffic participants are protected more and more by integral safety systems. They continuously perceive the vehicles environment to prevent dangerous situations by e.g. emergency braking systems. Furthermore, increasingly intelligent vehicle functions are still of major interest in research and development to reduce the risk of accidents. However, the development and testing of these functions should not rely only on validations on proving grounds and on long-term test-runs in real traffic; instead, they should be extended by virtual testing approaches to model potentially dangerous situations or to re-run specific traffic situations easily. This article outlines meta-metrics as one of todays challenges for the software engineering of these cyber-physical systems to provide guidance during the system development: For example, unstable results of simulation test-runs over the vehicle functions revision history are elaborated as an indicating metric where to focus on with real or further virtual test-runs; furthermore, varying acting time points for the same virtual traffic situation are indicating problems with the reliability to interpret the specific situation. In this article, several of such meta-metrics are discussed and assigned both to different phases during the series development and to different levels of detailedness of virtual testing approaches. △ Less

Submitted 25 August, 2014; originally announced August 2014.

Report number: 13 pages, 5 figures

Journal ref: Braunschweiger Symposium AAET 2013, Automatisierungssysteme, Assistenzsysteme und eingebettete Systeme für Transportmittel, 6.-7.2.2013, pp. 136-148, Niedersachsen e.V.(Hrsg.) 2013

arXiv:1408.3231 [pdf]

doi 10.1145/2559627.2559633

Simulations on Consumer Tests: A Perspective for Driver Assistance Systems

Authors: Delf Block, Sönke Heeren, Stefan Kühnel, André Leschke, Bernhard Rumpe, Vladislavs Serebro

Abstract: This article discusses new challenges for series development regarding the vehicle safety that arise from the recently published AEB test protocol by the consumer-test-organisation EuroNCAP for driver assistance systems [6]. The tests from the test protocol are of great significance for an OEM that sells millions of cars each year, due to the fact that a positive rating of the vehicle-under-test (… ▽ More This article discusses new challenges for series development regarding the vehicle safety that arise from the recently published AEB test protocol by the consumer-test-organisation EuroNCAP for driver assistance systems [6]. The tests from the test protocol are of great significance for an OEM that sells millions of cars each year, due to the fact that a positive rating of the vehicle-under-test (VUT) in safety relevant aspects is important for the reputation of a car manufacturer. The further intensification and aggravation of the test requirements for those systems is one of the challenges, that has to be mastered in order to continuously make significant contributions to safety for high-volume cars. Therefore, it is to be shown how a simulation approach may support the development process, especially with tolerance analysis. This article discusses the current stage of work, steps that are planned for the future and results that can be expected at the end of such an analysis. △ Less

Submitted 14 August, 2014; originally announced August 2014.

Comments: 6 pages, 5 figure, Proceedings of International Workshop on Engineering Simulations for Cyber-Physical Systems (ES4CPS '14)

Showing 1–9 of 9 results for author: Block, D