-
Learning Safety Constraints From Demonstration Using One-Class Decision Trees
Authors:
Mattijs Baert,
Sam Leroux,
Pieter Simoens
Abstract:
The alignment of autonomous agents with human values is a pivotal challenge when deploying these agents within physical environments, where safety is an important concern. However, defining the agent's objective as a reward and/or cost function is inherently complex and prone to human errors. In response to this challenge, we present a novel approach that leverages one-class decision trees to faci…
▽ More
The alignment of autonomous agents with human values is a pivotal challenge when deploying these agents within physical environments, where safety is an important concern. However, defining the agent's objective as a reward and/or cost function is inherently complex and prone to human errors. In response to this challenge, we present a novel approach that leverages one-class decision trees to facilitate learning from expert demonstrations. These decision trees provide a foundation for representing a set of constraints pertinent to the given environment as a logical formula in disjunctive normal form. The learned constraints are subsequently employed within an oracle constrained reinforcement learning framework, enabling the acquisition of a safe policy. In contrast to other methods, our approach offers an interpretable representation of the constraints, a vital feature in safety-critical environments. To validate the effectiveness of our proposed method, we conduct experiments in synthetic benchmark domains and a realistic driving environment.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Maximum Causal Entropy Inverse Constrained Reinforcement Learning
Authors:
Mattijs Baert,
Pietro Mazzaglia,
Sam Leroux,
Pieter Simoens
Abstract:
When deploying artificial agents in real-world environments where they interact with humans, it is crucial that their behavior is aligned with the values, social norms or other requirements of that environment. However, many environments have implicit constraints that are difficult to specify and transfer to a learning agent. To address this challenge, we propose a novel method that utilizes the p…
▽ More
When deploying artificial agents in real-world environments where they interact with humans, it is crucial that their behavior is aligned with the values, social norms or other requirements of that environment. However, many environments have implicit constraints that are difficult to specify and transfer to a learning agent. To address this challenge, we propose a novel method that utilizes the principle of maximum causal entropy to learn constraints and an optimal policy that adheres to these constraints, using demonstrations of agents that abide by the constraints. We prove convergence in a tabular setting and provide an approximation which scales to complex environments. We evaluate the effectiveness of the learned policy by assessing the reward received and the number of constraint violations, and we evaluate the learned cost function based on its transferability to other agents. Our method has been shown to outperform state-of-the-art approaches across a variety of tasks and environments, and it is able to handle problems with stochastic dynamics and a continuous state-action space.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Deep learning for enhanced free-space optical communications
Authors:
Manon P. Bart,
Nicholas J. Savino,
Paras Regmi,
Lior Cohen,
Haleh Safavi,
Harry C. Shaw,
Sanjaya Lohani,
Thomas A. Searles,
Brian T. Kirby,
Hwang Lee,
Ryan T. Glasser
Abstract:
Atmospheric effects, such as turbulence and background thermal noise, inhibit the propagation of coherent light used in ON-OFF keying free-space optical communication. Here we present and experimentally validate a convolutional neural network to reduce the bit error rate of free-space optical communication in post-processing that is significantly simpler and cheaper than existing solutions based o…
▽ More
Atmospheric effects, such as turbulence and background thermal noise, inhibit the propagation of coherent light used in ON-OFF keying free-space optical communication. Here we present and experimentally validate a convolutional neural network to reduce the bit error rate of free-space optical communication in post-processing that is significantly simpler and cheaper than existing solutions based on advanced optics. Our approach consists of two neural networks, the first determining the presence of coherent bit sequences in thermal noise and turbulence and the second demodulating the coherent bit sequences. All data used for training and testing our network is obtained experimentally by generating ON-OFF keying bit streams of coherent light, combining these with thermal light, and passing the resultant light through a turbulent water tank which we have verified mimics turbulence in the air to a high degree of accuracy. Our convolutional neural network improves detection accuracy over threshold classification schemes and has the capability to be integrated with current demodulation and error correction schemes.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Intelligent Frame Selection as a Privacy-Friendlier Alternative to Face Recognition
Authors:
Mattijs Baert,
Sam Leroux,
Pieter Simoens
Abstract:
The widespread deployment of surveillance cameras for facial recognition gives rise to many privacy concerns. This study proposes a privacy-friendly alternative to large scale facial recognition. While there are multiple techniques to preserve privacy, our work is based on the minimization principle which implies minimizing the amount of collected personal data. Instead of running facial recogniti…
▽ More
The widespread deployment of surveillance cameras for facial recognition gives rise to many privacy concerns. This study proposes a privacy-friendly alternative to large scale facial recognition. While there are multiple techniques to preserve privacy, our work is based on the minimization principle which implies minimizing the amount of collected personal data. Instead of running facial recognition software on all video data, we propose to automatically extract a high quality snapshot of each detected person without revealing his or her identity. This snapshot is then encrypted and access is only granted after legal authorization. We introduce a novel unsupervised face image quality assessment method which is used to select the high quality snapshots. For this, we train a variational autoencoder on high quality face images from a publicly available dataset and use the reconstruction probability as a metric to estimate the quality of each face crop. We experimentally confirm that the reconstruction probability can be used as biometric quality predictor. Unlike most previous studies, we do not rely on a manually defined face quality metric as everything is learned from data. Our face quality assessment method outperforms supervised, unsupervised and general image quality assessment methods on the task of improving face verification performance by rejecting low quality images. The effectiveness of the whole system is validated qualitatively on still images and videos.
△ Less
Submitted 27 January, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Upset Recovery Control for Quadrotors Subjected to a Complete Rotor Failure from Large Initial Disturbances
Authors:
Sihao Sun,
Matthias Baert,
Bram Adriaan Strack van Schijndel,
Coen de Visser
Abstract:
This study has developed a fault-tolerant controller that is able to recover a quadrotor from arbitrary initial orientations and angular velocities, despite the complete failure of a rotor. This cascaded control method includes a position/altitude controller, an almost-global convergence attitude controller, and a control allocation method based on quadratic programming. As a major novelty, a cons…
▽ More
This study has developed a fault-tolerant controller that is able to recover a quadrotor from arbitrary initial orientations and angular velocities, despite the complete failure of a rotor. This cascaded control method includes a position/altitude controller, an almost-global convergence attitude controller, and a control allocation method based on quadratic programming. As a major novelty, a constraint of undesirable angular velocity is derived and fused into the control allocator, which significantly improves the recovery performance. For validation, we have conducted a set of Monte-Carlo simulation to test the reliability of the proposed method of recovering the quadrotor from arbitrary initial attitude/rate conditions. In addition, real-life flight tests have been performed. The results demonstrate that the post-failure quadrotor can recover after being casually tossed into the air.
△ Less
Submitted 21 February, 2020;
originally announced February 2020.