Search | arXiv e-print repository

Leveraging small language models for Text2SPARQL tasks to improve the resilience of AI assistance

Authors: Felix Brei, Johannes Frey, Lars-Peter Meyer

Abstract: In this work we will show that language models with less than one billion parameters can be used to translate natural language to SPARQL queries after fine-tuning. Using three different datasets ranging from academic to real world, we identify prerequisites that the training data must fulfill in order for the training to be successful. The goal is to empower users of semantic web technology to use… ▽ More In this work we will show that language models with less than one billion parameters can be used to translate natural language to SPARQL queries after fine-tuning. Using three different datasets ranging from academic to real world, we identify prerequisites that the training data must fulfill in order for the training to be successful. The goal is to empower users of semantic web technology to use AI assistance with affordable commodity hardware, making them more resilient against external factors. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: To appear in Proceedings of the Workshop on Linked Data-driven Resilience Research 2024 (D2R2) co-located with Extended Semantic Web Conference 2024 (ESWC 2024)

arXiv:2404.14157 [pdf, other]

Autonomous Forest Inventory with Legged Robots: System Design and Field Deployment

Authors: Matías Mattamala, Nived Chebrolu, Benoit Casseau, Leonard Freißmuth, Jonas Frey, Turcan Tuna, Marco Hutter, Maurice Fallon

Abstract: We present a solution for autonomous forest inventory with a legged robotic platform. Compared to their wheeled and aerial counterparts, legged platforms offer an attractive balance of endurance and low soil impact for forest applications. In this paper, we present the complete system architecture of our forest inventory solution which includes state estimation, navigation, mission planning, and r… ▽ More We present a solution for autonomous forest inventory with a legged robotic platform. Compared to their wheeled and aerial counterparts, legged platforms offer an attractive balance of endurance and low soil impact for forest applications. In this paper, we present the complete system architecture of our forest inventory solution which includes state estimation, navigation, mission planning, and real-time tree segmentation and trait estimation. We present preliminary results for three campaigns in forests in Finland and the UK and summarize the main outcomes, lessons, and challenges. Our UK experiment at the Forest of Dean with the ANYmal D legged platform, achieved an autonomous survey of a 0.96 hectare plot in 20 min, identifying over 100 trees with typical DBH accuracy of 2 cm. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

arXiv:2404.11735 [pdf, other]

Learning with 3D rotations, a hitchhiker's guide to SO(3)

Authors: A. René Geist, Jonas Frey, Mikel Zobro, Anna Levina, Georg Martius

Abstract: Many settings in machine learning require the selection of a rotation representation. However, choosing a suitable representation from the many available options is challenging. This paper acts as a survey and guide through rotation representations. We walk through their properties that harm or benefit deep learning with gradient-based optimization. By consolidating insights from rotation-based le… ▽ More Many settings in machine learning require the selection of a rotation representation. However, choosing a suitable representation from the many available options is challenging. This paper acts as a survey and guide through rotation representations. We walk through their properties that harm or benefit deep learning with gradient-based optimization. By consolidating insights from rotation-based learning, we provide a comprehensive overview of learning functions with rotation representations. We provide guidance on selecting representations based on whether rotations are in the model's input or output and whether the data primarily comprises small angles. △ Less

Submitted 19 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: Published at ICML 2024

arXiv:2404.07110 [pdf, other]

Wild Visual Navigation: Fast Traversability Learning via Pre-Trained Models and Online Self-Supervision

Authors: Matías Mattamala, Jonas Frey, Piotr Libera, Nived Chebrolu, Georg Martius, Cesar Cadena, Marco Hutter, Maurice Fallon

Abstract: Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we present Wild Visual Navigation (WVN), an online self-supervised learning system for visual traversability estimation. The system is able to continuously adapt from a short human demonstration in the field,… ▽ More Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we present Wild Visual Navigation (WVN), an online self-supervised learning system for visual traversability estimation. The system is able to continuously adapt from a short human demonstration in the field, only using onboard sensing and computing. One of the key ideas to achieve this is the use of high-dimensional features from pre-trained self-supervised models, which implicitly encode semantic information that massively simplifies the learning task. Further, the development of an online scheme for supervision generator enables concurrent training and inference of the learned model in the wild. We demonstrate our approach through diverse real-world deployments in forests, parks, and grasslands. Our system is able to bootstrap the traversable terrain segmentation in less than 5 min of in-field training time, enabling the robot to navigate in complex, previously unseen outdoor terrains. Code: https://bit.ly/498b0CV - Project page:https://bit.ly/3M6nMHH △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: Extended version of arXiv:2305.08510

arXiv:2403.17340 [pdf, ps, other]

Uniform Preorders and Partial Combinatory Algebras

Authors: Jonas Frey

Abstract: Uniform preorders are a class of combinatory representations of Set-indexed preorders that generalize Pieter Hofstra's basic relational objects. An indexed preorder is representable by a uniform preorder if and only if it has as generic predicate. We study the $\exists$-completion of indexed preorders on the level of uniform preorders, and identify a combinatory condition (called 'relational compl… ▽ More Uniform preorders are a class of combinatory representations of Set-indexed preorders that generalize Pieter Hofstra's basic relational objects. An indexed preorder is representable by a uniform preorder if and only if it has as generic predicate. We study the $\exists$-completion of indexed preorders on the level of uniform preorders, and identify a combinatory condition (called 'relational completeness') which characterizes those uniform preorders with finite meets whose $\exists$-completions are triposes. The class of triposes obtained this way contains relative realizability triposes, for which we derive a characterization as a fibrational analogue of the characterization of realizability toposes given in earlier work. Besides relative partial combinatory algebras, the class of relationally complete uniform preorders contains filtered ordered partial combinatory algebras, and it is unclear if there are any others. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 21 pages

MSC Class: 03G30

arXiv:2402.19341 [pdf, other]

RoadRunner - Learning Traversability Estimation for Autonomous Off-road Driving

Authors: Jonas Frey, Shehryar Khattak, Manthan Patel, Deegan Atha, Julian Nubert, Curtis Padgett, Marco Hutter, Patrick Spieler

Abstract: Autonomous navigation at high speeds in off-road environments necessitates robots to comprehensively understand their surroundings using onboard sensing only. The extreme conditions posed by the off-road setting can cause degraded camera image quality due to poor lighting and motion blur, as well as limited sparse geometric information available from LiDAR sensing when driving at high speeds. In t… ▽ More Autonomous navigation at high speeds in off-road environments necessitates robots to comprehensively understand their surroundings using onboard sensing only. The extreme conditions posed by the off-road setting can cause degraded camera image quality due to poor lighting and motion blur, as well as limited sparse geometric information available from LiDAR sensing when driving at high speeds. In this work, we present RoadRunner, a novel framework capable of predicting terrain traversability and an elevation map directly from camera and LiDAR sensor inputs. RoadRunner enables reliable autonomous navigation, by fusing sensory information, handling of uncertainty, and generation of contextually informed predictions about the geometry and traversability of the terrain while operating at low latency. In contrast to existing methods relying on classifying handcrafted semantic classes and using heuristics to predict traversability costs, our method is trained end-to-end in a self-supervised fashion. The RoadRunner network architecture builds upon popular sensor fusion network architectures from the autonomous driving domain, which embed LiDAR and camera information into a common Bird's Eye View perspective. Training is enabled by utilizing an existing traversability estimation stack to generate training data in hindsight in a scalable manner from real-world off-road driving datasets. Furthermore, RoadRunner improves the system latency by a factor of roughly 4, from 500 ms to 140 ms, while improving the accuracy for traversability costs and elevation map predictions. We demonstrate the effectiveness of RoadRunner in enabling safe and reliable off-road navigation at high speeds in multiple real-world driving scenarios through unstructured desert environments. △ Less

Submitted 3 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: under review for Field Robotics

arXiv:2310.03581 [pdf, other]

Resilient Legged Local Navigation: Learning to Traverse with Compromised Perception End-to-End

Authors: ** **, Chong Zhang, Jonas Frey, Nikita Rudin, Matias Mattamala, Cesar Cadena, Marco Hutter

Abstract: Autonomous robots must navigate reliably in unknown environments even under compromised exteroceptive perception, or perception failures. Such failures often occur when harsh environments lead to degraded sensing, or when the perception algorithm misinterprets the scene due to limited generalization. In this paper, we model perception failures as invisible obstacles and pits, and train a reinforce… ▽ More Autonomous robots must navigate reliably in unknown environments even under compromised exteroceptive perception, or perception failures. Such failures often occur when harsh environments lead to degraded sensing, or when the perception algorithm misinterprets the scene due to limited generalization. In this paper, we model perception failures as invisible obstacles and pits, and train a reinforcement learning (RL) based local navigation policy to guide our legged robot. Unlike previous works relying on heuristics and anomaly detection to update navigational information, we train our navigation policy to reconstruct the environment information in the latent space from corrupted perception and react to perception failures end-to-end. To this end, we incorporate both proprioception and exteroception into our policy inputs, thereby enabling the policy to sense collisions on different body parts and pits, prompting corresponding reactions. We validate our approach in simulation and on the real quadruped robot ANYmal running in real-time (<10 ms CPU inference). In a quantitative comparison with existing heuristic-based locally reactive planners, our policy increases the success rate over 30% when facing perception failures. Project Page: https://bit.ly/45NBTuh. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: Website and videos are available at our Project Page: https://bit.ly/45NBTuh

arXiv:2309.17122 [pdf, other]

Benchmarking the Abilities of Large Language Models for RDF Knowledge Graph Creation and Comprehension: How Well Do LLMs Speak Turtle?

Authors: Johannes Frey, Lars-Peter Meyer, Natanael Arndt, Felix Brei, Kirill Bulert

Abstract: Large Language Models (LLMs) are advancing at a rapid pace, with significant improvements at natural language processing and coding tasks. Yet, their ability to work with formal languages representing data, specifically within the realm of knowledge graph engineering, remains under-investigated. To evaluate the proficiency of various LLMs, we created a set of five tasks that probe their ability to… ▽ More Large Language Models (LLMs) are advancing at a rapid pace, with significant improvements at natural language processing and coding tasks. Yet, their ability to work with formal languages representing data, specifically within the realm of knowledge graph engineering, remains under-investigated. To evaluate the proficiency of various LLMs, we created a set of five tasks that probe their ability to parse, understand, analyze, and create knowledge graphs serialized in Turtle syntax. These tasks, each embodying distinct degrees of complexity and being able to scale with the size of the problem, have been integrated into our automated evaluation system, the LLM-KG-Bench. The evaluation encompassed four commercially available LLMs - GPT-3.5, GPT-4, Claude 1.3, and Claude 2.0, as well as two freely accessible offline models, GPT4All Vicuna and GPT4All Falcon 13B. This analysis offers an in-depth understanding of the strengths and shortcomings of LLMs in relation to their application within RDF knowledge graph engineering workflows utilizing Turtle representation. While our findings show that the latest commercial models outperform their forerunners in terms of proficiency with the Turtle language, they also reveal an apparent weakness. These models fall short when it comes to adhering strictly to the output formatting constraints, a crucial requirement in this context. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: accepted for proceedings of DL4KG Workshop @ ISWC 2023 at ceur-ws.org

arXiv:2309.16818 [pdf, other]

MEM: Multi-Modal Elevation Map** for Robotics and Learning

Authors: Gian Erni, Jonas Frey, Takahiro Miki, Matias Mattamala, Marco Hutter

Abstract: Elevation maps are commonly used to represent the environment of mobile robots and are instrumental for locomotion and navigation tasks. However, pure geometric information is insufficient for many field applications that require appearance or semantic information, which limits their applicability to other platforms or domains. In this work, we extend a 2.5D robot-centric elevation map** framewo… ▽ More Elevation maps are commonly used to represent the environment of mobile robots and are instrumental for locomotion and navigation tasks. However, pure geometric information is insufficient for many field applications that require appearance or semantic information, which limits their applicability to other platforms or domains. In this work, we extend a 2.5D robot-centric elevation map** framework by fusing multi-modal information from multiple sources into a popular map representation. The framework allows inputting data contained in point clouds or images in a unified manner. To manage the different nature of the data, we also present a set of fusion algorithms that can be selected based on the information type and user requirements. Our system is designed to run on the GPU, making it real-time capable for various robotic and learning tasks. We demonstrate the capabilities of our framework by deploying it on multiple robots with varying sensor configurations and showcasing a range of applications that utilize multi-modal layers, including line detection, human detection, and colorization. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: Accapted for IROS2023. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2309.14246 [pdf, other]

Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

Authors: Lukas Schneider, Jonas Frey, Takahiro Miki, Marco Hutter

Abstract: Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider s… ▽ More Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider safety explicitly. Instead of relying on a value expectation, we estimate the complete value distribution to account for uncertainty in the robot's interaction with the environment. The value distribution is consumed by a risk metric to extract risk sensitive value estimates. These are integrated into Proximal Policy Optimization (PPO) to derive our method, Distributional Proximal Policy Optimization (DPPO). The risk preference, ranging from risk-averse to risk-seeking, can be controlled by a single parameter, which enables to adjust the robot's behavior dynamically. Importantly, our approach removes the need for additional reward function tuning to achieve risk sensitivity. We show emergent risk sensitive locomotion behavior in simulation and on the quadrupedal robot ANYmal. Videos of the experiments and code are available at https://sites.google.com/leggedrobotics.com/risk-aware-locomotion. △ Less

Submitted 3 May, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

arXiv:2308.16622 [pdf, other]

Develo** a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering

Authors: Lars-Peter Meyer, Johannes Frey, Kurt Junghanns, Felix Brei, Kirill Bulert, Sabine Gründer-Fahrer, Michael Martin

Abstract: As the field of Large Language Models (LLMs) evolves at an accelerated pace, the critical need to assess and monitor their performance emerges. We introduce a benchmarking framework focused on knowledge graph engineering (KGE) accompanied by three challenges addressing syntax and error correction, facts extraction and dataset generation. We show that while being a useful tool, LLMs are yet unfit t… ▽ More As the field of Large Language Models (LLMs) evolves at an accelerated pace, the critical need to assess and monitor their performance emerges. We introduce a benchmarking framework focused on knowledge graph engineering (KGE) accompanied by three challenges addressing syntax and error correction, facts extraction and dataset generation. We show that while being a useful tool, LLMs are yet unfit to assist in knowledge graph generation with zero-shot prompting. Consequently, our LLM-KG-Bench framework provides automatic evaluation and storage of LLM responses as well as statistical data and visualization tools to support tracking of prompt engineering and model performance. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: To be published in SEMANTICS 2023 poster track proceedings. SEMANTICS 2023 EU: 19th International Conference on Semantic Systems, September 20-22, 2023, Leipzig, Germany

arXiv:2308.11967 [pdf, ps, other]

Duality for Clans: an Extension of Gabriel-Ulmer Duality

Authors: Jonas Frey

Abstract: Clans are representations of generalized algebraic theories that contain more information than the finite-limit categories associated to the locally finitely presentable categories of models via Gabriel-Ulmer duality. Extending Gabriel-Ulmer duality to account for this additional information, we present a duality theory between clans and locally finitely presentable categories equipped with a weak… ▽ More Clans are representations of generalized algebraic theories that contain more information than the finite-limit categories associated to the locally finitely presentable categories of models via Gabriel-Ulmer duality. Extending Gabriel-Ulmer duality to account for this additional information, we present a duality theory between clans and locally finitely presentable categories equipped with a weak factorization system of a certain kind. △ Less

Submitted 29 October, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: 38 pages

MSC Class: 03G30; 03B38

arXiv:2307.07522 [pdf, other]

The Future of Fundamental Science Led by Generative Closed-Loop Artificial Intelligence

Authors: Hector Zenil, Jesper Tegnér, Felipe S. Abrahão, Alexander Lavin, Vipin Kumar, Jeremy G. Frey, Adrian Weller, Larisa Soldatova, Alan R. Bundy, Nicholas R. Jennings, Koichi Takahashi, Lawrence Hunter, Saso Dzeroski, Andrew Briggs, Frederick D. Gregory, Carla P. Gomes, Jon Rowe, James Evans, Hiroaki Kitano, Ross King

Abstract: Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet,… ▽ More Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet, AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Generative AI, in general, and Large Language Models in particular, may represent an opportunity to augment and accelerate the scientific discovery of fundamental deep science with quantitative models. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery, including self-driven hypothesis generation and open-ended autonomous exploration of the hypothesis space. Integrating AI-driven automation into the practice of science would mitigate current problems, including the replication of findings, systematic production of data, and ultimately democratisation of the scientific process. Realising these possibilities requires a vision for augmented AI coupled with a diversity of AI approaches able to deal with fundamental aspects of causality analysis and model discovery while enabling unbiased search across the space of putative explanations. These advances hold the promise to unleash AI's potential for searching and discovering the fundamental structure of our world beyond what human scientists have been able to achieve. Such a vision would push the boundaries of new fundamental science rather than automatize current workflows and instead open doors for technological innovation to tackle some of the greatest challenges facing humanity today. △ Less

Submitted 29 August, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

Comments: 35 pages, first draft of the final report from the Alan Turing Institute on AI for Scientific Discovery

arXiv:2307.06917 [pdf, ps, other]

doi 10.1007/978-3-658-43705-3_8

LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT

Authors: Lars-Peter Meyer, Claus Stadler, Johannes Frey, Norman Radtke, Kurt Junghanns, Roy Meissner, Gordian Dziwis, Kirill Bulert, Michael Martin

Abstract: Knowledge Graphs (KG) provide us with a structured, flexible, transparent, cross-system, and collaborative way of organizing our knowledge and data across various domains in society and industrial as well as scientific disciplines. KGs surpass any other form of representation in terms of effectiveness. However, Knowledge Graph Engineering (KGE) requires in-depth experiences of graph structures, we… ▽ More Knowledge Graphs (KG) provide us with a structured, flexible, transparent, cross-system, and collaborative way of organizing our knowledge and data across various domains in society and industrial as well as scientific disciplines. KGs surpass any other form of representation in terms of effectiveness. However, Knowledge Graph Engineering (KGE) requires in-depth experiences of graph structures, web technologies, existing models and vocabularies, rule sets, logic, as well as best practices. It also demands a significant amount of work. Considering the advancements in large language models (LLMs) and their interfaces and applications in recent years, we have conducted comprehensive experiments with ChatGPT to explore its potential in supporting KGE. In this paper, we present a selection of these experiments and their results to demonstrate how ChatGPT can assist us in the development and management of KGs. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: to appear in conference proceedings of AI-Tomorrow-23, 29.+30.6.2023 in Leipzig, Germany

Journal ref: Informatik aktuell. First Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow 2023. AIDRST 2023. p. 103-115

arXiv:2306.05309 [pdf, other]

SMUG Planner: A Safe Multi-Goal Planner for Mobile Robots in Challenging Environments

Authors: Changan Chen, Jonas Frey, Philip Arm, Marco Hutter

Abstract: Robotic exploration or monitoring missions require mobile robots to autonomously and safely navigate between multiple target locations in potentially challenging environments. Currently, this type of multi-goal mission often relies on humans designing a set of actions for the robot to follow in the form of a path or waypoints. In this work, we consider the multi-goal problem of visiting a set of p… ▽ More Robotic exploration or monitoring missions require mobile robots to autonomously and safely navigate between multiple target locations in potentially challenging environments. Currently, this type of multi-goal mission often relies on humans designing a set of actions for the robot to follow in the form of a path or waypoints. In this work, we consider the multi-goal problem of visiting a set of pre-defined targets, each of which could be visited from multiple potential locations. To increase autonomy in these missions, we propose a safe multi-goal (SMUG) planner that generates an optimal motion path to visit those targets. To increase safety and efficiency, we propose a hierarchical state validity checking scheme, which leverages robot-specific traversability learned in simulation. We use LazyPRM* with an informed sampler to accelerate collision-free path generation. Our iterative dynamic programming algorithm enables the planner to generate a path visiting more than ten targets within seconds. Moreover, the proposed hierarchical state validity checking scheme reduces the planning time by 30% compared to pure volumetric collision checking and increases safety by avoiding high-risk regions. We deploy the SMUG planner on the quadruped robot ANYmal and show its capability to guide the robot in multi-goal missions fully autonomously on rough terrain. △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2305.08510 [pdf, other]

Fast Traversability Estimation for Wild Visual Navigation

Authors: Jonas Frey, Matias Mattamala, Nived Chebrolu, Cesar Cadena, Maurice Fallon, Marco Hutter

Abstract: Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we propose Wild Visual Navigation (WVN), an online self-supervised learning system for traversability estimation which uses only vision. The system is able to continuously adapt from a short human demonstratio… ▽ More Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we propose Wild Visual Navigation (WVN), an online self-supervised learning system for traversability estimation which uses only vision. The system is able to continuously adapt from a short human demonstration in the field. It leverages high-dimensional features from self-supervised visual transformer models, with an online scheme for supervision generation that runs in real-time on the robot. We demonstrate the advantages of our approach with experiments and ablation studies in challenging environments in forests, parks, and grasslands. Our system is able to bootstrap the traversable terrain segmentation in less than 5 min of in-field training time, enabling the robot to navigate in complex outdoor terrains - negotiating obstacles in high grass as well as a 1.4 km footpath following. While our experiments were executed with a quadruped robot, ANYmal, the approach presented can generalize to any ground robot. △ Less

Submitted 16 May, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: Accepted for Robotics: Science and Systems 2023

arXiv:2305.07995 [pdf, other]

Seeing Through the Grass: Semantic Pointcloud Filter for Support Surface Learning

Authors: Anqiao Li, Chenyu Yang, Jonas Frey, Joonho Lee, Cesar Cadena, Marco Hutter

Abstract: Mobile ground robots require perceiving and understanding their surrounding support surface to move around autonomously and safely. The support surface is commonly estimated based on exteroceptive depth measurements, e.g., from LiDARs. However, the measured depth fails to align with the true support surface in the presence of high grass or other penetrable vegetation. In this work, we present the… ▽ More Mobile ground robots require perceiving and understanding their surrounding support surface to move around autonomously and safely. The support surface is commonly estimated based on exteroceptive depth measurements, e.g., from LiDARs. However, the measured depth fails to align with the true support surface in the presence of high grass or other penetrable vegetation. In this work, we present the Semantic Pointcloud Filter (SPF), a Convolutional Neural Network (CNN) that learns to adjust LiDAR measurements to align with the underlying support surface. The SPF is trained in a semi-self-supervised manner and takes as an input a LiDAR pointcloud and RGB image. The network predicts a binary segmentation mask that identifies the specific points requiring adjustment, along with estimating their corresponding depth values. To train the segmentation task, 300 distinct images are manually labeled into rigid and non-rigid terrain. The depth estimation task is trained in a self-supervised manner by utilizing the future footholds of the robot to estimate the support surface based on a Gaussian process. Our method can correctly adjust the support surface prior to interacting with the terrain and is extensively tested on the quadruped robot ANYmal. We show the qualitative benefits of SPF in natural environments for elevation map** and traversability estimation compared to using raw sensor measurements and existing smoothing methods. Quantitative analysis is performed in various natural environments, and an improvement by 48% RMSE is achieved within a meadow terrain. △ Less

Submitted 13 May, 2023; originally announced May 2023.

Comments: 8 pages, 9 figures

arXiv:2304.01782 [pdf, other]

Imitation Learning from Nonlinear MPC via the Exact Q-Loss and its Gauss-Newton Approximation

Authors: Andrea Ghezzi, Jasper Hoffman, Jonathan Frey, Joschka Boedecker, Moritz Diehl

Abstract: This work presents a novel loss function for learning nonlinear Model Predictive Control policies via Imitation Learning. Standard approaches to Imitation Learning neglect information about the expert and generally adopt a loss function based on the distance between expert and learned controls. In this work, we present a loss based on the Q-function directly embedding the performance objectives an… ▽ More This work presents a novel loss function for learning nonlinear Model Predictive Control policies via Imitation Learning. Standard approaches to Imitation Learning neglect information about the expert and generally adopt a loss function based on the distance between expert and learned controls. In this work, we present a loss based on the Q-function directly embedding the performance objectives and constraint satisfaction of the associated Optimal Control Problem (OCP). However, training a Neural Network with the Q-loss requires solving the associated OCP for each new sample. To alleviate the computational burden, we derive a second Q-loss based on the Gauss-Newton approximation of the OCP resulting in a faster training time. We validate our losses against Behavioral Cloning, the standard approach to Imitation Learning, on the control of a nonlinear system with constraints. The final results show that the Q-function-based losses significantly reduce the amount of constraint violations while achieving comparable or better closed-loop costs. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: Submitted to Conference on Decision and Control (CDC) 2023. The paper contains 6 pages

arXiv:2212.13115 [pdf, other]

Frenet-Cartesian Model Representations for Automotive Obstacle Avoidance within Nonlinear MPC

Authors: Rudolf Reiter, Armin Nurkanović, Jonathan Frey, Moritz Diehl

Abstract: In recent years, nonlinear model predictive control (NMPC) has been extensively used for solving automotive motion control and planning tasks. In order to formulate the NMPC problem, different coordinate systems can be used with different advantages. We propose and compare formulations for the NMPC related optimization problem, involving a Cartesian and a Frenet coordinate frame (CCF/ FCF) in a si… ▽ More In recent years, nonlinear model predictive control (NMPC) has been extensively used for solving automotive motion control and planning tasks. In order to formulate the NMPC problem, different coordinate systems can be used with different advantages. We propose and compare formulations for the NMPC related optimization problem, involving a Cartesian and a Frenet coordinate frame (CCF/ FCF) in a single nonlinear program (NLP). We specify costs and collision avoidance constraints in the more advantageous coordinate frame, derive appropriate formulations and compare different obstacle constraints. With this approach, we exploit the simpler formulation of opponent vehicle constraints in the CCF, as well as road aligned costs and constraints related to the FCF. Comparisons to other approaches in a simulation framework highlight the advantages of the proposed approaches. △ Less

Submitted 22 December, 2022; originally announced December 2022.

arXiv:2211.13969 [pdf, other]

Unsupervised Continual Semantic Adaptation through Neural Rendering

Authors: Zhizheng Liu, Francesco Milano, Jonas Frey, Roland Siegwart, Hermann Blum, Cesar Cadena

Abstract: An increasing amount of applications rely on data-driven models that are deployed for perception tasks across a sequence of scenes. Due to the mismatch between training and deployment data, adapting the model on the new scenes is often crucial to obtain good performance. In this work, we study continual multi-scene adaptation for the task of semantic segmentation, assuming that no ground-truth lab… ▽ More An increasing amount of applications rely on data-driven models that are deployed for perception tasks across a sequence of scenes. Due to the mismatch between training and deployment data, adapting the model on the new scenes is often crucial to obtain good performance. In this work, we study continual multi-scene adaptation for the task of semantic segmentation, assuming that no ground-truth labels are available during deployment and that performance on the previous scenes should be maintained. We propose training a Semantic-NeRF network for each scene by fusing the predictions of a segmentation model and then using the view-consistent rendered semantic labels as pseudo-labels to adapt the model. Through joint training with the segmentation model, the Semantic-NeRF model effectively enables 2D-3D knowledge transfer. Furthermore, due to its compact size, it can be stored in a long-term memory and subsequently used to render data from arbitrary viewpoints to reduce forgetting. We evaluate our approach on ScanNet, where we outperform both a voxel-based baseline and a state-of-the-art unsupervised domain adaptation method. △ Less

Submitted 24 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

Comments: Accepted by the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023. Zhizheng Liu and Francesco Milano share first authorship. Hermann Blum and Cesar Cadena share senior authorship. 18 pages, 8 figures, 9 tables

arXiv:2210.12583 [pdf, other]

doi 10.1109/TRO.2023.3339543

Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control

Authors: Alessandro Saviolo, Jonathan Frey, Abhishek Rathod, Moritz Diehl, Giuseppe Loianno

Abstract: Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of n… ▽ More Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines. △ Less

Submitted 7 December, 2023; v1 submitted 22 October, 2022; originally announced October 2022.

Journal ref: IEEE Transactions on Robotics

arXiv:2209.07899 [pdf, other]

Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions

Authors: Chenhao Li, Sebastian Blaes, Pavel Kolev, Marin Vlastelica, Jonas Frey, Georg Martius

Abstract: Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining… ▽ More Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining single versatile policies with controllable skill sets from unlabeled datasets containing diverse state transition patterns by maximizing their discriminability. Moreover, we show that by utilizing unsupervised skill discovery in the generative adversarial imitation learning framework, novel and useful skills emerge with successful task fulfillment. Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations. △ Less

Submitted 11 February, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

arXiv:2206.11693 [pdf, other]

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Authors: Chenhao Li, Marin Vlastelica, Sebastian Blaes, Jonas Frey, Felix Grimminger, Georg Martius

Abstract: Learning agile skills is one of the main challenges in robotics. To this end, reinforcement learning approaches have achieved impressive results. These methods require explicit task information in terms of a reward function or an expert that can be queried in simulation to provide a target control output, which limits their applicability. In this work, we propose a generative adversarial method fo… ▽ More Learning agile skills is one of the main challenges in robotics. To this end, reinforcement learning approaches have achieved impressive results. These methods require explicit task information in terms of a reward function or an expert that can be queried in simulation to provide a target control output, which limits their applicability. In this work, we propose a generative adversarial method for inferring reward functions from partial and potentially physically incompatible demonstrations for successful skill acquirement where reference or expert demonstrations are not easily accessible. Moreover, we show that by using a Wasserstein GAN formulation and transitions from demonstrations with rough and partial information as input, we are able to extract policies that are robust and capable of imitating demonstrated behaviors. Finally, the obtained skills such as a backflip are tested on an agile quadruped robot called Solo 8 and present faithful replication of hand-held human demonstrations. △ Less

Submitted 21 November, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

arXiv:2203.15854 [pdf, other]

doi 10.1109/IROS47612.2022.9982190

Locomotion Policy Guided Traversability Learning using Volumetric Representations of Complex Environments

Authors: Jonas Frey, David Hoeller, Shehryar Khattak, Marco Hutter

Abstract: Despite the progress in legged robotic locomotion, autonomous navigation in unknown environments remains an open problem. Ideally, the navigation system utilizes the full potential of the robots' locomotion capabilities while operating within safety limits under uncertainty. The robot must sense and analyze the traversability of the surrounding terrain, which depends on the hardware, locomotion co… ▽ More Despite the progress in legged robotic locomotion, autonomous navigation in unknown environments remains an open problem. Ideally, the navigation system utilizes the full potential of the robots' locomotion capabilities while operating within safety limits under uncertainty. The robot must sense and analyze the traversability of the surrounding terrain, which depends on the hardware, locomotion control, and terrain properties. It may contain information about the risk, energy, or time consumption needed to traverse the terrain. To avoid hand-crafted traversability cost functions we propose to collect traversability information about the robot and locomotion policy by simulating the traversal over randomly generated terrains using a physics simulator. Thousand of robots are simulated in parallel controlled by the same locomotion policy used in reality to acquire 57 years of real-world locomotion experience equivalent. For deployment on the real robot, a sparse convolutional network is trained to predict the simulated traversability cost, which is tailored to the deployed locomotion policy, from an entirely geometric representation of the environment in the form of a 3D voxel-occupancy map. This representation avoids the need for commonly used elevation maps, which are error-prone in the presence of overhanging obstacles and multi-floor or low-ceiling scenarios. The effectiveness of the proposed traversability prediction network is demonstrated for path planning for the legged robot ANYmal in various indoor and natural environments. △ Less

Submitted 21 August, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

Comments: accepted for 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

Journal ref: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

arXiv:2112.12399 [pdf, other]

doi 10.1109/TBME.2021.3113854

Towards identifying optimal biased feedback for various user states and traits in motor imagery BCI

Authors: Jelena Mladenović, Jeremy Frey, Smeety Pramij, Jeremie Mattout, Fabien Lotte

Abstract: Objective. Neural self-regulation is necessary for achieving control over brain-computer interfaces (BCIs). This can be an arduous learning process especially for motor imagery BCI. Various training methods were proposed to assist users in accomplishing BCI control and increase performance. Notably the use of biased feedback, i.e. non-realistic representation of performance. Benefits of biased fee… ▽ More Objective. Neural self-regulation is necessary for achieving control over brain-computer interfaces (BCIs). This can be an arduous learning process especially for motor imagery BCI. Various training methods were proposed to assist users in accomplishing BCI control and increase performance. Notably the use of biased feedback, i.e. non-realistic representation of performance. Benefits of biased feedback on performance and learning vary between users (e.g. depending on their initial level of BCI control) and remain speculative. To disentangle the speculations, we investigate what personality type, initial state and calibration performance (CP) could benefit from a biased feedback. Methods. We conduct an experiment (n=30 for 2 sessions). The feedback provided to each group (n=10) is either positively, negatively or not biased. Results. Statistical analyses suggest that interactions between bias and: 1) workload, 2) anxiety, and 3) self-control significantly affect online performance. For instance, low initial workload paired with negative bias is associated to higher peak performances (86%) than without any bias (69%). High anxiety relates negatively to performance no matter the bias (60%), while low anxiety matches best with negative bias (76%). For low CP, learning rate (LR) increases with negative bias only short term (LR=2%) as during the second session it severely drops (LR=-1%). Conclusion. We unveil many interactions between said human factors and bias. Additionally, we use prediction models to confirm and reveal even more interactions. Significance. This paper is a first step towards identifying optimal biased feedback for a personality type, state, and CP in order to maximize BCI performance and learning. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: IEEE Transactions on Biomedical Engineering, Institute of Electrical and Electronics Engineers, 2021

arXiv:2111.02156 [pdf, other]

doi 10.1109/LRA.2022.3203812

Continual Adaptation of Semantic Segmentation using Complementary 2D-3D Data Representations

Authors: Jonas Frey, Hermann Blum, Francesco Milano, Roland Siegwart, Cesar Cadena

Abstract: Semantic segmentation networks are usually pre-trained once and not updated during deployment. As a consequence, misclassifications commonly occur if the distribution of the training data deviates from the one encountered during the robot's operation. We propose to mitigate this problem by adapting the neural network to the robot's environment during deployment, without any need for external super… ▽ More Semantic segmentation networks are usually pre-trained once and not updated during deployment. As a consequence, misclassifications commonly occur if the distribution of the training data deviates from the one encountered during the robot's operation. We propose to mitigate this problem by adapting the neural network to the robot's environment during deployment, without any need for external supervision. Leveraging complementary data representations, we generate a supervision signal, by probabilistically accumulating consecutive 2D semantic predictions in a volumetric 3D map. We then train the network on renderings of the accumulated semantic map, effectively resolving ambiguities and enforcing multi-view consistency through the 3D representation. In contrast to scene adaptation methods, we aim to retain the previously-learned knowledge, and therefore employ a continual learning experience replay strategy to adapt the network. Through extensive experimental evaluation, we show successful adaptation to real-world indoor scenes both on the ScanNet dataset and on in-house data recorded with an RGB-D sensor. Our method increases the segmentation accuracy on average by 9.9% compared to the fixed pre-trained neural network, while retaining knowledge from the pre-training dataset. △ Less

Submitted 20 August, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted for IEEE Robotics and Automation Letters (R-AL 2022)

Report number: 9874976

Journal ref: IEEE Robotics and Automation Letters 2022

arXiv:2003.09333 [pdf, other]

doi 10.1145/3313831.3376643

Physiologically Driven Storytelling: Concept and Software Tool

Authors: Jérémy Frey, Gilad Ostrin, May Grabli, Jessica Cauchard

Abstract: We put forth Physiologically Driven Storytelling, a new approach to interactive storytelling where narratives adaptively unfold based on the reader's physiological state. We first describe a taxonomy framing how physiological signals can be used to drive interactive systems both as input and output. We then propose applications to interactive storytelling and describe the implementation of a softw… ▽ More We put forth Physiologically Driven Storytelling, a new approach to interactive storytelling where narratives adaptively unfold based on the reader's physiological state. We first describe a taxonomy framing how physiological signals can be used to drive interactive systems both as input and output. We then propose applications to interactive storytelling and describe the implementation of a software tool to create Physiological Interactive Fiction (PIF). The results of an online study (N=140) provided guidelines towards augmenting the reading experience. PIF was then evaluated in a lab study (N=14) to determine how physiological signals can be used to infer a reader's state. Our results show that breathing, electrodermal activity, and eye tracking can help differentiate positive from negative tones, and monotonous from exciting events. This work demonstrates how PIF can support storytelling in creating engaging content and experience tailored to the reader. Moreover, it opens the space to future physiologically driven systems within broader application areas. △ Less

Submitted 20 March, 2020; originally announced March 2020.

Comments: CHI '20 - SIGCHI Conference on Human Factors in Computing System, Apr 2020, Honolulu, United States

arXiv:1901.02198 [pdf, other]

doi 10.1145/3282894.3289740

Interactive Narrative in Virtual Reality

Authors: Gilad Ostrin, Jérémy Frey, Jessica Cauchard

Abstract: Interactive fiction is a literary genre that is rapidly gaining popularity. In this genre, readers are able to explicitly take actions in order to guide the course of the story. With the recent popularity of narrative focused games, we propose to design and develop an interactive narrative tool for content creators. In this extended abstract, we show how we leverage this interactive medium to pres… ▽ More Interactive fiction is a literary genre that is rapidly gaining popularity. In this genre, readers are able to explicitly take actions in order to guide the course of the story. With the recent popularity of narrative focused games, we propose to design and develop an interactive narrative tool for content creators. In this extended abstract, we show how we leverage this interactive medium to present a tool for interactive storytelling in virtual reality. Using a simple markup language, content creators and researchers are now able to create interactive narratives in a virtual reality environment. We further discuss the potential future directions for a virtual reality storytelling engine. △ Less

Submitted 8 January, 2019; originally announced January 2019.

Journal ref: MUM 2018 Proceedings of the 17th International Conference on Mobile and Ubiquitous Multimedia, Nov 2018, Cairo, Egypt. pp.463-467

arXiv:1808.08713 [pdf, other]

doi 10.1145/3267305.3267701

Remote Biofeedback Sharing, Opportunities and Challenges

Authors: Jérémy Frey, Jessica Cauchard

Abstract: Biofeedback is commonly used to regulate one's state, for example to manage stress. The underlying idea is that by perceiving a feedback about their physiological activity, a user can act upon it. In this paper we describe through two recent projects how biofeedback could be leveraged to share one's state at distance. Such extension of biofeedback could answer to the need of belonging, further wid… ▽ More Biofeedback is commonly used to regulate one's state, for example to manage stress. The underlying idea is that by perceiving a feedback about their physiological activity, a user can act upon it. In this paper we describe through two recent projects how biofeedback could be leveraged to share one's state at distance. Such extension of biofeedback could answer to the need of belonging, further widening the applications of the technology in terms of well-being. △ Less

Submitted 27 August, 2018; originally announced August 2018.

Comments: WellComp - UbiComp/ISWC'18 Adjunct, Oct 2018, Singapore, Singapore. http://ubicomp.org/ubicomp2018/

arXiv:1808.08711 [pdf, other]

Exploring Biofeedback with a Tangible Interface Designed for Relaxation

Authors: Morgane Hamon, Rémy Ramadour, Jérémy Frey

Abstract: Anxiety is a common health issue that can occur throughout one's existence. In this pilot study we explore an alternative technique to regulate it: biofeedback. The long-term objective is to offer an ecological device that could help people cope with anxiety, by exposing their inner state in a comprehensive manner. We propose a first iteration of this device, "Inner Flower", that uses heart rate t… ▽ More Anxiety is a common health issue that can occur throughout one's existence. In this pilot study we explore an alternative technique to regulate it: biofeedback. The long-term objective is to offer an ecological device that could help people cope with anxiety, by exposing their inner state in a comprehensive manner. We propose a first iteration of this device, "Inner Flower", that uses heart rate to adapt a breathing guide to the user, and we investigate its efficiency and usability. Traditionally, such device requires user's full attention. We propose an ambient modality during which the device operates in the peripheral vision. Beside comparing "Ambient" and "Focus" conditions, we also compare the biofeedback with a sham feedback (fixed breathing guide). We found that the Focus group demonstrated higher relaxation and performance on a cognitive task (N-back). However, there was no noticeable effect of the Ambient feedback, and the biofeedback condition did not yield any significant difference when compared to the sham feedback. These results, while promising, highlight the pitfalls of any research related to biofeedback, where it is difficult to fully comprehend the underlying mechanisms of such technique. △ Less

Submitted 27 August, 2018; originally announced August 2018.

Comments: PhyCS - International Conference on Physiological Computing Systems, Sep 2018, Seville, Spain. SCITEPRESS, 2018, http://www.phycs.org/?y=2018

arXiv:1805.09109 [pdf, other]

Active Inference for Adaptive BCI: application to the P300 Speller

Authors: Jelena Mladenović, Jérémy Frey, Emmanuel Maby, Mateus Joffily, Fabien Lotte, Jeremie Mattout

Abstract: Adaptive Brain-Computer interfaces (BCIs) have shown to improve performance, however a general and flexible framework to implement adaptive features is still lacking. We appeal to a generic Bayesian approach, called Active Inference (AI), to infer user's intentions or states and act in a way that optimizes performance. In realistic P300-speller simulations, AI outperforms traditional algorithms wi… ▽ More Adaptive Brain-Computer interfaces (BCIs) have shown to improve performance, however a general and flexible framework to implement adaptive features is still lacking. We appeal to a generic Bayesian approach, called Active Inference (AI), to infer user's intentions or states and act in a way that optimizes performance. In realistic P300-speller simulations, AI outperforms traditional algorithms with an increase in bit rate between 18% and 59%, while offering a possibility of unifying various adaptive implementations within one generic framework. △ Less

Submitted 22 May, 2018; originally announced May 2018.

Journal ref: International BCI meeting, May 2018, Asilomar, United States. 2018, http://bcisociety.org/

arXiv:1805.07064 [pdf, other]

Evaluation of a congruent auditory feedback for Motor Imagery BCI

Authors: Emmanuel Christophe, Jérémy Frey, Richard Kronland-Martinet, Jean-Arthur Micoulaud-Franchi, Jelena Mladenović, Gaëlle Mougin, Jean Vion-Dury, Solvi Ystad, Mitsuko Aramaki

Abstract: Designing a feedback that helps participants to achieve higher performances is an important concern in brain-computer interface (BCI) research. In a pilot study, we demonstrate how a congruent auditory feedback could improve classification in a electroencephalography (EEG) motor imagery BCI. This is a promising result for creating alternate feedback modality. Designing a feedback that helps participants to achieve higher performances is an important concern in brain-computer interface (BCI) research. In a pilot study, we demonstrate how a congruent auditory feedback could improve classification in a electroencephalography (EEG) motor imagery BCI. This is a promising result for creating alternate feedback modality. △ Less

Submitted 22 May, 2018; v1 submitted 18 May, 2018; originally announced May 2018.

Journal ref: International BCI meeting, May 2018, Asilomar, United States. http://bcisociety.org/

arXiv:1803.00296 [pdf, other]

doi 10.1145/3170427.3186517

Dišimo: Anchoring Our Breath

Authors: Jelena Mladenovic, Jérémy Frey, Jessica Cauchard

Abstract: We present a system that raises awareness about users' inner state. Dišimo is a multimodal ambient display that provides feedback about one's stress level, which is assessed through heart rate monitoring. Upon detecting a low heart rate variability for a prolonged period of time, Dišimo plays an audio track, setting the pace of a regular and deep breathing. Users can then choose to take a moment t… ▽ More We present a system that raises awareness about users' inner state. Dišimo is a multimodal ambient display that provides feedback about one's stress level, which is assessed through heart rate monitoring. Upon detecting a low heart rate variability for a prolonged period of time, Dišimo plays an audio track, setting the pace of a regular and deep breathing. Users can then choose to take a moment to focus on their breath. By doing so, they will activate the Dišimo devices belonging to their close ones, who can then join for a shared relaxation session. △ Less

Submitted 1 March, 2018; originally announced March 2018.

Journal ref: CHI '18 Interactivity - SIGCHI Conference on Human Factors in Computing System, Apr 2018, Montreal, Canada

arXiv:1802.04995 [pdf, other]

doi 10.1145/3173574.3174219

Breeze: Sharing Biofeedback Through Wearable Technologies

Authors: Jérémy Frey, May Grabli, Ronit Slyper, Jessica Cauchard

Abstract: Digitally presenting physiological signals as biofeedback to users raises awareness of both body and mind. This paper describes the effectiveness of conveying a physiological signal often overlooked for communication: breathing. We present the design and development of digital breathing patterns and their evaluation along three output modalities: visual, audio, and haptic. We also present Breeze,… ▽ More Digitally presenting physiological signals as biofeedback to users raises awareness of both body and mind. This paper describes the effectiveness of conveying a physiological signal often overlooked for communication: breathing. We present the design and development of digital breathing patterns and their evaluation along three output modalities: visual, audio, and haptic. We also present Breeze, a wearable pendant placed around the neck that measures breathing and sends biofeedback in real-time. We evaluated how the breathing patterns were interpreted in a fixed environment and gathered qualitative data on the wearable device's design. We found that participants intentionally modified their own breathing to match the biofeedback, as a technique for understanding the underlying emotion. Our results describe how the features of the breathing patterns and the feedback modalities influenced participants' perception. We include guidelines and suggested use cases, such as Breeze being used by loved ones to increase connectedness and empathy. △ Less

Submitted 14 February, 2018; originally announced February 2018.

Journal ref: CHI '18 - SIGCHI Conference on Human Factors in Computing System, Apr 2018, Montreal, Canada. 2018, https://chi2018.acm.org/

arXiv:1802.02820 [pdf, ps, other]

doi 10.1145/3209108.3209130

Impredicative Encodings of (Higher) Inductive Types

Authors: Steve Awodey, Jonas Frey, Sam Speight

Abstract: Postulating an impredicative universe in dependent type theory allows System F style encodings of finitary inductive types, but these fail to satisfy the relevant η-equalities and consequently do not admit dependent eliminators. To recover η and dependent elimination, we present a method to construct refinements of these impredicative encodings, using ideas from homotopy type theory. We then exten… ▽ More Postulating an impredicative universe in dependent type theory allows System F style encodings of finitary inductive types, but these fail to satisfy the relevant η-equalities and consequently do not admit dependent eliminators. To recover η and dependent elimination, we present a method to construct refinements of these impredicative encodings, using ideas from homotopy type theory. We then extend our method to construct impredicative encodings of some higher inductive types, such as 1-truncation and the unit circle S1. △ Less

Submitted 8 February, 2018; originally announced February 2018.

arXiv:1712.06148 [pdf, other]

Generating and designing DNA with deep generative models

Authors: Nathan Killoran, Leo J. Lee, Andrew Delong, David Duvenaud, Brendan J. Frey

Abstract: We propose generative neural network methods to generate DNA sequences and tune them to have desired properties. We present three approaches: creating synthetic DNA sequences using a generative adversarial network; a DNA-based variant of the activation maximization ("deep dream") design method; and a joint procedure which combines these two approaches together. We show that these tools capture imp… ▽ More We propose generative neural network methods to generate DNA sequences and tune them to have desired properties. We present three approaches: creating synthetic DNA sequences using a generative adversarial network; a DNA-based variant of the activation maximization ("deep dream") design method; and a joint procedure which combines these two approaches together. We show that these tools capture important structures of the data and, when applied to designing probes for protein binding microarrays, allow us to generate new sequences whose properties are estimated to be superior to those found in the training data. We believe that these results open the door for applying deep generative models to advance genomics research. △ Less

Submitted 17 December, 2017; originally announced December 2017.

Comments: NIPS 2017 Computational Biology Workshop

arXiv:1706.01728 [pdf, other]

The Impact of Flow in an EEG-based Brain Computer Interface

Authors: Jelena Mladenović, Jérémy Frey, Manon Bonnet-Save, Jérémie Mattout, Fabien Lotte

Abstract: Major issues in Brain Computer Interfaces (BCIs) include low usability and poor user performance. This paper tackles them by ensuring the users to be in a state of immersion, control and motivation, called state of flow. Indeed, in various disciplines, being in the state of flow was shown to improve performances and learning. Hence, we intended to draw BCI users in a flow state to improve both the… ▽ More Major issues in Brain Computer Interfaces (BCIs) include low usability and poor user performance. This paper tackles them by ensuring the users to be in a state of immersion, control and motivation, called state of flow. Indeed, in various disciplines, being in the state of flow was shown to improve performances and learning. Hence, we intended to draw BCI users in a flow state to improve both their subjective experience and their performances. In a Motor Imagery BCI game, we manipulated flow in two ways: 1) by adapting the task difficulty and 2) by using background music. Results showed that the difficulty adaptation induced a higher flow state, however music had no effect. There was a positive correlation between subjective flow scores and offline performance, although the flow factors had no effect (adaptation) or negative effect (music) on online performance. Overall, favouring the flow state seems a promising approach for enhancing users' satisfaction, although its complexity requires more thorough investigations. △ Less

Submitted 6 June, 2017; originally announced June 2017.

arXiv:1703.02365 [pdf, other]

doi 10.1145/3027063.3052971

Scientific Outreach with Teegi, a Tangible EEG Interface to Talk about Neurotechnologies

Authors: Jérémy Frey, Renaud Gervais, Thibault Lainé, Maxime Duluc, Hugo Germain, Stéphanie Fleck, Fabien Lotte, Martin Hachet

Abstract: Teegi is an anthropomorphic and tangible avatar exposing a users' brain activity in real time. It is connected to a device sensing the brain by means of electroencephalog-raphy (EEG). Teegi moves its hands and feet and closes its eyes along with the person being monitored. It also displays on its scalp the associated EEG signals, thanks to a semi-spherical display made of LEDs. Attendees can inter… ▽ More Teegi is an anthropomorphic and tangible avatar exposing a users' brain activity in real time. It is connected to a device sensing the brain by means of electroencephalog-raphy (EEG). Teegi moves its hands and feet and closes its eyes along with the person being monitored. It also displays on its scalp the associated EEG signals, thanks to a semi-spherical display made of LEDs. Attendees can interact directly with Teegi -- e.g. move its limbs -- to discover by themselves the underlying brain processes. Teegi can be used for scientific outreach to introduce neurotechnologies in general and brain-computer interfaces (BCI) in particular. △ Less

Submitted 7 March, 2017; originally announced March 2017.

Journal ref: CHI '17 Interactivity - SIGCHI Conference on Human Factors in Computing System, May 2017, Denver, United States

arXiv:1606.02438 [pdf, other]

Comparison of an open-hardware electroencephalography amplifier with medical grade device in brain-computer interface applications

Authors: Jérémy Frey

Abstract: Brain-computer interfaces (BCI) are promising communication devices between humans and machines. BCI based on non-invasive neuroimaging techniques such as electroencephalography (EEG) have many applications , however the dissemination of the technology is limited, in part because of the price of the hardware. In this paper we compare side by side two EEG amplifiers, the consumer grade OpenBCI and… ▽ More Brain-computer interfaces (BCI) are promising communication devices between humans and machines. BCI based on non-invasive neuroimaging techniques such as electroencephalography (EEG) have many applications , however the dissemination of the technology is limited, in part because of the price of the hardware. In this paper we compare side by side two EEG amplifiers, the consumer grade OpenBCI and the medical grade g.tec g.USBamp. For this purpose, we employed an original montage, based on the simultaneous recording of the same set of electrodes. Two set of recordings were performed. During the first experiment a simple adapter with a direct connection between the amplifiers and the electrodes was used. Then, in a second experiment, we attempted to discard any possible interference that one amplifier could cause to the other by adding "ideal" diodes to the adapter. Both spectral and temporal features were tested -- the former with a workload monitoring task, the latter with an visual P300 speller task. Overall, the results suggest that the OpenBCI board -- or a similar solution based on the Texas Instrument ADS1299 chip -- could be an effective alternative to traditional EEG devices. Even though a medical grade equipment still outperforms the OpenBCI, the latter gives very close EEG readings, resulting in practice in a classification accuracy that may be suitable for popularizing BCI uses. △ Less

Submitted 8 June, 2016; originally announced June 2016.

Comments: PhyCS - International Conference on Physiological Computing Systems, Jul 2016, Lisbon, Portugal. SCITEPRESS, 2016

arXiv:1606.02427 [pdf, other]

VIF: Virtual Interactive Fiction (with a twist)

Authors: Jérémy Frey

Abstract: Nowadays computer science can create digital worlds that deeply immerse users; it can also process in real time brain activity to infer their inner states. What marvels can we achieve with such technologies? Go back to displaying text. And unfold a story that follows and molds users as never before. Nowadays computer science can create digital worlds that deeply immerse users; it can also process in real time brain activity to infer their inner states. What marvels can we achieve with such technologies? Go back to displaying text. And unfold a story that follows and molds users as never before. △ Less

Submitted 8 June, 2016; originally announced June 2016.

Comments: Pervasive Play - CHI '16 Workshop, May 2016, San Jose, United States

arXiv:1603.04581 [pdf, other]

Introspectibles: Tangible Interaction to Foster Introspection

Authors: Renaud Gervais, Joan Sol Roo, Jérémy Frey, Martin Hachet

Abstract: Digital devices are now ubiquitous and have the potential to be used to support positive changes in human lives and promote psychological well-being. This paper presents three interactive systems that we created focusing on introspection activities, leveraging tangible interaction and spatial augmented reality. More specifically, we describe anthropomorphic augmented avatars that display the use… ▽ More Digital devices are now ubiquitous and have the potential to be used to support positive changes in human lives and promote psychological well-being. This paper presents three interactive systems that we created focusing on introspection activities, leveraging tangible interaction and spatial augmented reality. More specifically, we describe anthropomorphic augmented avatars that display the users' inner states using physiological sensors. We also present a first prototype of an augmented sandbox specifically dedicated to promoting mindfulness activities. △ Less

Submitted 15 March, 2016; originally announced March 2016.

Comments: in CHI '16 - SIGCHI Conference on Human Factors in Computing System - Computing and Mental Health Workshop, May 2016, San Jose, United States

arXiv:1602.08358 [pdf, other]

doi 10.1145/2851581.2892391

Remote Heart Rate Sensing and Projection to Renew Traditional Board Games and Foster Social Interactions

Authors: Jérémy Frey

Abstract: While physiological sensors enter the mass market and reach the general public, they are still mainly employed to monitor health -- whether it is for medical purpose or sports. We describe an application that uses heart rate feedback as an incentive for social interactions. A traditional board game has been "augmented" through remote physiological sensing, using webcams. Projection helped to conce… ▽ More While physiological sensors enter the mass market and reach the general public, they are still mainly employed to monitor health -- whether it is for medical purpose or sports. We describe an application that uses heart rate feedback as an incentive for social interactions. A traditional board game has been "augmented" through remote physiological sensing, using webcams. Projection helped to conceal the technological aspects from users. We detail how players reacted -- stressful situations could emerge when users are deprived from their own signals -- and we give directions for game designers to integrate physiological sensors. △ Less

Submitted 26 February, 2016; originally announced February 2016.

Journal ref: CHI '16 Extended Abstracts, May 2016, San Jose, United States. 2016

arXiv:1601.02768 [pdf, other]

doi 10.1145/2858036.2858525

Framework for Electroencephalography-based Evaluation of User Experience

Authors: Jérémy Frey, Maxime Daniel, Julien Castet, Martin Hachet, Fabien Lotte

Abstract: Measuring brain activity with electroencephalography (EEG) is mature enough to assess mental states. Combined with existing methods, such tool can be used to strengthen the understanding of user experience. We contribute a set of methods to estimate continuously the user's mental workload, attention and recognition of interaction errors during different interaction tasks. We validate these measure… ▽ More Measuring brain activity with electroencephalography (EEG) is mature enough to assess mental states. Combined with existing methods, such tool can be used to strengthen the understanding of user experience. We contribute a set of methods to estimate continuously the user's mental workload, attention and recognition of interaction errors during different interaction tasks. We validate these measures on a controlled virtual environment and show how they can be used to compare different interaction techniques or devices, by comparing here a keyboard and a touch-based interface. Thanks to such a framework, EEG becomes a promising method to improve the overall usability of complex computer systems. △ Less

Submitted 12 January, 2016; originally announced January 2016.

Comments: in ACM. CHI '16 - SIGCHI Conference on Human Factors in Computing System, May 2016, San Jose, United States

arXiv:1511.06510 [pdf, other]

doi 10.1145/2839462.2839486

TOBE: Tangible Out-of-Body Experience

Authors: Renaud Gervais, Jérémy Frey, Alexis Gay, Fabien Lotte, Martin Hachet

Abstract: We propose a toolkit for creating Tangible Out-of-Body Experiences: exposing the inner states of users using physiological signals such as heart rate or brain activity. Tobe can take the form of a tangible avatar displaying live physiological readings to reflect on ourselves and others. Such a toolkit could be used by researchers and designers to create a multitude of potential tangible applicatio… ▽ More We propose a toolkit for creating Tangible Out-of-Body Experiences: exposing the inner states of users using physiological signals such as heart rate or brain activity. Tobe can take the form of a tangible avatar displaying live physiological readings to reflect on ourselves and others. Such a toolkit could be used by researchers and designers to create a multitude of potential tangible applications, including (but not limited to) educational tools about Science Technologies Engineering and Mathematics (STEM) and cognitive science, medical applications or entertainment and social experiences with one or several users or Tobes involved. Through a co-design approach, we investigated how everyday people picture their physiology and we validated the acceptability of Tobe in a scientific museum. We also give a practical example where two users relax together, with insights on how Tobe helped them to synchronize their signals and share a moment. △ Less

Submitted 20 November, 2015; originally announced November 2015.

Journal ref: Tangible, Embedded and Embodied Interaction (TEI), Feb 2016, Eindhoven, Netherlands. 2016, \<http://www.tei-conf.org/16/\>. \<10.1145/2839462.2839486\>

arXiv:1505.07940 [pdf]

Continuous Mental Effort Evaluation during 3D Object Manipulation Tasks based on Brain and Physiological Signals

Authors: Dennis Wobrock, Jérémy Frey, Delphine Graeff, Jean-Baptiste De La Rivière, Julien Castet, Fabien Lotte

Abstract: Designing 3D User Interfaces (UI) requires adequate evaluation tools to ensure good usability and user experience. While many evaluation tools are already available and widely used, existing approaches generally cannot provide continuous and objective measures of usa-bility qualities during interaction without interrupting the user. In this paper, we propose to use brain (with ElectroEncephaloGrap… ▽ More Designing 3D User Interfaces (UI) requires adequate evaluation tools to ensure good usability and user experience. While many evaluation tools are already available and widely used, existing approaches generally cannot provide continuous and objective measures of usa-bility qualities during interaction without interrupting the user. In this paper, we propose to use brain (with ElectroEncephaloGraphy) and physiological (ElectroCardioGraphy, Galvanic Skin Response) signals to continuously assess the mental effort made by the user to perform 3D object manipulation tasks. We first show how this mental effort (a.k.a., mental workload) can be estimated from such signals, and then measure it on 8 participants during an actual 3D object manipulation task with an input device known as the CubTile. Our results suggest that monitoring workload enables us to continuously assess the 3DUI and/or interaction technique ease-of-use. Overall, this suggests that this new measure could become a useful addition to the repertoire of available evaluation tools, enabling a finer grain assessment of the ergonomic qualities of a given 3D user interface. △ Less

Submitted 29 May, 2015; originally announced May 2015.

Comments: Published in INTERACT, Sep 2015, Bamberg, Germany

arXiv:1505.07783 [pdf, other]

Estimating Visual Comfort in Stereoscopic Displays Using Electroencephalography: A Proof-of-Concept

Authors: Jérémy Frey, Aurélien Appriou, Fabien Lotte, Martin Hachet

Abstract: With stereoscopic displays, a depth sensation that is too strong could impede visual comfort and result in fatigue or pain. Electroencephalography (EEG) is a technology which records brain activity. We used it to develop a novel brain-computer interface that monitors users' states in order to reduce visual strain. We present the first proof-of-concept system that discriminates comfortable conditio… ▽ More With stereoscopic displays, a depth sensation that is too strong could impede visual comfort and result in fatigue or pain. Electroencephalography (EEG) is a technology which records brain activity. We used it to develop a novel brain-computer interface that monitors users' states in order to reduce visual strain. We present the first proof-of-concept system that discriminates comfortable conditions from uncomfortable ones during stereoscopic vision using EEG. It reacts within 1s to depth variations, achieving 63% accuracy on average and 74% when 7 consecutive variations are measured. This study could lead to adaptive systems that automatically suit stereoscopic displays to users and viewing conditions. △ Less

Submitted 28 May, 2015; originally announced May 2015.

Comments: INTERACT, Sep 2015, Bamberg, Germany

arXiv:1412.1790 [pdf, other]

doi 10.1145/2642918.2647368

Teegi: Tangible EEG Interface

Authors: Jérémy Frey, Renaud Gervais, Stéphanie Fleck, Fabien Lotte, Martin Hachet

Abstract: We introduce Teegi, a Tangible ElectroEncephaloGraphy (EEG) Interface that enables novice users to get to know more about something as complex as brain signals, in an easy, en- gaging and informative way. To this end, we have designed a new system based on a unique combination of spatial aug- mented reality, tangible interaction and real-time neurotech- nologies. With Teegi, a user can visualize a… ▽ More We introduce Teegi, a Tangible ElectroEncephaloGraphy (EEG) Interface that enables novice users to get to know more about something as complex as brain signals, in an easy, en- gaging and informative way. To this end, we have designed a new system based on a unique combination of spatial aug- mented reality, tangible interaction and real-time neurotech- nologies. With Teegi, a user can visualize and analyze his or her own brain activity in real-time, on a tangible character that can be easily manipulated, and with which it is possible to interact. An exploration study has shown that interacting with Teegi seems to be easy, motivating, reliable and infor- mative. Overall, this suggests that Teegi is a promising and relevant training and mediation tool for the general public. △ Less

Submitted 4 December, 2014; originally announced December 2014.

Comments: to appear in UIST-ACM User Interface Software and Technology Symposium, Oct 2014, Honolulu, United States

arXiv:1412.1772 [pdf, other]

Heart Rate Monitoring as an Easy Way to Increase Engagement in Human-Agent Interaction

Authors: Jérémy Frey

Abstract: Physiological sensors are gaining the attention of manufacturers and users. As denoted by devices such as smartwatches or the newly released Kinect 2 -- which can covertly measure heartbeats -- or by the popularity of smartphone apps that track heart rate during fitness activities. Soon, physiological monitoring could become widely accessible and transparent to users. We demonstrate how one could… ▽ More Physiological sensors are gaining the attention of manufacturers and users. As denoted by devices such as smartwatches or the newly released Kinect 2 -- which can covertly measure heartbeats -- or by the popularity of smartphone apps that track heart rate during fitness activities. Soon, physiological monitoring could become widely accessible and transparent to users. We demonstrate how one could take advantage of this situation to increase users' engagement and enhance user experience in human-agent interaction. We created an experimental protocol involving embodied agents -- "virtual avatars". Those agents were displayed alongside a beating heart. We compared a condition in which this feedback was simply duplicating the heart rates of users to another condition in which it was set to an average heart rate. Results suggest a superior social presence of agents when they display feedback similar to users' internal state. This physiological "similarity-attraction" effect may lead, with little effort, to a better acceptance of agents and robots by the general public. △ Less

Submitted 4 December, 2014; originally announced December 2014.

Comments: PhyCS - International Conference on Physiological Computing Systems, Feb 2015, Angers, France. SCITEPRESS, \<http://www.phycs.org/\&gt

arXiv:1404.6222 [pdf, other]

doi 10.1145/2559206.2581191

Assessing the Zone of Comfort in Stereoscopic Displays using EEG

Authors: Jérémy Frey, Léonard Pommereau, Fabien Lotte, Martin Hachet

Abstract: The conflict between vergence (eye movement) and accommodation (crystalline lens deformation) occurs in every stereoscopic display. It could cause important stress outside the "zone of comfort", when stereoscopic effect is too strong. This conflict has already been studied using questionnaires, during viewing sessions of several minutes. The present pilot study describes an experimental protocol w… ▽ More The conflict between vergence (eye movement) and accommodation (crystalline lens deformation) occurs in every stereoscopic display. It could cause important stress outside the "zone of comfort", when stereoscopic effect is too strong. This conflict has already been studied using questionnaires, during viewing sessions of several minutes. The present pilot study describes an experimental protocol which compares two different comfort conditions using electroencephalography (EEG) over short viewing sequences. Analyses showed significant differences both in event-related potentials (ERP) and in frequency bands power. An uncomfortable stereoscopy correlates with a weaker negative component and a delayed positive component in ERP. It also induces a power decrease in the alpha band and increases in theta and beta bands. With fast responses to stimuli, EEG is likely to enable the conception of adaptive systems, which could tune the stereoscopic experience according to each viewer. △ Less

Submitted 24 April, 2014; originally announced April 2014.

Journal ref: ACM SIGCHI Conference on Human Factors in Computing Systems (2014)

arXiv:1311.2222 [pdf, other]

Review of the Use of Electroencephalography as an Evaluation Method for Human-Computer Interaction

Authors: Jérémy Frey, Christian Mühl, Fabien Lotte, Martin Hachet

Abstract: Evaluating human-computer interaction is essential as a broadening population uses machines, sometimes in sensitive contexts. However, traditional evaluation methods may fail to combine real-time measures, an "objective" approach and data contextualization. In this review we look at how adding neuroimaging techniques can respond to such needs. We focus on electroencephalography (EEG), as it could… ▽ More Evaluating human-computer interaction is essential as a broadening population uses machines, sometimes in sensitive contexts. However, traditional evaluation methods may fail to combine real-time measures, an "objective" approach and data contextualization. In this review we look at how adding neuroimaging techniques can respond to such needs. We focus on electroencephalography (EEG), as it could be handled effectively during a dedicated evaluation phase. We identify workload, attention, vigilance, fatigue, error recognition, emotions, engagement, flow and immersion as being recognizable by EEG. We find that workload, attention and emotions assessments would benefit the most from EEG. Moreover, we advocate to study further error recognition through neuroimaging to enhance usability and increase user experience. △ Less

Submitted 9 November, 2013; originally announced November 2013.

Comments: PhyCS 2014 - International Conference on Physiological Computing Systems (2014)

Showing 1–50 of 62 results for author: Frey, J