-
Legibot: Generating Legible Motions for Service Robots Using Cost-Based Local Planners
Authors:
Javad Amirian,
Mouad Abrini,
Mohamed Chetouani
Abstract:
With the increasing presence of social robots in various environments and applications, there is an increasing need for these robots to exhibit socially-compliant behaviors. Legible motion, characterized by the ability of a robot to clearly and quickly convey intentions and goals to the individuals in its vicinity, through its motion, holds significant importance in this context. This will improve…
▽ More
With the increasing presence of social robots in various environments and applications, there is an increasing need for these robots to exhibit socially-compliant behaviors. Legible motion, characterized by the ability of a robot to clearly and quickly convey intentions and goals to the individuals in its vicinity, through its motion, holds significant importance in this context. This will improve the overall user experience and acceptance of robots in human environments. In this paper, we introduce a novel approach to incorporate legibility into local motion planning for mobile robots. This can enable robots to generate legible motions in real-time and dynamic environments. To demonstrate the effectiveness of our proposed methodology, we also provide a robotic stack designed for deploying legibility-aware motion planning in a social robot, by integrating perception and localization components.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Utility-based Adaptive Teaching Strategies using Bayesian Theory of Mind
Authors:
Clémence Grislain,
Hugo Caselles-Dupré,
Olivier Sigaud,
Mohamed Chetouani
Abstract:
Good teachers always tailor their explanations to the learners. Cognitive scientists model this process under the rationality principle: teachers try to maximise the learner's utility while minimising teaching costs. To this end, human teachers seem to build mental models of the learner's internal state, a capacity known as Theory of Mind (ToM). Inspired by cognitive science, we build on Bayesian…
▽ More
Good teachers always tailor their explanations to the learners. Cognitive scientists model this process under the rationality principle: teachers try to maximise the learner's utility while minimising teaching costs. To this end, human teachers seem to build mental models of the learner's internal state, a capacity known as Theory of Mind (ToM). Inspired by cognitive science, we build on Bayesian ToM mechanisms to design teacher agents that, like humans, tailor their teaching strategies to the learners. Our ToM-equipped teachers construct models of learners' internal states from observations and leverage them to select demonstrations that maximise the learners' rewards while minimising teaching costs. Our experiments in simulated environments demonstrate that learners taught this way are more efficient than those taught in a learner-agnostic way. This effect gets stronger when the teacher's model of the learner better aligns with the actual learner's state, either using a more accurate prior or after accumulating observations of the learner's behaviour. This work is a first step towards social machines that teach us and each other, see https://teacher-with-tom.github.io.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Enhancing Agent Communication and Learning through Action and Language
Authors:
Hugo Caselles-Dupré,
Olivier Sigaud,
Mohamed Chetouani
Abstract:
We introduce a novel category of GC-agents capable of functioning as both teachers and learners. Leveraging action-based demonstrations and language-based instructions, these agents enhance communication efficiency. We investigate the incorporation of pedagogy and pragmatism, essential elements in human communication and goal achievement, enhancing the agents' teaching and learning capabilities. F…
▽ More
We introduce a novel category of GC-agents capable of functioning as both teachers and learners. Leveraging action-based demonstrations and language-based instructions, these agents enhance communication efficiency. We investigate the incorporation of pedagogy and pragmatism, essential elements in human communication and goal achievement, enhancing the agents' teaching and learning capabilities. Furthermore, we explore the impact of combining communication modes (action and language) on learning outcomes, highlighting the benefits of a multi-modal approach.
△ Less
Submitted 27 September, 2023; v1 submitted 18 August, 2023;
originally announced August 2023.
-
SLOT-V: Supervised Learning of Observer Models for Legible Robot Motion Planning in Manipulation
Authors:
Sebastian Wallkotter,
Mohamed Chetouani,
Ginevra Castellano
Abstract:
We present SLOT-V, a novel supervised learning framework that learns observer models (human preferences) from robot motion trajectories in a legibility context. Legibility measures how easily a (human) observer can infer the robot's goal from a robot motion trajectory. When generating such trajectories, existing planners often rely on an observer model that estimates the quality of trajectory cand…
▽ More
We present SLOT-V, a novel supervised learning framework that learns observer models (human preferences) from robot motion trajectories in a legibility context. Legibility measures how easily a (human) observer can infer the robot's goal from a robot motion trajectory. When generating such trajectories, existing planners often rely on an observer model that estimates the quality of trajectory candidates. These observer models are frequently hand-crafted or, occasionally, learned from demonstrations. Here, we propose to learn them in a supervised manner using the same data format that is frequently used during the evaluation of aforementioned approaches. We then demonstrate the generality of SLOT-V using a Franka Emika in a simulated manipulation environment. For this, we show that it can learn to closely predict various hand-crafted observer models, i.e., that SLOT-V's hypothesis space encompasses existing handcrafted models. Next, we showcase SLOT-V's ability to generalize by showing that a trained model continues to perform well in environments with unseen goal configurations and/or goal counts. Finally, we benchmark SLOT-V's sample efficiency (and performance) against an existing IRL approach and show that SLOT-V learns better observer models with less data. Combined, these results suggest that SLOT-V can learn viable observer models. Better observer models imply more legible trajectories, which may - in turn - lead to better and more transparent human-robot interaction.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Automatic Context-Driven Inference of Engagement in HMI: A Survey
Authors:
Hanan Salam,
Oya Celiktutan,
Hatice Gunes,
Mohamed Chetouani
Abstract:
An integral part of seamless human-human communication is engagement, the process by which two or more participants establish, maintain, and end their perceived connection. Therefore, to develop successful human-centered human-machine interaction applications, automatic engagement inference is one of the tasks required to achieve engaging interactions between humans and machines, and to make machi…
▽ More
An integral part of seamless human-human communication is engagement, the process by which two or more participants establish, maintain, and end their perceived connection. Therefore, to develop successful human-centered human-machine interaction applications, automatic engagement inference is one of the tasks required to achieve engaging interactions between humans and machines, and to make machines attuned to their users, hence enhancing user satisfaction and technology acceptance. Several factors contribute to engagement state inference, which include the interaction context and interactants' behaviours and identity. Indeed, engagement is a multi-faceted and multi-modal construct that requires high accuracy in the analysis and interpretation of contextual, verbal and non-verbal cues. Thus, the development of an automated and intelligent system that accomplishes this task has been proven to be challenging so far. This paper presents a comprehensive survey on previous work in engagement inference for human-machine interaction, entailing interdisciplinary definition, engagement components and factors, publicly available datasets, ground truth assessment, and most commonly used features and methods, serving as a guide for the development of future human-machine interaction interfaces with reliable context-aware engagement inference capability. An in-depth review across embodied and disembodied interaction modes, and an emphasis on the interaction context of which engagement perception modules are integrated sets apart the presented survey from existing surveys.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
How unitizing affects annotation of cohesion
Authors:
Eleonora Ceccaldi,
Nale Lehmann-Willenbrock,
Erica Volta,
Mohamed Chetouani,
Gualtiero Volpe,
Giovanna Varni
Abstract:
This paper investigates how unitizing affects external observers' annotation of group cohesion. We compared unitizing techniques belonging to these categories: interval coding, continuous coding, and a technique inspired by a cognitive theory on event perception. We applied such techniques for sampling coding units from a set of recordings of social interactions rich in behaviors related to cohesi…
▽ More
This paper investigates how unitizing affects external observers' annotation of group cohesion. We compared unitizing techniques belonging to these categories: interval coding, continuous coding, and a technique inspired by a cognitive theory on event perception. We applied such techniques for sampling coding units from a set of recordings of social interactions rich in behaviors related to cohesion. Then, we compared the cohesion scores the observers assigned to each coding unit. Results show that the three techniques can lead to suitable ratings and that the technique inspired to cognitive theories leads to scores reflecting variability in cohesion better than the other ones.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Overcoming Referential Ambiguity in Language-Guided Goal-Conditioned Reinforcement Learning
Authors:
Hugo Caselles-Dupré,
Olivier Sigaud,
Mohamed Chetouani
Abstract:
Teaching an agent to perform new tasks using natural language can easily be hindered by ambiguities in interpretation. When a teacher provides an instruction to a learner about an object by referring to its features, the learner can misunderstand the teacher's intentions, for instance if the instruction ambiguously refer to features of the object, a phenomenon called referential ambiguity. We stud…
▽ More
Teaching an agent to perform new tasks using natural language can easily be hindered by ambiguities in interpretation. When a teacher provides an instruction to a learner about an object by referring to its features, the learner can misunderstand the teacher's intentions, for instance if the instruction ambiguously refer to features of the object, a phenomenon called referential ambiguity. We study how two concepts derived from cognitive sciences can help resolve those referential ambiguities: pedagogy (selecting the right instructions) and pragmatism (learning the preferences of the other agents using inductive reasoning). We apply those ideas to a teacher/learner setup with two artificial agents on a simulated robotic task (block-stacking). We show that these concepts improve sample efficiency for training the learner.
△ Less
Submitted 27 September, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments
Authors:
Hugo Caselles-Dupré,
Olivier Sigaud,
Mohamed Chetouani
Abstract:
Learning from demonstration methods usually leverage close to optimal demonstrations to accelerate training. By contrast, when demonstrating a task, human teachers deviate from optimal demonstrations and pedagogically modify their behavior by giving demonstrations that best disambiguate the goal they want to demonstrate. Analogously, human learners excel at pragmatically inferring the intent of th…
▽ More
Learning from demonstration methods usually leverage close to optimal demonstrations to accelerate training. By contrast, when demonstrating a task, human teachers deviate from optimal demonstrations and pedagogically modify their behavior by giving demonstrations that best disambiguate the goal they want to demonstrate. Analogously, human learners excel at pragmatically inferring the intent of the teacher, facilitating communication between the two agents. These mechanisms are critical in the few demonstrations regime, where inferring the goal is more difficult. In this paper, we implement pedagogy and pragmatism mechanisms by leveraging a Bayesian model of Goal Inference from demonstrations (BGI). We highlight the benefits of this model in multi-goal teacher-learner setups with two artificial agents that learn with goal-conditioned Reinforcement Learning. We show that combining BGI-agents (a pedagogical teacher and a pragmatic learner) results in faster learning and reduced goal ambiguity over standard learning from demonstrations, especially in the few demonstrations regime. We provide the code for our experiments (https://github.com/Caselles/NeurIPS22-demonstrations-pedagogy-pragmatism), as well as an illustrative video explaining our approach (https://youtu.be/V4n16IjkNyw).
△ Less
Submitted 27 September, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Two ways to make your robot proactive: reasoning about human intentions, or reasoning about possible futures
Authors:
Sera Buyukgoz,
Jasmin Grosinger,
Mohamed Chetouani,
Alessandro Saffiotti
Abstract:
Robots sharing their space with humans need to be proactive in order to be helpful. Proactive robots are able to act on their own initiative in an anticipatory way to benefit humans. In this work, we investigate two ways to make robots proactive. One way is to recognize humans' intentions and to act to fulfill them, like opening the door that you are about to cross. The other way is to reason abou…
▽ More
Robots sharing their space with humans need to be proactive in order to be helpful. Proactive robots are able to act on their own initiative in an anticipatory way to benefit humans. In this work, we investigate two ways to make robots proactive. One way is to recognize humans' intentions and to act to fulfill them, like opening the door that you are about to cross. The other way is to reason about possible future threats or opportunities and to act to prevent or to foster them, like recommending you to take an umbrella since rain has been forecasted. In this paper, we present approaches to realize these two types of proactive behavior. We then present an integrated system that can generate proactive robot behavior by reasoning on both factors: intentions and predictions. We illustrate our system on a sample use case including a domestic robot and a human. We first run this use case with the two separate proactive systems, intention-based and prediction-based, and then run it with our integrated system. The results show that the integrated system is able to take into account a broader variety of aspects that are needed for proactivity.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
A New Nonlinear speaker parameterization algorithm for speaker identification
Authors:
Mohamed Chetouani,
Marcos Faundez-Zanuy,
Bruno Gas,
Jean-Luc Zarader
Abstract:
In this paper we propose a new parameterization algorithm based on nonlinear prediction, which is an extension of the classical LPC parameters. The parameters performances are estimated by two different methods: the Arithmetic-Harmonic Sphericity (AHS) and the Auto-Regressive Vector Model (ARVM). Two different methods are proposed for the parameterization based on the Neural Predictive Coding (NPC…
▽ More
In this paper we propose a new parameterization algorithm based on nonlinear prediction, which is an extension of the classical LPC parameters. The parameters performances are estimated by two different methods: the Arithmetic-Harmonic Sphericity (AHS) and the Auto-Regressive Vector Model (ARVM). Two different methods are proposed for the parameterization based on the Neural Predictive Coding (NPC): classical neural networks initialization and linear initialization. We applied these two parameters to speaker identification. The fist parameters obtained smaller rates. We show for the first parameters how they can be combined with the classical parameters (LPCC, MFCC, etc.) in order to improve the results of only one classical parameterization (MFCC provides 97.55% and MFCC+NPC 98.78%). For the linear initialization, we obtain 100% which is great improvement. This study opens a new way towards different parameterization schemes that offer better accuracy on speaker recognition tasks.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
Pedagogical Demonstrations and Pragmatic Learning in Artificial Tutor-Learner Interactions
Authors:
Hugo Caselles-Dupré,
Mohamed Chetouani,
Olivier Sigaud
Abstract:
When demonstrating a task, human tutors pedagogically modify their behavior by either "showing" the task rather than just "doing" it (exaggerating on relevant parts of the demonstration) or by giving demonstrations that best disambiguate the communicated goal. Analogously, human learners pragmatically infer the communicative intent of the tutor: they interpret what the tutor is trying to teach the…
▽ More
When demonstrating a task, human tutors pedagogically modify their behavior by either "showing" the task rather than just "doing" it (exaggerating on relevant parts of the demonstration) or by giving demonstrations that best disambiguate the communicated goal. Analogously, human learners pragmatically infer the communicative intent of the tutor: they interpret what the tutor is trying to teach them and deduce relevant information for learning. Without such mechanisms, traditional Learning from Demonstration (LfD) algorithms will consider such demonstrations as sub-optimal. In this paper, we investigate the implementation of such mechanisms in a tutor-learner setup where both participants are artificial agents in an environment with multiple goals. Using pedagogy from the tutor and pragmatism from the learner, we show substantial improvements over standard learning from demonstrations.
△ Less
Submitted 27 September, 2023; v1 submitted 28 February, 2022;
originally announced March 2022.
-
Learning Collective Action under Risk Diversity
Authors:
Ramona Merhej,
Fernando P. Santos,
Francisco S. Melo,
Mohamed Chetouani,
Francisco C. Santos
Abstract:
Collective risk dilemmas (CRDs) are a class of n-player games that represent societal challenges where groups need to coordinate to avoid the risk of a disastrous outcome. Multi-agent systems incurring such dilemmas face difficulties achieving cooperation and often converge to sub-optimal, risk-dominant solutions where everyone defects. In this paper we investigate the consequences of risk diversi…
▽ More
Collective risk dilemmas (CRDs) are a class of n-player games that represent societal challenges where groups need to coordinate to avoid the risk of a disastrous outcome. Multi-agent systems incurring such dilemmas face difficulties achieving cooperation and often converge to sub-optimal, risk-dominant solutions where everyone defects. In this paper we investigate the consequences of risk diversity in groups of agents learning to play CRDs. We find that risk diversity places new challenges to cooperation that are not observed in homogeneous groups. We show that increasing risk diversity significantly reduces overall cooperation and hinders collective target achievement. It leads to asymmetrical changes in agents' policies -- i.e. the increase in contributions from individuals at high risk is unable to compensate for the decrease in contributions from individuals at low risk -- which overall reduces the total contributions in a population. When comparing RL behaviors to rational individualistic and social behaviors, we find that RL populations converge to fairer contributions among agents. Our results highlight the need for aligning risk perceptions among agents or develop new learning techniques that explicitly account for risk diversity.
△ Less
Submitted 30 January, 2022;
originally announced January 2022.
-
A new approach to evaluating legibility: Comparing legibility frameworks using framework-independent robot motion trajectories
Authors:
Sebastian Wallkotter,
Mohamed Chetouani,
Ginevra Castellano
Abstract:
Robots that share an environment with humans may communicate their intent using a variety of different channels. Movement is one of these channels and, particularly in manipulation tasks, intent communication via movement is called legibility. It alters a robot's trajectory to make it intent expressive. Here we propose a novel evaluation method that improves the data efficiency of collected experi…
▽ More
Robots that share an environment with humans may communicate their intent using a variety of different channels. Movement is one of these channels and, particularly in manipulation tasks, intent communication via movement is called legibility. It alters a robot's trajectory to make it intent expressive. Here we propose a novel evaluation method that improves the data efficiency of collected experimental data when benchmarking approaches generating such legible behavior. The primary novelty of the proposed method is that it uses trajectories that were generated independently of the framework being tested. This makes evaluation easier, enables N-way comparisons between approaches, and allows easier comparison across papers. We demonstrate the efficiency of the new evaluation method by comparing 10 legibility frameworks in 2 scenarios. The paper, thus, provides readers with (1) a novel approach to investigate and/or benchmark legibility, (2) an overview of existing frameworks, (3) an evaluation of 10 legibility frameworks (from 6 papers), and (4) evidence that viewing angle and trajectory progression matter when users evaluate the legibility of a motion.
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
Towards Teachable Autotelic Agents
Authors:
Olivier Sigaud,
Ahmed Akakzia,
Hugo Caselles-Dupré,
Cédric Colas,
Pierre-Yves Oudeyer,
Mohamed Chetouani
Abstract:
Autonomous discovery and direct instruction are two distinct sources of learning in children but education sciences demonstrate that mixed approaches such as assisted discovery or guided play result in improved skill acquisition. In the field of Artificial Intelligence, these extremes respectively map to autonomous agents learning from their own signals and interactive learning agents fully taught…
▽ More
Autonomous discovery and direct instruction are two distinct sources of learning in children but education sciences demonstrate that mixed approaches such as assisted discovery or guided play result in improved skill acquisition. In the field of Artificial Intelligence, these extremes respectively map to autonomous agents learning from their own signals and interactive learning agents fully taught by their teachers. In between should stand teachable autotelic agents (TAA): agents that learn from both internal and teaching signals to benefit from the higher efficiency of assisted discovery. Designing such agents will enable real-world non-expert users to orient the learning trajectories of agents towards their expectations. More fundamentally, this may also be a key step to build agents with human-level intelligence. This paper presents a roadmap towards the design of teachable autonomous agents. Building on developmental psychology and education sciences, we start by identifying key features enabling assisted discovery processes in child-tutor interactions. This leads to the production of a checklist of features that future TAA will need to demonstrate. The checklist allows us to precisely pinpoint the various limitations of current reinforcement learning agents and to identify the promising first steps towards TAA. It also shows the way forward by highlighting key research directions towards the design or autonomous agents that can be taught by ordinary people via natural pedagogy.
△ Less
Submitted 20 March, 2023; v1 submitted 25 May, 2021;
originally announced May 2021.
-
Does the Goal Matter? Emotion Recognition Tasks Can Change the Social Value of Facial Mimicry towards Artificial Agents
Authors:
Giulia Perugia,
Maike Paetzel-Prüssman,
Isabelle Hupont,
Giovanna Varni,
Mohamed Chetouani,
Christopher Edward Peters,
Ginevra Castellano
Abstract:
In this paper, we present a study aimed at understanding whether the embodiment and humanlikeness of an artificial agent can affect people's spontaneous and instructed mimicry of its facial expressions. The study followed a mixed experimental design and revolved around an emotion recognition task. Participants were randomly assigned to one level of humanlikeness (between-subject variable: humanlik…
▽ More
In this paper, we present a study aimed at understanding whether the embodiment and humanlikeness of an artificial agent can affect people's spontaneous and instructed mimicry of its facial expressions. The study followed a mixed experimental design and revolved around an emotion recognition task. Participants were randomly assigned to one level of humanlikeness (between-subject variable: humanlike, characterlike, or morph facial texture of the artificial agents) and observed the facial expressions displayed by a human (control) and three artificial agents differing in embodiment (within-subject variable: video-recorded robot, physical robot, and virtual agent). To study both spontaneous and instructed facial mimicry, we divided the experimental sessions into two phases. In the first phase, we asked participants to observe and recognize the emotions displayed by the agents. In the second phase, we asked them to look at the agents' facial expressions, replicate their dynamics as closely as possible, and then identify the observed emotions. In both cases, we assessed participants' facial expressions with an automated Action Unit (AU) intensity detector. Contrary to our hypotheses, our results disclose that the agent that was perceived as the least uncanny, and most anthropomorphic, likable, and co-present, was the one spontaneously mimicked the least. Moreover, they show that instructed facial mimicry negatively predicts spontaneous facial mimicry. Further exploratory analyses revealed that spontaneous facial mimicry appeared when participants were less certain of the emotion they recognized. Hence, we postulate that an emotion recognition goal can flip the social value of facial mimicry as it transforms a likable artificial agent into a distractor.
△ Less
Submitted 5 May, 2021;
originally announced May 2021.
-
AudVowelConsNet: A Phoneme-Level Based Deep CNN Architecture for Clinical Depression Diagnosis
Authors:
Muhammad Muzammel,
Hanan Salam,
Yann Hoffmann,
Mohamed Chetouani,
Alice Othmani
Abstract:
Depression is a common and serious mood disorder that negatively affects the patient's capacity of functioning normally in daily tasks. Speech is proven to be a vigorous tool in depression diagnosis. Research in psychiatry concentrated on performing fine-grained analysis on word-level speech components contributing to the manifestation of depression in speech and revealed significant variations at…
▽ More
Depression is a common and serious mood disorder that negatively affects the patient's capacity of functioning normally in daily tasks. Speech is proven to be a vigorous tool in depression diagnosis. Research in psychiatry concentrated on performing fine-grained analysis on word-level speech components contributing to the manifestation of depression in speech and revealed significant variations at the phoneme-level in depressed speech. On the other hand, research in Machine Learning-based automatic recognition of depression from speech focused on the exploration of various acoustic features for the detection of depression and its severity level. Few have focused on incorporating phoneme-level speech components in automatic assessment systems. In this paper, we propose an Artificial Intelligence (AI) based application for clinical depression recognition and assessment from speech. We investigate the acoustic characteristics of phoneme units, specifically vowels and consonants for depression recognition via Deep Learning. We present and compare three spectrogram-based Deep Neural Network architectures, trained on phoneme consonant and vowel units and their fusion respectively. Our experiments show that the deep learned consonant-based acoustic characteristics lead to better recognition results than vowel-based ones. The fusion of vowel and consonant speech characteristics through a deep network significantly outperforms the single space networks as well as the state-of-art deep learning approaches on the DAIC-WOZ database.
△ Less
Submitted 4 November, 2020; v1 submitted 30 October, 2020;
originally announced October 2020.
-
Grounding Language to Autonomously-Acquired Skills via Goal Generation
Authors:
Ahmed Akakzia,
Cédric Colas,
Pierre-Yves Oudeyer,
Mohamed Chetouani,
Olivier Sigaud
Abstract:
We are interested in the autonomous acquisition of repertoires of skills. Language-conditioned reinforcement learning (LC-RL) approaches are great tools in this quest, as they allow to express abstract goals as sets of constraints on the states. However, most LC-RL agents are not autonomous and cannot learn without external instructions and feedback. Besides, their direct language condition cannot…
▽ More
We are interested in the autonomous acquisition of repertoires of skills. Language-conditioned reinforcement learning (LC-RL) approaches are great tools in this quest, as they allow to express abstract goals as sets of constraints on the states. However, most LC-RL agents are not autonomous and cannot learn without external instructions and feedback. Besides, their direct language condition cannot account for the goal-directed behavior of pre-verbal infants and strongly limits the expression of behavioral diversity for a given language input. To resolve these issues, we propose a new conceptual approach to language-conditioned RL: the Language-Goal-Behavior architecture (LGB). LGB decouples skill learning and language grounding via an intermediate semantic representation of the world. To showcase the properties of LGB, we present a specific implementation called DECSTR. DECSTR is an intrinsically motivated learning agent endowed with an innate semantic representation describing spatial relations between physical objects. In a first stage (G -> B), it freely explores its environment and targets self-generated semantic configurations. In a second stage (L -> G), it trains a language-conditioned goal generator to generate semantic goals that match the constraints expressed in language-based inputs. We showcase the additional properties of LGB w.r.t. both an end-to-end LC-RL approach and a similar approach leveraging non-semantic, continuous intermediate representations. Intermediate semantic representations help satisfy language commands in a diversity of ways, enable strategy switching after a failure and facilitate language grounding.
△ Less
Submitted 25 January, 2021; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL
Authors:
Cédric Colas,
Ahmed Akakzia,
Pierre-Yves Oudeyer,
Mohamed Chetouani,
Olivier Sigaud
Abstract:
In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world. The notion of Language Grounding questions the interactions between language and embodiment: how do learning agents connect or ground linguistic representations to the physical world ? This question has recently been approached by the Reinforcement Learning community under the framework of i…
▽ More
In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world. The notion of Language Grounding questions the interactions between language and embodiment: how do learning agents connect or ground linguistic representations to the physical world ? This question has recently been approached by the Reinforcement Learning community under the framework of instruction-following agents. In these agents, behavioral policies or reward functions are conditioned on the embedding of an instruction expressed in natural language. This paper proposes another approach: using language to condition goal generators. Given any goal-conditioned policy, one could train a language-conditioned goal generator to generate language-agnostic goals for the agent. This method allows to decouple sensorimotor learning from language acquisition and enable agents to demonstrate a diversity of behaviors for any given instruction. We propose a particular instantiation of this approach and demonstrate its benefits.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Reinforcement learning with human advice: a survey
Authors:
Anis Najar,
Mohamed Chetouani
Abstract:
In this paper, we provide an overview of the existing methods for integrating human advice into a Reinforcement Learning process. We first propose a taxonomy of the different forms of advice that can be provided to a learning agent. We then describe the methods that can be used for interpreting advice when its meaning is not determined beforehand. Finally, we review different approaches for integr…
▽ More
In this paper, we provide an overview of the existing methods for integrating human advice into a Reinforcement Learning process. We first propose a taxonomy of the different forms of advice that can be provided to a learning agent. We then describe the methods that can be used for interpreting advice when its meaning is not determined beforehand. Finally, we review different approaches for integrating advice into the learning process.
△ Less
Submitted 24 November, 2020; v1 submitted 22 May, 2020;
originally announced May 2020.
-
MobiAxis: An Embodied Learning Task for Teaching Multiplication with a Social Robot
Authors:
Karen Tatarian,
Sebastian Wallkotter,
Sera Buyukgoz,
Rebecca Stower,
Mohamed Chetouani
Abstract:
The use of robots in educational settings is growing increasingly popular. Yet, many of the learning tasks involving social robots do not take full advantage of their physical embodiment. MobiAxis is a proposed learning task which uses the physical capabilities of a Pepper robot to teach the concepts of positive and negative multiplication along a number line. The robot is embodied with a number o…
▽ More
The use of robots in educational settings is growing increasingly popular. Yet, many of the learning tasks involving social robots do not take full advantage of their physical embodiment. MobiAxis is a proposed learning task which uses the physical capabilities of a Pepper robot to teach the concepts of positive and negative multiplication along a number line. The robot is embodied with a number of multi-modal socially intelligent features and behaviours which are designed to enhance learning. This paper is a position paper describing the technical and theoretical implementation of the task, as well as proposed directions for future studies.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
Explainable Agents Through Social Cues: A Review
Authors:
Sebastian Wallkotter,
Silvia Tulli,
Ginevra Castellano,
Ana Paiva,
Mohamed Chetouani
Abstract:
The issue of how to make embodied agents explainable has experienced a surge of interest over the last three years, and, there are many terms that refer to this concept, e.g., transparency or legibility. One reason for this high variance in terminology is the unique array of social cues that embodied agents can access in contrast to that accessed by non-embodied agents. Another reason is that diff…
▽ More
The issue of how to make embodied agents explainable has experienced a surge of interest over the last three years, and, there are many terms that refer to this concept, e.g., transparency or legibility. One reason for this high variance in terminology is the unique array of social cues that embodied agents can access in contrast to that accessed by non-embodied agents. Another reason is that different authors use these terms in different ways. Hence, we review the existing literature on explainability and organize it by (1) providing an overview of existing definitions, (2) showing how explainability is implemented and how it exploits different social cues, and (3) showing how the impact of explainability is measured. Additionally, we present a list of open questions and challenges that highlight areas that require further investigation by the community. This provides the interested reader with an overview of the current state-of-the-art.
△ Less
Submitted 18 February, 2021; v1 submitted 11 March, 2020;
originally announced March 2020.
-
Interactively sha** robot behaviour with unlabeled human instructions
Authors:
Anis Najar,
Olivier Sigaud,
Mohamed Chetouani
Abstract:
In this paper, we propose a framework that enables a human teacher to shape a robot behaviour by interactively providing it with unlabeled instructions. We ground the meaning of instruction signals in the task-learning process, and use them simultaneously for guiding the latter. We implement our framework as a modular architecture, named TICS (Task-Instruction-Contingency-Sha**) that combines di…
▽ More
In this paper, we propose a framework that enables a human teacher to shape a robot behaviour by interactively providing it with unlabeled instructions. We ground the meaning of instruction signals in the task-learning process, and use them simultaneously for guiding the latter. We implement our framework as a modular architecture, named TICS (Task-Instruction-Contingency-Sha**) that combines different information sources: a predefined reward function, human evaluative feedback and unlabeled instructions. This approach provides a novel perspective for robotic task learning that lies between Reinforcement Learning and Supervised Learning paradigms. We evaluate our framework both in simulation and with a real robot. The experimental results demonstrate the effectiveness of our framework in accelerating the task-learning process and in reducing the number of required teaching signals.
△ Less
Submitted 24 November, 2020; v1 submitted 5 February, 2019;
originally announced February 2019.
-
CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments
Authors:
Pierre Fournier,
Olivier Sigaud,
Cédric Colas,
Mohamed Chetouani
Abstract:
In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in…
▽ More
In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control. CLIC learns to control individual objects in its environment, and imitates Bob's interactions with these objects. It selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC is an effective baseline in our new setting. It can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. It can also follow Bob when he acts as a mentor and provides ordered demonstrations. Finally, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, we show that CLIC ignores non-reproducible and already mastered interactions with objects, resulting in a greater benefit from imitation.
△ Less
Submitted 25 March, 2019; v1 submitted 28 January, 2019;
originally announced January 2019.
-
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
Authors:
Cédric Colas,
Pierre Fournier,
Olivier Sigaud,
Mohamed Chetouani,
Pierre-Yves Oudeyer
Abstract:
In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any…
▽ More
In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus back on goals that are being forgotten. Experiments conducted in a new modular-goal robotic environment show the resulting developmental self-organization of a learning curriculum, and demonstrate properties of robustness to distracting goals, forgetting and changes in body properties.
△ Less
Submitted 29 May, 2019; v1 submitted 15 October, 2018;
originally announced October 2018.
-
Accuracy-based Curriculum Learning in Deep Reinforcement Learning
Authors:
Pierre Fournier,
Olivier Sigaud,
Mohamed Chetouani,
Pierre-Yves Oudeyer
Abstract:
In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning. Using a reinforcement learning agent based on the Deep Deterministic Policy Gradient algorithm and addressing the Reacher environment, we first show that an agent trained with various accuracy requirements sampled randomly learns…
▽ More
In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning. Using a reinforcement learning agent based on the Deep Deterministic Policy Gradient algorithm and addressing the Reacher environment, we first show that an agent trained with various accuracy requirements sampled randomly learns more efficiently than when asked to be very accurate at all times. Then we show that adaptive selection of accuracy requirements, based on a local measure of competence progress, automatically generates a curriculum where difficulty progressively increases, resulting in a better learning efficiency than sampling randomly.
△ Less
Submitted 21 September, 2018; v1 submitted 25 June, 2018;
originally announced June 2018.
-
Trust as indicator of robot functional and social acceptance. An experimental study on user conformation to the iCub's answers
Authors:
Ilaria Gaudiello,
Elisabetta Zibetti,
Sebastien Lefort,
Mohamed Chetouani,
Serena Ivaldi
Abstract:
To investigate the functional and social acceptance of a humanoid robot, we carried out an experimental study with 56 adult participants and the iCub robot. Trust in the robot has been considered as a main indicator of acceptance in decision-making tasks characterized by perceptual uncertainty (e.g., evaluating the weight of two objects) and socio-cognitive uncertainty (e.g., evaluating which is t…
▽ More
To investigate the functional and social acceptance of a humanoid robot, we carried out an experimental study with 56 adult participants and the iCub robot. Trust in the robot has been considered as a main indicator of acceptance in decision-making tasks characterized by perceptual uncertainty (e.g., evaluating the weight of two objects) and socio-cognitive uncertainty (e.g., evaluating which is the most suitable item in a specific context), and measured by the participants' conformation to the iCub's answers to specific questions. In particular, we were interested in understanding whether specific (i) user-related features (i.e. desire for control), (ii) robot-related features (i.e., attitude towards social influence of robots), and (iii) context-related features (i.e., collaborative vs. competitive scenario), may influence their trust towards the iCub robot. We found that participants conformed more to the iCub's answers when their decisions were about functional issues than when they were about social issues. Moreover, the few participants conforming to the iCub's answers for social issues also conformed less for functional issues. Trust in the robot's functional savvy does not thus seem to be a pre-requisite for trust in its social savvy. Finally, desire for control, attitude towards social influence of robots and type of interaction scenario did not influence the trust in iCub. Results are discussed with relation to methodology of HRI research.
△ Less
Submitted 13 October, 2015;
originally announced October 2015.
-
Towards engagement models that consider individual factors in HRI: on the relation of extroversion and negative attitude towards robots to gaze and speech during a human-robot assembly task
Authors:
Serena Ivaldi,
Sebastien Lefort,
Jan Peters,
Mohamed Chetouani,
Joelle Provasi,
Elisabetta Zibetti
Abstract:
Estimating the engagement is critical for human - robot interaction. Engagement measures typically rely on the dynamics of the social signals exchanged by the partners, especially speech and gaze. However, the dynamics of these signals is likely to be influenced by individual and social factors, such as personality traits, as it is well documented that they critically influence how two humans inte…
▽ More
Estimating the engagement is critical for human - robot interaction. Engagement measures typically rely on the dynamics of the social signals exchanged by the partners, especially speech and gaze. However, the dynamics of these signals is likely to be influenced by individual and social factors, such as personality traits, as it is well documented that they critically influence how two humans interact with each other. Here, we assess the influence of two factors, namely extroversion and negative attitude toward robots, on speech and gaze during a cooperative task, where a human must physically manipulate a robot to assemble an object. We evaluate if the scores of extroversion and negative attitude towards robots co-variate with the duration and frequency of gaze and speech cues. The experiments were carried out with the humanoid robot iCub and N=56 adult participants. We found that the more people are extrovert, the more and longer they tend to talk with the robot; and the more people have a negative attitude towards robots, the less they will look at the robot face and the more they will look at the robot hands where the assembly and the contacts occur. Our results confirm and provide evidence that the engagement models classically used in human-robot interaction should take into account attitudes and personality traits.
△ Less
Submitted 19 August, 2015;
originally announced August 2015.