-
Artificial consciousness. Some logical and conceptual preliminaries
Authors:
K. Evers,
M. Farisco,
R. Chatila,
B. D. Earp,
I. T. Freire,
F. Hamker,
E. Nemeth,
P. F. M. J. Verschure,
M. Khamassi
Abstract:
Is artificial consciousness theoretically possible? Is it plausible? If so, is it technically feasible? To make progress on these questions, it is necessary to lay some groundwork clarifying the logical and empirical conditions for artificial consciousness to arise and the meaning of relevant terms involved. Consciousness is a polysemic word: researchers from different fields, including neuroscien…
▽ More
Is artificial consciousness theoretically possible? Is it plausible? If so, is it technically feasible? To make progress on these questions, it is necessary to lay some groundwork clarifying the logical and empirical conditions for artificial consciousness to arise and the meaning of relevant terms involved. Consciousness is a polysemic word: researchers from different fields, including neuroscience, Artificial Intelligence, robotics, and philosophy, among others, sometimes use different terms in order to refer to the same phenomena or the same terms to refer to different phenomena. In fact, if we want to pursue artificial consciousness, a proper definition of the key concepts is required. Here, after some logical and conceptual preliminaries, we argue for the necessity of using dimensions and profiles of consciousness for a balanced discussion about their possible instantiation or realisation in artificial systems. Our primary goal in this paper is to review the main theoretical questions that arise in the domain of artificial consciousness. On the basis of this review, we propose to assess the issue of artificial consciousness within a multidimensional account. The theoretical possibility of artificial consciousness is already presumed within some theoretical frameworks; however, empirical possibility cannot simply be deduced from these frameworks but needs independent empirical validation. We break down the complexity of consciousness by identifying constituents, components, and dimensions, and reflect pragmatically about the general challenges confronting the creation of artificial consciousness. Despite these challenges, we outline a research strategy for showing how "awareness" as we propose to understand it could plausibly be realised in artificial systems.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Purpose for Open-Ended Learning Robots: A Computational Taxonomy, Definition, and Operationalisation
Authors:
Gianluca Baldassarre,
Richard J. Duro,
Emilio Cartoni,
Mehdi Khamassi,
Alejandro Romero,
Vieri Giuliano Santucci
Abstract:
Autonomous open-ended learning (OEL) robots are able to cumulatively acquire new skills and knowledge through direct interaction with the environment, for example relying on the guidance of intrinsic motivations and self-generated goals. OEL robots have a high relevance for applications as they can use the autonomously acquired knowledge to accomplish tasks relevant for their human users. OEL robo…
▽ More
Autonomous open-ended learning (OEL) robots are able to cumulatively acquire new skills and knowledge through direct interaction with the environment, for example relying on the guidance of intrinsic motivations and self-generated goals. OEL robots have a high relevance for applications as they can use the autonomously acquired knowledge to accomplish tasks relevant for their human users. OEL robots, however, encounter an important limitation: this may lead to the acquisition of knowledge that is not so much relevant to accomplish the users' tasks. This work analyses a possible solution to this problem that pivots on the novel concept of `purpose'. Purposes indicate what the designers and/or users want from the robot. The robot should use internal representations of purposes, called here `desires', to focus its open-ended exploration towards the acquisition of knowledge relevant to accomplish them. This work contributes to develop a computational framework on purpose in two ways. First, it formalises a framework on purpose based on a three-level motivational hierarchy involving: (a) the purposes; (b) the desires, which are domain independent; (c) specific domain dependent state-goals. Second, the work highlights key challenges highlighted by the framework such as: the `purpose-desire alignment problem', the `purpose-goal grounding problem', and the `arbitration between desires'. Overall, the approach enables OEL robots to learn in an autonomous way but also to focus on acquiring goals and skills that meet the purposes of the designers and users.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics
Authors:
Stephane Doncieux,
Nicolas Bredeche,
Léni Le Goff,
Benoît Girard,
Alexandre Coninx,
Olivier Sigaud,
Mehdi Khamassi,
Natalia Díaz-Rodríguez,
David Filliat,
Timothy Hospedales,
A. Eiben,
Richard Duro
Abstract:
Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithm…
▽ More
Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithms rely on the definition of state and action spaces that define reachable behaviors. Their adaptation capability critically depends on the representations of these spaces: small and discrete spaces result in fast learning while large and continuous spaces are challenging and either require a long training period or prevent the robot from converging to an appropriate behavior. Beside the operational cycle of policy execution and the learning cycle, which works at a slower time scale to acquire new policies, we introduce the redescription cycle, a third cycle working at an even slower time scale to generate or adapt the required representations to the robot, its environment and the task. We introduce the challenges raised by this cycle and we present DREAM (Deferred Restructuring of Experience in Autonomous Machines), a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots. We describe results obtained so far with this approach and end up with a discussion of the questions it raises in Neuroscience.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Co** with the variability in humans reward during simulated human-robot interactions through the coordination of multiple learning strategies
Authors:
Rémi Dromnelle,
Benoît Girard,
Erwan Renaudo,
Raja Chatila,
Mehdi Khamassi
Abstract:
An important current challenge in Human-Robot Interaction (HRI) is to enable robots to learn on-the-fly from human feedback. However, humans show a great variability in the way they reward robots. We propose to address this issue by enabling the robot to combine different learning strategies, namely model-based (MB) and model-free (MF) reinforcement learning. We simulate two HRI scenarios: a simpl…
▽ More
An important current challenge in Human-Robot Interaction (HRI) is to enable robots to learn on-the-fly from human feedback. However, humans show a great variability in the way they reward robots. We propose to address this issue by enabling the robot to combine different learning strategies, namely model-based (MB) and model-free (MF) reinforcement learning. We simulate two HRI scenarios: a simple task where the human congratulates the robot for putting the right cubes in the right boxes, and a more complicated version of this task where cubes have to be placed in a specific order. We show that our existing MB-MF coordination algorithm previously tested in robot navigation works well here without retuning parameters. It leads to the maximal performance while producing the same minimal computational cost as MF alone. Moreover, the algorithm gives a robust performance no matter the variability of the simulated human feedback, while each strategy alone is impacted by this variability. Overall, the results suggest a promising way to promote robot learning flexibility when facing variable human feedback.
△ Less
Submitted 6 May, 2020;
originally announced May 2020.
-
How to reduce computation time while sparing performance during robot navigation? A neuro-inspired architecture for autonomous shifting between model-based and model-free learning
Authors:
Rémi Dromnelle,
Erwan Renaudo,
Guillaume Pourcel,
Raja Chatila,
Benoît Girard,
Mehdi Khamassi
Abstract:
Taking inspiration from how the brain coordinates multiple learning systems is an appealing strategy to endow robots with more flexibility. One of the expected advantages would be for robots to autonomously switch to the least costly system when its performance is satisfying. However, to our knowledge no study on a real robot has yet shown that the measured computational cost is reduced while perf…
▽ More
Taking inspiration from how the brain coordinates multiple learning systems is an appealing strategy to endow robots with more flexibility. One of the expected advantages would be for robots to autonomously switch to the least costly system when its performance is satisfying. However, to our knowledge no study on a real robot has yet shown that the measured computational cost is reduced while performance is maintained with such brain-inspired algorithms. We present navigation experiments involving paths of different lengths to the goal, dead-end, and non-stationarity (i.e., change in goal location and apparition of obstacles). We present a novel arbitration mechanism between learning systems that explicitly measures performance and cost. We find that the robot can adapt to environment changes by switching between learning systems so as to maintain a high performance. Moreover, when the task is stable, the robot also autonomously shifts to the least costly system, which leads to a drastic reduction in computation cost while kee** a high performance. Overall, these results illustrates the interest of using multiple learning systems.
△ Less
Submitted 16 July, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
A Deep Learning Approach for Multi-View Engagement Estimation of Children in a Child-Robot Joint Attention task
Authors:
Jack Hadfield,
Georgia Chalvatzaki,
Petros Koutras,
Mehdi Khamassi,
Costas S. Tzafestas,
Petros Maragos
Abstract:
In this work we tackle the problem of child engagement estimation while children freely interact with a robot in their room. We propose a deep-based multi-view solution that takes advantage of recent developments in human pose detection. We extract the child's pose from different RGB-D cameras placed elegantly in the room, fuse the results and feed them to a deep neural network trained for classif…
▽ More
In this work we tackle the problem of child engagement estimation while children freely interact with a robot in their room. We propose a deep-based multi-view solution that takes advantage of recent developments in human pose detection. We extract the child's pose from different RGB-D cameras placed elegantly in the room, fuse the results and feed them to a deep neural network trained for classifying engagement levels. The deep network contains a recurrent layer, in order to exploit the rich temporal information contained in the pose data. The resulting method outperforms a number of baseline classifiers, and provides a promising tool for better automatic understanding of a child's attitude, interest and attention while cooperating with a robot. The goal is to integrate this model in next generation social robots as an attention monitoring tool during various CRI tasks both for Typically Developed (TD) children and children affected by autism (ASD).
△ Less
Submitted 1 December, 2018;
originally announced December 2018.
-
Prioritized Swee** Neural DynaQ with Multiple Predecessors, and Hippocampal Replays
Authors:
Lise Aubin,
Mehdi Khamassi,
Benoît Girard
Abstract:
During sleep and awake rest, the hippocampus replays sequences of place cells that have been activated during prior experiences. These have been interpreted as a memory consolidation process, but recent results suggest a possible interpretation in terms of reinforcement learning. The Dyna reinforcement learning algorithms use off-line replays to improve learning. Under limited replay budget, a pri…
▽ More
During sleep and awake rest, the hippocampus replays sequences of place cells that have been activated during prior experiences. These have been interpreted as a memory consolidation process, but recent results suggest a possible interpretation in terms of reinforcement learning. The Dyna reinforcement learning algorithms use off-line replays to improve learning. Under limited replay budget, a prioritized swee** approach, which requires a model of the transitions to the predecessors, can be used to improve performance. We investigate whether such algorithms can explain the experimentally observed replays. We propose a neural network version of prioritized swee** Q-learning, for which we developed a growing multiple expert algorithm, able to cope with multiple predecessors. The resulting architecture is able to improve the learning of simulated agents confronted to a navigation task. We predict that, in animals, learning the world model should occur during rest periods, and that the corresponding replays should be shuffled.
△ Less
Submitted 13 August, 2018; v1 submitted 15 February, 2018;
originally announced February 2018.
-
Adaptive coordination of working-memory and reinforcement learning in non-human primates performing a trial-and-error problem solving task
Authors:
Guillaume Viejo,
Benoît Girard,
Emmanuel Procyk,
Mehdi Khamassi
Abstract:
Accumulating evidence suggest that human behavior in trial-and-error learning tasks based on decisions between discrete actions may involve a combination of reinforcement learning (RL) and working-memory (WM). While the understanding of brain activity at stake in this type of tasks often involve the comparison with non-human primate neurophysiological results, it is not clear whether monkeys use s…
▽ More
Accumulating evidence suggest that human behavior in trial-and-error learning tasks based on decisions between discrete actions may involve a combination of reinforcement learning (RL) and working-memory (WM). While the understanding of brain activity at stake in this type of tasks often involve the comparison with non-human primate neurophysiological results, it is not clear whether monkeys use similar combined RL and WM processes to solve these tasks. Here we analyzed the behavior of five monkeys with computational models combining RL and WM. Our model-based analysis approach enables to not only fit trial-by-trial choices but also transient slowdowns in reaction times, indicative of WM use. We found that the behavior of the five monkeys was better explained in terms of a combination of RL and WM despite inter-individual differences. The same coordination dynamics we used in a previous study in humans best explained the behavior of some monkeys while the behavior of others showed the opposite pattern, revealing a possible different dynamics of WM process. We further analyzed different variants of the tested models to open a discussion on how the long pretraining in these tasks may have favored particular coordination dynamics between RL and WM. This points towards either inter-species differences or protocol differences which could be further tested in humans.
△ Less
Submitted 2 November, 2017;
originally announced November 2017.
-
Sustainable computational science: the ReScience initiative
Authors:
Nicolas P. Rougier,
Konrad Hinsen,
Frédéric Alexandre,
Thomas Arildsen,
Lorena Barba,
Fabien C. Y. Benureau,
C. Titus Brown,
Pierre de Buyl,
Ozan Caglayan,
Andrew P. Davison,
Marc André Delsuc,
Georgios Detorakis,
Alexandra K. Diem,
Damien Drix,
Pierre Enel,
Benoît Girard,
Olivia Guest,
Matt G. Hall,
Rafael Neto Henriques,
Xavier Hinaut,
Kamil S Jaron,
Mehdi Khamassi,
Almar Klein,
Tiina Manninen,
Pietro Marchesi
, et al. (20 additional authors not shown)
Abstract:
Computer science offers a large set of tools for prototy**, writing, running, testing, validating, sharing and reproducing results, however computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than tw…
▽ More
Computer science offers a large set of tools for prototy**, writing, running, testing, validating, sharing and reproducing results, however computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested, hardly ever actually executed to check that they produce the results advertised in the article. ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests.
△ Less
Submitted 11 November, 2017; v1 submitted 14 July, 2017;
originally announced July 2017.
-
Active exploration in parameterized reinforcement learning
Authors:
Mehdi Khamassi,
Costas Tzafestas
Abstract:
Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an a…
▽ More
Online model-free reinforcement learning (RL) methods with continuous actions are playing a prominent role when dealing with real-world applications such as Robotics. However, when confronted to non-stationary environments, these methods crucially rely on an exploration-exploitation trade-off which is rarely dynamically and automatically adjusted to changes in the environment. Here we propose an active exploration algorithm for RL in structured (parameterized) continuous action space. This framework deals with a set of discrete actions, each of which is parameterized with continuous variables. Discrete exploration is controlled through a Boltzmann softmax function with an inverse temperature $β$ parameter. In parallel, a Gaussian exploration is applied to the continuous action parameters. We apply a meta-learning algorithm based on the comparison between variations of short-term and long-term reward running averages to simultaneously tune $β$ and the width of the Gaussian distribution from which continuous action parameters are drawn. When applied to a simple virtual human-robot interaction task, we show that this algorithm outperforms continuous parameterized RL both without active exploration and with active exploration based on uncertainty variations measured by a Kalman-Q-learning algorithm.
△ Less
Submitted 6 October, 2016;
originally announced October 2016.