-
Mind the gap: Challenges of deep learning approaches to Theory of Mind
Authors:
Jaan Aru,
Aqeel Labash,
Oriol Corcoll,
Raul Vicente
Abstract:
Theory of Mind is an essential ability of humans to infer the mental states of others. Here we provide a coherent summary of the potential, current progress, and problems of deep learning approaches to Theory of Mind. We highlight that many current findings can be explained through shortcuts. These shortcuts arise because the tasks used to investigate Theory of Mind in deep learning systems have b…
▽ More
Theory of Mind is an essential ability of humans to infer the mental states of others. Here we provide a coherent summary of the potential, current progress, and problems of deep learning approaches to Theory of Mind. We highlight that many current findings can be explained through shortcuts. These shortcuts arise because the tasks used to investigate Theory of Mind in deep learning systems have been too narrow. Thus, we encourage researchers to investigate Theory of Mind in complex open-ended environments. Furthermore, to inspire future deep learning systems we provide a concise overview of prior work done in humans. We further argue that when studying Theory of Mind with deep learning, the research's main focus and contribution ought to be opening up the network's representations. We recommend researchers use tools from the field of interpretability of AI to study the relationship between different network components and aspects of Theory of Mind.
△ Less
Submitted 12 December, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Semantic Image Crop**
Authors:
Oriol Corcoll
Abstract:
Automatic image crop** techniques are commonly used to enhance the aesthetic quality of an image; they do it by detecting the most beautiful or the most salient parts of the image and removing the unwanted content to have a smaller image that is more visually pleasing. In this thesis, I introduce an additional dimension to the problem of crop**, semantics. I argue that image crop** can also…
▽ More
Automatic image crop** techniques are commonly used to enhance the aesthetic quality of an image; they do it by detecting the most beautiful or the most salient parts of the image and removing the unwanted content to have a smaller image that is more visually pleasing. In this thesis, I introduce an additional dimension to the problem of crop**, semantics. I argue that image crop** can also enhance the image's relevancy for a given entity by using the semantic information contained in the image. I call this problem, Semantic Image Crop**. To support my argument, I provide a new dataset containing 100 images with at least two different entities per image and four ground truth crop**s collected using Amazon Mechanical Turk. I use this dataset to show that state-of-the-art crop** algorithms that only take into account aesthetics do not perform well in the problem of semantic image crop**. Additionally, I provide a new deep learning system that takes not just aesthetics but also semantics into account to generate image crop**s, and I evaluate its performance using my new semantic crop** dataset, showing that using the semantic information of an image can help to produce better crop**s.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Did I do that? Blame as a means to identify controlled effects in reinforcement learning
Authors:
Oriol Corcoll,
Youssef Mohamed,
Raul Vicente
Abstract:
Identifying controllable aspects of the environment has proven to be an extraordinary intrinsic motivator to reinforcement learning agents. Despite repeatedly achieving State-of-the-Art results, this approach has only been studied as a proxy to a reward-based task and has not yet been evaluated on its own. Current methods are based on action-prediction. Humans, on the other hand, assign blame to t…
▽ More
Identifying controllable aspects of the environment has proven to be an extraordinary intrinsic motivator to reinforcement learning agents. Despite repeatedly achieving State-of-the-Art results, this approach has only been studied as a proxy to a reward-based task and has not yet been evaluated on its own. Current methods are based on action-prediction. Humans, on the other hand, assign blame to their actions to decide what they controlled. This work proposes Controlled Effect Network (CEN), an unsupervised method based on counterfactual measures of blame to identify effects on the environment controlled by the agent. CEN is evaluated in a wide range of environments showing that it can accurately identify controlled effects. Moreover, we demonstrate CEN's capabilities as intrinsic motivator by integrating it in the state-of-the-art exploration method, achieving substantially better performance than action-prediction models.
△ Less
Submitted 17 February, 2022; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Disentangling causal effects for hierarchical reinforcement learning
Authors:
Oriol Corcoll,
Raul Vicente
Abstract:
Exploration and credit assignment under sparse rewards are still challenging problems. We argue that these challenges arise in part due to the intrinsic rigidity of operating at the level of actions. Actions can precisely define how to perform an activity but are ill-suited to describe what activity to perform. Instead, causal effects are inherently composable and temporally abstract, making them…
▽ More
Exploration and credit assignment under sparse rewards are still challenging problems. We argue that these challenges arise in part due to the intrinsic rigidity of operating at the level of actions. Actions can precisely define how to perform an activity but are ill-suited to describe what activity to perform. Instead, causal effects are inherently composable and temporally abstract, making them ideal for descriptive tasks. By leveraging a hierarchy of causal effects, this study aims to expedite the learning of task-specific behavior and aid exploration. Borrowing counterfactual and normality measures from causal literature, we disentangle controllable effects from effects caused by other dynamics of the environment. We propose CEHRL, a hierarchical method that models the distribution of controllable effects using a Variational Autoencoder. This distribution is used by a high-level policy to 1) explore the environment via random effect exploration so that novel effects are continuously discovered and learned, and to 2) learn task-specific behavior by prioritizing the effects that maximize a given reward function. In comparison to exploring with random actions, experimental results show that random effect exploration is a more efficient mechanism and that by assigning credit to few effects rather than many actions, CEHRL learns tasks more rapidly.
△ Less
Submitted 21 February, 2022; v1 submitted 3 October, 2020;
originally announced October 2020.