Skip to main content

Showing 1–35 of 35 results for author: Chaplot, D S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.06609  [pdf, other

    cs.AI cs.RO

    GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

    Authors: Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani, Sriram Yenamandra, Theophile Gervet, Matthew Chang, Zsolt Kira, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi

    Abstract: The Embodied AI community has made significant strides in visual navigation tasks, exploring targets from 3D coordinates, objects, language descriptions, and images. However, these navigation models often handle only a single input modality as the target. With the progress achieved so far, it is time to move towards universal navigation models capable of handling various goal types, enabling more… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  2. arXiv:2401.04088  [pdf, other

    cs.LG cs.CL

    Mixtral of Experts

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix , et al. (1 additional authors not shown)

    Abstract: We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. Even though each token only sees two experts, the selected e… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: See more details at https://mistral.ai/news/mixtral-of-experts/

  3. arXiv:2311.06430  [pdf, other

    cs.RO

    GOAT: GO to Any Thing

    Authors: Matthew Chang, Theophile Gervet, Mukul Khanna, Sriram Yenamandra, Dhruv Shah, So Yeon Min, Kavit Shah, Chris Paxton, Saurabh Gupta, Dhruv Batra, Roozbeh Mottaghi, Jitendra Malik, Devendra Singh Chaplot

    Abstract: In deployment scenarios such as homes and warehouses, mobile robots are expected to autonomously navigate for extended periods, seamlessly executing tasks articulated in terms that are intuitively understandable by human operators. We present GO To Any Thing (GOAT), a universal navigation system capable of tackling these requirements with three key features: a) Multimodal: it can tackle goals spec… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  4. arXiv:2310.13724  [pdf, other

    cs.HC cs.AI cs.CV cs.GR cs.MA cs.RO

    Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots

    Authors: Xavier Puig, Eric Undersander, Andrew Szot, Mikael Dallaire Cote, Tsung-Yen Yang, Ruslan Partsey, Ruta Desai, Alexander William Clegg, Michal Hlavac, So Yeon Min, Vladimír Vondruš, Theophile Gervet, Vincent-Pierre Berges, John M. Turner, Oleksandr Maksymets, Zsolt Kira, Mrinal Kalakrishnan, Jitendra Malik, Devendra Singh Chaplot, Unnat Jain, Dhruv Batra, Akshara Rai, Roozbeh Mottaghi

    Abstract: We present Habitat 3.0: a simulation platform for studying collaborative human-robot tasks in home environments. Habitat 3.0 offers contributions across three dimensions: (1) Accurate humanoid simulation: addressing challenges in modeling complex deformable bodies and diversity in appearance and motion, all while ensuring high simulation speed. (2) Human-in-the-loop infrastructure: enabling real h… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: Project page: http://aihabitat.org/habitat3

  5. arXiv:2310.06825  [pdf, other

    cs.CL cs.AI cs.LG

    Mistral 7B

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

    Abstract: We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences o… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Models and code are available at https://mistral.ai/news/announcing-mistral-7b/

  6. arXiv:2306.11565  [pdf, other

    cs.RO cs.AI cs.CV

    HomeRobot: Open-Vocabulary Mobile Manipulation

    Authors: Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner, Zsolt Kira, Manolis Savva, Angel Chang, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi, Yonatan Bisk, Chris Paxton

    Abstract: HomeRobot (noun): An affordable compliant robot that navigates homes and manipulates a wide range of objects in order to complete everyday tasks. Open-Vocabulary Mobile Manipulation (OVMM) is the problem of picking any object in any unseen environment, and placing it in a commanded location. This is a foundational challenge for robots to be useful assistants in human environments, because it invol… ▽ More

    Submitted 10 January, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: 37 pages, 22 figures, 8 tables

  7. arXiv:2306.07552  [pdf, other

    cs.LG cs.AI cs.RO

    Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second

    Authors: Vincent-Pierre Berges, Andrew Szot, Devendra Singh Chaplot, Aaron Gokaslan, Roozbeh Mottaghi, Dhruv Batra, Eric Undersander

    Abstract: We present Galactic, a large-scale simulation and reinforcement-learning (RL) framework for robotic mobile manipulation in indoor environments. Specifically, a Fetch robot (equipped with a mobile base, 7DoF arm, RGBD camera, egomotion, and onboard sensing) is spawned in a home environment and asked to rearrange objects - by navigating to an object, picking it up, navigating to a target location, a… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  8. arXiv:2304.11241  [pdf, other

    cs.CV cs.LG cs.RO

    AutoNeRF: Training Implicit Scene Representations with Autonomous Agents

    Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Dhruv Batra, Christian Wolf, Devendra Singh Chaplot

    Abstract: Implicit representations such as Neural Radiance Fields (NeRF) have been shown to be very effective at novel view synthesis. However, these models typically require manual and careful human data collection for training. In this paper, we present AutoNeRF, a method to collect data required to train NeRFs using autonomous embodied agents. Our method allows an agent to explore an unseen environment e… ▽ More

    Submitted 22 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

  9. arXiv:2304.01192  [pdf, other

    cs.CV cs.RO

    Navigating to Objects Specified by Images

    Authors: Jacob Krantz, Theophile Gervet, Karmesh Yadav, Austin Wang, Chris Paxton, Roozbeh Mottaghi, Dhruv Batra, Jitendra Malik, Stefan Lee, Devendra Singh Chaplot

    Abstract: Images are a convenient way to specify which particular object instance an embodied agent should navigate to. Solving this task requires semantic visual reasoning and exploration of unknown environments. We present a system that can perform this task in both simulation and the real world. Our modular method solves sub-tasks of exploration, goal instance re-identification, goal localization, and lo… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  10. arXiv:2212.00922  [pdf, other

    cs.RO cs.CV cs.LG

    Navigating to Objects in the Real World

    Authors: Theophile Gervet, Soumith Chintala, Dhruv Batra, Jitendra Malik, Devendra Singh Chaplot

    Abstract: Semantic navigation is necessary to deploy mobile robots in uncontrolled environments like our homes, schools, and hospitals. Many learning-based approaches have been proposed in response to the lack of semantic understanding of the classical pipeline for spatial navigation, which builds a geometric map using depth sensors and plans to reach point goals. Broadly, end-to-end learning approaches rea… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: 39 pages, 19 figures and tables, submitted to Science Robotics

  11. arXiv:2211.15876  [pdf, other

    cs.CV

    Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances

    Authors: Jacob Krantz, Stefan Lee, Jitendra Malik, Dhruv Batra, Devendra Singh Chaplot

    Abstract: We consider the problem of embodied visual navigation given an image-goal (ImageNav) where an agent is initialized in an unfamiliar environment and tasked with navigating to a location 'described' by an image. Unlike related navigation tasks, ImageNav does not have a standardized task definition which makes comparison across methods difficult. Further, existing formulations have two problematic pr… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  12. arXiv:2210.06849  [pdf, other

    cs.CV

    Retrospectives on the Embodied AI Workshop

    Authors: Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi , et al. (14 additional authors not shown)

    Abstract: We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of… ▽ More

    Submitted 4 December, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

  13. arXiv:2210.05633  [pdf, other

    cs.CV

    Habitat-Matterport 3D Semantics Dataset

    Authors: Karmesh Yadav, Ram Ramrakhya, Santhosh Kumar Ramakrishnan, Theo Gervet, John Turner, Aaron Gokaslan, Noah Maestre, Angel Xuan Chang, Dhruv Batra, Manolis Savva, Alexander William Clegg, Devendra Singh Chaplot

    Abstract: We present the Habitat-Matterport 3D Semantics (HM3DSEM) dataset. HM3DSEM is the largest dataset of 3D real-world spaces with densely annotated semantics that is currently available to the academic community. It consists of 142,646 object instance annotations across 216 3D spaces and 3,100 rooms within those spaces. The scale, quality, and diversity of object annotations far exceed those of prior… ▽ More

    Submitted 12 October, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: 15 Pages, 11 Figures, 6 Tables

  14. arXiv:2209.02778  [pdf, other

    cs.RO cs.LG

    Multi-skill Mobile Manipulation for Object Rearrangement

    Authors: Jiayuan Gu, Devendra Singh Chaplot, Hao Su, Jitendra Malik

    Abstract: We study a modular approach to tackle long-horizon mobile manipulation tasks for object rearrangement, which decomposes a full task into a sequence of subtasks. To tackle the entire task, prior work chains multiple stationary manipulation skills with a point-goal navigation skill, which are learned individually on subtasks. Although more effective than monolithic end-to-end RL policies, this frame… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: Project website: https://sites.google.com/view/hab-m3

  15. arXiv:2201.10029  [pdf, other

    cs.CV cs.AI

    PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning

    Authors: Santhosh Kumar Ramakrishnan, Devendra Singh Chaplot, Ziad Al-Halah, Jitendra Malik, Kristen Grauman

    Abstract: State-of-the-art approaches to ObjectGoal navigation rely on reinforcement learning and typically require significant computational resources and time for learning. We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?' for an object and `how to navigate to (x, y)?'. Our key insight is that… ▽ More

    Submitted 17 June, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

    Comments: 8 pages + supplementary. Accepted in CVPR 2022

  16. arXiv:2112.01520  [pdf, other

    cs.CV

    Recognizing Scenes from Novel Viewpoints

    Authors: Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari

    Abstract: Humans can perceive scenes in 3D from a handful of 2D views. For AI agents, the ability to recognize a scene from any viewpoint given only a few images enables them to efficiently interact with the scene and its objects. In this work, we attempt to endow machines with this ability. We propose a model which takes as input a few RGB images of a new scene and recognizes the scene from novel viewpoint… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

  17. arXiv:2112.01010  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Differentiable Spatial Planning using Transformers

    Authors: Devendra Singh Chaplot, Deepak Pathak, Jitendra Malik

    Abstract: We consider the problem of spatial path planning. In contrast to the classical solutions which optimize a new plan from scratch and assume access to the full map with ground truth obstacle locations, we learn a planner from the data in a differentiable manner that allows us to leverage statistical regularities from past data. We propose Spatial Planning Transformers (SPT), which given an obstacle… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: Published at ICML 2021. See project webpage at https://devendrachaplot.github.io/projects/spatial-planning-transformers

  18. arXiv:2112.01001  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency

    Authors: Devendra Singh Chaplot, Murtaza Dalal, Saurabh Gupta, Jitendra Malik, Ruslan Salakhutdinov

    Abstract: In this paper, we explore how we can build upon the data and models of Internet images and use them to adapt to robot vision without requiring any extra labels. We present a framework called Self-supervised Embodied Active Learning (SEAL). It utilizes perception models trained on internet images to learn an active exploration policy. The observations gathered by this exploration policy are labelle… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: Published at NeurIPS 2021. See project webpage at https://devendrachaplot.github.io/projects/seal

  19. arXiv:2110.07342  [pdf, other

    cs.CL cs.LG

    FILM: Following Instructions in Language with Modular Methods

    Authors: So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov

    Abstract: Recent methods for embodied instruction following are typically trained end-to-end using imitation learning. This often requires the use of expert trajectories and low-level language instructions. Such approaches assume that neural states will integrate multimodal semantics to perform state tracking, building spatial memory, exploration, and long-term planning. In contrast, we propose a modular me… ▽ More

    Submitted 16 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at International Conference on Learning Representations (ICLR) 2022

  20. arXiv:2106.13415  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Building Intelligent Autonomous Navigation Agents

    Authors: Devendra Singh Chaplot

    Abstract: Breakthroughs in machine learning in the last decade have led to `digital intelligence', i.e. machine learning models capable of learning from vast amounts of labeled data to perform several digital tasks such as speech recognition, face recognition, machine translation and so on. The goal of this thesis is to make progress towards designing algorithms capable of `physical intelligence', i.e. buil… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: CMU Ph.D. Thesis, March 2021. For more details see http://devendrachaplot.github.io/

    Report number: CMU-ML-21-101

  21. arXiv:2010.14543  [pdf, other

    cs.LG cs.CV cs.RO

    Unsupervised Domain Adaptation for Visual Navigation

    Authors: Shangda Li, Devendra Singh Chaplot, Yao-Hung Hubert Tsai, Yue Wu, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: Advances in visual navigation methods have led to intelligent embodied navigation agents capable of learning meaningful representations from raw RGB images and perform a wide variety of tasks involving structural and semantic reasoning. However, most learning-based navigation policies are trained and tested in simulation environments. In order for these policies to be practically useful, they need… ▽ More

    Submitted 12 November, 2020; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: Deep Reinforcement Learning Workshop at NeurIPS 2020. Camera Ready Version

  22. arXiv:2010.11863  [pdf, other

    cs.AI cs.LG

    Planning with Submodular Objective Functions

    Authors: Ruosong Wang, Hanrui Zhang, Devendra Singh Chaplot, Denis Garagić, Ruslan Salakhutdinov

    Abstract: We study planning with submodular objective functions, where instead of maximizing the cumulative reward, the goal is to maximize the objective value induced by a submodular function. Our framework subsumes standard planning and submodular maximization with cardinality constraints as special cases, and thus many practical applications can be naturally formulated within our framework. Based on the… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  23. arXiv:2007.00643  [pdf, other

    cs.CV cs.LG cs.RO

    Object Goal Navigation using Goal-Oriented Semantic Exploration

    Authors: Devendra Singh Chaplot, Dhiraj Gandhi, Abhinav Gupta, Ruslan Salakhutdinov

    Abstract: This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at exploration and long-term planning. We propose a modular system called, `Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses… ▽ More

    Submitted 1 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Winner of the CVPR 2020 AI-Habitat Object Goal Navigation Challenge. See the project webpage at https://devendrachaplot.github.io/projects/semantic-exploration.html

  24. arXiv:2006.09367  [pdf, other

    cs.CV cs.AI cs.LG

    Semantic Curiosity for Active Visual Learning

    Authors: Devendra Singh Chaplot, Helen Jiang, Saurabh Gupta, Abhinav Gupta

    Abstract: In this paper, we study the task of embodied interactive learning for object detection. Given a set of environments (and some labeling budget), our goal is to learn an object detector by having an agent select what data to obtain labels for. How should an exploration policy decide which trajectory should be labeled? One possibility is to use a trained object detector's failure cases as an external… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: See project webpage at https://devendrachaplot.github.io/projects/SemanticCuriosity

  25. arXiv:2005.12256  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Neural Topological SLAM for Visual Navigation

    Authors: Devendra Singh Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, Saurabh Gupta

    Abstract: This paper studies the problem of image-goal navigation which involves navigating to the location indicated by a goal image in a novel previously unseen environment. To tackle this problem, we design topological representations for space that effectively leverage semantics and afford approximate geometric reasoning. At the heart of our representations are nodes with associated semantic features, t… ▽ More

    Submitted 28 May, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

    Comments: Published in CVPR 2020. See the project webpage at https://devendrachaplot.github.io/projects/Neural-Topological-SLAM

  26. arXiv:2004.05155  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Learning to Explore using Active Neural SLAM

    Authors: Devendra Singh Chaplot, Dhiraj Gandhi, Saurabh Gupta, Abhinav Gupta, Ruslan Salakhutdinov

    Abstract: This work presents a modular and hierarchical approach to learn policies for exploring 3D environments, called `Active Neural SLAM'. Our approach leverages the strengths of both classical and learning-based methods, by using analytical path planners with learned SLAM module, and global and local policies. The use of learning provides flexibility with respect to input modalities (in the SLAM module… ▽ More

    Submitted 10 April, 2020; originally announced April 2020.

    Comments: Published in ICLR-2020. See the project webpage at https://devendrachaplot.github.io/projects/Neural-SLAM for supplementary videos. The code is available at https://github.com/devendrachaplot/Neural-SLAM

  27. arXiv:1902.01385  [pdf, other

    cs.LG cs.AI cs.CL cs.RO stat.ML

    Embodied Multimodal Multitask Learning

    Authors: Devendra Singh Chaplot, Lisa Lee, Ruslan Salakhutdinov, Devi Parikh, Dhruv Batra

    Abstract: Recent efforts on training visual navigation agents conditioned on language using deep reinforcement learning have been successful in learning policies for different multimodal tasks, such as semantic goal navigation and embodied question answering. In this paper, we propose a multitask model capable of jointly learning these multimodal tasks, and transferring knowledge of words and their groundin… ▽ More

    Submitted 4 February, 2019; originally announced February 2019.

    Comments: See https://devendrachaplot.github.io/projects/EMML for demo videos

  28. arXiv:1807.06757  [pdf, other

    cs.AI cs.CV cs.LG cs.RO

    On Evaluation of Embodied Navigation Agents

    Authors: Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir

    Abstract: Skillful mobile operation in three-dimensional environments is a primary topic of study in Artificial Intelligence. The past two years have seen a surge of creative work on navigation. This creative output has produced a plethora of sometimes incompatible task definitions and evaluation protocols. To coordinate ongoing and future research in this area, we have convened a working group to study emp… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Comments: Report of a working group on empirical methodology in navigation research. Authors are listed in alphabetical order

  29. arXiv:1806.08065  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Cognitive Models using Neural Networks

    Authors: Devendra Singh Chaplot, Christopher MacLellan, Ruslan Salakhutdinov, Kenneth Koedinger

    Abstract: A cognitive model of human learning provides information about skills a learner must acquire to perform accurately in a task domain. Cognitive models of learning are not only of scientific interest, but are also valuable in adaptive online tutoring systems. A more accurate model yields more effective tutoring through better instructional decisions. Prior methods of automated cognitive model discov… ▽ More

    Submitted 21 June, 2018; originally announced June 2018.

  30. arXiv:1806.06408  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Gated Path Planning Networks

    Authors: Lisa Lee, Emilio Parisotto, Devendra Singh Chaplot, Eric Xing, Ruslan Salakhutdinov

    Abstract: Value Iteration Networks (VINs) are effective differentiable path planning modules that can be used by agents to perform navigation while still maintaining end-to-end differentiability of the entire architecture. Despite their effectiveness, they suffer from several disadvantages including training instability, random seed sensitivity, and other optimization problems. In this work, we reframe VINs… ▽ More

    Submitted 17 June, 2018; originally announced June 2018.

    Comments: ICML 2018

  31. arXiv:1802.06857  [pdf, other

    cs.CV cs.LG cs.RO

    Global Pose Estimation with an Attention-based Recurrent Network

    Authors: Emilio Parisotto, Devendra Singh Chaplot, Jian Zhang, Ruslan Salakhutdinov

    Abstract: The ability for an agent to localize itself within an environment is crucial for many real-world applications. For unknown environments, Simultaneous Localization and Map** (SLAM) enables incremental and concurrent building of and localizing within a map. We present a new, differentiable architecture, Neural Graph Optimizer, progressing towards a complete neural network solution for SLAM by desi… ▽ More

    Submitted 19 February, 2018; originally announced February 2018.

    Comments: First two authors contributed equally

  32. arXiv:1801.08214  [pdf, other

    cs.LG cs.AI cs.RO

    Active Neural Localization

    Authors: Devendra Singh Chaplot, Emilio Parisotto, Ruslan Salakhutdinov

    Abstract: Localization is the problem of estimating the location of an autonomous agent from an observation and a map of the environment. Traditional methods of localization, which filter the belief based on the observations, are sub-optimal in the number of steps required, as they do not decide the actions taken by the agent. We propose "Active Neural Localizer", a fully differentiable neural network that… ▽ More

    Submitted 24 January, 2018; originally announced January 2018.

    Comments: Under Review at ICLR-18, 15 pages, 7 figures

  33. arXiv:1801.01900  [pdf, other

    cs.CL cs.LG

    Knowledge-based Word Sense Disambiguation using Topic Models

    Authors: Devendra Singh Chaplot, Ruslan Salakhutdinov

    Abstract: Word Sense Disambiguation is an open problem in Natural Language Processing which is particularly challenging and useful in the unsupervised setting where all the words in any given text need to be disambiguated without using any labeled data. Typically WSD systems use the sentence or a small window of words around the target word as the context for disambiguation because their computational compl… ▽ More

    Submitted 5 January, 2018; originally announced January 2018.

    Comments: To appear in AAAI-18

  34. arXiv:1706.07230  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Gated-Attention Architectures for Task-Oriented Language Grounding

    Authors: Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov

    Abstract: To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior… ▽ More

    Submitted 8 January, 2018; v1 submitted 22 June, 2017; originally announced June 2017.

    Comments: To appear in AAAI-18

  35. arXiv:1609.05521  [pdf, other

    cs.AI cs.LG

    Playing FPS Games with Deep Reinforcement Learning

    Authors: Guillaume Lample, Devendra Singh Chaplot

    Abstract: Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve part… ▽ More

    Submitted 29 January, 2018; v1 submitted 18 September, 2016; originally announced September 2016.

    Comments: The authors contributed equally to this work