Search | arXiv e-print repository

BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation

Authors: Chengshu Li, Ruohan Zhang, Josiah Wong, Cem Gokmen, Sanjana Srivastava, Roberto Martín-Martín, Chen Wang, Gabrael Levine, Wensi Ai, Benjamin Martinez, Hang Yin, Michael Lingelbach, Minjune Hwang, Ayano Hiranaka, Sujay Garlanka, Arman Aydin, Sharon Lee, Jiankai Sun, Mona Anvari, Manasi Sharma, Dhruva Bansal, Samuel Hunter, Kyu-Young Kim, Alan Lou, Caleb R Matthews , et al. (10 additional authors not shown)

Abstract: We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an extensive survey on "what do you want robots to do for you?". The first is the definition of 1,000 everyday activities, grounded in 50 scenes (houses, gardens, restaurants, offices, etc.) with more than 9,000 objects annotated with… ▽ More We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an extensive survey on "what do you want robots to do for you?". The first is the definition of 1,000 everyday activities, grounded in 50 scenes (houses, gardens, restaurants, offices, etc.) with more than 9,000 objects annotated with rich physical and semantic properties. The second is OMNIGIBSON, a novel simulation environment that supports these activities via realistic physics simulation and rendering of rigid bodies, deformable bodies, and liquids. Our experiments indicate that the activities in BEHAVIOR-1K are long-horizon and dependent on complex manipulation skills, both of which remain a challenge for even state-of-the-art robot learning solutions. To calibrate the simulation-to-reality gap of BEHAVIOR-1K, we provide an initial study on transferring solutions learned with a mobile manipulator in a simulated apartment to its real-world counterpart. We hope that BEHAVIOR-1K's human-grounded nature, diversity, and realism make it valuable for embodied AI and robot learning research. Project website: https://behavior.stanford.edu. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: A preliminary version was published at 6th Conference on Robot Learning (CoRL 2022)

arXiv:2312.05250 [pdf, other]

TaskMet: Task-Driven Metric Learning for Model Learning

Authors: Dishank Bansal, Ricky T. Q. Chen, Mustafa Mukadam, Brandon Amos

Abstract: Deep learning models are often deployed in downstream tasks that the training procedure may not be aware of. For example, models solely trained to achieve accurate predictions may struggle to perform well on downstream tasks because seemingly small prediction errors may incur drastic task errors. The standard end-to-end learning approach is to make the task loss differentiable or to introduce a di… ▽ More Deep learning models are often deployed in downstream tasks that the training procedure may not be aware of. For example, models solely trained to achieve accurate predictions may struggle to perform well on downstream tasks because seemingly small prediction errors may incur drastic task errors. The standard end-to-end learning approach is to make the task loss differentiable or to introduce a differentiable surrogate that the model can be trained on. In these settings, the task loss needs to be carefully balanced with the prediction loss because they may have conflicting objectives. We propose take the task loss signal one level deeper than the parameters of the model and use it to learn the parameters of the loss function the model is trained on, which can be done by learning a metric in the prediction space. This approach does not alter the optimal prediction model itself, but rather changes the model learning to emphasize the information important for the downstream task. This enables us to achieve the best of both worlds: a prediction model trained in the original prediction space while also being valuable for the desired downstream task. We validate our approach through experiments conducted in two main settings: 1) decision-focused model learning scenarios involving portfolio optimization and budget allocation, and 2) reinforcement learning in noisy environments with distracting states. The source code to reproduce our experiments is available at https://github.com/facebookresearch/taskmet △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: NeurIPS 2023

arXiv:2109.13913 [pdf, other]

$f$-Cal: Calibrated aleatoric uncertainty estimation from neural networks for robot perception

Authors: Dhaivat Bhatt, Kaustubh Mani, Dishank Bansal, Krishna Murthy, Hanju Lee, Liam Paull

Abstract: While modern deep neural networks are performant perception modules, performance (accuracy) alone is insufficient, particularly for safety-critical robotic applications such as self-driving vehicles. Robot autonomy stacks also require these otherwise blackbox models to produce reliable and calibrated measures of confidence on their predictions. Existing approaches estimate uncertainty from these n… ▽ More While modern deep neural networks are performant perception modules, performance (accuracy) alone is insufficient, particularly for safety-critical robotic applications such as self-driving vehicles. Robot autonomy stacks also require these otherwise blackbox models to produce reliable and calibrated measures of confidence on their predictions. Existing approaches estimate uncertainty from these neural network perception stacks by modifying network architectures, inference procedure, or loss functions. However, in general, these methods lack calibration, meaning that the predictive uncertainties do not faithfully represent the true underlying uncertainties (process noise). Our key insight is that calibration is only achieved by imposing constraints across multiple examples, such as those in a mini-batch; as opposed to existing approaches which only impose constraints per-sample, often leading to overconfident (thus miscalibrated) uncertainty estimates. By enforcing the distribution of outputs of a neural network to resemble a target distribution by minimizing an $f$-divergence, we obtain significantly better-calibrated models compared to prior approaches. Our approach, $f$-Cal, outperforms existing uncertainty calibration approaches on robot perception tasks such as object detection and monocular depth estimation over multiple real-world benchmarks. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: For more details about $f$-Cal, visit https://f-cal.github.io

arXiv:2103.10730 [pdf, other]

MuRIL: Multilingual Representations for Indian Languages

Authors: Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, Shruti Gupta, Subhash Chandra Bose Gali, Vish Subramanian, Partha Talukdar

Abstract: India is a multilingual society with 1369 rationalized languages and dialects being spoken across the country (INDIA, 2011). Of these, the 22 scheduled languages have a staggering total of 1.17 billion speakers and 121 languages have more than 10,000 speakers (INDIA, 2011). India also has the second largest (and an ever growing) digital footprint (Statista, 2020). Despite this, today's state-of-th… ▽ More India is a multilingual society with 1369 rationalized languages and dialects being spoken across the country (INDIA, 2011). Of these, the 22 scheduled languages have a staggering total of 1.17 billion speakers and 121 languages have more than 10,000 speakers (INDIA, 2011). India also has the second largest (and an ever growing) digital footprint (Statista, 2020). Despite this, today's state-of-the-art multilingual systems perform suboptimally on Indian (IN) languages. This can be explained by the fact that multilingual language models (LMs) are often trained on 100+ languages together, leading to a small representation of IN languages in their vocabulary and training data. Multilingual LMs are substantially less effective in resource-lean scenarios (Wu and Dredze, 2020; Lauscher et al., 2020), as limited data doesn't help capture the various nuances of a language. One also commonly observes IN language text transliterated to Latin or code-mixed with English, especially in informal settings (for example, on social media platforms) (Rijhwani et al., 2017). This phenomenon is not adequately handled by current state-of-the-art multilingual LMs. To address the aforementioned gaps, we propose MuRIL, a multilingual LM specifically built for IN languages. MuRIL is trained on significantly large amounts of IN text corpora only. We explicitly augment monolingual text corpora with both translated and transliterated document pairs, that serve as supervised cross-lingual signals in training. MuRIL significantly outperforms multilingual BERT (mBERT) on all tasks in the challenging cross-lingual XTREME benchmark (Hu et al., 2020). We also present results on transliterated (native to Latin script) test sets of the chosen datasets and demonstrate the efficacy of MuRIL in handling transliterated data. △ Less

Submitted 2 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

arXiv:2006.03614 [pdf, other]

Anticipatory Human-Robot Collaboration via Multi-Objective Trajectory Optimization

Authors: Abhinav Jain, Daphne Chen, Dhruva Bansal, Sam Scheele, Mayank Kishore, Hritik Sapra, David Kent, Harish Ravichandar, Sonia Chernova

Abstract: We address the problem of adapting robot trajectories to improve safety, comfort, and efficiency in human-robot collaborative tasks. To this end, we propose CoMOTO, a trajectory optimization framework that utilizes stochastic motion prediction models to anticipate the human's motion and adapt the robot's joint trajectory accordingly. We design a multi-objective cost function that simultaneously op… ▽ More We address the problem of adapting robot trajectories to improve safety, comfort, and efficiency in human-robot collaborative tasks. To this end, we propose CoMOTO, a trajectory optimization framework that utilizes stochastic motion prediction models to anticipate the human's motion and adapt the robot's joint trajectory accordingly. We design a multi-objective cost function that simultaneously optimizes for i) separation distance, ii) visibility of the end-effector, iii) legibility, iv) efficiency, and v) smoothness. We evaluate CoMOTO against three existing methods for robot trajectory generation when in close proximity to humans. Our experimental results indicate that our approach consistently outperforms existing methods over a combined set of safety, comfort, and efficiency metrics. △ Less

Submitted 30 July, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: To be published at the International Conference on Intelligent Robots and Systems (IROS), 2020

arXiv:1907.12368 [pdf, other]

Detecting Radical Text over Online Media using Deep Learning

Authors: Armaan Kaur, Jaspal Kaur Saini, Divya Bansal

Abstract: Social Media has influenced the way people socially connect, interact and opinionize. The growth in technology has enhanced communication and dissemination of information. Unfortunately,many terror groups like jihadist communities have started consolidating a virtual community online for various purposes such as recruitment, online donations, targeting youth online and spread of extremist ideologi… ▽ More Social Media has influenced the way people socially connect, interact and opinionize. The growth in technology has enhanced communication and dissemination of information. Unfortunately,many terror groups like jihadist communities have started consolidating a virtual community online for various purposes such as recruitment, online donations, targeting youth online and spread of extremist ideologies. Everyday a large number of articles, tweets, posts, posters, blogs, comments, views and news are posted online without a check which in turn imposes a threat to the security of any nation. However, different agencies are working on getting down this radical content from various online social media platforms. The aim of our paper is to utilise deep learning algorithm in detection of radicalization contrary to the existing works based on machine learning algorithms. An LSTM based feed forward neural network is employed to detect radical content. We collected total 61601 records from various online sources constituting news, articles and blogs. These records are annotated by domain experts into three categories: Radical(R), Non-Radical (NR) and Irrelevant(I) which are further applied to LSTM based network to classify radical content. A precision of 85.9% has been achieved with the proposed approach △ Less

Submitted 30 July, 2019; v1 submitted 22 July, 2019; originally announced July 2019.

Comments: The Paper consists of 7 pages with 5 figures. The paper is accepted in Intelligent Information Feed Workshop of 25th ACM SIGKDD Conference 2019 for oral presentation

arXiv:1801.03425 [pdf, other]

doi 10.1145/3191477.3191487

Design, Analysis & Prototy** of a Semi-Automated Staircase-Climbing Rehabilitation Robot

Authors: Siddharth Jha, Himanshu Chaudhary, Swapnil Satardey, Piyush Kumar, Ankush Roy, Aditya Deshmukh, Dishank Bansal, Gopabandhu Hota, Saurabh Mirani

Abstract: In this paper, we describe the mechanical design, system overview, integration and control techniques associated with SKALA, a unique large-sized robot for carrying a person with physical disabilities, up and down staircases. As a regular wheelchair is unable to perform such a maneuver, the system functions as a non-conventional wheelchair with several intelligent features. We describe the unique… ▽ More In this paper, we describe the mechanical design, system overview, integration and control techniques associated with SKALA, a unique large-sized robot for carrying a person with physical disabilities, up and down staircases. As a regular wheelchair is unable to perform such a maneuver, the system functions as a non-conventional wheelchair with several intelligent features. We describe the unique mechanical design and the design choices associated with it. We showcase the embedded control architecture that allows for several different modes of teleoperation, all of which have been described in detail. We further investigate the architecture associated with the autonomous operation of the system. △ Less

Submitted 21 January, 2020; v1 submitted 10 January, 2018; originally announced January 2018.

arXiv:cs/0104012 [pdf, ps, other]

System Support for Bandwidth Management and Content Adaptation in Internet Applications

Authors: David G. Andersen, Deepak Bansal, Dorothy Curtis, Srinivasan Seshan, Hari Balakrishnan

Abstract: This paper describes the implementation and evaluation of an operating system module, the Congestion Manager (CM), which provides integrated network flow management and exports a convenient programming interface that allows applications to be notified of, and adapt to, changing network conditions. We describe the API by which applications interface with the CM, and the architectural consideratio… ▽ More This paper describes the implementation and evaluation of an operating system module, the Congestion Manager (CM), which provides integrated network flow management and exports a convenient programming interface that allows applications to be notified of, and adapt to, changing network conditions. We describe the API by which applications interface with the CM, and the architectural considerations that factored into the design. To evaluate the architecture and API, we describe our implementations of TCP; a streaming layered audio/video application; and an interactive audio application using the CM, and show that they achieve adaptive behavior without incurring much end-system overhead. All flows including TCP benefit from the sharing of congestion information, and applications are able to incorporate new functionality such as congestion control and adaptive behavior. △ Less

Submitted 7 April, 2001; originally announced April 2001.

Comments: 14 pages, appeared in OSDI 2000

ACM Class: D.4.4

Journal ref: Proc. OSDI 2000

Showing 1–8 of 8 results for author: Bansal, D