Skip to main content

Showing 1–42 of 42 results for author: Mousavian, A

.
  1. arXiv:2406.10721  [pdf, other

    cs.RO cs.AI cs.CV

    RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics

    Authors: Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

    Abstract: From rearranging objects on a table to putting groceries into shelves, robots must plan precise action points to perform tasks accurately and reliably. In spite of the recent adoption of vision language models (VLMs) to control robot behavior, VLMs struggle to precisely articulate robot actions using language. We introduce an automatic synthetic data generation pipeline that instruction-tunes VLMs… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  2. arXiv:2311.00926  [pdf, other

    cs.RO cs.AI cs.CV

    M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place

    Authors: Wentao Yuan, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

    Abstract: With the advent of large language models and large-scale robotic datasets, there has been tremendous progress in high-level decision-making for object manipulation. These generic models are able to interpret complex tasks using language commands, but they often have difficulties generalizing to out-of-distribution objects due to the inability of low-level action primitives. In contrast, existing t… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 12 pages, 8 figures, accepted by CoRL 2023

  3. arXiv:2304.09302  [pdf, other

    cs.RO cs.CV

    CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation

    Authors: Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox

    Abstract: We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes - orders of magnitude more than prior work - in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collisi… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  4. arXiv:2302.10745  [pdf, other

    cs.RO

    Constrained Generative Sampling of 6-DoF Grasps

    Authors: Jens Lundell, Francesco Verdoja, Tran Nguyen Le, Arsalan Mousavian, Dieter Fox, Ville Kyrki

    Abstract: Most state-of-the-art data-driven grasp sampling methods propose stable and collision-free grasps uniformly on the target object. For bin-picking, executing any of those reachable grasps is sufficient. However, for completing specific tasks, such as squeezing out liquid from a bottle, we want the grasp to be on a specific part of the object's body while avoiding other locations, such as the cap. T… ▽ More

    Submitted 17 August, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted at the International Conference on Intelligent Robots and Systems (IROS 2023)

  5. arXiv:2212.06870  [pdf, other

    cs.CV cs.RO

    MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare

    Authors: Yann Labbé, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, Josef Sivic

    Abstract: We introduce MegaPose, a method to estimate the 6D pose of novel objects, that is, objects unseen during training. At inference time, the method only assumes knowledge of (i) a region of interest displaying the object in the image and (ii) a CAD model of the observed object. The contributions of this work are threefold. First, we present a 6D pose refiner based on a render&compare strategy which c… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: CoRL 2022

  6. arXiv:2210.13638  [pdf, other

    cs.RO

    Learning Robust Real-World Dexterous Gras** Policies via Implicit Shape Augmentation

    Authors: Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox

    Abstract: Dexterous robotic hands have the capability to interact with a wide variety of household objects to perform tasks like gras**. However, learning robust real world gras** policies for arbitrary objects has proven challenging due to the difficulty of generating high quality training data. In this work, we propose a learning system (ISAGrasp) for leveraging a small number of human demonstrations… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted by CoRL2022

  7. arXiv:2209.14284  [pdf, other

    cs.CV

    DexTransfer: Real World Multi-fingered Dexterous Gras** with Minimal Human Demonstrations

    Authors: Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox

    Abstract: Teaching a multi-fingered dexterous robot to grasp objects in the real world has been a challenging problem due to its high dimensional state and action space. We propose a robot-learning system that can take a small number of human demonstrations and learn to grasp unseen object poses given partially occluded observations. Our system leverages a small motion capture dataset and generates a large… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

  8. arXiv:2209.11302  [pdf, other

    cs.RO cs.AI cs.CL cs.LG

    ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

    Authors: Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg

    Abstract: Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumeratin… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

  9. arXiv:2207.02556  [pdf, other

    cs.RO

    Deep Learning Approaches to Grasp Synthesis: A Review

    Authors: Rhys Newbury, Morris Gu, Lachlan Chumbley, Arsalan Mousavian, Clemens Eppner, Jürgen Leitner, Jeannette Bohg, Antonio Morales, Tamim Asfour, Danica Kragic, Dieter Fox, Akansel Cosgun

    Abstract: Gras** is the process of picking up an object by applying forces and torques at a set of contacts. Recent advances in deep-learning methods have allowed rapid progress in robotic object gras**. In this systematic review, we surveyed the publications over the last decade, with a particular interest in gras** an object using all 6 degrees of freedom of the end-effector pose. Our review found f… ▽ More

    Submitted 4 May, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: 20 pages. Accepted to T-RO

  10. arXiv:2202.00732  [pdf, other

    cs.RO cs.CV

    IFOR: Iterative Flow Minimization for Robotic Object Rearrangement

    Authors: Ankit Goyal, Arsalan Mousavian, Chris Paxton, Yu-Wei Chao, Brian Okorn, Jia Deng, Dieter Fox

    Abstract: Accurate object rearrangement from vision is a crucial problem for a wide variety of real-world robotics applications in unstructured environments. We propose IFOR, Iterative Flow Minimization for Robotic Object Rearrangement, an end-to-end method for the challenging problem of object rearrangement for unknown objects given an RGBD image of the original and final scenes. First, we learn an optical… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  11. arXiv:2106.15711  [pdf, other

    cs.CV cs.RO

    RICE: Refining Instance Masks in Cluttered Environments with Graph Neural Networks

    Authors: Christopher Xie, Arsalan Mousavian, Yu Xiang, Dieter Fox

    Abstract: Segmenting unseen object instances in cluttered environments is an important capability that robots need when functioning in unstructured environments. While previous methods have exhibited promising results, they still tend to provide incorrect results in highly cluttered scenes. We postulate that a network architecture that encodes relations between objects at a high-level can be beneficial. Thu… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  12. arXiv:2106.01352  [pdf, other

    cs.RO cs.AI cs.LG

    NeRP: Neural Rearrangement Planning for Unknown Objects

    Authors: Ahmed H. Qureshi, Arsalan Mousavian, Chris Paxton, Michael C. Yip, Dieter Fox

    Abstract: Robots will be expected to manipulate a wide variety of objects in complex and arbitrary ways as they become more widely used in human environments. As such, the rearrangement of objects has been noted to be an important benchmark for AI capabilities in recent years. We propose NeRP (Neural Rearrangement Planning), a deep learning based approach for multi-step neural object rearrangement planning… ▽ More

    Submitted 4 June, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Please refer to our supplementary video: https://youtu.be/CJb1IzH94eo

  13. arXiv:2104.13542  [pdf, other

    cs.RO

    STORM: An Integrated Framework for Fast Joint-Space Model-Predictive Control for Reactive Manipulation

    Authors: Mohak Bhardwaj, Balakumar Sundaralingam, Arsalan Mousavian, Nathan Ratliff, Dieter Fox, Fabio Ramos, Byron Boots

    Abstract: Sampling-based model-predictive control (MPC) is a promising tool for feedback control of robots with complex, non-smooth dynamics, and cost functions. However, the computationally demanding nature of sampling-based MPC algorithms has been a key bottleneck in their application to high-dimensional robotic manipulation problems in the real world. Previous methods have addressed this issue by running… ▽ More

    Submitted 14 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: Accepted for oral presentation at the Conference on Robot Learning (CoRL), 2021. Code available at: https://github.com/NVlabs/storm

    Journal ref: 5th Annual Conference on Robot Learning, 2021

  14. arXiv:2104.00622  [pdf, other

    cs.CV cs.RO

    RGB-D Local Implicit Function for Depth Completion of Transparent Objects

    Authors: Luyang Zhu, Arsalan Mousavian, Yu Xiang, Hammad Mazhar, Jozef van Eenbergen, Shoubhik Debnath, Dieter Fox

    Abstract: Majority of the perception methods in robotics require depth information provided by RGB-D cameras. However, standard 3D sensors fail to capture depth of transparent objects due to refraction and absorption of light. In this paper, we introduce a new approach for depth completion of transparent objects from a single RGB-D image. Key to our approach is a local implicit neural representation built o… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: CVPR 2021

  15. arXiv:2103.16747  [pdf, other

    cs.RO

    Sim-to-Real for Robotic Tactile Sensing via Physics-Based Simulation and Learned Latent Projections

    Authors: Yashraj Narang, Balakumar Sundaralingam, Miles Macklin, Arsalan Mousavian, Dieter Fox

    Abstract: Tactile sensing is critical for robotic gras** and manipulation of objects under visual occlusion. However, in contrast to simulations of robot arms and cameras, current simulations of tactile sensors have limited accuracy, speed, and utility. In this work, we develop an efficient 3D finite element method (FEM) model of the SynTouch BioTac sensor using an open-access, GPU-based robotics simulato… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: To be published in Proc. IEEE International Conference on Robotics and Automation (ICRA)

  16. arXiv:2103.14127  [pdf, other

    cs.RO cs.CV

    Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes

    Authors: Martin Sundermeyer, Arsalan Mousavian, Rudolph Triebel, Dieter Fox

    Abstract: Gras** unseen objects in unconstrained, cluttered environments is an essential skill for autonomous robotic manipulation. Despite recent progress in full 6-DoF grasp learning, existing approaches often consist of complex sequential pipelines that possess several potential failure points and run-times unsuitable for closed-loop gras**. Therefore, we propose an end-to-end network that efficientl… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: ICRA 2021. Video of the real world experiments and code are available at https://research.nvidia.com/publication/2021-03_Contact-GraspNet%3A--Efficient

  17. arXiv:2101.05452  [pdf, other

    cs.RO

    Interpreting and Predicting Tactile Signals for the SynTouch BioTac

    Authors: Yashraj S. Narang, Balakumar Sundaralingam, Karl Van Wyk, Arsalan Mousavian, Dieter Fox

    Abstract: In the human hand, high-density contact information provided by afferent neurons is essential for many human gras** and manipulation capabilities. In contrast, robotic tactile sensors, including the state-of-the-art SynTouch BioTac, are typically used to provide low-density contact information, such as contact location, center of pressure, and net force. Although useful, these data do not convey… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

    Comments: Submitted to International Journal of Robotics Research (IJRR)

  18. arXiv:2011.10726  [pdf, other

    cs.RO cs.CV

    Object Rearrangement Using Learned Implicit Collision Functions

    Authors: Michael Danielczuk, Arsalan Mousavian, Clemens Eppner, Dieter Fox

    Abstract: Robotic object rearrangement combines the skills of picking and placing objects. When object models are unavailable, typical collision-checking models may be unable to predict collisions in partial point clouds with occlusions, making generation of collision-free gras** or placement trajectories challenging. We propose a learned collision model that accepts scene and query object point clouds an… ▽ More

    Submitted 26 March, 2021; v1 submitted 21 November, 2020; originally announced November 2020.

    Comments: First two authors contributed equally. 2021 IEEE International Conference on Robotics and Automation. 8 pages, 4 figures, 3 tables

  19. arXiv:2011.09584  [pdf, other

    cs.RO cs.CV

    ACRONYM: A Large-Scale Grasp Dataset Based on Simulation

    Authors: Clemens Eppner, Arsalan Mousavian, Dieter Fox

    Abstract: We introduce ACRONYM, a dataset for robot grasp planning based on physics simulation. The dataset contains 17.7M parallel-jaw grasps, spanning 8872 objects from 262 different categories, each labeled with the grasp result obtained from a physics simulator. We show the value of this large and diverse dataset by using it to train two state-of-the-art learning-based grasp planning algorithms. Grasp p… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  20. arXiv:2011.08961  [pdf, other

    cs.RO cs.CV

    Reactive Human-to-Robot Handovers of Arbitrary Objects

    Authors: Wei Yang, Chris Paxton, Arsalan Mousavian, Yu-Wei Chao, Maya Cakmak, Dieter Fox

    Abstract: Human-robot object handovers have been an actively studied area of robotics over the past decade; however, very few techniques and systems have addressed the challenge of handing over diverse objects with arbitrary appearance, size, shape, and rigidity. In this paper, we present a vision-based system that enables reactive human-to-robot handovers of unknown objects. Our approach combines closed-lo… ▽ More

    Submitted 3 June, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: Accepted to the International Conference on Robotics and Automation (ICRA) 2021

  21. arXiv:2011.08694  [pdf, other

    cs.RO cs.LG

    Reactive Long Horizon Task Execution via Visual Skill and Precondition Models

    Authors: Shohin Mukherjee, Chris Paxton, Arsalan Mousavian, Adam Fishman, Maxim Likhachev, Dieter Fox

    Abstract: Zero-shot execution of unseen robotic tasks is important to allowing robots to perform a wide variety of tasks in human environments, but collecting the amounts of data necessary to train end-to-end policies in the real-world is often infeasible. We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a si… ▽ More

    Submitted 14 July, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021

  22. arXiv:2010.00824  [pdf, other

    cs.RO cs.CV

    Goal-Auxiliary Actor-Critic for 6D Robotic Gras** with Point Clouds

    Authors: Lirui Wang, Yu Xiang, Wei Yang, Arsalan Mousavian, Dieter Fox

    Abstract: 6D robotic gras** beyond top-down bin-picking scenarios is a challenging task. Previous solutions based on 6D grasp synthesis with robot motion planning usually operate in an open-loop setting, which are sensitive to grasp synthesis errors. In this work, we propose a new method for learning closed-loop control policies for 6D gras**. Our policy takes a segmented point cloud of an object from a… ▽ More

    Submitted 30 June, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Journal ref: The Conference on Robot Learning (CoRL 2021)

  23. arXiv:2009.01949  [pdf, ps, other

    physics.optics

    Strong-Field Terahertz Control of Plasmon Induced Opacity in Photoexcited Metamaterial

    Authors: Ali Mousavian, Zachary J. Thompson, Byounghwak Lee, Alden N. Bradley, Milo X. Sprague, Yun-Shik Lee

    Abstract: A terahertz metamaterial consisting of radiative slot antennas and subradiant complementary split-ring resonators exhibits plasmon induced opacity in a narrow spectral range due to the destructive interference between the bright and dark modes of the coupled oscillators. Femtosecond optical excitations instantly quench the mode coupling and plasmon oscillations, injecting photocarriers into the me… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 5 pages, 4 figures

  24. arXiv:2007.15157  [pdf, other

    cs.RO cs.CV

    Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

    Authors: Yu Xiang, Christopher Xie, Arsalan Mousavian, Dieter Fox

    Abstract: Segmenting unseen objects in cluttered scenes is an important skill that robots need to acquire in order to perform tasks in new environments. In this work, we propose a new method for unseen object instance segmentation by learning RGB-D feature embeddings from synthetic data. A metric learning loss function is utilized to learn to produce pixel-wise feature embeddings such that pixels from the s… ▽ More

    Submitted 3 March, 2021; v1 submitted 29 July, 2020; originally announced July 2020.

    Comments: CoRL 2020

  25. arXiv:2007.08073  [pdf, other

    cs.CV cs.RO

    Unseen Object Instance Segmentation for Robotic Environments

    Authors: Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox

    Abstract: In order to function in unstructured environments, robots need the ability to recognize unseen objects. We take a step in this direction by tackling the problem of segmenting unseen object instances in tabletop environments. However, the type of large-scale real-world dataset required for this task typically does not exist for most robotic settings, which motivates the use of synthetic data. Our p… ▽ More

    Submitted 13 October, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: Journal version of arXiv:1907.13236

  26. arXiv:2006.03777  [pdf, other

    cs.RO

    Interpreting and Predicting Tactile Signals via a Physics-Based and Data-Driven Framework

    Authors: Yashraj S. Narang, Karl Van Wyk, Arsalan Mousavian, Dieter Fox

    Abstract: High-density afferents in the human hand have long been regarded as essential for human gras** and manipulation abilities. In contrast, robotic tactile sensors are typically used to provide low-density contact data, such as center-of-pressure and resultant force. Although useful, this data does not exploit the rich information content that some tactile sensors (e.g., the SynTouch BioTac) natural… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

    Comments: To be published in Proc. Robotics: Science and Systems (RSS)

  27. arXiv:1912.05604  [pdf, other

    cs.RO cs.AI cs.CV

    A Billion Ways to Grasp: An Evaluation of Grasp Sampling Schemes on a Dense, Physics-based Grasp Data Set

    Authors: Clemens Eppner, Arsalan Mousavian, Dieter Fox

    Abstract: Robot gras** is often formulated as a learning problem. With the increasing speed and quality of physics simulations, generating large-scale gras** data sets that feed learning algorithms is becoming more and more popular. An often overlooked question is how to generate the grasps that make up these data sets. In this paper, we review, classify, and compare different grasp sampling strategies.… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

    Comments: For associated web page, see https://sites.google.com/view/abillionwaystograsp . 19th International Symposium of Robotics Research (ISRR)

  28. arXiv:1912.03628  [pdf, other

    cs.RO cs.CV

    6-DOF Gras** for Target-driven Object Manipulation in Clutter

    Authors: Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Chris Paxton, Dieter Fox

    Abstract: Gras** in cluttered environments is a fundamental but challenging robotic skill. It requires both reasoning about unseen object parts and potential collisions with the manipulator. Most existing data-driven approaches avoid this problem by limiting themselves to top-down planar grasps which is insufficient for many real-world scenarios and greatly limits possible grasps. We present a method that… ▽ More

    Submitted 20 May, 2020; v1 submitted 8 December, 2019; originally announced December 2019.

    Comments: Accepted to the International Conference on Robotics and Automation (ICRA) 2020

  29. arXiv:1912.00416  [pdf, other

    cs.CV cs.GR cs.RO

    LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation

    Authors: Keunhong Park, Arsalan Mousavian, Yu Xiang, Dieter Fox

    Abstract: Current 6D object pose estimation methods usually require a 3D model for each object. These methods also require additional training in order to incorporate new objects. As a result, they are difficult to scale to a large number of objects and cannot be directly applied to unseen objects. We propose a novel framework for 6D pose estimation of unseen objects. We present a network that reconstruct… ▽ More

    Submitted 11 June, 2020; v1 submitted 1 December, 2019; originally announced December 2019.

    Comments: CVPR 2020, Project Page: https://keunhong.com/publications/latentfusion/ , Video: https://youtu.be/tlzcq1KYXd8 , Code: https://github.com/NVlabs/latentfusion . We have added experiments for LINEMOD and have updated the experiments on MOPED. We've also added more technical and implementation details to the methods section

  30. arXiv:1909.10159  [pdf, other

    cs.RO

    Self-supervised 6D Object Pose Estimation for Robot Manipulation

    Authors: Xinke Deng, Yu Xiang, Arsalan Mousavian, Clemens Eppner, Timothy Bretl, Dieter Fox

    Abstract: To teach robots skills, it is crucial to obtain data with supervision. Since annotating real world data is time-consuming and expensive, enabling robots to learn in a self-supervised way is important. In this work, we introduce a robot system for self-supervised 6D object pose estimation. Starting from modules trained in simulation, our system is able to label real world images with accurate 6D ob… ▽ More

    Submitted 6 March, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: Accepted to International Conference on Robotics and Automation (ICRA), 2020

  31. arXiv:1907.13236  [pdf, other

    cs.CV cs.RO

    The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation

    Authors: Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox

    Abstract: In order to function in unstructured environments, robots need the ability to recognize unseen novel objects. We take a step in this direction by tackling the problem of segmenting unseen object instances in tabletop environments. However, the type of large-scale real-world dataset required for this task typically does not exist for most robotic settings, which motivates the use of synthetic data.… ▽ More

    Submitted 15 July, 2020; v1 submitted 30 July, 2019; originally announced July 2019.

  32. arXiv:1905.10520  [pdf, other

    cs.CV cs.RO

    6-DOF GraspNet: Variational Grasp Generation for Object Manipulation

    Authors: Arsalan Mousavian, Clemens Eppner, Dieter Fox

    Abstract: Generating grasp poses is a crucial component for any robot object manipulation task. In this work, we formulate the problem of grasp generation as sampling a set of grasps using a variational autoencoder and assess and refine the sampled grasps using a grasp evaluator model. Both Grasp Sampler and Grasp Refinement networks take 3D point clouds observed by a depth camera as input. We evaluate our… ▽ More

    Submitted 16 August, 2019; v1 submitted 25 May, 2019; originally announced May 2019.

    Comments: Accepted to ICCV 2019. Extended camera ready version with additional experiments

  33. arXiv:1905.09304  [pdf, other

    cs.CV cs.RO

    PoseRBPF: A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

    Authors: Xinke Deng, Arsalan Mousavian, Yu Xiang, Fei Xia, Timothy Bretl, Dieter Fox

    Abstract: Tracking 6D poses of objects from videos provides rich information to a robot in performing different tasks such as manipulation and navigation. In this work, we formulate the 6D object pose tracking problem in the Rao-Blackwellized particle filtering framework, where the 3D rotation and the 3D translation of an object are decoupled. This factorization allows our approach, called PoseRBPF, to effi… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

    Comments: Accepted to RSS 2019

  34. arXiv:1805.06066  [pdf, other

    cs.CV

    Visual Representations for Semantic Target Driven Navigation

    Authors: Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, Ayzaan Wahid, James Davidson

    Abstract: What is a good visual representation for autonomous agents? We address this question in the context of semantic visual navigation, which is the problem of a robot finding its way through a complex environment to a target object, e.g. go to the refrigerator. Instead of acquiring a metric semantic map of an environment and using planning for navigation, our approach learns navigation policies on top… ▽ More

    Submitted 2 July, 2019; v1 submitted 15 May, 2018; originally announced May 2018.

    Comments: Accepted to ICRA 2019 and ECCV 2018 Workshop on Visual Learning and Embodied Agents in Simulation Environments

  35. arXiv:1702.07836  [pdf, other

    cs.CV cs.RO

    Synthesizing Training Data for Object Detection in Indoor Scenes

    Authors: Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka

    Abstract: Detection of objects in cluttered indoor environments is one of the key enabling functionalities for service robots. The best performing object detection approaches in computer vision exploit deep Convolutional Neural Networks (CNN) to simultaneously detect and categorize the objects of interest in cluttered scenes. Training of such models typically requires large amounts of annotated training dat… ▽ More

    Submitted 7 September, 2017; v1 submitted 25 February, 2017; originally announced February 2017.

    Comments: Added more experiments and link to project webpage

  36. arXiv:1612.00496  [pdf, other

    cs.CV

    3D Bounding Box Estimation Using Deep Learning and Geometry

    Authors: Arsalan Mousavian, Dragomir Anguelov, John Flynn, Jana Kosecka

    Abstract: We present a method for 3D object detection and pose estimation from a single image. In contrast to current techniques that only regress the 3D orientation of an object, our method first regresses relatively stable 3D object properties using a deep convolutional neural network and then combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D… ▽ More

    Submitted 10 April, 2017; v1 submitted 1 December, 2016; originally announced December 2016.

    Comments: To appear in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017

  37. arXiv:1609.07826  [pdf, other

    cs.CV cs.RO

    Multiview RGB-D Dataset for Object Instance Detection

    Authors: Georgios Georgakis, Md Alimoor Reza, Arsalan Mousavian, Phi-Hung Le, Jana Kosecka

    Abstract: This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset. The viewpoints of the scenes are densely sampled and objects in the scenes are annotated with bounding boxes and in the 3D point cloud. Also, an approach for detection and recognition is presented, whi… ▽ More

    Submitted 25 September, 2016; originally announced September 2016.

  38. arXiv:1609.00278  [pdf, other

    cs.CV cs.RO

    Semantic Image Based Geolocation Given a Map

    Authors: Arsalan Mousavian, Jana Kosecka

    Abstract: The problem visual place recognition is commonly used strategy for localization. Most successful appearance based methods typically rely on a large database of views endowed with local or global image descriptors and strive to retrieve the views of the same location. The quality of the results is often affected by the density of the reference views and the robustness of the image representation wi… ▽ More

    Submitted 1 September, 2016; originally announced September 2016.

  39. arXiv:1604.07480  [pdf, other

    cs.CV

    Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks

    Authors: Arsalan Mousavian, Hamed Pirsiavash, Jana Kosecka

    Abstract: Multi-scale deep CNNs have been used successfully for problems map** each pixel to a label, such as depth estimation and semantic segmentation. It has also been shown that such architectures are reusable and can be used for multiple tasks. These networks are typically trained independently for each task by varying the output layer(s) and training objective. In this work we present a new model fo… ▽ More

    Submitted 19 September, 2016; v1 submitted 25 April, 2016; originally announced April 2016.

  40. arXiv:1509.06033  [pdf, other

    cs.CV

    Deep Convolutional Features for Image Based Retrieval and Scene Categorization

    Authors: Arsalan Mousavian, Jana Kosecka

    Abstract: Several recent approaches showed how the representations learned by Convolutional Neural Networks can be repurposed for novel tasks. Most commonly it has been shown that the activation features of the last fully connected layers (fc7 or fc6) of the network, followed by a linear classifier outperform the state-of-the-art on several recognition challenge datasets. Instead of recognition, this paper… ▽ More

    Submitted 20 September, 2015; originally announced September 2015.

  41. arXiv:1311.7215  [pdf

    cs.AI cs.DM

    Solving Minimum Vertex Cover Problem Using Learning Automata

    Authors: Aylin Mousavian, Alireza Rezvanian, Mohammad Reza Meybodi

    Abstract: Minimum vertex cover problem is an NP-Hard problem with the aim of finding minimum number of vertices to cover graph. In this paper, a learning automaton based algorithm is proposed to find minimum vertex cover in graph. In the proposed algorithm, each vertex of graph is equipped with a learning automaton that has two actions in the candidate or non-candidate of the corresponding vertex cover set.… ▽ More

    Submitted 28 November, 2013; originally announced November 2013.

    Comments: 5 pages, 5 figures, conference

  42. The Resource Theory of Stabilizer Computation

    Authors: Victor Veitch, Seyed Ali Hamed Mousavian, Daniel Gottesman, Joseph Emerson

    Abstract: Recent results on the non-universality of fault-tolerant gate sets underline the critical role of resource states, such as magic states, to power scalable, universal quantum computation. Here we develop a resource theory, analogous to the theory of entanglement, for resources for stabilizer codes. We introduce two quantitative measures - monotones - for the amount of non-stabilizer resource. As an… ▽ More

    Submitted 26 July, 2013; originally announced July 2013.

    Journal ref: New J. Phys. 16 (2014) 013009