Search | arXiv e-print repository

SegICP-DSR: Dense Semantic Scene Reconstruction and Registration

Authors: Jay M. Wong, Syler Wagner, Connor Lawson, Vincent Kee, Mitchell Hebert, Justin Rooney, Gian-Luca Mariottini, Rebecca Russell, Abraham Schneider, Rahul Chipalkatty, David M. S. Johnson

Abstract: To enable autonomous robotic manipulation in unstructured environments, we present SegICP-DSR, a real- time, dense, semantic scene reconstruction and pose estimation algorithm that achieves mm-level pose accuracy and standard deviation (7.9 mm, σ=7.6 mm and 1.7 deg, σ=0.7 deg) and suc- cessfully identified the object pose in 97% of test cases. This represents a 29% increase in accuracy, and a 14%… ▽ More To enable autonomous robotic manipulation in unstructured environments, we present SegICP-DSR, a real- time, dense, semantic scene reconstruction and pose estimation algorithm that achieves mm-level pose accuracy and standard deviation (7.9 mm, σ=7.6 mm and 1.7 deg, σ=0.7 deg) and suc- cessfully identified the object pose in 97% of test cases. This represents a 29% increase in accuracy, and a 14% increase in success rate compared to SegICP in cluttered, unstruc- tured environments. The performance increase of SegICP-DSR arises from (1) improved deep semantic segmentation under adversarial training, (2) precise automated calibration of the camera intrinsic and extrinsic parameters, (3) viewpoint specific ray-casting of the model geometry, and (4) dense semantic ElasticFusion point clouds for registration. We benchmark the performance of SegICP-DSR on thousands of pose-annotated video frames and demonstrate its accuracy and efficacy on two tight tolerance gras** and insertion tasks using a KUKA LBR iiwa robotic arm. △ Less

Submitted 6 November, 2017; originally announced November 2017.

arXiv:1703.01661 [pdf, other]

doi 10.1109/IROS.2017.8206470

SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

Authors: Jay M. Wong, Vincent Kee, Tiffany Le, Syler Wagner, Gian-Luca Mariottini, Abraham Schneider, Lei Hamilton, Rahul Chipalkatty, Mitchell Hebert, David M. S. Johnson, Jimmy Wu, Bolei Zhou, Antonio Torralba

Abstract: Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi… ▽ More Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems' perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects. Our architecture achieves 1cm position error and <5^\circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture. △ Less

Submitted 5 September, 2017; v1 submitted 5 March, 2017; originally announced March 2017.

Comments: IROS camera-ready

arXiv:1611.00201 [pdf, other]

Towards Lifelong Self-Supervision: A Deep Learning Direction for Robotics

Authors: Jay M. Wong

Abstract: Despite outstanding success in vision amongst other domains, many of the recent deep learning approaches have evident drawbacks for robots. This manuscript surveys recent work in the literature that pertain to applying deep learning systems to the robotics domain, either as means of estimation or as a tool to resolve motor commands directly from raw percepts. These recent advances are only a piece… ▽ More Despite outstanding success in vision amongst other domains, many of the recent deep learning approaches have evident drawbacks for robots. This manuscript surveys recent work in the literature that pertain to applying deep learning systems to the robotics domain, either as means of estimation or as a tool to resolve motor commands directly from raw percepts. These recent advances are only a piece to the puzzle. We suggest that deep learning as a tool alone is insufficient in building a unified framework to acquire general intelligence. For this reason, we complement our survey with insights from cognitive development and refer to ideas from classical control theory, producing an integrated direction for a lifelong learning architecture. △ Less

Submitted 1 November, 2016; originally announced November 2016.

arXiv:1607.04376 [pdf, other]

Intrinsically Motivated Multimodal Structure Learning

Authors: Jay Ming Wong, Roderic A. Grupen

Abstract: We present a long-term intrinsically motivated structure learning method for modeling transition dynamics during controlled interactions between a robot and semi-permanent structures in the world. In particular, we discuss how partially-observable state is represented using distributions over a Markovian state and build models of objects that predict how state distributions change in response to i… ▽ More We present a long-term intrinsically motivated structure learning method for modeling transition dynamics during controlled interactions between a robot and semi-permanent structures in the world. In particular, we discuss how partially-observable state is represented using distributions over a Markovian state and build models of objects that predict how state distributions change in response to interactions with such objects. These structures serve as the basis for a number of possible future tasks defined as Markov Decision Processes (MDPs). The approach is an example of a structure learning technique applied to a multimodal affordance representation that yields a population of forward models for use in planning. We evaluate the approach using experiments on a bimanual mobile manipulator (uBot-6) that show the performance of model acquisition as the number of transition actions increases. △ Less

Submitted 15 July, 2016; originally announced July 2016.

Showing 1–4 of 4 results for author: Wong, J M