Search | arXiv e-print repository

Transferable Tactile Transformers for Representation Learning Across Diverse Sensors and Tasks

Authors: Jialiang Zhao, Yuxiang Ma, Lirui Wang, Edward H. Adelson

Abstract: This paper presents T3: Transferable Tactile Transformers, a framework for tactile representation learning that scales across multi-sensors and multi-tasks. T3 is designed to overcome the contemporary issue that camera-based tactile sensing is extremely heterogeneous, i.e. sensors are built into different form factors, and existing datasets were collected for disparate tasks. T3 captures the share… ▽ More This paper presents T3: Transferable Tactile Transformers, a framework for tactile representation learning that scales across multi-sensors and multi-tasks. T3 is designed to overcome the contemporary issue that camera-based tactile sensing is extremely heterogeneous, i.e. sensors are built into different form factors, and existing datasets were collected for disparate tasks. T3 captures the shared latent information across different sensor-task pairings by constructing a shared trunk transformer with sensor-specific encoders and task-specific decoders. The pre-training of T3 utilizes a novel Foundation Tactile (FoTa) dataset, which is aggregated from several open-sourced datasets and it contains over 3 million data points gathered from 13 sensors and 11 tasks. FoTa is the largest and most diverse dataset in tactile sensing to date and it is made publicly available in a unified format. Across various sensors and tasks, experiments show that T3 pre-trained with FoTa achieved zero-shot transferability in certain sensor-task pairings, can be further fine-tuned with small amounts of domain-specific data, and its performance scales with bigger network sizes. T3 is also effective as a tactile encoder for long horizon contact-rich manipulation. Results from sub-millimeter multi-pin electronics insertion tasks show that T3 achieved a task success rate 25% higher than that of policies trained with tactile encoders trained from scratch, or 53% higher than without tactile sensing. Data, code, and model checkpoints are open-sourced at https://t3.alanz.info. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2404.08227 [pdf, other]

A Passively Bendable, Compliant Tactile Palm with RObotic Modular Endoskeleton Optical (ROMEO) Fingers

Authors: Sandra Q. Liu, Edward H. Adelson

Abstract: Many robotic hands currently rely on extremely dexterous robotic fingers and a thumb joint to envelop themselves around an object. Few hands focus on the palm even though human hands greatly benefit from their central fold and soft surface. As such, we develop a novel structurally compliant soft palm, which enables more surface area contact for the objects that are pressed into it. Moreover, this… ▽ More Many robotic hands currently rely on extremely dexterous robotic fingers and a thumb joint to envelop themselves around an object. Few hands focus on the palm even though human hands greatly benefit from their central fold and soft surface. As such, we develop a novel structurally compliant soft palm, which enables more surface area contact for the objects that are pressed into it. Moreover, this design, along with the development of a new low-cost, flexible illumination system, is able to incorporate a high-resolution tactile sensing system inspired by the GelSight sensors. Concurrently, we design RObotic Modular Endoskeleton Optical (ROMEO) fingers, which are underactuated two-segment soft fingers that are able to house the new illumination system, and we integrate them into these various palm configurations. The resulting robotic hand is slightly bigger than a baseball and represents one of the first soft robotic hands with actuated fingers and a passively compliant palm, all of which have high-resolution tactile sensing. This design also potentially helps researchers discover and explore more soft-rigid tactile robotic hand designs with greater capabilities in the future. The supplementary video can be found here: https://youtu.be/RKfIFiewqsg △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: Accepted to ICRA 2024

arXiv:2403.14887 [pdf, other]

GelLink: A Compact Multi-phalanx Finger with Vision-based Tactile Sensing and Proprioception

Authors: Yuxiang Ma, Jialiang Zhao, Edward Adelson

Abstract: Compared to fully-actuated robotic end-effectors, underactuated ones are generally more adaptive, robust, and cost-effective. However, state estimation for underactuated hands is usually more challenging. Vision-based tactile sensors, like Gelsight, can mitigate this issue by providing high-resolution tactile sensing and accurate proprioceptive sensing. As such, we present GelLink, a compact, unde… ▽ More Compared to fully-actuated robotic end-effectors, underactuated ones are generally more adaptive, robust, and cost-effective. However, state estimation for underactuated hands is usually more challenging. Vision-based tactile sensors, like Gelsight, can mitigate this issue by providing high-resolution tactile sensing and accurate proprioceptive sensing. As such, we present GelLink, a compact, underactuated, linkage-driven robotic finger with low-cost, high-resolution vision-based tactile sensing and proprioceptive sensing capabilities. In order to reduce the amount of embedded hardware, i.e. the cameras and motors, we optimize the linkage transmission with a planar linkage mechanism simulator and develop a planar reflection simulator to simplify the tactile sensing hardware. As a result, GelLink only requires one motor to actuate the three phalanges, and one camera to capture tactile signals along the entire finger. Overall, GelLink is a compact robotic finger that shows adaptability and robustness when performing gras** tasks. The integration of vision-based tactile sensors can significantly enhance the capabilities of underactuated fingers and potentially broaden their future usage. △ Less

Submitted 25 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: Supplement video: https://www.youtube.com/watch?v=hZwUpAig5C0 . 7 pages, 9 figures. ICRA 2024 (IEEE International Conference on Robotics and Automation)

arXiv:2403.04638 [pdf, other]

Scalable, Simulation-Guided Compliant Tactile Finger Design

Authors: Yuxiang Ma, Arpit Agarwal, Sandra Q. Liu, Wenzhen Yuan, Edward H. Adelson

Abstract: Compliant grippers enable robots to work with humans in unstructured environments. In general, these grippers can improve with tactile sensing to estimate the state of objects around them to precisely manipulate objects. However, co-designing compliant structures with high-resolution tactile sensing is a challenging task. We propose a simulation framework for the end-to-end forward design of GelSi… ▽ More Compliant grippers enable robots to work with humans in unstructured environments. In general, these grippers can improve with tactile sensing to estimate the state of objects around them to precisely manipulate objects. However, co-designing compliant structures with high-resolution tactile sensing is a challenging task. We propose a simulation framework for the end-to-end forward design of GelSight Fin Ray sensors. Our simulation framework consists of mechanical simulation using the finite element method (FEM) and optical simulation including physically based rendering (PBR). To simulate the fluorescent paint used in these GelSight Fin Rays, we propose an efficient method that can be directly integrated in PBR. Using the simulation framework, we investigate design choices available in the compliant grippers, namely gel pad shapes, illumination conditions, Fin Ray gripper sizes, and Fin Ray stiffness. This infrastructure enables faster design and prototype time frames of new Fin Ray sensors that have various sensing areas, ranging from 48 mm $\times$ \18 mm to 70 mm $\times$ 35 mm. Given the parameters we choose, we can thus optimize different Fin Ray designs and show their utility in gras** day-to-day objects. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: Yuxiang Ma, Arpit Agarwal, and Sandra Q. Liu contributed equally to this work. Project video: https://youtu.be/CnTUTA5cfMw . 7 pages, 11 figures, 2024 IEEE International Conference on Soft Robotics (RoboSoft)

arXiv:2402.02511 [pdf, other]

PoCo: Policy Composition from and for Heterogeneous Robot Learning

Authors: Lirui Wang, Jialiang Zhao, Yilun Du, Edward H. Adelson, Russ Tedrake

Abstract: Training general robotic policies from heterogeneous data for different tasks is a significant challenge. Existing robotic datasets vary in different modalities such as color, depth, tactile, and proprioceptive information, and collected in different domains such as simulation, real robots, and human videos. Current methods usually collect and pool all data from one domain to train a single policy… ▽ More Training general robotic policies from heterogeneous data for different tasks is a significant challenge. Existing robotic datasets vary in different modalities such as color, depth, tactile, and proprioceptive information, and collected in different domains such as simulation, real robots, and human videos. Current methods usually collect and pool all data from one domain to train a single policy to handle such heterogeneity in tasks and domains, which is prohibitively expensive and difficult. In this work, we present a flexible approach, dubbed Policy Composition, to combine information across such diverse modalities and domains for learning scene-level and task-level generalized manipulation skills, by composing different data distributions represented with diffusion models. Our method can use task-level composition for multi-task manipulation and be composed with analytic cost functions to adapt policy behaviors at inference time. We train our method on simulation, human, and real robot data and evaluate in tool-use tasks. The composed policy achieves robust and dexterous performance under varying scenes and tasks and outperforms baselines from a single data source in both simulation and real-world experiments. See https://liruiw.github.io/policycomp for more details . △ Less

Submitted 27 May, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

arXiv:2309.10886 [pdf, other]

GelSight Svelte Hand: A Three-finger, Two-DoF, Tactile-rich, Low-cost Robot Hand for Dexterous Manipulation

Authors: Jialiang Zhao, Edward H. Adelson

Abstract: This paper presents GelSight Svelte Hand, a novel 3-finger 2-DoF tactile robotic hand that is capable of performing precision grasps, power grasps, and intermediate grasps. Rich tactile signals are obtained from one camera on each finger, with an extended sensing area similar to the full length of a human finger. Each finger of GelSight Svelte Hand is supported by a semi-rigid endoskeleton and cov… ▽ More This paper presents GelSight Svelte Hand, a novel 3-finger 2-DoF tactile robotic hand that is capable of performing precision grasps, power grasps, and intermediate grasps. Rich tactile signals are obtained from one camera on each finger, with an extended sensing area similar to the full length of a human finger. Each finger of GelSight Svelte Hand is supported by a semi-rigid endoskeleton and covered with soft silicone materials, which provide both rigidity and compliance. We describe the design, fabrication, functionalities, and tactile sensing capability of GelSight Svelte Hand in this paper. More information is available on our website: \url{https://gelsight-svelte.alanz.info}. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: Submitted and accepted to IROS 2023 workshop on Visuo-Tactile Perception, Learning, Control for Manipulation and HRI (IROS RoboTac 2023)

arXiv:2309.10885 [pdf, other]

GelSight Svelte: A Human Finger-shaped Single-camera Tactile Robot Finger with Large Sensing Coverage and Proprioceptive Sensing

Authors: Jialiang Zhao, Edward H. Adelson

Abstract: Camera-based tactile sensing is a low-cost, popular approach to obtain highly detailed contact geometry information. However, most existing camera-based tactile sensors are fingertip sensors, and longer fingers often require extraneous elements to obtain an extended sensing area similar to the full length of a human finger. Moreover, existing methods to estimate proprioceptive information such as… ▽ More Camera-based tactile sensing is a low-cost, popular approach to obtain highly detailed contact geometry information. However, most existing camera-based tactile sensors are fingertip sensors, and longer fingers often require extraneous elements to obtain an extended sensing area similar to the full length of a human finger. Moreover, existing methods to estimate proprioceptive information such as total forces and torques applied on the finger from camera-based tactile sensors are not effective when the contact geometry is complex. We introduce GelSight Svelte, a curved, human finger-sized, single-camera tactile sensor that is capable of both tactile and proprioceptive sensing over a large area. GelSight Svelte uses curved mirrors to achieve the desired shape and sensing coverage. Proprioceptive information, such as the total bending and twisting torques applied on the finger, is reflected as deformations on the flexible backbone of GelSight Svelte, which are also captured by the camera. We train a convolutional neural network to estimate the bending and twisting torques from the captured images. We conduct gel deformation experiments at various locations of the finger to evaluate the tactile sensing capability and proprioceptive sensing accuracy. To demonstrate the capability and potential uses of GelSight Svelte, we conduct an object holding task with three different gras** modes that utilize different areas of the finger. More information is available on our website: https://gelsight-svelte.alanz.info △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: Submitted and accepted to 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

arXiv:2306.09946 [pdf, other]

Tactile-Reactive Roller Grasper

Authors: Shenli Yuan, Shaoxiong Wang, Radhen Patel, Megha Tippur, Connor Yako, Edward Adelson, Kenneth Salisbury

Abstract: Manipulation of objects within a robot's hand is one of the most important challenges in achieving robot dexterity. The "Roller Graspers" refers to a family of non-anthropomorphic hands utilizing motorized, rolling fingertips to achieve in-hand manipulation. These graspers manipulate grasped objects by commanding the rollers to exert forces that propel the object in the desired motion directions.… ▽ More Manipulation of objects within a robot's hand is one of the most important challenges in achieving robot dexterity. The "Roller Graspers" refers to a family of non-anthropomorphic hands utilizing motorized, rolling fingertips to achieve in-hand manipulation. These graspers manipulate grasped objects by commanding the rollers to exert forces that propel the object in the desired motion directions. In this paper, we explore the possibility of robot in-hand manipulation through tactile-guided rolling. We do so by develo** the Tactile-Reactive Roller Grasper (TRRG), which incorporates camera-based tactile sensing with compliant, steerable cylindrical fingertips, with accompanying sensor information processing and control strategies. We demonstrated that the combination of tactile feedback and the actively rolling surfaces enables a variety of robust in-hand manipulation applications. In addition, we also demonstrated object reconstruction techniques using tactile-guided rolling. A controlled experiment was conducted to provide insights on the benefits of tactile-reactive rollers for manipulation. We considered two manipulation cases: when the fingers are manipulating purely through rolling and when they are periodically breaking and reestablishing contact as in regras**. We found that tactile-guided rolling can improve the manipulation robustness by allowing the grasper to perform necessary fine grip adjustments in both manipulation cases, indicating that hybrid rolling fingertip and finger-gaiting designs may be a promising research direction. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2304.04268 [pdf, other]

GelSight360: An Omnidirectional Camera-Based Tactile Sensor for Dexterous Robotic Manipulation

Authors: Megha H. Tippur, Edward H. Adelson

Abstract: Camera-based tactile sensors have shown great promise in enhancing a robot's ability to perform a variety of dexterous manipulation tasks. Advantages of their use can be attributed to the high resolution tactile data and 3D depth map reconstructions they can provide. Unfortunately, many of these tactile sensors use either a flat sensing surface, sense on only one side of the sensor's body, or have… ▽ More Camera-based tactile sensors have shown great promise in enhancing a robot's ability to perform a variety of dexterous manipulation tasks. Advantages of their use can be attributed to the high resolution tactile data and 3D depth map reconstructions they can provide. Unfortunately, many of these tactile sensors use either a flat sensing surface, sense on only one side of the sensor's body, or have a bulky form-factor, making it difficult to integrate the sensors with a variety of robotic grippers. Of the camera-based sensors that do have all-around, curved sensing surfaces, many cannot provide 3D depth maps; those that do often require optical designs specified to a particular sensor geometry. In this work, we introduce GelSight360, a fingertip-like, omnidirectional, camera-based tactile sensor capable of producing depth maps of objects deforming the sensor's surface. In addition, we introduce a novel cross-LED lighting scheme that can be implemented in different all-around sensor geometries and sizes, allowing the sensor to easily be reconfigured and attached to different grippers of varying DOFs. With this work, we enable roboticists to quickly and easily customize high resolution tactile sensors to fit their robotic system's needs. △ Less

Submitted 9 April, 2023; originally announced April 2023.

arXiv:2303.17935 [pdf, other]

GelSight EndoFlex: A Soft Endoskeleton Hand with Continuous High-Resolution Tactile Sensing

Authors: Sandra Q. Liu, Leonardo Zamora Yañez, Edward H. Adelson

Abstract: We describe a novel three-finger robot hand that has high resolution tactile sensing along the entire length of each finger. The fingers are compliant, constructed with a soft shell supported with a flexible endoskeleton. Each finger contains two cameras, allowing tactile data to be gathered along the front and side surfaces of the fingers. The gripper can perform an envelo** grasp of an object… ▽ More We describe a novel three-finger robot hand that has high resolution tactile sensing along the entire length of each finger. The fingers are compliant, constructed with a soft shell supported with a flexible endoskeleton. Each finger contains two cameras, allowing tactile data to be gathered along the front and side surfaces of the fingers. The gripper can perform an envelo** grasp of an object and extract a large amount of rich tactile data in a single grasp. By capturing data from many parts of the grasped object at once, we can do object recognition with a single grasp rather than requiring multiple touches. We describe our novel design and construction techniques which allow us to simultaneously satisfy the requirements of compliance and strength, and high resolution tactile sensing over large areas. The supplementary video can be found here: https://youtu.be/H1OYADtgj9k △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: Accepted to IEEE Conference on Soft Robotics (RoboSoft) 2023

arXiv:2303.14883 [pdf, other]

GelSight Baby Fin Ray: A Compact, Compliant, Flexible Finger with High-Resolution Tactile Sensing

Authors: Sandra Q. Liu, Yuxiang Ma, Edward H. Adelson

Abstract: The synthesis of tactile sensing with compliance is essential to many fields, from agricultural usages like fruit picking, to sustainability practices such as sorting recycling, to the creation of safe home-care robots for the elderly to age with dignity. From tactile sensing, we can discern material properties, recognize textures, and determine softness, while with compliance, we are able to secu… ▽ More The synthesis of tactile sensing with compliance is essential to many fields, from agricultural usages like fruit picking, to sustainability practices such as sorting recycling, to the creation of safe home-care robots for the elderly to age with dignity. From tactile sensing, we can discern material properties, recognize textures, and determine softness, while with compliance, we are able to securely and safely interact with the objects and the environment around us. These two abilities can culminate into a useful soft robotic gripper, such as the original GelSight Fin Ray, which is able to grasp a large variety of different objects and also perform a simple household manipulation task: wine glass reorientation. Although the original GelSight Fin Ray solves the problem of interfacing a generally rigid, high-resolution sensor with a soft, compliant structure, we can improve the robustness of the sensor and implement techniques that make such camera-based tactile sensors applicable to a wider variety of soft robot designs. We first integrate flexible mirrors and incorporate the rigid electronic components into the base of the gripper, which greatly improves the compliance of the Fin Ray structure. Then, we synthesize a flexible and high-elongation silicone adhesive-based fluorescent paint, which can provide good quality 2D tactile localization results for our sensor. Finally, we incorporate all of these techniques into a new design: the Baby Fin Ray, which we use to dig through clutter, and perform successful classification of nuts in their shells. The supplementary video can be found here: https://youtu.be/_oD_QFtYTPM △ Less

Submitted 26 March, 2023; originally announced March 2023.

Comments: Accepted to IEEE Conference of Soft Robotics (RoboSoft) 2023

arXiv:2303.13482 [pdf, other]

TactoFind: A Tactile Only System for Object Retrieval

Authors: Sameer Pai, Tao Chen, Megha Tippur, Edward Adelson, Abhishek Gupta, Pulkit Agrawal

Abstract: We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer. Successful solutions require localizing free objects, identifying specific object instances, and then gras** the identified objects, only using touch feedback. Unlike vision, where cameras can observe the en… ▽ More We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer. Successful solutions require localizing free objects, identifying specific object instances, and then gras** the identified objects, only using touch feedback. Unlike vision, where cameras can observe the entire scene, touch sensors are local and only observe parts of the scene that are in contact with the manipulator. Moreover, information gathering via touch sensors necessitates applying forces on the touched surface which may disturb the scene itself. Reasoning with touch, therefore, requires careful exploration and integration of information over time -- a challenge we tackle. We present a system capable of using sparse tactile feedback from fingertip touch sensors on a dexterous hand to localize, identify and grasp novel objects without any visual feedback. Videos are available at https://taochenshh.github.io/projects/tactofind. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: Accepted in ICRA 2023

arXiv:2303.07997 [pdf, other]

FingerSLAM: Closed-loop Unknown Object Localization and Reconstruction from Visuo-tactile Feedback

Authors: Jialiang Zhao, Maria Bauza, Edward H. Adelson

Abstract: In this paper, we address the problem of using visuo-tactile feedback for 6-DoF localization and 3D reconstruction of unknown in-hand objects. We propose FingerSLAM, a closed-loop factor graph-based pose estimator that combines local tactile sensing at finger-tip and global vision sensing from a wrist-mount camera. FingerSLAM is constructed with two constituent pose estimators: a multi-pass refine… ▽ More In this paper, we address the problem of using visuo-tactile feedback for 6-DoF localization and 3D reconstruction of unknown in-hand objects. We propose FingerSLAM, a closed-loop factor graph-based pose estimator that combines local tactile sensing at finger-tip and global vision sensing from a wrist-mount camera. FingerSLAM is constructed with two constituent pose estimators: a multi-pass refined tactile-based pose estimator that captures movements from detailed local textures, and a single-pass vision-based pose estimator that predicts from a global view of the object. We also design a loop closure mechanism that actively matches current vision and tactile images to previously stored key-frames to reduce accumulated error. FingerSLAM incorporates the two sensing modalities of tactile and vision, as well as the loop closure mechanism with a factor graph-based optimization framework. Such a framework produces an optimized pose estimation solution that is more accurate than the standalone estimators. The estimated poses are then used to reconstruct the shape of the unknown object incrementally by stitching the local point clouds recovered from tactile images. We train our system on real-world data collected with 20 objects. We demonstrate reliable visuo-tactile pose estimation and shape reconstruction through quantitative and qualitative real-world evaluations on 6 objects that are unseen during training. △ Less

Submitted 14 March, 2023; originally announced March 2023.

Comments: Submitted and accepted to 2023 IEEE International Conference on Robotics and Automation (ICRA 2023)

arXiv:2212.05108 [pdf, other]

Visuotactile Affordances for Cloth Manipulation with Local Control

Authors: Neha Sunil, Shaoxiong Wang, Yu She, Edward Adelson, Alberto Rodriguez

Abstract: Cloth in the real world is often crumpled, self-occluded, or folded in on itself such that key regions, such as corners, are not directly graspable, making manipulation difficult. We propose a system that leverages visual and tactile perception to unfold the cloth via gras** and sliding on edges. By doing so, the robot is able to grasp two adjacent corners, enabling subsequent manipulation tasks… ▽ More Cloth in the real world is often crumpled, self-occluded, or folded in on itself such that key regions, such as corners, are not directly graspable, making manipulation difficult. We propose a system that leverages visual and tactile perception to unfold the cloth via gras** and sliding on edges. By doing so, the robot is able to grasp two adjacent corners, enabling subsequent manipulation tasks like folding or hanging. As components of this system, we develop tactile perception networks that classify whether an edge is grasped and estimate the pose of the edge. We use the edge classification network to supervise a visuotactile edge grasp affordance network that can grasp edges with a 90% success rate. Once an edge is grasped, we demonstrate that the robot can slide along the cloth to the adjacent corner using tactile pose estimation/control in real time. See http://nehasunil.com/visuotactile/visuotactile.html for videos. △ Less

Submitted 9 December, 2022; originally announced December 2022.

Comments: Accepted at CoRL 2022. Project website: http://nehasunil.com/visuotactile/visuotactile.html

arXiv:2212.03858 [pdf, other]

See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation

Authors: Hao Li, Yizhi Zhang, Junzhe Zhu, Shaoxiong Wang, Michelle A Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao, Jiajun Wu

Abstract: Humans use all of their senses to accomplish different tasks in everyday activities. In contrast, existing work on robotic manipulation mostly relies on one, or occasionally two modalities, such as vision and touch. In this work, we systematically study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks. We build a robot system that can see wit… ▽ More Humans use all of their senses to accomplish different tasks in everyday activities. In contrast, existing work on robotic manipulation mostly relies on one, or occasionally two modalities, such as vision and touch. In this work, we systematically study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks. We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor, with all three sensory modalities fused with a self-attention model. Results on two challenging tasks, dense packing and pouring, demonstrate the necessity and power of multisensory perception for robotic manipulation: vision displays the global status of the robot but can often suffer from occlusion, audio provides immediate feedback of key moments that are even invisible, and touch offers precise local geometry for decision making. Leveraging all three modalities, our robotic system significantly outperforms prior methods. △ Less

Submitted 8 December, 2022; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: In CoRL 2022. Li and Zhang equal contribution; Gao and Wu equal advising. Project page: https://ai.stanford.edu/~rhgao/see_hear_feel/

arXiv:2211.11744 [pdf, other]

doi 10.1126/scirobotics.adc9244

Visual Dexterity: In-Hand Reorientation of Novel and Complex Object Shapes

Authors: Tao Chen, Megha Tippur, Siyang Wu, Vikash Kumar, Edward Adelson, Pulkit Agrawal

Abstract: In-hand object reorientation is necessary for performing many dexterous manipulation tasks, such as tool use in less structured environments that remain beyond the reach of current robots. Prior works built reorientation systems assuming one or many of the following: reorienting only specific objects with simple shapes, limited range of reorientation, slow or quasistatic manipulation, simulation-o… ▽ More In-hand object reorientation is necessary for performing many dexterous manipulation tasks, such as tool use in less structured environments that remain beyond the reach of current robots. Prior works built reorientation systems assuming one or many of the following: reorienting only specific objects with simple shapes, limited range of reorientation, slow or quasistatic manipulation, simulation-only results, the need for specialized and costly sensor suites, and other constraints which make the system infeasible for real-world deployment. We present a general object reorientation controller that does not make these assumptions. It uses readings from a single commodity depth camera to dynamically reorient complex and new object shapes by any rotation in real-time, with the median reorientation time being close to seven seconds. The controller is trained using reinforcement learning in simulation and evaluated in the real world on new object shapes not used for training, including the most challenging scenario of reorienting objects held in the air by a downward-facing hand that must counteract gravity during reorientation. Our hardware platform only uses open-source components that cost less than five thousand dollars. Although we demonstrate the ability to overcome assumptions in prior work, there is ample scope for improving absolute performance. For instance, the challenging duck-shaped object not used for training was dropped in 56 percent of the trials. When it was not dropped, our controller reoriented the object within 0.4 radians (23 degrees) 75 percent of the time. Videos are available at: https://taochenshh.github.io/projects/visual-dexterity. △ Less

Submitted 24 November, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: Published in Science Robotics: https://www.science.org/doi/10.1126/scirobotics.adc9244

Journal ref: Science Robotics, 8(84): eadc9244, 2023

arXiv:2204.07146 [pdf, other]

GelSight Fin Ray: Incorporating Tactile Sensing into a Soft Compliant Robotic Gripper

Authors: Sandra Q. Liu, Edward H. Adelson

Abstract: To adapt to constantly changing environments and be safe for human interaction, robots should have compliant and soft characteristics as well as the ability to sense the world around them. Even so, the incorporation of tactile sensing into a soft compliant robot, like the Fin Ray finger, is difficult due to its deformable structure. Not only does the frame need to be modified to allow room for a v… ▽ More To adapt to constantly changing environments and be safe for human interaction, robots should have compliant and soft characteristics as well as the ability to sense the world around them. Even so, the incorporation of tactile sensing into a soft compliant robot, like the Fin Ray finger, is difficult due to its deformable structure. Not only does the frame need to be modified to allow room for a vision sensor, which enables intricate tactile sensing, the robot must also retain its original mechanically compliant properties. However, adding high-resolution tactile sensors to soft fingers is difficult since many sensorized fingers, such as GelSight-based ones, are rigid and function under the assumption that changes in the sensing region are only from tactile contact and not from finger compliance. A sensorized soft robotic finger needs to be able to separate its overall proprioceptive changes from its tactile information. To this end, this paper introduces the novel design of a GelSight Fin Ray, which embodies both the ability to passively adapt to any object it grasps and the ability to perform high-resolution tactile reconstruction, object orientation estimation, and marker tracking for shear and torsional forces. Having these capabilities allow soft and compliant robots to perform more manipulation tasks that require sensing. One such task the finger is able to perform successfully is a kitchen task: wine glass reorientation and placement, which is difficult to do with external vision sensors but is easy with tactile sensing. The development of this sensing technology could also potentially be applied to other soft compliant grippers, increasing their viability in many different fields. △ Less

Submitted 14 April, 2022; originally announced April 2022.

Comments: 2022 IEEE 5th International Conference on Soft Robotics (RoboSoft)

arXiv:2106.08851 [pdf, other]

GelSight Wedge: Measuring High-Resolution 3D Contact Geometry with a Compact Robot Finger

Authors: Shaoxiong Wang, Yu She, Branden Romero, Edward Adelson

Abstract: Vision-based tactile sensors have the potential to provide important contact geometry to localize the objective with visual occlusion. However, it is challenging to measure high-resolution 3D contact geometry for a compact robot finger, to simultaneously meet optical and mechanical constraints. In this work, we present the GelSight Wedge sensor, which is optimized to have a compact shape for robot… ▽ More Vision-based tactile sensors have the potential to provide important contact geometry to localize the objective with visual occlusion. However, it is challenging to measure high-resolution 3D contact geometry for a compact robot finger, to simultaneously meet optical and mechanical constraints. In this work, we present the GelSight Wedge sensor, which is optimized to have a compact shape for robot fingers, while achieving high-resolution 3D reconstruction. We evaluate the 3D reconstruction under different lighting configurations, and extend the method from 3 lights to 1 or 2 lights. We demonstrate the flexibility of the design by shrinking the sensor to the size of a human finger for fine manipulation tasks. We also show the effectiveness and potential of the reconstructed 3D geometry for pose tracking in the 3D space. △ Less

Submitted 16 June, 2021; originally announced June 2021.

Comments: ICRA 2021

arXiv:2102.10230 [pdf, other]

Digger Finger: GelSight Tactile Sensor for Object Identification Inside Granular Media

Authors: Radhen Patel, Rui Ouyang, Branden Romero, Edward Adelson

Abstract: In this paper we present an early prototype of the Digger Finger that is designed to easily penetrate granular media and is equipped with the GelSight sensor. Identifying objects buried in granular media using tactile sensors is a challenging task. First, particle jamming in granular media prevents downward movement. Second, the granular media particles tend to get stuck between the sensing surfac… ▽ More In this paper we present an early prototype of the Digger Finger that is designed to easily penetrate granular media and is equipped with the GelSight sensor. Identifying objects buried in granular media using tactile sensors is a challenging task. First, particle jamming in granular media prevents downward movement. Second, the granular media particles tend to get stuck between the sensing surface and the object of interest, distorting the actual shape of the object. To tackle these challenges we present a Digger Finger prototype. It is capable of fluidizing granular media during penetration using mechanical vibrations. It is equipped with high resolution vision based tactile sensing to identify objects buried inside granular media. We describe the experimental procedures we use to evaluate these fluidizing and buried shape recognition capabilities. A robot with such fingers can perform explosive ordnance disposal and Improvised Explosive Device (IED) detection tasks at a much a finer resolution compared to techniques like Ground Penetration Radars (GPRs). Sensors like the Digger Finger will allow robotic manipulation research to move beyond only manipulating rigid objects. △ Less

Submitted 19 February, 2021; originally announced February 2021.

Comments: To appear in 17th International Symposium on Experimental Robotics. For supplemental video see https://sites.google.com/view/diggerfinger/

arXiv:2101.11812 [pdf, other]

SwingBot: Learning Physical Features from In-hand Tactile Exploration for Dynamic Swing-up Manipulation

Authors: Chen Wang, Shaoxiong Wang, Branden Romero, Filipe Veiga, Edward Adelson

Abstract: Several robot manipulation tasks are extremely sensitive to variations of the physical properties of the manipulated objects. One such task is manipulating objects by using gravity or arm accelerations, increasing the importance of mass, center of mass, and friction information. We present SwingBot, a robot that is able to learn the physical features of a held object through tactile exploration. T… ▽ More Several robot manipulation tasks are extremely sensitive to variations of the physical properties of the manipulated objects. One such task is manipulating objects by using gravity or arm accelerations, increasing the importance of mass, center of mass, and friction information. We present SwingBot, a robot that is able to learn the physical features of a held object through tactile exploration. Two exploration actions (tilting and shaking) provide the tactile information used to create a physical feature embedding space. With this embedding, SwingBot is able to predict the swing angle achieved by a robot performing dynamic swing-up manipulations on a previously unseen object. Using these predictions, it is able to search for the optimal control parameters for a desired swing-up angle. We show that with the learned physical features our end-to-end self-supervised learning pipeline is able to substantially improve the accuracy of swinging up unseen objects. We also show that objects with similar dynamics are closer to each other on the embedding space and that the embedding can be disentangled into values of specific physical properties. △ Less

Submitted 27 January, 2021; originally announced January 2021.

Comments: IROS 2020

arXiv:2005.09068 [pdf, other]

Soft, Round, High Resolution Tactile Fingertip Sensors for Dexterous Robotic Manipulation

Authors: Branden Romero, Filipe Veiga, Edward Adelson

Abstract: High resolution tactile sensors are often bulky and have shape profiles that make them awkward for use in manipulation. This becomes important when using such sensors as fingertips for dexterous multi-fingered hands, where boxy or planar fingertips limit the available set of smooth manipulation strategies. High resolution optical based sensors such as GelSight have until now been constrained to re… ▽ More High resolution tactile sensors are often bulky and have shape profiles that make them awkward for use in manipulation. This becomes important when using such sensors as fingertips for dexterous multi-fingered hands, where boxy or planar fingertips limit the available set of smooth manipulation strategies. High resolution optical based sensors such as GelSight have until now been constrained to relatively flat geometries due to constraints on illumination geometry.Here, we show how to construct a rounded fingertip that utilizes a form of light pi** for directional illumination. Our sensors can replace the standard rounded fingertips of the Allegro hand.They can capture high resolution maps of the contact surfaces,and can be used to support various dexterous manipulation tasks. △ Less

Submitted 18 May, 2020; originally announced May 2020.

arXiv:2002.02474 [pdf, other]

Design of a Fully Actuated Robotic Hand With Multiple Gelsight Tactile Sensors

Authors: Achu Wilson, Shaoxiong Wang, Branden Romero, Edward Adelson

Abstract: This work details the design of a novel two finger robot gripper with multiple Gelsight based optical-tactile sensors covering the inner surface of the hand. The multiple Gelsight sensors can gather the surface topology of the object from multiple views simultaneously as well as can track the shear and tensile stress. In addition, other sensing modalities enable the hand to gather the thermal, aco… ▽ More This work details the design of a novel two finger robot gripper with multiple Gelsight based optical-tactile sensors covering the inner surface of the hand. The multiple Gelsight sensors can gather the surface topology of the object from multiple views simultaneously as well as can track the shear and tensile stress. In addition, other sensing modalities enable the hand to gather the thermal, acoustic and vibration information from the object being grasped. The force controlled gripper is fully actuated so that it can be used for various grasp configurations and can also be used for in-hand manipulation tasks. Here we present the design of such a gripper. △ Less

Submitted 6 February, 2020; originally announced February 2020.

arXiv:1910.02860 [pdf, other]

Cable Manipulation with a Tactile-Reactive Gripper

Authors: Yu She, Shaoxiong Wang, Siyuan Dong, Neha Sunil, Alberto Rodriguez, Edward Adelson

Abstract: Cables are complex, high dimensional, and dynamic objects. Standard approaches to manipulate them often rely on conservative strategies that involve long series of very slow and incremental deformations, or various mechanical fixtures such as clamps, pins or rings. We are interested in manipulating freely moving cables, in real time, with a pair of robotic grippers, and with no added mechanical co… ▽ More Cables are complex, high dimensional, and dynamic objects. Standard approaches to manipulate them often rely on conservative strategies that involve long series of very slow and incremental deformations, or various mechanical fixtures such as clamps, pins or rings. We are interested in manipulating freely moving cables, in real time, with a pair of robotic grippers, and with no added mechanical constraints. The main contribution of this paper is a perception and control framework that moves in that direction, and uses real-time tactile feedback to accomplish the task of following a dangling cable. The approach relies on a vision-based tactile sensor, GelSight, that estimates the pose of the cable in the grip, and the friction forces during cable sliding. We achieve the behavior by combining two tactile-based controllers: 1) Cable grip controller, where a PD controller combined with a leaky integrator regulates the grip** force to maintain the frictional sliding forces close to a suitable value; and 2) Cable pose controller, where an LQR controller based on a learned linear model of the cable sliding dynamics keeps the cable centered and aligned on the fingertips to prevent the cable from falling from the grip. This behavior is possible by a reactive gripper fitted with GelSight-based high-resolution tactile sensors. The robot can follow one meter of cable in random configurations within 2-3 hand regrasps, adapting to cables of different materials and thicknesses. We demonstrate a robot gras** a headphone cable, sliding the fingers to the jack connector, and inserting it. To the best of our knowledge, this is the first implementation of real-time cable following without the aid of mechanical fixtures. △ Less

Submitted 23 June, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: Accepted to RSS 2020

arXiv:1910.01287 [pdf, other]

Exoskeleton-covered soft finger with vision-based proprioception and tactile sensing

Authors: Yu She, Sandra Q. Liu, Peiyu Yu, Edward Adelson

Abstract: Soft robots offer significant advantages in adaptability, safety, and dexterity compared to conventional rigid-body robots. However, it is challenging to equip soft robots with accurate proprioception and tactile sensing due to their high flexibility and elasticity. In this work, we describe the development of a vision-based proprioceptive and tactile sensor for soft robots called GelFlex, which i… ▽ More Soft robots offer significant advantages in adaptability, safety, and dexterity compared to conventional rigid-body robots. However, it is challenging to equip soft robots with accurate proprioception and tactile sensing due to their high flexibility and elasticity. In this work, we describe the development of a vision-based proprioceptive and tactile sensor for soft robots called GelFlex, which is inspired by previous GelSight sensing techniques. More specifically, we develop a novel exoskeleton-covered soft finger with embedded cameras and deep learning methods that enable high-resolution proprioceptive sensing and rich tactile sensing. To do so, we design features along the axial direction of the finger, which enable high-resolution proprioceptive sensing, and incorporate a reflective ink coating on the surface of the finger to enable rich tactile sensing. We design a highly underactuated exoskeleton with a tendon-driven mechanism to actuate the finger. Finally, we assemble 2 of the fingers together to form a robotic gripper and successfully perform a bar stock classification task, which requires both shape and tactile information. We train neural networks for proprioception and shape (box versus cylinder) classification using data from the embedded sensors. The proprioception CNN had over 99\% accuracy on our testing set (all six joint angles were within 1 degree of error) and had an average accumulative distance error of 0.77 mm during live testing, which is better than human finger proprioception. These proposed techniques offer soft robots the high-level ability to simultaneously perceive their proprioceptive state and peripheral environment, providing potential solutions for soft robots to solve everyday manipulation tasks. We believe the methods developed in this work can be widely applied to different designs and applications. △ Less

Submitted 23 June, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: Accepted to ICRA2020

arXiv:1808.03247 [pdf, other]

3D Shape Perception from Monocular Vision, Touch, and Shape Priors

Authors: Shaoxiong Wang, Jiajun Wu, Xingyuan Sun, Wenzhen Yuan, William T. Freeman, Joshua B. Tenenbaum, Edward H. Adelson

Abstract: Perceiving accurate 3D object shape is important for robots to interact with the physical world. Current research along this direction has been primarily relying on visual observations. Vision, however useful, has inherent limitations due to occlusions and the 2D-3D ambiguities, especially for perception with a monocular camera. In contrast, touch gets precise local shape information, though its e… ▽ More Perceiving accurate 3D object shape is important for robots to interact with the physical world. Current research along this direction has been primarily relying on visual observations. Vision, however useful, has inherent limitations due to occlusions and the 2D-3D ambiguities, especially for perception with a monocular camera. In contrast, touch gets precise local shape information, though its efficiency for reconstructing the entire shape could be low. In this paper, we propose a novel paradigm that efficiently perceives accurate 3D object shape by incorporating visual and tactile observations, as well as prior knowledge of common object shapes learned from large-scale shape repositories. We use vision first, applying neural networks with learned shape priors to predict an object's 3D shape from a single-view color image. We then use tactile sensing to refine the shape; the robot actively touches the object regions where the visual prediction has high uncertainty. Our method efficiently builds the 3D shape of common objects from a color image and a small number of tactile explorations (around 10). Our setup is easy to apply and has potentials to help robots better perform gras** or manipulation tasks on real-world objects. △ Less

Submitted 9 August, 2018; originally announced August 2018.

Comments: IROS 2018. The first two authors contributed equally to this work

arXiv:1805.11085 [pdf, other]

doi 10.1109/LRA.2018.2852779

More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch

Authors: Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward H. Adelson, Sergey Levine

Abstract: For humans, the process of gras** an object relies heavily on rich tactile feedback. Most recent robotic gras** work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this paper, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to… ▽ More For humans, the process of gras** an object relies heavily on rich tactile feedback. Most recent robotic gras** work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this paper, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to-end action-conditional model that learns regras** policies from raw visuo-tactile data. This model -- a deep, multimodal convolutional network -- predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions. Our approach requires neither calibration of the tactile sensors, nor any analytical modeling of contact forces, thus reducing the engineering effort required to obtain efficient gras** policies. We train our model with data from about 6,450 gras** trials on a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger. Across extensive experiments, our approach outperforms a variety of baselines at (i) estimating grasp adjustment outcomes, (ii) selecting efficient grasp adjustments for quick gras**, and (iii) reducing the amount of force applied at the fingers, while maintaining competitive performance. Finally, we study the choices made by our model and show that it has successfully acquired useful and interpretable gras** behaviors. △ Less

Submitted 26 July, 2018; v1 submitted 28 May, 2018; originally announced May 2018.

Comments: 8 pages. Published on IEEE Robotics and Automation Letters (RAL). Website: https://sites.google.com/view/more-than-a-feeling

arXiv:1803.00628 [pdf, other]

GelSlim: A High-Resolution, Compact, Robust, and Calibrated Tactile-sensing Finger

Authors: Elliott Donlon, Siyuan Dong, Melody Liu, Jianhua Li, Edward Adelson, Alberto Rodriguez

Abstract: This work describes the development of a high-resolution tactile-sensing finger for robot gras**. This finger, inspired by previous GelSight sensing techniques, features an integration that is slimmer, more robust, and with more homogeneous output than previous vision-based tactile sensors. To achieve a compact integration, we redesign the optical path from illumination source to camera by combi… ▽ More This work describes the development of a high-resolution tactile-sensing finger for robot gras**. This finger, inspired by previous GelSight sensing techniques, features an integration that is slimmer, more robust, and with more homogeneous output than previous vision-based tactile sensors. To achieve a compact integration, we redesign the optical path from illumination source to camera by combining light guides and an arrangement of mirror reflections. We parameterize the optical path with geometric design variables and describe the tradeoffs between the finger thickness, the depth of field of the camera, and the size of the tactile sensing area. The sensor sustains the wear from continuous use -- and abuse -- in gras** tasks by combining tougher materials for the compliant soft gel, a textured fabric skin, a structurally rigid body, and a calibration process that maintains homogeneous illumination and contrast of the tactile images during use. Finally, we evaluate the sensor's durability along four metrics that track the signal quality during more than 3000 gras** experiments. △ Less

Submitted 15 May, 2018; v1 submitted 1 March, 2018; originally announced March 2018.

Comments: RA-L Pre-print. 8 pages

MSC Class: 70B15

arXiv:1802.10153 [pdf, other]

Slip Detection with Combined Tactile and Visual Information

Authors: Jianhua Li, Siyuan Dong, Edward Adelson

Abstract: Slip detection plays a vital role in robotic manipulation and it has long been a challenging problem in the robotic community. In this paper, we propose a new method based on deep neural network (DNN) to detect slip. The training data is acquired by a GelSight tactile sensor and a camera mounted on a gripper when we use a robot arm to grasp and lift 94 daily objects with different gras** forces… ▽ More Slip detection plays a vital role in robotic manipulation and it has long been a challenging problem in the robotic community. In this paper, we propose a new method based on deep neural network (DNN) to detect slip. The training data is acquired by a GelSight tactile sensor and a camera mounted on a gripper when we use a robot arm to grasp and lift 94 daily objects with different gras** forces and gras** positions. The DNN is trained to classify whether a slip occurred or not. To evaluate the performance of the DNN, we test 10 unseen objects in 152 grasps. A detection accuracy as high as 88.03% is achieved. It is anticipated that the accuracy can be further improved with a larger dataset. This method is beneficial for robots to make stable grasps, which can be widely applied to automatic force control, gras** strategy selection and fine manipulation. △ Less

Submitted 27 February, 2018; originally announced February 2018.

Comments: International Conference on Robotics and Automation (ICRA) 2018

arXiv:1802.07490 [pdf, other]

ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth Texture Recognition

Authors: Shan Luo, Wenzhen Yuan, Edward Adelson, Anthony G. Cohn, Raul Fuentes

Abstract: Vision and touch are two of the important sensing modalities for humans and they offer complementary information for sensing the environment. Robots could also benefit from such multi-modal sensing ability. In this paper, addressing for the first time (to the best of our knowledge) texture recognition from tactile images and vision, we propose a new fusion method named Deep Maximum Covariance Anal… ▽ More Vision and touch are two of the important sensing modalities for humans and they offer complementary information for sensing the environment. Robots could also benefit from such multi-modal sensing ability. In this paper, addressing for the first time (to the best of our knowledge) texture recognition from tactile images and vision, we propose a new fusion method named Deep Maximum Covariance Analysis (DMCA) to learn a joint latent space for sharing features through vision and tactile sensing. The features of camera images and tactile data acquired from a GelSight sensor are learned by deep neural networks. But the learned features are of a high dimensionality and are redundant due to the differences between the two sensing modalities, which deteriorates the perception performance. To address this, the learned features are paired using maximum covariance analysis. Results of the algorithm on a newly collected dataset of paired visual and tactile data relating to cloth textures show that a good recognition performance of greater than 90\% can be achieved by using the proposed DMCA framework. In addition, we find that the perception performance of either vision or tactile sensing can be improved by employing the shared representation space, compared to learning from unimodal data. △ Less

Submitted 13 March, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

Comments: 6 pages, 5 figures, Accepted for 2018 IEEE International Conference on Robotics and Automation

arXiv:1711.00574 [pdf, other]

Active Clothing Material Perception using Tactile Sensing and Deep Learning

Authors: Wenzhen Yuan, Yuchen Mo, Shaoxiong Wang, Edward Adelson

Abstract: Humans represent and discriminate the objects in the same category using their properties, and an intelligent robot should be able to do the same. In this paper, we build a robot system that can autonomously perceive the object properties through touch. We work on the common object category of clothing. The robot moves under the guidance of an external Kinect sensor, and squeezes the clothes with… ▽ More Humans represent and discriminate the objects in the same category using their properties, and an intelligent robot should be able to do the same. In this paper, we build a robot system that can autonomously perceive the object properties through touch. We work on the common object category of clothing. The robot moves under the guidance of an external Kinect sensor, and squeezes the clothes with a GelSight tactile sensor, then it recognizes the 11 properties of the clothing according to the tactile data. Those properties include the physical properties, like thickness, fuzziness, softness and durability, and semantic properties, like wearing season and preferred washing methods. We collect a dataset of 153 varied pieces of clothes, and conduct 6616 robot exploring iterations on them. To extract the useful information from the high-dimensional sensory output, we applied Convolutional Neural Networks (CNN) on the tactile data for recognizing the clothing properties, and on the Kinect depth images for selecting exploration locations. Experiments show that using the trained neural networks, the robot can autonomously explore the unknown clothes and learn their properties. This work proposes a new framework for active tactile perception system with vision-touch system, and has potential to enable robots to help humans with varied clothing related housework. △ Less

Submitted 25 February, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

Comments: ICRA 2018 accepted

arXiv:1710.05512 [pdf, other]

The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes?

Authors: Roberto Calandra, Andrew Owens, Manu Upadhyaya, Wenzhen Yuan, Justin Lin, Edward H. Adelson, Sergey Levine

Abstract: A successful grasp requires careful balancing of the contact forces. Deducing whether a particular grasp will be successful from indirect measurements, such as vision, is therefore quite challenging, and direct sensing of contacts through touch sensing provides an appealing avenue toward more successful and consistent robotic gras**. However, in order to fully evaluate the value of touch sensing… ▽ More A successful grasp requires careful balancing of the contact forces. Deducing whether a particular grasp will be successful from indirect measurements, such as vision, is therefore quite challenging, and direct sensing of contacts through touch sensing provides an appealing avenue toward more successful and consistent robotic gras**. However, in order to fully evaluate the value of touch sensing for grasp outcome prediction, we must understand how touch sensing can influence outcome prediction accuracy when combined with other modalities. Doing so using conventional model-based techniques is exceptionally difficult. In this work, we investigate the question of whether touch sensing aids in predicting grasp outcomes within a multimodal sensing framework that combines vision and touch. To that end, we collected more than 9,000 gras** trials using a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger, and evaluated visuo-tactile deep neural network models to directly predict grasp outcomes from either modality individually, and from both modalities together. Our experimental results indicate that incorporating tactile readings substantially improve gras** performance. △ Less

Submitted 16 October, 2017; originally announced October 2017.

Comments: 10 pages, accepted at the 1st Annual Conference on Robot Learning (CoRL)

arXiv:1708.00922 [pdf, other]

doi 10.1109/IROS.2017.8202149

Improved GelSight Tactile Sensor for Measuring Geometry and Slip

Authors: Siyuan Dong, Wenzhen Yuan, Edward Adelson

Abstract: A GelSight sensor uses an elastomeric slab covered with a reflective membrane to measure tactile signals. It measures the 3D geometry and contact force information with high spacial resolution, and successfully helped many challenging robot tasks. A previous sensor, based on a semi-specular membrane, produces high resolution but with limited geometry accuracy. In this paper, we describe a new desi… ▽ More A GelSight sensor uses an elastomeric slab covered with a reflective membrane to measure tactile signals. It measures the 3D geometry and contact force information with high spacial resolution, and successfully helped many challenging robot tasks. A previous sensor, based on a semi-specular membrane, produces high resolution but with limited geometry accuracy. In this paper, we describe a new design of GelSight for robot gripper, using a Lambertian membrane and new illumination system, which gives greatly improved geometric accuracy while retaining the compact size. We demonstrate its use in measuring surface normals and reconstructing height maps using photometric stereo. We also use it for the task of slip detection, using a combination of information about relative motions on the membrane surface and the shear distortions. Using a robotic arm and a set of 37 everyday objects with varied properties, we find that the sensor can detect translational and rotational slip in general cases, and can be used to improve the stability of the grasp. △ Less

Submitted 2 August, 2017; originally announced August 2017.

Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems

arXiv:1704.03955 [pdf, other]

doi 10.1109/ICRA.2017.7989116

Shape-independent Hardness Estimation Using Deep Learning and a GelSight Tactile Sensor

Authors: Wenzhen Yuan, Chenzhuo Zhu, Andrew Owens, Mandayam A. Srinivasan, Edward H. Adelson

Abstract: Hardness is among the most important attributes of an object that humans learn about through touch. However, approaches for robots to estimate hardness are limited, due to the lack of information provided by current tactile sensors. In this work, we address these limitations by introducing a novel method for hardness estimation, based on the GelSight tactile sensor, and the method does not require… ▽ More Hardness is among the most important attributes of an object that humans learn about through touch. However, approaches for robots to estimate hardness are limited, due to the lack of information provided by current tactile sensors. In this work, we address these limitations by introducing a novel method for hardness estimation, based on the GelSight tactile sensor, and the method does not require accurate control of contact conditions or the shape of objects. A GelSight has a soft contact interface, and provides high resolution tactile images of contact geometry, as well as contact force and slip conditions. In this paper, we try to use the sensor to measure hardness of objects with multiple shapes, under a loosely controlled contact condition. The contact is made manually or by a robot hand, while the force and trajectory are unknown and uneven. We analyze the data using a deep constitutional (and recurrent) neural network. Experiments show that the neural net model can estimate the hardness of objects with different shapes and hardness ranging from 8 to 87 in Shore 00 scale. △ Less

Submitted 12 April, 2017; originally announced April 2017.

arXiv:1704.03822 [pdf, other]

Connecting Look and Feel: Associating the visual and tactile properties of physical materials

Authors: Wenzhen Yuan, Shaoxiong Wang, Siyuan Dong, Edward Adelson

Abstract: For machines to interact with the physical world, they must understand the physical properties of objects and materials they encounter. We use fabrics as an example of a deformable material with a rich set of mechanical properties. A thin flexible fabric, when draped, tends to look different from a heavy stiff fabric. It also feels different when touched. Using a collection of 118 fabric sample, w… ▽ More For machines to interact with the physical world, they must understand the physical properties of objects and materials they encounter. We use fabrics as an example of a deformable material with a rich set of mechanical properties. A thin flexible fabric, when draped, tends to look different from a heavy stiff fabric. It also feels different when touched. Using a collection of 118 fabric sample, we captured color and depth images of draped fabrics along with tactile data from a high resolution touch sensor. We then sought to associate the information from vision and touch by jointly training CNNs across the three modalities. Through the CNN, each input, regardless of the modality, generates an embedding vector that records the fabric's physical property. By comparing the embeddings, our system is able to look at a fabric image and predict how it will feel, and vice versa. We also show that a system jointly trained on vision and touch data can outperform a similar system trained only on visual data when tested purely with visual inputs. △ Less

Submitted 12 April, 2017; originally announced April 2017.

arXiv:1512.08512 [pdf, other]

Visually Indicated Sounds

Authors: Andrew Owens, Phillip Isola, Josh McDermott, Antonio Torralba, Edward H. Adelson, William T. Freeman

Abstract: Objects make distinctive sounds when they are hit or scratched. These sounds reveal aspects of an object's material properties, as well as the actions that produced them. In this paper, we propose the task of predicting what sound an object makes when struck as a way of studying physical interactions within a visual scene. We present an algorithm that synthesizes sound from silent videos of people… ▽ More Objects make distinctive sounds when they are hit or scratched. These sounds reveal aspects of an object's material properties, as well as the actions that produced them. In this paper, we propose the task of predicting what sound an object makes when struck as a way of studying physical interactions within a visual scene. We present an algorithm that synthesizes sound from silent videos of people hitting and scratching objects with a drumstick. This algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We show that the sounds predicted by our model are realistic enough to fool participants in a "real or fake" psychophysical experiment, and that they convey significant information about material properties and physical interactions. △ Less

Submitted 29 April, 2016; v1 submitted 28 December, 2015; originally announced December 2015.

arXiv:1511.06811 [pdf, other]

Learning visual groups from co-occurrences in space and time

Authors: Phillip Isola, Daniel Zoran, Dilip Krishnan, Edward H. Adelson

Abstract: We propose a self-supervised framework that learns to group visual entities based on their rate of co-occurrence in space and time. To model statistical dependencies between the entities, we set up a simple binary classification problem in which the goal is to predict if two visual primitives occur in the same spatial or temporal context. We apply this framework to three domains: learning patch af… ▽ More We propose a self-supervised framework that learns to group visual entities based on their rate of co-occurrence in space and time. To model statistical dependencies between the entities, we set up a simple binary classification problem in which the goal is to predict if two visual primitives occur in the same spatial or temporal context. We apply this framework to three domains: learning patch affinities from spatial adjacency in images, learning frame affinities from temporal adjacency in videos, and learning photo affinities from geospatial proximity in image collections. We demonstrate that in each case the learned affinities uncover meaningful semantic grou**s. From patch affinities we generate object proposals that are competitive with state-of-the-art supervised methods. From frame affinities we generate movie scene segmentations that correlate well with DVD chapter structure. Finally, from geospatial affinities we learn groups that relate well to semantic place categories. △ Less

Submitted 20 November, 2015; originally announced November 2015.

arXiv:1412.7884 [pdf, other]

Sparkle Vision: Seeing the World through Random Specular Microfacets

Authors: Zhengdong Zhang, Phillip Isola, Edward H. Adelson

Abstract: In this paper, we study the problem of reproducing the world lighting from a single image of an object covered with random specular microfacets on the surface. We show that such reflectors can be interpreted as a randomized map** from the lighting to the image. Such specular objects have very different optical properties from both diffuse surfaces and smooth specular objects like metals, so we d… ▽ More In this paper, we study the problem of reproducing the world lighting from a single image of an object covered with random specular microfacets on the surface. We show that such reflectors can be interpreted as a randomized map** from the lighting to the image. Such specular objects have very different optical properties from both diffuse surfaces and smooth specular objects like metals, so we design special imaging system to robustly and effectively photograph them. We present simple yet reliable algorithms to calibrate the proposed system and do the inference. We conduct experiments to verify the correctness of our model assumptions and prove the effectiveness of our pipeline. △ Less

Submitted 25 December, 2014; originally announced December 2014.

Showing 1–37 of 37 results for author: Adelson, E