-
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects
Authors:
Avinash Ummadisingu,
Jongkeum Choi,
Koki Yamane,
Shimpei Masuda,
Naoki Fukaya,
Kuniyuki Takahashi
Abstract:
Acquiring accurate depth information of transparent objects using off-the-shelf RGB-D cameras is a well-known challenge in Computer Vision and Robotics. Depth estimation/completion methods are typically employed and trained on datasets with quality depth labels acquired from either simulation, additional sensors or specialized data collection setups and known 3d models. However, acquiring reliable…
▽ More
Acquiring accurate depth information of transparent objects using off-the-shelf RGB-D cameras is a well-known challenge in Computer Vision and Robotics. Depth estimation/completion methods are typically employed and trained on datasets with quality depth labels acquired from either simulation, additional sensors or specialized data collection setups and known 3d models. However, acquiring reliable depth information for datasets at scale is not straightforward, limiting training scalability and generalization. Neural Radiance Fields (NeRFs) are learning-free approaches and have demonstrated wide success in novel view synthesis and shape recovery. However, heuristics and controlled environments (lights, backgrounds, etc) are often required to accurately capture specular surfaces. In this paper, we propose using Visual Foundation Models (VFMs) for segmentation in a zero-shot, label-free way to guide the NeRF reconstruction process for these objects via the simultaneous reconstruction of semantic fields and extensions to increase robustness. Our proposed method Segmentation-AIDed NeRF (SAID-NeRF) shows significant performance on depth completion datasets for transparent objects and robotic gras**.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Precise Well-plate Placing Utilizing Contact During Sliding with Tactile-based Pose Estimation for Laboratory Automation
Authors:
Sameer Pai,
Kuniyuki Takahashi,
Shimpei Masuda,
Naoki Fukaya,
Koki Yamane,
Avinash Ummadisingu
Abstract:
Micro well-plates are an apparatus commonly used in chemical and biological experiments that are a few centimeters thick and contain wells or divets. In this paper, we aim to solve the task of placing the well-plate onto a well-plate holder (referred to as holder). This task is challenging due to the holder's raised grooves being a few millimeters in height, with a clearance of less than 1 mm betw…
▽ More
Micro well-plates are an apparatus commonly used in chemical and biological experiments that are a few centimeters thick and contain wells or divets. In this paper, we aim to solve the task of placing the well-plate onto a well-plate holder (referred to as holder). This task is challenging due to the holder's raised grooves being a few millimeters in height, with a clearance of less than 1 mm between the well-plate and holder, thus requiring precise control during placing. Our placing task has the following challenges: 1) The holder's detected pose is uncertain; 2) the required accuracy is at the millimeter to sub-millimeter level due to the raised groove's shallow height and small clearance; 3) the holder is not fixed to a desk and is susceptible to movement from external forces. To address these challenges, we developed methods including a) using tactile sensors for accurate pose estimation of the grasped well-plate to handle issue (1); b) sliding the well-plate onto the target holder while maintaining contact with the holder's groove and estimating its orientation for accurate alignment. This allows for high precision control (addressing issue (2)) and prevents displacement of the holder during placement (addressing issue (3)). We demonstrate a high success rate for the well-plate placing task, even under noisy observation of the holder's pose.
△ Less
Submitted 31 March, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Two-fingered Hand with Gear-type Synchronization Mechanism with Magnet for Improved Small and Offset Objects Gras**: F2 Hand
Authors:
Naoki Fukaya,
Avinash Ummadisingu,
Kuniyuki Takahashi,
Guilherme Maeda,
Shin-ichi Maeda
Abstract:
A problem that plagues robotic gras** is the misalignment of the object and gripper due to difficulties in precise localization, actuation, etc. Under-actuated robotic hands with compliant mechanisms are used to adapt and compensate for these inaccuracies. However, these mechanisms come at the cost of controllability and coordination. For instance, adaptive functions that let the fingers of a tw…
▽ More
A problem that plagues robotic gras** is the misalignment of the object and gripper due to difficulties in precise localization, actuation, etc. Under-actuated robotic hands with compliant mechanisms are used to adapt and compensate for these inaccuracies. However, these mechanisms come at the cost of controllability and coordination. For instance, adaptive functions that let the fingers of a two-fingered gripper adapt independently may affect the coordination necessary for gras** small objects. In this work, we develop a two-fingered robotic hand capable of gras** objects that are offset from the gripper's center, while still having the requisite coordination for gras** small objects via a novel gear-type synchronization mechanism with a magnet. This gear synchronization mechanism allows the adaptive finger's tips to be aligned enabling it to grasp objects as small as toothpicks and washers. The magnetic component allows this coordination to automatically turn off when needed, allowing for the gras** of objects that are offset/misaligned from the gripper. This equips the hand with the capability of gras** light, fragile objects (strawberries, creampuffs, etc) to heavy frying pan lids, all while maintaining their position and posture which is vital in numerous applications that require precise positioning or careful manipulation.
△ Less
Submitted 20 September, 2023; v1 submitted 15 September, 2023;
originally announced September 2023.
-
F3 Hand: A Versatile Robot Hand Inspired by Human Thumb and Index Fingers
Authors:
Naoki Fukaya,
Avinash Ummadisingu,
Guilherme Maeda,
Shin-ichi Maeda
Abstract:
It is challenging to grasp numerous objects with varying sizes and shapes with a single robot hand. To address this, we propose a new robot hand called the 'F3 hand' inspired by the complex movements of human index finger and thumb. The F3 hand attempts to realize complex human-like gras** movements by combining a parallel motion finger and a rotational motion finger with an adaptive function. I…
▽ More
It is challenging to grasp numerous objects with varying sizes and shapes with a single robot hand. To address this, we propose a new robot hand called the 'F3 hand' inspired by the complex movements of human index finger and thumb. The F3 hand attempts to realize complex human-like gras** movements by combining a parallel motion finger and a rotational motion finger with an adaptive function. In order to confirm the performance of our hand, we attached it to a mobile manipulator - the Toyota Human Support Robot (HSR) and conducted gras** experiments. In our results, we show that it is able to grasp all YCB objects (82 in total), including washers with outer diameters as small as 6.4mm. We also built a system for intuitive operation with a 3D mouse and grasp an additional 24 objects, including small toothpicks and paper clips and large pitchers and cracker boxes. The F3 hand is able to achieve a 98% success rate in gras** even under imprecise control and positional offsets. Furthermore, owing to the finger's adaptive function, we demonstrate characteristics of the F3 hand that facilitate the gras** of soft objects such as strawberries in a desirable posture.
△ Less
Submitted 16 June, 2022; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Cluttered Food Gras** with Adaptive Fingers and Synthetic-Data Trained Object Detection
Authors:
Avinash Ummadisingu,
Kuniyuki Takahashi,
Naoki Fukaya
Abstract:
The food packaging industry handles an immense variety of food products with wide-ranging shapes and sizes, even within one kind of food. Menus are also diverse and change frequently, making automation of pick-and-place difficult. A popular approach to bin-picking is to first identify each piece of food in the tray by using an instance segmentation method. However, human annotations to train these…
▽ More
The food packaging industry handles an immense variety of food products with wide-ranging shapes and sizes, even within one kind of food. Menus are also diverse and change frequently, making automation of pick-and-place difficult. A popular approach to bin-picking is to first identify each piece of food in the tray by using an instance segmentation method. However, human annotations to train these methods are unreliable and error-prone since foods are packed close together with unclear boundaries and visual similarity making separation of pieces difficult. To address this problem, we propose a method that trains purely on synthetic data and successfully transfers to the real world using sim2real methods by creating datasets of filled food trays using high-quality 3d models of real pieces of food for the training instance segmentation models. Another concern is that foods are easily damaged during gras**. We address this by introducing two additional methods -- a novel adaptive finger mechanism to passively retract when a collision occurs, and a method to filter grasps that are likely to cause damage to neighbouring pieces of food during a grasp. We demonstrate the effectiveness of the proposed method on several kinds of real foods.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Target-mass Gras** of Entangled Food using Pre-gras** & Post-gras**
Authors:
Kuniyuki Takahashi,
Naoki Fukaya,
Avinash Ummadisingu
Abstract:
Food packing industries typically use seasonal ingredients with immense variety that factory workers manually pack. For small pieces of food picked by volume or weight that tend to get entangled, stick or clump together, it is difficult to predict how intertwined they are from a visual examination, making it a challenge to grasp the requisite target mass accurately. Workers rely on a combination o…
▽ More
Food packing industries typically use seasonal ingredients with immense variety that factory workers manually pack. For small pieces of food picked by volume or weight that tend to get entangled, stick or clump together, it is difficult to predict how intertwined they are from a visual examination, making it a challenge to grasp the requisite target mass accurately. Workers rely on a combination of weighing scales and a sequence of complex maneuvers to separate out the food and reach the target mass. This makes automation of the process a non-trivial affair. In this study, we propose methods that combines 1) pre-gras** to reduce the degree of the entanglement, 2) post-gras** to adjust the grasped mass using a novel gripper mechanism to carefully discard excess food when the grasped amount is larger than the target mass, and 3) selecting the gras** point to grasp an amount likely to be reasonably higher than target gras** mass with confidence. We evaluate the methods on a variety of foods that entangle, stick and clump, each of which has a different size, shape, and material properties such as volumetric mass density. We show significant improvement in grasp accuracy of user-specified target masses using our proposed methods.
△ Less
Submitted 2 March, 2022; v1 submitted 3 January, 2022;
originally announced January 2022.
-
Uncertainty-Aware Self-Supervised Target-Mass Gras** of Granular Foods
Authors:
Kuniyuki Takahashi,
Wilson Ko,
Avinash Ummadisingu,
Shin-ichi Maeda
Abstract:
Food packing industry workers typically pick a target amount of food by hand from a food tray and place them in containers. Since menus are diverse and change frequently, robots must adapt and learn to handle new foods in a short time-span. Learning to grasp a specific amount of granular food requires a large training dataset, which is challenging to collect reasonably quickly. In this study, we p…
▽ More
Food packing industry workers typically pick a target amount of food by hand from a food tray and place them in containers. Since menus are diverse and change frequently, robots must adapt and learn to handle new foods in a short time-span. Learning to grasp a specific amount of granular food requires a large training dataset, which is challenging to collect reasonably quickly. In this study, we propose ways to reduce the necessary amount of training data by augmenting a deep neural network with models that estimate its uncertainty through self-supervised learning. To further reduce human effort, we devise a data collection system that automatically generates labels. We build on the idea that we can grasp sufficiently well if there is at least one low-uncertainty (high-confidence) grasp point among the various grasp point candidates. We evaluate the methods we propose in this work on a variety of granular foods -- coffee beans, rice, oatmeal and peanuts -- each of which has a different size, shape and material properties such as volumetric mass density or friction. For these foods, we show significantly improved grasp accuracy of user-specified target masses using smaller datasets by incorporating uncertainty.
△ Less
Submitted 27 May, 2021;
originally announced May 2021.
-
The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors
Authors:
William H. Guss,
Mario Ynocente Castro,
Sam Devlin,
Brandon Houghton,
Noboru Sean Kuno,
Crissman Loomis,
Stephanie Milani,
Sharada Mohanty,
Keisuke Nakata,
Ruslan Salakhutdinov,
John Schulman,
Shinya Shiroshita,
Nicholay Topin,
Avinash Ummadisingu,
Oriol Vinyals
Abstract:
Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development. Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineR…
▽ More
Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development. Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineRL Competition. The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve complex, hierarchical, and sparse environments. To that end, participants compete under a limited environment sample-complexity budget to develop systems which solve the MineRL ObtainDiamond task in Minecraft, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods. The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment with different game textures and shaders. At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform where they are trained from scratch on a hold-out dataset-environment pair for a total of 4-days on a pre-specified hardware platform. In this follow-up iteration to the NeurIPS 2019 MineRL Competition, we implement new features to expand the scale and reach of the competition. In response to the feedback of the previous participants, we introduce a second minor track focusing on solutions without access to environment interactions of any kind except during test-time. Further we aim to prompt domain agnostic submissions by implementing several novel competition mechanics including action-space randomization and desemantization of observations and actions.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Distributed Reinforcement Learning of Targeted Gras** with Active Vision for Mobile Manipulators
Authors:
Yasuhiro Fujita,
Kota Uenishi,
Avinash Ummadisingu,
Prabhat Nagarajan,
Shimpei Masuda,
Mario Ynocente Castro
Abstract:
Develo** personal robots that can perform a diverse range of manipulation tasks in unstructured environments necessitates solving several challenges for robotic gras** systems. We take a step towards this broader goal by presenting the first RL-based system, to our knowledge, for a mobile manipulator that can (a) achieve targeted gras** generalizing to unseen target objects, (b) learn comple…
▽ More
Develo** personal robots that can perform a diverse range of manipulation tasks in unstructured environments necessitates solving several challenges for robotic gras** systems. We take a step towards this broader goal by presenting the first RL-based system, to our knowledge, for a mobile manipulator that can (a) achieve targeted gras** generalizing to unseen target objects, (b) learn complex gras** strategies for cluttered scenes with occluded objects, and (c) perform active vision through its movable wrist camera to better locate objects. The system is informed of the desired target object in the form of a single, arbitrary-pose RGB image of that object, enabling the system to generalize to unseen objects without retraining. To achieve such a system, we combine several advances in deep reinforcement learning and present a large-scale distributed training system using synchronous SGD that seamlessly scales to multi-node, multi-GPU infrastructure to make rapid prototy** easier. We train and evaluate our system in a simulated environment, identify key components for improving performance, analyze its behaviors, and transfer to a real-world setup.
△ Less
Submitted 14 October, 2020; v1 submitted 15 July, 2020;
originally announced July 2020.
-
Hindsight policy gradients
Authors:
Paulo Rauber,
Avinash Ummadisingu,
Filipe Mutz,
Juergen Schmidhuber
Abstract:
A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable higher-level planning based on subgoals. In sparse-reward environments, the capacity to exploit information about the degree to which an arbitrary goal has been achieved…
▽ More
A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable higher-level planning based on subgoals. In sparse-reward environments, the capacity to exploit information about the degree to which an arbitrary goal has been achieved while another goal was intended appears crucial to enable sample efficient learning. However, reinforcement learning agents have only recently been endowed with such capacity for hindsight. In this paper, we demonstrate how hindsight can be introduced to policy gradient methods, generalizing this idea to a broad class of successful algorithms. Our experiments on a diverse selection of sparse-reward environments show that hindsight leads to a remarkable increase in sample efficiency.
△ Less
Submitted 20 February, 2019; v1 submitted 16 November, 2017;
originally announced November 2017.