-
TCuPGAN: A novel framework developed for optimizing human-machine interactions in citizen science
Authors:
Ramanakumar Sankar,
Kameswara Mantha,
Lucy Fortson,
Helen Spiers,
Thomas Pengo,
Douglas Mashek,
Myat Mo,
Mark Sanders,
Trace Christensen,
Jeffrey Salisbury,
Laura Trouille
Abstract:
In the era of big data in scientific research, there is a necessity to leverage techniques which reduce human effort in labeling and categorizing large datasets by involving sophisticated machine tools. To combat this problem, we present a novel, general purpose model for 3D segmentation that leverages patch-wise adversariality and Long Short-Term Memory to encode sequential information. Using thi…
▽ More
In the era of big data in scientific research, there is a necessity to leverage techniques which reduce human effort in labeling and categorizing large datasets by involving sophisticated machine tools. To combat this problem, we present a novel, general purpose model for 3D segmentation that leverages patch-wise adversariality and Long Short-Term Memory to encode sequential information. Using this model alongside citizen science projects which use 3D datasets (image cubes) on the Zooniverse platforms, we propose an iterative human-machine optimization framework where only a fraction of the 2D slices from these cubes are seen by the volunteers. We leverage the patch-wise discriminator in our model to provide an estimate of which slices within these image cubes have poorly generalized feature representations, and correspondingly poor machine performance. These images with corresponding machine proposals would be presented to volunteers on Zooniverse for correction, leading to a drastic reduction in the volunteer effort on citizen science projects. We trained our model on ~2300 liver tissue 3D electron micrographs. Lipid droplets were segmented within these images through human annotation via the `Etch A Cell - Fat Checker' citizen science project, hosted on the Zooniverse platform. In this work, we demonstrate this framework and the selection methodology which resulted in a measured reduction in volunteer effort by more than 60%. We envision this type of joint human-machine partnership will be of great use on future Zooniverse projects.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
From fat droplets to floating forests: cross-domain transfer learning using a PatchGAN-based segmentation model
Authors:
Kameswara Bharadwaj Mantha,
Ramanakumar Sankar,
Yu** Zheng,
Lucy Fortson,
Thomas Pengo,
Douglas Mashek,
Mark Sanders,
Trace Christensen,
Jeffrey Salisbury,
Laura Trouille,
Jarrett E. K. Byrnes,
Isaac Rosenthal,
Henry Houskeeper,
Kyle Cavanaugh
Abstract:
Many scientific domains gather sufficient labels to train machine algorithms through human-in-the-loop techniques provided by the Zooniverse.org citizen science platform. As the range of projects, task types and data rates increase, acceleration of model training is of paramount concern to focus volunteer effort where most needed. The application of Transfer Learning (TL) between Zooniverse projec…
▽ More
Many scientific domains gather sufficient labels to train machine algorithms through human-in-the-loop techniques provided by the Zooniverse.org citizen science platform. As the range of projects, task types and data rates increase, acceleration of model training is of paramount concern to focus volunteer effort where most needed. The application of Transfer Learning (TL) between Zooniverse projects holds promise as a solution. However, understanding the effectiveness of TL approaches that pretrain on large-scale generic image sets vs. images with similar characteristics possibly from similar tasks is an open challenge. We apply a generative segmentation model on two Zooniverse project-based data sets: (1) to identify fat droplets in liver cells (FatChecker; FC) and (2) the identification of kelp beds in satellite images (Floating Forests; FF) through transfer learning from the first project. We compare and contrast its performance with a TL model based on the COCO image set, and subsequently with baseline counterparts. We find that both the FC and COCO TL models perform better than the baseline cases when using >75% of the original training sample size. The COCO-based TL model generally performs better than the FC-based one, likely due to its generalized features. Our investigations provide important insights into usage of TL approaches on multi-domain data hosted across different Zooniverse projects, enabling future projects to accelerate task completion.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Designing Underactuated Graspers with Dynamically Variable Geometry Using Potential Energy Map Based Analysis
Authors:
Connor L. Yako,
Shenli Yuan,
J. Kenneth Salisbury
Abstract:
In this paper we present a potential energy map based approach that provides a framework for the design and control of a robotic grasper. Unlike other potential energy map approaches, our framework is able to consider friction for a more realistic perspective on grasper performance. Our analysis establishes the importance of including variable geometry in a grasper design, namely with regards to p…
▽ More
In this paper we present a potential energy map based approach that provides a framework for the design and control of a robotic grasper. Unlike other potential energy map approaches, our framework is able to consider friction for a more realistic perspective on grasper performance. Our analysis establishes the importance of including variable geometry in a grasper design, namely with regards to palm width, link lengths, and transmission ratio. We demonstrate the use of this method specifically for a two-phalanx tendon-pulley underactuated grasper, and show how various design parameters - palm width, link lengths, and transmission ratios - impact the gras** and manipulation performance of a specific design across a range of object sizes and friction coefficients. Optimal gras** designs have palms that scale with object size, and transmission ratios that scale with the coefficient of friction. Using a custom manipulation metric we compared a grasper that only dynamically varied its geometry to a grasper with a variable palm and distinct actuation commands. The analysis revealed the advantage of the compliant reconfiguration ability intrinsic to underactuated mechanisms; by varying only the geometry of the grasper, manipulation of a wide range of objects could be performed.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Design and Control of Roller Grasper V2 for In-Hand Manipulation
Authors:
Shenli Yuan,
Lin Shao,
Connor L. Yako,
Alex Gruebele,
J. Kenneth Salisbury
Abstract:
The ability to perform in-hand manipulation still remains an unsolved problem; having this capability would allow robots to perform sophisticated tasks requiring repositioning and reorienting of grasped objects. In this work, we present a novel non-anthropomorphic robot grasper with the ability to manipulate objects by means of active surfaces at the fingertips. Active surfaces are achieved by sph…
▽ More
The ability to perform in-hand manipulation still remains an unsolved problem; having this capability would allow robots to perform sophisticated tasks requiring repositioning and reorienting of grasped objects. In this work, we present a novel non-anthropomorphic robot grasper with the ability to manipulate objects by means of active surfaces at the fingertips. Active surfaces are achieved by spherical rolling fingertips with two degrees of freedom (DoF) -- a pivoting motion for surface reorientation -- and a continuous rolling motion for moving the object. A further DoF is in the base of each finger, allowing the fingers to grasp objects over a range of size and shapes. Instantaneous kinematics was derived and objects were successfully manipulated both with a custom handcrafted control scheme as well as one learned through imitation learning, in simulation and experimentally on the hardware.
△ Less
Submitted 17 November, 2020; v1 submitted 17 April, 2020;
originally announced April 2020.
-
Scene Recognition Through Visual and Acoustic Cues Using K-Means
Authors:
Sidharth Rajaram,
J. Kenneth Salisbury
Abstract:
We propose a K-Means based prediction system, nicknamed SERVANT (Scene Recognition Through Visual and Acoustic Cues), that is capable of recognizing environmental scenes through analysis of ambient sound and color cues. The concept and implementation originated within the Learning branch of the Intelligent Wearable Robotics Project (also known as the Third Arm project) at the Stanford Artificial I…
▽ More
We propose a K-Means based prediction system, nicknamed SERVANT (Scene Recognition Through Visual and Acoustic Cues), that is capable of recognizing environmental scenes through analysis of ambient sound and color cues. The concept and implementation originated within the Learning branch of the Intelligent Wearable Robotics Project (also known as the Third Arm project) at the Stanford Artificial Intelligence Lab-Toyota Center (SAIL-TC). The Third Arm Project focuses on the development and conceptualization of a robotic arm that can aid users in a whole array of situations: i.e. carrying a cup of coffee, holding a flashlight. Servant uses a K-Means fit-and-predict architecture to classify environmental scenes, such as that of a coffee shop or a basketball gym, using visual and auditory cues. Following such classification, Servant can recommend contextual actions based on prior training.
△ Less
Submitted 25 November, 2018;
originally announced November 2018.
-
Learning to Represent Haptic Feedback for Partially-Observable Tasks
Authors:
Jaeyong Sung,
J. Kenneth Salisbury,
Ashutosh Saxena
Abstract:
The sense of touch, being the earliest sensory system to develop in a human body [1], plays a critical part of our daily interaction with the environment. In order to successfully complete a task, many manipulation interactions require incorporating haptic feedback. However, manually designing a feedback mechanism can be extremely challenging. In this work, we consider manipulation tasks that need…
▽ More
The sense of touch, being the earliest sensory system to develop in a human body [1], plays a critical part of our daily interaction with the environment. In order to successfully complete a task, many manipulation interactions require incorporating haptic feedback. However, manually designing a feedback mechanism can be extremely challenging. In this work, we consider manipulation tasks that need to incorporate tactile sensor feedback in order to modify a provided nominal plan. To incorporate partial observation, we present a new framework that models the task as a partially observable Markov decision process (POMDP) and learns an appropriate representation of haptic feedback which can serve as the state for a POMDP model. The model, that is parametrized by deep recurrent neural networks, utilizes variational Bayes methods to optimize the approximate posterior. Finally, we build on deep Q-learning to be able to select the optimal action in each state without access to a simulator. We test our model on a PR2 robot for multiple tasks of turning a knob until it clicks.
△ Less
Submitted 17 May, 2017;
originally announced May 2017.
-
Optimal prediction and natural scene statistics in the retina
Authors:
Jared Salisbury,
Stephanie E. Palmer
Abstract:
Almost all neural computations involve making predictions. Whether an organism is trying to catch prey, avoid predators, or simply move through a complex environment, the data it collects through its senses can guide its actions only to the extent that it can extract from these data information about the future state of the world. An essential aspect of the problem in all these forms is that not a…
▽ More
Almost all neural computations involve making predictions. Whether an organism is trying to catch prey, avoid predators, or simply move through a complex environment, the data it collects through its senses can guide its actions only to the extent that it can extract from these data information about the future state of the world. An essential aspect of the problem in all these forms is that not all features of the past carry predictive power. Since there are costs associated with representing and transmitting information, a natural hypothesis is that sensory systems have developed coding strategies that are optimized to minimize these costs, kee** only a limited number of bits of information about the past and ensuring that these bits are maximally informative about the future. Another important feature of the prediction problem is that the physics of the world is diverse enough to contain a wide range of possible statistical ensembles, yet not all motion is probable. Thus, the brain might not be a generalized predictive machine; it might have evolved to specifically solve the prediction problems most common in the natural environment. This paper reviews recent results on predictive coding and optimal predictive information in the retina and suggests approaches for quantifying prediction in response to natural motion.
△ Less
Submitted 1 July, 2015;
originally announced July 2015.
-
The CMS Tracker Readout Front End Driver
Authors:
C. Foudas,
R. Bainbridge,
D. Ballard,
I. Church,
E. Corrin,
J. A. Coughlan,
C. P. Day,
E. J. Freeman,
J. Fulcher,
W. J. F. Gannon,
G. Hall,
R. N. J. Halsall,
G. Iles,
J. Jones,
J. Leaver,
M. Noy,
M. Pearson,
M. Raymond,
I. Reid,
G. Rogers,
J. Salisbury,
S. Taghavi,
I. R. Tomalin,
O. Zorba
Abstract:
The Front End Driver, FED, is a 9U 400mm VME64x card designed for reading out the Compact Muon Solenoid, CMS, silicon tracker signals transmitted by the APV25 analogue pipeline Application Specific Integrated Circuits. The FED receives the signals via 96 optical fibers at a total input rate of 3.4 GB/sec. The signals are digitized and processed by applying algorithms for pedestal and common mode…
▽ More
The Front End Driver, FED, is a 9U 400mm VME64x card designed for reading out the Compact Muon Solenoid, CMS, silicon tracker signals transmitted by the APV25 analogue pipeline Application Specific Integrated Circuits. The FED receives the signals via 96 optical fibers at a total input rate of 3.4 GB/sec. The signals are digitized and processed by applying algorithms for pedestal and common mode noise subtraction. Algorithms that search for clusters of hits are used to further reduce the input rate. Only the cluster data along with trigger information of the event are transmitted to the CMS data acquisition system using the S-LINK64 protocol at a maximum rate of 400 MB/sec. All data processing algorithms on the FED are executed in large on-board Field Programmable Gate Arrays. Results on the design, performance, testing and quality control of the FED are presented and discussed.
△ Less
Submitted 25 October, 2005;
originally announced October 2005.