-
Optimization of Trajectories for Machine Learning Training in Robot Accuracy Modeling
Authors:
Blake Hannaford
Abstract:
Recently, machine learning (ML) methods have been developed for increasing the accuracy of robot mechanisms. Complex mechanical issues such as non-linear friction, backlash, flexibility of structure transmission elements can cause these errors and they are hard to model. ML requires training data and the above mechanical phenomena are highly dependent on position of the robot in the workspace and…
▽ More
Recently, machine learning (ML) methods have been developed for increasing the accuracy of robot mechanisms. Complex mechanical issues such as non-linear friction, backlash, flexibility of structure transmission elements can cause these errors and they are hard to model. ML requires training data and the above mechanical phenomena are highly dependent on position of the robot in the workspace and also on its velocity, especially near zero velocity in both directions where non-linearities such as Streibek and Coulomb friction are most pronounced. It is well known that success of ML methods depends on amount of training data and it is expensive/time consuming to collect data from physical robot motion. We therefore address the problem of searching for trajectories in the 6D space of positions and velocities which collect the most information in the least amount of time. This reduces to a special case of the traveling-salesman problem in that the robot must be programmed to visit sampled points in the position-velocity phase space most efficiently. Two goals of this work are 1) Computationally study the difficulty of the TSP in this application by applying it to X, Y, Z motion in 3D space (6D phase space) and 2) assess the effectiveness of an extremely simple Nearest Neighbor search algorithm compared to random sampling of the search space. Results confirm that Nearest Neighbor heuristic searching produces significantly better trajectories than random sampling in this application.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Ablation Study on Features in Learning-based Joints Calibration of Cable-driven Surgical Robots
Authors:
Haonan Peng,
Andrew Lewis,
Blake Hannaford
Abstract:
With worldwide implementation, millions of surgeries are assisted by surgical robots. The cable-drive mechanism on many surgical robots allows flexible, light, and compact arms and tools. However, the slack and stretch of the cables and the backlash of the gears introduce inevitable errors from motor poses to joint poses, and thus forwarded to the pose and orientation of the end-effector. In this…
▽ More
With worldwide implementation, millions of surgeries are assisted by surgical robots. The cable-drive mechanism on many surgical robots allows flexible, light, and compact arms and tools. However, the slack and stretch of the cables and the backlash of the gears introduce inevitable errors from motor poses to joint poses, and thus forwarded to the pose and orientation of the end-effector. In this paper, a learning-based calibration using a deep neural network is proposed, which reduces the unloaded pose RMSE of joints 1, 2, 3 to 0.3003 deg, 0.2888 deg, 0.1565 mm, and loaded pose RMSE of joints 1, 2, 3 to 0.4456 deg, 0.3052 deg, 0.1900 mm, respectively. Then, removal ablation and inaccurate ablation are performed to study which features of the DNN model contribute to the calibration accuracy. The results suggest that raw joint poses and motor torques are the most important features. For joint poses, the removal ablation shows that DNN model can derive this information from end-effector pose and orientation. For motor torques, the direction is much more important than amplitude.
△ Less
Submitted 14 February, 2023; v1 submitted 9 February, 2023;
originally announced February 2023.
-
Real-time Virtual Intraoperative CT for Image Guided Surgery
Authors:
Yangming Li,
Neeraja Konuthula,
Ian M. Humphreys,
Kris Moe,
Blake Hannaford,
Randall Bly
Abstract:
Abstract. Purpose: This paper presents a scheme for generating virtual intraoperative CT scans in order to improve surgical completeness in Endoscopic Sinus Surgeries (ESS). Approach: The work presents three methods, the tip motion-based, the tip trajectory-based, and the instrument based, along with non-parametric smoothing and Gaussian Process Regression, for virtual intraoperative CT generation…
▽ More
Abstract. Purpose: This paper presents a scheme for generating virtual intraoperative CT scans in order to improve surgical completeness in Endoscopic Sinus Surgeries (ESS). Approach: The work presents three methods, the tip motion-based, the tip trajectory-based, and the instrument based, along with non-parametric smoothing and Gaussian Process Regression, for virtual intraoperative CT generation. Results: The proposed methods studied and compared on ESS performed on cadavers. Surgical results show all three methods improve the Dice Similarity Coefficients > 86%, with F-score > 92% and precision > 89.91%. The tip trajectory-based method was found to have best performance and reached 96.87% precision in surgical completeness evaluation. Conclusions: This work demonstrated that virtual intraoperative CT scans improves the consistency between the actual surgical scene and the reference model, and improves surgical completeness in ESS. Comparing with actual intraoperative CT scans, the proposed scheme has no impact on existing surgical protocols, does not require extra hardware other than the one is already available in most ESS overcome the high costs, the repeated radiation, and the elongated anesthesia caused by actual intraoperative CTs, and is practical in ESS.
△ Less
Submitted 5 December, 2021;
originally announced December 2021.
-
Real-time Informative Surgical Skill Assessment with Gaussian Process Learning
Authors:
Yangming Li,
Randall Bly,
Sarah Akkina,
Rajeev C. Saxena,
Ian Humphreys,
Mark Whipple,
Kris Moe,
Blake Hannaford
Abstract:
Endoscopic Sinus and Skull Base Surgeries (ESSBSs) is a challenging and potentially dangerous surgical procedure, and objective skill assessment is the key components to improve the effectiveness of surgical training, to re-validate surgeons' skills, and to decrease surgical trauma and the complication rate in operating rooms. Because of the complexity of surgical procedures, the variation of oper…
▽ More
Endoscopic Sinus and Skull Base Surgeries (ESSBSs) is a challenging and potentially dangerous surgical procedure, and objective skill assessment is the key components to improve the effectiveness of surgical training, to re-validate surgeons' skills, and to decrease surgical trauma and the complication rate in operating rooms. Because of the complexity of surgical procedures, the variation of operation styles, and the fast development of new surgical skills, the surgical skill assessment remains a challenging problem. This work presents a novel Gaussian Process Learning-based heuristic automatic objective surgical skill assessment method for ESSBSs. Different with classical surgical skill assessment algorithms, the proposed method 1) utilizes the kinematic features in surgical instrument relative movements, instead of using specific surgical tasks or the statistics to assess skills in real-time; 2) provide informative feedback, instead of a summative scores; 3) has the ability to incrementally learn from new data, instead of depending on a fixed dataset. The proposed method projects the instrument movements into the endoscope coordinate to reduce the data dimensionality. It then extracts the kinematic features of the projected data and learns the relationship between surgical skill levels and the features with the Gaussian Process learning technique. The proposed method was verified in full endoscopic skull base and sinus surgeries on cadavers. These surgeries have different pathology, requires different treatment and has different complexities. The experimental results show that the proposed method reaches 100\% prediction precision for complete surgical procedures and 90\% precision for real-time prediction assessment.
△ Less
Submitted 5 December, 2021;
originally announced December 2021.
-
Reducing Annotating Load: Active Learning with Synthetic Images in Surgical Instrument Segmentation
Authors:
Haonan Peng,
Shan Lin,
Daniel King,
Yun-Hsuan Su,
Randall A. Bly,
Kris S. Moe,
Blake Hannaford
Abstract:
Accurate instrument segmentation in endoscopic vision of robot-assisted surgery is challenging due to reflection on the instruments and frequent contacts with tissue. Deep neural networks (DNN) show competitive performance and are in favor in recent years. However, the hunger of DNN for labeled data poses a huge workload of annotation. Motivated by alleviating this workload, we propose a general e…
▽ More
Accurate instrument segmentation in endoscopic vision of robot-assisted surgery is challenging due to reflection on the instruments and frequent contacts with tissue. Deep neural networks (DNN) show competitive performance and are in favor in recent years. However, the hunger of DNN for labeled data poses a huge workload of annotation. Motivated by alleviating this workload, we propose a general embeddable method to decrease the usage of labeled real images, using active generated synthetic images. In each active learning iteration, the most informative unlabeled images are first queried by active learning and then labeled. Next, synthetic images are generated based on these selected images. The instruments and backgrounds are cropped out and randomly combined with each other with blending and fusion near the boundary. The effectiveness of the proposed method is validated on 2 sinus surgery datasets and 1 intraabdominal surgery dataset. The results indicate a considerable improvement in performance, especially when the budget for annotation is small. The effectiveness of different types of synthetic images, blending methods, and external background are also studied. All the code is open-sourced at: https://github.com/HaonanPeng/active_syn_generator.
△ Less
Submitted 7 August, 2021;
originally announced August 2021.
-
Evaluation of an Inflated Beam Model Applied to Everted Tubes
Authors:
Joel Hwee,
Andrew Lewis,
Allison Raines,
Blake Hannaford
Abstract:
Everted tubes have often been modeled as inflated beams to determine transverse and axial buckling conditions. This paper seeks to validate the assumption that an everted tube can be modeled in this way. The tip deflections of everted and uneverted beams under transverse cantilever loads are compared with a tip deflection model that was first developed for aerospace applications. LDPE and silicone…
▽ More
Everted tubes have often been modeled as inflated beams to determine transverse and axial buckling conditions. This paper seeks to validate the assumption that an everted tube can be modeled in this way. The tip deflections of everted and uneverted beams under transverse cantilever loads are compared with a tip deflection model that was first developed for aerospace applications. LDPE and silicone coated nylon beams were tested; everted and uneverted beams showed similar tip deflection. The literature model best fit the tip deflection of LDPE tubes with an average tip deflection error of 6 mm, while the nylon tubes had an average tip deflection error of 16.4 mm. Everted beams of both materials buckled at 83% of the theoretical buckling condition while straight beams collapsed at 109% of the theoretical buckling condition. The curvature of everted beams was estimated from a tip load and a known displacement showing relative errors of 14.2% and 17.3% for LDPE and nylon beams respectively. This paper shows a numerical method for determining inflated beam deflection. It also provides an iterative method for computing static tip pose and applied wall forces in a known environment.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video
Authors:
Shan Lin,
Fangbo Qin,
Haonan Peng,
Randall A. Bly,
Kris S. Moe,
Blake Hannaford
Abstract:
Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the…
▽ More
Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the presence of blood. We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially in a recurrent mode. By distributing the computation load of deep feature extraction over sequential frames, we can use a lightweight encoder to reduce the computation costs at each time step. Moreover, public surgical videos usually are not labeled frame by frame, so we develop a method that can randomly synthesize a surgical frame sequence from a single labeled frame to assist network training. We demonstrate that our approach achieves superior performance to corresponding deeper segmentation models on two public surgery datasets.
△ Less
Submitted 25 July, 2021; v1 submitted 17 November, 2020;
originally announced November 2020.
-
LC-GAN: Image-to-image Translation Based on Generative Adversarial Network for Endoscopic Images
Authors:
Shan Lin,
Fangbo Qin,
Yangming Li,
Randall A. Bly,
Kris S. Moe,
Blake Hannaford
Abstract:
Intelligent vision is appealing in computer-assisted and robotic surgeries. Vision-based analysis with deep learning usually requires large labeled datasets, but manual data labeling is expensive and time-consuming in medical problems. We investigate a novel cross-domain strategy to reduce the need for manual data labeling by proposing an image-to-image translation model live-cadaver GAN (LC-GAN)…
▽ More
Intelligent vision is appealing in computer-assisted and robotic surgeries. Vision-based analysis with deep learning usually requires large labeled datasets, but manual data labeling is expensive and time-consuming in medical problems. We investigate a novel cross-domain strategy to reduce the need for manual data labeling by proposing an image-to-image translation model live-cadaver GAN (LC-GAN) based on generative adversarial networks (GANs). We consider a situation when a labeled cadaveric surgery dataset is available while the task is instrument segmentation on an unlabeled live surgery dataset. We train LC-GAN to learn the map**s between the cadaveric and live images. For live image segmentation, we first translate the live images to fake-cadaveric images with LC-GAN and then perform segmentation on the fake-cadaveric images with models trained on the real cadaveric dataset. The proposed method fully makes use of the labeled cadaveric dataset for live image segmentation without the need to label the live dataset. LC-GAN has two generators with different architectures that leverage the deep feature representation learned from the cadaveric image based segmentation task. Moreover, we propose the structural similarity loss and segmentation consistency loss to improve the semantic consistency during translation. Our model achieves better image-to-image translation and leads to improved segmentation performance in the proposed cross-domain segmentation task.
△ Less
Submitted 13 August, 2020; v1 submitted 10 March, 2020;
originally announced March 2020.
-
Towards Better Surgical Instrument Segmentation in Endoscopic Vision: Multi-Angle Feature Aggregation and Contour Supervision
Authors:
Fangbo Qin,
Shan Lin,
Yangming Li,
Randall A. Bly,
Kris S. Moe,
Blake Hannaford
Abstract:
Accurate and real-time surgical instrument segmentation is important in the endoscopic vision of robot-assisted surgery, and significant challenges are posed by frequent instrument-tissue contacts and continuous change of observation perspective. For these challenging tasks more and more deep neural networks (DNN) models are designed in recent years. We are motivated to propose a general embeddabl…
▽ More
Accurate and real-time surgical instrument segmentation is important in the endoscopic vision of robot-assisted surgery, and significant challenges are posed by frequent instrument-tissue contacts and continuous change of observation perspective. For these challenging tasks more and more deep neural networks (DNN) models are designed in recent years. We are motivated to propose a general embeddable approach to improve these current DNN segmentation models without increasing the model parameter number. Firstly, observing the limited rotation-invariance performance of DNN, we proposed the Multi-Angle Feature Aggregation (MAFA) method, leveraging active image rotation to gain richer visual cues and make the prediction more robust to instrument orientation changes. Secondly, in the end-to-end training stage, the auxiliary contour supervision is utilized to guide the model to learn the boundary awareness, so that the contour shape of segmentation mask is more precise. The proposed method is validated with ablation experiments on the novel Sinus-Surgery datasets collected from surgeons' operations, and is compared to the existing methods on a public dataset collected with a da Vinci Xi Robot.
△ Less
Submitted 10 August, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
Real-time Data Driven Precision Estimator for RAVEN-II Surgical Robot End Effector Position
Authors:
Haonan Peng,
Xingjian Yang,
Yun-Hsuan Su,
Blake Hannaford
Abstract:
Surgical robots have been introduced to operating rooms over the past few decades due to their high sensitivity, small size, and remote controllability. The cable-driven nature of many surgical robots allows the systems to be dexterous and lightweight, with diameters as low as 5mm. However, due to the slack and stretch of the cables and the backlash of the gears, inevitable uncertainties are broug…
▽ More
Surgical robots have been introduced to operating rooms over the past few decades due to their high sensitivity, small size, and remote controllability. The cable-driven nature of many surgical robots allows the systems to be dexterous and lightweight, with diameters as low as 5mm. However, due to the slack and stretch of the cables and the backlash of the gears, inevitable uncertainties are brought into the kinematics calculation. Since the reported end effector position of surgical robots like RAVEN-II is directly calculated using the motor encoder measurements and forward kinematics, it may contain relatively large error up to 10mm, whereas semi-autonomous functions being introduced into abdominal surgeries require position inaccuracy of at most 1mm. To resolve the problem, a cost-effective, real-time and data-driven pipeline for robot end effector position precision estimation is proposed and tested on RAVEN-II. Analysis shows an improved end effector position error of around 1mm RMS traversing through the entire robot workspace without high-resolution motion tracker.
△ Less
Submitted 14 October, 2019;
originally announced October 2019.
-
Hidden Markov Models derived from Behavior Trees
Authors:
Blake Hannaford
Abstract:
Behavior trees are rapidly attracting interest in robotics and human task-related motion tracking. However no algorithms currently exist to track or identify parameters of BTs under noisy observations. We report a new relationship between BTs, augmented with statistical information, and Hidden Markov Models. Exploiting this relationship will allow application of many algorithms for HMMs (and dynam…
▽ More
Behavior trees are rapidly attracting interest in robotics and human task-related motion tracking. However no algorithms currently exist to track or identify parameters of BTs under noisy observations. We report a new relationship between BTs, augmented with statistical information, and Hidden Markov Models. Exploiting this relationship will allow application of many algorithms for HMMs (and dynamic Bayesian networks) to data acquired from BT-based systems.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
Raven: Open Surgical Robotic Platforms
Authors:
Yangming Li,
Blake Hannaford,
Jacob Rosen
Abstract:
The Raven I and the Raven II surgical robots, as open research platforms, have been serving the robotic surgery research community for ten years. The paper 1) briefly presents the Raven I and the Raven II robots, 2) reviews the recent publications that are built upon the Raven robots, aim to be applied to the Raven robots, or are directly compared with the Raven robots, and 3) uses the Raven robot…
▽ More
The Raven I and the Raven II surgical robots, as open research platforms, have been serving the robotic surgery research community for ten years. The paper 1) briefly presents the Raven I and the Raven II robots, 2) reviews the recent publications that are built upon the Raven robots, aim to be applied to the Raven robots, or are directly compared with the Raven robots, and 3) uses the Raven robots as a case study to discuss the popular research problems in the research community and the trend of robotic surgery study. Instead of being a thorough literature review, this work only reviews the works formally published in the past three years and uses these recent publications to analyze the research interests, the popular open research problems, and opportunities in the topic of robotic surgery.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
Behavior Trees as a Representation for Medical Procedures
Authors:
Blake Hannaford,
Randall Bly,
Ian Humphreys,
Mark Whipple
Abstract:
Objective: Effective collaboration between machines and clinicians requires flexible data structures to represent medical processes and clinical practice guidelines. Such a data structure could enable effective turn-taking between human and automated components of a complex treatment, accurate on-line monitoring of clinical treatments (for example to detect medical errors), or automated treatment…
▽ More
Objective: Effective collaboration between machines and clinicians requires flexible data structures to represent medical processes and clinical practice guidelines. Such a data structure could enable effective turn-taking between human and automated components of a complex treatment, accurate on-line monitoring of clinical treatments (for example to detect medical errors), or automated treatment systems (such as future medical robots) whose overall treatment plan is understandable and auditable by human experts.
Materials and Methods: Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. BTs have several properties which are attractive for modeling medical procedures including human-readability, authoring tools, and composability.
Results: This paper will illustrate construction of BTs for exemplary medical procedures and clinical protocols.
Discussion and Conclusion: Behavior Trees thus form a useful, and human authorable/readable bridge between clinical practice guidelines and AI systems.
△ Less
Submitted 27 August, 2018;
originally announced August 2018.
-
Behavior Trees as a Representation for Medical Procedures
Authors:
Blake Hannaford
Abstract:
Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. BTs have several properties which are attractive for modeling medical procedures including human-readability, authoring tools, and composability. This paper will illustrate construction of BTs for exemplary medical procedures. We are pleased to acknowledge support from National…
▽ More
Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. BTs have several properties which are attractive for modeling medical procedures including human-readability, authoring tools, and composability. This paper will illustrate construction of BTs for exemplary medical procedures. We are pleased to acknowledge support from National Science Foundation grant #IIS-1637444 and collaborations on that project with Johns Hopkins University and Worcester Polytechnic Institute.
△ Less
Submitted 24 January, 2018;
originally announced January 2018.
-
IKBT: solving closed-form Inverse Kinematics with Behavior Tree
Authors:
Dianmu Zhang,
Blake Hannaford
Abstract:
Serial robot arms have complicated kinematic equations which must be solved to write effective arm planning and control software (the Inverse Kinematics Problem). Existing software packages for inverse kinematics often rely on numerical methods which have significant shortcomings. Here we report a new symbolic inverse kinematics solver which overcomes the limitations of numerical methods, and the…
▽ More
Serial robot arms have complicated kinematic equations which must be solved to write effective arm planning and control software (the Inverse Kinematics Problem). Existing software packages for inverse kinematics often rely on numerical methods which have significant shortcomings. Here we report a new symbolic inverse kinematics solver which overcomes the limitations of numerical methods, and the shortcomings of previous symbolic software packages. We integrate Behavior Trees, an execution planning framework previously used for controlling intelligent robot behavior, to organize the equation solving process, and a modular architecture for each solution technique. The system successfully solved, generated a LaTex report, and generated a Python code template for 18 out of 19 example robots of 4-6 DOF. The system is readily extensible, maintainable, and multi-platform with few dependencies. The complete package is available with a Modified BSD license on Github.
△ Less
Submitted 7 December, 2017; v1 submitted 15 November, 2017;
originally announced November 2017.
-
Simulation Results on Selector Adaptation in Behavior Trees
Authors:
Blake Hannaford,
Danying Hu,
Dianmu Zhang,
Yangming Li
Abstract:
Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. However as initially implemented, behavior trees are static plans. This paper adds to recent literature exploring the ability of BTs to adapt to their success or failure in achieving tasks. The "Selector" node of a BT tries alternative strategies (its children) and returns succ…
▽ More
Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. However as initially implemented, behavior trees are static plans. This paper adds to recent literature exploring the ability of BTs to adapt to their success or failure in achieving tasks. The "Selector" node of a BT tries alternative strategies (its children) and returns success only if all of its children return failure. This paper studies several means by which Selector nodes can learn from experience, in particular, learn conditional probabilities of success based on sensor information, and modify the execution order based on the learned iformation. Furthermore, a "Greedy Selector" is studied which only tries the child having the highest success probability. Simulation results indicate significantly increased task performance, especially when frequentist probability estimate is conditioned on sensor information. The Greedy selector was ineffective unless it was preceded by a period of training in which all children were exercised.
△ Less
Submitted 30 June, 2016; v1 submitted 29 June, 2016;
originally announced June 2016.