Search | arXiv e-print repository

Optimization of Trajectories for Machine Learning Training in Robot Accuracy Modeling

Abstract: Recently, machine learning (ML) methods have been developed for increasing the accuracy of robot mechanisms. Complex mechanical issues such as non-linear friction, backlash, flexibility of structure transmission elements can cause these errors and they are hard to model. ML requires training data and the above mechanical phenomena are highly dependent on position of the robot in the workspace and… ▽ More Recently, machine learning (ML) methods have been developed for increasing the accuracy of robot mechanisms. Complex mechanical issues such as non-linear friction, backlash, flexibility of structure transmission elements can cause these errors and they are hard to model. ML requires training data and the above mechanical phenomena are highly dependent on position of the robot in the workspace and also on its velocity, especially near zero velocity in both directions where non-linearities such as Streibek and Coulomb friction are most pronounced. It is well known that success of ML methods depends on amount of training data and it is expensive/time consuming to collect data from physical robot motion. We therefore address the problem of searching for trajectories in the 6D space of positions and velocities which collect the most information in the least amount of time. This reduces to a special case of the traveling-salesman problem in that the robot must be programmed to visit sampled points in the position-velocity phase space most efficiently. Two goals of this work are 1) Computationally study the difficulty of the TSP in this application by applying it to X, Y, Z motion in 3D space (6D phase space) and 2) assess the effectiveness of an extremely simple Nearest Neighbor search algorithm compared to random sampling of the search space. Results confirm that Nearest Neighbor heuristic searching produces significantly better trajectories than random sampling in this application. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2302.04999 [pdf, other]

Ablation Study on Features in Learning-based Joints Calibration of Cable-driven Surgical Robots

Authors: Haonan Peng, Andrew Lewis, Blake Hannaford

Abstract: With worldwide implementation, millions of surgeries are assisted by surgical robots. The cable-drive mechanism on many surgical robots allows flexible, light, and compact arms and tools. However, the slack and stretch of the cables and the backlash of the gears introduce inevitable errors from motor poses to joint poses, and thus forwarded to the pose and orientation of the end-effector. In this… ▽ More With worldwide implementation, millions of surgeries are assisted by surgical robots. The cable-drive mechanism on many surgical robots allows flexible, light, and compact arms and tools. However, the slack and stretch of the cables and the backlash of the gears introduce inevitable errors from motor poses to joint poses, and thus forwarded to the pose and orientation of the end-effector. In this paper, a learning-based calibration using a deep neural network is proposed, which reduces the unloaded pose RMSE of joints 1, 2, 3 to 0.3003 deg, 0.2888 deg, 0.1565 mm, and loaded pose RMSE of joints 1, 2, 3 to 0.4456 deg, 0.3052 deg, 0.1900 mm, respectively. Then, removal ablation and inaccurate ablation are performed to study which features of the DNN model contribute to the calibration accuracy. The results suggest that raw joint poses and motor torques are the most important features. For joint poses, the removal ablation shows that DNN model can derive this information from end-effector pose and orientation. For motor torques, the direction is much more important than amplitude. △ Less

Submitted 14 February, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

arXiv:2112.02608 [pdf]

Real-time Virtual Intraoperative CT for Image Guided Surgery

Authors: Yangming Li, Neeraja Konuthula, Ian M. Humphreys, Kris Moe, Blake Hannaford, Randall Bly

Abstract: Abstract. Purpose: This paper presents a scheme for generating virtual intraoperative CT scans in order to improve surgical completeness in Endoscopic Sinus Surgeries (ESS). Approach: The work presents three methods, the tip motion-based, the tip trajectory-based, and the instrument based, along with non-parametric smoothing and Gaussian Process Regression, for virtual intraoperative CT generation… ▽ More Abstract. Purpose: This paper presents a scheme for generating virtual intraoperative CT scans in order to improve surgical completeness in Endoscopic Sinus Surgeries (ESS). Approach: The work presents three methods, the tip motion-based, the tip trajectory-based, and the instrument based, along with non-parametric smoothing and Gaussian Process Regression, for virtual intraoperative CT generation. Results: The proposed methods studied and compared on ESS performed on cadavers. Surgical results show all three methods improve the Dice Similarity Coefficients > 86%, with F-score > 92% and precision > 89.91%. The tip trajectory-based method was found to have best performance and reached 96.87% precision in surgical completeness evaluation. Conclusions: This work demonstrated that virtual intraoperative CT scans improves the consistency between the actual surgical scene and the reference model, and improves surgical completeness in ESS. Comparing with actual intraoperative CT scans, the proposed scheme has no impact on existing surgical protocols, does not require extra hardware other than the one is already available in most ESS overcome the high costs, the repeated radiation, and the elongated anesthesia caused by actual intraoperative CTs, and is practical in ESS. △ Less

Submitted 5 December, 2021; originally announced December 2021.

arXiv:2112.02598 [pdf]

Real-time Informative Surgical Skill Assessment with Gaussian Process Learning

Authors: Yangming Li, Randall Bly, Sarah Akkina, Rajeev C. Saxena, Ian Humphreys, Mark Whipple, Kris Moe, Blake Hannaford

Abstract: Endoscopic Sinus and Skull Base Surgeries (ESSBSs) is a challenging and potentially dangerous surgical procedure, and objective skill assessment is the key components to improve the effectiveness of surgical training, to re-validate surgeons' skills, and to decrease surgical trauma and the complication rate in operating rooms. Because of the complexity of surgical procedures, the variation of oper… ▽ More Endoscopic Sinus and Skull Base Surgeries (ESSBSs) is a challenging and potentially dangerous surgical procedure, and objective skill assessment is the key components to improve the effectiveness of surgical training, to re-validate surgeons' skills, and to decrease surgical trauma and the complication rate in operating rooms. Because of the complexity of surgical procedures, the variation of operation styles, and the fast development of new surgical skills, the surgical skill assessment remains a challenging problem. This work presents a novel Gaussian Process Learning-based heuristic automatic objective surgical skill assessment method for ESSBSs. Different with classical surgical skill assessment algorithms, the proposed method 1) utilizes the kinematic features in surgical instrument relative movements, instead of using specific surgical tasks or the statistics to assess skills in real-time; 2) provide informative feedback, instead of a summative scores; 3) has the ability to incrementally learn from new data, instead of depending on a fixed dataset. The proposed method projects the instrument movements into the endoscope coordinate to reduce the data dimensionality. It then extracts the kinematic features of the projected data and learns the relationship between surgical skill levels and the features with the Gaussian Process learning technique. The proposed method was verified in full endoscopic skull base and sinus surgeries on cadavers. These surgeries have different pathology, requires different treatment and has different complexities. The experimental results show that the proposed method reaches 100\% prediction precision for complete surgical procedures and 90\% precision for real-time prediction assessment. △ Less

Submitted 5 December, 2021; originally announced December 2021.

arXiv:2108.03534 [pdf, other]

doi 10.1016/j.media.2024.103246

Reducing Annotating Load: Active Learning with Synthetic Images in Surgical Instrument Segmentation

Authors: Haonan Peng, Shan Lin, Daniel King, Yun-Hsuan Su, Randall A. Bly, Kris S. Moe, Blake Hannaford

Abstract: Accurate instrument segmentation in endoscopic vision of robot-assisted surgery is challenging due to reflection on the instruments and frequent contacts with tissue. Deep neural networks (DNN) show competitive performance and are in favor in recent years. However, the hunger of DNN for labeled data poses a huge workload of annotation. Motivated by alleviating this workload, we propose a general e… ▽ More Accurate instrument segmentation in endoscopic vision of robot-assisted surgery is challenging due to reflection on the instruments and frequent contacts with tissue. Deep neural networks (DNN) show competitive performance and are in favor in recent years. However, the hunger of DNN for labeled data poses a huge workload of annotation. Motivated by alleviating this workload, we propose a general embeddable method to decrease the usage of labeled real images, using active generated synthetic images. In each active learning iteration, the most informative unlabeled images are first queried by active learning and then labeled. Next, synthetic images are generated based on these selected images. The instruments and backgrounds are cropped out and randomly combined with each other with blending and fusion near the boundary. The effectiveness of the proposed method is validated on 2 sinus surgery datasets and 1 intraabdominal surgery dataset. The results indicate a considerable improvement in performance, especially when the budget for annotation is small. The effectiveness of different types of synthetic images, blending methods, and external background are also studied. All the code is open-sourced at: https://github.com/HaonanPeng/active_syn_generator. △ Less

Submitted 7 August, 2021; originally announced August 2021.

arXiv:2107.05748 [pdf, other]

Evaluation of an Inflated Beam Model Applied to Everted Tubes

Authors: Joel Hwee, Andrew Lewis, Allison Raines, Blake Hannaford

Abstract: Everted tubes have often been modeled as inflated beams to determine transverse and axial buckling conditions. This paper seeks to validate the assumption that an everted tube can be modeled in this way. The tip deflections of everted and uneverted beams under transverse cantilever loads are compared with a tip deflection model that was first developed for aerospace applications. LDPE and silicone… ▽ More Everted tubes have often been modeled as inflated beams to determine transverse and axial buckling conditions. This paper seeks to validate the assumption that an everted tube can be modeled in this way. The tip deflections of everted and uneverted beams under transverse cantilever loads are compared with a tip deflection model that was first developed for aerospace applications. LDPE and silicone coated nylon beams were tested; everted and uneverted beams showed similar tip deflection. The literature model best fit the tip deflection of LDPE tubes with an average tip deflection error of 6 mm, while the nylon tubes had an average tip deflection error of 16.4 mm. Everted beams of both materials buckled at 83% of the theoretical buckling condition while straight beams collapsed at 109% of the theoretical buckling condition. The curvature of everted beams was estimated from a tip load and a known displacement showing relative errors of 14.2% and 17.3% for LDPE and nylon beams respectively. This paper shows a numerical method for determining inflated beam deflection. It also provides an iterative method for computing static tip pose and applied wall forces in a known environment. △ Less

Submitted 12 July, 2021; originally announced July 2021.

arXiv:2011.08752 [pdf, other]

doi 10.1109/LRA.2021.3096156

Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Authors: Shan Lin, Fangbo Qin, Haonan Peng, Randall A. Bly, Kris S. Moe, Blake Hannaford

Abstract: Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the… ▽ More Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the presence of blood. We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially in a recurrent mode. By distributing the computation load of deep feature extraction over sequential frames, we can use a lightweight encoder to reduce the computation costs at each time step. Moreover, public surgical videos usually are not labeled frame by frame, so we develop a method that can randomly synthesize a surgical frame sequence from a single labeled frame to assist network training. We demonstrate that our approach achieves superior performance to corresponding deeper segmentation models on two public surgery datasets. △ Less

Submitted 25 July, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

Comments: Published in IEEE Robotics and Automation Letters (Early Access)

arXiv:2003.04949 [pdf, other]

LC-GAN: Image-to-image Translation Based on Generative Adversarial Network for Endoscopic Images

Authors: Shan Lin, Fangbo Qin, Yangming Li, Randall A. Bly, Kris S. Moe, Blake Hannaford

Abstract: Intelligent vision is appealing in computer-assisted and robotic surgeries. Vision-based analysis with deep learning usually requires large labeled datasets, but manual data labeling is expensive and time-consuming in medical problems. We investigate a novel cross-domain strategy to reduce the need for manual data labeling by proposing an image-to-image translation model live-cadaver GAN (LC-GAN)… ▽ More Intelligent vision is appealing in computer-assisted and robotic surgeries. Vision-based analysis with deep learning usually requires large labeled datasets, but manual data labeling is expensive and time-consuming in medical problems. We investigate a novel cross-domain strategy to reduce the need for manual data labeling by proposing an image-to-image translation model live-cadaver GAN (LC-GAN) based on generative adversarial networks (GANs). We consider a situation when a labeled cadaveric surgery dataset is available while the task is instrument segmentation on an unlabeled live surgery dataset. We train LC-GAN to learn the map**s between the cadaveric and live images. For live image segmentation, we first translate the live images to fake-cadaveric images with LC-GAN and then perform segmentation on the fake-cadaveric images with models trained on the real cadaveric dataset. The proposed method fully makes use of the labeled cadaveric dataset for live image segmentation without the need to label the live dataset. LC-GAN has two generators with different architectures that leverage the deep feature representation learned from the cadaveric image based segmentation task. Moreover, we propose the structural similarity loss and segmentation consistency loss to improve the semantic consistency during translation. Our model achieves better image-to-image translation and leads to improved segmentation performance in the proposed cross-domain segmentation task. △ Less

Submitted 13 August, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

Comments: Accepted by 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

arXiv:2002.10675 [pdf]

doi 10.1109/LRA.2020.3009073

Towards Better Surgical Instrument Segmentation in Endoscopic Vision: Multi-Angle Feature Aggregation and Contour Supervision

Authors: Fangbo Qin, Shan Lin, Yangming Li, Randall A. Bly, Kris S. Moe, Blake Hannaford

Abstract: Accurate and real-time surgical instrument segmentation is important in the endoscopic vision of robot-assisted surgery, and significant challenges are posed by frequent instrument-tissue contacts and continuous change of observation perspective. For these challenging tasks more and more deep neural networks (DNN) models are designed in recent years. We are motivated to propose a general embeddabl… ▽ More Accurate and real-time surgical instrument segmentation is important in the endoscopic vision of robot-assisted surgery, and significant challenges are posed by frequent instrument-tissue contacts and continuous change of observation perspective. For these challenging tasks more and more deep neural networks (DNN) models are designed in recent years. We are motivated to propose a general embeddable approach to improve these current DNN segmentation models without increasing the model parameter number. Firstly, observing the limited rotation-invariance performance of DNN, we proposed the Multi-Angle Feature Aggregation (MAFA) method, leveraging active image rotation to gain richer visual cues and make the prediction more robust to instrument orientation changes. Secondly, in the end-to-end training stage, the auxiliary contour supervision is utilized to guide the model to learn the boundary awareness, so that the contour shape of segmentation mask is more precise. The proposed method is validated with ablation experiments on the novel Sinus-Surgery datasets collected from surgeons' operations, and is compared to the existing methods on a public dataset collected with a da Vinci Xi Robot. △ Less

Submitted 10 August, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: Accepted by IEEE Robotics and Automation Letters

arXiv:1910.06425 [pdf, other]

Real-time Data Driven Precision Estimator for RAVEN-II Surgical Robot End Effector Position

Authors: Haonan Peng, Xingjian Yang, Yun-Hsuan Su, Blake Hannaford

Abstract: Surgical robots have been introduced to operating rooms over the past few decades due to their high sensitivity, small size, and remote controllability. The cable-driven nature of many surgical robots allows the systems to be dexterous and lightweight, with diameters as low as 5mm. However, due to the slack and stretch of the cables and the backlash of the gears, inevitable uncertainties are broug… ▽ More Surgical robots have been introduced to operating rooms over the past few decades due to their high sensitivity, small size, and remote controllability. The cable-driven nature of many surgical robots allows the systems to be dexterous and lightweight, with diameters as low as 5mm. However, due to the slack and stretch of the cables and the backlash of the gears, inevitable uncertainties are brought into the kinematics calculation. Since the reported end effector position of surgical robots like RAVEN-II is directly calculated using the motor encoder measurements and forward kinematics, it may contain relatively large error up to 10mm, whereas semi-autonomous functions being introduced into abdominal surgeries require position inaccuracy of at most 1mm. To resolve the problem, a cost-effective, real-time and data-driven pipeline for robot end effector position precision estimation is proposed and tested on RAVEN-II. Analysis shows an improved end effector position error of around 1mm RMS traversing through the entire robot workspace without high-resolution motion tracker. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Comments: 6 pages, 10 figures, ICRA2020(under review)

arXiv:1907.10029 [pdf, other]

Hidden Markov Models derived from Behavior Trees

Authors: Blake Hannaford

Abstract: Behavior trees are rapidly attracting interest in robotics and human task-related motion tracking. However no algorithms currently exist to track or identify parameters of BTs under noisy observations. We report a new relationship between BTs, augmented with statistical information, and Hidden Markov Models. Exploiting this relationship will allow application of many algorithms for HMMs (and dynam… ▽ More Behavior trees are rapidly attracting interest in robotics and human task-related motion tracking. However no algorithms currently exist to track or identify parameters of BTs under noisy observations. We report a new relationship between BTs, augmented with statistical information, and Hidden Markov Models. Exploiting this relationship will allow application of many algorithms for HMMs (and dynamic Bayesian networks) to data acquired from BT-based systems. △ Less

Submitted 23 July, 2019; originally announced July 2019.

Comments: Submitted to IEEE Transactions on Robotics and Automation, 23-Jul-2019

arXiv:1906.11747 [pdf, other]

Raven: Open Surgical Robotic Platforms

Authors: Yangming Li, Blake Hannaford, Jacob Rosen

Abstract: The Raven I and the Raven II surgical robots, as open research platforms, have been serving the robotic surgery research community for ten years. The paper 1) briefly presents the Raven I and the Raven II robots, 2) reviews the recent publications that are built upon the Raven robots, aim to be applied to the Raven robots, or are directly compared with the Raven robots, and 3) uses the Raven robot… ▽ More The Raven I and the Raven II surgical robots, as open research platforms, have been serving the robotic surgery research community for ten years. The paper 1) briefly presents the Raven I and the Raven II robots, 2) reviews the recent publications that are built upon the Raven robots, aim to be applied to the Raven robots, or are directly compared with the Raven robots, and 3) uses the Raven robots as a case study to discuss the popular research problems in the research community and the trend of robotic surgery study. Instead of being a thorough literature review, this work only reviews the works formally published in the past three years and uses these recent publications to analyze the research interests, the popular open research problems, and opportunities in the topic of robotic surgery. △ Less

Submitted 27 June, 2019; originally announced June 2019.

Journal ref: published by Acta Polytechnica Hungarica 2019

arXiv:1808.08954 [pdf, other]

Behavior Trees as a Representation for Medical Procedures

Authors: Blake Hannaford, Randall Bly, Ian Humphreys, Mark Whipple

Abstract: Objective: Effective collaboration between machines and clinicians requires flexible data structures to represent medical processes and clinical practice guidelines. Such a data structure could enable effective turn-taking between human and automated components of a complex treatment, accurate on-line monitoring of clinical treatments (for example to detect medical errors), or automated treatment… ▽ More Objective: Effective collaboration between machines and clinicians requires flexible data structures to represent medical processes and clinical practice guidelines. Such a data structure could enable effective turn-taking between human and automated components of a complex treatment, accurate on-line monitoring of clinical treatments (for example to detect medical errors), or automated treatment systems (such as future medical robots) whose overall treatment plan is understandable and auditable by human experts. Materials and Methods: Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. BTs have several properties which are attractive for modeling medical procedures including human-readability, authoring tools, and composability. Results: This paper will illustrate construction of BTs for exemplary medical procedures and clinical protocols. Discussion and Conclusion: Behavior Trees thus form a useful, and human authorable/readable bridge between clinical practice guidelines and AI systems. △ Less

Submitted 27 August, 2018; originally announced August 2018.

Comments: We are pleased to acknowledge support from National Science Foundation grant #IIS-1637444. arXiv admin note: text overlap with arXiv:1801.07864

arXiv:1801.07864 [pdf, other]

Behavior Trees as a Representation for Medical Procedures

Authors: Blake Hannaford

Abstract: Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. BTs have several properties which are attractive for modeling medical procedures including human-readability, authoring tools, and composability. This paper will illustrate construction of BTs for exemplary medical procedures. We are pleased to acknowledge support from National… ▽ More Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. BTs have several properties which are attractive for modeling medical procedures including human-readability, authoring tools, and composability. This paper will illustrate construction of BTs for exemplary medical procedures. We are pleased to acknowledge support from National Science Foundation grant #IIS-1637444 and collaborations on that project with Johns Hopkins University and Worcester Polytechnic Institute. △ Less

Submitted 24 January, 2018; originally announced January 2018.

Comments: 8 pages, 3 figures, 22 references

arXiv:1711.05412 [pdf, ps, other]

IKBT: solving closed-form Inverse Kinematics with Behavior Tree

Authors: Dianmu Zhang, Blake Hannaford

Abstract: Serial robot arms have complicated kinematic equations which must be solved to write effective arm planning and control software (the Inverse Kinematics Problem). Existing software packages for inverse kinematics often rely on numerical methods which have significant shortcomings. Here we report a new symbolic inverse kinematics solver which overcomes the limitations of numerical methods, and the… ▽ More Serial robot arms have complicated kinematic equations which must be solved to write effective arm planning and control software (the Inverse Kinematics Problem). Existing software packages for inverse kinematics often rely on numerical methods which have significant shortcomings. Here we report a new symbolic inverse kinematics solver which overcomes the limitations of numerical methods, and the shortcomings of previous symbolic software packages. We integrate Behavior Trees, an execution planning framework previously used for controlling intelligent robot behavior, to organize the equation solving process, and a modular architecture for each solution technique. The system successfully solved, generated a LaTex report, and generated a Python code template for 18 out of 19 example robots of 4-6 DOF. The system is readily extensible, maintainable, and multi-platform with few dependencies. The complete package is available with a Modified BSD license on Github. △ Less

Submitted 7 December, 2017; v1 submitted 15 November, 2017; originally announced November 2017.

Comments: 14 pages, 6 figures

arXiv:1606.09219 [pdf, other]

Simulation Results on Selector Adaptation in Behavior Trees

Authors: Blake Hannaford, Danying Hu, Dianmu Zhang, Yangming Li

Abstract: Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. However as initially implemented, behavior trees are static plans. This paper adds to recent literature exploring the ability of BTs to adapt to their success or failure in achieving tasks. The "Selector" node of a BT tries alternative strategies (its children) and returns succ… ▽ More Behavior trees (BTs) emerged from video game development as a graphical language for modeling intelligent agent behavior. However as initially implemented, behavior trees are static plans. This paper adds to recent literature exploring the ability of BTs to adapt to their success or failure in achieving tasks. The "Selector" node of a BT tries alternative strategies (its children) and returns success only if all of its children return failure. This paper studies several means by which Selector nodes can learn from experience, in particular, learn conditional probabilities of success based on sensor information, and modify the execution order based on the learned iformation. Furthermore, a "Greedy Selector" is studied which only tries the child having the highest success probability. Simulation results indicate significantly increased task performance, especially when frequentist probability estimate is conditioned on sensor information. The Greedy selector was ineffective unless it was preceded by a period of training in which all children were exercised. △ Less

Submitted 30 June, 2016; v1 submitted 29 June, 2016; originally announced June 2016.

Showing 1–16 of 16 results for author: Hannaford, B