Search | arXiv e-print repository

doi 10.1016/j.cmpb.2019.105004

3D Reconstruction and Alignment by Consumer RGB-D Sensors and Fiducial Planar Markers for Patient Positioning in Radiation Therapy

Authors: Hamid Sarmadi, Rafael Muñoz-Salinas, M. Álvaro Berbís, Antonio Luna, Rafael Medina-Carnicer

Abstract: BACKGROUND AND OBJECTIVE: Patient positioning is a crucial step in radiation therapy, for which non-invasive methods have been developed based on surface reconstruction using optical 3D imaging. However, most solutions need expensive specialized hardware and a careful calibration procedure that must be repeated over time.This paper proposes a fast and cheap patient positioning method based on inex… ▽ More BACKGROUND AND OBJECTIVE: Patient positioning is a crucial step in radiation therapy, for which non-invasive methods have been developed based on surface reconstruction using optical 3D imaging. However, most solutions need expensive specialized hardware and a careful calibration procedure that must be repeated over time.This paper proposes a fast and cheap patient positioning method based on inexpensive consumer level RGB-D sensors. METHODS: The proposed method relies on a 3D reconstruction approach that fuses, in real-time, artificial and natural visual landmarks recorded from a hand-held RGB-D sensor. The video sequence is transformed into a set of keyframes with known poses, that are later refined to obtain a realistic 3D reconstruction of the patient. The use of artificial landmarks allows our method to automatically align the reconstruction to a reference one, without the need of calibrating the system with respect to the linear accelerator coordinate system. RESULTS:The experiments conducted show that our method obtains a median of 1 cm in translational error, and 1 degree of rotational error with respect to reference pose. Additionally, the proposed method shows as visual output overlayed poses (from the reference and the current scene) and an error map that can be used to correct the patient's current pose to match the reference pose. CONCLUSIONS: A novel approach to obtain 3D body reconstructions for patient positioning without requiring expensive hardware or dedicated graphic cards is proposed. The method can be used to align in real time the patient's current pose to a preview pose, which is a relevant step in radiation therapy. △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: 11 pages, 5 figures

arXiv:2103.09141 [pdf, other]

doi 10.1109/ACCESS.2019.2896648

Simultaneous Multi-View Camera Pose Estimation and Object Tracking with Square Planar Markers

Authors: Hamid Sarmadi, Rafael Muñoz-Salinas, M. A. Berbís, R. Medina-Carnicer

Abstract: Object tracking is a key aspect in many applications such as augmented reality in medicine (e.g. tracking a surgical instrument) or robotics. Squared planar markers have become popular tools for tracking since their pose can be estimated from their four corners. While using a single marker and a single camera limits the working area considerably, using multiple markers attached to an object requir… ▽ More Object tracking is a key aspect in many applications such as augmented reality in medicine (e.g. tracking a surgical instrument) or robotics. Squared planar markers have become popular tools for tracking since their pose can be estimated from their four corners. While using a single marker and a single camera limits the working area considerably, using multiple markers attached to an object requires estimating their relative position, which is not trivial, for high accuracy tracking. Likewise, using multiple cameras requires estimating their extrinsic parameters, also a tedious process that must be repeated whenever a camera is moved. This work proposes a novel method to simultaneously solve the above-mentioned problems. From a video sequence showing a rigid set of planar markers recorded from multiple cameras, the proposed method is able to automatically obtain the three-dimensional configuration of the markers, the extrinsic parameters of the cameras, and the relative pose between the markers and the cameras at each frame. Our experiments show that our approach can obtain highly accurate results for estimating these parameters using low resolution cameras. Once the parameters are obtained, tracking of the object can be done in real time with a low computational cost. The proposed method is a step forward in the development of cost-effective solutions for object tracking. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: Some errors in the IEEE Access version (regarding object's rotational accuracy, and the definition of Equation 14) have been corrected in this version. IEEE Access paper: https://doi.org/10.1109/ACCESS.2019.2896648

arXiv:2012.06516 [pdf, other]

doi 10.1109/ACCESS.2021.3058423

Detection of Binary Square Fiducial Markers Using an Event Camera

Authors: Hamid Sarmadi, Rafael Muñoz-Salinas, Miguel A. Olivares-Mendez, Rafael Medina-Carnicer

Abstract: Event cameras are a new type of image sensors that output changes in light intensity (events) instead of absolute intensity values. They have a very high temporal resolution and a high dynamic range. In this paper, we propose a method to detect and decode binary square markers using an event camera. We detect the edges of the markers by detecting line segments in an image created from events in th… ▽ More Event cameras are a new type of image sensors that output changes in light intensity (events) instead of absolute intensity values. They have a very high temporal resolution and a high dynamic range. In this paper, we propose a method to detect and decode binary square markers using an event camera. We detect the edges of the markers by detecting line segments in an image created from events in the current packet. The line segments are combined to form marker candidates. The bit value of marker cells is decoded using the events on their borders. To the best of our knowledge, no other approach exists for detecting square binary markers directly from an event camera using only the CPU unit in real-time. Experimental results show that the performance of our proposal is much superior to the one from the RGB ArUco marker detector. The proposed method can achieve the real-time performance on a single CPU thread. △ Less

Submitted 15 March, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

Comments: An error in the abstract of the IEEE Access version has been corrected in this version. Link to the IEEE Access paper: https://doi.org/10.1109/ACCESS.2021.3058423

arXiv:2010.01895 [pdf, other]

doi 10.1016/j.cmpb.2021.106296

Joint Scene and Object Tracking for Cost-Effective Augmented Reality Assisted Patient Positioning in Radiation Therapy

Authors: Hamid Sarmadi, Rafael Muñoz-Salinas, M. Álvaro Berbís, Antonio Luna, R. Medina-Carnicer

Abstract: BACKGROUND AND OBJECTIVE: The research done in the field of Augmented Reality (AR) for patient positioning in radiation therapy is scarce. We propose an efficient and cost-effective algorithm for tracking the scene and the patient to interactively assist the patient's positioning process by providing visual feedback to the operator. Up to our knowledge, this is the first framework that can be empl… ▽ More BACKGROUND AND OBJECTIVE: The research done in the field of Augmented Reality (AR) for patient positioning in radiation therapy is scarce. We propose an efficient and cost-effective algorithm for tracking the scene and the patient to interactively assist the patient's positioning process by providing visual feedback to the operator. Up to our knowledge, this is the first framework that can be employed for mobile interactive AR to guide patient positioning. METHODS: We propose a point cloud processing method that combined with a fiducial marker-mapper algorithm and the generalized ICP algorithm tracks the patient and the camera precisely and efficiently only using the CPU unit. The alignment between the 3D reference model and body marker map is calculated employing an efficient body reconstruction algorithm. RESULTS: Our quantitative evaluation shows that the proposed method achieves a translational and rotational error of 4.17 mm/0.82 deg at 9 fps. Furthermore, the qualitative results demonstrate the usefulness of our algorithm in patient positioning on different human subjects. CONCLUSION: Since our algorithm achieves a relatively high frame rate and accuracy employing a regular laptop (without the usage of a dedicated GPU), it is a very cost-effective AR-based patient positioning method. It also opens the way for other researchers by introducing a framework that could be improved upon for better mobile interactive AR patient positioning solutions in the future. △ Less

Submitted 11 January, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

Comments: 16 pages, 7 figures

arXiv:1902.03729 [pdf, other]

UcoSLAM: Simultaneous Localization and Map** by Fusion of KeyPoints and Squared Planar Markers

Authors: Rafael Munoz-Salinas, Rafael Medina-Carnicer

Abstract: This paper proposes a novel approach for Simultaneous Localization and Map** by fusing natural and artificial landmarks. Most of the SLAM approaches use natural landmarks (such as keypoints). However, they are unstable over time, repetitive in many cases or insufficient for a robust tracking (e.g. in indoor buildings). On the other hand, other approaches have employed artificial landmarks (such… ▽ More This paper proposes a novel approach for Simultaneous Localization and Map** by fusing natural and artificial landmarks. Most of the SLAM approaches use natural landmarks (such as keypoints). However, they are unstable over time, repetitive in many cases or insufficient for a robust tracking (e.g. in indoor buildings). On the other hand, other approaches have employed artificial landmarks (such as squared fiducial markers) placed in the environment to help tracking and relocalization. We propose a method that integrates both approaches in order to achieve long-term robust tracking in many scenarios. Our method has been compared to the start-of-the-art methods ORB-SLAM2 and LDSO in the public dataset Kitti, Euroc-MAV, TUM and SPM, obtaining better precision, robustness and speed. Our tests also show that the combination of markers and keypoints achieves better accuracy than each one of them independently. △ Less

Submitted 10 February, 2019; originally announced February 2019.

Comments: Paper submitted to Pattern Recognition

arXiv:1807.05389 [pdf, other]

3D human pose estimation from depth maps using a deep combination of poses

Authors: Manuel J. Marin-Jimenez, Francisco J. Romero-Ramirez, Rafael Muñoz-Salinas, Rafael Medina-Carnicer

Abstract: Many real-world applications require the estimation of human body joints for higher-level tasks as, for example, human behaviour understanding. In recent years, depth sensors have become a popular approach to obtain three-dimensional information. The depth maps generated by these sensors provide information that can be employed to disambiguate the poses observed in two-dimensional images. This wor… ▽ More Many real-world applications require the estimation of human body joints for higher-level tasks as, for example, human behaviour understanding. In recent years, depth sensors have become a popular approach to obtain three-dimensional information. The depth maps generated by these sensors provide information that can be employed to disambiguate the poses observed in two-dimensional images. This work addresses the problem of 3D human pose estimation from depth maps employing a Deep Learning approach. We propose a model, named Deep Depth Pose (DDP), which receives a depth map containing a person and a set of predefined 3D prototype poses and returns the 3D position of the body joints of the person. In particular, DDP is defined as a ConvNet that computes the specific weights needed to linearly combine the prototypes for the given input. We have thoroughly evaluated DDP on the challenging 'ITOP' and 'UBC3V' datasets, which respectively depict realistic and synthetic samples, defining a new state-of-the-art on them. △ Less

Submitted 14 July, 2018; originally announced July 2018.

Comments: Accepted for publication at "Journal of Visual Communication and Image Representation"

arXiv:1606.00151 [pdf, other]

Map** and Localization from Planar Markers

Authors: Rafael Muñoz-Salinas, Manuel J. Marín-Jimenez, Enrique Yeguas-Bolivar, Rafael Medina-Carnicer

Abstract: Squared planar markers are a popular tool for fast, accurate and robust camera localization, but its use is frequently limited to a single marker, or at most, to a small set of them for which their relative pose is known beforehand. Map** and localization from a large set of planar markers is yet a scarcely treated problem in favour of keypoint-based approaches. However, while keypoint detectors… ▽ More Squared planar markers are a popular tool for fast, accurate and robust camera localization, but its use is frequently limited to a single marker, or at most, to a small set of them for which their relative pose is known beforehand. Map** and localization from a large set of planar markers is yet a scarcely treated problem in favour of keypoint-based approaches. However, while keypoint detectors are not robust to rapid motion, large changes in viewpoint, or significant changes in appearance, fiducial markers can be robustly detected under a wider range of conditions. This paper proposes a novel method to simultaneously solve the problems of map** and localization from a set of squared planar markers. First, a quiver of pairwise relative marker poses is created, from which an initial pose graph is obtained. The pose graph may contain small pairwise pose errors, that when propagated, leads to large errors. Thus, we distribute the rotational and translational error along the basis cycles of the graph so as to obtain a corrected pose graph. Finally, we perform a global pose optimization by minimizing the reprojection errors of the planar markers in all observed frames. The experiments conducted show that our method performs better than Structure from Motion and visual SLAM techniques. △ Less

Submitted 25 January, 2017; v1 submitted 1 June, 2016; originally announced June 2016.

Comments: Paper submitted to journal. Code available. See webpage http://www.uco.es/investiga/grupos/ava/node/57/

arXiv:1403.6950 [pdf, other]

Pyramidal Fisher Motion for Multiview Gait Recognition

Authors: F. M. Castro, M. J. Marin-Jimenez, R. Medina-Carnicer

Abstract: The goal of this paper is to identify individuals by analyzing their gait. Instead of using binary silhouettes as input data (as done in many previous works) we propose and evaluate the use of motion descriptors based on densely sampled short-term trajectories. We take advantage of state-of-the-art people detectors to define custom spatial configurations of the descriptors around the target person… ▽ More The goal of this paper is to identify individuals by analyzing their gait. Instead of using binary silhouettes as input data (as done in many previous works) we propose and evaluate the use of motion descriptors based on densely sampled short-term trajectories. We take advantage of state-of-the-art people detectors to define custom spatial configurations of the descriptors around the target person. Thus, obtaining a pyramidal representation of the gait motion. The local motion features (described by the Divergence-Curl-Shear descriptor) extracted on the different spatial areas of the person are combined into a single high-level gait descriptor by using the Fisher Vector encoding. The proposed approach, coined Pyramidal Fisher Motion, is experimentally validated on the recent `AVA Multiview Gait' dataset. The results show that this new approach achieves promising results in the problem of gait recognition. △ Less

Submitted 27 March, 2014; originally announced March 2014.

Comments: Submitted to International Conference on Pattern Recognition, ICPR, 2014

Showing 1–8 of 8 results for author: Medina-Carnicer, R