Search | arXiv e-print repository

An Android Robot Head as Embodied Conversational Agent

Authors: Marcel Heisler, Christian Becker-Asano

Abstract: This paper describes, how current Machine Learning (ML) techniques combined with simple rule-based animation routines make an android robot head an embodied conversational agent with ChatGPT as its core component. The android robot head is described, technical details are given of how lip-sync animation is being achieved, and general software design decisions are presented. A public presentation o… ▽ More This paper describes, how current Machine Learning (ML) techniques combined with simple rule-based animation routines make an android robot head an embodied conversational agent with ChatGPT as its core component. The android robot head is described, technical details are given of how lip-sync animation is being achieved, and general software design decisions are presented. A public presentation of the system revealed improvement opportunities that are reported and that lead our iterative implementation approach. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Journal ref: ISR Europe 2023, 56th International Symposium on Robotics, Stuttgart, Germany, 2023, pp. 93-99

arXiv:2208.07407 [pdf, other]

SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding

Authors: Morgan Heisler, Amin Banitalebi-Dehkordi, Yong Zhang

Abstract: Data augmentation is an essential technique in improving the generalization of deep neural networks. The majority of existing image-domain augmentations either rely on geometric and structural transformations, or apply different kinds of photometric distortions. In this paper, we propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes. O… ▽ More Data augmentation is an essential technique in improving the generalization of deep neural networks. The majority of existing image-domain augmentations either rely on geometric and structural transformations, or apply different kinds of photometric distortions. In this paper, we propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes. Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects that can be placed into relevant locations in the image (the what and where problems). Then it embeds these objects into their relevant target locations, thereby promoting diversity of object instance distribution. Our method allows for introducing new object instances and categories that may not even exist in the training set. Furthermore, it does not require the additional overhead of training a context network, so it can be easily added to existing architectures. Our comprehensive set of evaluations showed that the proposed method is very effective in improving the generalization, while the overhead is negligible. In particular, for a wide range of model architectures, our method achieved ~2-4% and ~1-2% mAP improvements for the task of object detection on the Pascal VOC and COCO datasets, respectively. △ Less

Submitted 15 August, 2022; originally announced August 2022.

arXiv:2003.09033 [pdf]

doi 10.1167/tvst.9.2.38

Microvasculature Segmentation and Inter-capillary Area Quantification of the Deep Vascular Complex using Transfer Learning

Authors: Julian Lo, Morgan Heisler, Vinicius Vanzan, Sonja Karst, Ivana Zadro Matovinovic, Sven Loncaric, Eduardo V. Navajas, Mirza Faisal Beg, Marinko V. Sarunic

Abstract: Purpose: Optical Coherence Tomography Angiography (OCT-A) permits visualization of the changes to the retinal circulation due to diabetic retinopathy (DR), a microvascular complication of diabetes. We demonstrate accurate segmentation of the vascular morphology for the superficial capillary plexus and deep vascular complex (SCP and DVC) using a convolutional neural network (CNN) for quantitative a… ▽ More Purpose: Optical Coherence Tomography Angiography (OCT-A) permits visualization of the changes to the retinal circulation due to diabetic retinopathy (DR), a microvascular complication of diabetes. We demonstrate accurate segmentation of the vascular morphology for the superficial capillary plexus and deep vascular complex (SCP and DVC) using a convolutional neural network (CNN) for quantitative analysis. Methods: Retinal OCT-A with a 6x6mm field of view (FOV) were acquired using a Zeiss PlexElite. Multiple-volume acquisition and averaging enhanced the vessel network contrast used for training the CNN. We used transfer learning from a CNN trained on 76 images from smaller FOVs of the SCP acquired using different OCT systems. Quantitative analysis of perfusion was performed on the automated vessel segmentations in representative patients with DR. Results: The automated segmentations of the OCT-A images maintained the hierarchical branching and lobular morphologies of the SCP and DVC, respectively. The network segmented the SCP with an accuracy of 0.8599, and a Dice index of 0.8618. For the DVC, the accuracy was 0.7986, and the Dice index was 0.8139. The inter-rater comparisons for the SCP had an accuracy and Dice index of 0.8300 and 0.6700, respectively, and 0.6874 and 0.7416 for the DVC. Conclusions: Transfer learning reduces the amount of manually-annotated images required, while producing high quality automatic segmentations of the SCP and DVC. Using high quality training data preserves the characteristic appearance of the capillary networks in each layer. Translational Relevance: Accurate retinal microvasculature segmentation with the CNN results in improved perfusion analysis in diabetic retinopathy. △ Less

Submitted 19 March, 2020; originally announced March 2020.

Comments: 27 pages, 8 figures

arXiv:1912.03418 [pdf, other]

Cascaded Deep Neural Networks for Retinal Layer Segmentation of Optical Coherence Tomography with Fluid Presence

Authors: Donghuan Lu, Morgan Heisler, Da Ma, Setareh Dabiri, Sieun Lee, Gavin Weiguang Ding, Marinko V. Sarunic, Mirza Faisal Beg

Abstract: Optical coherence tomography (OCT) is a non-invasive imaging technology which can provide micrometer-resolution cross-sectional images of the inner structures of the eye. It is widely used for the diagnosis of ophthalmic diseases with retinal alteration, such as layer deformation and fluid accumulation. In this paper, a novel framework was proposed to segment retinal layers with fluid presence. Th… ▽ More Optical coherence tomography (OCT) is a non-invasive imaging technology which can provide micrometer-resolution cross-sectional images of the inner structures of the eye. It is widely used for the diagnosis of ophthalmic diseases with retinal alteration, such as layer deformation and fluid accumulation. In this paper, a novel framework was proposed to segment retinal layers with fluid presence. The main contribution of this study is two folds: 1) we developed a cascaded network framework to incorporate the prior structural knowledge; 2) we proposed a novel deep neural network based on U-Net and fully convolutional network, termed LF-UNet. Cross validation experiments proved that the proposed LF-UNet has superior performance comparing with the state-of-the-art methods, and incorporating the relative distance map structural prior information could further improve the performance regardless the network. △ Less

Submitted 6 December, 2019; originally announced December 2019.

arXiv:1909.11289 [pdf]

Deep learning vessel segmentation and quantification of the foveal avascular zone using commercial and prototype OCT-A platforms

Authors: Morgan Heisler, Forson Chan, Zaid Mammo, Chandrakumar Balaratnasingam, Pavle Prentasic, Gavin Docherty, Myeong** Ju, Sanjeeva Rajapakse, Sieun Lee, Andrew Merkur, Andrew Kirker, David Albiani, David Maberley, K. Bailey Freund, Mirza Faisal Beg, Sven Loncaric, Marinko V. Sarunic, Eduardo V. Navajas

Abstract: Automatic quantification of perifoveal vessel densities in optical coherence tomography angiography (OCT-A) images face challenges such as variable intra- and inter-image signal to noise ratios, projection artefacts from outer vasculature layers, and motion artefacts. This study demonstrates the utility of deep neural networks for automatic quantification of foveal avascular zone (FAZ) parameters… ▽ More Automatic quantification of perifoveal vessel densities in optical coherence tomography angiography (OCT-A) images face challenges such as variable intra- and inter-image signal to noise ratios, projection artefacts from outer vasculature layers, and motion artefacts. This study demonstrates the utility of deep neural networks for automatic quantification of foveal avascular zone (FAZ) parameters and perifoveal vessel density of OCT-A images in healthy and diabetic eyes. OCT-A images of the foveal region were acquired using three OCT-A systems: a 1060nm Swept Source (SS)-OCT prototype, RTVue XR Avanti (Optovue Inc., Fremont, CA), and the ZEISS Angioplex (Carl Zeiss Meditec, Dublin, CA). Automated segmentation was then performed using a deep neural network. Four FAZ morphometric parameters (area, min/max diameter, and eccentricity) and perifoveal vessel density were used as outcome measures. The accuracy, sensitivity and specificity of the DNN vessel segmentations were comparable across all three device platforms. No significant difference between the means of the measurements from automated and manual segmentations were found for any of the outcome measures on any system. The intraclass correlation coefficient (ICC) was also good (> 0.51) for all measurements. Automated deep learning vessel segmentation of OCT-A may be suitable for both commercial and research purposes for better quantification of the retinal circulation. △ Less

Submitted 25 September, 2019; originally announced September 2019.

Comments: 22 pages, 5 figures

arXiv:1710.04778 [pdf, other]

Retinal Fluid Segmentation and Detection in Optical Coherence Tomography Images using Fully Convolutional Neural Network

Authors: Donghuan Lu, Morgan Heisler, Sieun Lee, Gavin Ding, Marinko V. Sarunic, Mirza Faisal Beg

Abstract: As a non-invasive imaging modality, optical coherence tomography (OCT) can provide micrometer-resolution 3D images of retinal structures. Therefore it is commonly used in the diagnosis of retinal diseases associated with edema in and under the retinal layers. In this paper, a new framework is proposed for the task of fluid segmentation and detection in retinal OCT images. Based on the raw images a… ▽ More As a non-invasive imaging modality, optical coherence tomography (OCT) can provide micrometer-resolution 3D images of retinal structures. Therefore it is commonly used in the diagnosis of retinal diseases associated with edema in and under the retinal layers. In this paper, a new framework is proposed for the task of fluid segmentation and detection in retinal OCT images. Based on the raw images and layers segmented by a graph-cut algorithm, a fully convolutional neural network was trained to recognize and label the fluid pixels. Random forest classification was performed on the segmented fluid regions to detect and reject the falsely labeled fluid regions. The leave-one-out cross validation experiments on the RETOUCH database show that our method performs well in both segmentation (mean Dice: 0.7317) and detection (mean AUC: 0.985) tasks. △ Less

Submitted 12 October, 2017; originally announced October 2017.

Comments: 9 pages, 5 figures, MICCAI Retinal OCT Fluid Challenge 2017

Showing 1–6 of 6 results for author: Heisler, M