Search | arXiv e-print repository

Knowing the Distance: Understanding the Gap Between Synthetic and Real Data For Face Parsing

Authors: Eli Friedman, Assaf Lehr, Alexey Gruzdev, Vladimir Loginov, Max Kogan, Moran Rubin, Orly Zvitia

Abstract: The use of synthetic data for training computer vision algorithms has become increasingly popular due to its cost-effectiveness, scalability, and ability to provide accurate multi-modality labels. Although recent studies have demonstrated impressive results when training networks solely on synthetic data, there remains a performance gap between synthetic and real data that is commonly attributed t… ▽ More The use of synthetic data for training computer vision algorithms has become increasingly popular due to its cost-effectiveness, scalability, and ability to provide accurate multi-modality labels. Although recent studies have demonstrated impressive results when training networks solely on synthetic data, there remains a performance gap between synthetic and real data that is commonly attributed to lack of photorealism. The aim of this study is to investigate the gap in greater detail for the face parsing task. We differentiate between three types of gaps: distribution gap, label gap, and photorealism gap. Our findings show that the distribution gap is the largest contributor to the performance gap, accounting for over 50% of the gap. By addressing this gap and accounting for the labels gap, we demonstrate that a model trained on synthetic data achieves comparable results to one trained on a similar amount of real data. This suggests that synthetic data is a viable alternative to real data, especially when real data is limited or difficult to obtain. Our study highlights the importance of content diversity in synthetic datasets and challenges the notion that the photorealism gap is the most critical factor affecting the performance of computer vision models trained on synthetic data. △ Less

Submitted 27 March, 2023; originally announced March 2023.

arXiv:2010.03975 [pdf, other]

doi 10.1007/s42979-021-00720-7

Evaluating the Clinical Realism of Synthetic Chest X-Rays Generated Using Progressively Growing GANs

Authors: Bradley Segal, David M. Rubin, Grace Rubin, Adam Pantanowitz

Abstract: Chest x-rays are a vital tool in the workup of many patients. Similar to most medical imaging modalities, they are profoundly multi-modal and are capable of visualising a variety of combinations of conditions. There is an ever pressing need for greater quantities of labelled data to develop new diagnostic tools, however this is in direct opposition to concerns regarding patient confidentiality whi… ▽ More Chest x-rays are a vital tool in the workup of many patients. Similar to most medical imaging modalities, they are profoundly multi-modal and are capable of visualising a variety of combinations of conditions. There is an ever pressing need for greater quantities of labelled data to develop new diagnostic tools, however this is in direct opposition to concerns regarding patient confidentiality which constrains access through permission requests and ethics approvals. Previous work has sought to address these concerns by creating class-specific GANs that synthesise images to augment training data. These approaches cannot be scaled as they introduce computational trade offs between model size and class number which places fixed limits on the quality that such generates can achieve. We address this concern by introducing latent class optimisation which enables efficient, multi-modal sampling from a GAN and with which we synthesise a large archive of labelled generates. We apply a PGGAN to the task of unsupervised x-ray synthesis and have radiologists evaluate the clinical realism of the resultant samples. We provide an in depth review of the properties of varying pathologies seen on generates as well as an overview of the extent of disease diversity captured by the model. We validate the application of the Fréchet Inception Distance (FID) to measure the quality of x-ray generates and find that they are similar to other high resolution tasks. We quantify x-ray clinical realism by asking radiologists to distinguish between real and fake scans and find that generates are more likely to be classed as real than by chance, but there is still progress required to achieve true realism. We confirm these findings by evaluating synthetic classification model performance on real scans. We conclude by discussing the limitations of PGGAN generates and how to achieve controllable, realistic generates. △ Less

Submitted 10 March, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

Comments: 18 pages, 8 figures, 2 tables

arXiv:2009.02132 [pdf, other]

doi 10.1016/j.imu.2020.100488

Addressing the eye-fixation problem in gaze tracking for human computer interface using the Vestibulo-ocular Reflex

Authors: Adam Pantanowitz, Kimoon Kim, Chelsey Chewins, Isabel N. K. Tollman, David M. Rubin

Abstract: A custom head-mounted system to track smooth eye movements for control of a mouse cursor is implemented and evaluated. The system comprises a head-mounted infrared camera, an infrared light source, and a computer. Software-based image processing techniques, implemented in Microsoft Visual Studio, OpenCV, and Pupil, detect the pupil position and direction of pupil movement in near real-time. The id… ▽ More A custom head-mounted system to track smooth eye movements for control of a mouse cursor is implemented and evaluated. The system comprises a head-mounted infrared camera, an infrared light source, and a computer. Software-based image processing techniques, implemented in Microsoft Visual Studio, OpenCV, and Pupil, detect the pupil position and direction of pupil movement in near real-time. The identified direction is used to determine the desired positioning of the cursor, and the cursor moves towards the target. Two users participated in three tests to quantify the differences between incremental tracking of smooth eye movement resulting from the Vestibulo-ocular Reflex versus step-change tracking of saccadic eye movement. Tracking smooth eye movements was four times more accurate than tracking saccadic eye movements, with an average position resolution of 0.80 cm away from the target. In contrast, tracking saccadic eye movements was measured with an average position resolution of 3.21 cm. Using the incremental tracking of smooth eye movements, the user was able to place the cursor within a target as small as a 9 x 9 pixel square 90 % of the time. However, when using the step change tracking of saccadic eye movements, the user was unable to position the cursor within the 9 x 9 pixel target. The average time for the incremental tracking of smooth eye movements to track a target was 6.45 s, whereas for the step change tracking of saccadic eye movements, it was 2.61 s. △ Less

Submitted 24 November, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

Comments: 16 pages, 4 figures, 2 tables

arXiv:1908.11694 [pdf, other]

Estimation of Body Mass Index from Photographs using Deep Convolutional Neural Networks

Authors: Adam Pantanowitz, Emmanuel Cohen, Philippe Gradidge, Nigel Crowther, Vered Aharonson, Benjamin Rosman, David M Rubin

Abstract: Obesity is an important concern in public health, and Body Mass Index is one of the useful (and proliferant) measures. We use Convolutional Neural Networks to determine Body Mass Index from photographs in a study with 161 participants. Low data, a common problem in medicine, is addressed by reducing the information in the photographs by generating silhouette images. Results present with high corre… ▽ More Obesity is an important concern in public health, and Body Mass Index is one of the useful (and proliferant) measures. We use Convolutional Neural Networks to determine Body Mass Index from photographs in a study with 161 participants. Low data, a common problem in medicine, is addressed by reducing the information in the photographs by generating silhouette images. Results present with high correlation when tested on unseen data. △ Less

Submitted 29 August, 2019; originally announced August 2019.

Comments: 7 pages, 4 figures, preprint journal

arXiv:1812.11006 [pdf]

TOP-GAN: Label-Free Cancer Cell Classification Using Deep Learning with a Small Training Set

Authors: Moran Rubin, Omer Stein, Nir A. Turko, Yoav Nygate, Darina Roitshtain, Lidor Karako, Itay Barnea, Raja Giryes, Natan T. Shaked

Abstract: We propose a new deep learning approach for medical imaging that copes with the problem of a small training set, the main bottleneck of deep learning, and apply it for classification of healthy and cancer cells acquired by quantitative phase imaging. The proposed method, called transferring of pre-trained generative adversarial network (TOP-GAN), is a hybridization between transfer learning and ge… ▽ More We propose a new deep learning approach for medical imaging that copes with the problem of a small training set, the main bottleneck of deep learning, and apply it for classification of healthy and cancer cells acquired by quantitative phase imaging. The proposed method, called transferring of pre-trained generative adversarial network (TOP-GAN), is a hybridization between transfer learning and generative adversarial networks (GANs). Healthy cells and cancer cells of different metastatic potential have been imaged by low-coherence off-axis holography. After the acquisition, the optical path delay maps of the cells have been extracted and directly used as an input to the deep networks. In order to cope with the small number of classified images, we have used GANs to train a large number of unclassified images from another cell type (sperm cells). After this preliminary training, and after transforming the last layer of the network with new ones, we have designed an automatic classifier for the correct cell type (healthy/primary cancer/metastatic cancer) with 90-99% accuracy, although small training sets of down to several images have been used. These results are better in comparison to other classic methods that aim at co** with the same problem of a small training set. We believe that our approach makes the combination of holographic microscopy and deep learning networks more accessible to the medical field by enabling a rapid, automatic and accurate classification in stain-free imaging flow cytometry. Furthermore, our approach is expected to be applicable to many other medical image classification tasks, suffering from a small training set. △ Less

Submitted 17 December, 2018; originally announced December 2018.

arXiv:0705.0828 [pdf]

Enhancement of Noisy Planar Nuclear Medicine Images using Mean Field Annealing

Authors: D. L. Falk, D. M. Rubin, T. Marwala

Abstract: Nuclear medicine (NM) images inherently suffer from large amounts of noise and blur. The purpose of this research is to reduce the noise and blur while maintaining image integrity for improved diagnosis. The proposed solution is to increase image quality after the standard pre- and post-processing undertaken by a gamma camera system. Mean Field Annealing (MFA) is the image processing technique u… ▽ More Nuclear medicine (NM) images inherently suffer from large amounts of noise and blur. The purpose of this research is to reduce the noise and blur while maintaining image integrity for improved diagnosis. The proposed solution is to increase image quality after the standard pre- and post-processing undertaken by a gamma camera system. Mean Field Annealing (MFA) is the image processing technique used in this research. It is a computational iterative technique that makes use of the Point Spread Function (PSF) and the noise associated with the NM image. MFA is applied to NM images with the objective of reducing noise while not compromising edge integrity. Using a sharpening filter as a post-processing technique (after MFA) yields image enhancement of planar NM images. △ Less

Submitted 6 May, 2007; originally announced May 2007.

Comments: 4 pages

Showing 1–6 of 6 results for author: Rubin, M