Gamified Crowdsourcing as a Novel Approach to Lung Ultrasound Dataset Labeling
Authors:
Nicole M Duggan,
Mike **,
Maria Alejandra Duran Mendicuti,
Stephen Hallisey,
Denie Bernier,
Lauren A Selame,
Ameneh Asgari-Targhi,
Chanel E Fischetti,
Ruben Lucassen,
Anthony E Samir,
Erik Duhaime+,
Tina Kapur,
Andrew J Goldsmith
Abstract:
Study Objective: Machine learning models have advanced medical image processing and can yield faster, more accurate diagnoses. Despite a wealth of available medical imaging data, high-quality labeled data for model training is lacking. We investigated whether a gamified crowdsourcing platform enhanced with inbuilt quality control metrics can produce lung ultrasound clip labels comparable to those…
▽ More
Study Objective: Machine learning models have advanced medical image processing and can yield faster, more accurate diagnoses. Despite a wealth of available medical imaging data, high-quality labeled data for model training is lacking. We investigated whether a gamified crowdsourcing platform enhanced with inbuilt quality control metrics can produce lung ultrasound clip labels comparable to those from clinical experts.
Methods: 2,384 lung ultrasound clips were retrospectively collected from 203 patients. Six lung ultrasound experts classified 393 of these clips as having no B-lines, one or more discrete B-lines, or confluent B-lines to create two sets of reference standard labels (195 training set clips and 198 test set clips). Sets were respectively used to A) train users on a gamified crowdsourcing platform, and B) compare concordance of the resulting crowd labels to the concordance of individual experts to reference standards.
Results: 99,238 crowdsourced opinions on 2,384 lung ultrasound clips were collected from 426 unique users over 8 days. On the 198 test set clips, mean labeling concordance of individual experts relative to the reference standard was 85.0% +/- 2.0 (SEM), compared to 87.9% crowdsourced label concordance (p=0.15). When individual experts' opinions were compared to reference standard labels created by majority vote excluding their own opinion, crowd concordance was higher than the mean concordance of individual experts to reference standards (87.4% vs. 80.8% +/- 1.6; p<0.001).
Conclusion: Crowdsourced labels for B-line classification via a gamified approach achieved expert-level quality. Scalable, high-quality labeling approaches may facilitate training dataset creation for machine learning model development.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging
Authors:
Szu-Yeu Hu,
Shuhang Wang,
Wei-Hung Weng,
**gChao Wang,
XiaoHong Wang,
Arinc Ozturk,
Qian Li,
Viksit Kumar,
Anthony E. Samir
Abstract:
Modern deep learning algorithms geared towards clinical adaption rely on a significant amount of high fidelity labeled data. Low-resource settings pose challenges like acquiring high fidelity data and becomes the bottleneck for develo** artificial intelligence applications. Ultrasound images, stored in Digital Imaging and Communication in Medicine (DICOM) format, have additional metadata data co…
▽ More
Modern deep learning algorithms geared towards clinical adaption rely on a significant amount of high fidelity labeled data. Low-resource settings pose challenges like acquiring high fidelity data and becomes the bottleneck for develo** artificial intelligence applications. Ultrasound images, stored in Digital Imaging and Communication in Medicine (DICOM) format, have additional metadata data corresponding to ultrasound image parameters and medical exams. In this work, we leverage DICOM metadata from ultrasound images to help learn representations of the ultrasound image. We demonstrate that the proposed method outperforms the non-metadata based approaches across different downstream tasks.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.