Search | arXiv e-print repository

Advancing Multimodal Medical Capabilities of Gemini

Authors: Lin Yang, Shawn Xu, Andrew Sellergren, Timo Kohlberger, Yuchen Zhou, Ira Ktena, Atilla Kiraly, Faruk Ahmed, Farhad Hormozdiari, Tiam Jaroensri, Eric Wang, Ellery Wulczyn, Fayaz Jamil, Theo Guidroz, Chuck Lau, Siyuan Qiao, Yun Liu, Akshay Goel, Kendall Park, Arnav Agharwal, Nick George, Yang Wang, Ryutaro Tanno, David G. T. Barrett, Wei-Hung Weng , et al. (22 additional authors not shown)

Abstract: Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histop… ▽ More Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histopathology, ophthalmology, dermatology and genomic data. Med-Gemini-2D sets a new standard for AI-based chest X-ray (CXR) report generation based on expert evaluation, exceeding previous best results across two separate datasets by an absolute margin of 1% and 12%, where 57% and 96% of AI reports on normal cases, and 43% and 65% on abnormal cases, are evaluated as "equivalent or better" than the original radiologists' reports. We demonstrate the first ever large multimodal model-based report generation for 3D computed tomography (CT) volumes using Med-Gemini-3D, with 53% of AI reports considered clinically acceptable, although additional research is needed to meet expert radiologist reporting quality. Beyond report generation, Med-Gemini-2D surpasses the previous best performance in CXR visual question answering (VQA) and performs well in CXR classification and radiology VQA, exceeding SoTA or baselines on 17 of 20 tasks. In histopathology, ophthalmology, and dermatology image classification, Med-Gemini-2D surpasses baselines across 18 out of 20 tasks and approaches task-specific model performance. Beyond imaging, Med-Gemini-Polygenic outperforms the standard linear polygenic risk score-based approach for disease risk prediction and generalizes to genetically correlated diseases for which it has never been trained. Although further development and evaluation are necessary in the safety-critical medical domain, our results highlight the potential of Med-Gemini across a wide range of medical tasks. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:1912.11027 [pdf, other]

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Authors: William Lotter, Abdul Rahman Diab, Bryan Haslam, Jiye G. Kim, Giorgia Grisot, Eric Wu, Kevin Wu, Jorge Onieva Onieva, Jerrold L. Boxerman, Meiyun Wang, Mack Bandler, Gopal Vijayaraghavan, A. Gregory Sorensen

Abstract: Breast cancer remains a global challenge, causing over 1 million deaths globally in 2018. To achieve earlier breast cancer detection, screening x-ray mammography is recommended by health organizations worldwide and has been estimated to decrease breast cancer mortality by 20-40%. Nevertheless, significant false positive and false negative rates, as well as high interpretation costs, leave opportun… ▽ More Breast cancer remains a global challenge, causing over 1 million deaths globally in 2018. To achieve earlier breast cancer detection, screening x-ray mammography is recommended by health organizations worldwide and has been estimated to decrease breast cancer mortality by 20-40%. Nevertheless, significant false positive and false negative rates, as well as high interpretation costs, leave opportunities for improving quality and access. To address these limitations, there has been much recent interest in applying deep learning to mammography; however, obtaining large amounts of annotated data poses a challenge for training deep learning models for this purpose, as does ensuring generalization beyond the populations represented in the training dataset. Here, we present an annotation-efficient deep learning approach that 1) achieves state-of-the-art performance in mammogram classification, 2) successfully extends to digital breast tomosynthesis (DBT; "3D mammography"), 3) detects cancers in clinically-negative prior mammograms of cancer patients, 4) generalizes well to a population with low screening rates, and 5) outperforms five-out-of-five full-time breast imaging specialists by improving absolute sensitivity by an average of 14%. Our results demonstrate promise towards software that can improve the accuracy of and access to screening mammography worldwide. △ Less

Submitted 27 December, 2019; v1 submitted 23 December, 2019; originally announced December 2019.

arXiv:1911.00364 [pdf, other]

Validation of a deep learning mammography model in a population with low screening rates

Authors: Kevin Wu, Eric Wu, Ya** Wu, Hongna Tan, Greg Sorensen, Meiyun Wang, Bill Lotter

Abstract: A key promise of AI applications in healthcare is in increasing access to quality medical care in under-served populations and emerging markets. However, deep learning models are often only trained on data from advantaged populations that have the infrastructure and resources required for large-scale data collection. In this paper, we aim to empirically investigate the potential impact of such bia… ▽ More A key promise of AI applications in healthcare is in increasing access to quality medical care in under-served populations and emerging markets. However, deep learning models are often only trained on data from advantaged populations that have the infrastructure and resources required for large-scale data collection. In this paper, we aim to empirically investigate the potential impact of such biases on breast cancer detection in mammograms. We specifically explore how a deep learning algorithm trained on screening mammograms from the US and UK generalizes to mammograms collected at a hospital in China, where screening is not widely implemented. For the evaluation, we use a top-scoring model developed for the Digital Mammography DREAM Challenge. Despite the change in institution and population composition, we find that the model generalizes well, exhibiting similar performance to that achieved in the DREAM Challenge, even when controlling for tumor size. We also illustrate a simple but effective method for filtering predictions based on model variance, which can be particularly useful for deployment in new settings. While there are many components in develo** a clinically effective system, these results represent a promising step towards increasing access to life-saving screening mammography in populations where screening rates are currently low. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Journal ref: NeurIPS 2019. Fair ML for Health Workshop

arXiv:1809.01652 [pdf]

Current potentials and challenges using Sentinel-1 for broadacre field remote sensing

Authors: Martin Peter Christiansen, Morten Stigaard Laursen, Birgitte Feld Mikkelsen, Nima Teimouri, Rasmus Nyholm Jørgensen, Claus Aage Grøn Sørensen

Abstract: ESA operates the Sentinel-1 satellites, which provides Synthetic Aperture Radar (SAR) data of Earth. Recorded Sentinel-1 data have shown a potential for remotely observing and monitoring local conditions on broad acre fields. Remote sensing using Sentinel-1 have the potential to provide daily updates on the current conditions in the individual fields and at the same time give an overview of the ag… ▽ More ESA operates the Sentinel-1 satellites, which provides Synthetic Aperture Radar (SAR) data of Earth. Recorded Sentinel-1 data have shown a potential for remotely observing and monitoring local conditions on broad acre fields. Remote sensing using Sentinel-1 have the potential to provide daily updates on the current conditions in the individual fields and at the same time give an overview of the agricultural areas in the region. Research depends on the ability of independent validation of the presented results. In the case of the Sentinel-1 satellites, every researcher has access to the same base dataset, and therefore independent validation is possible. Well documented research performed with Sentinel-1 allow other research the ability to redo the experiments and either validate or falsify presented findings. Based on current state-of-art research we have chosen to provide a service for researchers in the agricultural domain. The service allows researchers the ability to monitor local conditions by using the Sentinel-1 information combined with a priori knowledge from broad acre fields. Correlating processed Sentinel-1 to the actual conditions is still a task the individual researchers must perform to benefit from the service. In this paper, we presented our methodology in translating sentinel-1 data to a level that is more accessible to researchers in the agricultural field. The goal here was to make the data more easily available, so the primary focus can be on correlating and comparing to measurements collected in the broadacre fields. We illustrate the value of the service with three examples of the possible application areas. The presented application examples are all based on Denmark, where we have processed all sentinel-1 scan from since 2016. △ Less

Submitted 4 September, 2018; originally announced September 2018.

Comments: 9 pages, 5 figures, conference (AGENG2018)

Journal ref: EurAgEng 2018

arXiv:1707.06978 [pdf, other]

A Multi-Scale CNN and Curriculum Learning Strategy for Mammogram Classification

Authors: William Lotter, Greg Sorensen, David Cox

Abstract: Screening mammography is an important front-line tool for the early detection of breast cancer, and some 39 million exams are conducted each year in the United States alone. Here, we describe a multi-scale convolutional neural network (CNN) trained with a curriculum learning strategy that achieves high levels of accuracy in classifying mammograms. Specifically, we first train CNN-based patch class… ▽ More Screening mammography is an important front-line tool for the early detection of breast cancer, and some 39 million exams are conducted each year in the United States alone. Here, we describe a multi-scale convolutional neural network (CNN) trained with a curriculum learning strategy that achieves high levels of accuracy in classifying mammograms. Specifically, we first train CNN-based patch classifiers on segmentation masks of lesions in mammograms, and then use the learned features to initialize a scanning-based model that renders a decision on the whole image, trained end-to-end on outcome data. We demonstrate that our approach effectively handles the "needle in a haystack" nature of full-image mammogram classification, achieving 0.92 AUROC on the DDSM dataset. △ Less

Submitted 21 July, 2017; originally announced July 2017.

Comments: Accepted to MICCAI 2017 Workshop on Deep Learning in Medical Image Analysis

Showing 1–5 of 5 results for author: Sorensen, G