Skip to main content

Showing 1–19 of 19 results for author: Irvin, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09798  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Many-Shot In-Context Learning in Multimodal Foundation Models

    Authors: Yixing Jiang, Jeremy Irvin, Ji Hun Wang, Muhammad Ahmed Chaudhry, Jonathan H. Chen, Andrew Y. Ng

    Abstract: Large language models are well-known to be effective at few-shot in-context learning (ICL). Recent advancements in multimodal foundation models have enabled unprecedentedly long context windows, presenting an opportunity to explore their capability to perform ICL with many more demonstrating examples. In this work, we evaluate the performance of multimodal foundation models scaling from few-shot t… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  2. arXiv:2401.14486  [pdf, other

    cs.CV cs.LG

    CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds

    Authors: Muhammad Ahmed Chaudhry, Lyna Kim, Jeremy Irvin, Yuzu Ido, Sonia Chu, Jared Thomas Isobe, Andrew Y. Ng, Duncan Watson-Parris

    Abstract: Clouds play a significant role in global temperature regulation through their effect on planetary albedo. Anthropogenic emissions of aerosols can alter the albedo of clouds, but the extent of this effect, and its consequent impact on temperature change, remains uncertain. Human-induced clouds caused by ship aerosol emissions, commonly referred to as ship tracks, provide visible manifestations of t… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 11 pages, 5 figures, submitted to Journal of Machine Learning Research

  3. arXiv:2312.02200  [pdf, other

    cs.CV cs.AI stat.AP

    An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets

    Authors: Maya Srikanth, Jeremy Irvin, Brian Wesley Hill, Felipe Godoy, Ishan Sabane, Andrew Y. Ng

    Abstract: Major advancements in computer vision can primarily be attributed to the use of labeled datasets. However, acquiring labels for datasets often results in errors which can harm model performance. Recent works have proposed methods to automatically identify mislabeled images, but develo** strategies to effectively implement them in real world datasets has been sparsely explored. Towards improved d… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  4. arXiv:2312.02199  [pdf, other

    cs.CV cs.AI cs.LG eess.IV stat.AP

    USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery

    Authors: Jeremy Irvin, Lucas Tao, Joanne Zhou, Yuntao Ma, Langston Nashold, Benjamin Liu, Andrew Y. Ng

    Abstract: Large, self-supervised vision models have led to substantial advancements for automatically interpreting natural images. Recent works have begun tailoring these methods to remote sensing data which has rich structure with multi-sensor, multi-spectral, and temporal information providing massive amounts of self-labeled data that can be used for self-supervised pre-training. In this work, we develop… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  5. arXiv:2311.17449  [pdf, other

    cs.CV

    Weakly-semi-supervised object detection in remotely sensed imagery

    Authors: Ji Hun Wang, Jeremy Irvin, Beri Kohen Behar, Ha Tran, Raghav Samavedam, Quentin Hsu, Andrew Y. Ng

    Abstract: Deep learning for detecting objects in remotely sensed imagery can enable new technologies for important applications including mitigating climate change. However, these models often require large datasets labeled with bounding box annotations which are expensive to curate, prohibiting the development of models for new tasks and geographies. To address this challenge, we develop weakly-semi-superv… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Tackling Climate Change with Machine Learning at NeurIPS 2023

  6. arXiv:2306.03831  [pdf, other

    cs.LG cs.CV

    GEO-Bench: Toward Foundation Models for Earth Monitoring

    Authors: Alexandre Lacoste, Nils Lehmann, Pau Rodriguez, Evan David Sherwin, Hannah Kerner, Björn Lütjens, Jeremy Andrew Irvin, David Dao, Hamed Alemohammad, Alexandre Drouin, Mehmet Gunturkun, Gabriel Huang, David Vazquez, Dava Newman, Yoshua Bengio, Stefano Ermon, Xiao Xiang Zhu

    Abstract: Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote s… ▽ More

    Submitted 23 December, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2112.00570

  7. arXiv:2305.08017  [pdf, other

    cs.CV

    How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems

    Authors: Cara Van Uden, Jeremy Irvin, Mars Huang, Nathan Dean, Jason Carr, Andrew Ng, Curtis Langlotz

    Abstract: Self-supervised learning (SSL) enables label efficient training for machine learning models. This is essential for domains such as medical imaging, where labels are costly and time-consuming to curate. However, the most effective supervised or SSL strategy for transferring models to different healthcare systems or novel tasks is not well understood. In this work, we systematically experiment with… ▽ More

    Submitted 13 May, 2023; originally announced May 2023.

    Comments: 13 pages, 12 figures

  8. arXiv:2208.13027  [pdf, other

    cs.LG cs.AI

    Improving debris flow evacuation alerts in Taiwan using machine learning

    Authors: Yi-Lin Tsai, Jeremy Irvin, Suhas Chundi, Andrew Y. Ng, Christopher B. Field, Peter K. Kitanidis

    Abstract: Taiwan has the highest susceptibility to and fatalities from debris flows worldwide. The existing debris flow warning system in Taiwan, which uses a time-weighted measure of rainfall, leads to alerts when the measure exceeds a predefined threshold. However, this system generates many false alarms and misses a substantial fraction of the actual debris flows. Towards improving this system, we implem… ▽ More

    Submitted 2 September, 2022; v1 submitted 27 August, 2022; originally announced August 2022.

    Comments: Supplementary information: https://drive.google.com/file/d/1Y17YxXo5rhIbUuZzwLo99pmttbh28v9X/view?usp=sharing

  9. arXiv:2207.11166  [pdf, other

    cs.CV

    METER-ML: A Multi-Sensor Earth Observation Benchmark for Automated Methane Source Map**

    Authors: Bryan Zhu, Nicholas Lui, Jeremy Irvin, Jimmy Le, Sahil Tadwalkar, Chenghao Wang, Zutao Ouyang, Frankie Y. Liu, Andrew Y. Ng, Robert B. Jackson

    Abstract: Reducing methane emissions is essential for mitigating global warming. To attribute methane emissions to their sources, a comprehensive dataset of methane source infrastructure is necessary. Recent advancements with deep learning on remotely sensed imagery have the potential to identify the locations and characteristics of methane sources, but there is a substantial lack of publicly available data… ▽ More

    Submitted 15 August, 2022; v1 submitted 22 July, 2022; originally announced July 2022.

    Comments: Workshop on Complex Data Challenges in Earth Observation at IJCAI-ECAI 2022

  10. arXiv:2112.00570  [pdf, other

    cs.LG physics.geo-ph

    Toward Foundation Models for Earth Monitoring: Proposal for a Climate Change Benchmark

    Authors: Alexandre Lacoste, Evan David Sherwin, Hannah Kerner, Hamed Alemohammad, Björn Lütjens, Jeremy Irvin, David Dao, Alex Chang, Mehmet Gunturkun, Alexandre Drouin, Pau Rodriguez, David Vazquez

    Abstract: Recent progress in self-supervision shows that pre-training large neural networks on vast amounts of unsupervised data can lead to impressive increases in generalisation for downstream tasks. Such models, recently coined as foundation models, have been transformational to the field of natural language processing. While similar models have also been trained on large corpuses of images, they are not… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  11. arXiv:2105.03020  [pdf, other

    eess.IV cs.CV cs.LG

    Structured dataset documentation: a datasheet for CheXpert

    Authors: Christian Garbin, Pranav Rajpurkar, Jeremy Irvin, Matthew P. Lungren, Oge Marques

    Abstract: Billions of X-ray images are taken worldwide each year. Machine learning, and deep learning in particular, has shown potential to help radiologists triage and diagnose images. However, deep learning requires large datasets with reliable labels. The CheXpert dataset was created with the participation of board-certified radiologists, resulting in the strong ground truth needed to train deep learning… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

  12. arXiv:2011.07227  [pdf, other

    cs.CV cs.AI cs.LG

    OGNet: Towards a Global Oil and Gas Infrastructure Database using Deep Learning on Remotely Sensed Imagery

    Authors: Hao Sheng, Jeremy Irvin, Sasankh Munukutla, Shawn Zhang, Christopher Cross, Kyle Story, Rose Rustowicz, Cooper Elsworth, Zutao Yang, Mark Omara, Ritesh Gautam, Robert B. Jackson, Andrew Y. Ng

    Abstract: At least a quarter of the warming that the Earth is experiencing today is due to anthropogenic methane emissions. There are multiple satellites in orbit and planned for launch in the next few years which can detect and quantify these emissions; however, to attribute methane emissions to their sources on the ground, a comprehensive database of the locations and characteristics of emission sources w… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

    Comments: Tackling Climate Change with Machine Learning at NeurIPS 2020 (Spotlight talk)

  13. arXiv:2011.06129  [pdf, other

    eess.IV cs.CV cs.LG

    CheXphotogenic: Generalization of Deep Learning Models for Chest X-ray Interpretation to Photos of Chest X-rays

    Authors: Pranav Rajpurkar, Anirudh Joshi, Anuj Pareek, Jeremy Irvin, Andrew Y. Ng, Matthew Lungren

    Abstract: The use of smartphones to take photographs of chest x-rays represents an appealing solution for scaled deployment of deep learning models for chest x-ray interpretation. However, the performance of chest x-ray algorithms on photos of chest x-rays has not been thoroughly investigated. In this study, we measured the diagnostic performance for 8 different chest x-ray models when applied to photos of… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2020 - Extended Abstract

  14. arXiv:2011.05479  [pdf, other

    cs.CV cs.LG eess.IV

    ForestNet: Classifying Drivers of Deforestation in Indonesia using Deep Learning on Satellite Imagery

    Authors: Jeremy Irvin, Hao Sheng, Neel Ramachandran, Sonja Johnson-Yu, Sharon Zhou, Kyle Story, Rose Rustowicz, Cooper Elsworth, Kemen Austin, Andrew Y. Ng

    Abstract: Characterizing the processes leading to deforestation is critical to the development and implementation of targeted forest conservation and management policies. In this work, we develop a deep learning model called ForestNet to classify the drivers of primary forest loss in Indonesia, a country with one of the highest deforestation rates in the world. Using satellite imagery, ForestNet identifies… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    Comments: Tackling Climate Change with Machine Learning at NeurIPS 2020

  15. arXiv:2010.04715  [pdf, other

    cs.LG stat.AP stat.ML

    Short-Term Solar Irradiance Forecasting Using Calibrated Probabilistic Models

    Authors: Eric Zelikman, Sharon Zhou, Jeremy Irvin, Cooper Raterink, Hao Sheng, Anand Avati, Jack Kelly, Ram Rajagopal, Andrew Y. Ng, David Gagne

    Abstract: Advancing probabilistic solar forecasting methods is essential to supporting the integration of solar energy into the electricity grid. In this work, we develop a variety of state-of-the-art probabilistic models for forecasting solar irradiance. We investigate the use of post-hoc calibration techniques for ensuring well-calibrated probabilistic predictions. We train and evaluate the models using p… ▽ More

    Submitted 14 October, 2020; v1 submitted 9 October, 2020; originally announced October 2020.

  16. arXiv:2002.11379  [pdf, other

    eess.IV cs.CV cs.LG

    CheXpedition: Investigating Generalization Challenges for Translation of Chest X-Ray Algorithms to the Clinical Setting

    Authors: Pranav Rajpurkar, Anirudh Joshi, Anuj Pareek, Phil Chen, Amirhossein Kiani, Jeremy Irvin, Andrew Y. Ng, Matthew P. Lungren

    Abstract: Although there have been several recent advances in the application of deep learning algorithms to chest x-ray interpretation, we identify three major challenges for the translation of chest x-ray algorithms to the clinical setting. We examine the performance of the top 10 performing models on the CheXpert challenge leaderboard on three tasks: (1) TB detection, (2) pathology detection on photos of… ▽ More

    Submitted 11 March, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: Accepted as workshop paper at ACM Conference on Health, Inference, and Learning (CHIL) 2020

  17. arXiv:1901.07031  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

    Authors: Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, Andrew Y. Ng

    Abstract: Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We invest… ▽ More

    Submitted 21 January, 2019; originally announced January 2019.

    Comments: Published in AAAI 2019

  18. arXiv:1712.06957  [pdf, other

    physics.med-ph cs.AI

    MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs

    Authors: Pranav Rajpurkar, Jeremy Irvin, Aarti Bagul, Daisy Ding, Tony Duan, Hershel Mehta, Brandon Yang, Kaylie Zhu, Dillon Laird, Robyn L. Ball, Curtis Langlotz, Katie Shpanskaya, Matthew P. Lungren, Andrew Y. Ng

    Abstract: We introduce MURA, a large dataset of musculoskeletal radiographs containing 40,561 images from 14,863 studies, where each study is manually labeled by radiologists as either normal or abnormal. To evaluate models robustly and to get an estimate of radiologist performance, we collect additional labels from six board-certified Stanford radiologists on the test set, consisting of 207 musculoskeletal… ▽ More

    Submitted 22 May, 2018; v1 submitted 11 December, 2017; originally announced December 2017.

    Comments: 1st Conference on Medical Imaging with Deep Learning (MIDL 2018)

  19. arXiv:1711.05225  [pdf, other

    cs.CV cs.LG stat.ML

    CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

    Authors: Pranav Rajpurkar, Jeremy Irvin, Kaylie Zhu, Brandon Yang, Hershel Mehta, Tony Duan, Daisy Ding, Aarti Bagul, Curtis Langlotz, Katie Shpanskaya, Matthew P. Lungren, Andrew Y. Ng

    Abstract: We develop an algorithm that can detect pneumonia from chest X-rays at a level exceeding practicing radiologists. Our algorithm, CheXNet, is a 121-layer convolutional neural network trained on ChestX-ray14, currently the largest publicly available chest X-ray dataset, containing over 100,000 frontal-view X-ray images with 14 diseases. Four practicing academic radiologists annotate a test set, on w… ▽ More

    Submitted 25 December, 2017; v1 submitted 14 November, 2017; originally announced November 2017.