Search | arXiv e-print repository

M&M: Tackling False Positives in Mammography with a Multi-view and Multi-instance Learning Sparse Detector

Authors: Yen Nhi Truong Vu, Dan Guo, Ahmed Taha, Jason Su, Thomas Paul Matthews

Abstract: Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice. To reduce false positives, we identify three challenges: (1) unlike natural images, a malignant mammogram typically contains only one malignant finding; (2) mammography exams contain two views of each breast, and both… ▽ More Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice. To reduce false positives, we identify three challenges: (1) unlike natural images, a malignant mammogram typically contains only one malignant finding; (2) mammography exams contain two views of each breast, and both views ought to be considered to make a correct assessment; (3) most mammograms are negative and do not contain any findings. In this work, we tackle the three aforementioned challenges by: (1) leveraging Sparse R-CNN and showing that sparse detectors are more appropriate than dense detectors for mammography; (2) including a multi-view cross-attention module to synthesize information from different views; (3) incorporating multi-instance learning (MIL) to train with unannotated images and perform breast-level classification. The resulting model, M&M, is a Multi-view and Multi-instance learning system that can both localize malignant findings and provide breast-level predictions. We validate M&M's detection and classification performance using five mammography datasets. In addition, we demonstrate the effectiveness of each proposed component through comprehensive ablation studies. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: MICCAI 2023 with supplementary materials

arXiv:2303.16417 [pdf]

Problems and shortcuts in deep learning for screening mammography

Authors: Trevor Tsue, Brent Mombourquette, Ahmed Taha, Thomas Paul Matthews, Yen Nhi Truong Vu, Jason Su

Abstract: This work reveals undiscovered challenges in the performance and generalizability of deep learning models. We (1) identify spurious shortcuts and evaluation issues that can inflate performance and (2) propose training and analysis methods to address them. We trained an AI model to classify cancer on a retrospective dataset of 120,112 US exams (3,467 cancers) acquired from 2008 to 2017 and 16,693… ▽ More This work reveals undiscovered challenges in the performance and generalizability of deep learning models. We (1) identify spurious shortcuts and evaluation issues that can inflate performance and (2) propose training and analysis methods to address them. We trained an AI model to classify cancer on a retrospective dataset of 120,112 US exams (3,467 cancers) acquired from 2008 to 2017 and 16,693 UK exams (5,655 cancers) acquired from 2011 to 2015. We evaluated on a screening mammography test set of 11,593 US exams (102 cancers; 7,594 women; age 57.1 \pm 11.0) and 1,880 UK exams (590 cancers; 1,745 women; age 63.3 \pm 7.2). A model trained on images of only view markers (no breast) achieved a 0.691 AUC. The original model trained on both datasets achieved a 0.945 AUC on the combined US+UK dataset but paradoxically only 0.838 and 0.892 on the US and UK datasets, respectively. Sampling cancers equally from both datasets during training mitigated this shortcut. A similar AUC paradox (0.903) occurred when evaluating diagnostic exams vs screening exams (0.862 vs 0.861, respectively). Removing diagnostic exams during training alleviated this bias. Finally, the model did not exhibit the AUC paradox over scanner models but still exhibited a bias toward Selenia Dimension (SD) over Hologic Selenia (HS) exams. Analysis showed that this AUC paradox occurred when a dataset attribute had values with a higher cancer prevalence (dataset bias) and the model consequently assigned a higher probability to these attribute values (model bias). Stratification and balancing cancer prevalence can mitigate shortcuts during evaluation. Dataset and model bias can introduce shortcuts and the AUC paradox, potentially pervasive issues within the healthcare AI space. Our methods can verify and mitigate shortcuts while providing a clear understanding of performance. △ Less

Submitted 28 March, 2023; originally announced March 2023.

arXiv:2208.06066 [pdf, other]

Deep is a Luxury We Don't Have

Authors: Ahmed Taha, Yen Nhi Truong Vu, Brent Mombourquette, Thomas Paul Matthews, Jason Su, Sadanand Singh

Abstract: Medical images come in high resolutions. A high resolution is vital for finding malignant tissues at an early stage. Yet, this resolution presents a challenge in terms of modeling long range dependencies. Shallow transformers eliminate this problem, but they suffer from quadratic complexity. In this paper, we tackle this complexity by leveraging a linear self-attention approximation. Through this… ▽ More Medical images come in high resolutions. A high resolution is vital for finding malignant tissues at an early stage. Yet, this resolution presents a challenge in terms of modeling long range dependencies. Shallow transformers eliminate this problem, but they suffer from quadratic complexity. In this paper, we tackle this complexity by leveraging a linear self-attention approximation. Through this approximation, we propose an efficient vision model called HCT that stands for High resolution Convolutional Transformer. HCT brings transformers' merits to high resolution images at a significantly lower cost. We evaluate HCT using a high resolution mammography dataset. HCT is significantly superior to its CNN counterpart. Furthermore, we demonstrate HCT's fitness for medical images by evaluating its effective receptive field.Code available at https://bit.ly/3ykBhhf △ Less

Submitted 11 August, 2022; originally announced August 2022.

Comments: MICCAI 2022 + Extra Experiments

arXiv:2204.06671 [pdf]

A deep learning algorithm for reducing false positives in screening mammography

Authors: Stefano Pedemonte, Trevor Tsue, Brent Mombourquette, Yen Nhi Truong Vu, Thomas Matthews, Rodrigo Morales Hoil, Meet Shah, Nikita Ghare, Naomi Zingman-Daniels, Susan Holley, Catherine M. Appleton, Jason Su, Richard L. Wahl

Abstract: Screening mammography improves breast cancer outcomes by enabling early detection and treatment. However, false positive callbacks for additional imaging from screening exams cause unnecessary procedures, patient anxiety, and financial burden. This work demonstrates an AI algorithm that reduces false positives by identifying mammograms not suspicious for breast cancer. We trained the algorithm to… ▽ More Screening mammography improves breast cancer outcomes by enabling early detection and treatment. However, false positive callbacks for additional imaging from screening exams cause unnecessary procedures, patient anxiety, and financial burden. This work demonstrates an AI algorithm that reduces false positives by identifying mammograms not suspicious for breast cancer. We trained the algorithm to determine the absence of cancer using 123,248 2D digital mammograms (6,161 cancers) and performed a retrospective study on 14,831 screening exams (1,026 cancers) from 15 US and 3 UK sites. Retrospective evaluation of the algorithm on the largest of the US sites (11,592 mammograms, 101 cancers) a) left the cancer detection rate unaffected (p=0.02, non-inferiority margin 0.25 cancers per 1000 exams), b) reduced callbacks for diagnostic exams by 31.1% compared to standard clinical readings, c) reduced benign needle biopsies by 7.4%, and d) reduced screening exams requiring radiologist interpretation by 41.6% in the simulated clinical workflow. This work lays the foundation for semi-autonomous breast cancer screening systems that could benefit patients and healthcare systems by reducing false positives, unnecessary procedures, patient anxiety, and expenses. △ Less

Submitted 13 April, 2022; originally announced April 2022.

arXiv:2102.10663 [pdf, other]

MedAug: Contrastive learning leveraging patient metadata improves representations for chest X-ray interpretation

Authors: Yen Nhi Truong Vu, Richard Wang, Niranjan Balachandar, Can Liu, Andrew Y. Ng, Pranav Rajpurkar

Abstract: Self-supervised contrastive learning between pairs of multiple views of the same image has been shown to successfully leverage unlabeled data to produce meaningful visual representations for both natural and medical images. However, there has been limited work on determining how to select pairs for medical images, where availability of patient metadata can be leveraged to improve representations.… ▽ More Self-supervised contrastive learning between pairs of multiple views of the same image has been shown to successfully leverage unlabeled data to produce meaningful visual representations for both natural and medical images. However, there has been limited work on determining how to select pairs for medical images, where availability of patient metadata can be leveraged to improve representations. In this work, we develop a method to select positive pairs coming from views of possibly different images through the use of patient metadata. We compare strategies for selecting positive pairs for chest X-ray interpretation including requiring them to be from the same patient, imaging study or laterality. We evaluate downstream task performance by fine-tuning the linear layer on 1% of the labeled dataset for pleural effusion classification. Our best performing positive pair selection strategy, which involves using images from the same patient from the same study across all lateralities, achieves a performance increase of 14.4% in mean AUC from the ImageNet pretrained baseline. Our controlled experiments show that the keys to improving downstream performance on disease classification are (1) using patient metadata to appropriately create positive pairs from different images with the same underlying pathologies, and (2) maximizing the number of different images used in query pairing. In addition, we explore leveraging patient metadata to select hard negative pairs for contrastive learning, but do not find improvement over baselines that do not use metadata. Our method is broadly applicable to medical image interpretation and allows flexibility for incorporating medical insights in choosing pairs for contrastive learning. △ Less

Submitted 17 October, 2021; v1 submitted 21 February, 2021; originally announced February 2021.

arXiv:1706.08424 [pdf, ps, other]

On algorithms to calculate integer complexity

Authors: Katherine Cordwell, Alyssa Epstein, Anand Hemmady, Steven J. Miller, Eyvindur A. Palsson, Aaditya Sharma, Stefan Steinerberger, Yen Nhi Truong Vu

Abstract: We consider a problem first proposed by Mahler and Popken in 1953 and later developed by Coppersmith, Erdős, Guy, Isbell, Selfridge, and others. Let $f(n)$ be the complexity of $n \in \mathbb{Z^{+}}$, where $f(n)$ is defined as the least number of $1$'s needed to represent $n$ in conjunction with an arbitrary number of $+$'s, $*$'s, and parentheses. Several algorithms have been developed to calcul… ▽ More We consider a problem first proposed by Mahler and Popken in 1953 and later developed by Coppersmith, Erdős, Guy, Isbell, Selfridge, and others. Let $f(n)$ be the complexity of $n \in \mathbb{Z^{+}}$, where $f(n)$ is defined as the least number of $1$'s needed to represent $n$ in conjunction with an arbitrary number of $+$'s, $*$'s, and parentheses. Several algorithms have been developed to calculate the complexity of all integers up to $n$. Currently, the fastest known algorithm runs in time $\mathcal{O}(n^{1.230175})$ and was given by J. Arias de Reyna and J. van de Lune in 2014. This algorithm makes use of a recursive definition given by Guy and iterates through products, $f(d) + f\left(\frac{n}{d}\right)$, for $d \ |\ n$, and sums, $f(a) + f(n - a)$, for $a$ up to some function of $n$. The rate-limiting factor is iterating through the sums. We discuss potential improvements to this algorithm via a method that provides a strong uniform bound on the number of summands that must be calculated for almost all $n$. We also develop code to run J. Arias de Reyna and J. van de Lune's analysis in higher bases and thus reduce their runtime of $\mathcal{O}(n^{1.230175})$ to $\mathcal{O}(n^{1.222911236})$. All of our code can be found online at: https://github.com/kcordwel/Integer-Complexity. △ Less

Submitted 18 December, 2018; v1 submitted 26 June, 2017; originally announced June 2017.

Comments: 8 pages; more details were added for the complexity analysis and a link added to the code on GitHub; minor typos corrected

MSC Class: 11Y55; 11Y16 (primary); 11B75; 11A67; 68Q25 (secondary)

arXiv:1609.03120 [pdf, ps, other]

doi 10.1142/S2010326318500065

Random Matrix Ensembles with Split Limiting Behavior

Authors: Paula Burkhardt, Peter Cohen, Jonathan Dewitt, Max Hlavacek, Steven J. Miller, Carsten Sprunger, Yen Nhi Truong Vu, Roger Van Peski, Kevin Yang

Abstract: We introduce a new family of $N\times N$ random real symmetric matrix ensembles, the $k$-checkerboard matrices, whose limiting spectral measure has two components which can be determined explicitly. All but $k$ eigenvalues are in the bulk, and their behavior, appropriately normalized, converges to the semi-circle as $N\to\infty$; the remaining $k$ are tightly constrained near $N/k$ and their distr… ▽ More We introduce a new family of $N\times N$ random real symmetric matrix ensembles, the $k$-checkerboard matrices, whose limiting spectral measure has two components which can be determined explicitly. All but $k$ eigenvalues are in the bulk, and their behavior, appropriately normalized, converges to the semi-circle as $N\to\infty$; the remaining $k$ are tightly constrained near $N/k$ and their distribution converges to the $k \times k$ hollow GOE ensemble (this is the density arising by modifying the GOE ensemble by forcing all entries on the main diagonal to be zero). Similar results hold for complex and quaternionic analogues. We isolate the two regimes by using matrix perturbation results and a nonstandard weight function for the eigenvalues, then derive their limiting distributions using a modification of the method of moments and analysis of the resulting combinatorics. △ Less

Submitted 15 September, 2016; v1 submitted 11 September, 2016; originally announced September 2016.

Comments: Version 1.1, 31 pages, 3 figures, one appendix joint with Manuel Fernandez and Nicholas Sieger

MSC Class: 15B52 (primary); 15B57; 15B33 (secondary)

arXiv:1608.08764 [pdf, ps, other]

doi 10.1007/s40993-018-0137-7

Summand minimality and asymptotic convergence of generalized Zeckendorf decompositions

Authors: Katherine Cordwell, Max Hlavacek, Chi Huynh, Steven J. Miller, Carsten Peterson, Yen Nhi Truong Vu

Abstract: Given a recurrence sequence $H$, with $H_n = c_1 H_{n-1} + \dots + c_t H_{n-t}$ where $c_i \in \mathbb{N}_0$ for all $i$ and $c_1, c_t \geq 1$, the generalized Zeckendorf decomposition (gzd) of $m \in \mathbb{N}_0$ is the unique representation of $m$ using $H$ composed of blocks lexicographically less than $σ= (c_1, \dots, c_t)$. We prove that the gzd of $m$ uses the fewest number of summands amon… ▽ More Given a recurrence sequence $H$, with $H_n = c_1 H_{n-1} + \dots + c_t H_{n-t}$ where $c_i \in \mathbb{N}_0$ for all $i$ and $c_1, c_t \geq 1$, the generalized Zeckendorf decomposition (gzd) of $m \in \mathbb{N}_0$ is the unique representation of $m$ using $H$ composed of blocks lexicographically less than $σ= (c_1, \dots, c_t)$. We prove that the gzd of $m$ uses the fewest number of summands among all representations of $m$ using $H$, for all $m$, if and only if $σ$ is weakly decreasing. We develop an algorithm for moving from any representation of $m$ to the gzd, the analysis of which proves that $σ$ weakly decreasing implies summand minimality. We prove that the gzds of numbers of the form $v_0 H_n + \dots + v_\ell H_{n-\ell}$ converge in a suitable sense as $n \to \infty$, furthermore we classify three distinct behaviors for this convergence. We use this result, together with the irreducibility of certain families of polynomials, to exhibit a representation with fewer summands than the gzd if $σ$ is not weakly decreasing. △ Less

Submitted 14 October, 2018; v1 submitted 31 August, 2016; originally announced August 2016.

Comments: Version 3.0, 27 pages

MSC Class: 11B37 (primary); 11B39; 65Q30 (secondary)

Journal ref: Res. number theory (2018) 4: 43

arXiv:1501.00519 [pdf, other]

Some case example exact solutions for quadratically nonlinear optical media with $\mathcal{PT}$-symmetric potentials

Authors: Y. N. Truong Vu, J. D'Ambroise, P. G. Kevrekidis, F. Kh. Abdullaev

Abstract: In the present paper we consider an optical system with a $χ^{(2)}$-type nonlinearity and unspecified $\mathcal{PT}$-symmetric potential functions. Considering this as an inverse problem and positing a family of exact solutions in terms of cnoidal functions, we solve for the resulting potential functions in a way that ensures the potentials obey the requirements of $\mathcal{PT}$-symmetry. We then… ▽ More In the present paper we consider an optical system with a $χ^{(2)}$-type nonlinearity and unspecified $\mathcal{PT}$-symmetric potential functions. Considering this as an inverse problem and positing a family of exact solutions in terms of cnoidal functions, we solve for the resulting potential functions in a way that ensures the potentials obey the requirements of $\mathcal{PT}$-symmetry. We then focus on case examples of soliton and periodic solutions for which we present a stability analysis as a function of their amplitude parameters. Finally, we numerically explore the nonlinear dynamics of the associated waveforms to identify the outcome of the relevant dynamical instabilities of localized and extended states. △ Less

Submitted 2 January, 2015; originally announced January 2015.

MSC Class: 37K40; 35Q55

Showing 1–9 of 9 results for author: Vu, Y N T