Search | arXiv e-print repository

InterNeRF: Scaling Radiance Fields via Parameter Interpolation

Authors: Clinton Wang, Peter Hedman, Polina Golland, Jonathan T. Barron, Daniel Duckworth

Abstract: Neural Radiance Fields (NeRFs) have unmatched fidelity on large, real-world scenes. A common approach for scaling NeRFs is to partition the scene into regions, each of which is assigned its own parameters. When implemented naively, such an approach is limited by poor test-time scaling and inconsistent appearance and geometry. We instead propose InterNeRF, a novel architecture for rendering a targe… ▽ More Neural Radiance Fields (NeRFs) have unmatched fidelity on large, real-world scenes. A common approach for scaling NeRFs is to partition the scene into regions, each of which is assigned its own parameters. When implemented naively, such an approach is limited by poor test-time scaling and inconsistent appearance and geometry. We instead propose InterNeRF, a novel architecture for rendering a target view using a subset of the model's parameters. Our approach enables out-of-core training and rendering, increasing total model capacity with only a modest increase to training time. We demonstrate significant improvements in multi-room scenes while remaining competitive on standard benchmarks. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Presented at CVPR 2024 Neural Rendering Intelligence Workshop

arXiv:2404.00132 [pdf, other]

FetalDiffusion: Pose-Controllable 3D Fetal MRI Synthesis with Conditional Diffusion Model

Authors: Molin Zhang, Polina Golland, Patricia Ellen Grant, Elfar Adalsteinsson

Abstract: The quality of fetal MRI is significantly affected by unpredictable and substantial fetal motion, leading to the introduction of artifacts even when fast acquisition sequences are employed. The development of 3D real-time fetal pose estimation approaches on volumetric EPI fetal MRI opens up a promising avenue for fetal motion monitoring and prediction. Challenges arise in fetal pose estimation due… ▽ More The quality of fetal MRI is significantly affected by unpredictable and substantial fetal motion, leading to the introduction of artifacts even when fast acquisition sequences are employed. The development of 3D real-time fetal pose estimation approaches on volumetric EPI fetal MRI opens up a promising avenue for fetal motion monitoring and prediction. Challenges arise in fetal pose estimation due to limited number of real scanned fetal MR training images, hindering model generalization when the acquired fetal MRI lacks adequate pose. In this study, we introduce FetalDiffusion, a novel approach utilizing a conditional diffusion model to generate 3D synthetic fetal MRI with controllable pose. Additionally, an auxiliary pose-level loss is adopted to enhance model performance. Our work demonstrates the success of this proposed model by producing high-quality synthetic fetal MRI images with accurate and recognizable fetal poses, comparing favorably with in-vivo real fetal MRI. Furthermore, we show that the integration of synthetic fetal MR images enhances the fetal pose estimation model's performance, particularly when the number of available real scanned data is limited resulting in 15.4% increase in PCK and 50.2% reduced in mean error. All experiments are done on a single 32GB V100 GPU. Our method holds promise for improving real-time tracking models, thereby addressing fetal motion issues more effectively. △ Less

Submitted 29 March, 2024; originally announced April 2024.

Comments: 8 pages, 3 figures, 2 tables, submitted to MICCAI 2024, code available if accepted

arXiv:2402.02318 [pdf, other]

Diversity Measurement and Subset Selection for Instruction Tuning Datasets

Authors: Peiqi Wang, Yikang Shen, Zhen Guo, Matthew Stallone, Yoon Kim, Polina Golland, Rameswar Panda

Abstract: We aim to select data subsets for the fine-tuning of large language models to more effectively follow instructions. Prior work has emphasized the importance of diversity in dataset curation but relied on heuristics such as the number of tasks. In this paper, we use determinantal point processes to capture the diversity and quality of instruction tuning datasets for subset selection. We propose to… ▽ More We aim to select data subsets for the fine-tuning of large language models to more effectively follow instructions. Prior work has emphasized the importance of diversity in dataset curation but relied on heuristics such as the number of tasks. In this paper, we use determinantal point processes to capture the diversity and quality of instruction tuning datasets for subset selection. We propose to measure dataset diversity with log determinant distance that is the distance between the dataset of interest and a maximally diverse reference dataset. Our experiments demonstrate that the proposed diversity measure in the normalized weight gradient space is correlated with downstream instruction-following performance. Consequently, it can be used to inform when data selection is the most helpful and to analyze dataset curation strategies. We demonstrate the utility of our approach on various instruction tuning datasets. △ Less

Submitted 3 February, 2024; originally announced February 2024.

arXiv:2312.13534 [pdf, other]

doi 10.1109/TMI.2024.3411989

SE(3)-Equivariant and Noise-Invariant 3D Rigid Motion Tracking in Brain MRI

Authors: Benjamin Billot, Neel Dey, Daniel Moyer, Malte Hoffmann, Esra Abaci Turk, Borjan Gagoski, Ellen Grant, Polina Golland

Abstract: Rigid motion tracking is paramount in many medical imaging applications where movements need to be detected, corrected, or accounted for. Modern strategies rely on convolutional neural networks (CNN) and pose this problem as rigid registration. Yet, CNNs do not exploit natural symmetries in this task, as they are equivariant to translations (their outputs shift with their inputs) but not to rotati… ▽ More Rigid motion tracking is paramount in many medical imaging applications where movements need to be detected, corrected, or accounted for. Modern strategies rely on convolutional neural networks (CNN) and pose this problem as rigid registration. Yet, CNNs do not exploit natural symmetries in this task, as they are equivariant to translations (their outputs shift with their inputs) but not to rotations. Here we propose EquiTrack, the first method that uses recent steerable SE(3)-equivariant CNNs (E-CNN) for motion tracking. While steerable E-CNNs can extract corresponding features across different poses, testing them on noisy medical images reveals that they do not have enough learning capacity to learn noise invariance. Thus, we introduce a hybrid architecture that pairs a denoiser with an E-CNN to decouple the processing of anatomically irrelevant intensity features from the extraction of equivariant spatial features. Rigid transforms are then estimated in closed-form. EquiTrack outperforms state-of-the-art learning and optimisation methods for motion tracking in adult brain MRI and fetal MRI time series. Our code is available at https://github.com/BBillot/EquiTrack. △ Less

Submitted 12 June, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: Published at IEEE transactions on Medical Imaging

arXiv:2312.06358 [pdf, other]

Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering

Authors: Vivek Gopalakrishnan, Neel Dey, Polina Golland

Abstract: Surgical decisions are informed by aligning rapid portable 2D intraoperative images (e.g., X-rays) to a high-fidelity 3D preoperative reference scan (e.g., CT). 2D/3D image registration often fails in practice: conventional optimization methods are prohibitively slow and susceptible to local minima, while neural networks trained on small datasets fail on new patients or require impractical landmar… ▽ More Surgical decisions are informed by aligning rapid portable 2D intraoperative images (e.g., X-rays) to a high-fidelity 3D preoperative reference scan (e.g., CT). 2D/3D image registration often fails in practice: conventional optimization methods are prohibitively slow and susceptible to local minima, while neural networks trained on small datasets fail on new patients or require impractical landmark supervision. We present DiffPose, a self-supervised approach that leverages patient-specific simulation and differentiable physics-based rendering to achieve accurate 2D/3D registration without relying on manually labeled data. Preoperatively, a CNN is trained to regress the pose of a randomly oriented synthetic X-ray rendered from the preoperative CT. The CNN then initializes rapid intraoperative test-time optimization that uses the differentiable X-ray renderer to refine the solution. Our work further proposes several geometrically principled methods for sampling camera poses from $\mathbf{SE}(3)$, for sparse differentiable rendering, and for driving registration in the tangent space $\mathfrak{se}(3)$ with geodesic and multiscale locality-sensitive losses. DiffPose achieves sub-millimeter accuracy across surgical datasets at intraoperative speeds, improving upon existing unsupervised methods by an order of magnitude and even outperforming supervised baselines. Our code is available at https://github.com/eigenvivek/DiffPose. △ Less

Submitted 27 March, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: CVPR 2024

arXiv:2312.05148 [pdf, other]

doi 10.59275/j.melba.2023-g3f8

Shape-aware Segmentation of the Placenta in BOLD Fetal MRI Time Series

Authors: S. Mazdak Abulnaga, Neel Dey, Sean I. Young, Eileen Pan, Katherine I. Hobgood, Clinton J. Wang, P. Ellen Grant, Esra Abaci Turk, Polina Golland

Abstract: Blood oxygen level dependent (BOLD) MRI time series with maternal hyperoxia can assess placental oxygenation and function. Measuring precise BOLD changes in the placenta requires accurate temporal placental segmentation and is confounded by fetal and maternal motion, contractions, and hyperoxia-induced intensity changes. Current BOLD placenta segmentation methods warp a manually annotated subject-… ▽ More Blood oxygen level dependent (BOLD) MRI time series with maternal hyperoxia can assess placental oxygenation and function. Measuring precise BOLD changes in the placenta requires accurate temporal placental segmentation and is confounded by fetal and maternal motion, contractions, and hyperoxia-induced intensity changes. Current BOLD placenta segmentation methods warp a manually annotated subject-specific template to the entire time series. However, as the placenta is a thin, elongated, and highly non-rigid organ subject to large deformations and obfuscated edges, existing work cannot accurately segment the placental shape, especially near boundaries. In this work, we propose a machine learning segmentation framework for placental BOLD MRI and apply it to segmenting each volume in a time series. We use a placental-boundary weighted loss formulation and perform a comprehensive evaluation across several popular segmentation objectives. Our model is trained and tested on a cohort of 91 subjects containing healthy fetuses, fetuses with fetal growth restriction, and mothers with high BMI. Biomedically, our model performs reliably in segmenting volumes in both normoxic and hyperoxic points in the BOLD time series. We further find that boundary-weighting increases placental segmentation performance by 8.3% and 6.0% Dice coefficient for the cross-entropy and signed distance transform objectives, respectively. Our code and trained model is available at https://github.com/mabulnaga/automatic-placenta-segmentation. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:017. arXiv admin note: substantial text overlap with arXiv:2208.02895

Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2023)

arXiv:2312.03102 [pdf]

Fully Convolutional Slice-to-Volume Reconstruction for Single-Stack MRI

Authors: Sean I. Young, Yaël Balbastre, Bruce Fischl, Polina Golland, Juan Eugenio Iglesias

Abstract: In magnetic resonance imaging (MRI), slice-to-volume reconstruction (SVR) refers to computational reconstruction of an unknown 3D magnetic resonance volume from stacks of 2D slices corrupted by motion. While promising, current SVR methods require multiple slice stacks for accurate 3D reconstruction, leading to long scans and limiting their use in time-sensitive applications such as fetal fMRI. Her… ▽ More In magnetic resonance imaging (MRI), slice-to-volume reconstruction (SVR) refers to computational reconstruction of an unknown 3D magnetic resonance volume from stacks of 2D slices corrupted by motion. While promising, current SVR methods require multiple slice stacks for accurate 3D reconstruction, leading to long scans and limiting their use in time-sensitive applications such as fetal fMRI. Here, we propose a SVR method that overcomes the shortcomings of previous work and produces state-of-the-art reconstructions in the presence of extreme inter-slice motion. Inspired by the recent success of single-view depth estimation methods, we formulate SVR as a single-stack motion estimation task and train a fully convolutional network to predict a motion stack for a given slice stack, producing a 3D reconstruction as a byproduct of the predicted motion. Extensive experiments on the SVR of adult and fetal brains demonstrate that our fully convolutional method is twice as accurate as previous SVR methods. Our code is available at github.com/seannz/svr. △ Less

Submitted 28 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: Accepted to CVPR 2024

arXiv:2311.02874 [pdf, other]

doi 10.48550/arXiv.2311.02874

Dynamic Neural Fields for Learning Atlases of 4D Fetal MRI Time-series

Authors: Zeen Chi, Zhongxiao Cong, Clinton J. Wang, Yingcheng Liu, Esra Abaci Turk, P. Ellen Grant, S. Mazdak Abulnaga, Polina Golland, Neel Dey

Abstract: We present a method for fast biomedical image atlas construction using neural fields. Atlases are key to biomedical image analysis tasks, yet conventional and deep network estimation methods remain time-intensive. In this preliminary work, we frame subject-specific atlas building as learning a neural field of deformable spatiotemporal observations. We apply our method to learning subject-specific… ▽ More We present a method for fast biomedical image atlas construction using neural fields. Atlases are key to biomedical image analysis tasks, yet conventional and deep network estimation methods remain time-intensive. In this preliminary work, we frame subject-specific atlas building as learning a neural field of deformable spatiotemporal observations. We apply our method to learning subject-specific atlases and motion stabilization of dynamic BOLD MRI time-series of fetuses in utero. Our method yields high-quality atlases of fetal BOLD time-series with $\sim$5-7$\times$ faster convergence compared to existing work. While our method slightly underperforms well-tuned baselines in terms of anatomical overlap, it estimates templates significantly faster, thus enabling rapid processing and stabilization of large databases of 4D dynamic MRI acquisitions. Code is available at https://github.com/Kidrauh/neural-atlasing △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 6 pages, 2 figures. Accepted by Medical Imaging Meets NeurIPS 2023

arXiv:2310.19635 [pdf, other]

Bidirectional Captioning for Clinically Accurate and Interpretable Models

Authors: Keegan Quigley, Miriam Cha, Josh Barua, Geeticka Chauhan, Seth Berkowitz, Steven Horng, Polina Golland

Abstract: Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experi… ▽ More Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experiment with bidirectional captioning of radiology reports as a form of pretraining and compare the quality and utility of learned embeddings with those from contrastive pretraining methods. We optimize a CNN encoder, transformer decoder architecture named RadTex for the radiology domain. Results show that not only does captioning pretraining yield visual encoders that are competitive with contrastive pretraining (CheXpert competition multi-label AUC of 89.4%), but also that our transformer decoder is capable of generating clinically relevant reports (captioning macro-F1 score of 0.349 using CheXpert labeler) and responding to prompts with targeted, interactive outputs. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 12 pages, 7 figures. Code release to follow

arXiv:2310.05133 [pdf, other]

Geometry Aware Field-to-field Transformations for 3D Semantic Segmentation

Authors: Dominik Hollidt, Clinton Wang, Polina Golland, Marc Pollefeys

Abstract: We present a novel approach to perform 3D semantic segmentation solely from 2D supervision by leveraging Neural Radiance Fields (NeRFs). By extracting features along a surface point cloud, we achieve a compact representation of the scene which is sample-efficient and conducive to 3D reasoning. Learning this feature space in an unsupervised manner via masked autoencoding enables few-shot segmentati… ▽ More We present a novel approach to perform 3D semantic segmentation solely from 2D supervision by leveraging Neural Radiance Fields (NeRFs). By extracting features along a surface point cloud, we achieve a compact representation of the scene which is sample-efficient and conducive to 3D reasoning. Learning this feature space in an unsupervised manner via masked autoencoding enables few-shot segmentation. Our method is agnostic to the scene parameterization, working on scenes fit with any type of NeRF. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: 8 pages

arXiv:2310.03870 [pdf, other]

Consistency Regularization Improves Placenta Segmentation in Fetal EPI MRI Time Series

Authors: Yingcheng Liu, Neerav Karani, Neel Dey, S. Mazdak Abulnaga, Junshen Xu, P. Ellen Grant, Esra Abaci Turk, Polina Golland

Abstract: The placenta plays a crucial role in fetal development. Automated 3D placenta segmentation from fetal EPI MRI holds promise for advancing prenatal care. This paper proposes an effective semi-supervised learning method for improving placenta segmentation in fetal EPI MRI time series. We employ consistency regularization loss that promotes consistency under spatial transformation of the same image a… ▽ More The placenta plays a crucial role in fetal development. Automated 3D placenta segmentation from fetal EPI MRI holds promise for advancing prenatal care. This paper proposes an effective semi-supervised learning method for improving placenta segmentation in fetal EPI MRI time series. We employ consistency regularization loss that promotes consistency under spatial transformation of the same image and temporal consistency across nearby images in a time series. The experimental results show that the method improves the overall segmentation accuracy and provides better performance for outliers and hard samples. The evaluation also indicates that our method improves the temporal coherency of the prediction, which could lead to more accurate computation of temporal placental biomarkers. This work contributes to the study of the placenta and prenatal clinical decision-making. Code is available at https://github.com/firstmover/cr-seg. △ Less

Submitted 15 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2307.12560 [pdf, other]

Interpolating between Images with Diffusion Models

Authors: Clinton J. Wang, Polina Golland

Abstract: One little-explored frontier of image generation and editing is the task of interpolating between two input images, a feature missing from all currently deployed image generation pipelines. We argue that such a feature can expand the creative applications of such models, and propose a method for zero-shot interpolation using latent diffusion models. We apply interpolation in the latent space at a… ▽ More One little-explored frontier of image generation and editing is the task of interpolating between two input images, a feature missing from all currently deployed image generation pipelines. We argue that such a feature can expand the creative applications of such models, and propose a method for zero-shot interpolation using latent diffusion models. We apply interpolation in the latent space at a sequence of decreasing noise levels, then perform denoising conditioned on interpolated text embeddings derived from textual inversion and (optionally) subject poses. For greater consistency, or to specify additional criteria, we can generate several candidates and use CLIP to select the highest quality image. We obtain convincing interpolations across diverse subject poses, image styles, and image content, and show that standard quantitative metrics such as FID are insufficient to measure the quality of an interpolation. Code and data are available at https://clintonjwang.github.io/interpolation. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: Presented at ICML 2023 Workshop on Challenges of Deploying Generative AI

arXiv:2307.08163 [pdf, other]

Boundary-weighted logit consistency improves calibration of segmentation networks

Authors: Neerav Karani, Neel Dey, Polina Golland

Abstract: Neural network prediction probabilities and accuracy are often only weakly-correlated. Inherent label ambiguity in training data for image segmentation aggravates such miscalibration. We show that logit consistency across stochastic transformations acts as a spatially varying regularizer that prevents overconfident predictions at pixels with ambiguous labels. Our boundary-weighted extension of thi… ▽ More Neural network prediction probabilities and accuracy are often only weakly-correlated. Inherent label ambiguity in training data for image segmentation aggravates such miscalibration. We show that logit consistency across stochastic transformations acts as a spatially varying regularizer that prevents overconfident predictions at pixels with ambiguous labels. Our boundary-weighted extension of this regularizer provides state-of-the-art calibration for prostate and heart MRI segmentation. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Comments: Accepted for publication at MICCAI 2023

arXiv:2307.07044 [pdf, other]

AnyStar: Domain randomized universal star-convex 3D instance segmentation

Authors: Neel Dey, S. Mazdak Abulnaga, Benjamin Billot, Esra Abaci Turk, P. Ellen Grant, Adrian V. Dalca, Polina Golland

Abstract: Star-convex shapes arise across bio-microscopy and radiology in the form of nuclei, nodules, metastases, and other units. Existing instance segmentation networks for such structures train on densely labeled instances for each dataset, which requires substantial and often impractical manual annotation effort. Further, significant reengineering or finetuning is needed when presented with new dataset… ▽ More Star-convex shapes arise across bio-microscopy and radiology in the form of nuclei, nodules, metastases, and other units. Existing instance segmentation networks for such structures train on densely labeled instances for each dataset, which requires substantial and often impractical manual annotation effort. Further, significant reengineering or finetuning is needed when presented with new datasets and imaging modalities due to changes in contrast, shape, orientation, resolution, and density. We present AnyStar, a domain-randomized generative model that simulates synthetic training data of blob-like objects with randomized appearance, environments, and imaging physics to train general-purpose star-convex instance segmentation networks. As a result, networks trained using our generative model do not require annotated images from unseen datasets. A single network trained on our synthesized data accurately 3D segments C. elegans and P. dumerilii nuclei in fluorescence microscopy, mouse cortical nuclei in micro-CT, zebrafish brain nuclei in EM, and placental cotyledons in human fetal MRI, all without any retraining, finetuning, transfer learning, or domain adaptation. Code is available at https://github.com/neel-dey/AnyStar. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: Code available at https://github.com/neel-dey/AnyStar

arXiv:2305.03413 [pdf, other]

doi 10.1007/978-3-031-43993-3_24

Domain-agnostic segmentation of thalamic nuclei from joint structural and diffusion MRI

Authors: Henry F. J. Tregidgo, Sonja Soskic, Mark D. Olchanyi, Juri Althonayan, Benjamin Billot, Chiara Maffei, Polina Golland, Anastasia Yendiki, Daniel C. Alexander, Martina Bocchetta, Jonathan D. Rohrer, Juan Eugenio Iglesias

Abstract: The human thalamus is a highly connected subcortical grey-matter structure within the brain. It comprises dozens of nuclei with different function and connectivity, which are affected differently by disease. For this reason, there is growing interest in studying the thalamic nuclei in vivo with MRI. Tools are available to segment the thalamus from 1 mm T1 scans, but the contrast of the lateral and… ▽ More The human thalamus is a highly connected subcortical grey-matter structure within the brain. It comprises dozens of nuclei with different function and connectivity, which are affected differently by disease. For this reason, there is growing interest in studying the thalamic nuclei in vivo with MRI. Tools are available to segment the thalamus from 1 mm T1 scans, but the contrast of the lateral and internal boundaries is too faint to produce reliable segmentations. Some tools have attempted to incorporate information from diffusion MRI in the segmentation to refine these boundaries, but do not generalise well across diffusion MRI acquisitions. Here we present the first CNN that can segment thalamic nuclei from T1 and diffusion data of any resolution without retraining or fine tuning. Our method builds on a public histological atlas of the thalamic nuclei and silver standard segmentations on high-quality diffusion data obtained with a recent Bayesian adaptive segmentation tool. We combine these with an approximate degradation model for fast domain randomisation during training. Our CNN produces a segmentation at 0.7 mm isotropic resolution, irrespective of the resolution of the input. Moreover, it uses a parsimonious model of the diffusion signal at each voxel (fractional anisotropy and principal eigenvector) that is compatible with virtually any set of directions and b-values, including huge amounts of legacy data. We show results of our proposed method on three heterogeneous datasets acquired on dozens of different scanners. An implementation of the method is publicly available at https://freesurfer.net/fswiki/ThalamicNucleiDTI. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: Under review

arXiv:2304.13181 [pdf, other]

Sample-Specific Debiasing for Better Image-Text Models

Authors: Peiqi Wang, Yingcheng Liu, Ching-Yun Ko, William M. Wells, Seth Berkowitz, Steven Horng, Polina Golland

Abstract: Self-supervised representation learning on image-text data facilitates crucial medical applications, such as image classification, visual grounding, and cross-modal retrieval. One common approach involves contrasting semantically similar (positive) and dissimilar (negative) pairs of data points. Drawing negative samples uniformly from the training data set introduces false negatives, i.e., samples… ▽ More Self-supervised representation learning on image-text data facilitates crucial medical applications, such as image classification, visual grounding, and cross-modal retrieval. One common approach involves contrasting semantically similar (positive) and dissimilar (negative) pairs of data points. Drawing negative samples uniformly from the training data set introduces false negatives, i.e., samples that are treated as dissimilar but belong to the same class. In healthcare data, the underlying class distribution is nonuniform, implying that false negatives occur at a highly variable rate. To improve the quality of learned representations, we develop a novel approach that corrects for false negatives. Our method can be viewed as a variant of debiased contrastive learning that uses estimated sample-specific class probabilities. We provide theoretical analysis of the objective function and demonstrate the proposed approach on both image and paired image-text data sets. Our experiments illustrate empirical advantages of sample-specific debiasing. △ Less

Submitted 12 August, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

Comments: Machine Learning for Healthcare Conference 2023

arXiv:2301.10365 [pdf, other]

Data Consistent Deep Rigid MRI Motion Correction

Authors: Nalini M. Singh, Neel Dey, Malte Hoffmann, Bruce Fischl, Elfar Adalsteinsson, Robert Frost, Adrian V. Dalca, Polina Golland

Abstract: Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alo… ▽ More Motion artifacts are a pervasive problem in MRI, leading to misdiagnosis or mischaracterization in population-level imaging studies. Current retrospective rigid intra-slice motion correction techniques jointly optimize estimates of the image and the motion parameters. In this paper, we use a deep network to reduce the joint image-motion parameter search to a search over rigid motion parameters alone. Our network produces a reconstruction as a function of two inputs: corrupted k-space data and motion parameters. We train the network using simulated, motion-corrupted k-space data generated with known motion parameters. At test-time, we estimate unknown motion parameters by minimizing a data consistency loss between the motion parameters, the network-based image reconstruction given those parameters, and the acquired measurements. Intra-slice motion correction experiments on simulated and realistic 2D fast spin echo brain MRI achieve high reconstruction fidelity while providing the benefits of explicit data consistency optimization. Our code is publicly available at https://www.github.com/nalinimsingh/neuroMoCo. △ Less

Submitted 16 November, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Comments: Presented at MIDL 2023. 14 pages, 6 figures. Keywords: motion correction, magnetic resonance imaging, deep learning

arXiv:2212.05561 [pdf, other]

doi 10.1007/978-3-031-34048-2_35

Using Multiple Instance Learning to Build Multimodal Representations

Authors: Peiqi Wang, William M. Wells, Seth Berkowitz, Steven Horng, Polina Golland

Abstract: Image-text multimodal representation learning aligns data across modalities and enables important medical applications, e.g., image classification, visual grounding, and cross-modal retrieval. In this work, we establish a connection between multimodal representation learning and multiple instance learning. Based on this connection, we propose a generic framework for constructing permutation-invari… ▽ More Image-text multimodal representation learning aligns data across modalities and enables important medical applications, e.g., image classification, visual grounding, and cross-modal retrieval. In this work, we establish a connection between multimodal representation learning and multiple instance learning. Based on this connection, we propose a generic framework for constructing permutation-invariant score functions with many existing multimodal representation learning approaches as special cases. Furthermore, we use the framework to derive a novel contrastive learning approach and demonstrate that our method achieves state-of-the-art results in several downstream tasks. △ Less

Submitted 9 March, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

arXiv:2209.14267 [pdf, ps, other]

Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets

Authors: Deborah Pereg, Martin Villiger, Brett Bouma, Polina Golland

Abstract: The statistical supervised learning framework assumes an input-output set with a joint probability distribution that is reliably represented by the training dataset. The learner is then required to output a prediction rule learned from the training dataset's input-output pairs. In this work, we provide meaningful insights into the asymptotic equipartition property (AEP) \citep{Shannon:1948} in the… ▽ More The statistical supervised learning framework assumes an input-output set with a joint probability distribution that is reliably represented by the training dataset. The learner is then required to output a prediction rule learned from the training dataset's input-output pairs. In this work, we provide meaningful insights into the asymptotic equipartition property (AEP) \citep{Shannon:1948} in the context of machine learning, and illuminate some of its potential ramifications for few-shot learning. We provide theoretical guarantees for reliable learning under the information-theoretic AEP, and for the generalization error with respect to the sample size. We then focus on a highly efficient recurrent neural net (RNN) framework and propose a reduced-entropy algorithm for few-shot learning. We also propose a mathematical intuition for the RNN as an approximation of a sparse coding solver. We verify the applicability, robustness, and computational efficiency of the proposed approach with image deblurring and optical coherence tomography (OCT) speckle suppression. Our experimental results demonstrate significant potential for improving learning models' sample efficiency, generalization, and time complexity, that can therefore be leveraged for practical real-time applications. △ Less

Submitted 25 February, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

arXiv:2208.12737 [pdf, other]

Fast Auto-Differentiable Digitally Reconstructed Radiographs for Solving Inverse Problems in Intraoperative Imaging

Authors: Vivek Gopalakrishnan, Polina Golland

Abstract: The use of digitally reconstructed radiographs (DRRs) to solve inverse problems such as slice-to-volume registration and 3D reconstruction is well-studied in preoperative settings. In intraoperative imaging, the utility of DRRs is limited by the challenges in generating them in real-time and supporting optimization procedures that rely on repeated DRR synthesis. While immense progress has been mad… ▽ More The use of digitally reconstructed radiographs (DRRs) to solve inverse problems such as slice-to-volume registration and 3D reconstruction is well-studied in preoperative settings. In intraoperative imaging, the utility of DRRs is limited by the challenges in generating them in real-time and supporting optimization procedures that rely on repeated DRR synthesis. While immense progress has been made in accelerating the generation of DRRs through algorithmic refinements and GPU implementations, DRR-based optimization remains slow because most DRR generators do not offer a straightforward way to obtain gradients with respect to the imaging parameters. To make DRRs interoperable with gradient-based optimization and deep learning frameworks, we have reformulated Siddon's method, the most popular ray-tracing algorithm used in DRR generation, as a series of vectorized tensor operations. We implemented this vectorized version of Siddon's method in PyTorch, taking advantage of the library's strong automatic differentiation engine to make this DRR generator fully differentiable with respect to its parameters. Additionally, using GPU-accelerated tensor computation enables our vectorized implementation to achieve rendering speeds equivalent to state-of-the-art DRR generators implemented in CUDA and C++. We illustrate the resulting method in the context of slice-to-volume registration. Moreover, our simulations suggest that the loss landscapes for the slice-to-volume registration problem are convex in the neighborhood of the optimal solution, and gradient-based registration promises a much faster solution than prevailing gradient-free optimization strategies. The proposed DRR generator enables fast computer vision algorithms to support image guidance in minimally invasive procedures. Our implementation is publically available at https://github.com/v715/DiffDRR. △ Less

Submitted 26 August, 2022; originally announced August 2022.

Comments: Published in Clinical Image-based Procedures (CLIP) workshop at MICCAI 2022

arXiv:2208.03218 [pdf, other]

doi 10.1007/978-3-031-16876-5_3

RadTex: Learning Efficient Radiograph Representations from Text Reports

Authors: Keegan Quigley, Miriam Cha, Ruizhi Liao, Geeticka Chauhan, Steven Horng, Seth Berkowitz, Polina Golland

Abstract: Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to im… ▽ More Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data (fewer than 1000 examples). Specifically, we examine image-captioning pretraining to learn high-quality medical image representations that train on fewer examples. Following joint pretraining of a convolutional encoder and transformer decoder, we transfer the learned encoder to various classification tasks. Averaged over 9 pathologies, we find that our model achieves higher classification performance than ImageNet-supervised and in-domain supervised pretraining when labeled training data is limited. △ Less

Submitted 7 April, 2023; v1 submitted 5 August, 2022; originally announced August 2022.

Comments: Awarded Best Paper at Resource Efficient Medical Image Analysis (REMIA) Workshop, MICCAI 2022

arXiv:2208.02895 [pdf, other]

Automatic Segmentation of the Placenta in BOLD MRI Time Series

Authors: S. Mazdak Abulnaga, Sean I. Young, Katherine Hobgood, Eileen Pan, Clinton J. Wang, P. Ellen Grant, Esra Abaci Turk, Polina Golland

Abstract: Blood oxygen level dependent (BOLD) MRI with maternal hyperoxia can assess oxygen transport within the placenta and has emerged as a promising tool to study placental function. Measuring signal changes over time requires segmenting the placenta in each volume of the time series. Due to the large number of volumes in the BOLD time series, existing studies rely on registration to map all volumes to… ▽ More Blood oxygen level dependent (BOLD) MRI with maternal hyperoxia can assess oxygen transport within the placenta and has emerged as a promising tool to study placental function. Measuring signal changes over time requires segmenting the placenta in each volume of the time series. Due to the large number of volumes in the BOLD time series, existing studies rely on registration to map all volumes to a manually segmented template. As the placenta can undergo large deformation due to fetal motion, maternal motion, and contractions, this approach often results in a large number of discarded volumes, where the registration approach fails. In this work, we propose a machine learning model based on a U-Net neural network architecture to automatically segment the placenta in BOLD MRI and apply it to segmenting each volume in a time series. We use a boundary-weighted loss function to accurately capture the placental shape. Our model is trained and tested on a cohort of 91 subjects containing healthy fetuses, fetuses with fetal growth restriction, and mothers with high BMI. We achieve a Dice score of 0.83+/-0.04 when matching with ground truth labels and our model performs reliably in segmenting volumes in both normoxic and hyperoxic points in the BOLD time series. Our code and trained model are available at https://github.com/mabulnaga/automatic-placenta-segmentation. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: Accepted at MICCAI PIPPI 2022

arXiv:2206.10802 [pdf, other]

SVoRT: Iterative Transformer for Slice-to-Volume Registration in Fetal Brain MRI

Authors: Junshen Xu, Daniel Moyer, P. Ellen Grant, Polina Golland, Juan Eugenio Iglesias, Elfar Adalsteinsson

Abstract: Volumetric reconstruction of fetal brains from multiple stacks of MR slices, acquired in the presence of almost unpredictable and often severe subject motion, is a challenging task that is highly sensitive to the initialization of slice-to-volume transformations. We propose a novel slice-to-volume registration method using Transformers trained on synthetically transformed data, which model multipl… ▽ More Volumetric reconstruction of fetal brains from multiple stacks of MR slices, acquired in the presence of almost unpredictable and often severe subject motion, is a challenging task that is highly sensitive to the initialization of slice-to-volume transformations. We propose a novel slice-to-volume registration method using Transformers trained on synthetically transformed data, which model multiple stacks of MR slices as a sequence. With the attention mechanism, our model automatically detects the relevance between slices and predicts the transformation of one slice using information from other slices. We also estimate the underlying 3D volume to assist slice-to-volume registration and update the volume and transformations alternately to improve accuracy. Results on synthetic data show that our method achieves lower registration error and better reconstruction quality compared with existing state-of-the-art methods. Experiments with real-world MRI data are also performed to demonstrate the ability of the proposed model to improve the quality of 3D reconstruction under severe fetal motion. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: Accepted by MICCAI 2022

arXiv:2206.01178 [pdf, other]

Discretization Invariant Networks for Learning Maps between Neural Fields

Authors: Clinton J. Wang, Polina Golland

Abstract: With the emergence of powerful representations of continuous data in the form of neural fields, there is a need for discretization invariant learning: an approach for learning maps between functions on continuous domains without being sensitive to how the function is sampled. We present a new framework for understanding and designing discretization invariant neural networks (DI-Nets), which genera… ▽ More With the emergence of powerful representations of continuous data in the form of neural fields, there is a need for discretization invariant learning: an approach for learning maps between functions on continuous domains without being sensitive to how the function is sampled. We present a new framework for understanding and designing discretization invariant neural networks (DI-Nets), which generalizes many discrete networks such as convolutional neural networks as well as continuous networks such as neural operators. Our analysis establishes upper bounds on the deviation in model outputs under different finite discretizations, and highlights the central role of point set discrepancy in characterizing such bounds. This insight leads to the design of a family of neural networks driven by numerical integration via quasi-Monte Carlo sampling with discretizations of low discrepancy. We prove by construction that DI-Nets universally approximate a large class of maps between integrable function spaces, and show that discretization invariance also describes backpropagation through such models. Applied to neural fields, convolutional DI-Nets can learn to classify and segment visual data under various discretizations, and sometimes generalize to new types of discretizations at test time. Code: https://github.com/clintonjwang/DI-net. △ Less

Submitted 19 October, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

Comments: Published in Transactions on Machine Learning Research 2023

arXiv:2202.02952 [pdf]

doi 10.1109/TPAMI.2023.3299789

Supervision by Denoising for Medical Image Segmentation

Authors: Sean I. Young, Adrian V. Dalca, Enzo Ferrante, Polina Golland, Christopher A. Metzler, Bruce Fischl, Juan Eugenio Iglesias

Abstract: Learning-based image reconstruction models, such as those based on the U-Net, require a large set of labeled images if good generalization is to be guaranteed. In some imaging domains, however, labeled data with pixel- or voxel-level label accuracy are scarce due to the cost of acquiring them. This problem is exacerbated further in domains like medical imaging, where there is no single ground trut… ▽ More Learning-based image reconstruction models, such as those based on the U-Net, require a large set of labeled images if good generalization is to be guaranteed. In some imaging domains, however, labeled data with pixel- or voxel-level label accuracy are scarce due to the cost of acquiring them. This problem is exacerbated further in domains like medical imaging, where there is no single ground truth label, resulting in large amounts of repeat variability in the labels. Therefore, training reconstruction networks to generalize better by learning from both labeled and unlabeled examples (called semi-supervised learning) is problem of practical and theoretical interest. However, traditional semi-supervised learning methods for image reconstruction often necessitate handcrafting a differentiable regularizer specific to some given imaging problem, which can be extremely time-consuming. In this work, we propose "supervision by denoising" (SUD), a framework that enables us to supervise reconstruction models using their own denoised output as soft labels. SUD unifies stochastic averaging and spatial denoising techniques under a spatio-temporal denoising framework and alternates denoising and model weight update steps in an optimization framework for semi-supervision. As example applications, we apply SUD to two problems arising from biomedical imaging -- anatomical brain reconstruction (3D) and cortical parcellation (2D) -- to demonstrate a significant improvement in the image reconstructions over supervised-only and stochastic averaging baselines. △ Less

Submitted 4 January, 2024; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: To appear in the IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2202.02568 [pdf, other]

doi 10.1145/3572897

Symmetric Volume Maps: Order-Invariant Volumetric Mesh Correspondence with Free Boundary

Authors: S. Mazdak Abulnaga, Oded Stein, Polina Golland, Justin Solomon

Abstract: Although shape correspondence is a central problem in geometry processing, most methods for this task apply only to two-dimensional surfaces. The neglected task of volumetric correspondence--a natural extension relevant to shapes extracted from simulation, medical imaging, and volume rendering--presents unique challenges that do not appear in the two-dimensional case. In this work, we propose a me… ▽ More Although shape correspondence is a central problem in geometry processing, most methods for this task apply only to two-dimensional surfaces. The neglected task of volumetric correspondence--a natural extension relevant to shapes extracted from simulation, medical imaging, and volume rendering--presents unique challenges that do not appear in the two-dimensional case. In this work, we propose a method for map** between volumes represented as tetrahedral meshes. Our formulation minimizes a distortion energy designed to extract maps symmetrically, i.e., without dependence on the ordering of the source and target domains. We accompany our method with theoretical discussion describing the consequences of this symmetry assumption, leading us to select a symmetrized ARAP energy that favors isometric correspondences. Our final formulation optimizes for near-isometry while matching the boundary. We demonstrate our method on a diverse geometric dataset, producing low-distortion matchings that align closely to the boundary. △ Less

Submitted 16 November, 2022; v1 submitted 5 February, 2022; originally announced February 2022.

Comments: Accepted to ACM Transactions on Graphics. Our code is available at https://github.com/mabulnaga/symmetric-volume-maps

arXiv:2112.06693 [pdf, other]

Hypernet-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Authors: Sungmin Hong, Anna K. Bonkhoff, Andrew Hoopes, Martin Bretzner, Markus D. Schirmer, Anne-Katrin Giese, Adrian V. Dalca, Polina Golland, Natalia S. Rost

Abstract: Despite the superior performance of Deep Learning (DL) on numerous segmentation tasks, the DL-based approaches are notoriously overconfident about their prediction with highly polarized label probability. This is often not desirable for many applications with the inherent label ambiguity even in human annotations. This challenge has been addressed by leveraging multiple annotations per image and t… ▽ More Despite the superior performance of Deep Learning (DL) on numerous segmentation tasks, the DL-based approaches are notoriously overconfident about their prediction with highly polarized label probability. This is often not desirable for many applications with the inherent label ambiguity even in human annotations. This challenge has been addressed by leveraging multiple annotations per image and the segmentation uncertainty. However, multiple per-image annotations are often not available in a real-world application and the uncertainty does not provide full control on segmentation results to users. In this paper, we propose novel methods to improve the segmentation probability estimation without sacrificing performance in a real-world scenario that we have only one ambiguous annotation per image. We marginalize the estimated segmentation probability maps of networks that are encouraged to under-/over-segment with the varying Tversky loss without penalizing balanced segmentation. Moreover, we propose a unified hypernetwork ensemble method to alleviate the computational burden of training multiple networks. Our approaches successfully estimated the segmentation probability maps that reflected the underlying structures and provided the intuitive control on segmentation for the challenging 3D medical image segmentation. Although the main focus of our proposed methods is not to improve the binary segmentation performance, our approaches marginally outperformed the state-of-the-arts. The codes are available at \url{https://github.com/sh4174/HypernetEnsemble}. △ Less

Submitted 13 December, 2021; originally announced December 2021.

MSC Class: 68T07 (Primary) 92C55; 94A08 (Secondary) ACM Class: I.4; I.4.6; I.2; I.2.1; I.5.1; I.5.4; J.3

arXiv:2111.07900 [pdf, other]

Volumetric Parameterization of the Placenta to a Flattened Template

Authors: S. Mazdak Abulnaga, Esra Abaci Turk, Mikhail Bessmeltsev, P. Ellen Grant, Justin Solomon, Polina Golland

Abstract: We present a volumetric mesh-based algorithm for parameterizing the placenta to a flattened template to enable effective visualization of local anatomy and function. MRI shows potential as a research tool as it provides signals directly related to placental function. However, due to the curved and highly variable in vivo shape of the placenta, interpreting and visualizing these images is difficult… ▽ More We present a volumetric mesh-based algorithm for parameterizing the placenta to a flattened template to enable effective visualization of local anatomy and function. MRI shows potential as a research tool as it provides signals directly related to placental function. However, due to the curved and highly variable in vivo shape of the placenta, interpreting and visualizing these images is difficult. We address interpretation challenges by map** the placenta so that it resembles the familiar ex vivo shape. We formulate the parameterization as an optimization problem for map** the placental shape represented by a volumetric mesh to a flattened template. We employ the symmetric Dirichlet energy to control local distortion throughout the volume. Local injectivity in the map** is enforced by a constrained line search during the gradient descent optimization. We validate our method using a research study of 111 placental shapes extracted from BOLD MRI images. Our map** achieves sub-voxel accuracy in matching the template while maintaining low distortion throughout the volume. We demonstrate how the resulting flattening of the placenta improves visualization of anatomy and function. Our code is freely available at https://github.com/mabulnaga/placenta-flattening . △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: Accepted to IEEE TMI ( (c) IEEE). This manuscript expands the MICCAI 2019 paper (ar** additional template models and extensions to improve robustness, expanded evaluation on a significantly larger dataset, and experiments and discussion demonstrating utility for clinical research. Code is available at https://github.com/mabulnaga/placenta-flattening

arXiv:2111.07048 [pdf, other]

Image Classification with Consistent Supporting Evidence

Authors: Peiqi Wang, Ruizhi Liao, Daniel Moyer, Seth Berkowitz, Steven Horng, Polina Golland

Abstract: Adoption of machine learning models in healthcare requires end users' trust in the system. Models that provide additional supportive evidence for their predictions promise to facilitate adoption. We define consistent evidence to be both compatible and sufficient with respect to model predictions. We propose measures of model inconsistency and regularizers that promote more consistent evidence. We… ▽ More Adoption of machine learning models in healthcare requires end users' trust in the system. Models that provide additional supportive evidence for their predictions promise to facilitate adoption. We define consistent evidence to be both compatible and sufficient with respect to model predictions. We propose measures of model inconsistency and regularizers that promote more consistent evidence. We demonstrate our ideas in the context of edema severity grading from chest radiographs. We demonstrate empirically that consistent models provide competitive performance while supporting interpretation. △ Less

Submitted 13 November, 2021; originally announced November 2021.

Comments: 13 pages, 6 figures, proceedings of the Machine Learning for Health NeurIPS Workshop, 2021

arXiv:2107.09700 [pdf, other]

3D-StyleGAN: A Style-Based Generative Adversarial Network for Generative Modeling of Three-Dimensional Medical Images

Authors: Sungmin Hong, Razvan Marinescu, Adrian V. Dalca, Anna K. Bonkhoff, Martin Bretzner, Natalia S. Rost, Polina Golland

Abstract: Image synthesis via Generative Adversarial Networks (GANs) of three-dimensional (3D) medical images has great potential that can be extended to many medical applications, such as, image enhancement and disease progression modeling. However, current GAN technologies for 3D medical image synthesis need to be significantly improved to be readily adapted to real-world medical problems. In this paper,… ▽ More Image synthesis via Generative Adversarial Networks (GANs) of three-dimensional (3D) medical images has great potential that can be extended to many medical applications, such as, image enhancement and disease progression modeling. However, current GAN technologies for 3D medical image synthesis need to be significantly improved to be readily adapted to real-world medical problems. In this paper, we extend the state-of-the-art StyleGAN2 model, which natively works with two-dimensional images, to enable 3D image synthesis. In addition to the image synthesis, we investigate the controllability and interpretability of the 3D-StyleGAN via style vectors inherited form the original StyleGAN2 that are highly suitable for medical applications: (i) the latent space projection and reconstruction of unseen real images, and (ii) style mixing. We demonstrate the 3D-StyleGAN's performance and feasibility with ~12,000 three-dimensional full brain MR T1 images, although it can be applied to any 3D volumetric images. Furthermore, we explore different configurations of hyperparameters to investigate potential improvement of the image synthesis with larger networks. The codes and pre-trained networks are available online: https://github.com/sh4174/3DStyleGAN. △ Less

Submitted 20 July, 2021; originally announced July 2021.

Comments: 11 pages, 6 figures, 2 tables. Provisionally Accepted at DGM4MICCAI workshop in MICCAI 2021

MSC Class: 68T07 (Primary) 68T01 (Secondary) ACM Class: I.2; I.4

arXiv:2106.12407 [pdf, other]

STRESS: Super-Resolution for Dynamic Fetal MRI using Self-Supervised Learning

Authors: Junshen Xu, Esra Abaci Turk, P. Ellen Grant, Polina Golland, Elfar Adalsteinsson

Abstract: Fetal motion is unpredictable and rapid on the scale of conventional MR scan times. Therefore, dynamic fetal MRI, which aims at capturing fetal motion and dynamics of fetal function, is limited to fast imaging techniques with compromises in image quality and resolution. Super-resolution for dynamic fetal MRI is still a challenge, especially when multi-oriented stacks of image slices for oversampli… ▽ More Fetal motion is unpredictable and rapid on the scale of conventional MR scan times. Therefore, dynamic fetal MRI, which aims at capturing fetal motion and dynamics of fetal function, is limited to fast imaging techniques with compromises in image quality and resolution. Super-resolution for dynamic fetal MRI is still a challenge, especially when multi-oriented stacks of image slices for oversampling are not available and high temporal resolution for recording the dynamics of the fetus or placenta is desired. Further, fetal motion makes it difficult to acquire high-resolution images for supervised learning methods. To address this problem, in this work, we propose STRESS (Spatio-Temporal Resolution Enhancement with Simulated Scans), a self-supervised super-resolution framework for dynamic fetal MRI with interleaved slice acquisitions. Our proposed method simulates an interleaved slice acquisition along the high-resolution axis on the originally acquired data to generate pairs of low- and high-resolution images. Then, it trains a super-resolution network by exploiting both spatial and temporal correlations in the MR time series, which is used to enhance the resolution of the original data. Evaluations on both simulated and in utero data show that our proposed method outperforms other self-supervised super-resolution methods and improves image quality, which is beneficial to other downstream tasks and evaluations. △ Less

Submitted 30 June, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

arXiv:2105.05137 [pdf, other]

Segmentation of Anatomical Layers and Artifacts in Intravascular Polarization Sensitive Optical Coherence Tomography Using Attending Physician and Boundary Cardinality Losses

Authors: Mohammad Haft-Javaherian, Martin Villiger, Kenichiro Otsuka, Joost Daemen, Peter Libby, Polina Golland, Brett E. Bouma

Abstract: Intravascular ultrasound and optical coherence tomography are widely available for characterizing coronary stenoses and provide critical vessel parameters to optimize percutaneous intervention. Intravascular polarization-sensitive optical coherence tomography (PS-OCT) simultaneously provides high-resolution cross-sectional images of vascular structures while also revealing preponderant tissue comp… ▽ More Intravascular ultrasound and optical coherence tomography are widely available for characterizing coronary stenoses and provide critical vessel parameters to optimize percutaneous intervention. Intravascular polarization-sensitive optical coherence tomography (PS-OCT) simultaneously provides high-resolution cross-sectional images of vascular structures while also revealing preponderant tissue components such as collagen and smooth muscle and thereby enhances plaque characterization. Automated interpretation of these features promises to facilitate the objective clinical investigation of the natural history and significance of coronary atheromas. Here, we propose a convolutional neural network model, optimized using a new multi-term loss function, to classify the lumen, intima, and media layers in addition to the guidewire and plaque shadows. We demonstrate that our multi-class classification model outperforms state-of-the-art methods in detecting the coronary anatomical layers. Furthermore, the proposed model segments two classes of common imaging artifacts and detects the anatomical layers within the thickened vessel wall regions that were excluded from analysis by other studies. The source code and the trained model are publicly available at https://github.com/mhaft/OCTseg △ Less

Submitted 13 September, 2022; v1 submitted 11 May, 2021; originally announced May 2021.

arXiv:2103.14696 [pdf, other]

BrainPainter v2: Mouse Brain Visualization Software

Authors: Vedvayas Mallela, Polina Golland, Razvan V. Marinescu

Abstract: BrainPainter is a software for the 3D visualization of human brain structures; it generates colored brain images using user-defined biomarker data for each brain region. However, BrainPainter is only able to generate human brain images. In this paper, we present updates to the existing BrainPainter software which enables the generation of mouse brain images. We load meshes for each mouse brain reg… ▽ More BrainPainter is a software for the 3D visualization of human brain structures; it generates colored brain images using user-defined biomarker data for each brain region. However, BrainPainter is only able to generate human brain images. In this paper, we present updates to the existing BrainPainter software which enables the generation of mouse brain images. We load meshes for each mouse brain region, based on the Allen Mouse Brain Atlas, into Blender, a powerful 3D computer graphics engine. We then use Blender to color each region and generate images of subcortical, outer-cortical, inner-cortical, top and bottom view renders. In addition to those views, we add new render angles and separate visualization settings for the left and right hemispheres. While BrainPainter traditionally ran from the browser ( https://brainpainter.csail.mit.edu ), we also created a graphical user interface that launches image-generation requests in a user-friendly way, by connecting to the Blender backend via a Docker API. We illustrate a use case of BrainPainter for modeling the progression of tau protein accumulation in a mouse study. Our contributions can help neuroscientists visualize brains in mouse studies and show disease progression. In addition, integration into Blender can subsequently enable the generation of complex animations using a moving camera, generation of complex mesh deformations that simulate tumors and other pathologies, as well as visualization of toxic proteins using Blender's particle system. △ Less

Submitted 24 March, 2021; originally announced March 2021.

Comments: 9 pages, 4 figures, 2 movies (see ancillary files)

ACM Class: I.3.5

arXiv:2103.10255 [pdf, other]

Equivariant Filters for Efficient Tracking in 3D Imaging

Authors: Daniel Moyer, Esra Abaci Turk, P Ellen Grant, William M. Wells, Polina Golland

Abstract: We demonstrate an object tracking method for 3D images with fixed computational cost and state-of-the-art performance. Previous methods predicted transformation parameters from convolutional layers. We instead propose an architecture that does not include either flattening of convolutional features or fully connected layers, but instead relies on equivariant filters to preserve transformations bet… ▽ More We demonstrate an object tracking method for 3D images with fixed computational cost and state-of-the-art performance. Previous methods predicted transformation parameters from convolutional layers. We instead propose an architecture that does not include either flattening of convolutional features or fully connected layers, but instead relies on equivariant filters to preserve transformations between inputs and outputs (e.g. rot./trans. of inputs rotate/translate outputs). The transformation is then derived in closed form from the outputs of the filters. This method is useful for applications requiring low latency, such as real-time tracking. We demonstrate our model on synthetically augmented adult brain MRI, as well as fetal brain MRI, which is the intended use-case. △ Less

Submitted 27 September, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

Comments: MICCAI 2021 Early Accept

arXiv:2103.04537 [pdf, other]

doi 10.1007/978-3-030-87196-3_26

Multimodal Representation Learning via Maximization of Local Mutual Information

Authors: Ruizhi Liao, Daniel Moyer, Miriam Cha, Keegan Quigley, Seth Berkowitz, Steven Horng, Polina Golland, William M. Wells

Abstract: We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting represe… ▽ More We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting representations to exhibit high local mutual information. We make use of recent advances in mutual information estimation with neural network discriminators. We argue that the sum of local mutual information is typically a lower bound on the global mutual information. Our experimental results in the downstream image classification tasks demonstrate the advantages of using local features for image-text representation learning. △ Less

Submitted 14 December, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

Comments: In Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021

Journal ref: In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 273-283. Springer, Cham, 2021

arXiv:2101.06255 [pdf, ps, other]

Harmonization and the Worst Scanner Syndrome

Authors: Daniel Moyer, Polina Golland

Abstract: We show that for a wide class of harmonization/domain-invariance schemes several undesirable properties are unavoidable. If a predictive machine is made invariant to a set of domains, the accuracy of the output predictions (as measured by mutual information) is limited by the domain with the least amount of information to begin with. If a real label value is highly informative about the source dom… ▽ More We show that for a wide class of harmonization/domain-invariance schemes several undesirable properties are unavoidable. If a predictive machine is made invariant to a set of domains, the accuracy of the output predictions (as measured by mutual information) is limited by the domain with the least amount of information to begin with. If a real label value is highly informative about the source domain, it cannot be accurately predicted by an invariant predictor. These results are simple and intuitive, but we believe that it is beneficial to state them for medical imaging harmonization. △ Less

Submitted 21 April, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

Comments: Med-NeurIPS 2020 Workshop Paper, updated 4/2021

arXiv:2012.13340 [pdf, other]

Joint super-resolution and synthesis of 1 mm isotropic MP-RAGE volumes from clinical MRI exams with scans of different orientation, resolution and contrast

Authors: Juan Eugenio Iglesias, Benjamin Billot, Yael Balbastre, Azadeh Tabari, John Conklin, Daniel C. Alexander, Polina Golland, Brian L. Edlow, Bruce Fischl

Abstract: Most existing algorithms for automatic 3D morphometry of human brain MRI scans are designed for data with near-isotropic voxels at approximately 1 mm resolution, and frequently have contrast constraints as well - typically requiring T1 scans (e.g., MP-RAGE). This limitation prevents the analysis of millions of MRI scans acquired with large inter-slice spacing ("thick slice") in clinical settings e… ▽ More Most existing algorithms for automatic 3D morphometry of human brain MRI scans are designed for data with near-isotropic voxels at approximately 1 mm resolution, and frequently have contrast constraints as well - typically requiring T1 scans (e.g., MP-RAGE). This limitation prevents the analysis of millions of MRI scans acquired with large inter-slice spacing ("thick slice") in clinical settings every year. The inability to quantitatively analyze these scans hinders the adoption of quantitative neuroimaging in healthcare, and precludes research studies that could attain huge sample sizes and hence greatly improve our understanding of the human brain. Recent advances in CNNs are producing outstanding results in super-resolution and contrast synthesis of MRI. However, these approaches are very sensitive to the contrast, resolution and orientation of the input images, and thus do not generalize to diverse clinical acquisition protocols - even within sites. Here we present SynthSR, a method to train a CNN that receives one or more thick-slice scans with different contrast, resolution and orientation, and produces an isotropic scan of canonical contrast (typically a 1 mm MP-RAGE). The presented method does not require any preprocessing, e.g., skull strip** or bias field correction. Crucially, SynthSR trains on synthetic input images generated from 3D segmentations, and can thus be used to train CNNs for any combination of contrasts, resolutions and orientations without high-resolution training data. We test the images generated with SynthSR in an array of common downstream analyses, and show that they can be reliably used for subcortical segmentation and volumetry, image registration (e.g., for tensor-based morphometry), and, if some image quality requirements are met, even cortical thickness morphometry. The source code is publicly available at github.com/BBillot/SynthSR. △ Less

Submitted 24 December, 2020; originally announced December 2020.

arXiv:2012.04567 [pdf, other]

Bayesian Image Reconstruction using Deep Generative Models

Authors: Razvan V Marinescu, Daniel Moyer, Polina Golland

Abstract: Machine learning models are commonly trained end-to-end and in a supervised setting, using paired (input, output) data. Examples include recent super-resolution methods that train on pairs of (low-resolution, high-resolution) images. However, these end-to-end approaches require re-training every time there is a distribution shift in the inputs (e.g., night images vs daylight) or relevant latent va… ▽ More Machine learning models are commonly trained end-to-end and in a supervised setting, using paired (input, output) data. Examples include recent super-resolution methods that train on pairs of (low-resolution, high-resolution) images. However, these end-to-end approaches require re-training every time there is a distribution shift in the inputs (e.g., night images vs daylight) or relevant latent variables (e.g., camera blur or hand motion). In this work, we leverage state-of-the-art (SOTA) generative models (here StyleGAN2) for building powerful image priors, which enable application of Bayes' theorem for many downstream reconstruction tasks. Our method, Bayesian Reconstruction through Generative Models (BRGM), uses a single pre-trained generator model to solve different image restoration tasks, i.e., super-resolution and in-painting, by combining it with different forward corruption models. We keep the weights of the generator model fixed, and reconstruct the image by estimating the Bayesian maximum a-posteriori (MAP) estimate over the input latent vector that generated the reconstructed image. We further use variational inference to approximate the posterior distribution over the latent vectors, from which we sample multiple solutions. We demonstrate BRGM on three large and diverse datasets: (i) 60,000 images from the Flick Faces High Quality dataset (ii) 240,000 chest X-rays from MIMIC III and (iii) a combined collection of 5 brain MRI datasets with 7,329 scans. Across all three datasets and without any dataset-specific hyperparameter tuning, our simple approach yields performance competitive with current task-specific state-of-the-art methods on super-resolution and in-painting, while being more generalisable and without requiring any training. Our source code and pre-trained models are available online: https://razvanmarinescu.github.io/brgm/. △ Less

Submitted 9 December, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: 27 pages, 17 figures, 5 tables

Journal ref: NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications

arXiv:2010.12721 [pdf, other]

PEP: Parameter Ensembling by Perturbation

Authors: Alireza Mehrtash, Purang Abolmaesumi, Polina Golland, Tina Kapur, Demian Wassermann, William M. Wells III

Abstract: Ensembling is now recognized as an effective approach for increasing the predictive performance and calibration of deep networks. We introduce a new approach, Parameter Ensembling by Perturbation (PEP), that constructs an ensemble of parameter values as random perturbations of the optimal parameter set from training by a Gaussian with a single variance parameter. The variance is chosen to maximize… ▽ More Ensembling is now recognized as an effective approach for increasing the predictive performance and calibration of deep networks. We introduce a new approach, Parameter Ensembling by Perturbation (PEP), that constructs an ensemble of parameter values as random perturbations of the optimal parameter set from training by a Gaussian with a single variance parameter. The variance is chosen to maximize the log-likelihood of the ensemble average ($\mathbb{L}$) on the validation data set. Empirically, and perhaps surprisingly, $\mathbb{L}$ has a well-defined maximum as the variance grows from zero (which corresponds to the baseline model). Conveniently, calibration level of predictions also tends to grow favorably until the peak of $\mathbb{L}$ is reached. In most experiments, PEP provides a small improvement in performance, and, in some cases, a substantial improvement in empirical calibration. We show that this "PEP effect" (the gain in log-likelihood) is related to the mean curvature of the likelihood function and the empirical Fisher information. Experiments on ImageNet pre-trained networks including ResNet, DenseNet, and Inception showed improved calibration and likelihood. We further observed a mild improvement in classification accuracy on these networks. Experiments on classification benchmarks such as MNIST and CIFAR-10 showed improved calibration and likelihood, as well as the relationship between the PEP effect and overfitting; this demonstrates that PEP can be used to probe the level of overfitting that occurred during training. In general, no special training procedure or network architecture is needed, and in the case of pre-trained networks, no additional training is needed. △ Less

Submitted 23 October, 2020; originally announced October 2020.

Comments: NeurIPS 2020

arXiv:2010.04757 [pdf, other]

Predictive Modeling of Anatomy with Genetic and Clinical Data

Authors: Adrian V. Dalca, Ramesh Sridharan, Mert R. Sabuncu, Polina Golland

Abstract: We present a semi-parametric generative model for predicting anatomy of a patient in subsequent scans following a single baseline image. Such predictive modeling promises to facilitate novel analyses in both voxel-level studies and longitudinal biomarker evaluation. We capture anatomical change through a combination of population-wide regression and a non-parametric model of the subject's health b… ▽ More We present a semi-parametric generative model for predicting anatomy of a patient in subsequent scans following a single baseline image. Such predictive modeling promises to facilitate novel analyses in both voxel-level studies and longitudinal biomarker evaluation. We capture anatomical change through a combination of population-wide regression and a non-parametric model of the subject's health based on individual genetic and clinical indicators. In contrast to classical correlation and longitudinal analysis, we focus on predicting new observations from a single subject observation. We demonstrate prediction of follow-up anatomical scans in the ADNI cohort, and illustrate a novel analysis approach that compares a patient's scans to the predicted subject-specific healthy anatomical trajectory. The code is available at https://github.com/adalca/voxelorb. △ Less

Submitted 9 October, 2020; originally announced October 2020.

Comments: MICCAI 2015. Keywords: Neuroimaging, Anatomical prediction, Synthesis, Simulation, Genetics, Generative model, Linear mixed effects, Kernel Machines

arXiv:2010.01766 [pdf, other]

DEMI: Discriminative Estimator of Mutual Information

Authors: Ruizhi Liao, Daniel Moyer, Polina Golland, William M. Wells

Abstract: Estimating mutual information between continuous random variables is often intractable and extremely challenging for high-dimensional data. Recent progress has leveraged neural networks to optimize variational lower bounds on mutual information. Although showing promise for this difficult problem, the variational methods have been theoretically and empirically proven to have serious statistical li… ▽ More Estimating mutual information between continuous random variables is often intractable and extremely challenging for high-dimensional data. Recent progress has leveraged neural networks to optimize variational lower bounds on mutual information. Although showing promise for this difficult problem, the variational methods have been theoretically and empirically proven to have serious statistical limitations: 1) many methods struggle to produce accurate estimates when the underlying mutual information is either low or high; 2) the resulting estimators may suffer from high variance. Our approach is based on training a classifier that provides the probability that a data sample pair is drawn from the joint distribution rather than from the product of its marginal distributions. Moreover, we establish a direct connection between mutual information and the average log odds estimate produced by the classifier on a test set, leading to a simple and accurate estimator of mutual information. We show theoretically that our method and other variational approaches are equivalent when they achieve their optimum, while our method sidesteps the variational bound. Empirical results demonstrate high accuracy of our approach and the advantages of our estimator in the context of representation learning. Our demo is available at https://github.com/RayRuizhiLiao/demi_mi_estimator. △ Less

Submitted 29 November, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

arXiv:2008.09884 [pdf, other]

Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Authors: Geeticka Chauhan, Ruizhi Liao, William Wells, Jacob Andreas, Xin Wang, Seth Berkowitz, Steven Horng, Peter Szolovits, Polina Golland

Abstract: We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification. To take… ▽ More We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification. To take advantage of the rich information present in the radiology reports, we develop a neural network model that is trained on both images and free-text to assess pulmonary edema severity from chest radiographs at inference time. Our experimental results suggest that the joint image-text representation learning improves the performance of pulmonary edema assessment compared to a supervised model trained on images only. We also show the use of the text for explaining the image classification by the joint model. To the best of our knowledge, our approach is the first to leverage free-text radiology reports for improving the image model performance in this application. Our code is available at https://github.com/RayRuizhiLiao/joint_chestxray. △ Less

Submitted 22 August, 2020; originally announced August 2020.

Comments: The two first authors contributed equally. To be published in the proceedings of MICCAI 2020

arXiv:2008.05975 [pdf]

doi 10.1148/ryai.2021190228

Deep Learning to Quantify Pulmonary Edema in Chest Radiographs

Authors: Steven Horng, Ruizhi Liao, Xin Wang, Sandeep Dalal, Polina Golland, Seth J Berkowitz

Abstract: Purpose: To develop a machine learning model to classify the severity grades of pulmonary edema on chest radiographs. Materials and Methods: In this retrospective study, 369,071 chest radiographs and associated radiology reports from 64,581 (mean age, 51.71; 54.51% women) patients from the MIMIC-CXR chest radiograph dataset were included. This dataset was split into patients with and without con… ▽ More Purpose: To develop a machine learning model to classify the severity grades of pulmonary edema on chest radiographs. Materials and Methods: In this retrospective study, 369,071 chest radiographs and associated radiology reports from 64,581 (mean age, 51.71; 54.51% women) patients from the MIMIC-CXR chest radiograph dataset were included. This dataset was split into patients with and without congestive heart failure (CHF). Pulmonary edema severity labels from the associated radiology reports were extracted from patients with CHF as four different ordinal levels: 0, no edema; 1, vascular congestion; 2, interstitial edema; and 3, alveolar edema. Deep learning models were developed using two approaches: a semi-supervised model using a variational autoencoder and a pre-trained supervised learning model using a dense neural network. Receiver operating characteristic curve analysis was performed on both models. Results: The area under the receiver operating characteristic curve (AUC) for differentiating alveolar edema from no edema was 0.99 for the semi-supervised model and 0.87 for the pre-trained models. Performance of the algorithm was inversely related to the difficulty in categorizing milder states of pulmonary edema (shown as AUCs for semi-supervised model and pre-trained model, respectively): 2 versus 0, 0.88 and 0.81; 1 versus 0, 0.79 and 0.66; 3 versus 1, 0.93 and 0.82; 2 versus 1, 0.69 and 0.73; and, 3 versus 2, 0.88 and 0.63. Conclusion: Deep learning models were trained on a large chest radiograph dataset and could grade the severity of pulmonary edema on chest radiographs with high performance. △ Less

Submitted 7 January, 2021; v1 submitted 13 August, 2020; originally announced August 2020.

Comments: The two first authors contributed equally

arXiv:2007.08146 [pdf, other]

Enhanced detection of fetal pose in 3D MRI by Deep Reinforcement Learning with physical structure priors on anatomy

Authors: Molin Zhang, Junshen Xu, Esra Abaci Turk, P. Ellen Grant, Polina Golland, Elfar Adalsteinsson

Abstract: Fetal MRI is heavily constrained by unpredictable and substantial fetal motion that causes image artifacts and limits the set of viable diagnostic image contrasts. Current mitigation of motion artifacts is predominantly performed by fast, single-shot MRI and retrospective motion correction. Estimation of fetal pose in real time during MRI stands to benefit prospective methods to detect and mitigat… ▽ More Fetal MRI is heavily constrained by unpredictable and substantial fetal motion that causes image artifacts and limits the set of viable diagnostic image contrasts. Current mitigation of motion artifacts is predominantly performed by fast, single-shot MRI and retrospective motion correction. Estimation of fetal pose in real time during MRI stands to benefit prospective methods to detect and mitigate fetal motion artifacts where inferred fetal motion is combined with online slice prescription with low-latency decision making. Current developments of deep reinforcement learning (DRL), offer a novel approach for fetal landmarks detection. In this task 15 agents are deployed to detect 15 landmarks simultaneously by DRL. The optimization is challenging, and here we propose an improved DRL that incorporates priors on physical structure of the fetal body. First, we use graph communication layers to improve the communication among agents based on a graph where each node represents a fetal-body landmark. Further, additional reward based on the distance between agents and physical structures such as the fetal limbs is used to fully exploit physical structure. Evaluation of this method on a repository of 3-mm resolution in vivo data demonstrates a mean accuracy of landmark estimation within 10 mm of ground truth as 87.3%, and a mean error of 6.9 mm. The proposed DRL for fetal pose landmark search demonstrates a potential clinical utility for online detection of fetal motion that guides real-time mitigation of motion artifacts as well as health diagnosis during MRI of the pregnant mother. △ Less

Submitted 16 July, 2020; originally announced July 2020.

Comments: 10 pages, 3 figures, MICCAI 2020

arXiv:2007.01441 [pdf, other]

doi 10.59275/j.melba.2022-16cc

Joint Frequency and Image Space Learning for MRI Reconstruction and Analysis

Authors: Nalini M. Singh, Juan Eugenio Iglesias, Elfar Adalsteinsson, Adrian V. Dalca, Polina Golland

Abstract: We propose neural network layers that explicitly combine frequency and image feature representations and show that they can be used as a versatile building block for reconstruction from frequency space data. Our work is motivated by the challenges arising in MRI acquisition where the signal is a corrupted Fourier transform of the desired image. The proposed joint learning schemes enable both corre… ▽ More We propose neural network layers that explicitly combine frequency and image feature representations and show that they can be used as a versatile building block for reconstruction from frequency space data. Our work is motivated by the challenges arising in MRI acquisition where the signal is a corrupted Fourier transform of the desired image. The proposed joint learning schemes enable both correction of artifacts native to the frequency space and manipulation of image space representations to reconstruct coherent image structures at every layer of the network. This is in contrast to most current deep learning approaches for image reconstruction that treat frequency and image space features separately and often operate exclusively in one of the two spaces. We demonstrate the advantages of joint convolutional learning for a variety of tasks, including motion correction, denoising, reconstruction from undersampled acquisitions, and combined undersampling and motion correction on simulated and real world multicoil MRI data. The joint models produce consistently high quality output images across all tasks and datasets. When integrated into a state of the art unrolled optimization network with physics-inspired data consistency constraints for undersampled reconstruction, the proposed architectures significantly improve the optimization landscape, which yields an order of magnitude reduction of training time. This result suggests that joint representations are particularly well suited for MRI signals in deep learning networks. Our code and pretrained models are publicly available at https://github.com/nalinimsingh/interlacer. △ Less

Submitted 17 June, 2022; v1 submitted 2 July, 2020; originally announced July 2020.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://www.melba-journal.org/papers/2022:018.html

arXiv:2006.12704 [pdf, other]

Semi-Supervised Learning for Fetal Brain MRI Quality Assessment with ROI consistency

Authors: Junshen Xu, Sayeri Lala, Borjan Gagoski, Esra Abaci Turk, P. Ellen Grant, Polina Golland, Elfar Adalsteinsson

Abstract: Fetal brain MRI is useful for diagnosing brain abnormalities but is challenged by fetal motion. The current protocol for T2-weighted fetal brain MRI is not robust to motion so image volumes are degraded by inter- and intra- slice motion artifacts. Besides, manual annotation for fetal MR image quality assessment are usually time-consuming. Therefore, in this work, a semi-supervised deep learning me… ▽ More Fetal brain MRI is useful for diagnosing brain abnormalities but is challenged by fetal motion. The current protocol for T2-weighted fetal brain MRI is not robust to motion so image volumes are degraded by inter- and intra- slice motion artifacts. Besides, manual annotation for fetal MR image quality assessment are usually time-consuming. Therefore, in this work, a semi-supervised deep learning method that detects slices with artifacts during the brain volume scan is proposed. Our method is based on the mean teacher model, where we not only enforce consistency between student and teacher models on the whole image, but also adopt an ROI consistency loss to guide the network to focus on the brain region. The proposed method is evaluated on a fetal brain MR dataset with 11,223 labeled images and more than 200,000 unlabeled images. Results show that compared with supervised learning, the proposed method can improve model accuracy by about 6\% and outperform other state-of-the-art semi-supervised learning methods. The proposed method is also implemented and evaluated on an MR scanner, which demonstrates the feasibility of online image quality assessment and image reacquisition during fetal MR scans. △ Less

Submitted 22 June, 2020; originally announced June 2020.

arXiv:2004.12926 [pdf]

A New Age of Computing and the Brain

Authors: Polina Golland, Jack Gallant, Greg Hager, Hanspeter Pfister, Christos Papadimitriou, Stefan Schaal, Joshua T. Vogelstein

Abstract: The history of computer science and brain sciences are intertwined. In his unfinished manuscript "The Computer and the Brain," von Neumann debates whether or not the brain can be thought of as a computing machine and identifies some of the similarities and differences between natural and artificial computation. Turing, in his 1950 article in Mind, argues that computing devices could ultimately emu… ▽ More The history of computer science and brain sciences are intertwined. In his unfinished manuscript "The Computer and the Brain," von Neumann debates whether or not the brain can be thought of as a computing machine and identifies some of the similarities and differences between natural and artificial computation. Turing, in his 1950 article in Mind, argues that computing devices could ultimately emulate intelligence, leading to his proposed Turing test. Herbert Simon predicted in 1957 that most psychological theories would take the form of a computer program. In 1976, David Marr proposed that the function of the visual system could be abstracted and studied at computational and algorithmic levels that did not depend on the underlying physical substrate. In December 2014, a two-day workshop supported by the Computing Community Consortium (CCC) and the National Science Foundation's Computer and Information Science and Engineering Directorate (NSF CISE) was convened in Washington, DC, with the goal of bringing together computer scientists and brain researchers to explore these new opportunities and connections, and develop a new, modern dialogue between the two research communities. Specifically, our objectives were: 1. To articulate a conceptual framework for research at the interface of brain sciences and computing and to identify key problems in this interface, presented in a way that will attract both CISE and brain researchers into this space. 2. To inform and excite researchers within the CISE research community about brain research opportunities and to identify and explain strategic roles they can play in advancing this initiative. 3. To develop new connections, conversations and collaborations between brain sciences and CISE researchers that will lead to highly relevant and competitive proposals, high-impact research, and influential publications. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: A Computing Community Consortium (CCC) workshop report, 24 pages

Report number: ccc2014report_5

arXiv:1907.07783 [pdf, other]

Patient-specific Conditional Joint Models of Shape, Image Features and Clinical Indicators

Authors: Bernhard Egger, Markus D. Schirmer, Florian Dubost, Marco J. Nardin, Natalia S. Rost, Polina Golland

Abstract: We propose and demonstrate a joint model of anatomical shapes, image features and clinical indicators for statistical shape modeling and medical image analysis. The key idea is to employ a copula model to separate the joint dependency structure from the marginal distributions of variables of interest. This separation provides flexibility on the assumptions made during the modeling process. The pro… ▽ More We propose and demonstrate a joint model of anatomical shapes, image features and clinical indicators for statistical shape modeling and medical image analysis. The key idea is to employ a copula model to separate the joint dependency structure from the marginal distributions of variables of interest. This separation provides flexibility on the assumptions made during the modeling process. The proposed method can handle binary, discrete, ordinal and continuous variables. We demonstrate a simple and efficient way to include binary, discrete and ordinal variables into the modeling. We build Bayesian conditional models based on observed partial clinical indicators, features or shape based on Gaussian processes capturing the dependency structure. We apply the proposed method on a stroke dataset to jointly model the shape of the lateral ventricles, the spatial distribution of the white matter hyperintensity associated with periventricular white matter disease, and clinical indicators. The proposed method yields interpretable joint models for data exploration and patient-specific statistical shape models for medical image analysis. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Comments: Supplementary material: https://www.youtube.com/watch?v=gPoHP_iFQIA

Journal ref: MICCAI 2019, the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, in Shenzhen, China

arXiv:1907.04500 [pdf, other]

Fetal Pose Estimation in Volumetric MRI using a 3D Convolution Neural Network

Authors: Junshen Xu, Molin Zhang, Esra Abaci Turk, Larry Zhang, Ellen Grant, Kui Ying, Polina Golland, Elfar Adalsteinsson

Abstract: The performance and diagnostic utility of magnetic resonance imaging (MRI) in pregnancy is fundamentally constrained by fetal motion. Motion of the fetus, which is unpredictable and rapid on the scale of conventional imaging times, limits the set of viable acquisition techniques to single-shot imaging with severe compromises in signal-to-noise ratio and diagnostic contrast, and frequently results… ▽ More The performance and diagnostic utility of magnetic resonance imaging (MRI) in pregnancy is fundamentally constrained by fetal motion. Motion of the fetus, which is unpredictable and rapid on the scale of conventional imaging times, limits the set of viable acquisition techniques to single-shot imaging with severe compromises in signal-to-noise ratio and diagnostic contrast, and frequently results in unacceptable image quality. Surprisingly little is known about the characteristics of fetal motion during MRI and here we propose and demonstrate methods that exploit a growing repository of MRI observations of the gravid abdomen that are acquired at low spatial resolution but relatively high temporal resolution and over long durations (10-30 minutes). We estimate fetal pose per frame in MRI volumes of the pregnant abdomen via deep learning algorithms that detect key fetal landmarks. Evaluation of the proposed method shows that our framework achieves quantitatively an average error of 4.47 mm and 96.4\% accuracy (with error less than 10 mm). Fetal pose estimation in MRI time series yields novel means of quantifying fetal movements in health and disease, and enables the learning of kinematic models that may enhance prospective mitigation of fetal motion artifacts during MRI acquisition. △ Less

Submitted 9 July, 2019; originally announced July 2019.

Comments: MICCAI 2019

arXiv:1905.08627 [pdf, other]

BrainPainter: A software for the visualisation of brain structures, biomarkers and associated pathological processes

Authors: Razvan V. Marinescu, Arman Eshaghi, Daniel C. Alexander, Polina Golland

Abstract: We present BrainPainter, a software that automatically generates images of highlighted brain structures given a list of numbers corresponding to the output colours of each region. Compared to existing visualisation software (i.e. Freesurfer, SPM, 3D Slicer), BrainPainter has three key advantages: (1) it does not require the input data to be in a specialised format, allowing BrainPainter to be used… ▽ More We present BrainPainter, a software that automatically generates images of highlighted brain structures given a list of numbers corresponding to the output colours of each region. Compared to existing visualisation software (i.e. Freesurfer, SPM, 3D Slicer), BrainPainter has three key advantages: (1) it does not require the input data to be in a specialised format, allowing BrainPainter to be used in combination with any neuroimaging analysis tools, (2) it can visualise both cortical and subcortical structures and (3) it can be used to generate movies showing dynamic processes, e.g. propagation of pathology on the brain. We highlight three use cases where BrainPainter was used in existing neuroimaging studies: (1) visualisation of the degree of atrophy through interpolation along a user-defined gradient of colours, (2) visualisation of the progression of pathology in Alzheimer's disease as well as (3) visualisation of pathology in subcortical regions in Huntington's disease. Moreover, through the design of BrainPainter we demonstrate the possibility of using a powerful 3D computer graphics engine such as Blender to generate brain visualisations for the neuroscience community. Blender's capabilities, e.g. particle simulations, motion graphics, UV unwrap**, raster graphics editing, raytracing and illumination effects, open a wealth of possibilities for brain visualisation not available in current neuroimaging software. BrainPainter is customisable, easy to use, and can run straight from the web browser: https://brainpainter.csail.mit.edu , as well as from source-code packaged in a docker container: https://github.com/mrazvan22/brain-coloring . It can be used to visualise biomarker data from any brain imaging modality, or simply to highlight a particular brain structure for e.g. anatomy courses. △ Less

Submitted 22 August, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

Comments: Accepted at the MICCAI Multimodal Brain Imaging Analysis (MBIA) workshop, 2019

Showing 1–50 of 65 results for author: Golland, P