Search | arXiv e-print repository

Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models

Authors: Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi

Abstract: Inverse problems arise in a multitude of applications, where the goal is to recover a clean signal from noisy and possibly (non)linear observations. The difficulty of a reconstruction problem depends on multiple factors, such as the structure of the ground truth signal, the severity of the degradation and the complex interactions between the above. This results in natural sample-by-sample variatio… ▽ More Inverse problems arise in a multitude of applications, where the goal is to recover a clean signal from noisy and possibly (non)linear observations. The difficulty of a reconstruction problem depends on multiple factors, such as the structure of the ground truth signal, the severity of the degradation and the complex interactions between the above. This results in natural sample-by-sample variation in the difficulty of a reconstruction task, which is often overlooked by contemporary techniques. Our key observation is that most existing inverse problem solvers lack the ability to adapt their compute power to the difficulty of the reconstruction task, resulting in subpar performance and wasteful resource allocation. We propose a novel method that we call severity encoding, to estimate the degradation severity of noisy, degraded signals in the latent space of an autoencoder. We show that the estimated severity has strong correlation with the true corruption level and can give useful hints at the difficulty of reconstruction problems on a sample-by-sample basis. Furthermore, we propose a reconstruction method based on latent diffusion models that leverages the predicted degradation severities to fine-tune the reverse diffusion sampling trajectory and thus achieve sample-adaptive inference times. Our framework acts as a wrapper that can be combined with any latent diffusion-based baseline solver, imbuing it with sample-adaptivity and acceleration. We perform numerical experiments on both linear and nonlinear inverse problems and demonstrate that our technique greatly improves the performance of the baseline solver and achieves up to $10\times$ acceleration in mean sampling speed. △ Less

Submitted 4 February, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 30 pages, 21 figures, preliminary version

ACM Class: I.2.6; I.4.5

arXiv:2303.14353 [pdf, other]

DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency

Authors: Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi

Abstract: Diffusion models have established new state of the art in a multitude of computer vision tasks, including image restoration. Diffusion-based inverse problem solvers generate reconstructions of exceptional visual quality from heavily corrupted measurements. However, in what is widely known as the perception-distortion trade-off, the price of perceptually appealing reconstructions is often paid in d… ▽ More Diffusion models have established new state of the art in a multitude of computer vision tasks, including image restoration. Diffusion-based inverse problem solvers generate reconstructions of exceptional visual quality from heavily corrupted measurements. However, in what is widely known as the perception-distortion trade-off, the price of perceptually appealing reconstructions is often paid in declined distortion metrics, such as PSNR. Distortion metrics measure faithfulness to the observation, a crucial requirement in inverse problems. In this work, we propose a novel framework for inverse problem solving, namely we assume that the observation comes from a stochastic degradation process that gradually degrades and noises the original clean image. We learn to reverse the degradation process in order to recover the clean image. Our technique maintains consistency with the original measurement throughout the reverse process, and allows for great flexibility in trading off perceptual quality for improved distortion metrics and sampling speedup via early-stop**. We demonstrate the efficiency of our method on different high-resolution datasets and inverse problems, achieving great improvements over other state-of-the-art diffusion-based methods with respect to both perceptual and distortion metrics. Source code and pre-trained models will be released soon. △ Less

Submitted 25 March, 2023; originally announced March 2023.

Comments: 28 pages, 13 figures, preliminary version

ACM Class: I.2.6; I.4.4; I.4.5

arXiv:2203.08213 [pdf, other]

HUMUS-Net: Hybrid unrolled multi-scale network architecture for accelerated MRI reconstruction

Authors: Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi

Abstract: In accelerated MRI reconstruction, the anatomy of a patient is recovered from a set of under-sampled and noisy measurements. Deep learning approaches have been proven to be successful in solving this ill-posed inverse problem and are capable of producing very high quality reconstructions. However, current architectures heavily rely on convolutions, that are content-independent and have difficultie… ▽ More In accelerated MRI reconstruction, the anatomy of a patient is recovered from a set of under-sampled and noisy measurements. Deep learning approaches have been proven to be successful in solving this ill-posed inverse problem and are capable of producing very high quality reconstructions. However, current architectures heavily rely on convolutions, that are content-independent and have difficulties modeling long-range dependencies in images. Recently, Transformers, the workhorse of contemporary natural language processing, have emerged as powerful building blocks for a multitude of vision tasks. These models split input images into non-overlap** patches, embed the patches into lower-dimensional tokens and utilize a self-attention mechanism that does not suffer from the aforementioned weaknesses of convolutional architectures. However, Transformers incur extremely high compute and memory cost when 1) the input image resolution is high and 2) when the image needs to be split into a large number of patches to preserve fine detail information, both of which are typical in low-level vision problems such as MRI reconstruction, having a compounding effect. To tackle these challenges, we propose HUMUS-Net, a hybrid architecture that combines the beneficial implicit bias and efficiency of convolutions with the power of Transformer blocks in an unrolled and multi-scale network. HUMUS-Net extracts high-resolution features via convolutional blocks and refines low-resolution features via a novel Transformer-based multi-scale feature extractor. Features from both levels are then synthesized into a high-resolution output reconstruction. Our network establishes new state of the art on the largest publicly available MRI dataset, the fastMRI dataset. We further demonstrate the performance of HUMUS-Net on two other popular MRI datasets and perform fine-grained ablation studies to validate our design. △ Less

Submitted 16 March, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: 18 pages, 11 figures, NeurIPS 2022

ACM Class: I.2; I.4; J.3

arXiv:2011.14347 [pdf, other]

Semi-Supervised Learning of Mutually Accelerated MRI Synthesis without Fully-Sampled Ground Truths

Authors: Mahmut Yurt, Salman Ul Hassan Dar, Muzaffer Özbey, Berk Tınaz, Kader Karlı Oğuz, Tolga Çukur

Abstract: Learning-based synthetic multi-contrast MRI commonly involves deep models trained using high-quality images of source and target contrasts, regardless of whether source and target domain samples are paired or unpaired. This results in undesirable reliance on fully-sampled acquisitions of all MRI contrasts, which might prove impractical due to limitations on scan costs and time. Here, we propose a… ▽ More Learning-based synthetic multi-contrast MRI commonly involves deep models trained using high-quality images of source and target contrasts, regardless of whether source and target domain samples are paired or unpaired. This results in undesirable reliance on fully-sampled acquisitions of all MRI contrasts, which might prove impractical due to limitations on scan costs and time. Here, we propose a novel semi-supervised deep generative model that instead learns to recover high-quality target images directly from accelerated acquisitions of source and target contrasts. To achieve this, the proposed model introduces novel multi-coil tensor losses in image, k-space and adversarial domains. These selective losses are based only on acquired k-space samples, and randomized sampling masks are used across subjects to capture relationships among acquired and non-acquired k-space regions. Comprehensive experiments on multi-contrast neuroimaging datasets demonstrate that our semi-supervised approach yields equivalent performance to gold-standard fully-supervised models, while outperforming a cascaded approach that learns to synthesize based on reconstructions of undersampled data. Therefore, the proposed approach holds great promise to improve the feasibility and utility of accelerated MRI acquisitions mutually undersampled across both contrast sets and k-space. △ Less

Submitted 12 May, 2021; v1 submitted 29 November, 2020; originally announced November 2020.

arXiv:2011.13913 [pdf, other]

Progressively Volumetrized Deep Generative Models for Data-Efficient Contextual Learning of MR Image Recovery

Authors: Mahmut Yurt, Muzaffer Özbey, Salman Ul Hassan Dar, Berk Tınaz, Kader Karlı Oğuz, Tolga Çukur

Abstract: Magnetic resonance imaging (MRI) offers the flexibility to image a given anatomic volume under a multitude of tissue contrasts. Yet, scan time considerations put stringent limits on the quality and diversity of MRI data. The gold-standard approach to alleviate this limitation is to recover high-quality images from data undersampled across various dimensions, most commonly the Fourier domain or con… ▽ More Magnetic resonance imaging (MRI) offers the flexibility to image a given anatomic volume under a multitude of tissue contrasts. Yet, scan time considerations put stringent limits on the quality and diversity of MRI data. The gold-standard approach to alleviate this limitation is to recover high-quality images from data undersampled across various dimensions, most commonly the Fourier domain or contrast sets. A primary distinction among recovery methods is whether the anatomy is processed per volume or per cross-section. Volumetric models offer enhanced capture of global contextual information, but they can suffer from suboptimal learning due to elevated model complexity. Cross-sectional models with lower complexity offer improved learning behavior, yet they ignore contextual information across the longitudinal dimension of the volume. Here, we introduce a novel progressive volumetrization strategy for generative models (ProvoGAN) that serially decomposes complex volumetric image recovery tasks into successive cross-sectional map**s task-optimally ordered across individual rectilinear dimensions. ProvoGAN effectively captures global context and recovers fine-structural details across all dimensions, while maintaining low model complexity and improved learning behaviour. Comprehensive demonstrations on mainstream MRI reconstruction and synthesis tasks show that ProvoGAN yields superior performance to state-of-the-art volumetric and cross-sectional models. △ Less

Submitted 12 March, 2022; v1 submitted 27 November, 2020; originally announced November 2020.

Showing 1–5 of 5 results for author: Tınaz, B