-
Improving Diffusion Models for Inverse Problems using Manifold Constraints
Authors:
Hyung** Chung,
Byeongsu Sim,
Dohoon Ryu,
Jong Chul Ye
Abstract:
Recently, diffusion models have been used to solve various inverse problems in an unsupervised manner with appropriate modifications to the sampling process. However, the current solvers, which recursively apply a reverse diffusion step followed by a projection-based measurement consistency step, often produce suboptimal results. By studying the generative sampling path, here we show that current…
▽ More
Recently, diffusion models have been used to solve various inverse problems in an unsupervised manner with appropriate modifications to the sampling process. However, the current solvers, which recursively apply a reverse diffusion step followed by a projection-based measurement consistency step, often produce suboptimal results. By studying the generative sampling path, here we show that current solvers throw the sample path off the data manifold, and hence the error accumulates. To address this, we propose an additional correction term inspired by the manifold constraint, which can be used synergistically with the previous solvers to make the iterations close to the manifold. The proposed manifold constraint is straightforward to implement within a few lines of code, yet boosts the performance by a surprisingly large margin. With extensive experiments, we show that our method is superior to the previous methods both theoretically and empirically, producing promising results in many applications such as image inpainting, colorization, and sparse-view computed tomography. Code available https://github.com/HJ-harry/MCG_diffusion
△ Less
Submitted 20 May, 2024; v1 submitted 2 June, 2022;
originally announced June 2022.
-
Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks
Authors:
Sangmin Lee,
Byeongsu Sim,
Jong Chul Ye
Abstract:
Understanding implicit bias of gradient descent for generalization capability of ReLU networks has been an important research topic in machine learning research. Unfortunately, even for a single ReLU neuron trained with the square loss, it was recently shown impossible to characterize the implicit regularization in terms of a norm of model parameters (Vardi & Shamir, 2021). In order to close the g…
▽ More
Understanding implicit bias of gradient descent for generalization capability of ReLU networks has been an important research topic in machine learning research. Unfortunately, even for a single ReLU neuron trained with the square loss, it was recently shown impossible to characterize the implicit regularization in terms of a norm of model parameters (Vardi & Shamir, 2021). In order to close the gap toward understanding intriguing generalization behavior of ReLU networks, here we examine the gradient flow dynamics in the parameter space when training single-neuron ReLU networks. Specifically, we discover an implicit bias in terms of support vectors, which plays a key role in why and how ReLU networks generalize well. Moreover, we analyze gradient flows with respect to the magnitude of the norm of initialization, and show that the norm of the learned weight strictly increases through the gradient flow. Lastly, we prove the global convergence of single ReLU neuron for $d = 2$ case.
△ Less
Submitted 13 June, 2022; v1 submitted 11 February, 2022;
originally announced February 2022.
-
Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction
Authors:
Hyung** Chung,
Byeongsu Sim,
Jong Chul Ye
Abstract:
Diffusion models have recently attained significant interest within the community owing to their strong performance as generative models. Furthermore, its application to inverse problems have demonstrated state-of-the-art performance. Unfortunately, diffusion models have a critical downside - they are inherently slow to sample from, needing few thousand steps of iteration to generate images from p…
▽ More
Diffusion models have recently attained significant interest within the community owing to their strong performance as generative models. Furthermore, its application to inverse problems have demonstrated state-of-the-art performance. Unfortunately, diffusion models have a critical downside - they are inherently slow to sample from, needing few thousand steps of iteration to generate images from pure Gaussian noise. In this work, we show that starting from Gaussian noise is unnecessary. Instead, starting from a single forward diffusion with better initialization significantly reduces the number of sampling steps in the reverse conditional diffusion. This phenomenon is formally explained by the contraction theory of the stochastic difference equations like our conditional diffusion strategy - the alternating applications of reverse diffusion followed by a non-expansive data consistency step. The new sampling strategy, dubbed Come-Closer-Diffuse-Faster (CCDF), also reveals a new insight on how the existing feed-forward neural network approaches for inverse problems can be synergistically combined with the diffusion models. Experimental results with super-resolution, image inpainting, and compressed sensing MRI demonstrate that our method can achieve state-of-the-art reconstruction performance at significantly reduced sampling steps.
△ Less
Submitted 19 March, 2022; v1 submitted 8 December, 2021;
originally announced December 2021.
-
Unpaired Deep Learning for Accelerated MRI using Optimal Transport Driven CycleGAN
Authors:
Gyutaek Oh,
Byeongsu Sim,
Hyung** Chung,
Leonard Sunwoo,
Jong Chul Ye
Abstract:
Recently, deep learning approaches for accelerated MRI have been extensively studied thanks to their high performance reconstruction in spite of significantly reduced runtime complexity. These neural networks are usually trained in a supervised manner, so matched pairs of subsampled and fully sampled k-space data are required. Unfortunately, it is often difficult to acquire matched fully sampled k…
▽ More
Recently, deep learning approaches for accelerated MRI have been extensively studied thanks to their high performance reconstruction in spite of significantly reduced runtime complexity. These neural networks are usually trained in a supervised manner, so matched pairs of subsampled and fully sampled k-space data are required. Unfortunately, it is often difficult to acquire matched fully sampled k-space data, since the acquisition of fully sampled k-space data requires long scan time and often leads to the change of the acquisition protocol. Therefore, unpaired deep learning without matched label data has become a very important research topic. In this paper, we propose an unpaired deep learning approach using a optimal transport driven cycle-consistent generative adversarial network (OT-cycleGAN) that employs a single pair of generator and discriminator. The proposed OT-cycleGAN architecture is rigorously derived from a dual formulation of the optimal transport formulation using a specially designed penalized least squares cost. The experimental results show that our method can reconstruct high resolution MR images from accelerated k- space data from both single and multiple coil acquisition, without requiring matched reference data.
△ Less
Submitted 29 August, 2020;
originally announced August 2020.
-
Optimal Transport driven CycleGAN for Unsupervised Learning in Inverse Problems
Authors:
Byeongsu Sim,
Gyutaek Oh,
Jeongsol Kim,
Chanyong Jung,
Jong Chul Ye
Abstract:
To improve the performance of classical generative adversarial network (GAN), Wasserstein generative adversarial networks (W-GAN) was developed as a Kantorovich dual formulation of the optimal transport (OT) problem using Wasserstein-1 distance. However, it was not clear how cycleGAN-type generative models can be derived from the optimal transport theory. Here we show that a novel cycleGAN archite…
▽ More
To improve the performance of classical generative adversarial network (GAN), Wasserstein generative adversarial networks (W-GAN) was developed as a Kantorovich dual formulation of the optimal transport (OT) problem using Wasserstein-1 distance. However, it was not clear how cycleGAN-type generative models can be derived from the optimal transport theory. Here we show that a novel cycleGAN architecture can be derived as a Kantorovich dual OT formulation if a penalized least square (PLS) cost with deep learning-based inverse path penalty is used as a transportation cost. One of the most important advantages of this formulation is that depending on the knowledge of the forward problem, distinct variations of cycleGAN architecture can be derived: for example, one with two pairs of generators and discriminators, and the other with only a single pair of generator and discriminator. Even for the two generator cases, we show that the structural knowledge of the forward operator can lead to a simpler generator architecture which significantly simplifies the neural network training. The new cycleGAN formulation, what we call the OT-cycleGAN, have been applied for various biomedical imaging problems, such as accelerated magnetic resonance imaging (MRI), super-resolution microscopy, and low-dose x-ray computed tomography (CT). Experimental results confirm the efficacy and flexibility of the theory.
△ Less
Submitted 30 August, 2020; v1 submitted 25 September, 2019;
originally announced September 2019.