Search | arXiv e-print repository

Towards Exact Gradient-based Training on Analog In-memory Computing

Authors: Zhaoxian Wu, Tayfun Gokmen, Malte J. Rasch, Tianyi Chen

Abstract: Given the high economic and environmental costs of using large vision or language models, analog in-memory accelerators present a promising solution for energy-efficient AI. While inference on analog accelerators has been studied recently, the training perspective is underexplored. Recent studies have shown that the "workhorse" of digital AI training - stochastic gradient descent (SGD) algorithm c… ▽ More Given the high economic and environmental costs of using large vision or language models, analog in-memory accelerators present a promising solution for energy-efficient AI. While inference on analog accelerators has been studied recently, the training perspective is underexplored. Recent studies have shown that the "workhorse" of digital AI training - stochastic gradient descent (SGD) algorithm converges inexactly when applied to model training on non-ideal devices. This paper puts forth a theoretical foundation for gradient-based training on analog devices. We begin by characterizing the non-convergent issue of SGD, which is caused by the asymmetric updates on the analog devices. We then provide a lower bound of the asymptotic error to show that there is a fundamental performance limit of SGD-based analog training rather than an artifact of our analysis. To address this issue, we study a heuristic analog algorithm called Tiki-Taka that has recently exhibited superior empirical performance compared to SGD and rigorously show its ability to exactly converge to a critical point and hence eliminates the asymptotic error. The simulations verify the correctness of the analyses. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 10 pages, 5 figures,2 tables

arXiv:1806.10038 [pdf, other]

doi 10.1088/1361-6420/aaf6f5

Convergence rates and structure of solutions of inverse problems with imperfect forward models

Authors: Martin Burger, Yury Korolev, Julian Rasch

Abstract: The goal of this paper is to further develop an approach to inverse problems with imperfect forward operators that is based on partially ordered spaces. Studying the dual problem yields useful insights into the convergence of the regularised solutions and allow us to obtain convergence rates in terms of Bregman distances - as usual in inverse problems, under an additional assumption on the exact s… ▽ More The goal of this paper is to further develop an approach to inverse problems with imperfect forward operators that is based on partially ordered spaces. Studying the dual problem yields useful insights into the convergence of the regularised solutions and allow us to obtain convergence rates in terms of Bregman distances - as usual in inverse problems, under an additional assumption on the exact solution called the source condition. These results are obtained for general absolutely one-homogeneous functionals. In the special case of TV-based regularisation we also study the structure of regularised solutions and prove convergence of their level sets to those of an exact solution. Finally, using the developed theory, we adapt the concept of debiasing to inverse problems with imperfect operators and propose an approach to pointwise error estimation in TV-based regularisation. △ Less

Submitted 20 November, 2018; v1 submitted 26 June, 2018; originally announced June 2018.

MSC Class: 65J20; 94A08; 49N45; 49N30

arXiv:1803.10576 [pdf, other]

Inexact First-Order Primal-Dual Algorithms

Authors: Julian Rasch, Antonin Chambolle

Abstract: In this paper we investigate the convergence of a recently popular class of first-order primal-dual algorithms for saddle point problems under the presence of errors occurring in the proximal maps and gradients. We study several types of errors and show that, provided a sufficient decay of these errors, the same convergence rates as for the error-free algorithm can be established. More precisely,… ▽ More In this paper we investigate the convergence of a recently popular class of first-order primal-dual algorithms for saddle point problems under the presence of errors occurring in the proximal maps and gradients. We study several types of errors and show that, provided a sufficient decay of these errors, the same convergence rates as for the error-free algorithm can be established. More precisely, we prove the (optimal) $O(1/N)$ convergence to a saddle point in finite dimensions for the class of non-smooth problems considered in this paper, and prove a $O(1/N^2)$ or even linear $O(θ^N)$ convergence rate if either the primal or dual objective respectively both are strongly convex. Moreover we show that also under a slower decay of errors we can establish rates, however slower and directly depending on the decay of the errors. We demonstrate the performance and practical use of the algorithms on the example of nested algorithms and show how they can be used to split the global objective more efficiently. △ Less

Submitted 24 February, 2020; v1 submitted 28 March, 2018; originally announced March 2018.

Comments: update after revision

arXiv:1712.00099 [pdf, other]

Dynamic MRI Reconstruction from Undersampled Data with an Anatomical Prescan

Authors: Julian Rasch, Ville Kolehmainen, Riikka Nivajärvi, Mikko Kettunen, Olli Gröhn, Martin Burger, Eva-Maria Brinkmann

Abstract: The goal of dynamic magnetic resonance imaging (dynamic MRI) is to visualize tissue properties and their local changes over time that are traceable in the MR signal. We propose a new variational approach for the reconstruction of subsampled dynamic MR data, which combines smooth, temporal regularization with spatial total variation regularization. In particular, it furthermore uses the infimal con… ▽ More The goal of dynamic magnetic resonance imaging (dynamic MRI) is to visualize tissue properties and their local changes over time that are traceable in the MR signal. We propose a new variational approach for the reconstruction of subsampled dynamic MR data, which combines smooth, temporal regularization with spatial total variation regularization. In particular, it furthermore uses the infimal convolution of two total variation Bregman distances to incorporate structural a-priori information from an anatomical MRI prescan into the reconstruction of the dynamic image sequence. The method promotes the reconstructed image sequence to have a high structural similarity to the anatomical prior, while still allowing for local intensity changes which are smooth in time. The approach is evaluated using artificial data simulating functional magnetic resonance imaging (fMRI), and experimental dynamic contrast-enhanced magnetic resonance data from small animal imaging using radial golden angle sampling of the k-space. △ Less

Submitted 30 November, 2017; originally announced December 2017.

arXiv:1710.05705 [pdf, other]

doi 10.1088/1361-6420/aaaf63

Blind Image Fusion for Hyperspectral Imaging with the Directional Total Variation

Authors: Leon Bungert, David A. Coomes, Matthias J. Ehrhardt, Jennifer Rasch, Rafael Reisenhofer, Carola-Bibiane Schönlieb

Abstract: Hyperspectral imaging is a cutting-edge type of remote sensing used for map** vegetation properties, rock minerals and other materials. A major drawback of hyperspectral imaging devices is their intrinsic low spatial resolution. In this paper, we propose a method for increasing the spatial resolution of a hyperspectral image by fusing it with an image of higher spatial resolution that was obtain… ▽ More Hyperspectral imaging is a cutting-edge type of remote sensing used for map** vegetation properties, rock minerals and other materials. A major drawback of hyperspectral imaging devices is their intrinsic low spatial resolution. In this paper, we propose a method for increasing the spatial resolution of a hyperspectral image by fusing it with an image of higher spatial resolution that was obtained with a different imaging modality. This is accomplished by solving a variational problem in which the regularization functional is the directional total variation. To accommodate for possible mis-registrations between the two images, we consider a non-convex blind super-resolution problem where both a fused image and the corresponding convolution kernel are estimated. Using this approach, our model can realign the given images if needed. Our experimental results indicate that the non-convexity is negligible in practice and that reliable solutions can be computed using a variety of different optimization algorithms. Numerical results on real remote sensing data from plant sciences and urban monitoring show the potential of the proposed method and suggests that it is robust with respect to the regularization parameters, mis-registration and the shape of the kernel. △ Less

Submitted 9 April, 2018; v1 submitted 4 October, 2017; originally announced October 2017.

Comments: 24 pages, 18 figures, published in Inverse Problems, typo corrected, figure added

MSC Class: 49M37; 65K10; 90C30; 90C90

Journal ref: Inverse Problems, 34(4), 044003, 2018

arXiv:1704.06073 [pdf, other]

doi 10.1088/1361-6420/aa9425

Joint Reconstruction via Coupled Bregman Iterations with Applications to PET-MR Imaging

Authors: Julian Rasch, Eva-Maria Brinkmann, Martin Burger

Abstract: Joint reconstruction has recently attracted a lot of attention, especially in the field of medical multi-modality imaging such as PET-MRI. Most of the developed methods rely on the comparison of image gradients, or more precisely their location, direction and magnitude, to make use of structural similarities between the images. A challenge and still an open issue for most of the methods is to hand… ▽ More Joint reconstruction has recently attracted a lot of attention, especially in the field of medical multi-modality imaging such as PET-MRI. Most of the developed methods rely on the comparison of image gradients, or more precisely their location, direction and magnitude, to make use of structural similarities between the images. A challenge and still an open issue for most of the methods is to handle images in entirely different scales, i.e. different magnitudes of gradients that cannot be dealt with by a global scaling of the data. We propose the use of generalized Bregman distances and infimal convolutions thereof with regard to the well-known total variation functional. The use of a total variation subgradient respectively the involved vector field rather than an image gradient naturally excludes the magnitudes of gradients, which in particular solves the scaling behavior. Additionally, the presented method features a weighting that allows to control the amount of interaction between channels. We give insights into the general behavior of the method, before we further tailor it to a particular application, namely PET-MRI joint reconstruction. To do so, we compute joint reconstruction results from blurry Poisson data for PET and undersampled Fourier data from MRI and show that we can gain a mutual benefit for both modalities. In particular, the results are superior to the respective separate reconstructions and other joint reconstruction methods. △ Less

Submitted 26 September, 2017; v1 submitted 20 April, 2017; originally announced April 2017.

Comments: Submitted

arXiv:1606.05113 [pdf, other]

Bias-Reduction in Variational Regularization

Authors: Eva-Maria Brinkmann, Martin Burger, Julian Rasch, Camille Sutour

Abstract: The aim of this paper is to introduce and study a two-step debiasing method for variational regularization. After solving the standard variational problem, the key idea is to add a consecutive debiasing step minimizing the data fidelity on an appropriate set, the so-called model manifold. The latter is defined by Bregman distances or infimal convolutions thereof, using the (uniquely defined) subgr… ▽ More The aim of this paper is to introduce and study a two-step debiasing method for variational regularization. After solving the standard variational problem, the key idea is to add a consecutive debiasing step minimizing the data fidelity on an appropriate set, the so-called model manifold. The latter is defined by Bregman distances or infimal convolutions thereof, using the (uniquely defined) subgradient appearing in the optimality condition of the variational method. For particular settings, such as anisotropic $\ell^1$ and TV-type regularization, previously used debiasing techniques are shown to be special cases. The proposed approach is however easily applicable to a wider range of regularizations. The two-step debiasing is shown to be well-defined and to optimally reduce bias in a certain setting. In addition to visual and PSNR-based evaluations, different notions of bias and variance decompositions are investigated in numerical studies. The improvements offered by the proposed scheme are demonstrated and its performance is shown to be comparable to optimal results obtained with Bregman iterations. △ Less

Submitted 22 June, 2017; v1 submitted 16 June, 2016; originally announced June 2016.

Comments: Accepted by JMIV

Showing 1–7 of 7 results for author: Rasch, J