Search | arXiv e-print repository

Test like you Train in Implicit Deep Learning

Authors: Zaccharie Ramzi, Pierre Ablin, Gabriel Peyré, Thomas Moreau

Abstract: Implicit deep learning has recently gained popularity with applications ranging from meta-learning to Deep Equilibrium Networks (DEQs). In its general formulation, it relies on expressing some components of deep learning pipelines implicitly, typically via a root equation called the inner problem. In practice, the solution of the inner problem is approximated during training with an iterative proc… ▽ More Implicit deep learning has recently gained popularity with applications ranging from meta-learning to Deep Equilibrium Networks (DEQs). In its general formulation, it relies on expressing some components of deep learning pipelines implicitly, typically via a root equation called the inner problem. In practice, the solution of the inner problem is approximated during training with an iterative procedure, usually with a fixed number of inner iterations. During inference, the inner problem needs to be solved with new data. A popular belief is that increasing the number of inner iterations compared to the one used during training yields better performance. In this paper, we question such an assumption and provide a detailed theoretical analysis in a simple setting. We demonstrate that overparametrization plays a key role: increasing the number of iterations at test time cannot improve performance for overparametrized networks. We validate our theory on an array of implicit deep-learning problems. DEQs, which are typically overparametrized, do not benefit from increasing the number of iterations at inference while meta-learning, which is typically not overparametrized, benefits from it. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2206.13424 [pdf, other]

Benchopt: Reproducible, efficient and collaborative optimization benchmarks

Authors: Thomas Moreau, Mathurin Massias, Alexandre Gramfort, Pierre Ablin, Pierre-Antoine Bannier, Benjamin Charlier, Mathieu Dagréou, Tom Dupré la Tour, Ghislain Durif, Cassio F. Dantas, Quentin Klopfenstein, Johan Larsson, En Lai, Tanguy Lefort, Benoit Malézieux, Badr Moufad, Binh T. Nguyen, Alain Rakotomamonjy, Zaccharie Ramzi, Joseph Salmon, Samuel Vaiter

Abstract: Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementat… ▽ More Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementation work. As a result, validation is often very partial, which can lead to wrong conclusions that slow down the progress of research. We propose Benchopt, a collaborative framework to automate, reproduce and publish optimization benchmarks in machine learning across programming languages and hardware architectures. Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments. To demonstrate its broad usability, we showcase benchmarks on three standard learning tasks: $\ell_2$-regularized logistic regression, Lasso, and ResNet18 training for image classification. These benchmarks highlight key practical findings that give a more nuanced view of the state-of-the-art for these problems, showing that for practical evaluation, the devil is in the details. We hope that Benchopt will foster collaborative work in the community hence improving the reproducibility of research findings. △ Less

Submitted 28 October, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: Accepted in proceedings of NeurIPS 22; Benchopt library documentation is available at https://benchopt.github.io/

arXiv:2106.00553 [pdf, other]

SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models

Authors: Zaccharie Ramzi, Florian Mannel, Shaojie Bai, Jean-Luc Starck, Philippe Ciuciu, Thomas Moreau

Abstract: In recent years, implicit deep learning has emerged as a method to increase the effective depth of deep neural networks. While their training is memory-efficient, they are still significantly slower to train than their explicit counterparts. In Deep Equilibrium Models (DEQs), the training is performed as a bi-level problem, and its computational complexity is partially driven by the iterative inve… ▽ More In recent years, implicit deep learning has emerged as a method to increase the effective depth of deep neural networks. While their training is memory-efficient, they are still significantly slower to train than their explicit counterparts. In Deep Equilibrium Models (DEQs), the training is performed as a bi-level problem, and its computational complexity is partially driven by the iterative inversion of a huge Jacobian matrix. In this paper, we propose a novel strategy to tackle this computational bottleneck from which many bi-level problems suffer. The main idea is to use the quasi-Newton matrices from the forward pass to efficiently approximate the inverse Jacobian matrix in the direction needed for the gradient computation. We provide a theorem that motivates using our method with the original forward algorithms. In addition, by modifying these forward algorithms, we further provide theoretical guarantees that our method asymptotically estimates the true implicit gradient. We empirically study this approach and the recent Jacobian-Free method in different settings, ranging from hyperparameter optimization to large Multiscale DEQs (MDEQs) applied to CIFAR and ImageNet. Both methods reduce significantly the computational cost of the backward pass. While SHINE has a clear advantage on hyperparameter optimization problems, both methods attain similar computational performances for larger scale problems such as MDEQs at the cost of a limited performance drop compared to the original models. △ Less

Submitted 10 March, 2023; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: Accepted as a spotlight to ICLR 2022

arXiv:2101.01570 [pdf, other]

Density Compensated Unrolled Networks for Non-Cartesian MRI Reconstruction

Authors: Zaccharie Ramzi, Jean-Luc Starck, Philippe Ciuciu

Abstract: Deep neural networks have recently been thoroughly investigated as a powerful tool for MRI reconstruction. There is a lack of research, however, regarding their use for a specific setting of MRI, namely non-Cartesian acquisitions. In this work, we introduce a novel kind of deep neural networks to tackle this problem, namely density compensated unrolled neural networks, which rely on Density Compen… ▽ More Deep neural networks have recently been thoroughly investigated as a powerful tool for MRI reconstruction. There is a lack of research, however, regarding their use for a specific setting of MRI, namely non-Cartesian acquisitions. In this work, we introduce a novel kind of deep neural networks to tackle this problem, namely density compensated unrolled neural networks, which rely on Density Compensation to correct the uneven weighting of the k-space. We assess their efficiency on the publicly available fastMRI dataset, and perform a small ablation study. Our results show that the density-compensated unrolled neural networks outperform the different baselines, and that all parts of the design are needed. We also open source our code, in particular a Non-Uniform Fast Fourier transform for TensorFlow. △ Less

Submitted 8 February, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

arXiv:2011.08698 [pdf, other]

Denoising Score-Matching for Uncertainty Quantification in Inverse Problems

Authors: Zaccharie Ramzi, Benjamin Remy, Francois Lanusse, Jean-Luc Starck, Philippe Ciuciu

Abstract: Deep neural networks have proven extremely efficient at solving a wide rangeof inverse problems, but most often the uncertainty on the solution they provideis hard to quantify. In this work, we propose a generic Bayesian framework forsolving inverse problems, in which we limit the use of deep neural networks tolearning a prior distribution on the signals to recover. We adopt recent denoisingscore… ▽ More Deep neural networks have proven extremely efficient at solving a wide rangeof inverse problems, but most often the uncertainty on the solution they provideis hard to quantify. In this work, we propose a generic Bayesian framework forsolving inverse problems, in which we limit the use of deep neural networks tolearning a prior distribution on the signals to recover. We adopt recent denoisingscore matching techniques to learn this prior from data, and subsequently use it aspart of an annealed Hamiltonian Monte-Carlo scheme to sample the full posteriorof image inverse problems. We apply this framework to Magnetic ResonanceImage (MRI) reconstruction and illustrate how this approach not only yields highquality reconstructions but can also be used to assess the uncertainty on particularfeatures of a reconstructed image. △ Less

Submitted 16 November, 2020; originally announced November 2020.

arXiv:2010.07290 [pdf, other]

XPDNet for MRI Reconstruction: an application to the 2020 fastMRI challenge

Authors: Zaccharie Ramzi, Philippe Ciuciu, Jean-Luc Starck

Abstract: We present a new neural network, the XPDNet, for MRI reconstruction from periodically under-sampled multi-coil data. We inform the design of this network by taking best practices from MRI reconstruction and computer vision. We show that this network can achieve state-of-the-art reconstruction results, as shown by its ranking of second in the fastMRI 2020 challenge. We present a new neural network, the XPDNet, for MRI reconstruction from periodically under-sampled multi-coil data. We inform the design of this network by taking best practices from MRI reconstruction and computer vision. We show that this network can achieve state-of-the-art reconstruction results, as shown by its ranking of second in the fastMRI 2020 challenge. △ Less

Submitted 7 July, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

Comments: 8 pages, 3 figures, presented as an oral to the 2021 ISMRM conference

Showing 1–6 of 6 results for author: Ramzi, Z