Skip to main content

Showing 1–9 of 9 results for author: Marion, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.13456  [pdf, other

    stat.ML cs.LG

    Deep linear networks for regression are implicitly regularized towards flat minima

    Authors: Pierre Marion, Lénaïc Chizat

    Abstract: The largest eigenvalue of the Hessian, or sharpness, of neural networks is a key quantity to understand their optimization dynamics. In this paper, we study the sharpness of deep linear networks for overdetermined univariate regression. Minimizers can have arbitrarily large sharpness, but not an arbitrarily small one. Indeed, we show a lower bound on the sharpness of minimizers, which grows linear… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 46 pages, 4 figures

  2. arXiv:2402.05468  [pdf, other

    cs.LG

    Implicit Diffusion: Efficient Optimization through Stochastic Sampling

    Authors: Pierre Marion, Anna Korba, Peter Bartlett, Mathieu Blondel, Valentin De Bortoli, Arnaud Doucet, Felipe Llinares-López, Courtney Paquette, Quentin Berthet

    Abstract: We present a new algorithm to optimize distributions defined implicitly by parameterized stochastic diffusions. Doing so allows us to modify the outcome distribution of sampling processes by optimizing over their parameters. We introduce a general framework for first-order optimization of these processes, that performs jointly, in a single loop, optimization and sampling steps. This approach is in… ▽ More

    Submitted 22 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 38 pages, 16 figures. Updated with additional experiments

  3. arXiv:2309.01213  [pdf, other

    stat.ML cs.LG

    Implicit regularization of deep residual networks towards neural ODEs

    Authors: Pierre Marion, Yu-Han Wu, Michael E. Sander, Gérard Biau

    Abstract: Residual neural networks are state-of-the-art deep learning models. Their continuous-depth analog, neural ordinary differential equations (ODEs), are also widely used. Despite their success, the link between the discrete and continuous models still lacks a solid mathematical foundation. In this article, we take a step in this direction by establishing an implicit regularization of deep residual ne… ▽ More

    Submitted 1 March, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

    Comments: ICLR 2024 (spotlight). 40 pages, 3 figures

  4. arXiv:2305.06648  [pdf, other

    stat.ML cs.LG

    Generalization bounds for neural ordinary differential equations and deep residual networks

    Authors: Pierre Marion

    Abstract: Neural ordinary differential equations (neural ODEs) are a popular family of continuous-depth deep learning models. In this work, we consider a large family of parameterized ODEs with continuous-in-time parameters, which include time-dependent neural ODEs. We derive a generalization bound for this class by a Lipschitz-based argument. By leveraging the analogy between neural ODEs and deep residual… ▽ More

    Submitted 11 October, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023, 21 pages, 2 figures

  5. arXiv:2304.09576  [pdf, other

    math.OC cs.LG stat.ML

    Leveraging the two timescale regime to demonstrate convergence of neural networks

    Authors: Pierre Marion, Raphaël Berthier

    Abstract: We study the training dynamics of shallow neural networks, in a two-timescale regime in which the stepsizes for the inner layer are much smaller than those for the outer layer. In this regime, we prove convergence of the gradient flow to a global optimum of the non-convex optimization problem in a simple univariate setting. The number of neurons need not be asymptotically large for our result to h… ▽ More

    Submitted 25 October, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023. 34 pages, 10 figures

  6. arXiv:2206.06929  [pdf, other

    cs.LG stat.ML

    Scaling ResNets in the Large-depth Regime

    Authors: Pierre Marion, Adeline Fermanian, Gérard Biau, Jean-Philippe Vert

    Abstract: Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid vanishing or exploding gradients, particularly as the depth $L$ increases. No consensus has been reached on how to mitigate this issue, although a widely discussed… ▽ More

    Submitted 10 June, 2024; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: 44 pages, 9 figures. Updated with clarifications and additional references

  7. arXiv:2109.00269  [pdf, other

    cs.CL

    Structured Context and High-Coverage Grammar for Conversational Question Answering over Knowledge Graphs

    Authors: Pierre Marion, Paweł Krzysztof Nowak, Francesco Piccinno

    Abstract: We tackle the problem of weakly-supervised conversational Question Answering over large Knowledge Graphs using a neural semantic parsing approach. We introduce a new Logical Form (LF) grammar that can model a wide range of queries on the graph while remaining sufficiently simple to generate supervision data efficiently. Our Transformer-based model takes a JSON-like structure as input, allowing us… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: 16 pages, 1 figure. Accepted to EMNLP 2021

    ACM Class: I.2.7

  8. arXiv:2106.01202  [pdf, other

    stat.ML cs.LG

    Framing RNN as a kernel method: A neural ODE approach

    Authors: Adeline Fermanian, Pierre Marion, Jean-Philippe Vert, Gérard Biau

    Abstract: Building on the interpretation of a recurrent neural network (RNN) as a continuous-time neural differential equation, we show, under appropriate conditions, that the solution of a RNN can be viewed as a linear function of a specific feature set of the input sequence, known as the signature. This connection allows us to frame a RNN as a kernel method in a suitable reproducing kernel Hilbert space.… ▽ More

    Submitted 29 October, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: 33 pages, 7 figures, accepted for an oral presentation at NeurIPS 2021

  9. arXiv:1707.04796  [pdf, other

    cs.CV cs.RO

    LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

    Authors: Pat Marion, Peter R. Florence, Lucas Manuelli, Russ Tedrake

    Abstract: Deep neural network (DNN) architectures have been shown to outperform traditional pipelines for object segmentation and pose estimation using RGBD data, but the performance of these DNN pipelines is directly tied to how representative the training data is of the true data. Hence a key requirement for employing these methods in practice is to have a large set of labeled data for your specific robot… ▽ More

    Submitted 26 September, 2017; v1 submitted 15 July, 2017; originally announced July 2017.