Skip to main content

Showing 1–8 of 8 results for author: Cyr, E C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.14084  [pdf, other

    math.NA cs.CE cs.LG

    Graph Neural Networks and Applied Linear Algebra

    Authors: Nicholas S. Moore, Eric C. Cyr, Peter Ohm, Christopher M. Siefert, Raymond S. Tuminaro

    Abstract: Sparse matrix computations are ubiquitous in scientific computing. With the recent interest in scientific machine learning, it is natural to ask how sparse matrix computations can leverage neural networks (NN). Unfortunately, multi-layer perceptron (MLP) neural networks are typically not natural for either graph or sparse matrix computations. The issue lies with the fact that MLPs require fixed-si… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Report number: SAND2023-10755O

  2. arXiv:2203.04738  [pdf, other

    cs.CV cs.DC cs.LG

    Parallel Training of GRU Networks with a Multi-Grid Solver for Long Sequences

    Authors: Gordon Euhyun Moon, Eric C. Cyr

    Abstract: Parallelizing Gated Recurrent Unit (GRU) networks is a challenging task, as the training procedure of GRU is inherently sequential. Prior efforts to parallelize GRU have largely focused on conventional parallelization strategies such as data-parallel and model-parallel training algorithms. However, when the given sequences are very long, existing approaches are still inevitably performance limited… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: Accepted at ICLR 2022

  3. arXiv:2101.11256  [pdf, other

    cs.LG math.NA stat.ML

    Partition of unity networks: deep hp-approximation

    Authors: Kook** Lee, Nathaniel A. Trask, Ravi G. Patel, Mamikon A. Gulian, Eric C. Cyr

    Abstract: Approximation theorists have established best-in-class optimal approximation rates of deep neural networks by utilizing their ability to simultaneously emulate partitions of unity and monomials. Motivated by this, we propose partition of unity networks (POUnets) which incorporate these elements directly into the architecture. Classification architectures of the type used to learn probability measu… ▽ More

    Submitted 27 January, 2021; originally announced January 2021.

    Comments: 8 pages, 5 figures

  4. arXiv:2009.11992  [pdf, other

    physics.comp-ph cs.LG math.NA stat.ML

    A physics-informed operator regression framework for extracting data-driven continuum models

    Authors: Ravi G. Patel, Nathaniel A. Trask, Mitchell A. Wood, Eric C. Cyr

    Abstract: The application of deep learning toward discovery of data-driven models requires careful application of inductive biases to obtain a description of physics which is both accurate and robust. We present here a framework for discovering continuum models from high fidelity molecular simulation data. Our approach applies a neural network parameterization of governing physics in modal space, allowing a… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: 37 pages, 15 figures

  5. arXiv:2006.10123  [pdf, other

    cs.LG stat.ML

    A block coordinate descent optimizer for classification problems exploiting convexity

    Authors: Ravi G. Patel, Nathaniel A. Trask, Mamikon A. Gulian, Eric C. Cyr

    Abstract: Second-order optimizers hold intriguing potential for deep learning, but suffer from increased cost and sensitivity to the non-convexity of the loss surface as compared to gradient-based approaches. We introduce a coordinate descent method to train deep neural networks for classification tasks that exploits global convexity of the cross-entropy loss in the weights of the linear layer. Our hybrid N… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: 10 pages, 4 figures

  6. arXiv:1912.08974  [pdf, other

    cs.LG math.OC stat.ML

    Multilevel Initialization for Layer-Parallel Deep Neural Network Training

    Authors: Eric C. Cyr, Stefanie Günther, Jacob B. Schroder

    Abstract: This paper investigates multilevel initialization strategies for training very deep neural networks with a layer-parallel multigrid solver. The scheme is based on the continuous interpretation of the training problem as a problem of optimal control, in which neural networks are represented as discretizations of time-dependent ordinary differential equations. A key goal is to develop a method able… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  7. arXiv:1912.04862  [pdf, other

    cs.LG math.NA stat.ML

    Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

    Authors: Eric C. Cyr, Mamikon A. Gulian, Ravi G. Patel, Mauro Perego, Nathaniel A. Trask

    Abstract: Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dram… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Comments: 26 pages

  8. arXiv:1812.04352  [pdf, other

    math.OC cs.LG

    Layer-Parallel Training of Deep Residual Neural Networks

    Authors: S. Günther, L. Ruthotto, J. B. Schroder, E. C. Cyr, N. R. Gauger

    Abstract: Residual neural networks (ResNets) are a promising class of deep neural networks that have shown excellent performance for a number of learning tasks, e.g., image classification and recognition. Mathematically, ResNet architectures can be interpreted as forward Euler discretizations of a nonlinear initial value problem whose time-dependent control variables represent the weights of the neural netw… ▽ More

    Submitted 25 July, 2019; v1 submitted 11 December, 2018; originally announced December 2018.