Skip to main content

Showing 1–10 of 10 results for author: Rusch, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.15059  [pdf, other

    cs.LG math.NA stat.ML

    Message-Passing Monte Carlo: Generating low-discrepancy point sets via Graph Neural Networks

    Authors: T. Konstantin Rusch, Nathan Kirk, Michael M. Bronstein, Christiane Lemieux, Daniela Rus

    Abstract: Discrepancy is a well-known measure for the irregularity of the distribution of a point set. Point sets with small discrepancy are called low-discrepancy and are known to efficiently fill the space in a uniform manner. Low-discrepancy points play a central role in many problems in science and engineering, including numerical integration, computer vision, machine perception, computer graphics, mach… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2306.03589  [pdf, other

    cs.LG stat.ML

    How does over-squashing affect the power of GNNs?

    Authors: Francesco Di Giovanni, T. Konstantin Rusch, Michael M. Bronstein, Andreea Deac, Marc Lackenby, Siddhartha Mishra, Petar Veličković

    Abstract: Graph Neural Networks (GNNs) are the state-of-the-art model for machine learning on graph-structured data. The most popular class of GNNs operate by exchanging information between adjacent nodes, and are known as Message Passing Neural Networks (MPNNs). Given their widespread use, understanding the expressive power of MPNNs is a key question. However, existing results typically consider settings w… ▽ More

    Submitted 12 February, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 36 pages; Published in Transactions on Machine Learning Research (TMLR)

  3. arXiv:2302.03580  [pdf, other

    cs.LG math.NA stat.ML

    Multi-Scale Message Passing Neural PDE Solvers

    Authors: Léonard Equer, T. Konstantin Rusch, Siddhartha Mishra

    Abstract: We propose a novel multi-scale message passing neural network algorithm for learning the solutions of time-dependent PDEs. Our algorithm possesses both temporal and spatial multi-scale resolution features by incorporating multi-scale sequence models and graph gating modules in the encoder and processor, respectively. Benchmark numerical experiments are presented to demonstrate that the proposed al… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  4. arXiv:2210.00513  [pdf, other

    cs.LG stat.ML

    Gradient Gating for Deep Multi-Rate Learning on Graphs

    Authors: T. Konstantin Rusch, Benjamin P. Chamberlain, Michael W. Mahoney, Michael M. Bronstein, Siddhartha Mishra

    Abstract: We present Gradient Gating (G$^2$), a novel framework for improving the performance of Graph Neural Networks (GNNs). Our framework is based on gating the output of GNN layers with a mechanism for multi-rate flow of message passing information across nodes of the underlying graph. Local gradients are harnessed to further modulate message passing updates. Our framework flexibly allows one to use any… ▽ More

    Submitted 15 March, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

  5. arXiv:2202.02296  [pdf, other

    cs.LG math.DS stat.ML

    Graph-Coupled Oscillator Networks

    Authors: T. Konstantin Rusch, Benjamin P. Chamberlain, James Rowbottom, Siddhartha Mishra, Michael M. Bronstein

    Abstract: We propose Graph-Coupled Oscillator Networks (GraphCON), a novel framework for deep learning on graphs. It is based on discretizations of a second-order system of ordinary differential equations (ODEs), which model a network of nonlinear controlled and damped oscillators, coupled via the adjacency structure of the underlying graph. The flexibility of our framework permits any basic GNN layer (e.g.… ▽ More

    Submitted 23 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: ICML 2022

  6. arXiv:2110.04744  [pdf, other

    cs.LG math.DS stat.ML

    Long Expressive Memory for Sequence Modeling

    Authors: T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney

    Abstract: We propose a novel method called Long Expressive Memory (LEM) for learning long-term sequential dependencies. LEM is gradient-based, it can efficiently process sequential tasks with very long-term dependencies, and it is sufficiently expressive to be able to learn complicated input-output maps. To derive LEM, we consider a system of multiscale ordinary differential equations, as well as a suitable… ▽ More

    Submitted 25 February, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: ICLR 2022

  7. arXiv:2103.05487  [pdf, other

    cs.LG math.DS stat.ML

    UnICORNN: A recurrent model for learning very long time dependencies

    Authors: T. Konstantin Rusch, Siddhartha Mishra

    Abstract: The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of… ▽ More

    Submitted 10 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Report number: PMLR 139:9168-9178, 2021

  8. arXiv:2010.00951  [pdf, other

    cs.LG cs.NE stat.ML

    Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies

    Authors: T. Konstantin Rusch, Siddhartha Mishra

    Abstract: Circuits of biological neurons, such as in the functional parts of the brain can be modeled as networks of coupled oscillators. Inspired by the ability of these systems to express a rich set of outputs while kee** (gradients of) state variables bounded, we propose a novel architecture for recurrent neural networks. Our proposed RNN is based on a time-discretization of a system of second-order or… ▽ More

    Submitted 14 March, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

  9. arXiv:2005.12564  [pdf, other

    cs.LG math.NA physics.flu-dyn stat.ML

    Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences

    Authors: Siddhartha Mishra, T. Konstantin Rusch

    Abstract: We propose a deep supervised learning algorithm based on low-discrepancy sequences as the training set. By a combination of theoretical arguments and extensive numerical experiments we demonstrate that the proposed algorithm significantly outperforms standard deep learning algorithms that are based on randomly chosen training data, for problems in moderately high dimensions. The proposed algorithm… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

  10. arXiv:1911.05035   

    cs.LG math.NA math.OC stat.ML

    Constructing Gradient Controllable Recurrent Neural Networks Using Hamiltonian Dynamics

    Authors: Konstantin Rusch, John W. Pearson, Konstantinos C. Zygalakis

    Abstract: Recurrent neural networks (RNNs) have gained a great deal of attention in solving sequential learning problems. The learning of long-term dependencies, however, remains challenging due to the problem of a vanishing or exploding hidden states gradient. By exploring further the recently established connections between RNNs and dynamical systems we propose a novel RNN architecture, which we call a Ha… ▽ More

    Submitted 16 March, 2020; v1 submitted 11 November, 2019; originally announced November 2019.

    Comments: Reasons: 1. theoretical result of bounding the gradient dynamics is highly important when tackling the exploding gradient problem. However, we only proved the boundedness in one dimension and cannot generalize to the higher dimensional case, as the Hamiltonian argument is not valid in the general higher dimensional case. 2. The only medium strong performance on the widely used sMNIST problem