Search | arXiv e-print repository

Rewiring Networks for Graph Neural Network Training Using Discrete Geometry

Authors: Jakub Bober, Anthea Monod, Emil Saucan, Kevin N. Webster

Abstract: Information over-squashing is a phenomenon of inefficient information propagation between distant nodes on networks. It is an important problem that is known to significantly impact the training of graph neural networks (GNNs), as the receptive field of a node grows exponentially. To mitigate this problem, a preprocessing procedure known as rewiring is often applied to the input network. In this p… ▽ More Information over-squashing is a phenomenon of inefficient information propagation between distant nodes on networks. It is an important problem that is known to significantly impact the training of graph neural networks (GNNs), as the receptive field of a node grows exponentially. To mitigate this problem, a preprocessing procedure known as rewiring is often applied to the input network. In this paper, we investigate the use of discrete analogues of classical geometric notions of curvature to model information flow on networks and rewire them. We show that these classical notions achieve state-of-the-art performance in GNN training accuracy on a variety of real-world network datasets. Moreover, compared to the current state-of-the-art, these classical notions exhibit a clear advantage in computational runtime by several orders of magnitude. △ Less

Submitted 16 July, 2022; originally announced July 2022.

Comments: 21 pages, 8 figures, 7 tables

arXiv:2204.12323 [pdf, other]

Learning reversible symplectic dynamics

Authors: Riccardo Valperga, Kevin Webster, Victoria Klein, Dmitry Turaev, Jeroen S. W. Lamb

Abstract: Time-reversal symmetry arises naturally as a structural property in many dynamical systems of interest. While the importance of hard-wiring symmetry is increasingly recognized in machine learning, to date this has eluded time-reversibility. In this paper we propose a new neural network architecture for learning time-reversible dynamical systems from data. We focus in particular on an adaptation to… ▽ More Time-reversal symmetry arises naturally as a structural property in many dynamical systems of interest. While the importance of hard-wiring symmetry is increasingly recognized in machine learning, to date this has eluded time-reversibility. In this paper we propose a new neural network architecture for learning time-reversible dynamical systems from data. We focus in particular on an adaptation to symplectic systems, because of their importance in physics-informed learning. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: Published at the 4th Annual Learning for Dynamics & Control Conference

arXiv:2111.11979 [pdf, other]

Measurement That Matches Theory: Theory-Driven Identification in IRT Models

Authors: Marco Morucci, Margaret Foster, Kaitlyn Webster, So ** Lee, David Siegel

Abstract: Measurement bridges theory and empirics. Without measures that appropriately capture theoretical concepts, description will fail to represent reality and true causal inference will be impossible. Yet, the social sciences traffic in complex concepts and their measurement is difficult. Item Response Theory (IRT) models reduce variation in multiple variables to continuous variation along one or more… ▽ More Measurement bridges theory and empirics. Without measures that appropriately capture theoretical concepts, description will fail to represent reality and true causal inference will be impossible. Yet, the social sciences traffic in complex concepts and their measurement is difficult. Item Response Theory (IRT) models reduce variation in multiple variables to continuous variation along one or more latent dimensions intended to capture key theoretical concepts. Unfortunately, those latent dimensions have no intrinsic conceptual meaning. Partial solutions to that problem include limiting the number of dimensions to one or assigning meaning post-analysis, but either can lead to potential bias and a lack of reliability across data sources. We propose, detail, and validate a semi-supervised approach employing Bayesian Item Response Theory on multiple latent dimensions and binary data. Our approach, which we validate on simulated and real data, yields conceptually meaningful latent dimensions that are reliable across different data sources without additional exogenous assumptions. △ Less

Submitted 28 May, 2024; v1 submitted 23 November, 2021; originally announced November 2021.

arXiv:2011.03395 [pdf, other]

Underspecification Presents Challenges for Credibility in Modern Machine Learning

Authors: Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne , et al. (15 additional authors not shown)

Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict… ▽ More ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains. This ambiguity can lead to instability and poor model behavior in practice, and is a distinct failure mode from previously identified issues arising from structural mismatch between training and deployment domains. We show that this problem appears in a wide variety of practical ML pipelines, using examples from computer vision, medical imaging, natural language processing, clinical risk prediction based on electronic health records, and medical genomics. Our results show the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain. △ Less

Submitted 24 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

Comments: Updates: Updated statistical analysis in Section 6; Additional citations

Showing 1–4 of 4 results for author: Webster, K