Search | arXiv e-print repository

On the Limitations of Fractal Dimension as a Measure of Generalization

Authors: Charlie Tan, Inés García-Redondo, Qiquan Wang, Michael M. Bronstein, Anthea Monod

Abstract: Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. Neural network optimization trajectories have been proposed to possess fractal structure, leading to bounds and generalization measures based on notions of fractal dimension on these trajectories. Prominently, both the Hausdorff dimension and the persi… ▽ More Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. Neural network optimization trajectories have been proposed to possess fractal structure, leading to bounds and generalization measures based on notions of fractal dimension on these trajectories. Prominently, both the Hausdorff dimension and the persistent homology dimension have been proposed to correlate with generalization gap, thus serving as a measure of generalization. This work performs an extended evaluation of these topological generalization measures. We demonstrate that fractal dimension fails to predict generalization of models trained from poor initializations. We further identify that the $\ell^2$ norm of the final parameter iterate, one of the simplest complexity measures in learning theory, correlates more strongly with the generalization gap than these notions of fractal dimension. Finally, our study reveals the intriguing manifestation of model-wise double descent in persistent homology-based generalization measures. This work lays the ground for a deeper investigation of the causal relationships between fractal geometry, topological data analysis, and neural network optimization. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 17 pages, 6 figures

arXiv:2405.15059 [pdf, other]

Message-Passing Monte Carlo: Generating low-discrepancy point sets via Graph Neural Networks

Authors: T. Konstantin Rusch, Nathan Kirk, Michael M. Bronstein, Christiane Lemieux, Daniela Rus

Abstract: Discrepancy is a well-known measure for the irregularity of the distribution of a point set. Point sets with small discrepancy are called low-discrepancy and are known to efficiently fill the space in a uniform manner. Low-discrepancy points play a central role in many problems in science and engineering, including numerical integration, computer vision, machine perception, computer graphics, mach… ▽ More Discrepancy is a well-known measure for the irregularity of the distribution of a point set. Point sets with small discrepancy are called low-discrepancy and are known to efficiently fill the space in a uniform manner. Low-discrepancy points play a central role in many problems in science and engineering, including numerical integration, computer vision, machine perception, computer graphics, machine learning, and simulation. In this work, we present the first machine learning approach to generate a new class of low-discrepancy point sets named Message-Passing Monte Carlo (MPMC) points. Motivated by the geometric nature of generating low-discrepancy point sets, we leverage tools from Geometric Deep Learning and base our model on Graph Neural Networks. We further provide an extension of our framework to higher dimensions, which flexibly allows the generation of custom-made points that emphasize the uniformity in specific dimensions that are primarily important for the particular problem at hand. Finally, we demonstrate that our proposed model achieves state-of-the-art performance superior to previous methods by a significant margin. In fact, MPMC points are empirically shown to be either optimal or near-optimal with respect to the discrepancy for every dimension and the number of points for which the optimal discrepancy can be determined. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.13806 [pdf, other]

Advancing Graph Convolutional Networks via General Spectral Wavelets

Authors: Nian Liu, Xiaoxin He, Thomas Laurent, Francesco Di Giovanni, Michael M. Bronstein, Xavier Bresson

Abstract: Spectral graph convolution, an important tool of data filtering on graphs, relies on two essential decisions; selecting spectral bases for signal transformation and parameterizing the kernel for frequency analysis. While recent techniques mainly focus on standard Fourier transform and vector-valued spectral functions, they fall short in flexibility to describe specific signal distribution for each… ▽ More Spectral graph convolution, an important tool of data filtering on graphs, relies on two essential decisions; selecting spectral bases for signal transformation and parameterizing the kernel for frequency analysis. While recent techniques mainly focus on standard Fourier transform and vector-valued spectral functions, they fall short in flexibility to describe specific signal distribution for each node, and expressivity of spectral function. In this paper, we present a novel wavelet-based graph convolution network, namely WaveGC, which integrates multi-resolution spectral bases and a matrix-valued filter kernel. Theoretically, we establish that WaveGC can effectively capture and decouple short-range and long-range information, providing superior filtering flexibility, surpassing existing graph convolutional networks and graph Transformers (GTs). To instantiate WaveGC, we introduce a novel technique for learning general graph wavelets by separately combining odd and even terms of Chebyshev polynomials. This approach strictly satisfies wavelet admissibility criteria. Our numerical experiments showcase the capabilities of the new network. By replacing the Transformer part in existing architectures with WaveGC, we consistently observe improvements in both short-range and long-range tasks. This underscores the effectiveness of the proposed model in handling different scenarios. Our code is available at https://github.com/liun-online/WaveGC. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2402.08480 [pdf, other]

Revealing Decurve Flows for Generalized Graph Propagation

Authors: Chen Lin, Liheng Ma, Yiyang Chen, Wanli Ouyang, Michael M. Bronstein, Philip H. S. Torr

Abstract: This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining {\em \textbf{generalized propagation}} with directed and weighted graphs. The significance manifest in two ways. \textbf{Firstly}, we propose {\em Generalized Propagation Neural Networks} (\textbf{GPNNs}), a framework that unifies most propagation-based graph neural networks.… ▽ More This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining {\em \textbf{generalized propagation}} with directed and weighted graphs. The significance manifest in two ways. \textbf{Firstly}, we propose {\em Generalized Propagation Neural Networks} (\textbf{GPNNs}), a framework that unifies most propagation-based graph neural networks. By generating directed-weighted propagation graphs with adjacency function and connectivity function, GPNNs offer enhanced insights into attention mechanisms across various graph models. We delve into the trade-offs within the design space with empirical experiments and emphasize the crucial role of the adjacency function for model expressivity via theoretical analysis. \textbf{Secondly}, we propose the {\em Continuous Unified Ricci Curvature} (\textbf{CURC}), an extension of celebrated {\em Ollivier-Ricci Curvature} for directed and weighted graphs. Theoretically, we demonstrate that CURC possesses continuity, scale invariance, and a lower bound connection with the Dirichlet isoperimetric constant validating bottleneck analysis for GPNNs. We include a preliminary exploration of learned propagation patterns in datasets, a first in the field. We observe an intriguing ``{\em \textbf{decurve flow}}'' - a curvature reduction during training for models with learnable propagation, revealing the evolution of propagation over time and a deeper connection to over-smoothing and bottleneck trade-off. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 15 pages, 4 figures

arXiv:2402.04062 [pdf, other]

Link Prediction with Relational Hypergraphs

Authors: Xingyue Huang, Miguel Romero Orth, Pablo Barceló, Michael M. Bronstein, İsmail İlkan Ceylan

Abstract: Link prediction with knowledge graphs has been thoroughly studied in graph machine learning, leading to a rich landscape of graph neural network architectures with successful applications. Nonetheless, it remains challenging to transfer the success of these architectures to relational hypergraphs, where the task of link prediction is over $k$-ary relations, which is substantially harder than link… ▽ More Link prediction with knowledge graphs has been thoroughly studied in graph machine learning, leading to a rich landscape of graph neural network architectures with successful applications. Nonetheless, it remains challenging to transfer the success of these architectures to relational hypergraphs, where the task of link prediction is over $k$-ary relations, which is substantially harder than link prediction with knowledge graphs. In this paper, we propose a framework for link prediction with relational hypergraphs, unlocking applications of graph neural networks to fully relational structures. Theoretically, we conduct a thorough analysis of the expressive power of the resulting model architectures via corresponding relational Weisfeiler-Leman algorithms and also via logical expressiveness. Empirically, we validate the power of the proposed model architectures on various relational hypergraph benchmarks. The resulting model architectures substantially outperform every baseline for inductive link prediction, and lead to state-of-the-art results for transductive link prediction. △ Less

Submitted 23 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.04301 [pdf, other]

Setting the Record Straight on Transformer Oversmoothing

Authors: Gbètondji J-S Dovonon, Michael M. Bronstein, Matt J. Kusner

Abstract: Transformer-based models have recently become wildly successful across a diverse set of domains. At the same time, recent work has argued that Transformers are inherently low-pass filters that gradually oversmooth the inputs. This is worrisome as it limits generalization, especially as model depth increases. A natural question is: How can Transformers achieve these successes given this shortcoming… ▽ More Transformer-based models have recently become wildly successful across a diverse set of domains. At the same time, recent work has argued that Transformers are inherently low-pass filters that gradually oversmooth the inputs. This is worrisome as it limits generalization, especially as model depth increases. A natural question is: How can Transformers achieve these successes given this shortcoming? In this work we show that in fact Transformers are not inherently low-pass filters. Instead, whether Transformers oversmooth or not depends on the eigenspectrum of their update equations. Further, depending on the task, smoothing does not harm generalization as model depth increases. Our analysis extends prior work in oversmoothing and in the closely-related phenomenon of rank collapse. Based on this analysis, we derive a simple way to parameterize the weights of the Transformer update equations that allows for control over its filtering behavior. For image classification tasks we show that smoothing, instead of sharpening, can improve generalization. Whereas for text generation tasks Transformers that are forced to either smooth or sharpen have worse generalization. We hope that this work gives ML researchers and practitioners additional insight and leverage when develo** future Transformer models. △ Less

Submitted 13 February, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

arXiv:2306.03589 [pdf, other]

How does over-squashing affect the power of GNNs?

Authors: Francesco Di Giovanni, T. Konstantin Rusch, Michael M. Bronstein, Andreea Deac, Marc Lackenby, Siddhartha Mishra, Petar Veličković

Abstract: Graph Neural Networks (GNNs) are the state-of-the-art model for machine learning on graph-structured data. The most popular class of GNNs operate by exchanging information between adjacent nodes, and are known as Message Passing Neural Networks (MPNNs). Given their widespread use, understanding the expressive power of MPNNs is a key question. However, existing results typically consider settings w… ▽ More Graph Neural Networks (GNNs) are the state-of-the-art model for machine learning on graph-structured data. The most popular class of GNNs operate by exchanging information between adjacent nodes, and are known as Message Passing Neural Networks (MPNNs). Given their widespread use, understanding the expressive power of MPNNs is a key question. However, existing results typically consider settings with uninformative node features. In this paper, we provide a rigorous analysis to determine which function classes of node features can be learned by an MPNN of a given capacity. We do so by measuring the level of pairwise interactions between nodes that MPNNs allow for. This measure provides a novel quantitative characterization of the so-called over-squashing effect, which is observed to occur when a large volume of messages is aggregated into fixed-size vectors. Using our measure, we prove that, to guarantee sufficient communication between pairs of nodes, the capacity of the MPNN must be large enough, depending on properties of the input graph structure, such as commute times. For many relevant scenarios, our analysis results in impossibility statements in practice, showing that over-squashing hinders the expressive power of MPNNs. We validate our theoretical findings through extensive controlled experiments and ablation studies. △ Less

Submitted 12 February, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

Comments: 36 pages; Published in Transactions on Machine Learning Research (TMLR)

arXiv:2305.09605 [pdf, other]

To smooth a cloud or to pin it down: Guarantees and Insights on Score Matching in Denoising Diffusion Models

Authors: Francisco Vargas, Teodora Reu, Anna Kerekes, Michael M Bronstein

Abstract: Denoising diffusion models are a class of generative models which have recently achieved state-of-the-art results across many domains. Gradual noise is added to the data using a diffusion process, which transforms the data distribution into a Gaussian. Samples from the generative model are then obtained by simulating an approximation of the time reversal of this diffusion initialized by Gaussian s… ▽ More Denoising diffusion models are a class of generative models which have recently achieved state-of-the-art results across many domains. Gradual noise is added to the data using a diffusion process, which transforms the data distribution into a Gaussian. Samples from the generative model are then obtained by simulating an approximation of the time reversal of this diffusion initialized by Gaussian samples. Recent research has explored adapting diffusion models for sampling and inference tasks. In this paper, we leverage known connections to stochastic control akin to the Föllmer drift to extend established neural network approximation results for the Föllmer drift to denoising diffusion models and samplers. △ Less

Submitted 26 June, 2024; v1 submitted 16 May, 2023; originally announced May 2023.

Comments: arXiv admin note: text overlap with arXiv:1903.01608 by other authors

arXiv:2303.10993 [pdf, other]

A Survey on Oversmoothing in Graph Neural Networks

Authors: T. Konstantin Rusch, Michael M. Bronstein, Siddhartha Mishra

Abstract: Node features of graph neural networks (GNNs) tend to become more similar with the increase of the network depth. This effect is known as over-smoothing, which we axiomatically define as the exponential convergence of suitable similarity measures on the node features. Our definition unifies previous approaches and gives rise to new quantitative measures of over-smoothing. Moreover, we empirically… ▽ More Node features of graph neural networks (GNNs) tend to become more similar with the increase of the network depth. This effect is known as over-smoothing, which we axiomatically define as the exponential convergence of suitable similarity measures on the node features. Our definition unifies previous approaches and gives rise to new quantitative measures of over-smoothing. Moreover, we empirically demonstrate this behavior for several over-smoothing measures on different graphs (small-, medium-, and large-scale). We also review several approaches for mitigating over-smoothing and empirically test their effectiveness on real-world graph datasets. Through illustrative examples, we demonstrate that mitigating over-smoothing is a necessary but not sufficient condition for building deep GNNs that are expressive on a wide range of graph learning tasks. Finally, we extend our definition of over-smoothing to the rapidly emerging field of continuous-time GNNs. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2210.00513 [pdf, other]

Gradient Gating for Deep Multi-Rate Learning on Graphs

Authors: T. Konstantin Rusch, Benjamin P. Chamberlain, Michael W. Mahoney, Michael M. Bronstein, Siddhartha Mishra

Abstract: We present Gradient Gating (G$^2$), a novel framework for improving the performance of Graph Neural Networks (GNNs). Our framework is based on gating the output of GNN layers with a mechanism for multi-rate flow of message passing information across nodes of the underlying graph. Local gradients are harnessed to further modulate message passing updates. Our framework flexibly allows one to use any… ▽ More We present Gradient Gating (G$^2$), a novel framework for improving the performance of Graph Neural Networks (GNNs). Our framework is based on gating the output of GNN layers with a mechanism for multi-rate flow of message passing information across nodes of the underlying graph. Local gradients are harnessed to further modulate message passing updates. Our framework flexibly allows one to use any basic GNN layer as a wrapper around which the multi-rate gradient gating mechanism is built. We rigorously prove that G$^2$ alleviates the oversmoothing problem and allows the design of deep GNNs. Empirical results are presented to demonstrate that the proposed framework achieves state-of-the-art performance on a variety of graph learning tasks, including on large-scale heterophilic graphs. △ Less

Submitted 15 March, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

arXiv:2209.15486 [pdf, other]

Graph Neural Networks for Link Prediction with Subgraph Sketching

Authors: Benjamin Paul Chamberlain, Sergey Shirobokov, Emanuele Rossi, Fabrizio Frasca, Thomas Markovich, Nils Hammerla, Michael M. Bronstein, Max Hansmire

Abstract: Many Graph Neural Networks (GNNs) perform poorly compared to simple heuristics on Link Prediction (LP) tasks. This is due to limitations in expressive power such as the inability to count triangles (the backbone of most LP heuristics) and because they can not distinguish automorphic nodes (those having identical structural roles). Both expressiveness issues can be alleviated by learning link (rath… ▽ More Many Graph Neural Networks (GNNs) perform poorly compared to simple heuristics on Link Prediction (LP) tasks. This is due to limitations in expressive power such as the inability to count triangles (the backbone of most LP heuristics) and because they can not distinguish automorphic nodes (those having identical structural roles). Both expressiveness issues can be alleviated by learning link (rather than node) representations and incorporating structural features such as triangle counts. Since explicit link representations are often prohibitively expensive, recent works resorted to subgraph-based methods, which have achieved state-of-the-art performance for LP, but suffer from poor efficiency due to high levels of redundancy between subgraphs. We analyze the components of subgraph GNN (SGNN) methods for link prediction. Based on our analysis, we propose a novel full-graph GNN called ELPH (Efficient Link Prediction with Hashing) that passes subgraph sketches as messages to approximate the key components of SGNNs without explicit subgraph construction. ELPH is provably more expressive than Message Passing GNNs (MPNNs). It outperforms existing SGNN models on many standard LP benchmarks while being orders of magnitude faster. However, it shares the common GNN limitation that it is only efficient when the dataset fits in GPU memory. Accordingly, we develop a highly scalable model, called BUDDY, which uses feature precomputation to circumvent this limitation without sacrificing predictive performance. Our experiments show that BUDDY also outperforms SGNNs on standard LP benchmarks while being highly scalable and faster than ELPH. △ Less

Submitted 2 May, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

Comments: 29 pages, 19 figures, 6 appendices

Journal ref: The Eleventh International Conference on Learning Representations 2023 (oral - top 5%)

arXiv:2206.11140 [pdf, other]

Understanding and Extending Subgraph GNNs by Rethinking Their Symmetries

Authors: Fabrizio Frasca, Beatrice Bevilacqua, Michael M. Bronstein, Haggai Maron

Abstract: Subgraph GNNs are a recent class of expressive Graph Neural Networks (GNNs) which model graphs as collections of subgraphs. So far, the design space of possible Subgraph GNN architectures as well as their basic theoretical properties are still largely unexplored. In this paper, we study the most prominent form of subgraph methods, which employs node-based subgraph selection policies such as ego-ne… ▽ More Subgraph GNNs are a recent class of expressive Graph Neural Networks (GNNs) which model graphs as collections of subgraphs. So far, the design space of possible Subgraph GNN architectures as well as their basic theoretical properties are still largely unexplored. In this paper, we study the most prominent form of subgraph methods, which employs node-based subgraph selection policies such as ego-networks or node marking and deletion. We address two central questions: (1) What is the upper-bound of the expressive power of these methods? and (2) What is the family of equivariant message passing layers on these sets of subgraphs?. Our first step in answering these questions is a novel symmetry analysis which shows that modelling the symmetries of node-based subgraph collections requires a significantly smaller symmetry group than the one adopted in previous works. This analysis is then used to establish a link between Subgraph GNNs and Invariant Graph Networks (IGNs). We answer the questions above by first bounding the expressive power of subgraph methods by 3-WL, and then proposing a general family of message-passing layers for subgraph methods that generalises all previous node-based Subgraph GNNs. Finally, we design a novel Subgraph GNN dubbed SUN, which theoretically unifies previous architectures while providing better empirical performance on multiple benchmarks. △ Less

Submitted 13 October, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: NeurIPS 2022, Camera Ready; 48 pages, 7 figures

arXiv:2206.10991 [pdf, other]

Understanding convolution on graphs via energies

Authors: Francesco Di Giovanni, James Rowbottom, Benjamin P. Chamberlain, Thomas Markovich, Michael M. Bronstein

Abstract: Graph Neural Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours. Most message-passing models act as graph convolutions, where features are mixed by a shared, linear transformation before being propagated over the edges. On node-classification tasks, graph convolutions have been shown to suffer from two li… ▽ More Graph Neural Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours. Most message-passing models act as graph convolutions, where features are mixed by a shared, linear transformation before being propagated over the edges. On node-classification tasks, graph convolutions have been shown to suffer from two limitations: poor performance on heterophilic graphs, and over-smoothing. It is common belief that both phenomena occur because such models behave as low-pass filters, meaning that the Dirichlet energy of the features decreases along the layers incurring a smoothing effect that ultimately makes features no longer distinguishable. In this work, we rigorously prove that simple graph-convolutional models can actually enhance high frequencies and even lead to an asymptotic behaviour we refer to as over-sharpening, opposite to over-smoothing. We do so by showing that linear graph convolutions with symmetric weights minimize a multi-particle energy that generalizes the Dirichlet energy; in this setting, the weight matrices induce edge-wise attraction (repulsion) through their positive (negative) eigenvalues, thereby controlling whether the features are being smoothed or sharpened. We also extend the analysis to non-linear GNNs, and demonstrate that some existing time-continuous GNNs are instead always dominated by the low frequencies. Finally, we validate our theoretical findings through ablations and real-world experiments. △ Less

Submitted 6 September, 2023; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: Accepted at TMLR; First two authors equal contribution; 35 pages

arXiv:2206.08119 [pdf, other]

Learning to Infer Structures of Network Games

Authors: Emanuele Rossi, Federico Monti, Yan Leng, Michael M. Bronstein, Xiaowen Dong

Abstract: Strategic interactions between a group of individuals or organisations can be modelled as games played on networks, where a player's payoff depends not only on their actions but also on those of their neighbours. Inferring the network structure from observed game outcomes (equilibrium actions) is an important problem with numerous potential applications in economics and social sciences. Existing m… ▽ More Strategic interactions between a group of individuals or organisations can be modelled as games played on networks, where a player's payoff depends not only on their actions but also on those of their neighbours. Inferring the network structure from observed game outcomes (equilibrium actions) is an important problem with numerous potential applications in economics and social sciences. Existing methods mostly require the knowledge of the utility function associated with the game, which is often unrealistic to obtain in real-world scenarios. We adopt a transformer-like architecture which correctly accounts for the symmetries of the problem and learns a map** from the equilibrium actions to the network structure of the game without explicit knowledge of the utility function. We test our method on three different types of network games using both synthetic and real-world data, and demonstrate its effectiveness in network structure inference and superior performance over existing methods. △ Less

Submitted 18 August, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

arXiv:2205.00354 [pdf, ps, other]

Graph Anisotropic Diffusion

Authors: Ahmed A. A. Elhag, Gabriele Corso, Hannes Stärk, Michael M. Bronstein

Abstract: Traditional Graph Neural Networks (GNNs) rely on message passing, which amounts to permutation-invariant local aggregation of neighbour features. Such a process is isotropic and there is no notion of `direction' on the graph. We present a new GNN architecture called Graph Anisotropic Diffusion. Our model alternates between linear diffusion, for which a closed-form solution is available, and local… ▽ More Traditional Graph Neural Networks (GNNs) rely on message passing, which amounts to permutation-invariant local aggregation of neighbour features. Such a process is isotropic and there is no notion of `direction' on the graph. We present a new GNN architecture called Graph Anisotropic Diffusion. Our model alternates between linear diffusion, for which a closed-form solution is available, and local anisotropic filters to obtain efficient multi-hop anisotropic kernels. We test our model on two common molecular property prediction benchmarks (ZINC and QM9) and show its competitive performance. △ Less

Submitted 30 April, 2022; originally announced May 2022.

Comments: 10 pages, 3 figures, Published at the GTRL and MLDD workshops, ICLR 2022

arXiv:2204.00323 [pdf, other]

Graph-in-Graph (GiG): Learning interpretable latent graphs in non-Euclidean domain for biological and healthcare applications

Authors: Kamilia Mullakaeva, Luca Cosmo, Anees Kazi, Seyed-Ahmad Ahmadi, Nassir Navab, Michael M. Bronstein

Abstract: Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain. Two prominent examples are molecule property prediction and brain connectome analysis. Importantly, recent works have shown that considering relationships between input data samples have a positive regularizing effect for the downstream task in healthcare applications. Th… ▽ More Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain. Two prominent examples are molecule property prediction and brain connectome analysis. Importantly, recent works have shown that considering relationships between input data samples have a positive regularizing effect for the downstream task in healthcare applications. These relationships are naturally modeled by a (possibly unknown) graph structure between input samples. In this work, we propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications that exploits the graph representation of the input data samples and their latent relation. We assume an initially unknown latent-graph structure between graph-valued input data and propose to learn end-to-end a parametric model for message passing within and across input graph samples, along with the latent structure connecting the input graphs. Further, we introduce a degree distribution loss that helps regularize the predicted latent relationships structure. This regularization can significantly improve the downstream task. Moreover, the obtained latent graph can represent patient population models or networks of molecule clusters, providing a level of interpretability and knowledge discovery in the input domain of particular value in healthcare. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2202.04579 [pdf, other]

Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs

Authors: Cristian Bodnar, Francesco Di Giovanni, Benjamin Paul Chamberlain, Pietro Liò, Michael M. Bronstein

Abstract: Cellular sheaves equip graphs with a "geometrical" structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that dis… ▽ More Cellular sheaves equip graphs with a "geometrical" structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that discretise this equation. In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. At the same time, we prove that when the sheaf is non-trivial, discretised parametric diffusion processes have greater control than GNNs over their asymptotic behaviour. On the practical side, we study how sheaves can be learned from data. The resulting sheaf diffusion models have many desirable properties that address the limitations of classical graph diffusion equations (and corresponding GNN models) and obtain competitive results in heterophilic settings. Overall, our work provides new connections between GNNs and algebraic topology and would be of interest to both fields. △ Less

Submitted 6 January, 2023; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: Accepted to NeurIPS 2022. Contains 29 pages, 10 figures

arXiv:2202.04202 [pdf, other]

RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro

Authors: Paul Bertin, Jarrid Rector-Brooks, Deepak Sharma, Thomas Gaudelet, Andrew Anighoro, Torsten Gross, Francisco Martinez-Pena, Eileen L. Tang, Suraj M S, Cristian Regep, Jeremy Hayter, Maksym Korablyov, Nicholas Valiante, Almer van der Sloot, Mike Tyers, Charles Roberts, Michael M. Bronstein, Luke L. Lairson, Jake P. Taylor-King, Yoshua Bengio

Abstract: For large libraries of small molecules, exhaustive combinatorial chemical screens become infeasible to perform when considering a range of disease models, assay conditions, and dose ranges. Deep learning models have achieved state of the art results in silico for the prediction of synergy scores. However, databases of drug combinations are biased towards synergistic agents and these results do not… ▽ More For large libraries of small molecules, exhaustive combinatorial chemical screens become infeasible to perform when considering a range of disease models, assay conditions, and dose ranges. Deep learning models have achieved state of the art results in silico for the prediction of synergy scores. However, databases of drug combinations are biased towards synergistic agents and these results do not necessarily generalise out of distribution. We employ a sequential model optimization search utilising a deep learning model to quickly discover synergistic drug combinations active against a cancer cell line, requiring substantially less screening than an exhaustive evaluation. Our small scale wet lab experiments only account for evaluation of ~5% of the total search space. After only 3 rounds of ML-guided in vitro experimentation (including a calibration round), we find that the set of drug pairs queried is enriched for highly synergistic combinations; two additional rounds of ML-guided experiments were performed to ensure reproducibility of trends. Remarkably, we rediscover drug combinations later confirmed to be under study within clinical trials. Moreover, we find that drug embeddings generated using only structural information begin to reflect mechanisms of action. Prior in silico benchmarking suggests we can enrich search queries by a factor of ~5-10x for highly synergistic drug combinations by using sequential rounds of evaluation when compared to random selection, or by a factor of >3x when using a pretrained model selecting all drug combinations at a single time point. △ Less

Submitted 2 March, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

arXiv:2202.02296 [pdf, other]

Graph-Coupled Oscillator Networks

Authors: T. Konstantin Rusch, Benjamin P. Chamberlain, James Rowbottom, Siddhartha Mishra, Michael M. Bronstein

Abstract: We propose Graph-Coupled Oscillator Networks (GraphCON), a novel framework for deep learning on graphs. It is based on discretizations of a second-order system of ordinary differential equations (ODEs), which model a network of nonlinear controlled and damped oscillators, coupled via the adjacency structure of the underlying graph. The flexibility of our framework permits any basic GNN layer (e.g.… ▽ More We propose Graph-Coupled Oscillator Networks (GraphCON), a novel framework for deep learning on graphs. It is based on discretizations of a second-order system of ordinary differential equations (ODEs), which model a network of nonlinear controlled and damped oscillators, coupled via the adjacency structure of the underlying graph. The flexibility of our framework permits any basic GNN layer (e.g. convolutional or attentional) as the coupling function, from which a multi-layer deep neural network is built up via the dynamics of the proposed ODEs. We relate the oversmoothing problem, commonly encountered in GNNs, to the stability of steady states of the underlying ODE and show that zero-Dirichlet energy steady states are not stable for our proposed ODEs. This demonstrates that the proposed framework mitigates the oversmoothing problem. Moreover, we prove that GraphCON mitigates the exploding and vanishing gradients problem to facilitate training of deep multi-layer GNNs. Finally, we show that our approach offers competitive performance with respect to the state-of-the-art on a variety of graph-based learning tasks. △ Less

Submitted 23 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: ICML 2022

arXiv:2111.14522 [pdf, other]

Understanding over-squashing and bottlenecks on graphs via curvature

Authors: Jake Top**, Francesco Di Giovanni, Benjamin Paul Chamberlain, Xiaowen Dong, Michael M. Bronstein

Abstract: Most graph neural networks (GNNs) use the message passing paradigm, in which node features are propagated on the input graph. Recent works pointed to the distortion of information flowing from distant nodes as a factor limiting the efficiency of message passing for tasks relying on long-distance interactions. This phenomenon, referred to as 'over-squashing', has been heuristically attributed to gr… ▽ More Most graph neural networks (GNNs) use the message passing paradigm, in which node features are propagated on the input graph. Recent works pointed to the distortion of information flowing from distant nodes as a factor limiting the efficiency of message passing for tasks relying on long-distance interactions. This phenomenon, referred to as 'over-squashing', has been heuristically attributed to graph bottlenecks where the number of $k$-hop neighbors grows rapidly with $k$. We provide a precise description of the over-squashing phenomenon in GNNs and analyze how it arises from bottlenecks in the graph. For this purpose, we introduce a new edge-based combinatorial curvature and prove that negatively curved edges are responsible for the over-squashing issue. We also propose and experimentally test a curvature-based graph rewiring method to alleviate the over-squashing. △ Less

Submitted 12 November, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: First two authors equal contribution; outstanding paper honorable mention at ICLR22; 30 pages. v2: corrected Corollary 3 and rephrased in terms of graph diameter. Any other result is unaffected

arXiv:2110.09443 [pdf, other]

Beltrami Flow and Neural Diffusion on Graphs

Authors: Benjamin Paul Chamberlain, James Rowbottom, Davide Eynard, Francesco Di Giovanni, Xiaowen Dong, Michael M Bronstein

Abstract: We propose a novel class of graph neural networks based on the discretised Beltrami flow, a non-Euclidean diffusion PDE. In our model, node features are supplemented with positional encodings derived from the graph topology and jointly evolved by the Beltrami flow, producing simultaneously continuous feature learning and topology evolution. The resulting model generalises many popular graph neural… ▽ More We propose a novel class of graph neural networks based on the discretised Beltrami flow, a non-Euclidean diffusion PDE. In our model, node features are supplemented with positional encodings derived from the graph topology and jointly evolved by the Beltrami flow, producing simultaneously continuous feature learning and topology evolution. The resulting model generalises many popular graph neural networks and achieves state-of-the-art results on several benchmarks. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 21 pages, 5 figures. Proceedings of the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS) 2021

arXiv:2110.02910 [pdf, other]

Equivariant Subgraph Aggregation Networks

Authors: Beatrice Bevilacqua, Fabrizio Frasca, Derek Lim, Balasubramaniam Srinivasan, Chen Cai, Gopinath Balamurugan, Michael M. Bronstein, Haggai Maron

Abstract: Message-passing neural networks (MPNNs) are the leading architecture for deep learning on graph-structured data, in large part due to their simplicity and scalability. Unfortunately, it was shown that these architectures are limited in their expressive power. This paper proposes a novel framework called Equivariant Subgraph Aggregation Networks (ESAN) to address this issue. Our main observation is… ▽ More Message-passing neural networks (MPNNs) are the leading architecture for deep learning on graph-structured data, in large part due to their simplicity and scalability. Unfortunately, it was shown that these architectures are limited in their expressive power. This paper proposes a novel framework called Equivariant Subgraph Aggregation Networks (ESAN) to address this issue. Our main observation is that while two graphs may not be distinguishable by an MPNN, they often contain distinguishable subgraphs. Thus, we propose to represent each graph as a set of subgraphs derived by some predefined policy, and to process it using a suitable equivariant architecture. We develop novel variants of the 1-dimensional Weisfeiler-Leman (1-WL) test for graph isomorphism, and prove lower bounds on the expressiveness of ESAN in terms of these new WL variants. We further prove that our approach increases the expressive power of both MPNNs and more expressive architectures. Moreover, we provide theoretical results that describe how design choices such as the subgraph selection policy and equivariant neural architecture affect our architecture's expressive power. To deal with the increased computational cost, we propose a subgraph sampling scheme, which can be viewed as a stochastic version of our framework. A comprehensive set of experiments on real and synthetic datasets demonstrates that our framework improves the expressive power and overall performance of popular GNN architectures. △ Less

Submitted 16 March, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

Comments: Published at ICLR 2022, Spotlight. 46 pages

arXiv:2107.01952 [pdf, other]

Partition and Code: learning how to compress graphs

Authors: Giorgos Bouritsas, Andreas Loukas, Nikolaos Karalias, Michael M. Bronstein

Abstract: Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to discover relevant patterns. On the other hand, most graph compression approaches rely on domain-dependent handcrafted representations and cannot adapt to different underlying graph dis… ▽ More Can we use machine learning to compress graph data? The absence of ordering in graphs poses a significant challenge to conventional compression algorithms, limiting their attainable gains as well as their ability to discover relevant patterns. On the other hand, most graph compression approaches rely on domain-dependent handcrafted representations and cannot adapt to different underlying graph distributions. This work aims to establish the necessary principles a lossless graph compression method should follow to approach the entropy storage lower bound. Instead of making rigid assumptions about the graph distribution, we formulate the compressor as a probabilistic model that can be learned from data and generalise to unseen instances. Our "Partition and Code" framework entails three steps: first, a partitioning algorithm decomposes the graph into subgraphs, then these are mapped to the elements of a small dictionary on which we learn a probability distribution, and finally, an entropy encoder translates the representation into bits. All the components (partitioning, dictionary and distribution) are parametric and can be trained with gradient descent. We theoretically compare the compression quality of several graph encodings and prove, under mild conditions, that PnC achieves compression gains that grow either linearly or quadratically with the number of vertices. Empirically, PnC yields significant compression improvements on diverse real-world networks. △ Less

Submitted 3 November, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

Comments: Published at NeurIPS 2021

Journal ref: Proc. Adv. Neur. Inf. Process. Syst. (NeurIPS) vol. 34 pp. 18603 - 18619 (2021)

arXiv:2106.10934 [pdf, other]

GRAND: Graph Neural Diffusion

Authors: Benjamin Paul Chamberlain, James Rowbottom, Maria Gorinova, Stefan Webb, Emanuele Rossi, Michael M. Bronstein

Abstract: We present Graph Neural Diffusion (GRAND) that approaches deep learning on graphs as a continuous diffusion process and treats Graph Neural Networks (GNNs) as discretisations of an underlying PDE. In our model, the layer structure and topology correspond to the discretisation choices of temporal and spatial operators. Our approach allows a principled development of a broad new class of GNNs that a… ▽ More We present Graph Neural Diffusion (GRAND) that approaches deep learning on graphs as a continuous diffusion process and treats Graph Neural Networks (GNNs) as discretisations of an underlying PDE. In our model, the layer structure and topology correspond to the discretisation choices of temporal and spatial operators. Our approach allows a principled development of a broad new class of GNNs that are able to address the common plights of graph learning models such as depth, oversmoothing, and bottlenecks. Key to the success of our models are stability with respect to perturbations in the data and this is addressed for both implicit and explicit discretisation schemes. We develop linear and nonlinear versions of GRAND, which achieve competitive results on many standard graph benchmarks. △ Less

Submitted 22 September, 2021; v1 submitted 21 June, 2021; originally announced June 2021.

arXiv:2104.13478 [pdf, other]

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Authors: Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković

Abstract: The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simpl… ▽ More The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learning, whereby adapted, often hierarchical, features capture the appropriate notion of regularity for each task, and second, learning by local gradient-descent type methods, typically implemented as backpropagation. While learning generic functions in high dimensions is a cursed estimation problem, most tasks of interest are not generic, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world. This text is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of applications. Such a 'geometric unification' endeavour, in the spirit of Felix Klein's Erlangen Program, serves a dual purpose: on one hand, it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers. On the other hand, it gives a constructive procedure to incorporate prior physical knowledge into neural architectures and provide principled way to build future architectures yet to be invented. △ Less

Submitted 2 May, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

Comments: 156 pages. Work in progress -- comments welcome!

arXiv:2104.08614 [pdf]

Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales

Authors: Jacob Andreas, Gašper Beguš, Michael M. Bronstein, Roee Diamant, Denley Delaney, Shane Gero, Shafi Goldwasser, David F. Gruber, Sarah de Haas, Peter Malkin, Roger Payne, Giovanni Petri, Daniela Rus, Pratyusha Sharma, Dan Tchernov, Pernille Tønnesen, Antonio Torralba, Daniel Vogt, Robert J. Wood

Abstract: The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman speci… ▽ More The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman species. We posit that machine learning will be the cornerstone of future collection, processing, and analysis of multimodal streams of data in animal communication studies, including bioacoustic, behavioral, biological, and environmental data. Cetaceans are unique non-human model species as they possess sophisticated acoustic communications, but utilize a very different encoding system that evolved in an aquatic rather than terrestrial medium. Sperm whales, in particular, with their highly-developed neuroanatomical features, cognitive abilities, social structures, and discrete click-based encoding make for an excellent starting point for advanced machine learning tools that can be applied to other animals in the future. This paper details a roadmap toward this goal based on currently existing technology and multidisciplinary scientific community effort. We outline the key elements required for the collection and processing of massive bioacoustic data of sperm whales, detecting their basic communication units and language-like higher-level structures, and validating these models through interactive playback experiments. The technological capabilities developed by such an undertaking are likely to yield cross-applications and advancements in broader communities investigating non-human communication and animal behavioral research. △ Less

Submitted 17 April, 2021; originally announced April 2021.

arXiv:2012.09235 [pdf, other]

Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation

Authors: Mehdi Bahri, Eimear O' Sullivan, Shunwang Gong, Feng Liu, Xiaoming Liu, Michael M. Bronstein, Stefanos Zafeiriou

Abstract: Standard registration algorithms need to be independently applied to each surface to register, following careful pre-processing and hand-tuning. Recently, learning-based approaches have emerged that reduce the registration of new scans to running inference with a previously-trained model. In this paper, we cast the registration task as a surface-to-surface translation problem, and design a model t… ▽ More Standard registration algorithms need to be independently applied to each surface to register, following careful pre-processing and hand-tuning. Recently, learning-based approaches have emerged that reduce the registration of new scans to running inference with a previously-trained model. In this paper, we cast the registration task as a surface-to-surface translation problem, and design a model to reliably capture the latent geometric information directly from raw 3D face scans. We introduce Shape-My-Face (SMF), a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model that we smoothly integrate with the mesh convolutions. Compared to the previous state-of-the-art learning algorithms for non-rigid registration of face scans, SMF only requires the raw data to be rigidly aligned (with scaling) with a pre-defined face template. Additionally, our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets. We extensively evaluate the quality of our registrations on diverse data. We demonstrate the robustness and generalizability of our model with in-the-wild face scans across different modalities, sensor types, and resolutions. Finally, we show that, by learning to register scans, SMF produces a hybrid linear and non-linear morphable model. Manipulation of the latent space of SMF allows for shape generation, and morphing applications such as expression transfer in-the-wild. We train SMF on a dataset of human faces comprising 9 large-scale databases on commodity hardware. △ Less

Submitted 10 March, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

Comments: In review with International Journal of Computer Vision (IJCV) - Revision 1 (minor revision)

arXiv:2012.05716 [pdf, other]

Utilising Graph Machine Learning within Drug Discovery and Development

Authors: Thomas Gaudelet, Ben Day, Arian R. Jamasb, Jyothish Soman, Cristian Regep, Gertrude Liu, Jeremy B. R. Hayter, Richard Vickers, Charles Roberts, Jian Tang, David Roblin, Tom L. Blundell, Michael M. Bronstein, Jake P. Taylor-King

Abstract: Graph Machine Learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development… ▽ More Graph Machine Learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development. After introducing key terms and modelling approaches, we move chronologically through the drug development pipeline to identify and summarise work incorporating: target identification, design of small molecules and biologics, and drug repurposing. Whilst the field is still emerging, key milestones including repurposed drugs entering in vivo studies, suggest graph machine learning will become a modelling framework of choice within biomedical machine learning. △ Less

Submitted 10 February, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

Comments: 19 pages, 7 figures, 2 tables

arXiv:2009.12192 [pdf, other]

doi 10.1145/3383313.3418486

Tuning Word2vec for Large Scale Recommendation Systems

Authors: Benjamin P. Chamberlain, Emanuele Rossi, Dan Shiebler, Suvash Sedhain, Michael M. Bronstein

Abstract: Word2vec is a powerful machine learning tool that emerged from Natural Lan-guage Processing (NLP) and is now applied in multiple domains, including recom-mender systems, forecasting, and network analysis. As Word2vec is often used offthe shelf, we address the question of whether the default hyperparameters are suit-able for recommender systems. The answer is emphatically no. In this paper, wefirst… ▽ More Word2vec is a powerful machine learning tool that emerged from Natural Lan-guage Processing (NLP) and is now applied in multiple domains, including recom-mender systems, forecasting, and network analysis. As Word2vec is often used offthe shelf, we address the question of whether the default hyperparameters are suit-able for recommender systems. The answer is emphatically no. In this paper, wefirst elucidate the importance of hyperparameter optimization and show that un-constrained optimization yields an average 221% improvement in hit rate over thedefault parameters. However, unconstrained optimization leads to hyperparametersettings that are very expensive and not feasible for large scale recommendationtasks. To this end, we demonstrate 138% average improvement in hit rate with aruntime budget-constrained hyperparameter optimization. Furthermore, to makehyperparameter optimization applicable for large scale recommendation problemswhere the target dataset is too large to search over, we investigate generalizinghyperparameters settings from samples. We show that applying constrained hy-perparameter optimization using only a 10% sample of the data still yields a 91%average improvement in hit rate over the default parameters when applied to thefull datasets. Finally, we apply hyperparameters learned using our method of con-strained optimization on a sample to the Who To Follow recommendation serviceat Twitter and are able to increase follow rates by 15%. △ Less

Submitted 24 September, 2020; originally announced September 2020.

Comments: 11 pages, 4 figures, Fourteenth ACM Conference on Recommender Systems

Journal ref: Fourteenth ACM Conference on Recommender Systems (RecSys '20), September 22--26, 2020, Virtual Event, Brazil

arXiv:2006.09252 [pdf, other]

doi 10.1109/TPAMI.2022.3154319

Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting

Authors: Giorgos Bouritsas, Fabrizio Frasca, Stefanos Zafeiriou, Michael M. Bronstein

Abstract: While Graph Neural Networks (GNNs) have achieved remarkable results in a variety of applications, recent studies exposed important shortcomings in their ability to capture the structure of the underlying graph. It has been shown that the expressive power of standard GNNs is bounded by the Weisfeiler-Leman (WL) graph isomorphism test, from which they inherit proven limitations such as the inability… ▽ More While Graph Neural Networks (GNNs) have achieved remarkable results in a variety of applications, recent studies exposed important shortcomings in their ability to capture the structure of the underlying graph. It has been shown that the expressive power of standard GNNs is bounded by the Weisfeiler-Leman (WL) graph isomorphism test, from which they inherit proven limitations such as the inability to detect and count graph substructures. On the other hand, there is significant empirical evidence, e.g. in network science and bioinformatics, that substructures are often intimately related to downstream tasks. To this end, we propose "Graph Substructure Networks" (GSN), a topologically-aware message passing scheme based on substructure encoding. We theoretically analyse the expressive power of our architecture, showing that it is strictly more expressive than the WL test, and provide sufficient conditions for universality. Importantly, we do not attempt to adhere to the WL hierarchy; this allows us to retain multiple attractive properties of standard GNNs such as locality and linear network complexity, while being able to disambiguate even hard instances of graph isomorphism. We perform an extensive experimental evaluation on graph classification and regression tasks and obtain state-of-the-art results in diverse real-world settings including molecular graphs and social networks. The code is publicly available at https://github.com/gbouritsas/graph-substructure-networks. △ Less

Submitted 5 July, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Journal ref: IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) vol. 45 (2023) pp. 657 - 668

arXiv:2004.02658 [pdf, other]

Geometrically Principled Connections in Graph Neural Networks

Authors: Shunwang Gong, Mehdi Bahri, Michael M. Bronstein, Stefanos Zafeiriou

Abstract: Graph convolution operators bring the advantages of deep learning to a variety of graph and mesh processing tasks previously deemed out of reach. With their continued success comes the desire to design more powerful architectures, often by adapting existing deep learning techniques to non-Euclidean data. In this paper, we argue geometry should remain the primary driving force behind innovation in… ▽ More Graph convolution operators bring the advantages of deep learning to a variety of graph and mesh processing tasks previously deemed out of reach. With their continued success comes the desire to design more powerful architectures, often by adapting existing deep learning techniques to non-Euclidean data. In this paper, we argue geometry should remain the primary driving force behind innovation in the emerging field of geometric deep learning. We relate graph neural networks to widely successful computer graphics and data approximation models: radial basis functions (RBFs). We conjecture that, like RBFs, graph convolution layers would benefit from the addition of simple functions to the powerful convolution kernels. We introduce affine skip connections, a novel building block formed by combining a fully connected layer with any graph convolution operator. We experimentally demonstrate the effectiveness of our technique and show the improved performance is the consequence of more than the increased number of parameters. Operators equipped with the affine skip connection markedly outperform their base performance on every task we evaluated, i.e., shape reconstruction, dense shape correspondence, and graph classification. We hope our simple and effective approach will serve as a solid baseline and help ease future research in graph neural networks. △ Less

Submitted 6 April, 2020; originally announced April 2020.

Comments: Presented at Computer Vision and Pattern Recognition (CVPR), 2020

arXiv:1909.06609 [pdf, ps, other]

Learning Interpretable Disease Self-Representations for Drug Repositioning

Authors: Fabrizio Frasca, Diego Galeano, Guadalupe Gonzalez, Ivan Laponogov, Kirill Veselkov, Alberto Paccanaro, Michael M. Bronstein

Abstract: Drug repositioning is an attractive cost-efficient strategy for the development of treatments for human diseases. Here, we propose an interpretable model that learns disease self-representations for drug repositioning. Our self-representation model represents each disease as a linear combination of a few other diseases. We enforce proximity in the learnt representations in a way to preserve the ge… ▽ More Drug repositioning is an attractive cost-efficient strategy for the development of treatments for human diseases. Here, we propose an interpretable model that learns disease self-representations for drug repositioning. Our self-representation model represents each disease as a linear combination of a few other diseases. We enforce proximity in the learnt representations in a way to preserve the geometric structure of the human phenome network - a domain-specific knowledge that naturally adds relational inductive bias to the disease self-representations. We prove that our method is globally optimal and show results outperforming state-of-the-art drug repositioning approaches. We further show that the disease self-representations are biologically interpretable. △ Less

Submitted 20 October, 2019; v1 submitted 14 September, 2019; originally announced September 2019.

Comments: 10 pages, 2 figures, v2 corresponds to the camera ready version accepted at the Graph Representation Learning Workshop, NeurIPS 2019

arXiv:1907.12972 [pdf, other]

Transferability of Spectral Graph Convolutional Neural Networks

Authors: Ron Levie, Wei Huang, Lorenzo Bucci, Michael M. Bronstein, Gitta Kutyniok

Abstract: This paper focuses on spectral graph convolutional neural networks (ConvNets), where filters are defined as elementwise multiplication in the frequency domain of a graph. In machine learning settings where the dataset consists of signals defined on many different graphs, the trained ConvNet should generalize to signals on graphs unseen in the training set. It is thus important to transfer ConvNets… ▽ More This paper focuses on spectral graph convolutional neural networks (ConvNets), where filters are defined as elementwise multiplication in the frequency domain of a graph. In machine learning settings where the dataset consists of signals defined on many different graphs, the trained ConvNet should generalize to signals on graphs unseen in the training set. It is thus important to transfer ConvNets between graphs. Transferability, which is a certain type of generalization capability, can be loosely defined as follows: if two graphs describe the same phenomenon, then a single filter or ConvNet should have similar repercussions on both graphs. This paper aims at debunking the common misconception that spectral filters are not transferable. We show that if two graphs discretize the same "continuous" space, then a spectral filter or ConvNet has approximately the same repercussion on both graphs. Our analysis is more permissive than the standard analysis. Transferability is typically described as the robustness of the filter to small graph perturbations and re-indexing of the vertices. Our analysis accounts also for large graph perturbations. We prove transferability between graphs that can have completely different dimensions and topologies, only requiring that both graphs discretize the same underlying space in some generic sense. △ Less

Submitted 12 June, 2021; v1 submitted 30 July, 2019; originally announced July 2019.

arXiv:1902.06673 [pdf, other]

Fake News Detection on Social Media using Geometric Deep Learning

Authors: Federico Monti, Fabrizio Frasca, Davide Eynard, Damon Mannion, Michael M. Bronstein

Abstract: Social media are nowadays one of the main news sources for millions of people around the globe due to their low cost, easy access and rapid dissemination. This however comes at the cost of dubious trustworthiness and significant risk of exposure to 'fake news', intentionally written to mislead the readers. Automatically detecting fake news poses challenges that defy existing content-based analysis… ▽ More Social media are nowadays one of the main news sources for millions of people around the globe due to their low cost, easy access and rapid dissemination. This however comes at the cost of dubious trustworthiness and significant risk of exposure to 'fake news', intentionally written to mislead the readers. Automatically detecting fake news poses challenges that defy existing content-based analysis approaches. One of the main reasons is that often the interpretation of the news requires the knowledge of political or social context or 'common sense', which current NLP algorithms are still missing. Recent studies have shown that fake and real news spread differently on social media, forming propagation patterns that could be harnessed for the automatic fake news detection. Propagation-based approaches have multiple advantages compared to their content-based counterparts, among which is language independence and better resilience to adversarial attacks. In this paper we show a novel automatic fake news detection model based on geometric deep learning. The underlying core algorithms are a generalization of classical CNNs to graphs, allowing the fusion of heterogeneous data such as content, user profile and activity, social graph, and news propagation. Our model was trained and tested on news stories, verified by professional fact-checking organizations, that were spread on Twitter. Our experiments indicate that social network structure and propagation are important features allowing highly accurate (92.7% ROC AUC) fake news detection. Second, we observe that fake news can be reliably detected at an early stage, after just a few hours of propagation. Third, we test the aging of our model on training and testing data separated in time. Our results point to the promise of propagation-based approaches for fake news detection as an alternative or complementary strategy to content-based approaches. △ Less

Submitted 10 February, 2019; originally announced February 2019.

arXiv:1811.11465 [pdf, other]

doi 10.1109/CVPR.2019.00771

Isospectralization, or how to hear shape, style, and correspondence

Authors: Luca Cosmo, Mikhail Panine, Arianna Rampini, Maks Ovsjanikov, Michael M. Bronstein, Emanuele Rodolà

Abstract: The question whether one can recover the shape of a geometric object from its Laplacian spectrum ('hear the shape of the drum') is a classical problem in spectral geometry with a broad range of implications and applications. While theoretically the answer to this question is negative (there exist examples of iso-spectral but non-isometric manifolds), little is known about the practical possibility… ▽ More The question whether one can recover the shape of a geometric object from its Laplacian spectrum ('hear the shape of the drum') is a classical problem in spectral geometry with a broad range of implications and applications. While theoretically the answer to this question is negative (there exist examples of iso-spectral but non-isometric manifolds), little is known about the practical possibility of using the spectrum for shape reconstruction and optimization. In this paper, we introduce a numerical procedure called isospectralization, consisting of deforming one shape to make its Laplacian spectrum match that of another. We implement the isospectralization procedure using modern differentiable programming techniques and exemplify its applications in some of the classical and notoriously hard problems in geometry processing, computer vision, and graphics such as shape reconstruction, pose and style transfer, and dense deformable correspondence. △ Less

Submitted 5 April, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

Comments: Accepted to CVPR 2019

arXiv:1809.10940 [pdf, other]

Functional Maps Representation on Product Manifolds

Authors: Emanuele Rodolà, Zorah Lähner, Alex M. Bronstein, Michael M. Bronstein, Justin Solomon

Abstract: We consider the tasks of representing, analyzing and manipulating maps between shapes. We model maps as densities over the product manifold of the input shapes; these densities can be treated as scalar functions and therefore are manipulable using the language of signal processing on manifolds. Being a manifold itself, the product space endows the set of maps with a geometry of its own, which we e… ▽ More We consider the tasks of representing, analyzing and manipulating maps between shapes. We model maps as densities over the product manifold of the input shapes; these densities can be treated as scalar functions and therefore are manipulable using the language of signal processing on manifolds. Being a manifold itself, the product space endows the set of maps with a geometry of its own, which we exploit to define map operations in the spectral domain; we also derive relationships with other existing representations (soft maps and functional maps). To apply these ideas in practice, we discretize product manifolds and their Laplace--Beltrami operators, and we introduce localized spectral analysis of the product manifold as a novel tool for map processing. Our framework applies to maps defined between and across 2D and 3D shapes without requiring special adjustment, and it can be implemented efficiently with simple operations on sparse matrices. △ Less

Submitted 9 January, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: Accepted to Computer Graphics Forum

arXiv:1809.07399 [pdf, other]

Nonisometric Surface Registration via Conformal Laplace-Beltrami Basis Pursuit

Authors: Stefan C. Schonsheck, Michael M. Bronstein, Rongjie Lai

Abstract: Surface registration is one of the most fundamental problems in geometry processing. Many approaches have been developed to tackle this problem in cases where the surfaces are nearly isometric. However, it is much more challenging to compute correspondence between surfaces which are intrinsically less similar. In this paper, we propose a variational model to align the Laplace-Beltrami (LB) eigensy… ▽ More Surface registration is one of the most fundamental problems in geometry processing. Many approaches have been developed to tackle this problem in cases where the surfaces are nearly isometric. However, it is much more challenging to compute correspondence between surfaces which are intrinsically less similar. In this paper, we propose a variational model to align the Laplace-Beltrami (LB) eigensytems of two non-isometric genus zero shapes via conformal deformations. This method enables us compute to geometric meaningful point-to-point maps between non-isometric shapes. Our model is based on a novel basis pursuit scheme whereby we simultaneously compute a conformal deformation of a 'target shape' and its deformed LB eigensytem. We solve the model using an proximal alternating minimization algorithm hybridized with the augmented Lagrangian method which produces accurate correspondences given only a few landmark points. We also propose a reinitialization scheme to overcome some of the difficulties caused by the non-convexity of the variational problem. Intensive numerical experiments illustrate the effectiveness and robustness of the proposed method to handle non-isometric surfaces with large deformation with respect to both noise on the underlying manifolds and errors within the given landmarks. △ Less

Submitted 19 September, 2018; originally announced September 2018.

Comments: 21 pages, 7 figures

MSC Class: 65K10; 58J50; 68U05

arXiv:1809.06166 [pdf, other]

Graph Neural Networks for IceCube Signal Classification

Authors: Nicholas Choma, Federico Monti, Lisa Gerhardt, Tomasz Palczewski, Zahra Ronaghi, Prabhat, Wahid Bhimji, Michael M. Bronstein, Spencer R. Klein, Joan Bruna

Abstract: Tasks involving the analysis of geometric (graph- and manifold-structured) data have recently gained prominence in the machine learning community, giving birth to a rapidly develo** field of geometric deep learning. In this work, we leverage graph neural networks to improve signal detection in the IceCube neutrino observatory. The IceCube detector array is modeled as a graph, where vertices are… ▽ More Tasks involving the analysis of geometric (graph- and manifold-structured) data have recently gained prominence in the machine learning community, giving birth to a rapidly develo** field of geometric deep learning. In this work, we leverage graph neural networks to improve signal detection in the IceCube neutrino observatory. The IceCube detector array is modeled as a graph, where vertices are sensors and edges are a learned function of the sensors' spatial coordinates. As only a subset of IceCube's sensors is active during a given observation, we note the adaptive nature of our GNN, wherein computation is restricted to the input signal support. We demonstrate the effectiveness of our GNN architecture on a task classifying IceCube events, where it outperforms both a traditional physics-based method as well as classical 3D convolution neural networks. △ Less

Submitted 17 September, 2018; originally announced September 2018.

arXiv:1806.00770 [pdf, other]

Dual-Primal Graph Convolutional Networks

Authors: Federico Monti, Oleksandr Shchur, Aleksandar Bojchevski, Or Litany, Stephan Günnemann, Michael M. Bronstein

Abstract: In recent years, there has been a surge of interest in develo** deep learning methods for non-Euclidean structured data such as graphs. In this paper, we propose Dual-Primal Graph CNN, a graph convolutional architecture that alternates convolution-like operations on the graph and its dual. Our approach allows to learn both vertex- and edge features and generalizes the previous graph attention (G… ▽ More In recent years, there has been a surge of interest in develo** deep learning methods for non-Euclidean structured data such as graphs. In this paper, we propose Dual-Primal Graph CNN, a graph convolutional architecture that alternates convolution-like operations on the graph and its dual. Our approach allows to learn both vertex- and edge features and generalizes the previous graph attention (GAT) model. We provide extensive experimental validation showing state-of-the-art results on a variety of tasks tested on established graph benchmarks, including CORA and Citeseer citation networks as well as MovieLens, Flixter, Douban and Yahoo Music graph-guided recommender systems. △ Less

Submitted 3 June, 2018; originally announced June 2018.

arXiv:1806.00088 [pdf, other]

PeerNets: Exploiting Peer Wisdom Against Adversarial Attacks

Authors: Jan Svoboda, Jonathan Masci, Federico Monti, Michael M. Bronstein, Leonidas Guibas

Abstract: Deep learning systems have become ubiquitous in many aspects of our lives. Unfortunately, it has been shown that such systems are vulnerable to adversarial attacks, making them prone to potential unlawful uses. Designing deep neural networks that are robust to adversarial attacks is a fundamental step in making such systems safer and deployable in a broader variety of applications (e.g. autonomous… ▽ More Deep learning systems have become ubiquitous in many aspects of our lives. Unfortunately, it has been shown that such systems are vulnerable to adversarial attacks, making them prone to potential unlawful uses. Designing deep neural networks that are robust to adversarial attacks is a fundamental step in making such systems safer and deployable in a broader variety of applications (e.g. autonomous driving), but more importantly is a necessary step to design novel and more advanced architectures built on new computational paradigms rather than marginally building on the existing ones. In this paper we introduce PeerNets, a novel family of convolutional networks alternating classical Euclidean convolutions with graph convolutions to harness information from a graph of peer samples. This results in a form of non-local forward propagation in the model, where latent features are conditioned on the global structure induced by the graph, that is up to 3 times more robust to a variety of white- and black-box adversarial attacks compared to conventional architectures with almost no drop in accuracy. △ Less

Submitted 31 May, 2018; originally announced June 2018.

arXiv:1802.01572 [pdf, other]

MotifNet: a motif-based Graph Convolutional Network for directed graphs

Authors: Federico Monti, Karl Otness, Michael M. Bronstein

Abstract: Deep learning on graphs and in particular, graph convolutional neural networks, have recently attracted significant attention in the machine learning community. Many of such techniques explore the analogy between the graph Laplacian eigenvectors and the classical Fourier basis, allowing to formulate the convolution as a multiplication in the spectral domain. One of the key drawback of spectral CNN… ▽ More Deep learning on graphs and in particular, graph convolutional neural networks, have recently attracted significant attention in the machine learning community. Many of such techniques explore the analogy between the graph Laplacian eigenvectors and the classical Fourier basis, allowing to formulate the convolution as a multiplication in the spectral domain. One of the key drawback of spectral CNNs is their explicit assumption of an undirected graph, leading to a symmetric Laplacian matrix with orthogonal eigendecomposition. In this work we propose MotifNet, a graph CNN capable of dealing with directed graphs by exploiting local graph motifs. We present experimental evidence showing the advantage of our approach on real data. △ Less

Submitted 3 February, 2018; originally announced February 2018.

arXiv:1801.07829 [pdf, other]

Dynamic Graph CNN for Learning on Point Clouds

Authors: Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon

Abstract: Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of… ▽ More Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. Point clouds inherently lack topological information so designing a model to recover topology can enrich the representation power of point clouds. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds including classification and segmentation. EdgeConv acts on graphs dynamically computed in each layer of the network. It is differentiable and can be plugged into existing architectures. Compared to existing modules operating in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. We show the performance of our model on standard benchmarks including ModelNet40, ShapeNetPart, and S3DIS. △ Less

Submitted 11 June, 2019; v1 submitted 23 January, 2018; originally announced January 2018.

arXiv:1709.03484 [pdf, other]

doi 10.1007/978-3-319-58771-4_54

Subspace Least Squares Multidimensional Scaling

Authors: Amit Boyarski, Alex M. Bronstein, Michael M. Bronstein

Abstract: Multidimensional Scaling (MDS) is one of the most popular methods for dimensionality reduction and visualization of high dimensional data. Apart from these tasks, it also found applications in the field of geometry processing for the analysis and reconstruction of non-rigid shapes. In this regard, MDS can be thought of as a \textit{shape from metric} algorithm, consisting of finding a configuratio… ▽ More Multidimensional Scaling (MDS) is one of the most popular methods for dimensionality reduction and visualization of high dimensional data. Apart from these tasks, it also found applications in the field of geometry processing for the analysis and reconstruction of non-rigid shapes. In this regard, MDS can be thought of as a \textit{shape from metric} algorithm, consisting of finding a configuration of points in the Euclidean space that realize, as isometrically as possible, some given distance structure. In the present work we cast the least squares variant of MDS (LS-MDS) in the spectral domain. This uncovers a multiresolution property of distance scaling which speeds up the optimization by a significant amount, while producing comparable, and sometimes even better, embeddings. △ Less

Submitted 11 September, 2017; originally announced September 2017.

Comments: Scale Space and Variational Methods in Computer Vision: 6th International Conference, SSVM 2017, Kolding, Denmark, June 4-8, 2017

Journal ref: Scale Space and Variational Methods in Computer Vision: 6th International Conference, SSVM 2017, Kolding, Denmark, June 4-8, 2017, Proceedings, 10302, 681--693

arXiv:1707.02596 [pdf, other]

Localized Manifold Harmonics for Spectral Shape Analysis

Authors: Simone Melzi, Emanuele Rodolà, Umberto Castellani, Michael M. Bronstein

Abstract: The use of Laplacian eigenfunctions is ubiquitous in a wide range of computer graphics and geometry processing applications. In particular, Laplacian eigenbases allow generalizing the classical Fourier analysis to manifolds. A key drawback of such bases is their inherently global nature, as the Laplacian eigenfunctions carry geometric and topological structure of the entire manifold. In this paper… ▽ More The use of Laplacian eigenfunctions is ubiquitous in a wide range of computer graphics and geometry processing applications. In particular, Laplacian eigenbases allow generalizing the classical Fourier analysis to manifolds. A key drawback of such bases is their inherently global nature, as the Laplacian eigenfunctions carry geometric and topological structure of the entire manifold. In this paper, we introduce a new framework for local spectral shape analysis. We show how to efficiently construct localized orthogonal bases by solving an optimization problem that in turn can be posed as the eigendecomposition of a new operator obtained by a modification of the standard Laplacian. We study the theoretical and computational aspects of the proposed framework and showcase our new construction on the classical problems of shape approximation and correspondence. We obtain significant improvement compared to classical Laplacian eigenbases as well as other alternatives for constructing localized bases. △ Less

Submitted 2 November, 2017; v1 submitted 9 July, 2017; originally announced July 2017.

Comments: Accepted to Computer Graphics Forum

arXiv:1705.07664 [pdf, other]

CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters

Authors: Ron Levie, Federico Monti, Xavier Bresson, Michael M. Bronstein

Abstract: The rise of graph-structured data such as social networks, regulatory networks, citation graphs, and functional brain networks, in combination with resounding success of deep learning in various applications, has brought the interest in generalizing deep learning models to non-Euclidean domains. In this paper, we introduce a new spectral domain convolutional architecture for deep learning on graph… ▽ More The rise of graph-structured data such as social networks, regulatory networks, citation graphs, and functional brain networks, in combination with resounding success of deep learning in various applications, has brought the interest in generalizing deep learning models to non-Euclidean domains. In this paper, we introduce a new spectral domain convolutional architecture for deep learning on graphs. The core ingredient of our model is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute spectral filters on graphs that specialize on frequency bands of interest. Our model generates rich spectral filters that are localized in space, scales linearly with the size of the input data for sparsely-connected graphs, and can handle different constructions of Laplacian operators. Extensive experimental results show the superior performance of our approach, in comparison to other spectral domain convolutional architectures, on spectral image classification, community detection, vertex classification and matrix completion tasks. △ Less

Submitted 31 October, 2018; v1 submitted 22 May, 2017; originally announced May 2017.

arXiv:1705.01707 [pdf, other]

Generative Convolutional Networks for Latent Fingerprint Reconstruction

Authors: Jan Svoboda, Federico Monti, Michael M. Bronstein

Abstract: Performance of fingerprint recognition depends heavily on the extraction of minutiae points. Enhancement of the fingerprint ridge pattern is thus an essential pre-processing step that noticeably reduces false positive and negative detection rates. A particularly challenging setting is when the fingerprint images are corrupted or partially missing. In this work, we apply generative convolutional ne… ▽ More Performance of fingerprint recognition depends heavily on the extraction of minutiae points. Enhancement of the fingerprint ridge pattern is thus an essential pre-processing step that noticeably reduces false positive and negative detection rates. A particularly challenging setting is when the fingerprint images are corrupted or partially missing. In this work, we apply generative convolutional networks to denoise visible minutiae and predict the missing parts of the ridge pattern. The proposed enhancement approach is tested as a pre-processing step in combination with several standard feature extraction methods such as MINDTCT, followed by biometric comparison using MCC and BOZORTH3. We evaluate our method on several publicly available latent fingerprint datasets captured using different sensors. △ Less

Submitted 4 May, 2017; originally announced May 2017.

arXiv:1704.08686 [pdf, other]

Deep Functional Maps: Structured Prediction for Dense Shape Correspondence

Authors: Or Litany, Tal Remez, Emanuele Rodolà, Alex M. Bronstein, Michael M. Bronstein

Abstract: We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a par… ▽ More We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a paradigm shift and design a structured prediction model in the space of functional maps, linear operators that provide a compact representation of the correspondence. We model the learning process via a deep residual network which takes dense descriptor fields defined on two shapes as input, and outputs a soft map between the two given objects. The resulting correspondence is shown to be accurate on several challenging benchmarks comprising multiple categories, synthetic models, real scans with acquisition artifacts, topological noise, and partiality. △ Less

Submitted 30 July, 2017; v1 submitted 27 April, 2017; originally announced April 2017.

Comments: Accepted for publication at ICCV 2017

arXiv:1704.06803 [pdf, other]

Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks

Authors: Federico Monti, Michael M. Bronstein, Xavier Bresson

Abstract: Matrix completion models are among the most common formulations of recommender systems. Recent works have showed a boost of performance of these techniques when introducing the pairwise relationships between users/items in the form of graphs, and imposing smoothness priors on these graphs. However, such techniques do not fully exploit the local stationarity structures of user/item graphs, and the… ▽ More Matrix completion models are among the most common formulations of recommender systems. Recent works have showed a boost of performance of these techniques when introducing the pairwise relationships between users/items in the form of graphs, and imposing smoothness priors on these graphs. However, such techniques do not fully exploit the local stationarity structures of user/item graphs, and the number of parameters to learn is linear w.r.t. the number of users and items. We propose a novel approach to overcome these limitations by using geometric deep learning on graphs. Our matrix completion architecture combines graph convolutional neural networks and recurrent neural networks to learn meaningful statistical graph-structured patterns and the non-linear diffusion process that generates the known ratings. This neural network system requires a constant number of parameters independent of the matrix size. We apply our method on both synthetic and real datasets, showing that it outperforms state-of-the-art techniques. △ Less

Submitted 22 April, 2017; originally announced April 2017.

arXiv:1611.08402 [pdf, other]

Geometric deep learning on graphs and manifolds using mixture model CNNs

Authors: Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein

Abstract: Deep learning has achieved a remarkable performance breakthrough in several fields, most notably in speech recognition, natural language processing, and computer vision. In particular, convolutional neural network (CNN) architectures currently produce state-of-the-art performance on a variety of image analysis tasks such as object detection and recognition. Most of deep learning research has so fa… ▽ More Deep learning has achieved a remarkable performance breakthrough in several fields, most notably in speech recognition, natural language processing, and computer vision. In particular, convolutional neural network (CNN) architectures currently produce state-of-the-art performance on a variety of image analysis tasks such as object detection and recognition. Most of deep learning research has so far focused on dealing with 1D, 2D, or 3D Euclidean-structured data such as acoustic signals, images, or videos. Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics. In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently outperforms previous approaches. △ Less

Submitted 6 December, 2016; v1 submitted 25 November, 2016; originally announced November 2016.

arXiv:1611.08097 [pdf, other]

doi 10.1109/MSP.2017.2693418

Geometric deep learning: going beyond Euclidean data

Authors: Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, Pierre Vandergheynst

Abstract: Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social… ▽ More Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them. Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field. △ Less

Submitted 3 May, 2017; v1 submitted 24 November, 2016; originally announced November 2016.

Showing 1–50 of 74 results for author: Bronstein, M M