-
The Hidden Pitfalls of the Cosine Similarity Loss
Authors:
Andrew Draganov,
Sharvaree Vadgama,
Erik J. Bekkers
Abstract:
We show that the gradient of the cosine similarity between two points goes to zero in two under-explored settings: (1) if a point has large magnitude or (2) if the points are on opposite ends of the latent space. Counterintuitively, we prove that optimizing the cosine similarity between points forces them to grow in magnitude. Thus, (1) is unavoidable in practice. We then observe that these deriva…
▽ More
We show that the gradient of the cosine similarity between two points goes to zero in two under-explored settings: (1) if a point has large magnitude or (2) if the points are on opposite ends of the latent space. Counterintuitively, we prove that optimizing the cosine similarity between points forces them to grow in magnitude. Thus, (1) is unavoidable in practice. We then observe that these derivations are extremely general -- they hold across deep learning architectures and for many of the standard self-supervised learning (SSL) loss functions. This leads us to propose cut-initialization: a simple change to network initialization that helps all studied SSL methods converge faster.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Space-Time Continuous PDE Forecasting using Equivariant Neural Fields
Authors:
David M. Knigge,
David R. Wessels,
Riccardo Valperga,
Samuele Papa,
Jan-Jakob Sonke,
Efstratios Gavves,
Erik J. Bekkers
Abstract:
Recently, Conditional Neural Fields (NeFs) have emerged as a powerful modelling paradigm for PDEs, by learning solutions as flows in the latent space of the Conditional NeF. Although benefiting from favourable properties of NeFs such as grid-agnosticity and space-time-continuous dynamics modelling, this approach limits the ability to impose known constraints of the PDE on the solutions -- e.g. sym…
▽ More
Recently, Conditional Neural Fields (NeFs) have emerged as a powerful modelling paradigm for PDEs, by learning solutions as flows in the latent space of the Conditional NeF. Although benefiting from favourable properties of NeFs such as grid-agnosticity and space-time-continuous dynamics modelling, this approach limits the ability to impose known constraints of the PDE on the solutions -- e.g. symmetries or boundary conditions -- in favour of modelling flexibility. Instead, we propose a space-time continuous NeF-based solving framework that - by preserving geometric information in the latent space - respects known symmetries of the PDE. We show that modelling solutions as flows of pointclouds over the group of interest $G$ improves generalization and data-efficiency. We validated that our framework readily generalizes to unseen spatial and temporal locations, as well as geometric transformations of the initial conditions - where other NeF-based PDE forecasting methods fail - and improve over baselines in a number of challenging geometries.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
Authors:
David R Wessels,
David M Knigge,
Samuele Papa,
Riccardo Valperga,
Sharvaree Vadgama,
Efstratios Gavves,
Erik J Bekkers
Abstract:
Recently, Neural Fields have emerged as a powerful modelling paradigm to represent continuous signals. In a conditional neural field, a field is represented by a latent variable that conditions the NeF, whose parametrisation is otherwise shared over an entire dataset. We propose Equivariant Neural Fields based on cross attention transformers, in which NeFs are conditioned on a geometric conditioni…
▽ More
Recently, Neural Fields have emerged as a powerful modelling paradigm to represent continuous signals. In a conditional neural field, a field is represented by a latent variable that conditions the NeF, whose parametrisation is otherwise shared over an entire dataset. We propose Equivariant Neural Fields based on cross attention transformers, in which NeFs are conditioned on a geometric conditioning variable, a latent point cloud, that enables an equivariant decoding from latent to field. Our equivariant approach induces a steerability property by which both field and latent are grounded in geometry and amenable to transformation laws if the field transforms, the latent represents transforms accordingly and vice versa. Crucially, the equivariance relation ensures that the latent is capable of (1) representing geometric patterns faitfhully, allowing for geometric reasoning in latent space, (2) weightsharing over spatially similar patterns, allowing for efficient learning of datasets of fields. These main properties are validated using classification experiments and a verification of the capability of fitting entire datasets, in comparison to other non-equivariant NeF approaches. We further validate the potential of ENFs by demonstrate unique local field editing properties.
△ Less
Submitted 17 June, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.
-
E(n) Equivariant Message Passing Cellular Networks
Authors:
Veljko Kovač,
Erik J. Bekkers,
Pietro Liò,
Floor Eijkelboom
Abstract:
This paper introduces E(n) Equivariant Message Passing Cellular Networks (EMPCNs), an extension of E(n) Equivariant Graph Neural Networks to CW-complexes. Our approach addresses two aspects of geometric message passing networks: 1) enhancing their expressiveness by incorporating arbitrary cells, and 2) achieving this in a computationally efficient way with a decoupled EMPCNs technique. We demonstr…
▽ More
This paper introduces E(n) Equivariant Message Passing Cellular Networks (EMPCNs), an extension of E(n) Equivariant Graph Neural Networks to CW-complexes. Our approach addresses two aspects of geometric message passing networks: 1) enhancing their expressiveness by incorporating arbitrary cells, and 2) achieving this in a computationally efficient way with a decoupled EMPCNs technique. We demonstrate that EMPCNs achieve close to state-of-the-art performance on multiple tasks without the need for steerability, including many-body predictions and motion capture. Moreover, ablation studies confirm that decoupled EMPCNs exhibit stronger generalization capabilities than their non-topologically informed counterparts. These findings show that EMPCNs can be used as a scalable and expressive framework for higher-order message passing in geometric and topological graphs
△ Less
Submitted 6 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Latent Field Discovery In Interacting Dynamical Systems With Neural Fields
Authors:
Miltiadis Kofinas,
Erik J. Bekkers,
Naveen Shankar Nagaraja,
Efstratios Gavves
Abstract:
Systems of interacting objects often evolve under the influence of field effects that govern their dynamics, yet previous works have abstracted away from such effects, and assume that systems evolve in a vacuum. In this work, we focus on discovering these fields, and infer them from the observed dynamics alone, without directly observing them. We theorize the presence of latent force fields, and p…
▽ More
Systems of interacting objects often evolve under the influence of field effects that govern their dynamics, yet previous works have abstracted away from such effects, and assume that systems evolve in a vacuum. In this work, we focus on discovering these fields, and infer them from the observed dynamics alone, without directly observing them. We theorize the presence of latent force fields, and propose neural fields to learn them. Since the observed dynamics constitute the net effect of local object interactions and global field effects, recently popularized equivariant networks are inapplicable, as they fail to capture global information. To address this, we propose to disentangle local object interactions -- which are $\mathrm{SE}(n)$ equivariant and depend on relative states -- from external global field effects -- which depend on absolute states. We model interactions with equivariant graph networks, and combine them with neural fields in a novel graph network that integrates field forces. Our experiments show that we can accurately discover the underlying fields in charged particles settings, traffic scenes, and gravitational n-body problems, and effectively use them to learn the system and forecast future trajectories.
△ Less
Submitted 20 March, 2024; v1 submitted 31 October, 2023;
originally announced October 2023.
-
Fast, Expressive SE$(n)$ Equivariant Networks through Weight-Sharing in Position-Orientation Space
Authors:
Erik J Bekkers,
Sharvaree Vadgama,
Rob D Hesselink,
Putri A van der Linden,
David W Romero
Abstract:
Based on the theory of homogeneous spaces we derive geometrically optimal edge attributes to be used within the flexible message-passing framework. We formalize the notion of weight sharing in convolutional networks as the sharing of message functions over point-pairs that should be treated equally. We define equivalence classes of point-pairs that are identical up to a transformation in the group…
▽ More
Based on the theory of homogeneous spaces we derive geometrically optimal edge attributes to be used within the flexible message-passing framework. We formalize the notion of weight sharing in convolutional networks as the sharing of message functions over point-pairs that should be treated equally. We define equivalence classes of point-pairs that are identical up to a transformation in the group and derive attributes that uniquely identify these classes. Weight sharing is then obtained by conditioning message functions on these attributes. As an application of the theory, we develop an efficient equivariant group convolutional network for processing 3D point clouds. The theory of homogeneous spaces tells us how to do group convolutions with feature maps over the homogeneous space of positions $\mathbb{R}^3$, position and orientations $\mathbb{R}^3 {\times} S^2$, and the group $SE(3)$ itself. Among these, $\mathbb{R}^3 {\times} S^2$ is an optimal choice due to the ability to represent directional information, which $\mathbb{R}^3$ methods cannot, and it significantly enhances computational efficiency compared to indexing features on the full $SE(3)$ group. We support this claim with state-of-the-art results -- in accuracy and speed -- on five different benchmarks in 2D and 3D, including interatomic potential energy prediction, trajectory forecasting in N-body systems, and generating molecules via equivariant diffusion models.
△ Less
Submitted 15 March, 2024; v1 submitted 4 October, 2023;
originally announced October 2023.
-
On genuine invariance learning without weight-tying
Authors:
Artem Moskalev,
Anna Sepliarskaia,
Erik J. Bekkers,
Arnold Smeulders
Abstract:
In this paper, we investigate properties and limitations of invariance learned by neural networks from the data compared to the genuine invariance achieved through invariant weight-tying. To do so, we adopt a group theoretical perspective and analyze invariance learning in neural networks without weight-tying constraints. We demonstrate that even when a network learns to correctly classify samples…
▽ More
In this paper, we investigate properties and limitations of invariance learned by neural networks from the data compared to the genuine invariance achieved through invariant weight-tying. To do so, we adopt a group theoretical perspective and analyze invariance learning in neural networks without weight-tying constraints. We demonstrate that even when a network learns to correctly classify samples on a group orbit, the underlying decision-making in such a model does not attain genuine invariance. Instead, learned invariance is strongly conditioned on the input data, rendering it unreliable if the input distribution shifts. We next demonstrate how to guide invariance learning toward genuine invariance by regularizing the invariance of a model at the training. To this end, we propose several metrics to quantify learned invariance: (i) predictive distribution invariance, (ii) logit invariance, and (iii) saliency invariance similarity. We show that the invariance learned with the invariance error regularization closely reassembles the genuine invariance of weight-tying models and reliably holds even under a severe input distribution shift. Closer analysis of the learned invariance also reveals the spectral decay phenomenon, when a network chooses to achieve the invariance to a specific transformation group by reducing the sensitivity to any input perturbation.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Learned Gridification for Efficient Point Cloud Processing
Authors:
Putri A. van der Linden,
David W. Romero,
Erik J. Bekkers
Abstract:
Neural operations that rely on neighborhood information are much more expensive when deployed on point clouds than on grid data due to the irregular distances between points in a point cloud. In a grid, on the other hand, we can compute the kernel only once and reuse it for all query positions. As a result, operations that rely on neighborhood information scale much worse for point clouds than for…
▽ More
Neural operations that rely on neighborhood information are much more expensive when deployed on point clouds than on grid data due to the irregular distances between points in a point cloud. In a grid, on the other hand, we can compute the kernel only once and reuse it for all query positions. As a result, operations that rely on neighborhood information scale much worse for point clouds than for grid data, specially for large inputs and large neighborhoods.
In this work, we address the scalability issue of point cloud methods by tackling its root cause: the irregularity of the data. We propose learnable gridification as the first step in a point cloud processing pipeline to transform the point cloud into a compact, regular grid. Thanks to gridification, subsequent layers can use operations defined on regular grids, e.g., Conv3D, which scale much better than native point cloud methods. We then extend gridification to point cloud to point cloud tasks, e.g., segmentation, by adding a learnable de-gridification step at the end of the point cloud processing pipeline to map the compact, regular grid back to its original point cloud form. Through theoretical and empirical analysis, we show that gridified networks scale better in terms of memory and time than networks directly applied on raw point cloud data, while being able to achieve competitive results. Our code is publicly available at https://github.com/computri/gridifier.
△ Less
Submitted 22 July, 2023;
originally announced July 2023.
-
Regular SE(3) Group Convolutions for Volumetric Medical Image Analysis
Authors:
Thijs P. Kuipers,
Erik J. Bekkers
Abstract:
Regular group convolutional neural networks (G-CNNs) have been shown to increase model performance and improve equivariance to different geometrical symmetries. This work addresses the problem of SE(3), i.e., roto-translation equivariance, on volumetric data. Volumetric image data is prevalent in many medical settings. Motivated by the recent work on separable group convolutions, we devise a SE(3)…
▽ More
Regular group convolutional neural networks (G-CNNs) have been shown to increase model performance and improve equivariance to different geometrical symmetries. This work addresses the problem of SE(3), i.e., roto-translation equivariance, on volumetric data. Volumetric image data is prevalent in many medical settings. Motivated by the recent work on separable group convolutions, we devise a SE(3) group convolution kernel separated into a continuous SO(3) (rotation) kernel and a spatial kernel. We approximate equivariance to the continuous setting by sampling uniform SO(3) grids. Our continuous SO(3) kernel is parameterized via RBF interpolation on similarly uniform grids. We demonstrate the advantages of our approach in volumetric medical image analysis. Our SE(3) equivariant models consistently outperform CNNs and regular discrete G-CNNs on challenging medical classification tasks and show significantly improved generalization capabilities. Our approach achieves up to a 16.5% gain in accuracy over regular CNNs.
△ Less
Submitted 20 July, 2023; v1 submitted 24 June, 2023;
originally announced June 2023.
-
An Exploration of Conditioning Methods in Graph Neural Networks
Authors:
Yeskendir Koishekenov,
Erik J. Bekkers
Abstract:
The flexibility and effectiveness of message passing based graph neural networks (GNNs) induced considerable advances in deep learning on graph-structured data. In such approaches, GNNs recursively update node representations based on their neighbors and they gain expressivity through the use of node and edge attribute vectors. E.g., in computational tasks such as physics and chemistry usage of ed…
▽ More
The flexibility and effectiveness of message passing based graph neural networks (GNNs) induced considerable advances in deep learning on graph-structured data. In such approaches, GNNs recursively update node representations based on their neighbors and they gain expressivity through the use of node and edge attribute vectors. E.g., in computational tasks such as physics and chemistry usage of edge attributes such as relative position or distance proved to be essential. In this work, we address not what kind of attributes to use, but how to condition on this information to improve model performance. We consider three types of conditioning; weak, strong, and pure, which respectively relate to concatenation-based conditioning, gating, and transformations that are causally dependent on the attributes. This categorization provides a unifying viewpoint on different classes of GNNs, from separable convolutions to various forms of message passing networks. We provide an empirical study on the effect of conditioning methods in several tasks in computational chemistry.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN
Authors:
David M. Knigge,
David W. Romero,
Albert Gu,
Efstratios Gavves,
Erik J. Bekkers,
Jakub M. Tomczak,
Mark Hoogendoorn,
Jan-Jakob Sonke
Abstract:
Performant Convolutional Neural Network (CNN) architectures must be tailored to specific tasks in order to consider the length, resolution, and dimensionality of the input data. In this work, we tackle the need for problem-specific CNN architectures. We present the Continuous Convolutional Neural Network (CCNN): a single CNN able to process data of arbitrary resolution, dimensionality and length w…
▽ More
Performant Convolutional Neural Network (CNN) architectures must be tailored to specific tasks in order to consider the length, resolution, and dimensionality of the input data. In this work, we tackle the need for problem-specific CNN architectures. We present the Continuous Convolutional Neural Network (CCNN): a single CNN able to process data of arbitrary resolution, dimensionality and length without any structural changes. Its key component are its continuous convolutional kernels which model long-range dependencies at every layer, and thus remove the need of current CNN architectures for task-dependent downsampling and depths. We showcase the generality of our method by using the same architecture for tasks on sequential ($1{\rm D}$), visual ($2{\rm D}$) and point-cloud ($3{\rm D}$) data. Our CCNN matches and often outperforms the current state-of-the-art across all tasks considered.
△ Less
Submitted 16 April, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Towards a General Purpose CNN for Long Range Dependencies in $N$D
Authors:
David W. Romero,
David M. Knigge,
Albert Gu,
Erik J. Bekkers,
Efstratios Gavves,
Jakub M. Tomczak,
Mark Hoogendoorn
Abstract:
The use of Convolutional Neural Networks (CNNs) is widespread in Deep Learning due to a range of desirable model properties which result in an efficient and effective machine learning framework. However, performant CNN architectures must be tailored to specific tasks in order to incorporate considerations such as the input length, resolution, and dimentionality. In this work, we overcome the need…
▽ More
The use of Convolutional Neural Networks (CNNs) is widespread in Deep Learning due to a range of desirable model properties which result in an efficient and effective machine learning framework. However, performant CNN architectures must be tailored to specific tasks in order to incorporate considerations such as the input length, resolution, and dimentionality. In this work, we overcome the need for problem-specific CNN architectures with our Continuous Convolutional Neural Network (CCNN): a single CNN architecture equipped with continuous convolutional kernels that can be used for tasks on data of arbitrary resolution, dimensionality and length without structural changes. Continuous convolutional kernels model long range dependencies at every layer, and remove the need for downsampling layers and task-dependent depths needed in current CNN architectures. We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$\mathrm{D}$) and visual data (2$\mathrm{D}$). Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
△ Less
Submitted 5 July, 2022; v1 submitted 7 June, 2022;
originally announced June 2022.
-
ChebLieNet: Invariant Spectral Graph NNs Turned Equivariant by Riemannian Geometry on Lie Groups
Authors:
Hugo Aguettaz,
Erik J. Bekkers,
Michaël Defferrard
Abstract:
We introduce ChebLieNet, a group-equivariant method on (anisotropic) manifolds. Surfing on the success of graph- and group-based neural networks, we take advantage of the recent developments in the geometric deep learning field to derive a new approach to exploit any anisotropies in data. Via discrete approximations of Lie groups, we develop a graph neural network made of anisotropic convolutional…
▽ More
We introduce ChebLieNet, a group-equivariant method on (anisotropic) manifolds. Surfing on the success of graph- and group-based neural networks, we take advantage of the recent developments in the geometric deep learning field to derive a new approach to exploit any anisotropies in data. Via discrete approximations of Lie groups, we develop a graph neural network made of anisotropic convolutional layers (Chebyshev convolutions), spatial pooling and unpooling layers, and global pooling layers. Group equivariance is achieved via equivariant and invariant operators on graphs with anisotropic left-invariant Riemannian distance-based affinities encoded on the edges. Thanks to its simple form, the Riemannian metric can model any anisotropies, both in the spatial and orientation domains. This control on anisotropies of the Riemannian metrics allows to balance equivariance (anisotropic metric) against invariance (isotropic metric) of the graph convolution layers. Hence we open the doors to a better understanding of anisotropic properties. Furthermore, we empirically prove the existence of (data-dependent) sweet spots for anisotropic parameters on CIFAR10. This crucial result is evidence of the benefice we could get by exploiting anisotropic properties in data. We also evaluate the scalability of this approach on STL10 (image data) and ClimateNet (spherical data), showing its remarkable adaptability to diverse tasks.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Towards Lightweight Controllable Audio Synthesis with Conditional Implicit Neural Representations
Authors:
Jan Zuiderveld,
Marco Federici,
Erik J. Bekkers
Abstract:
The high temporal resolution of audio and our perceptual sensitivity to small irregularities in waveforms make synthesizing at high sampling rates a complex and computationally intensive task, prohibiting real-time, controllable synthesis within many approaches. In this work we aim to shed light on the potential of Conditional Implicit Neural Representations (CINRs) as lightweight backbones in gen…
▽ More
The high temporal resolution of audio and our perceptual sensitivity to small irregularities in waveforms make synthesizing at high sampling rates a complex and computationally intensive task, prohibiting real-time, controllable synthesis within many approaches. In this work we aim to shed light on the potential of Conditional Implicit Neural Representations (CINRs) as lightweight backbones in generative frameworks for audio synthesis.
Our experiments show that small Periodic Conditional INRs (PCINRs) learn faster and generally produce quantitatively better audio reconstructions than Transposed Convolutional Neural Networks with equal parameter counts. However, their performance is very sensitive to activation scaling hyperparameters. When learning to represent more uniform sets, PCINRs tend to introduce artificial high-frequency components in reconstructions. We validate this noise can be minimized by applying standard weight regularization during training or decreasing the compositional depth of PCINRs, and suggest directions for future research.
△ Less
Submitted 2 December, 2021; v1 submitted 14 November, 2021;
originally announced November 2021.
-
Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups
Authors:
David M. Knigge,
David W. Romero,
Erik J. Bekkers
Abstract:
Group convolutional neural networks (G-CNNs) have been shown to increase parameter efficiency and model accuracy by incorporating geometric inductive biases. In this work, we investigate the properties of representations learned by regular G-CNNs, and show considerable parameter redundancy in group convolution kernels. This finding motivates further weight-tying by sharing convolution kernels over…
▽ More
Group convolutional neural networks (G-CNNs) have been shown to increase parameter efficiency and model accuracy by incorporating geometric inductive biases. In this work, we investigate the properties of representations learned by regular G-CNNs, and show considerable parameter redundancy in group convolution kernels. This finding motivates further weight-tying by sharing convolution kernels over subgroups. To this end, we introduce convolution kernels that are separable over the subgroup and channel dimensions. In order to obtain equivariance to arbitrary affine Lie groups we provide a continuous parameterisation of separable convolution kernels. We evaluate our approach across several vision datasets, and show that our weight sharing leads to improved performance and computational efficiency. In many settings, separable G-CNNs outperform their non-separable counterpart, while only using a fraction of their training time. In addition, thanks to the increase in computational efficiency, we are able to implement G-CNNs equivariant to the $\mathrm{Sim(2)}$ group; the group of dilations, rotations and translations. $\mathrm{Sim(2)}$-equivariance further improves performance on all tasks considered.
△ Less
Submitted 4 April, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
Authors:
David W. Romero,
Robert-Jan Bruintjes,
Jakub M. Tomczak,
Erik J. Bekkers,
Mark Hoogendoorn,
Jan C. van Gemert
Abstract:
When designing Convolutional Neural Networks (CNNs), one must select the size\break of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size h…
▽ More
When designing Convolutional Neural Networks (CNNs), one must select the size\break of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size have a limited bandwidth. These approaches scale kernels by dilation, and thus the detail they can describe is limited. In this work, we propose FlexConv, a novel convolutional operation with which high bandwidth convolutional kernels of learnable kernel size can be learned at a fixed parameter cost. FlexNets model long-term dependencies without the use of pooling, achieve state-of-the-art performance on several sequential datasets, outperform recent works with learned kernel sizes, and are competitive with much deeper ResNets on image benchmark datasets. Additionally, FlexNets can be deployed at higher resolutions than those seen during training. To avoid aliasing, we propose a novel kernel parameterization with which the frequency of the kernels can be analytically controlled. Our novel kernel parameterization shows higher descriptive power and faster convergence speed than existing parameterizations. This leads to important improvements in classification accuracy.
△ Less
Submitted 17 March, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Geometric and Physical Quantities Improve E(3) Equivariant Message Passing
Authors:
Johannes Brandstetter,
Rob Hesselink,
Elise van der Pol,
Erik J Bekkers,
Max Welling
Abstract:
Including covariant information, such as position, force, velocity or spin is important in many tasks in computational physics and chemistry. We introduce Steerable E(3) Equivariant Graph Neural Networks (SEGNNs) that generalise equivariant graph networks, such that node and edge attributes are not restricted to invariant scalars, but can contain covariant information, such as vectors or tensors.…
▽ More
Including covariant information, such as position, force, velocity or spin is important in many tasks in computational physics and chemistry. We introduce Steerable E(3) Equivariant Graph Neural Networks (SEGNNs) that generalise equivariant graph networks, such that node and edge attributes are not restricted to invariant scalars, but can contain covariant information, such as vectors or tensors. This model, composed of steerable MLPs, is able to incorporate geometric and physical information in both the message and update functions. Through the definition of steerable node attributes, the MLPs provide a new class of activation functions for general use with steerable feature fields. We discuss ours and related work through the lens of equivariant non-linear convolutions, which further allows us to pin-point the successful components of SEGNNs: non-linear message aggregation improves upon classic linear (steerable) point convolutions; steerable messages improve upon recent equivariant graph networks that send invariant messages. We demonstrate the effectiveness of our method on several tasks in computational physics and chemistry and provide extensive ablation studies.
△ Less
Submitted 26 March, 2022; v1 submitted 6 October, 2021;
originally announced October 2021.
-
CKConv: Continuous Kernel Convolution For Sequential Data
Authors:
David W. Romero,
Anna Kuzina,
Erik J. Bekkers,
Jakub M. Tomczak,
Mark Hoogendoorn
Abstract:
Conventional neural architectures for sequential data present important limitations. Recurrent networks suffer from exploding and vanishing gradients, small effective memory horizons, and must be trained sequentially. Convolutional networks are unable to handle sequences of unknown size and their memory horizon must be defined a priori. In this work, we show that all these problems can be solved b…
▽ More
Conventional neural architectures for sequential data present important limitations. Recurrent networks suffer from exploding and vanishing gradients, small effective memory horizons, and must be trained sequentially. Convolutional networks are unable to handle sequences of unknown size and their memory horizon must be defined a priori. In this work, we show that all these problems can be solved by formulating convolutional kernels in CNNs as continuous functions. The resulting Continuous Kernel Convolution (CKConv) allows us to model arbitrarily long sequences in a parallel manner, within a single operation, and without relying on any form of recurrence. We show that Continuous Kernel Convolutional Networks (CKCNNs) obtain state-of-the-art results in multiple datasets, e.g., permuted MNIST, and, thanks to their continuous nature, are able to handle non-uniformly sampled datasets and irregularly-sampled data natively. CKCNNs match or perform better than neural ODEs designed for these purposes in a faster and simpler manner.
△ Less
Submitted 17 March, 2022; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Wavelet Networks: Scale-Translation Equivariant Learning From Raw Time-Series
Authors:
David W. Romero,
Erik J. Bekkers,
Jakub M. Tomczak,
Mark Hoogendoorn
Abstract:
Leveraging the symmetries inherent to specific data domains for the construction of equivariant neural networks has lead to remarkable improvements in terms of data efficiency and generalization. However, most existing research focuses on symmetries arising from planar and volumetric data, leaving a crucial data source largely underexplored: time-series. In this work, we fill this gap by leveragin…
▽ More
Leveraging the symmetries inherent to specific data domains for the construction of equivariant neural networks has lead to remarkable improvements in terms of data efficiency and generalization. However, most existing research focuses on symmetries arising from planar and volumetric data, leaving a crucial data source largely underexplored: time-series. In this work, we fill this gap by leveraging the symmetries inherent to time-series for the construction of equivariant neural network. We identify two core symmetries: *scale and translation*, and construct scale-translation equivariant neural networks for time-series learning. Intriguingly, we find that scale-translation equivariant map**s share strong resemblance with the wavelet transform. Inspired by this resemblance, we term our networks Wavelet Networks, and show that they perform nested non-linear wavelet-like time-frequency transforms. Empirical results show that Wavelet Networks outperform conventional CNNs on raw waveforms, and match strongly engineered spectrogram techniques across several tasks and time-series types, including audio, environmental sounds, and electrical signals. Our code is publicly available at https://github.com/dwromero/wavelet_networks.
△ Less
Submitted 21 January, 2024; v1 submitted 9 June, 2020;
originally announced June 2020.
-
Roto-Translation Equivariant Convolutional Networks: Application to Histopathology Image Analysis
Authors:
Maxime W. Lafarge,
Erik J. Bekkers,
Josien P. W. Pluim,
Remco Duits,
Mitko Veta
Abstract:
Rotation-invariance is a desired property of machine-learning models for medical image analysis and in particular for computational pathology applications. We propose a framework to encode the geometric structure of the special Euclidean motion group SE(2) in convolutional networks to yield translation and rotation equivariance via the introduction of SE(2)-group convolution layers. This structure…
▽ More
Rotation-invariance is a desired property of machine-learning models for medical image analysis and in particular for computational pathology applications. We propose a framework to encode the geometric structure of the special Euclidean motion group SE(2) in convolutional networks to yield translation and rotation equivariance via the introduction of SE(2)-group convolution layers. This structure enables models to learn feature representations with a discretized orientation dimension that guarantees that their outputs are invariant under a discrete set of rotations. Conventional approaches for rotation invariance rely mostly on data augmentation, but this does not guarantee the robustness of the output when the input is rotated. At that, trained conventional CNNs may require test-time rotation augmentation to reach their full capability. This study is focused on histopathology image analysis applications for which it is desirable that the arbitrary global orientation information of the imaged tissues is not captured by the machine learning models. The proposed framework is evaluated on three different histopathology image analysis tasks (mitosis detection, nuclei segmentation and tumor classification). We present a comparative analysis for each problem and show that consistent increase of performances can be achieved when using the proposed framework.
△ Less
Submitted 20 February, 2020;
originally announced February 2020.
-
Attentive Group Equivariant Convolutional Networks
Authors:
David W. Romero,
Erik J. Bekkers,
Jakub M. Tomczak,
Mark Hoogendoorn
Abstract:
Although group convolutional networks are able to learn powerful representations based on symmetry patterns, they lack explicit means to learn meaningful relationships among them (e.g., relative positions and poses). In this paper, we present attentive group equivariant convolutions, a generalization of the group convolution, in which attention is applied during the course of convolution to accent…
▽ More
Although group convolutional networks are able to learn powerful representations based on symmetry patterns, they lack explicit means to learn meaningful relationships among them (e.g., relative positions and poses). In this paper, we present attentive group equivariant convolutions, a generalization of the group convolution, in which attention is applied during the course of convolution to accentuate meaningful symmetry combinations and suppress non-plausible, misleading ones. We indicate that prior work on visual attention can be described as special cases of our proposed framework and show empirically that our attentive group equivariant convolutional networks consistently outperform conventional group convolutional networks on benchmark image datasets. Simultaneously, we provide interpretability to the learned concepts through the visualization of equivariant attention maps.
△ Less
Submitted 30 June, 2020; v1 submitted 7 February, 2020;
originally announced February 2020.
-
B-Spline CNNs on Lie Groups
Authors:
Erik J Bekkers
Abstract:
Group convolutional neural networks (G-CNNs) can be used to improve classical CNNs by equip** them with the geometric structure of groups. Central in the success of G-CNNs is the lifting of feature maps to higher dimensional disentangled representations, in which data characteristics are effectively learned, geometric data-augmentations are made obsolete, and predictable behavior under geometric…
▽ More
Group convolutional neural networks (G-CNNs) can be used to improve classical CNNs by equip** them with the geometric structure of groups. Central in the success of G-CNNs is the lifting of feature maps to higher dimensional disentangled representations, in which data characteristics are effectively learned, geometric data-augmentations are made obsolete, and predictable behavior under geometric transformations (equivariance) is guaranteed via group theory. Currently, however, the practical implementations of G-CNNs are limited to either discrete groups (that leave the grid intact) or continuous compact groups such as rotations (that enable the use of Fourier theory). In this paper we lift these limitations and propose a modular framework for the design and implementation of G-CNNs for arbitrary Lie groups. In our approach the differential structure of Lie groups is used to expand convolution kernels in a generic basis of B-splines that is defined on the Lie algebra. This leads to a flexible framework that enables localized, atrous, and deformable convolutions in G-CNNs by means of respectively localized, sparse and non-uniform B-spline expansions. The impact and potential of our approach is studied on two benchmark datasets: cancer detection in histopathology slides in which rotation equivariance plays a key role and facial landmark localization in which scale equivariance is important. In both cases, G-CNN architectures outperform their classical 2D counterparts and the added value of atrous and localized group convolutions is studied in detail.
△ Less
Submitted 22 March, 2021; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Fourier Transform on the Homogeneous Space of 3D Positions and Orientations for Exact Solutions to Linear Parabolic and (Hypo-)Elliptic PDEs
Authors:
Remco Duits,
Erik J. Bekkers,
Alexey Mashtakov
Abstract:
Fokker-Planck PDEs (incl. diffusions) for stable Lévy processes (incl. Wiener processes) on the joint space of positions and orientations play a major role in mechanics, robotics, image analysis, directional statistics and probability theory. Exact analytic designs and solutions are known in the 2D case, where they have been obtained using Fourier transform on $SE(2)$. Here we extend these approac…
▽ More
Fokker-Planck PDEs (incl. diffusions) for stable Lévy processes (incl. Wiener processes) on the joint space of positions and orientations play a major role in mechanics, robotics, image analysis, directional statistics and probability theory. Exact analytic designs and solutions are known in the 2D case, where they have been obtained using Fourier transform on $SE(2)$. Here we extend these approaches to 3D using Fourier transform on the Lie group $SE(3)$ of rigid body motions. More precisely, we define the homogeneous space of 3D positions and orientations $\mathbb{R}^{3}\rtimes S^{2}:=SE(3)/(\{\mathbf{0}\} \times SO(2))$ as the quotient in $SE(3)$. In our construction, two group elements are equivalent if they are equal up to a rotation around the reference axis. On this quotient we design a specific Fourier transform. We apply this Fourier transform to derive new exact solutions to Fokker-Planck PDEs of $α$-stable Lévy processes on $\mathbb{R}^{3}\rtimes S^{2}$. This reduces classical analysis computations and provides an explicit algebraic spectral decomposition of the solutions. We compare the exact probability kernel for $α= 1$ (the diffusion kernel) to the kernel for $α=\frac12$ (the Poisson kernel). We set up SDEs for the Lévy processes on the quotient and derive corresponding Monte-Carlo methods. We verify that the exact probability kernels arise as the limit of the Monte-Carlo approximations.
△ Less
Submitted 13 December, 2018; v1 submitted 1 November, 2018;
originally announced November 2018.
-
Roto-Translation Covariant Convolutional Networks for Medical Image Analysis
Authors:
Erik J Bekkers,
Maxime W Lafarge,
Mitko Veta,
Koen AJ Eppenhof,
Josien PW Pluim,
Remco Duits
Abstract:
We propose a framework for rotation and translation covariant deep learning using $SE(2)$ group convolutions. The group product of the special Euclidean motion group $SE(2)$ describes how a concatenation of two roto-translations results in a net roto-translation. We encode this geometric structure into convolutional neural networks (CNNs) via $SE(2)$ group convolutional layers, which fit into the…
▽ More
We propose a framework for rotation and translation covariant deep learning using $SE(2)$ group convolutions. The group product of the special Euclidean motion group $SE(2)$ describes how a concatenation of two roto-translations results in a net roto-translation. We encode this geometric structure into convolutional neural networks (CNNs) via $SE(2)$ group convolutional layers, which fit into the standard 2D CNN framework, and which allow to generically deal with rotated input samples without the need for data augmentation.
We introduce three layers: a lifting layer which lifts a 2D (vector valued) image to an $SE(2)$-image, i.e., 3D (vector valued) data whose domain is $SE(2)$; a group convolution layer from and to an $SE(2)$-image; and a projection layer from an $SE(2)$-image to a 2D image. The lifting and group convolution layers are $SE(2)$ covariant (the output roto-translates with the input). The final projection layer, a maximum intensity projection over rotations, makes the full CNN rotation invariant.
We show with three different problems in histopathology, retinal imaging, and electron microscopy that with the proposed group CNNs, state-of-the-art performance can be achieved, without the need for data augmentation by rotation and with increased performance compared to standard CNNs that do rely on augmentation.
△ Less
Submitted 11 June, 2018; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Nilpotent Approximations of Sub-Riemannian Distances for Fast Perceptual Grou** of Blood Vessels in 2D and 3D
Authors:
Erik J. Bekkers,
Da Chen,
Jorg M. Portegies
Abstract:
We propose an efficient approach for the grou** of local orientations (points on vessels) via nilpotent approximations of sub-Riemannian distances in the 2D and 3D roto-translation groups $SE(2)$ and $SE(3)$. In our distance approximations we consider homogeneous norms on nilpotent groups that locally approximate $SE(n)$, and which are obtained via the exponential and logarithmic map on $SE(n)$.…
▽ More
We propose an efficient approach for the grou** of local orientations (points on vessels) via nilpotent approximations of sub-Riemannian distances in the 2D and 3D roto-translation groups $SE(2)$ and $SE(3)$. In our distance approximations we consider homogeneous norms on nilpotent groups that locally approximate $SE(n)$, and which are obtained via the exponential and logarithmic map on $SE(n)$. In a qualitative validation we show that the norms provide accurate approximations of the true sub-Riemannian distances, and we discuss their relations to the fundamental solution of the sub-Laplacian on $SE(n)$. The quantitative experiments further confirm the accuracy of the approximations. Quantitative results are obtained by evaluating perceptual grou** performance of retinal blood vessels in 2D images and curves in challenging 3D synthetic volumes. The results show that 1) sub-Riemannian geometry is essential in achieving top performance and 2) that grou** via the fast analytic approximations performs almost equally, or better, than data-adaptive fast marching approaches on $\mathbb{R}^n$ and $SE(n)$.
△ Less
Submitted 8 November, 2017; v1 submitted 10 July, 2017;
originally announced July 2017.
-
Design and Processing of Invertible Orientation Scores of 3D Images for Enhancement of Complex Vasculature
Authors:
M. H. J. Janssen,
A. J. E. M. Janssen,
E. J. Bekkers,
J. Olivan Bescos,
R. Duits
Abstract:
The enhancement and detection of elongated structures in noisy image data is relevant for many biomedical imaging applications. To handle complex crossing structures in 2D images, 2D orientation scores $U: \mathbb{R} ^ 2\times S ^ 1 \rightarrow \mathbb{C}$ were introduced, which already showed their use in a variety of applications. Here we extend this work to 3D orientation scores…
▽ More
The enhancement and detection of elongated structures in noisy image data is relevant for many biomedical imaging applications. To handle complex crossing structures in 2D images, 2D orientation scores $U: \mathbb{R} ^ 2\times S ^ 1 \rightarrow \mathbb{C}$ were introduced, which already showed their use in a variety of applications. Here we extend this work to 3D orientation scores $U: \mathbb{R} ^ 3 \times S ^ 2\rightarrow \mathbb{C}$. First, we construct the orientation score from a given dataset, which is achieved by an invertible coherent state type of transform. For this transformation we introduce 3D versions of the 2D cake-wavelets, which are complex wavelets that can simultaneously detect oriented structures and oriented edges. Here we introduce two types of cake-wavelets, the first uses a discrete Fourier transform, the second is designed in the 3D generalized Zernike basis, allowing us to calculate analytical expressions for the spatial filters. Finally, we show two applications of the orientation score transformation. In the first application we propose an extension of crossing-preserving coherence enhancing diffusion via our invertible orientation scores of 3D images which we apply to real medical image data. In the second one we develop a new tubularity measure using 3D orientation scores and apply the tubularity measure to both artificial and real medical data.
△ Less
Submitted 27 November, 2017; v1 submitted 7 July, 2017;
originally announced July 2017.
-
Vessel Tracking via Sub-Riemannian Geodesics on $\mathbb{R}^2 \times P^{1}$
Authors:
E. J. Bekkers,
R. Duits,
A. Mashtakov,
Yu. Sachkov
Abstract:
We study a data-driven sub-Riemannian (SR) curve optimization model for connecting local orientations in orientation lifts of images. Our model lives on the projective line bundle $\mathbb{R}^{2} \times P^{1}$, with $P^{1}=S^{1}/_{\sim}$ with identification of antipodal points. It extends previous cortical models for contour perception on $\mathbb{R}^{2} \times P^{1}$ to the data-driven case. We p…
▽ More
We study a data-driven sub-Riemannian (SR) curve optimization model for connecting local orientations in orientation lifts of images. Our model lives on the projective line bundle $\mathbb{R}^{2} \times P^{1}$, with $P^{1}=S^{1}/_{\sim}$ with identification of antipodal points. It extends previous cortical models for contour perception on $\mathbb{R}^{2} \times P^{1}$ to the data-driven case. We provide a complete (mainly numerical) analysis of the dynamics of the 1st Maxwell-set with growing radii of SR-spheres, revealing the cut-locus. Furthermore, a comparison of the cusp-surface in $\mathbb{R}^{2} \times P^{1}$ to its counterpart in $\mathbb{R}^{2} \times S^{1}$ of a previous model, reveals a general and strong reduction of cusps in spatial projections of geodesics. Numerical solutions of the model are obtained by a single wavefront propagation method relying on a simple extension of existing anisotropic fast-marching or iterative morphological scale space methods. Experiments show that the projective line bundle structure greatly reduces the presence of cusps. Another advantage of including $\mathbb{R}^2 \times P^{1}$ instead of $\mathbb{R}^{2} \times S^{1}$ in the wavefront propagation is reduction of computational time.
△ Less
Submitted 13 April, 2017;
originally announced April 2017.
-
Tracking of Lines in Spherical Images via Sub-Riemannian Geodesics on SO(3)
Authors:
A. Mashtakov,
R. Duits,
Yu. Sachkov,
E. J. Bekkers,
I. Beschastnyi
Abstract:
In order to detect salient lines in spherical images, we consider the problem of minimizing the functional $\int \limits_0^l C(γ(s)) \sqrt{ξ^2 + k_g^2(s)} \, {\rm d}s$ for a curve $γ$ on a sphere with fixed boundary points and directions. The total length $l$ is free, $s$ denotes the spherical arclength, and $k_g$ denotes the geodesic curvature of $γ$. Here the smooth external cost $C\geq δ>0$ is…
▽ More
In order to detect salient lines in spherical images, we consider the problem of minimizing the functional $\int \limits_0^l C(γ(s)) \sqrt{ξ^2 + k_g^2(s)} \, {\rm d}s$ for a curve $γ$ on a sphere with fixed boundary points and directions. The total length $l$ is free, $s$ denotes the spherical arclength, and $k_g$ denotes the geodesic curvature of $γ$. Here the smooth external cost $C\geq δ>0$ is obtained from spherical data. We lift this problem to the sub-Riemannian (SR) problem in Lie group $SO(3)$ and show that the spherical projection of certain SR geodesics provides a solution to our curve optimization problem. In fact, this holds only for the geodesics whose spherical projection does not exhibit a cusp. The problem is a spherical extension of a well-known contour perception model, where we extend the model by Boscain and Rossi to the general case $ξ> 0$, $C \neq 1$. For $C=1$, we derive SR geodesics and evaluate the first cusp time. We show that these curves have a simpler expression when they are parameterized by spherical arclength rather than by sub-Riemannian arclength. For case $C \neq 1$ (data-driven SR geodesics), we solve via a SR Fast Marching method. Finally, we show an experiment of vessel tracking in a spherical image of the retina and study the effect of including the spherical geometry in analysis of vessels curvature.
△ Less
Submitted 24 March, 2017; v1 submitted 13 April, 2016;
originally announced April 2016.
-
Template Matching via Densities on the Roto-Translation Group
Authors:
Erik J. Bekkers,
Marco Loog,
Bart M. ter Haar Romeny,
Remco Duits
Abstract:
We propose a template matching method for the detection of 2D image objects that are characterized by orientation patterns. Our method is based on data representations via orientation scores, which are functions on the space of positions and orientations, and which are obtained via a wavelet-type transform. This new representation allows us to detect orientation patterns in an intuitive and direct…
▽ More
We propose a template matching method for the detection of 2D image objects that are characterized by orientation patterns. Our method is based on data representations via orientation scores, which are functions on the space of positions and orientations, and which are obtained via a wavelet-type transform. This new representation allows us to detect orientation patterns in an intuitive and direct way, namely via cross-correlations. Additionally, we propose a generalized linear regression framework for the construction of suitable templates using smoothing splines. Here, it is important to recognize a curved geometry on the position-orientation domain, which we identify with the Lie group SE(2): the roto-translation group. Templates are then optimized in a B-spline basis, and smoothness is defined with respect to the curved geometry. We achieve state-of-the-art results on three different applications: detection of the optic nerve head in the retina (99.83% success rate on 1737 images), of the fovea in the retina (99.32% success rate on 1616 images), and of the pupil in regular camera images (95.86% on 1521 images). The high performance is due to inclusion of both intensity and orientation features with effective geometric priors in the template matching. Moreover, our method is fast due to a cross-correlation based matching approach.
△ Less
Submitted 9 March, 2017; v1 submitted 10 March, 2016;
originally announced March 2016.
-
A PDE Approach to Data-driven Sub-Riemannian Geodesics in SE(2)
Authors:
Erik J. Bekkers,
Remco Duits,
Alexey Mashtakov,
Gonzalo R. Sanguinetti
Abstract:
We present a new flexible wavefront propagation algorithm for the boundary value problem for sub-Riemannian (SR) geodesics in the roto-translation group $SE(2) = \mathbb{R}^2 \rtimes S^1$ with a metric tensor depending on a smooth external cost $\mathcal{C}:SE(2) \to [δ,1]$, $δ>0$, computed from image data. The method consists of a first step where a SR-distance map is computed as a viscosity solu…
▽ More
We present a new flexible wavefront propagation algorithm for the boundary value problem for sub-Riemannian (SR) geodesics in the roto-translation group $SE(2) = \mathbb{R}^2 \rtimes S^1$ with a metric tensor depending on a smooth external cost $\mathcal{C}:SE(2) \to [δ,1]$, $δ>0$, computed from image data. The method consists of a first step where a SR-distance map is computed as a viscosity solution of a Hamilton-Jacobi-Bellman (HJB) system derived via Pontryagin's Maximum Principle (PMP). Subsequent backward integration, again relying on PMP, gives the SR-geodesics. For $\mathcal{C}=1$ we show that our method produces the global minimizers. Comparison with exact solutions shows a remarkable accuracy of the SR-spheres and the SR-geodesics. We present numerical computations of Maxwell points and cusp points, which we again verify for the uniform cost case $\mathcal{C}=1$. Regarding image analysis applications, tracking of elongated structures in retinal and synthetic images show that our line tracking generically deals with crossings. We show the benefits of including the sub-Riemannian geometry.
△ Less
Submitted 20 April, 2015; v1 submitted 4 March, 2015;
originally announced March 2015.