-
Multivector Neurons: Better and Faster O(n)-Equivariant Clifford Graph Neural Networks
Authors:
Cong Liu,
David Ruhe,
Patrick Forré
Abstract:
Most current deep learning models equivariant to $O(n)$ or $SO(n)$ either consider mostly scalar information such as distances and angles or have a very high computational complexity. In this work, we test a few novel message passing graph neural networks (GNNs) based on Clifford multivectors, structured similarly to other prevalent equivariant models in geometric deep learning. Our approach lever…
▽ More
Most current deep learning models equivariant to $O(n)$ or $SO(n)$ either consider mostly scalar information such as distances and angles or have a very high computational complexity. In this work, we test a few novel message passing graph neural networks (GNNs) based on Clifford multivectors, structured similarly to other prevalent equivariant models in geometric deep learning. Our approach leverages efficient invariant scalar features while simultaneously performing expressive learning on multivector representations, particularly through the use of the equivariant geometric product operator. By integrating these elements, our methods outperform established efficient baseline models on an N-Body simulation task and protein denoising task while maintaining a high efficiency. In particular, we push the state-of-the-art error on the N-body dataset to 0.0035 (averaged over 3 runs); an 8% improvement over recent methods. Our implementation is available on Github.
△ Less
Submitted 10 July, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Clifford-Steerable Convolutional Neural Networks
Authors:
Maksim Zhdanov,
David Ruhe,
Maurice Weiler,
Ana Lucic,
Johannes Brandstetter,
Patrick Forré
Abstract:
We present Clifford-Steerable Convolutional Neural Networks (CS-CNNs), a novel class of $\mathrm{E}(p, q)$-equivariant CNNs. CS-CNNs process multivector fields on pseudo-Euclidean spaces $\mathbb{R}^{p,q}$. They cover, for instance, $\mathrm{E}(3)$-equivariance on $\mathbb{R}^3$ and Poincaré-equivariance on Minkowski spacetime $\mathbb{R}^{1,3}$. Our approach is based on an implicit parametrizatio…
▽ More
We present Clifford-Steerable Convolutional Neural Networks (CS-CNNs), a novel class of $\mathrm{E}(p, q)$-equivariant CNNs. CS-CNNs process multivector fields on pseudo-Euclidean spaces $\mathbb{R}^{p,q}$. They cover, for instance, $\mathrm{E}(3)$-equivariance on $\mathbb{R}^3$ and Poincaré-equivariance on Minkowski spacetime $\mathbb{R}^{1,3}$. Our approach is based on an implicit parametrization of $\mathrm{O}(p,q)$-steerable kernels via Clifford group equivariant neural networks. We significantly and consistently outperform baseline methods on fluid dynamics as well as relativistic electrodynamics forecasting tasks.
△ Less
Submitted 6 July, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Clifford Group Equivariant Simplicial Message Passing Networks
Authors:
Cong Liu,
David Ruhe,
Floor Eijkelboom,
Patrick Forré
Abstract:
We introduce Clifford Group Equivariant Simplicial Message Passing Networks, a method for steerable E(n)-equivariant message passing on simplicial complexes. Our method integrates the expressivity of Clifford group-equivariant layers with simplicial message passing, which is topologically more intricate than regular graph message passing. Clifford algebras include higher-order objects such as bive…
▽ More
We introduce Clifford Group Equivariant Simplicial Message Passing Networks, a method for steerable E(n)-equivariant message passing on simplicial complexes. Our method integrates the expressivity of Clifford group-equivariant layers with simplicial message passing, which is topologically more intricate than regular graph message passing. Clifford algebras include higher-order objects such as bivectors and trivectors, which express geometric features (e.g., areas, volumes) derived from vectors. Using this knowledge, we represent simplex features through geometric products of their vertices. To achieve efficient simplicial message passing, we share the parameters of the message network across different dimensions. Additionally, we restrict the final message to an aggregation of the incoming messages from different dimensions, leading to what we term shared simplicial message passing. Experimental results show that our method is able to outperform both equivariant and simplicial graph neural networks on a variety of geometric tasks.
△ Less
Submitted 12 March, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Rolling Diffusion Models
Authors:
David Ruhe,
Jonathan Heek,
Tim Salimans,
Emiel Hoogeboom
Abstract:
Diffusion models have recently been increasingly applied to temporal data such as video, fluid mechanics simulations, or climate data. These methods generally treat subsequent frames equally regarding the amount of noise in the diffusion process. This paper explores Rolling Diffusion: a new approach that uses a sliding window denoising process. It ensures that the diffusion process progressively c…
▽ More
Diffusion models have recently been increasingly applied to temporal data such as video, fluid mechanics simulations, or climate data. These methods generally treat subsequent frames equally regarding the amount of noise in the diffusion process. This paper explores Rolling Diffusion: a new approach that uses a sliding window denoising process. It ensures that the diffusion process progressively corrupts through time by assigning more noise to frames that appear later in a sequence, reflecting greater uncertainty about the future as the generation process unfolds. Empirically, we show that when the temporal dynamics are complex, Rolling Diffusion is superior to standard diffusion. In particular, this result is demonstrated in a video prediction task using the Kinetics-600 video dataset and in a chaotic fluid dynamics forecasting experiment.
△ Less
Submitted 6 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
A candidate coherent radio flash following a neutron star merger
Authors:
A. Rowlinson,
I. de Ruiter,
R. L. C. Starling,
K. M. Rajwade,
A. Hennessy,
R. A. M. J. Wijers,
G. E. Anderson,
M. Mevius,
D. Ruhe,
K. Gourdji,
A. J. van der Horst,
S. ter Veen,
K. Wiersema
Abstract:
In this paper, we present rapid follow-up observations of the short GRB 201006A, consistent with being a compact binary merger, using the LOw Frequency ARray (LOFAR). We have detected a candidate 5.6$σ$, short, coherent radio flash at 144 MHz at 76.6 mins post-GRB with a 3$σ$ duration of 38 seconds. This radio flash is 27 arcsec offset from the GRB location, which has a probability of occurring by…
▽ More
In this paper, we present rapid follow-up observations of the short GRB 201006A, consistent with being a compact binary merger, using the LOw Frequency ARray (LOFAR). We have detected a candidate 5.6$σ$, short, coherent radio flash at 144 MHz at 76.6 mins post-GRB with a 3$σ$ duration of 38 seconds. This radio flash is 27 arcsec offset from the GRB location, which has a probability of occurring by chance of $\sim$0.05% (3.8$σ$) when accounting for measurement uncertainties. Despite the offset, we show that the probability of finding an unrelated transient within 40 arcsec of the GRB location is $<10^{-6}$ and conclude that this is a candidate radio counterpart to GRB 201006A. We performed image plane dedispersion and the radio flash is tentatively (2.4$σ$) shown to be highly dispersed, allowing a distance estimate, corresponding to a redshift of $0.58\pm0.06$. The corresponding luminosity of the event at this distance is $6.7^{+6.6}_{-4.4} \times 10^{32}$ erg s$^{-1}$ Hz$^{-1}$. If associated with GRB 201006A, this emission would indicate prolonged activity from the central engine that is consistent with being a newborn, supramassive, likely highly magnetised, millisecond spin neutron star (a magnetar).
△ Less
Submitted 28 May, 2024; v1 submitted 7 December, 2023;
originally announced December 2023.
-
Transient study using LoTSS -- framework development and preliminary results
Authors:
Iris de Ruiter,
Zachary S. Meyers,
Antonia Rowlinson,
Timothy W. Shimwell,
David Ruhe,
Ralph A. M. J. Wijers
Abstract:
We present a search for transient radio sources on time-scales of seconds to hours at 144 MHz using the LOFAR Two-metre Sky Survey (LoTSS). This search is conducted by examining short time-scale images derived from the LoTSS data. To allow imaging of LoTSS on short time-scales, a novel imaging and filtering strategy is introduced. This includes sky model source subtraction, no cleaning or primary…
▽ More
We present a search for transient radio sources on time-scales of seconds to hours at 144 MHz using the LOFAR Two-metre Sky Survey (LoTSS). This search is conducted by examining short time-scale images derived from the LoTSS data. To allow imaging of LoTSS on short time-scales, a novel imaging and filtering strategy is introduced. This includes sky model source subtraction, no cleaning or primary beam correction, a simple source finder, fast filtering schemes and source catalogue matching. This new strategy is first tested by injecting simulated transients, with a range of flux densities and durations, into the data. We find the limiting sensitivity to be 113 and 6 mJy for 8 second and 1 hour transients respectively. The new imaging and filtering strategies are applied to 58 fields of the LoTSS survey, corresponding to LoTSS-DR1 (2% of the survey). One transient source is identified in the 8 second and 2 minute snapshot images. The source shows one minute duration flare in the 8 hour observation. Our method puts the most sensitive constraints on/estimates of the transient surface density at low frequencies at time-scales of seconds to hours; $<4.0\cdot 10^{-4} \; \text{deg}^{-2}$ at 1 hour at a sensitivity of 6.3 mJy; $5.7\cdot 10^{-7} \; \text{deg}^{-2}$ at 2 minutes at a sensitivity of 30 mJy; and $3.6\cdot 10^{-8} \; \text{deg}^{-2}$ at 8 seconds at a sensitivity of 113 mJy. In the future, we plan to apply the strategies presented in this paper to all LoTSS data.
△ Less
Submitted 14 November, 2023; v1 submitted 13 November, 2023;
originally announced November 2023.
-
On the Effectiveness of Hybrid Mutual Information Estimation
Authors:
Marco Federici,
David Ruhe,
Patrick Forré
Abstract:
Estimating the mutual information from samples from a joint distribution is a challenging problem in both science and engineering. In this work, we realize a variational bound that generalizes both discriminative and generative approaches. Using this bound, we propose a hybrid method to mitigate their respective shortcomings. Further, we propose Predictive Quantization (PQ): a simple generative me…
▽ More
Estimating the mutual information from samples from a joint distribution is a challenging problem in both science and engineering. In this work, we realize a variational bound that generalizes both discriminative and generative approaches. Using this bound, we propose a hybrid method to mitigate their respective shortcomings. Further, we propose Predictive Quantization (PQ): a simple generative method that can be easily combined with discriminative estimators for minimal computational overhead. Our propositions yield a tighter bound on the information thanks to the reduced variance of the estimator. We test our methods on a challenging task of correlated high-dimensional Gaussian distributions and a stochastic process involving a system of free particles subjected to a fixed energy landscape. Empirical results show that hybrid methods consistently improved mutual information estimates when compared to the corresponding discriminative counterpart.
△ Less
Submitted 2 June, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Clifford Group Equivariant Neural Networks
Authors:
David Ruhe,
Johannes Brandstetter,
Patrick Forré
Abstract:
We introduce Clifford Group Equivariant Neural Networks: a novel approach for constructing $\mathrm{O}(n)$- and $\mathrm{E}(n)$-equivariant models. We identify and study the $\textit{Clifford group}$, a subgroup inside the Clifford algebra tailored to achieve several favorable properties. Primarily, the group's action forms an orthogonal automorphism that extends beyond the typical vector space to…
▽ More
We introduce Clifford Group Equivariant Neural Networks: a novel approach for constructing $\mathrm{O}(n)$- and $\mathrm{E}(n)$-equivariant models. We identify and study the $\textit{Clifford group}$, a subgroup inside the Clifford algebra tailored to achieve several favorable properties. Primarily, the group's action forms an orthogonal automorphism that extends beyond the typical vector space to the entire Clifford algebra while respecting the multivector grading. This leads to several non-equivalent subrepresentations corresponding to the multivector decomposition. Furthermore, we prove that the action respects not just the vector space structure of the Clifford algebra but also its multiplicative structure, i.e., the geometric product. These findings imply that every polynomial in multivectors, An advantage worth mentioning is that we obtain expressive layers that can elegantly generalize to inner-product spaces of any dimension. We demonstrate, notably from a single core implementation, state-of-the-art performance on several distinct tasks, including a three-dimensional $n$-body experiment, a four-dimensional Lorentz-equivariant high-energy physics experiment, and a five-dimensional convex hull experiment.
△ Less
Submitted 22 October, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Geometric Clifford Algebra Networks
Authors:
David Ruhe,
Jayesh K. Gupta,
Steven de Keninck,
Max Welling,
Johannes Brandstetter
Abstract:
We propose Geometric Clifford Algebra Networks (GCANs) for modeling dynamical systems. GCANs are based on symmetry group transformations using geometric (Clifford) algebras. We first review the quintessence of modern (plane-based) geometric algebra, which builds on isometries encoded as elements of the $\mathrm{Pin}(p,q,r)$ group. We then propose the concept of group action layers, which linearly…
▽ More
We propose Geometric Clifford Algebra Networks (GCANs) for modeling dynamical systems. GCANs are based on symmetry group transformations using geometric (Clifford) algebras. We first review the quintessence of modern (plane-based) geometric algebra, which builds on isometries encoded as elements of the $\mathrm{Pin}(p,q,r)$ group. We then propose the concept of group action layers, which linearly combine object transformations using pre-specified group actions. Together with a new activation and normalization scheme, these layers serve as adjustable $\textit{geometric templates}$ that can be refined via gradient descent. Theoretical advantages are strongly reflected in the modeling of three-dimensional rigid body transformations as well as large-scale fluid dynamics simulations, showing significantly improved performance over traditional methods.
△ Less
Submitted 29 May, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Normalizing Flows for Hierarchical Bayesian Analysis: A Gravitational Wave Population Study
Authors:
David Ruhe,
Kaze Wong,
Miles Cranmer,
Patrick Forré
Abstract:
We propose parameterizing the population distribution of the gravitational wave population modeling framework (Hierarchical Bayesian Analysis) with a normalizing flow. We first demonstrate the merit of this method on illustrative experiments and then analyze four parameters of the latest LIGO/Virgo data release: primary mass, secondary mass, redshift, and effective spin. Our results show that desp…
▽ More
We propose parameterizing the population distribution of the gravitational wave population modeling framework (Hierarchical Bayesian Analysis) with a normalizing flow. We first demonstrate the merit of this method on illustrative experiments and then analyze four parameters of the latest LIGO/Virgo data release: primary mass, secondary mass, redshift, and effective spin. Our results show that despite the small and notoriously noisy dataset, the posterior predictive distributions (assuming a prior over the parameters of the flow) of the observed gravitational wave population recover structure that agrees with robust previous phenomenological modeling results while being less susceptible to biases introduced by less flexible models. Therefore, the method forms a promising flexible, reliable replacement for population inference distributions, even when data is highly noisy.
△ Less
Submitted 29 December, 2022; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Self-Supervised Inference in State-Space Models
Authors:
David Ruhe,
Patrick Forré
Abstract:
We perform approximate inference in state-space models with nonlinear state transitions. Without parameterizing a generative model, we apply Bayesian update formulas using a local linearity approximation parameterized by neural networks. This comes accompanied by a maximum likelihood objective that requires no supervision via uncorrupt observations or ground truth latent states. The optimization b…
▽ More
We perform approximate inference in state-space models with nonlinear state transitions. Without parameterizing a generative model, we apply Bayesian update formulas using a local linearity approximation parameterized by neural networks. This comes accompanied by a maximum likelihood objective that requires no supervision via uncorrupt observations or ground truth latent states. The optimization backpropagates through a recursion similar to the classical Kalman filter and smoother. Additionally, using an approximate conditional independence, we can perform smoothing without having to parameterize a separate model. In scientific applications, domain knowledge can give a linear approximation of the latent transition maps, which we can easily incorporate into our model. Usage of such domain knowledge is reflected in excellent results (despite our model's simplicity) on the chaotic Lorenz system compared to fully supervised and variational inference methods. Finally, we show competitive results on an audio denoising experiment.
△ Less
Submitted 25 January, 2022; v1 submitted 28 July, 2021;
originally announced July 2021.
-
Detecting Dispersed Radio Transients in Real Time Using Convolutional Neural Networks
Authors:
David Ruhe,
Mark Kuiack,
Antonia Rowlinson,
Ralph Wijers,
Patrick Forré
Abstract:
We present a methodology for automated real-time analysis of a radio image data stream with the goal to find transient sources. Contrary to previous works, the transients we are interested in occur on a time-scale where dispersion starts to play a role, so we must search a higher-dimensional data space and yet work fast enough to keep up with the data stream in real time. The approach consists of…
▽ More
We present a methodology for automated real-time analysis of a radio image data stream with the goal to find transient sources. Contrary to previous works, the transients we are interested in occur on a time-scale where dispersion starts to play a role, so we must search a higher-dimensional data space and yet work fast enough to keep up with the data stream in real time. The approach consists of five main steps: quality control, source detection, association, flux measurement, and physical parameter inference. We present parallelized methods based on convolutions and filters that can be accelerated on a GPU, allowing the pipeline to run in real-time. In the parameter inference step, we apply a convolutional neural network to dynamic spectra that were obtained from the preceding steps. It infers physical parameters, among which the dispersion measure of the transient candidate. Based on critical values of these parameters, an alert can be sent out and data will be saved for further investigation. Experimentally, the pipeline is applied to simulated data and images from AARTFAAC (Amsterdam Astron Radio Transients Facility And Analysis Centre), a transients facility based on the Low-Frequency Array (LOFAR). Results on simulated data show the efficacy of the pipeline, and from real data it discovered dispersed pulses. The current work targets transients on time scales that are longer than the fast transients of beam-formed search, but shorter than slow transients in which dispersion matters less. This fills a methodological gap that is relevant for the upcoming Square-Kilometer Array (SKA). Additionally, since real-time analysis can be performed, only data with promising detections can be saved to disk, providing a solution to the big-data problem that modern astronomy is dealing with.
△ Less
Submitted 6 August, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Bayesian Modelling in Practice: Using Uncertainty to Improve Trustworthiness in Medical Applications
Authors:
David Ruhe,
Giovanni Cinà,
Michele Tonutti,
Daan de Bruin,
Paul Elbers
Abstract:
The Intensive Care Unit (ICU) is a hospital department where machine learning has the potential to provide valuable assistance in clinical decision making. Classical machine learning models usually only provide point-estimates and no uncertainty of predictions. In practice, uncertain predictions should be presented to doctors with extra care in order to prevent potentially catastrophic treatment d…
▽ More
The Intensive Care Unit (ICU) is a hospital department where machine learning has the potential to provide valuable assistance in clinical decision making. Classical machine learning models usually only provide point-estimates and no uncertainty of predictions. In practice, uncertain predictions should be presented to doctors with extra care in order to prevent potentially catastrophic treatment decisions. In this work we show how Bayesian modelling and the predictive uncertainty that it provides can be used to mitigate risk of misguided prediction and to detect out-of-domain examples in a medical setting. We derive analytically a bound on the prediction loss with respect to predictive uncertainty. The bound shows that uncertainty can mitigate loss. Furthermore, we apply a Bayesian Neural Network to the MIMIC-III dataset, predicting risk of mortality of ICU patients. Our empirical results show that uncertainty can indeed prevent potential errors and reliably identifies out-of-domain patients. These results suggest that Bayesian predictive uncertainty can greatly improve trustworthiness of machine learning models in high-risk settings such as the ICU.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.