Search | arXiv e-print repository

Scaling physics-informed hard constraints with mixture-of-experts

Authors: Nithin Chalapathi, Yiheng Du, Aditi Krishnapriyan

Abstract: Imposing known physical constraints, such as conservation laws, during neural network training introduces an inductive bias that can improve accuracy, reliability, convergence, and data efficiency for modeling physical dynamics. While such constraints can be softly imposed via loss function penalties, recent advancements in differentiable physics and optimization improve performance by incorporati… ▽ More Imposing known physical constraints, such as conservation laws, during neural network training introduces an inductive bias that can improve accuracy, reliability, convergence, and data efficiency for modeling physical dynamics. While such constraints can be softly imposed via loss function penalties, recent advancements in differentiable physics and optimization improve performance by incorporating PDE-constrained optimization as individual layers in neural networks. This enables a stricter adherence to physical constraints. However, imposing hard constraints significantly increases computational and memory costs, especially for complex dynamical systems. This is because it requires solving an optimization problem over a large number of points in a mesh, representing spatial and temporal discretizations, which greatly increases the complexity of the constraint. To address this challenge, we develop a scalable approach to enforce hard physical constraints using Mixture-of-Experts (MoE), which can be used with any neural network architecture. Our approach imposes the constraint over smaller decomposed domains, each of which is solved by an "expert" through differentiable optimization. During training, each expert independently performs a localized backpropagation step by leveraging the implicit function theorem; the independence of each expert allows for parallelization across multiple GPUs. Compared to standard differentiable optimization, our scalable approach achieves greater accuracy in the neural PDE solver setting for predicting the dynamics of challenging non-linear systems. We also improve training stability and require significantly less computation time during both training and inference stages. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted to the International Conference on Learning Representations (ICLR) 2024

arXiv:2312.05225 [pdf, other]

Neural Spectral Methods: Self-supervised learning in the spectral domain

Authors: Yiheng Du, Nithin Chalapathi, Aditi Krishnapriyan

Abstract: We present Neural Spectral Methods, a technique to solve parametric Partial Differential Equations (PDEs), grounded in classical spectral methods. Our method uses orthogonal bases to learn PDE solutions as map**s between spectral coefficients. In contrast to current machine learning approaches which enforce PDE constraints by minimizing the numerical quadrature of the residuals in the spatiotemp… ▽ More We present Neural Spectral Methods, a technique to solve parametric Partial Differential Equations (PDEs), grounded in classical spectral methods. Our method uses orthogonal bases to learn PDE solutions as map**s between spectral coefficients. In contrast to current machine learning approaches which enforce PDE constraints by minimizing the numerical quadrature of the residuals in the spatiotemporal domain, we leverage Parseval's identity and introduce a new training strategy through a \textit{spectral loss}. Our spectral loss enables more efficient differentiation through the neural network, and substantially reduces training complexity. At inference time, the computational cost of our method remains constant, regardless of the spatiotemporal resolution of the domain. Our experimental results demonstrate that our method significantly outperforms previous machine learning approaches in terms of speed and accuracy by one to two orders of magnitude on multiple different problems. When compared to numerical solvers of the same accuracy, our method demonstrates a $10\times$ increase in performance speed. △ Less

Submitted 18 January, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

Comments: Accepted to International Conference on Learning Representations (ICLR) 2024

arXiv:2011.03209 [pdf, other]

Mapper Interactive: A Scalable, Extendable, and Interactive Toolbox for the Visual Exploration of High-Dimensional Data

Authors: Youjia Zhou, Nithin Chalapathi, Archit Rathore, Yaodong Zhao, Bei Wang

Abstract: The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high-dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus supporting pra… ▽ More The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high-dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus supporting practical data analysis. In particular, its command-line API can compute mapper graphs for 1 million points of 256 dimensions in about 3 minutes (4 times faster than the vanilla implementation). Its visual interface allows on-the-fly computation and manipulation of the mapper graph based on user-specified parameters and supports the addition of new analysis modules with a few lines of code. Mapper Interactive makes the mapper algorithm accessible to nonspecialists and accelerates topological analytics workflows. △ Less

Submitted 27 April, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

arXiv:1912.06332 [pdf, other]

TopoAct: Visually Exploring the Shape of Activations in Deep Learning

Authors: Archit Rathore, Nithin Chalapathi, Sourabh Palande, Bei Wang

Abstract: Deep neural networks such as GoogLeNet, ResNet, and BERT have achieved impressive performance in tasks such as image and text classification. To understand how such performance is achieved, we probe a trained deep neural network by studying neuron activations, i.e., combinations of neuron firings, at various layers of the network in response to a particular input. With a large number of inputs, we… ▽ More Deep neural networks such as GoogLeNet, ResNet, and BERT have achieved impressive performance in tasks such as image and text classification. To understand how such performance is achieved, we probe a trained deep neural network by studying neuron activations, i.e., combinations of neuron firings, at various layers of the network in response to a particular input. With a large number of inputs, we aim to obtain a global view of what neurons detect by studying their activations. In particular, we develop visualizations that show the shape of the activation space, the organizational principle behind neuron activations, and the relationships of these activations within a layer. Applying tools from topological data analysis, we present TopoAct, a visual exploration system to study topological summaries of activation vectors. We present exploration scenarios using TopoAct that provide valuable insights into learned representations of neural networks. We expect TopoAct to give a topological perspective that enriches the current toolbox of neural network analysis, and to provide a basis for network architecture diagnosis and data anomaly detection. △ Less

Submitted 12 April, 2021; v1 submitted 13 December, 2019; originally announced December 2019.

Showing 1–4 of 4 results for author: Chalapathi, N