Search | arXiv e-print repository

Reparameterization invariance in approximate Bayesian inference

Authors: Hrittik Roy, Marco Miani, Carl Henrik Ek, Philipp Hennig, Marvin Pförtner, Lukas Tatzel, Søren Hauberg

Abstract: Current approximate posteriors in Bayesian neural networks (BNNs) exhibit a crucial limitation: they fail to maintain invariance under reparameterization, i.e. BNNs assign different posterior densities to different parametrizations of identical functions. This creates a fundamental flaw in the application of Bayesian principles as it breaks the correspondence between uncertainty over the parameter… ▽ More Current approximate posteriors in Bayesian neural networks (BNNs) exhibit a crucial limitation: they fail to maintain invariance under reparameterization, i.e. BNNs assign different posterior densities to different parametrizations of identical functions. This creates a fundamental flaw in the application of Bayesian principles as it breaks the correspondence between uncertainty over the parameters with uncertainty over the parametrized function. In this paper, we investigate this issue in the context of the increasingly popular linearized Laplace approximation. Specifically, it has been observed that linearized predictives alleviate the common underfitting problems of the Laplace approximation. We develop a new geometric view of reparametrizations from which we explain the success of linearization. Moreover, we demonstrate that these reparameterization invariance properties can be extended to the original neural network predictive using a Riemannian diffusion process giving a straightforward algorithm for approximate posterior sampling, which empirically improves posterior fit. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.17277 [pdf, other]

Gradients of Functions of Large Matrices

Authors: Nicholas Krämer, Pablo Moreno-Muñoz, Hrittik Roy, Søren Hauberg

Abstract: Tuning scientific and probabilistic machine learning models -- for example, partial differential equations, Gaussian processes, or Bayesian neural networks -- often relies on evaluating functions of matrices whose size grows with the data set or the number of parameters. While the state-of-the-art for evaluating these quantities is almost always based on Lanczos and Arnoldi iterations, the present… ▽ More Tuning scientific and probabilistic machine learning models -- for example, partial differential equations, Gaussian processes, or Bayesian neural networks -- often relies on evaluating functions of matrices whose size grows with the data set or the number of parameters. While the state-of-the-art for evaluating these quantities is almost always based on Lanczos and Arnoldi iterations, the present work is the first to explain how to differentiate these workhorses of numerical linear algebra efficiently. To get there, we derive previously unknown adjoint systems for Lanczos and Arnoldi iterations, implement them in JAX, and show that the resulting code can compete with Diffrax when it comes to differentiating PDEs, GPyTorch for selecting Gaussian process models and beats standard factorisation methods for calibrating Bayesian neural networks. All this is achieved without any problem-specific code optimisation. Find the code at https://github.com/pnkraemer/experiments-lanczos-adjoints and install the library with pip install matfree. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2310.18907 [pdf, other]

Topological, or Non-topological? A Deep Learning Based Prediction

Authors: Ashiqur Rasul, Md Shafayat Hossain, Ankan Ghosh Dastider, Himaddri Roy, M. Zahid Hasan, Quazi D. M. Khosru

Abstract: Prediction and discovery of new materials with desired properties are at the forefront of quantum science and technology research. A major bottleneck in this field is the computational resources and time complexity related to finding new materials from ab initio calculations. In this work, an effective and robust deep learning-based model is proposed by incorporating persistent homology and graph… ▽ More Prediction and discovery of new materials with desired properties are at the forefront of quantum science and technology research. A major bottleneck in this field is the computational resources and time complexity related to finding new materials from ab initio calculations. In this work, an effective and robust deep learning-based model is proposed by incorporating persistent homology and graph neural network which offers an accuracy of 91.4% and an F1 score of 88.5% in classifying topological vs. non-topological materials, outperforming the other state-of-the-art classifier models. The incorporation of the graph neural network encodes the underlying relation between the atoms into the model based on their own crystalline structures and thus proved to be an effective method to represent and process non-euclidean data like molecules with a relatively shallow network. The persistent homology pipeline in the suggested neural network is capable of integrating the atom-specific topological information into the deep learning model, increasing robustness, and gain in performance. It is believed that the presented work will be an efficacious tool for predicting the topological class and therefore enable the high-throughput search for novel materials in this field. △ Less

Submitted 29 October, 2023; originally announced October 2023.

Comments: 13 pages, 8 figures

arXiv:2307.04719 [pdf, other]

On the curvature of the loss landscape

Authors: Alison Pouplin, Hrittik Roy, Sidak Pal Singh, Georgios Arvanitidis

Abstract: One of the main challenges in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data. A way to analyze this generalization concept is through the properties of the associated loss landscape. In this work, we consider the loss landscape as an embedded Riemannian manifold and show that the differential geometric properties of the manifold… ▽ More One of the main challenges in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data. A way to analyze this generalization concept is through the properties of the associated loss landscape. In this work, we consider the loss landscape as an embedded Riemannian manifold and show that the differential geometric properties of the manifold can be used when analyzing the generalization abilities of a deep net. In particular, we focus on the scalar curvature, which can be computed analytically for our manifold, and show connections to several settings that potentially imply generalization. △ Less

Submitted 10 July, 2023; originally announced July 2023.

Comments: 12 pages, 5 figures, preliminary work

arXiv:2201.04593 [pdf]

doi 10.3390/mti6080067

Ability-Based Methods for Personalized Keyboard Generation

Authors: Claire L. Mitchell, Gabriel J. Cler, Susan K. Fager, Paola Contessa, Serge H. Roy, Gianluca De Luca, Joshua C. Kline, Jennifer M. Vojtech

Abstract: This study introduces an ability-based method for personalized keyboard generation, wherein an individual's own movement and human-computer interaction data are used to automatically compute a personalized virtual keyboard layout. Our approach integrates a multidirectional point-select task to characterize cursor control over time, distance, and direction. The characterization is automatically emp… ▽ More This study introduces an ability-based method for personalized keyboard generation, wherein an individual's own movement and human-computer interaction data are used to automatically compute a personalized virtual keyboard layout. Our approach integrates a multidirectional point-select task to characterize cursor control over time, distance, and direction. The characterization is automatically employed to develop a computationally efficient keyboard layout that prioritizes each user's movement abilities through capturing directional constraints and preferences. We evaluated our approach in a study involving 16 participants using inertial sensing and facial electromyography as an access method, resulting in significantly increased communication rates using the personalized keyboard (52.0 bits/min) when compared to a generically optimized keyboard (47.9 bits/min). Our results demonstrate the ability to effectively characterize an individual's movement abilities to design a personalized keyboard for improved communication. This work underscores the importance of integrating a user's motor abilities when designing virtual interfaces. △ Less

Submitted 3 August, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

Comments: 20 pages, 7 figures

Journal ref: Multimodal Technol. Interact. 2022, 6, 67

arXiv:2012.01832 [pdf, other]

doi 10.1117/1.JEI.30.2.023016

Image inpainting using frequency domain priors

Authors: Hiya Roy, Subhajit Chaudhury, Toshihiko Yamasaki, Tatsuaki Hashimoto

Abstract: In this paper, we present a novel image inpainting technique using frequency domain information. Prior works on image inpainting predict the missing pixels by training neural networks using only the spatial domain information. However, these methods still struggle to reconstruct high-frequency details for real complex scenes, leading to a discrepancy in color, boundary artifacts, distorted pattern… ▽ More In this paper, we present a novel image inpainting technique using frequency domain information. Prior works on image inpainting predict the missing pixels by training neural networks using only the spatial domain information. However, these methods still struggle to reconstruct high-frequency details for real complex scenes, leading to a discrepancy in color, boundary artifacts, distorted patterns, and blurry textures. To alleviate these problems, we investigate if it is possible to obtain better performance by training the networks using frequency domain information (Discrete Fourier Transform) along with the spatial domain information. To this end, we propose a frequency-based deconvolution module that enables the network to learn the global context while selectively reconstructing the high-frequency components. We evaluate our proposed method on the publicly available datasets CelebA, Paris Streetview, and DTD texture dataset, and show that our method outperforms current state-of-the-art image inpainting techniques both qualitatively and quantitatively. △ Less

Submitted 3 December, 2020; originally announced December 2020.

arXiv:1905.05307 [pdf, other]

Current Mode Neuron for the Memristor based synapse

Authors: Harshit Roy, Mrigank Sharad

Abstract: Due to many limitations of Von Neumann architecture such as speed, memory bandwidth, efficiency of global interconnects and increase in the application of artificial neural network, researchers have been pushed to look into alternative architectures such as Neuromorphic computing system. Memristors (memristive crossbar memory RCM) are used as synapses due to its high packing density and energy eff… ▽ More Due to many limitations of Von Neumann architecture such as speed, memory bandwidth, efficiency of global interconnects and increase in the application of artificial neural network, researchers have been pushed to look into alternative architectures such as Neuromorphic computing system. Memristors (memristive crossbar memory RCM) are used as synapses due to its high packing density and energy efficiency and CMOS blocks as neurons. The increase in the terminal resistance of the RCM can degrade its energy efficiency and bandwidth. A more energy efficient current mode neuron has been proposed in this paper which can operate at lower voltages as compared to conventional voltage mode neuron circuit. △ Less

Submitted 13 May, 2019; originally announced May 2019.

arXiv:1904.06683 [pdf]

Lunar surface image restoration using U-net based deep neural networks

Authors: Hiya Roy, Subhajit Chaudhury, Toshihiko Yamasaki, Danielle DeLatte, Makiko Ohtake, Tatsuaki Hashimoto

Abstract: Image restoration is a technique that reconstructs a feasible estimate of the original image from the noisy observation. In this paper, we present a U-Net based deep neural network model to restore the missing pixels on the lunar surface image in a context-aware fashion, which is often known as image inpainting problem. We use the grayscale image of the lunar surface captured by Multiband Imager (… ▽ More Image restoration is a technique that reconstructs a feasible estimate of the original image from the noisy observation. In this paper, we present a U-Net based deep neural network model to restore the missing pixels on the lunar surface image in a context-aware fashion, which is often known as image inpainting problem. We use the grayscale image of the lunar surface captured by Multiband Imager (MI) onboard Kaguya satellite for our experiments and the results show that our method can reconstruct the lunar surface image with good visual quality and improved PSNR values. △ Less

Submitted 14 April, 2019; originally announced April 2019.

arXiv:1611.04481 [pdf, other]

Can fully convolutional networks perform well for general image restoration problems?

Authors: Subhajit Chaudhury, Hiya Roy

Abstract: We present a fully convolutional network(FCN) based approach for color image restoration. FCNs have recently shown remarkable performance for high-level vision problem like semantic segmentation. In this paper, we investigate if FCN models can show promising performance for low-level problems like image restoration as well. We propose a fully convolutional model, that learns a direct end-to-end ma… ▽ More We present a fully convolutional network(FCN) based approach for color image restoration. FCNs have recently shown remarkable performance for high-level vision problem like semantic segmentation. In this paper, we investigate if FCN models can show promising performance for low-level problems like image restoration as well. We propose a fully convolutional model, that learns a direct end-to-end map** between the corrupted images as input and the desired clean images as output. Our proposed method takes inspiration from domain transformation techniques but presents a data-driven task specific approach where filters for novel basis projection, task dependent coefficient alterations, and image reconstruction are represented as convolutional networks. Experimental results show that our FCN model outperforms traditional sparse coding based methods and demonstrates competitive performance compared to the state-of-the-art methods for image denoising. We further show that our proposed model can solve the difficult problem of blind image inpainting and can produce reconstructed images of impressive visual quality. △ Less

Submitted 13 April, 2017; v1 submitted 14 November, 2016; originally announced November 2016.

Comments: Accepted at IAPR MVA 2017

Showing 1–9 of 9 results for author: Roy, H