Search | arXiv e-print repository

Diagonalisation SGD: Fast & Convergent SGD for Non-Differentiable Models via Reparameterisation and Smoothing

Authors: Dominik Wagner, Basim Khajwal, C. -H. Luke Ong

Abstract: It is well-known that the reparameterisation gradient estimator, which exhibits low variance in practice, is biased for non-differentiable models. This may compromise correctness of gradient-based optimisation methods such as stochastic gradient descent (SGD). We introduce a simple syntactic framework to define non-differentiable functions piecewisely and present a systematic approach to obtain sm… ▽ More It is well-known that the reparameterisation gradient estimator, which exhibits low variance in practice, is biased for non-differentiable models. This may compromise correctness of gradient-based optimisation methods such as stochastic gradient descent (SGD). We introduce a simple syntactic framework to define non-differentiable functions piecewisely and present a systematic approach to obtain smoothings for which the reparameterisation gradient estimator is unbiased. Our main contribution is a novel variant of SGD, Diagonalisation Stochastic Gradient Descent, which progressively enhances the accuracy of the smoothed approximation during optimisation, and we prove convergence to stationary points of the unsmoothed (original) objective. Our empirical evaluation reveals benefits over the state of the art: our approach is simple, fast, stable and attains orders of magnitude reduction in work-normalised variance. △ Less

Submitted 19 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

arXiv:2301.03415 [pdf, other]

Fast and Correct Gradient-Based Optimisation for Probabilistic Programming via Smoothing

Authors: Basim Khajwal, C. -H. Luke Ong, Dominik Wagner

Abstract: We study the foundations of variational inference, which frames posterior inference as an optimisation problem, for probabilistic programming. The dominant approach for optimisation in practice is stochastic gradient descent. In particular, a variant using the so-called reparameterisation gradient estimator exhibits fast convergence in a traditional statistics setting. Unfortunately, discontinuiti… ▽ More We study the foundations of variational inference, which frames posterior inference as an optimisation problem, for probabilistic programming. The dominant approach for optimisation in practice is stochastic gradient descent. In particular, a variant using the so-called reparameterisation gradient estimator exhibits fast convergence in a traditional statistics setting. Unfortunately, discontinuities, which are readily expressible in programming languages, can compromise the correctness of this approach. We consider a simple (higher-order, probabilistic) programming language with conditionals, and we endow our language with both a measurable and a smoothed (approximate) value semantics. We present type systems which establish technical pre-conditions. Thus we can prove stochastic gradient descent with the reparameterisation gradient estimator to be correct when applied to the smoothed problem. Besides, we can solve the original problem up to any error tolerance by choosing an accuracy coefficient suitably. Empirically we demonstrate that our approach has a similar convergence as a key competitor, but is simpler, faster, and attains orders of magnitude reduction in work-normalised variance. △ Less

Submitted 9 January, 2023; originally announced January 2023.

arXiv:2208.03419 [pdf]

Multi-view deep learning for reliable post-disaster damage classification

Authors: Asim Bashir Khajwal, Chih-Shen Cheng, Arash Noshadravan

Abstract: This study aims to enable more reliable automated post-disaster building damage classification using artificial intelligence (AI) and multi-view imagery. The current practices and research efforts in adopting AI for post-disaster damage assessment are generally (a) qualitative, lacking refined classification of building damage levels based on standard damage scales, and (b) trained based on aerial… ▽ More This study aims to enable more reliable automated post-disaster building damage classification using artificial intelligence (AI) and multi-view imagery. The current practices and research efforts in adopting AI for post-disaster damage assessment are generally (a) qualitative, lacking refined classification of building damage levels based on standard damage scales, and (b) trained based on aerial or satellite imagery with limited views, which, although indicative, are not completely descriptive of the damage scale. To enable more accurate and reliable automated quantification of damage levels, the present study proposes the use of more comprehensive visual data in the form of multiple ground and aerial views of the buildings. To have such a spatially-aware damage prediction model, a Multi-view Convolution Neural Network (MV-CNN) architecture is used that combines the information from different views of a damaged building. This spatial 3D context damage information will result in more accurate identification of damages and reliable quantification of damage levels. The proposed model is trained and validated on reconnaissance visual dataset containing expert-labeled, geotagged images of the inspected buildings following hurricane Harvey. The developed model demonstrates reasonably good accuracy in predicting the damage levels and can be used to support more informed and reliable AI-assisted disaster management practices. △ Less

Submitted 5 August, 2022; originally announced August 2022.

Comments: 13th International Workshop on Structural Health Monitoring (IWSHM 2021)

Showing 1–3 of 3 results for author: Khajwal, B