-
Multicalibration for Confidence Scoring in LLMs
Authors:
Gianluca Detommaso,
Martin Bertran,
Riccardo Fogliato,
Aaron Roth
Abstract:
This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting grou**s of the data. We show how to form grou**s for prompt/completion pairs that are correlated with the probability of correctnes…
▽ More
This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting grou**s of the data. We show how to form grou**s for prompt/completion pairs that are correlated with the probability of correctness via two techniques: clustering within an embedding space, and "self-annotation" - querying the LLM by asking it various yes-or-no questions about the prompt. We also develop novel variants of multicalibration algorithms that offer performance improvements by reducing their tendency to overfit. Through systematic benchmarking across various question answering datasets and LLMs, we show how our techniques can yield confidence scores that provide substantial improvements in fine-grained measures of both calibration and accuracy compared to existing methods.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Fortuna: A Library for Uncertainty Quantification in Deep Learning
Authors:
Gianluca Detommaso,
Alberto Gasparin,
Michele Donini,
Matthias Seeger,
Andrew Gordon Wilson,
Cedric Archambeau
Abstract:
We present Fortuna, an open-source library for uncertainty quantification in deep learning. Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reliable uncertainty estimates, and scalable Bayesian inference methods that can be applied to Flax-based deep neural networks trained from scratch for improved unce…
▽ More
We present Fortuna, an open-source library for uncertainty quantification in deep learning. Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reliable uncertainty estimates, and scalable Bayesian inference methods that can be applied to Flax-based deep neural networks trained from scratch for improved uncertainty quantification and accuracy. By providing a coherent framework for advanced uncertainty quantification methods, Fortuna simplifies the process of benchmarking and helps practitioners build robust AI systems.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Uncertainty Calibration in Bayesian Neural Networks via Distance-Aware Priors
Authors:
Gianluca Detommaso,
Alberto Gasparin,
Andrew Wilson,
Cedric Archambeau
Abstract:
As we move away from the data, the predictive uncertainty should increase, since a great variety of explanations are consistent with the little available information. We introduce Distance-Aware Prior (DAP) calibration, a method to correct overconfidence of Bayesian deep learning models outside of the training domain. We define DAPs as prior distributions over the model parameters that depend on t…
▽ More
As we move away from the data, the predictive uncertainty should increase, since a great variety of explanations are consistent with the little available information. We introduce Distance-Aware Prior (DAP) calibration, a method to correct overconfidence of Bayesian deep learning models outside of the training domain. We define DAPs as prior distributions over the model parameters that depend on the inputs through a measure of their distance from the training set. DAP calibration is agnostic to the posterior inference method, and it can be performed as a post-processing step. We demonstrate its effectiveness against several baselines in a variety of classification and regression problems, including benchmarks designed to test the quality of predictive distributions away from the data.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Causal Bias Quantification for Continuous Treatments
Authors:
Gianluca Detommaso,
Michael Brückner,
Philip Schulz,
Victor Chernozhukov
Abstract:
We extend the definition of the marginal causal effect to the continuous treatment setting and develop a novel characterization of causal bias in the framework of structural causal models. We prove that our derived bias expression is zero if, and only if, the causal effect is identifiable via covariate adjustment. We show that under some restrictions on the structural equations, the causal bias ca…
▽ More
We extend the definition of the marginal causal effect to the continuous treatment setting and develop a novel characterization of causal bias in the framework of structural causal models. We prove that our derived bias expression is zero if, and only if, the causal effect is identifiable via covariate adjustment. We show that under some restrictions on the structural equations, the causal bias can be estimated efficiently and allows for causal regularization of predictive probabilistic models. We demonstrate the effectiveness of our method for causal bias quantification in various settings where (not) controlling for certain covariates would introduce causal bias.
△ Less
Submitted 30 January, 2022; v1 submitted 17 June, 2021;
originally announced June 2021.
-
Multilevel Dimension-Independent Likelihood-Informed MCMC for Large-Scale Inverse Problems
Authors:
Tiangang Cui,
Gianluca Detommaso,
Robert Scheichl
Abstract:
We present a non-trivial integration of dimension-independent likelihood-informed (DILI) MCMC (Cui, Law, Marzouk, 2016) and the multilevel MCMC (Dodwell et al., 2015) to explore the hierarchy of posterior distributions. This integration offers several advantages: First, DILI-MCMC employs an intrinsic likelihood-informed subspace (LIS) (Cui et al., 2014) -- which involves a number of forward and ad…
▽ More
We present a non-trivial integration of dimension-independent likelihood-informed (DILI) MCMC (Cui, Law, Marzouk, 2016) and the multilevel MCMC (Dodwell et al., 2015) to explore the hierarchy of posterior distributions. This integration offers several advantages: First, DILI-MCMC employs an intrinsic likelihood-informed subspace (LIS) (Cui et al., 2014) -- which involves a number of forward and adjoint model simulations -- to design accelerated operator-weighted proposals. By exploiting the multilevel structure of the discretised parameters and discretised forward models, we design a Rayleigh-Ritz procedure to significantly reduce the computational effort in building the LIS and operating with DILI proposals. Second, the resulting DILI-MCMC can drastically improve the sampling efficiency of MCMC at each level, and hence reduce the integration error of the multilevel algorithm for fixed CPU time. Numerical results confirm the improved computational efficiency of the multilevel DILI approach.
△ Less
Submitted 29 November, 2023; v1 submitted 28 October, 2019;
originally announced October 2019.
-
HINT: Hierarchical Invertible Neural Transport for Density Estimation and Bayesian Inference
Authors:
Jakob Kruse,
Gianluca Detommaso,
Ullrich Köthe,
Robert Scheichl
Abstract:
Many recent invertible neural architectures are based on coupling block designs where variables are divided in two subsets which serve as inputs of an easily invertible (usually affine) triangular transformation. While such a transformation is invertible, its Jacobian is very sparse and thus may lack expressiveness. This work presents a simple remedy by noting that subdivision and (affine) couplin…
▽ More
Many recent invertible neural architectures are based on coupling block designs where variables are divided in two subsets which serve as inputs of an easily invertible (usually affine) triangular transformation. While such a transformation is invertible, its Jacobian is very sparse and thus may lack expressiveness. This work presents a simple remedy by noting that subdivision and (affine) coupling can be repeated recursively within the resulting subsets, leading to an efficiently invertible block with dense, triangular Jacobian. By formulating our recursive coupling scheme via a hierarchical architecture, HINT allows sampling from a joint distribution p(y,x) and the corresponding posterior p(x|y) using a single invertible network. We evaluate our method on some standard data sets and benchmark its full power for density estimation and Bayesian inference on a novel data set of 2D shapes in Fourier parameterization, which enables consistent visualization of samples for different dimensionalities.
△ Less
Submitted 25 May, 2021; v1 submitted 25 May, 2019;
originally announced May 2019.
-
Stein Variational Online Changepoint Detection with Applications to Hawkes Processes and Neural Networks
Authors:
Gianluca Detommaso,
Hanne Hoitzing,
Tiangang Cui,
Ardavan Alamir
Abstract:
Bayesian online changepoint detection (BOCPD) (Adams & MacKay, 2007) offers a rigorous and viable way to identify changepoints in complex systems. In this work, we introduce a Stein variational online changepoint detection (SVOCD) method to provide a computationally tractable generalization of BOCPD beyond the exponential family of probability distributions. We integrate the recently developed Ste…
▽ More
Bayesian online changepoint detection (BOCPD) (Adams & MacKay, 2007) offers a rigorous and viable way to identify changepoints in complex systems. In this work, we introduce a Stein variational online changepoint detection (SVOCD) method to provide a computationally tractable generalization of BOCPD beyond the exponential family of probability distributions. We integrate the recently developed Stein variational Newton (SVN) method (Detommaso et al., 2018) and BOCPD to offer a full online Bayesian treatment for a large number of situations with significant importance in practice. We apply the resulting method to two challenging and novel applications: Hawkes processes and long short-term memory (LSTM) neural networks. In both cases, we successfully demonstrate the efficacy of our method on real data.
△ Less
Submitted 25 May, 2019; v1 submitted 23 January, 2019;
originally announced January 2019.
-
A Stein variational Newton method
Authors:
Gianluca Detommaso,
Tiangang Cui,
Alessio Spantini,
Youssef Marzouk,
Robert Scheichl
Abstract:
Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD…
▽ More
Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.
△ Less
Submitted 29 October, 2018; v1 submitted 8 June, 2018;
originally announced June 2018.
-
Continuous Level Monte Carlo and Sample-Adaptive Model Hierarchies
Authors:
Gianluca Detommaso,
Tim Dodwell,
Rob Scheichl
Abstract:
In this paper, we present a generalisation of the Multilevel Monte Carlo (MLMC) method to a setting where the level parameter is a continuous variable. This Continuous Level Monte Carlo (CLMC) estimator provides a natural framework in PDE applications to adapt the model hierarchy to each sample. In addition, it can be made unbiased with respect to the expected value of the true quantity of interes…
▽ More
In this paper, we present a generalisation of the Multilevel Monte Carlo (MLMC) method to a setting where the level parameter is a continuous variable. This Continuous Level Monte Carlo (CLMC) estimator provides a natural framework in PDE applications to adapt the model hierarchy to each sample. In addition, it can be made unbiased with respect to the expected value of the true quantity of interest provided the quantity of interest converges sufficiently fast. The practical implementation of the CLMC estimator is based on interpolating actual evaluations of the quantity of interest at a finite number of resolutions. As our new level parameter, we use the logarithm of a goal-oriented finite element error estimator for the accuracy of the quantity of interest. We prove the unbiasedness, as well as a complexity theorem that shows the same rate of complexity for CLMC as for MLMC. Finally, we provide some numerical evidence to support our theoretical results, by successfully testing CLMC on a standard PDE test problem. The numerical experiments demonstrate clear gains for sample-wise adaptive refinement strategies over uniform refinements.
△ Less
Submitted 21 February, 2018;
originally announced February 2018.