Search | arXiv e-print repository

On Fixing the Right Problems in Predictive Analytics: AUC Is Not the Problem

Authors: Ryan S. Baker, Nigel Bosch, Stephen Hutt, Andres F. Zambrano, Alex J. Bowers

Abstract: Recently, ACM FAccT published an article by Kwegyir-Aggrey and colleagues (2023), critiquing the use of AUC ROC in predictive analytics in several domains. In this article, we offer a critique of that article. Specifically, we highlight technical inaccuracies in that paper's comparison of metrics, mis-specification of the interpretation and goals of AUC ROC, the article's use of the accuracy metri… ▽ More Recently, ACM FAccT published an article by Kwegyir-Aggrey and colleagues (2023), critiquing the use of AUC ROC in predictive analytics in several domains. In this article, we offer a critique of that article. Specifically, we highlight technical inaccuracies in that paper's comparison of metrics, mis-specification of the interpretation and goals of AUC ROC, the article's use of the accuracy metric as a gold standard for comparison to AUC ROC, and the article's application of critiques solely to AUC ROC for concerns that would apply to the use of any metric. We conclude with a re-framing of the very valid concerns raised in that article, and discuss how the use of AUC ROC can remain a valid and appropriate practice in a well-informed predictive analytics approach taking those concerns into account. We conclude by discussing the combined use of multiple metrics, including machine learning bias metrics, and AUC ROC's place in such an approach. Like broccoli, AUC ROC is healthy, but also like broccoli, researchers and practitioners in our field shouldn't eat a diet of only AUC ROC. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2402.12231 [pdf, other]

Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations

Authors: Jonas Beck, Nathanael Bosch, Michael Deistler, Kyra L. Kadhim, Jakob H. Macke, Philipp Hennig, Philipp Berens

Abstract: Ordinary differential equations (ODEs) are widely used to describe dynamical systems in science, but identifying parameters that explain experimental measurements is challenging. In particular, although ODEs are differentiable and would allow for gradient-based parameter optimization, the nonlinear dynamics of ODEs often lead to many local minima and extreme sensitivity to initial conditions. We t… ▽ More Ordinary differential equations (ODEs) are widely used to describe dynamical systems in science, but identifying parameters that explain experimental measurements is challenging. In particular, although ODEs are differentiable and would allow for gradient-based parameter optimization, the nonlinear dynamics of ODEs often lead to many local minima and extreme sensitivity to initial conditions. We therefore propose diffusion tempering, a novel regularization technique for probabilistic numerical methods which improves convergence of gradient-based parameter optimization in ODEs. By iteratively reducing a noise parameter of the probabilistic integrator, the proposed method converges more reliably to the true parameters. We demonstrate that our method is effective for dynamical systems of different complexity and show that it obtains reliable parameter estimates for a Hodgkin-Huxley model with a practically relevant number of parameters. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2401.17731 [pdf, other]

Probabilistic ODE Solvers for Integration Error-Aware Numerical Optimal Control

Authors: Amon Lahr, Filip Tronarp, Nathanael Bosch, Jonathan Schmidt, Philipp Hennig, Melanie N. Zeilinger

Abstract: Appropriate time discretization is crucial for real-time applications of numerical optimal control, such as nonlinear model predictive control. However, if the discretization error strongly depends on the applied control input, meeting accuracy and sampling time requirements simultaneously can be challenging using classical discretization methods. In particular, neither fixed-grid nor adaptive-gri… ▽ More Appropriate time discretization is crucial for real-time applications of numerical optimal control, such as nonlinear model predictive control. However, if the discretization error strongly depends on the applied control input, meeting accuracy and sampling time requirements simultaneously can be challenging using classical discretization methods. In particular, neither fixed-grid nor adaptive-grid discretizations may be suitable, when they suffer from large integration error or exceed the prescribed sampling time, respectively. In this work, we take a first step at closing this gap by utilizing probabilistic numerical integrators to approximate the solution of the initial value problem, as well as the computational uncertainty associated with it, inside the optimal control problem (OCP). By taking the viewpoint of probabilistic numerics and propagating the numerical uncertainty in the cost, the OCP is reformulated such that the optimal input reduces the computational uncertainty insofar as it is beneficial for the control objective. The proposed approach is illustrated using a numerical example, and potential benefits and limitations are discussed. △ Less

Submitted 13 June, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: to be published in the 6th Annual Learning for Dynamics & Control Conference (L4DC 2024)

MSC Class: 49M25 ACM Class: G.1.7

arXiv:2310.01145 [pdf, other]

Parallel-in-Time Probabilistic Numerical ODE Solvers

Authors: Nathanael Bosch, Adrien Corenflos, Fatemeh Yaghoobi, Filip Tronarp, Philipp Hennig, Simo Särkkä

Abstract: Probabilistic numerical solvers for ordinary differential equations (ODEs) treat the numerical simulation of dynamical systems as problems of Bayesian state estimation. Aside from producing posterior distributions over ODE solutions and thereby quantifying the numerical approximation error of the method itself, one less-often noted advantage of this formalism is the algorithmic flexibility gained… ▽ More Probabilistic numerical solvers for ordinary differential equations (ODEs) treat the numerical simulation of dynamical systems as problems of Bayesian state estimation. Aside from producing posterior distributions over ODE solutions and thereby quantifying the numerical approximation error of the method itself, one less-often noted advantage of this formalism is the algorithmic flexibility gained by formulating numerical simulation in the framework of Bayesian filtering and smoothing. In this paper, we leverage this flexibility and build on the time-parallel formulation of iterated extended Kalman smoothers to formulate a parallel-in-time probabilistic numerical ODE solver. Instead of simulating the dynamical system sequentially in time, as done by current probabilistic solvers, the proposed method processes all time steps in parallel and thereby reduces the span cost from linear to logarithmic in the number of time steps. We demonstrate the effectiveness of our approach on a variety of ODEs and compare it to a range of both classic and probabilistic numerical ODE solvers. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2305.14978 [pdf, other]

Probabilistic Exponential Integrators

Authors: Nathanael Bosch, Philipp Hennig, Filip Tronarp

Abstract: Probabilistic solvers provide a flexible and efficient framework for simulation, uncertainty quantification, and inference in dynamical systems. However, like standard solvers, they suffer performance penalties for certain stiff systems, where small steps are required not for reasons of numerical accuracy but for the sake of stability. This issue is greatly alleviated in semi-linear problems by th… ▽ More Probabilistic solvers provide a flexible and efficient framework for simulation, uncertainty quantification, and inference in dynamical systems. However, like standard solvers, they suffer performance penalties for certain stiff systems, where small steps are required not for reasons of numerical accuracy but for the sake of stability. This issue is greatly alleviated in semi-linear problems by the probabilistic exponential integrators developed in this paper. By including the fast, linear dynamics in the prior, we arrive at a class of probabilistic integrators with favorable properties. Namely, they are proven to be L-stable, and in a certain case reduce to a classic exponential integrator -- with the added benefit of providing a probabilistic account of the numerical error. The method is also generalized to arbitrary non-linear systems by imposing piece-wise semi-linearity on the prior via Jacobians of the vector field at the previous estimates, resulting in probabilistic exponential Rosenbrock methods. We evaluate the proposed methods on multiple stiff differential equations and demonstrate their improved stability and efficiency over established probabilistic solvers. The present contribution thus expands the range of problems that can be effectively tackled within probabilistic numerics. △ Less

Submitted 19 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

arXiv:2202.01287 [pdf, other]

Fenrir: Physics-Enhanced Regression for Initial Value Problems

Authors: Filip Tronarp, Nathanael Bosch, Philipp Hennig

Abstract: We show how probabilistic numerics can be used to convert an initial value problem into a Gauss--Markov process parametrised by the dynamics of the initial value problem. Consequently, the often difficult problem of parameter estimation in ordinary differential equations is reduced to hyperparameter estimation in Gauss--Markov regression, which tends to be considerably easier. The method's relatio… ▽ More We show how probabilistic numerics can be used to convert an initial value problem into a Gauss--Markov process parametrised by the dynamics of the initial value problem. Consequently, the often difficult problem of parameter estimation in ordinary differential equations is reduced to hyperparameter estimation in Gauss--Markov regression, which tends to be considerably easier. The method's relation and benefits in comparison to classical numerical integration and gradient matching approaches is elucidated. In particular, the method can, in contrast to gradient matching, handle partial observations, and has certain routes for esca** local optima not available to classical numerical integration. Experimental results demonstrate that the method is on par or moderately better than competing approaches. △ Less

Submitted 24 May, 2023; v1 submitted 2 February, 2022; originally announced February 2022.

arXiv:2112.02100 [pdf, other]

ProbNum: Probabilistic Numerics in Python

Authors: Jonathan Wenger, Nicholas Krämer, Marvin Pförtner, Jonathan Schmidt, Nathanael Bosch, Nina Effenberger, Johannes Zenn, Alexandra Gessner, Toni Karvonen, François-Xavier Briol, Maren Mahsereci, Philipp Hennig

Abstract: Probabilistic numerical methods (PNMs) solve numerical problems via probabilistic inference. They have been developed for linear algebra, optimization, integration and differential equation simulation. PNMs naturally incorporate prior information about a problem and quantify uncertainty due to finite computational resources as well as stochastic input. In this paper, we present ProbNum: a Python l… ▽ More Probabilistic numerical methods (PNMs) solve numerical problems via probabilistic inference. They have been developed for linear algebra, optimization, integration and differential equation simulation. PNMs naturally incorporate prior information about a problem and quantify uncertainty due to finite computational resources as well as stochastic input. In this paper, we present ProbNum: a Python library providing state-of-the-art probabilistic numerical solvers. ProbNum enables custom composition of PNMs for specific problem classes via a modular design as well as wrappers for off-the-shelf use. Tutorials, documentation, developer guides and benchmarks are available online at www.probnum.org. △ Less

Submitted 3 December, 2021; originally announced December 2021.

arXiv:2110.11812 [pdf, other]

Probabilistic ODE Solutions in Millions of Dimensions

Authors: Nicholas Krämer, Nathanael Bosch, Jonathan Schmidt, Philipp Hennig

Abstract: Probabilistic solvers for ordinary differential equations (ODEs) have emerged as an efficient framework for uncertainty quantification and inference on dynamical systems. In this work, we explain the mathematical assumptions and detailed implementation schemes behind solving {high-dimensional} ODEs with a probabilistic numerical algorithm. This has not been possible before due to matrix-matrix ope… ▽ More Probabilistic solvers for ordinary differential equations (ODEs) have emerged as an efficient framework for uncertainty quantification and inference on dynamical systems. In this work, we explain the mathematical assumptions and detailed implementation schemes behind solving {high-dimensional} ODEs with a probabilistic numerical algorithm. This has not been possible before due to matrix-matrix operations in each solver step, but is crucial for scientifically relevant problems -- most importantly, the solution of discretised {partial} differential equations. In a nutshell, efficient high-dimensional probabilistic ODE solutions build either on independence assumptions or on Kronecker structure in the prior model. We evaluate the resulting efficiency on a range of problems, including the probabilistic numerical simulation of a differential equation with millions of dimensions. △ Less

Submitted 22 October, 2021; originally announced October 2021.

arXiv:2110.10770 [pdf, other]

Pick-and-Mix Information Operators for Probabilistic ODE Solvers

Authors: Nathanael Bosch, Filip Tronarp, Philipp Hennig

Abstract: Probabilistic numerical solvers for ordinary differential equations compute posterior distributions over the solution of an initial value problem via Bayesian inference. In this paper, we leverage their probabilistic formulation to seamlessly include additional information as general likelihood terms. We show that second-order differential equations should be directly provided to the solver, inste… ▽ More Probabilistic numerical solvers for ordinary differential equations compute posterior distributions over the solution of an initial value problem via Bayesian inference. In this paper, we leverage their probabilistic formulation to seamlessly include additional information as general likelihood terms. We show that second-order differential equations should be directly provided to the solver, instead of transforming the problem to first order. Additionally, by including higher-order information or physical conservation laws in the model, solutions become more accurate and more physically meaningful. Lastly, we demonstrate the utility of flexible information operators by solving differential-algebraic equations. In conclusion, the probabilistic formulation of numerical solvers offers a flexible way to incorporate various types of information, thus improving the resulting solutions. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: 13 pages, 7 figures

arXiv:2012.08202 [pdf, other]

Calibrated Adaptive Probabilistic ODE Solvers

Authors: Nathanael Bosch, Philipp Hennig, Filip Tronarp

Abstract: Probabilistic solvers for ordinary differential equations assign a posterior measure to the solution of an initial value problem. The joint covariance of this distribution provides an estimate of the (global) approximation error. The contraction rate of this error estimate as a function of the solver's step size identifies it as a well-calibrated worst-case error, but its explicit numerical value… ▽ More Probabilistic solvers for ordinary differential equations assign a posterior measure to the solution of an initial value problem. The joint covariance of this distribution provides an estimate of the (global) approximation error. The contraction rate of this error estimate as a function of the solver's step size identifies it as a well-calibrated worst-case error, but its explicit numerical value for a certain step size is not automatically a good estimate of the explicit error. Addressing this issue, we introduce, discuss, and assess several probabilistically motivated ways to calibrate the uncertainty estimate. Numerical experiments demonstrate that these calibration methods interact efficiently with adaptive step-size selection, resulting in descriptive, and efficiently computable posteriors. We demonstrate the efficiency of the methodology by benchmarking against the classic, widely used Dormand-Prince 4/5 Runge-Kutta method. △ Less

Submitted 22 February, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

Comments: 17 pages, 10 figures;

arXiv:2005.03770 [pdf, other]

Planning from Images with Deep Latent Gaussian Process Dynamics

Authors: Nathanael Bosch, Jan Achterhold, Laura Leal-Taixé, Jörg Stückler

Abstract: Planning is a powerful approach to control problems with known environment dynamics. In unknown environments the agent needs to learn a model of the system dynamics to make planning applicable. This is particularly challenging when the underlying states are only indirectly observable through images. We propose to learn a deep latent Gaussian process dynamics (DLGPD) model that learns low-dimension… ▽ More Planning is a powerful approach to control problems with known environment dynamics. In unknown environments the agent needs to learn a model of the system dynamics to make planning applicable. This is particularly challenging when the underlying states are only indirectly observable through images. We propose to learn a deep latent Gaussian process dynamics (DLGPD) model that learns low-dimensional system dynamics from environment interactions with visual observations. The method infers latent state representations from observations using neural networks and models the system dynamics in the learned latent space with Gaussian processes. All parts of the model can be trained jointly by optimizing a lower bound on the likelihood of transitions in image space. We evaluate the proposed approach on the pendulum swing-up task while using the learned dynamics model for planning in latent space in order to solve the control problem. We also demonstrate that our method can quickly adapt a trained agent to changes in the system dynamics from just a few rollouts. We compare our approach to a state-of-the-art purely deep learning based method and demonstrate the advantages of combining Gaussian processes with deep learning for data efficiency and transfer learning. △ Less

Submitted 7 May, 2020; originally announced May 2020.

Comments: Accepted for publication at the 2nd Annual Conference on Learning for Dynamics and Control (L4DC) 2020, with supplementary material. First two authors contributed equally

arXiv:2001.10794 [pdf, other]

Software Logging for Machine Learning

Authors: Nathan Bosch, Jan Bosch

Abstract: System logs perform a critical function in software-intensive systems as logs record the state of the system and significant events in the system at important points in time. Unfortunately, log entries are typically created in an ad-hoc, unstructured and uncoordinated fashion, limiting their usefulness for analytics and machine learning. In this paper, we present the main challenges of contemporar… ▽ More System logs perform a critical function in software-intensive systems as logs record the state of the system and significant events in the system at important points in time. Unfortunately, log entries are typically created in an ad-hoc, unstructured and uncoordinated fashion, limiting their usefulness for analytics and machine learning. In this paper, we present the main challenges of contemporary approaches to generating and storing system logs data for large, complex, software-intensive systems based on an in-depth case study at a world-leading telecommunications company. Second, we present a systematic and structured approach for generating log data that does not suffer from the aforementioned challenges and is optimized for use in machine learning. Third, we provide validation of the approach based on expert interviews that confirms that the approach addresses the identified challenges and problems. △ Less

Submitted 24 January, 2020; originally announced January 2020.

arXiv:1909.13304 [pdf]

doi 10.1016/j.compedu.2021.104190

Identifying supportive student factors for mindset interventions: A two-model machine learning approach

Authors: Nigel Bosch

Abstract: Growth mindset interventions foster students' beliefs that their abilities can grow through effort and appropriate strategies. However, not every student benefits from such interventions - yet research identifying which student factors support growth mindset interventions is sparse. In this study, we utilized machine learning methods to predict growth mindset effectiveness in a nationwide experime… ▽ More Growth mindset interventions foster students' beliefs that their abilities can grow through effort and appropriate strategies. However, not every student benefits from such interventions - yet research identifying which student factors support growth mindset interventions is sparse. In this study, we utilized machine learning methods to predict growth mindset effectiveness in a nationwide experiment in the U.S. with over 10,000 students. These methods enable analysis of arbitrarily-complex interactions between combinations of student-level predictor variables and intervention outcome, defined as the improvement in grade point average (GPA) during the transition to high school. We utilized two separate machine learning models: one to control for complex relationships between 51 student-level predictors and GPA, and one to predict the change in GPA due to the intervention. We analyzed the trained models to discover which features influenced model predictions most, finding that prior academic achievement, blocked navigations (attempting to navigate through the intervention software too quickly), self-reported reasons for learning, and race/ethnicity were the most important predictors in the model for predicting intervention effectiveness. As in previous research, we found that the intervention was most effective for students with prior low academic achievement. Unique to this study, we found that blocked navigations predicted an intervention effect as low as 0.185 GPA points (on a 0-4 scale) less than the mean. This was a notable negative prediction given that the mean intervention effect in our sample was just 0.026 GPA points, though few students (4.4%) experienced a substantial number of blocked navigation events. We also found that some minoritized students were predicted to benefit less (or even not at all) from the intervention. △ Less

Submitted 26 March, 2021; v1 submitted 29 September, 2019; originally announced September 2019.

Comments: 28 pages, 4 figures, 3 tables

Showing 1–13 of 13 results for author: Bosch, N