Integrating Uncertainty Awareness into Conformalized Quantile Regression
Authors:
Raphael Rossellini,
Rina Foygel Barber,
Rebecca Willett
Abstract:
Conformalized Quantile Regression (CQR) is a recently proposed method for constructing prediction intervals for a response $Y$ given covariates $X$, without making distributional assumptions. However, existing constructions of CQR can be ineffective for problems where the quantile regressors perform better in certain parts of the feature space than others. The reason is that the prediction interva…
▽ More
Conformalized Quantile Regression (CQR) is a recently proposed method for constructing prediction intervals for a response $Y$ given covariates $X$, without making distributional assumptions. However, existing constructions of CQR can be ineffective for problems where the quantile regressors perform better in certain parts of the feature space than others. The reason is that the prediction intervals of CQR do not distinguish between two forms of uncertainty: first, the variability of the conditional distribution of $Y$ given $X$ (i.e., aleatoric uncertainty), and second, our uncertainty in estimating this conditional distribution (i.e., epistemic uncertainty). This can lead to intervals that are overly narrow in regions where epistemic uncertainty is high. To address this, we propose a new variant of the CQR methodology, Uncertainty-Aware CQR (UACQR), that explicitly separates these two sources of uncertainty to adjust quantile regressors differentially across the feature space. Compared to CQR, our methods enjoy the same distribution-free theoretical coverage guarantees, while demonstrating in our experiments stronger conditional coverage properties in simulated settings and real-world data sets alike.
△ Less
Submitted 12 March, 2024; v1 submitted 14 June, 2023;
originally announced June 2023.
Beyond Ensemble Averages: Leveraging Climate Model Ensembles for Subseasonal Forecasting
Authors:
Elena Orlova,
Haokun Liu,
Raphael Rossellini,
Benjamin A. Cash,
Rebecca Willett
Abstract:
Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as post-processing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dat…
▽ More
Producing high-quality forecasts of key climate variables, such as temperature and precipitation, on subseasonal time scales has long been a gap in operational forecasting. This study explores an application of machine learning (ML) models as post-processing tools for subseasonal forecasting. Lagged numerical ensemble forecasts (i.e., an ensemble where the members have different initialization dates) and observational data, including relative humidity, pressure at sea level, and geopotential height, are incorporated into various ML methods to predict monthly average precipitation and two-meter temperature two weeks in advance for the continental United States. For regression, quantile regression, and tercile classification tasks, we consider using linear models, random forests, convolutional neural networks, and stacked models (a multi-model approach based on the prediction of the individual ML models). Unlike previous ML approaches that often use ensemble mean alone, we leverage information embedded in the ensemble forecasts to enhance prediction accuracy. Additionally, we investigate extreme event predictions that are crucial for planning and mitigation efforts. Considering ensemble members as a collection of spatial forecasts, we explore different approaches to using spatial information. Trade-offs between different approaches may be mitigated with model stacking. Our proposed models outperform standard baselines such as climatological forecasts and ensemble means. In addition, we investigate feature importance, trade-offs between using the full ensemble or only the ensemble mean, and different modes of accounting for spatial variability.
△ Less
Submitted 3 June, 2024; v1 submitted 28 November, 2022;
originally announced November 2022.