-
arXiv:2307.13124 [pdf, ps, other]
Conformal prediction for frequency-severity modeling
Abstract: We present a nonparametric model-agnostic framework for building prediction intervals of insurance claims, with finite sample statistical guarantees, extending the technique of split conformal prediction to the domain of two-stage frequency-severity modeling. The effectiveness of the framework is showcased with simulated and real datasets. When the underlying severity model is a random forest, we… ▽ More
Submitted 27 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.
-
On the universal distribution of the coverage in split conformal prediction
Abstract: Two additional universal properties are established in the split conformal prediction framework. In a regression setting with exchangeable data, we determine the exact distribution of the coverage of prediction sets for a finite horizon of future observables, as well as the exact distribution of its almost sure limit. The results hold for finite training and calibration samples, and both distribut… ▽ More
Submitted 5 March, 2023; originally announced March 2023.
Comments: 13 pages, 2 figures
-
arXiv:2112.06101 [pdf, ps, other]
Confidence intervals for the random forest generalization error
Abstract: We show that the byproducts of the standard training process of a random forest yield not only the well known and almost computationally free out-of-bag point estimate of the model generalization error, but also give a direct path to compute confidence intervals for the generalization error which avoids processes of data splitting and model retraining. Besides the low computational cost involved i… ▽ More
Submitted 11 March, 2022; v1 submitted 11 December, 2021; originally announced December 2021.
Comments: 10 pages
-
Learning a latent pattern of heterogeneity in the innovation rates of a time series of counts
Abstract: We develop a Bayesian hierarchical semiparametric model for phenomena related to time series of counts. The main feature of the model is its capability to learn a latent pattern of heterogeneity in the distribution of the process innovation rates, which are softly clustered through time with the help of a Dirichlet process placed at the top of the model hierarchy. The probabilistic forecasting cap… ▽ More
Submitted 6 July, 2019; originally announced July 2019.
-
arXiv:1312.2291 [pdf, ps, other]
Predictive analysis of microarray data
Abstract: Microarray gene expression data are analyzed by means of a Bayesian nonparametric model, with emphasis on prediction of future observables, yielding a method for selection of differentially expressed genes and a classifier.
Submitted 10 June, 2014; v1 submitted 8 December, 2013; originally announced December 2013.
-
arXiv:1306.1170 [pdf, ps, other]
On the computation of the marginal likelihood
Abstract: We describe briefly in this note a procedure for consistently estimating the marginal likelihood of a statistical model through a sample from the posterior distribution of the model parameters.
Submitted 10 June, 2014; v1 submitted 3 June, 2013; originally announced June 2013.
-
arXiv:1209.4947 [pdf, ps, other]
Bayesian Analysis of Simple Random Densities
Abstract: A tractable nonparametric prior over densities is introduced which is closed under sampling and exhibits proper posterior asymptotics.
Submitted 10 June, 2014; v1 submitted 21 September, 2012; originally announced September 2012.
Comments: 19 pages; 6 figures