-
On a new class of tests for the Pareto distribution using Fourier methods
Authors:
L. Ndwandwe,
J. S. Allison,
M. Smuts,
I. J. H. Visagie
Abstract:
We propose new classes of tests for the Pareto type I distribution using the empirical characteristic function. These tests are $U$ and $V$ statistics based on a characterisation of the Pareto distribution involving the distribution of the sample minimum. In addition to deriving simple computational forms for the proposed test statistics, we prove consistency against a wide range of fixed alternat…
▽ More
We propose new classes of tests for the Pareto type I distribution using the empirical characteristic function. These tests are $U$ and $V$ statistics based on a characterisation of the Pareto distribution involving the distribution of the sample minimum. In addition to deriving simple computational forms for the proposed test statistics, we prove consistency against a wide range of fixed alternatives. A Monte Carlo study is included in which the newly proposed tests are shown to produce high powers. These powers include results relating to fixed alternatives as well as local powers against mixture distributions. The use of the proposed tests is illustrated using an observed data set.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Testing for the Pareto type I distribution: A comparative study
Authors:
L. Ndwandwe,
J. S. Allison,
L. Santana,
I. J. H. Visagie
Abstract:
Pareto distributions are widely used models in economics, finance and actuarial sciences. As a result, a number of goodness-of-fit tests have been proposed for these distributions in the literature. We provide an overview of the existing tests for the Pareto distribution, focussing specifically on the Pareto type I distribution. To date, only a single overview paper on goodness-of-fit testing for…
▽ More
Pareto distributions are widely used models in economics, finance and actuarial sciences. As a result, a number of goodness-of-fit tests have been proposed for these distributions in the literature. We provide an overview of the existing tests for the Pareto distribution, focussing specifically on the Pareto type I distribution. To date, only a single overview paper on goodness-of-fit testing for Pareto distributions has been published. However, the mentioned paper has a much wider scope than is the case for the current paper as it covers multiple types of Pareto distributions. The current paper differs in a number of respects. First, the narrower focus on the Pareto type I distribution allows a larger number of tests to be included. Second, the current paper is concerned with composite hypotheses compared to the simple hypotheses (specifying the parameters of the Pareto distribution in question) considered in the mentioned overview. Third, the sample sizes considered in the two papers differ substantially.
In addition, we consider two different methods of fitting the Pareto Type I distribution; the method of maximum likelihood and a method closely related to moment matching. It is demonstrated that the method of estimation has a profound effect, not only on the powers achieved by the various tests, but also on the way in which numerical critical values are calculated. We show that, when using maximum likelihood, the resulting critical values are shape invariant and can be obtained using a Monte Carlo procedure. This is not the case when moment matching is employed.
The paper includes an extensive Monte Carlo power study. Based on the results obtained, we recommend the use of a test based on the phi divergence together with maximum likelihood estimation.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
A proposed simulation technique for population stability testing in credit risk scorecards
Authors:
J. du Pisanie,
J. S. Allison,
I. J. H. Visagie
Abstract:
Credit risk scorecards are logistic regression models, fitted to large and complex data sets, employed by the financial industry to model the probability of default of a potential customer. In order to ensure that a scorecard remains a representative model of the population one tests the hypothesis of population stability; specifying that the distribution of clients' attributes remains constant ov…
▽ More
Credit risk scorecards are logistic regression models, fitted to large and complex data sets, employed by the financial industry to model the probability of default of a potential customer. In order to ensure that a scorecard remains a representative model of the population one tests the hypothesis of population stability; specifying that the distribution of clients' attributes remains constant over time. Simulating realistic data sets for this purpose is nontrivial as these data sets are multivariate and contain intricate dependencies. The simulation of these data sets are of practical interest for both practitioners and for researchers; practitioners may wish to consider the effect that a specified change in the properties of the data has on the scorecard and its usefulness from a business perspective, while researchers may wish to test a newly developed technique in credit scoring.
We propose a simulation technique based on the specification of bad ratios, this is explained below. Practitioners can generally not be expected to provide realistic parameter values for a scorecard; these models are simply too complex and contain too many parameters to make such a specification viable. However, practitioners can often confidently specify the bad ratio associated with two different levels of a specific attribute. That is, practitioners are often comfortable with making statements such as "on average a new customer is 1.5 times as likely to default as an existing customer with similar attributes". We propose a method which can be used to obtain parameter values for a scorecard based on specified bad ratios. The proposed technique is demonstrated using a realistic example and we show that the simulated data sets adhere closely to the specified bad ratios. The paper provides a link to a github project in which the R code used in order to generate the results shown can be found.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Logistic or not logistic?
Authors:
James S. Allison,
Bruno Ebner,
Marius Smuts
Abstract:
We propose a new class of goodness-of-fit tests for the logistic distribution based on a characterisation related to the density approach in the context of Stein's method. This characterisation based test is a first of its kind for the logistic distribution. The asymptotic null distribution of the test statistic is derived and it is shown that the test is consistent against fixed alternatives. The…
▽ More
We propose a new class of goodness-of-fit tests for the logistic distribution based on a characterisation related to the density approach in the context of Stein's method. This characterisation based test is a first of its kind for the logistic distribution. The asymptotic null distribution of the test statistic is derived and it is shown that the test is consistent against fixed alternatives. The finite sample power performance of the newly proposed class of tests is compared to various existing tests by means of a Monte Carlo study. It is found that this new class of tests are especially powerful when the alternative distributions are heavy tailed, like Student's t and Cauchy, or for skew alternatives such as the log-normal, gamma and chi-square distributions.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Kaplan-Meier based tests for exponentiality in the presence of censoring
Authors:
E. Bothma,
J. S. Allison,
M. Cockeran,
I. J. H. Visagie
Abstract:
In this paper we test the composite hypothesis that lifetimes follow an exponential distribution based on observed randomly right censored data. Testing this hypothesis is complicated by the presence of this censoring, due to the fact that not all lifetimes are observed. To account for this complication, we propose modifications to tests based on the empirical characteristic function and Laplace t…
▽ More
In this paper we test the composite hypothesis that lifetimes follow an exponential distribution based on observed randomly right censored data. Testing this hypothesis is complicated by the presence of this censoring, due to the fact that not all lifetimes are observed. To account for this complication, we propose modifications to tests based on the empirical characteristic function and Laplace transform. In the full sample case these empirical functions can be expressed as integrals with respect to the empirical distribution function of the lifetimes. We propose replacing this estimate of the distribution function by the Kaplan-Meier estimate. The resulting test statistics can be expressed in easily calculable forms in terms of summations of functionals of the observed data. Additionally, a general framework for goodness-of-fit testing, in the presence of random right censoring, is outlined. A Monte Carlo study is performed, the results of which indicate that the newly modified tests generally outperform the existing tests. A practical application, concerning initial remission times of leukemia patients, is discussed along with some concluding remarks and avenues for future research.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
New weighted $L^2$-type tests for the inverse Gaussian distribution
Authors:
J. S. Allison,
S. Betsch,
B. Ebner,
I. J. H. Visagie
Abstract:
We propose a new class of goodness-of-fit tests for the inverse Gaussian distribution. The proposed tests are weighted $L^2$-type tests depending on a tuning parameter. We develop the asymptotic theory under the null hypothesis and under a broad class of alternative distributions. These results are used to show that the parametric bootstrap procedure, which we employ to implement the test, is asym…
▽ More
We propose a new class of goodness-of-fit tests for the inverse Gaussian distribution. The proposed tests are weighted $L^2$-type tests depending on a tuning parameter. We develop the asymptotic theory under the null hypothesis and under a broad class of alternative distributions. These results are used to show that the parametric bootstrap procedure, which we employ to implement the test, is asymptotically valid and that the whole test procedure is consistent. A comparative simulation study for finite sample sizes shows that the new procedure is competitive to classical and recent tests, outperforming these other methods almost uniformly over a large set of alternative distributions. The use of the newly proposed test is illustrated with two observed data sets.
△ Less
Submitted 30 October, 2019;
originally announced October 2019.