PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A New Goodness-Of-Fit Statistical Test
Bruno Apolloni and Simone Bassis
Intelligent Decision Technologies Volume 1, pp. 1-14, 2007.


We introduce a nonparametric procedure for statistically testing if a model fits a sample of data well. The employed statistic is the empirical cumulative distribution (e.c.d.f.) of the measure of the blocks determined by the ordered sample. For any distribution law un- derlying the data this statistic is distributed around a Beta cumulative distribution law (c.d.f.) so that the shift between the two curves is the statistic at the basis of the test. Its distribution is computed through a new bootstrap procedure from a population of free parameters of the model that are compatible with the sampled data according to the model. Closing the loop, if the model fits the data well we may expect that the Beta c.d.f. constitutes a template for the block e.c.d.f.s that are compat- ible with the observed data. In the paper we show how to appreciate the template functionality in case of good fitting and also how to discrimi- nate bad models. We show the potentialities of the test in contrast with conventional tests both in case studies and in a well-known benchmark for the semiparametric logistic model used widely in database analysis.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:4461
Deposited By:Bruno Apolloni
Deposited On:13 March 2009