PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Selecting Hidden Markov Model State Number with Cross-Validated Likelihood
Gilles Celeux and Jean-Baptiste Durand
Computational Statistics Volume 23, pp. 541-554, 2008.

Abstract

The problem of estimating the number of hidden states in a hidden Markov chain model is considered. Emphasis is placed on cross-validated likelihood criteria. Using cross-validation to assess the number of hidden states allows to circumvent the well documented technical difficulties of the order identification problem in mixture models. Moreover, in a predictive perspective, it does not require that the sampling distribution belongs to one of the models in competition. However, computing cross-validated likelihood for hidden Markov chains involves difficulties since the data are not independent. Two approaches are proposed to compute cross-validated likelihood for a hidden Markov chain. The first one consists of using a deterministic half-sampling procedure, and the second one consists of an adaptation of the EM algorithm for hidden Markov chains, to take into account randomly missing values induced by cross-validation. Numerical experiments on both simulated and real data sets compare different versions of cross-validated likelihood criterion and penalised likelihood criteria, including BIC and a penalised marginal likelihood criterion. Those numerical experiments hightlight a promising behaviour of the deterministic half-sampling criterion.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:4904
Deposited By:Gilles Celeux
Deposited On:24 March 2009