Prequential Plug-In Codes that Achieve Optimal Redundancy Rates even if the Model is Wrong
We analyse the prequential plug-in codes relative to one-parameter exponential families M. We show that if data are sampled i.i.d. from some distribution outside M, then the redundancy of any plug-in prequential code grows at rate larger than (1/2) log n in the worst case. This means that plug-in codes, such as the Rissanen-Dawid ML code, may behave inferior to other important universal codes such as the 2-part MDL, Shtarkov and Bayes codes, for which the redundancy is always (1/2) log n+ O(1). However, we also show that a slight modification of the ML plug-in code, ``almost'' in the model, does achieve the optimal redundancy even if the the true distribution is outside M.