PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Suboptimality of penalties proportional to the dimension for model selection in heteroscedastic regression
Sylvain Arlot
(2008) Hal.


We consider the problem of choosing between several models in least-squares regression with heteroscedastic data. We prove that any penalization procedure is suboptimal when the penalty is proportional to the dimension of the model, at least for some typical heteroscedastic model selection problems. In particular, Mallows' $C_p$ is suboptimal in this framework, as well as any ``linear'' penalty depending on both the data and their true distribution. On the contrary, optimal model selection is possible in this framework with data-driven penalties such as $V$-fold or resampling penalties (Arlot, 2008a,b). Therefore, estimating the ``shape'' of the penalty from the data is useful, even at the price of a higher computational cost.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Other
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Learning/Statistics & Optimisation
ID Code:4548
Deposited By:Sylvain Arlot
Deposited On:13 March 2009