A method for robust variable selection with significance assessment
Our goal is proposing an unbiased framework for gene expression data analysis based on variable selection combined with a significance assessment step. We start discussing the need of such a framework by illustrating the dramatic effect of a biased approach especially when the sample size is small. Then we describe our analysis protocol, based on two main ingredients. The first is a variable selection core based on elastic net regularization where we explicitly take into account regularization parameter tuning. The second, is a general architecture to assess the statistical significance of the model via cross validation and permutation testing. Finally we challenge the system on real data experiments studying its performance when changing variable selection algorithm or dealing with small sample datasets.