PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Practical Feature Selection: from Correlation to Causality
Isabelle Guyon
In: Mining Massive Data Sets for Security (2008) IOS Press .

Abstract

Feature selection encompasses a wide variety of methods for selecting a restricted number of input variables or “features”, which are “relevant” to a problem at hand. In this report, we guide practitioners through the maze of methods, which have recently appeared in the literature, particularly for supervised feature selection. Starting from the simplest methods of feature ranking with correlation coefficients, we branch in various direction and explore various topics, including “conditional relevance”, “local relevance”, “multivariate selection”, and “causal relevance”. We make recommendations for assessment methods and stress the importance of matching the complexity of the method employed to the available amount of training data. Software and teaching material associated with this tutorial are available http://clopinet.com/CLOP/.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book Section
Additional Information:We attach a longer technical memorandum.
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:4038
Deposited By:Isabelle Guyon
Deposited On:25 February 2008