Feature Selection for Dimensionality Reduction
Dimensionality reduction is a commonly used step in machine learning, especially when dealing with a high dimensional space of features. The original feature space is mapped onto a new, reduced dimensionally space. The dimensionality reduction is usually performed either by selecting a subset of the original dimensions or/and by constructing new dimensions. This paper deals with feature subset selection for dimensionality reduction in machine learning. We provide a brief overview of the feature subset selection techniques that are commonly used in machine learning. Detailed description is provided for feature subset selection as commonly used on text data. For illustration, we show performance of several methods on document categorization of real-world data.