Visual approach to supervised variable selection by self-organizing map
Practical data analysis often encounters data sets with both relevant and useless variables. Supervised variable selection is the task of selecting the relevant variables based on some predefined criterion. We propose a robust method for this task. The user manually selects a set of target variables and trains a Self-Organizing Map with these data. This sets a criterion to variable selection and is an illustrative description of the user's problem, even for multivariate target data. The user also defines another set of variables that are potentially related to the problem. Our method returns a subset of these variables, which best corresponds to the description provided by the Self-Organizing Map and, thus, agrees with the user's understanding about the problem. The method is conceptually simple and, based on experiments, allows an accessible approach to supervised variable selection.