PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Stepwise variable selection using nonparametric multiple comparisons - On optimal channel configurations for SMR-based brain-computer interfaces
Claudia Sannelli, Thorsten Dickhaus, Sebastian Halder, Eva Hammer, Klaus-Robert Müller and Benjamin Blankertz
In: German Statistical Society Meeting "Statistische Woche 2009", 05-08 Oct 2009, Wuppertal, Germany.


Brain-computer interfaces (BCIs) based on sensorimotor rhythms (SMRs) make use of brain activity modulated during motor imagery tasks, e.g., left-hand versus right-hand imagery, to control a device and thereby provide a communication tool. Usually, activity is at this measured by multi-channel electroencephalogram (EEG) and translated into commands by a computer program. For example, with the output of a classifier operating on such EEG data, control of an application like a moving cursor on a screen becomes possible. In order to enhance the convenience of use of such a BCI system, small channel setups are favorable. However, a tradeoff with information loss is necessary, because removing channels is likely to deteriorate classification accuracy. We tackle this problem from a statistical point of view. A procedure is designed which selects a sparse subset of EEG channels providing most of the essential discriminative information. Frequencially and temporally filtered EEG data form a real-valued matrix $X \in \mathbb{R}^{C \times T}$, where $C$ denotes the number of recorded channels and $T$ the number of sampled time points. In a data preprocessing step, dimension reduction is done by spatial filtering. A useful method is Common Spatial Pattern (CSP) analysis. Hereby, a generalized eigenvalue problem is solved leading to linear combinations of the $C$ original EEG channels (spatial filters) which maximize variance (i.e., bandpower of bandpass-filtered signal data) in one class while minimizing it in the other. Features for classifier input are then built as log-bandpower in the spatially filtered channels, together with the corresponding class label in case of training data. From prior experience, binary classification based on such data can be performed well utilizing Linear Discriminant Analysis (LDA) in $\mathbb{R}^J$, where $J$ is the number of selected CSP filters. The aim of our method is to select a proper subset of $k << C$ original EEG channels with the property that classification accuracy across all $N = 80$ participants of a large study with the Berlin BCI is not significantly worse with the $k$ selected channels than with the full setup of all $C$ ones. To this end, we propose an iterative method based on nonparametric multiple comparisons in the spirit of stepwise regression analysis. Starting with a configuration of $C_{\text{start}} \leq C$ channels, we perform the entire classification workflow (filtering, CSP analysis, LDA) for all $N$ subjects and calculate the classification error (ratio of incorrect classification decisions and number of test data points) per subject. Then, in an 'internal cycle', we investigate all possible subsets consisting of $C_{\text{start}} - 1$ channels and compare them to the start configuration in terms of classification errors using multiple Wilcoxon signed rank tests. If existing, we iteratively remove channels with non-significant contributions to the median of classification error percentages. This results in a configuration with $C_{\text{intern}} \leq C_{\text{start}}$ channels on exit of the 'internal cycle'. Then, the procedure switches to an 'external cycle' and selects (if existing) iteratively from the remaining $C - C_{\text{intern}}$ channels those with a significant improvement of the classification median (with a stricter significance threshold inducing sparsity). Once no channels can be selected, the procedure enters the 'internal cycle' modality again. The algorithm terminates when no further channels can be removed from or added to the selected set. The results of this iterative procedure show a good accordance with prior knowledge about typical loci of activation corresponding to motor tasks. A sparse setup of $k = 17$ out of $C = 119$ initial channels located over motor areas is selected using a start configuration consisting of $C_{\text{start}} = 32$ channels. The median classification error is increased by only approximately 1\%.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Brain Computer Interfaces
ID Code:6810
Deposited By:Thorsten Dickhaus
Deposited On:08 March 2010