PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Correcting Sample Selection Bias by Unlabeled Data
J. Huang, Alex Smola, Arthur Gretton, K Borgwardt and Bernhard Schölkopf
In: NIPS 2006, 04-09 December 2006, Vancouver Canada.


We consider the scenario where training and test data are drawn from different distributions, commonly referred to as \emph{sample selection bias}. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces resampling weights without distribution estimation. Our method works by matching distributions between training and testing sets in feature space. Experimental results demonstrate that our method works well in practice.

EPrint Type:Conference or Workshop Item (Spotlight)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:2390
Deposited By:Arthur Gretton
Deposited On:22 November 2006