Sparse Domain Adaptation in Projection Spaces based on Good Similarity Functions
Emilie Morvant, Amaury Habrard and Stéphane Ayache
In: 11th IEEE International Conference on Data Mining (ICDM), 11-14 Dec 2011, Vancouver, Canada.
We address the problem of domain adaptation for binary classification which arises when the distributions generating the source learning data and target test data are somewhat different. We consider the challenging case where no target labeled data is available. From a theoretical standpoint, a classifier has better generalization guarantees when the two domain marginal distributions are close. We study a new direction based on a recent framework of Balcan et al. allowing to learn linear classifiers in an explicit projection space based on similarity functions that may be not symmetric and not positive semi-definite. We propose a general method for learning a good classifier on target data with generalization guarantees and we improve its efficiency thanks to an iterative procedure by reweighting the similarity function - compatible with Balcan et al. framework - to move closer the two distributions in a new projection space. Hyperparameters and reweighting quality are controlled by a reverse validation procedure. Our approach is based on a linear programming formulation and shows good adaptation performances with very sparse models. We evaluate it on a synthetic problem and on real image annotation task.