Design and Analysis of the WCCI 2010 Active Learning Challenge
Isabelle Guyon, Gavin Cawley, Gideon Dror and Vincent Lemaire
In: WCCI 2010 special session on Active and Autonomous Learning, 19-23 Jul 2010, Barcelona, Spain.
We organized a data mining challenge in “active learning” for IJCNN/WCCI 2010, addressing machine learning problems where labeling data is expensive, but large amounts of unlabeled data are available at low cost. Examples include handwriting and speech recognition, document classification, vision tasks, drug design using recombinant molecules and protein engineering. Such problems might be tackled from different angles: learning from unlabeled data or active learning. In the former case, the algorithms must satisfy themselves
with the limited amount of labeled data and capitalize on the unlabeled data with semi-supervised learning methods. Several challenges have addressed this problem in the past. In the latter case, the algorithms may place a limited number of queries to get new sample labels. The goal in that case is to optimize the queries and the problem is referred to as active learning. While the problem of active learning is of great
importance, organizing a challenge in that area is non trivial. This is the problem we have addressed, which we describe in this paper. The “active learning” challenge is part of the WCCI 2010 competition program (http://www.wcci2010.org/competition-program). The website of the challenge remains open for submission of new methods beyond the termination of the challenge as a resource for students and