Active Learning on Graphs via Spanning Trees
Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale and Giovanni Zappella
In: NIPS 2010 Workshop on Networks Across Disciplines: Theory and Applications, December 2010, Whistler, Canada.
Active learning algorithms for graph node classification select
a subset $L$ of nodes in a given graph. The goal is to minimize the mistakes
made on the remaining nodes by a standard node classifier using $L$ as training set.
Bilmes and Guillory introduced a combinatorial quantity, $\Psi^*(L)$, and related it to
the performance of the mincut classifier run on any given training set $L$.
While no efficient algorithms for minimizing $\Psi^*$ are known,
they show that simple heuristics for (approximately) minimizing it do not work well in practice.
Building on previous theoretical results about active learning on trees, we show that exact
minimization of $\Psi^*$ on suitable spanning trees of the graph yields an efficient active
learner that compares well against standard baselines on real-world graphs.