## AbstractWe give an overview of well-known methods for graph-based transduction. In graph-based transduction we are given a fixed set of objects, some of which are labeled and some of which are unlabeled, and we wish to predict the unlabeled objects. A graph is then defined where an edge between objects indicates similarity between objects. If the graph is weighted then the weights indicate the degree of similarity. These include the min-cut method of~\cite{blum01learning} and the harmonic energy minimization procedure of~\cite{Zoubin1} (also~\cite{Belkin2}). We interpret these methods as specific instances of the minimization of a $p$-energy~\cite{pInterpolation}. When $p=2$ the analogy is that the graph is an electrical network; the edges are now resistors whose resistance is reciprocal of the similarity. The fixed labels from $\{-1,1\}$ now correspond to potential (voltage) constraints and the algorithm for labeling the graph is then to find the set of consistent voltages which minimize the energy dissipation and then to predict with the ``sign'' of the voltages. We extend the analogy for general $p$, which leads to natural analogues of Kirchoff’s laws, Ohm’s law, the conservation of energy principle, and the “rules” of resistors in series and parallel. We exploit this network analogy in two ways. First, we will show how by choosing a $p\in (1,2)$ this leads to an algorithm that obtains advantageous performance guarantees that are unobtainable for the special cases of $p=1,2$. Second, when $p=2$ we will show how to treat this energy minimization procedure as a kernel method, and we will find that the kernel matrix can be derived by a transformation of the matrix of effective resistances between pairs of vertices in the network. By approximating the network with a tree, we will develop a fast method~\cite{NIPS2008_0824} to compute the kernel matrix. This allows us to scale our method to large graphs. We present experiments on two web-spam classification tasks, one of which includes a graph with 400,000 vertices and more than 10,000,000 edges. The results indicate that the accuracy of our technique is competitive with previous methods using the full graph information.
[Edit] |