PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

DIFFRAC : a discriminative and flexible framework for clustering
Francis Bach and Zaid Harchaoui
Advances in Neural Information Processing Systems (NIPS) Volume 20, 2007.


We present a novel linear clustering framework (DIFFRAC) which relies on a linear discriminative cost function and a convex relaxation of a combinatorial optimization problem. The large convex optimization problem is solved through a sequence of lower dimensional singular value decompositions. This framework has several attractive properties: (1) although apparently similar to K-means, it exhibits superior clustering performance than K-means, in particular in terms of robustness to noise. (2) It can be readily extended to non linear clustering if the discriminative cost function is based on positive definite kernels, and can then be seen as an alternative to spectral clustering. (3) Prior information on the partition is easily incorporated, leading to state-of-the-art performance for semi-supervised learning, for clustering or classification. We present empirical evaluations of our algorithms on synthetic and real medium-scale datasets.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:3188
Deposited By:Francis Bach
Deposited On:12 January 2008