PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Generalization in Clustering with Unobserved Features
Eyal Krupka and Naftali Tishby
Advances in Neural Information Processing Systems (NIPS) Volume 19, 2005.


We argue that when objects are characterized by many attributes, clustering them on the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, we show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. We prove a finite sample generalization theorems for this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Additional Information:A new paradigm for generalziation in learning.
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:2006
Deposited By:Naftali Tishby
Deposited On:14 January 2006