Archetypal Analysis for Machine Learning
Morten Mørup and Lars Kai Hansen
Machine Learning for Signal Processing (MLSP), IEEE Workshop on
Archetypal analysis (AA) proposed by Cutler and Breiman in estimates the principal convex hull of a data set. As such AA favors features that constitute representative 'corners' of the data, i.e. distinct aspects or archetypes. We will show that AA enjoys the interpretability of clustering - without being limited to hard assignment and the uniqueness of SVD - without being limited to orthogonal representations. In order to do large scale AA, we derive an efficient algorithm based on projected gradient as well as an initialization procedure inspired by the FURTHESTFIRST approach widely used for K-means. We demonstrate that the AA model is relevant for feature extraction and dimensional reduction for a large variety of machine learning problems taken from computer vision, neuroimaging, text mining and collaborative filtering.