PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Generative modeling for maximizing precision and recall in information visualization
Jaakko Peltonen and Samuel Kaski
In: AISTATS 2011, The Fourteenth International Conference on Artificial Intelligence and Statistics, 11-13 Apr 2011, Fort Lauderdale, USA.


Information visualization has recently been formulated as an information retrieval problem, where the goal is to find similar data points based on the visualized nonlinear projection, and the visualization is optimized to maximize a compromise between (smoothed) precision and recall. We turn the visualization into a generative modeling task where a simple user model parameterized by the data coordinates is optimized, neighborhood relations are the observed data, and straightforward maximum likelihood estimation corresponds to Stochastic Neighbor Embedding (SNE). While SNE maximizes pure recall, adding a mixture component that “explains away” misses allows our generative model to focus on maximizing precision as well. The resulting model is a generative solution to maximizing tradeoffs between precision and recall. The model outperforms earlier models in terms of precision and recall and in external validation by unsupervised classification.

EPrint Type:Conference or Workshop Item (Paper)
Additional Information:
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Theory & Algorithms
ID Code:9090
Deposited By:Jaakko Peltonen
Deposited On:21 February 2012