Latent grouping models for user preference prediction
We tackle the problem of new users or documents in collaborative filtering. Generalization over users by grouping them into user groups is beneficial when a rating is to be predicted for a relatively new document having only few observed ratings. Analogously, generalization over documents improves predictions in the case of new users. We show that if either users and documents or both are new, two-way generalization becomes necessary. We demonstrate the benefits of grouping of users, grouping of documents, and two-way grouping, with artificial data and in two case studies with real data. We have introduced a probabilistic latent grouping model for predicting the relevance of a document to a user. The model assumes a latent group structure for both users and items. We compare the model against a state-of-the-art method, the User Rating Profile model, where only users have a latent group structure. We estimate both models by Gibbs sampling. The Two-Way Model predicts relevance more accurately when the target consists of both new documents and new users. The reason is that generalization over documents becomes beneficial for new documents and at the same time generalization over users is needed for new users.