PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Categorization in multiple category systems
Jean-Michel Renders, Eric Gaussier, Cyril Goutte, Gabriela Csurka and Francois Pacull
In: ICML 2006 Conference Proceedings Book (2006) Omnipress . ISBN 1-59593-383-2

Abstract

We explore the situation in which documents have to be categorized into more than one category system, a situation we refer to as multiple-view categorization. More particularly, we address the case where two different categorizers have already been built based on non-necessarily identical training sets, each one labeled using one category system. On the top of these categorizers considered as black-boxes, we propose some algorithms able to exploit a third training set containing a few examples annotated in both category systems. Such a situation arises for example in large companies where incoming mails have to be routed to several departments, each one relying on its own category system. We focus here on exploiting possible dependencies between category systems in order to refine the categorization decisions made by categorizers trained independently on different category systems. After a description of the multiple categorization problem, we present several possible solutions, based either on a categorization or reweighting approach, and compare them on real data. Lastly, we show how the multimedia categorization problem can be cast as a multiple categorization problem and assess our methods in this framework.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Machine Vision
Natural Language Processing
Multimodal Integration
ID Code:2253
Deposited By:Gabriela Csurka
Deposited On:11 October 2006