PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Traveling among Clusters: a Way to Reconsider the Benefits of the Cluster Hypothesis
sylvain lamprier, Tassadit Amghar, Bernard Levrat and Frédéric Saubion
In: SAC 2010, 21-27 March 2010, Sierre, Switzerland.


Relying on the Cluster Hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant documents, most of information retrieval systems organizing search results as a set of clusters seek to gather all relevant documents in the same cluster. We propose here to reconsider the benefits of the entailed concentration of the relevant information. Contrary to what is commonly admitted, we believe that systems which aim to distribute the relevant documents in different clusters, since being more likely to highlight different aspects of the subject, may be at least as useful for the user as systems gathering all relevant documents in a single group. Since existing evaluation measures tend to greatly favor the latter systems, we first investigate ways to more fairly assess the ability to reach the relevant information from the list of cluster descriptions. At last, we show that systems distributing the relevant information in different clusters may actually provide a better information access than classical systems.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Information Retrieval & Textual Information Access
ID Code:6450
Deposited By:sylvain lamprier
Deposited On:08 March 2010