PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Apprentissage Statistique pour la Constitution de Corpus d'évaluation
Huyen-Trang Vu and Patrick Gallinari
In: CORIA 2006, 15-17 Mar 2006, Lyon, France.


Test collections play a crucial role in Information Retrieval system evaluation. Forming relevance assessment set has been recognized as the key bottleneck in test collection building, especially on very large sized document collections. This paper addresses the problem of efficiently selecting documents to be included in the assessment set. Machine learning algorithms such as RankBoost can be helpful for this purpose. This leads to smaller pools than traditional round robin pooling, thus reduces significantly the manual assessment workload. Experimental results on TREC collections consistently demonstrate the effectiveness of our approach according to different evaluation criteria.

EPrint Type:Conference or Workshop Item (Oral)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Information Retrieval & Textual Information Access
ID Code:2618
Deposited By:Huyen-Trang Vu
Deposited On:22 November 2006