PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A Boosting Algorithm for Learning Bipartite Ranking Functions with Partially Labeled Data
Massih Amini, Vinh truong and Cyril Goutte
In: SIGIR 2008, 20-24 July 2008, Singapore.


This paper presents a boosting based algorithm for learning a bipartite ranking function (BRF) with partially labeled data. Until now different attempts had been made to build a BRF in a \textit{transductive} setting, in which the test points are given to the methods in advance as unlabeled data. The proposed approach is a semi-supervised \textit{inductive} ranking algorithm which, as opposed to transductive algorithms, is able to infer an ordering on new examples that were not used for its training. We evaluate our approach using the TREC-9 {\ohsu} and the {\Reuters}-21578 data collections, comparing against two semi-supervised classification algorithms for ROCArea ({\AUC}), uninterpolated average precision ({\AUP}), mean precision$@50$ ({\TP}) and Precision-Recall ({\PR}) curves. In the most interesting cases where there are an unbalanced number of irrelevant examples over relevant ones, we show our method to produce statistically significant improvements with respect to these ranking measures.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:4129
Deposited By:Massih Amini
Deposited On:17 May 2008