PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Working Notes for the InFile Campaign : Online Document Filtering Using 1 Nearest Neighbor
Vincent Bodinier, Ali Mustafa Qamar and Eric Gaussier
In: CLEF 2008 Workshop, 17-19 September, Aarhus, Denmark.

Abstract

This paper has been written as a part of the InFile (INFormation, FILtering, Evaluation) campaign. This project is a cross-language adaptive filtering evaluation campaign, sponsored by the French national research agency, and it is a pilot track of the CLEF (Cross Language Evaluation Forum) 2008 campaigns. We propose in this paper an online algorithm to learn category specific thresholds in a multiclass environment where a document can belong to more than one class. Our method uses 1 Nearest Neighbor (1NN) algorithm for classification. It uses simulated user feedback to fine tune the threshold and in turn the classification performance over time. The experiments were run on English language corpus containing 100,000 documents. The best results have a precision of 0.366 and the recall is 0.260.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Information Retrieval & Textual Information Access
ID Code:4387
Deposited By:Ali QAMAR
Deposited On:31 March 2009