PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

MACHINE LEARNING ON SETS OF DOCUMENTS CONNECTED IN GRAPHS
Janez Brank
In: SIKDD 2004 at multiconference IS 2004, 12-15 Oct 2004, Ljubljana, Slovenia.

Abstract

This paper deals with the problem of machine learning on sets of documents connected into graphs. Our strategy is to represent each document by a diverse set of heterogeneous attributes, including traditional binary and categorical attributes, textual attributes, and attributes derived from the graphs. We present experiments on two datasets, showing the usefulness of graph-based attributes and the importance of weighting the different attributes suitably before learning. On the download estimation task, the approach presented here achieved the best results on the KDD Cup 2003 challenge.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:743
Deposited By:Blaz Fortuna
Deposited On:30 December 2004