PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Machine Learning for Semi-Structured Multimedia Documents : Application to pornographic filtering and thematic categorization.
Ludovic Denoyer and Patrick Gallinari
In: Machine Learning Techinques for Multimedia Content (2007) Springer-Verlag .

Abstract

We propose a generative statistical model for the classication of semi structured multimedia documents. Its main originality is its ability to simultaneously take into account the structural and the content information present in a semi structured document, and also to cope with dierent types of content (text, image, etc). We then present the results obtained on two sets of experiments: one set concerns the ltering of pornographic Web pages. the second one concerns the thematic classication of Wikipedia documents

EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Information Retrieval & Textual Information Access
ID Code:3657
Deposited By:Ludovic Denoyer
Deposited On:14 February 2008