PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Automatic Detection and Banning of Content Stealing Bots for E-commerce
Nicolas Poggi, Josep Lluis Berral, Toni Moreno, Ricard Gavaldà and Jordi Torres
In: NIPS 2007 Workshop on Machine Learning in Adversarial Environments for Computer Security, 7-8 December 2007, Whistler, British Columbia, Canada.


Content stealing in the web is becoming a serious concern for information and e-commerce websites. In the practices known as web fetching or web scraping, a stealer bot simulates a human web user to extract desired content off the victim’s website, which is then stripped off copyright information and displayed as original in the scraper's website. In this work we report initial results on the application of machine learning techniques to detect and ban stealer bots from a website, extending our AUGURES system previously used to separate buying from nonbuying sessions in an e-commerce site.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
User Modelling for Computer Human Interaction
ID Code:3332
Deposited By:Ricard Gavaldà
Deposited On:07 February 2008