|
Automatic Detection and Banning of Content Stealing Bots for E-commerce AbstractContent stealing in the web is becoming a serious concern for information and e-commerce websites. In the practices known as web fetching or web scraping, a stealer bot simulates a human web user to extract desired content off the victim’s website, which is then stripped off copyright information and displayed as original in the scraper's website. In this work we report initial results on the application of machine learning techniques to detect and ban stealer bots from a website, extending our AUGURES system previously used to separate buying from nonbuying sessions in an e-commerce site.
[Edit] |