PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon.
Dmitry Davidov, Oren Tsur and Ari Rappoport
CoNLL 2010 2010.

Abstract

Sarcasm is a form of speech act in which the speakers convey their message in an implicit way. The inherently ambiguous nature of sarcasm sometimes makes it hard even for humans to decide whether an ut- terance is sarcastic or not. Recognition of sarcasm can benefit many sentiment analy- sis NLP applications, such as review sum- marization, dialogue systems and review ranking systems. In this paper we experiment with semi- supervised sarcasm identification on two very different data sets: a collection of 5.9 million tweets collected from Twit- ter, and a collection of 66000 product re- views from Amazon. Using the Mechani- cal Turk we created a gold standard sam- ple in which each sentence was tagged by 3 annotators, obtaining F-scores of 0.78 on the product reviews dataset and 0.83 on the Twitter dataset. We discuss the dif- ferences between the datasets and how the algorithm uses them (e.g., for the Amazon dataset the algorithm makes use of struc- tured information). We also discuss the utility of Twitter #sarcasm hashtags for the task.

EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:User Modelling for Computer Human Interaction
Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:7069
Deposited By:Ari Rappoport
Deposited On:27 February 2011