Combining topic models and social networks for chat data mining
Ville Tuulos and Henry Tirri
In: WI 2004, 20-24 Sep 2004, Beijing, China.
Informal chat-room conversations have intrinsically different properties from regular static document collections. Noise, concise expressions and dynamic, changing nature of discussions make chat data ill-suited for analysis with an off-the-shelf text mining method. On the other hand, human communication has some implicit features which may be used to enhance the results.
In our research we infer a social network from the chat data by using a few basic heuristics. We then present some preliminary results showing that the inferred social graph may be used to enhance topic identification of a chat room when combined with a latent variable topic model. We compare effects of four different graph features for classification accuracy.