PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Data mining tools for analysis of linguistic variation
Jefrey Lijffijt
In: Lorentz Workshop on Mining Patterns and Subgroups, 16-19 Nov 2010, Leiden, The Netherlands.

Abstract

Over the past decades, linguists have compiled large electronic text corpora of various kinds, enabling the study of diverse aspects of language. The development of tools for analysis of corpora has received far less attention. In a combined effort with researchers in data mining, linguistics and information visualization, we develop advanced and interactive tools, specifically for analysis of natural language corpora. We use these tools to study differences in writing style throughout genres in modern texts, and development of genres and language change starting at Early Modern English (ca. 1400).

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:7765
Deposited By:Jefrey Lijffijt
Deposited On:17 March 2011