Data mining tools for analysis of linguistic variation
Over the past decades, linguists have compiled large electronic text corpora of various kinds, enabling the study of diverse aspects of language. The development of tools for analysis of corpora has received far less attention. In a combined effort with researchers in data mining, linguistics and information visualization, we develop advanced and interactive tools, specifically for analysis of natural language corpora. We use these tools to study differences in writing style throughout genres in modern texts, and development of genres and language change starting at Early Modern English (ca. 1400).