Multi-Faceted Information Retrieval System for Large Scale Email Archives
Jukka Perkiö, Ville Tuulos, Wray Buntine and Henry Tirri
Proceedings of the IEEE/WIC/ACM Conference on Web Intelligence
We profile a system for search and analysis of large-scale email
archives. The system builds around four facets: Content-based
search engine, statistical topic model, automatically inferred
social networks and time-series analysis. The facets correspond
to the types of information available in email data.
The presented system allows chaining or combining the facets
flexibly. Results of one facet may be used as input to another,
yielding remarkable combinatorial power. In information retrieval
point of view, the system provides support for exploration,
approximate textual searches and data visualization. We present
some experimental results based on a large real-world email corpus.