Semi-supervised Structured Prediction Models
PhD thesis, Humboldt-Universtität zu Berlin.
Learning mappings between arbitrary structured input and output variables is a fundamental problem in machine learning. It covers many natural learning tasks and challenges the standard model of learning a mapping from independently drawn instances to a small set of labels. Potential applications include classification with a class taxonomy, named entity recognition, and natural language parsing. In these structured domains, labeled training instances are generally expensive to obtain while unlabeled inputs are readily available and inexpensive.
This thesis deals with semi-supervised learning of discriminative models for structured output variables. The analytical techniques and algorithms of
classical semi-supervised learning are lifted to the structured setting. Several approaches based on different assumptions of the data are presented. Co-learning, for instance, maximizes the agreement among multiple hypotheses while transductive approaches rely on an implicit cluster assumption. Furthermore, in the framework of this dissertation, a case study on email batch detection in message streams is presented. The involved tasks exhibit an inherent cluster structure and the presented solution exploits the streaming nature of the data.
The different approaches are developed into semi-supervised structured
prediction models and efficient optimization strategies thereof are presented. The novel algorithms generalize state-of-the-art approaches in structural learning such as structural support vector machines. Empirical results show that the semi-supervised algorithms lead to significantly lower error rates than their fully supervised counterparts in many application areas, including multi-class classification, named entity recognition, and natural language parsing.