PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Semi-Supervised Feature Learning from Clinical Text
Zhuoran Wang, John Shawe-Taylor and Anoop Shah
In: IEEE International Conference on Bioinformatics and Biomedicine, 18-21 Dec 2010, Hong Kong.


This paper is focused on the automated identification of the clinical free-text records that contain useful information (e.g. symptoms, modifiers, diagnosis, etc) of a certain disease. We introduce a novel semi-supervised machine learning algorithm to address this problem, by training the set covering machine in a bootstrapping procedure. The advantage of the proposed technique is that not only can it find the documents of interest more accurately than searching based on diagnostic codes, the features it learned could also be directly used as a knowledge representation of the given topic and to assist either further machine learning algorithms or manual post-processing and analysis.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
Information Retrieval & Textual Information Access
ID Code:7030
Deposited By:Zhuoran Wang
Deposited On:05 December 2010