A distance for partially labeled trees
Trees are a powerful data structure for representing data for which hierarchical relations can be defined. It has been applied in a number of fields like image analysis, natural language processing, protein structure, or music retrieval, to name a few. Procedures for comparing trees are very relevant in many tasks where tree representations are involved. The computation of these measures is usually time consuming and different authors have proposed algorithms that are able to compute them in a reasonable time, by means of approximated versions of the similarity measure. Other methods require that the trees are fully labeled for the distance to be computed. The measure utilized in this paper is able to deal with trees labeled only at the leaves that runs in $O(|T_1|\times|T_2|)$ time. Experiments and comparative results are provided.