|
FUZZY CLUSTERING OF DOCUMENTS AbstractThis paper presents a short overview of methods for fuzzy clustering and states desired properties for an optimal fuzzy document clustering algorithm. Based on these criteria we chose one of the fuzzy clustering most prominent methods – the c-means, more precisely probabilistic c-means. This algorithm is presented in more detail along with some empirical results of the clustering of 2-dimensional points and documents. For the needs of documents clustering we implemented fuzzy c-means in the TextGarden environment. We show few difficulties with the implementation and their possible solutions. As a conclusion we also propose further work that would be needed in order to fully exploit the power of fuzzy document clustering in TextGarden.
[Edit] |