Image similarity: from syntax to weak semantics
Measuring image similarity is an important task for various multimedia applications. Similarity can be defined at two levels: at the syntactic (lower, context-free) level and at the semantic (higher, contextual) level. As long as one deals with the syntactic level, defining and measuring similarity is a relatively straightforward task, but as soon as one starts dealing with the semantic similarity, the task becomes very difficult. We examine the use of simple readily available syntactic image features combined with other multimodal features to derive a similarity measure that captures the weak semantics of an image. The weak semantics can be seen as an intermediate step between low level image understanding and full semantic image understanding. We investigate the use of single modalities alone and see how the combination of modalities affect the similarity measures. We also test the measure on multimedia retrieval task on a tv series data, even though the motivation is in understanding how different modalities relate to each other.