## AbstractWe present a Bayesian framework for content-based im- age retrieval which models the distribution of color and tex- ture features within sets of related images. Given a user- specified text query (e.g. “penguins”) the system first ex- tracts a set of images, from a labelled corpus, correspond- ing to that query. The distribution over features of these images is used to compute a Bayesian score for each image in a large unlabelled corpus. Unlabelled images are then ranked using this score and the top images are returned. Al- though the Bayesian score is based on computing marginal likelihoods, which integrate over model parameters, in the case of sparse binary data the score reduces to a single matrix-vector multiplication and is therefore extremely ef- ficient to compute. We show that our method works sur- prisingly well despite its simplicity and the fact that no rel- evance feedback is used. We compare different choices of features, and evaluate our results using human subjects.
[Edit] |