Push and Pull: Iterative grouping of media
Andrew Gilbert and Richard Bowden
British Machine Vision Conference
We present an approach to iteratively cluster images and video in an efﬁcient and intuitive manor. While many techniques use the traditional approach of time consuming
groundtruthing large amounts of data [10, 16, 20, 23], this is increasingly infeasible as dataset size and complexity increase. Furthermore it is not applicable to the home user,
who wants to intuitively group his/her own media without labelling the content. Instead we propose a solution that allows the user to select media that semantically belongs to
the same class and use machine learning to “pull” this and other related content together.
We introduce an "image signature" descriptor and use min-Hash and greedy clustering to
efﬁciently present the user with clusters of the dataset using multi-dimensional scaling.
The image signatures of the dataset are then adjusted by APriori data mining identifying the common elements between a small subset of image signatures. This is able to
both pull together true positive clusters and push apart false positive examples. The approach is tested on real videos harvested from the web using the state of the art YouTube
dataset . The accuracy of correct group label increases from 60.4% to 81.7% using 15 iterations of pulling and pushing the media around. While the process takes only 1 minute to compute the pair wise similarities of the image signatures and visualise the youtube whole dataset
|Project Keyword:||Project Keyword UNSPECIFIED|
|Deposited By:||Andrew Gilbert|
|Deposited On:||21 February 2012|