Nowadays the most efficient engine of databases, which text and image contains, are based on the so called “bag-of-words” model. The typical method of data mining and image classification contains an unsupervised clustering before the building of this model. We interpret the documents as sets of words and the pictures as sets of low level features. A word/ feature is low of significance, it has not too much semantic meaning, but with the help of the clustering we can find similar behavior ones. According to our assumption, if there are more similar details in two document/ image, than there is in higher level a similar theme/ image object in the document/ image.
I examined three method of these “bag-of-words” model family. I will compare the the k-means based Vector Quantization, the PLSA (probabilistic latent semantic analysis) and the GMM (Gaussian mixture models) based Fisher vector method with the help of an existing image classification tool, which I will expand with my methods, programs.