Improving bag-of-visual-words image retrieval with predictive clustering trees
Journal
Information Sciences
Date Issued
2016-02-01
Author(s)
Kocev, Dragi
Djeroski, Sasho
Abstract
The recent overwhelming increase in the amount of available visual information, especially
digital images, has brought up a pressing need to develop efficient and accurate systems for
image retrieval. State-of-the-art systems for image retrieval use the bag-of-visual-words
representation of images. However, the computational bottleneck in all such systems is
the construction of the visual codebook, i.e., obtaining the visual words. This is typically
performed by clustering hundreds of thousands or millions of local descriptors, where
the resulting clusters correspond to visual words. Each image is then represented by a histogram of the distribution of its local descriptors across the codebook. The major issue in
retrieval systems is that by increasing the sizes of the image databases, the number of local
descriptors to be clustered increases rapidly: Thus, using conventional clustering techniques is infeasible. Considering this, we propose to construct the visual codebook by using
predictive clustering trees (PCTs), which can be constructed and executed efficiently and
have good predictive performance. Moreover, to increase the stability of the model, we
propose to use random forests of predictive clustering trees. We create a random forest
of PCTs that represents both the codebook and the indexing structure. We evaluate the proposed improvement of the bag-of-visual-words approach on three reference datasets and
two additional datasets of 100 K images and 1 M images, compare it to two
state-of-the-art methods based on approximate k-means and extremely randomized tree
ensembles. The results reveal that the proposed method produces a visual codebook with
superior discriminative power and thus better retrieval performance while maintaining
excellent computational efficiency.
digital images, has brought up a pressing need to develop efficient and accurate systems for
image retrieval. State-of-the-art systems for image retrieval use the bag-of-visual-words
representation of images. However, the computational bottleneck in all such systems is
the construction of the visual codebook, i.e., obtaining the visual words. This is typically
performed by clustering hundreds of thousands or millions of local descriptors, where
the resulting clusters correspond to visual words. Each image is then represented by a histogram of the distribution of its local descriptors across the codebook. The major issue in
retrieval systems is that by increasing the sizes of the image databases, the number of local
descriptors to be clustered increases rapidly: Thus, using conventional clustering techniques is infeasible. Considering this, we propose to construct the visual codebook by using
predictive clustering trees (PCTs), which can be constructed and executed efficiently and
have good predictive performance. Moreover, to increase the stability of the model, we
propose to use random forests of predictive clustering trees. We create a random forest
of PCTs that represents both the codebook and the indexing structure. We evaluate the proposed improvement of the bag-of-visual-words approach on three reference datasets and
two additional datasets of 100 K images and 1 M images, compare it to two
state-of-the-art methods based on approximate k-means and extremely randomized tree
ensembles. The results reveal that the proposed method produces a visual codebook with
superior discriminative power and thus better retrieval performance while maintaining
excellent computational efficiency.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
2015-DimitrovskiEtAl-INFSCI.pdf
Size
3.65 MB
Format
Adobe PDF
Checksum
(MD5):6155b4b0e1e87730d49f64b3455dc311
