next up previous
Next: R-tree Up: Fast k nearest neighbour Previous: Abstract

Introduction

The R-tree [6] is one of the more popular approaches for spatial access methods and a number of R-tree variants have been developed. The R-tree is simply a hierarchical tree where the higher level node is an MBR (minimum bounding rectangle) that encloses a set of child MBRs or data objects in the lower level.

Beckmann et al. [1] developed the R*-tree which is more efficient in insertion and space utilization than the R-tree. They proposed a force-reinserted technique to cut down the possibility of splitting a full node. It has been used for image applications, such as QBIC [5], and content-based retrieval and navigation in a hypermedia system [9].

Kamel et al. [7] developed an improved R-tree, the Hilbert R-tree, which uses a Hilbert space filling curve for grouping data objects [2] [3]. They show that the Hilbert R-tree outperforms the R*-tree. The Hilbert space filling curve has been shown to give a better space localization than other space filling curves [8].

Although there has been much research on the R-tree and its variants, there is hardly any development in the k nearest neighbour search for the R-tree family. Roussopoulos et al. [10] developed the first knn search for the R-tree. It measures the optimistic (MINDIST) and pessimistic (MINMAXDIST) distance between the query object and the multi-dimensional MBRs with different pruning strategies. However, the main application area motivating our work is for content based retrieval from large image databases, using high dimensional image feature vectors. Since multiple instances of similar images occur, the data tend to be clustered in the feature space. If a clustered database is indexed by the R-tree, then a number of nodes' MBRs will have a large overlap with their siblings. This may cause Roussopoulos et al.'s knn method to search unnecessary rectangles.

A simplified version of knn search is introduced. The method is similar to Roussopoulos et al.'s knn method. All overlapping rectangles with the query point are gathered and processed together in each level of the R-tree. The calulation of Roussopoulos et al.'s pessimistic distance between MBRs and the query point is avoided, speeding up the knn search performance. In section 2, the basic structure of the R-tree is shown. In section 3, Roussopoulos et al.'s knn search method is briefly explained. The simplified version of knn search for R-tree is described in section 4. Finally, comparative experiments are performed between Roussopoulos et al.'s knn search and the simplified knn search in terms of the number of dimensions and the value of k with the Hilbert R-tree at section 5. We used a clustered database for all the experiments (image based feature vectors derived from sub-images).



next up previous
Next: R-tree Up: Fast k nearest neighbour Previous: Abstract



Joseph Kuan
Wed Jun 3 13:57:27 BST 1998