The R-tree [6] is one of the more popular approaches for spatial access methods and a number of R-tree variants have been developed. The R-tree is simply a hierarchical tree where the higher level node is an MBR (minimum bounding rectangle) that encloses a set of child MBRs or data objects in the lower level.

Beckmann et al. [1] developed the R*-tree which is more efficient in insertion and space utilization than the R-tree. They proposed a force-reinserted technique to cut down the possibility of splitting a full node. It has been used for image applications, such as QBIC [5], and content-based retrieval and navigation in a hypermedia system [9].

Kamel et al. [7] developed an improved R-tree, the Hilbert R-tree, which uses a Hilbert space filling curve for grouping data objects [2] [3]. They show that the Hilbert R-tree outperforms the R*-tree. The Hilbert space filling curve has been shown to give a better space localization than other space filling curves [8].

Although there has been much research on the R-tree and its variants,
there is hardly any development in the *k* nearest neighbour search for the
R-tree family.
Roussopoulos et al. [10] developed the first *knn* search
for the R-tree. It measures
the optimistic (MINDIST) and pessimistic (MINMAXDIST) distance between the
query object and the
multi-dimensional MBRs with different pruning strategies.
However, the main application area motivating our work is for content based
retrieval from large image databases, using high dimensional image feature
vectors. Since multiple instances of similar images occur, the data tend to
be clustered in the feature space.
If a clustered database is indexed by the R-tree, then a
number of nodes' MBRs will have a large overlap with their siblings. This may
cause Roussopoulos et al.'s *knn* method to search unnecessary rectangles.

A simplified version of *knn* search
is introduced. The method is similar to Roussopoulos et al.'s *knn* method.
All overlapping rectangles with the query point are gathered
and processed together in each level of the R-tree.
The calulation of Roussopoulos et al.'s pessimistic distance between
MBRs and the query point is avoided, speeding up the knn search performance.
In section 2, the basic structure of the R-tree is shown.
In section 3, Roussopoulos et al.'s *knn* search method is briefly
explained. The simplified version of *knn* search for R-tree is
described in section 4. Finally, comparative
experiments are performed between Roussopoulos et al.'s *knn* search and the simplified *knn*
search in terms of the number of dimensions and the value of *k* with the Hilbert R-tree at
section 5.
We used a clustered database for all the experiments (image based feature vectors derived
from sub-images).

Wed Jun 3 13:57:27 BST 1998