next up previous
Next: Simplified k Nearest Neighbour Up: Fast k nearest neighbour Previous: R-tree

The k Nearest Neighbour Search

 

   figure79
Figure 2: Example of MinDist and MinMaxDist

Roussopoulos et al.'s knn search used the squares of Euclidean distances as a measurement between objects. They derive MINDIST and MINMAXDIST to measure the optimistic and pessimistic distances between a query point and an MBR. Figure 2 shows the MINDIST and MINMAXDIST in a 2D case.

The MINDIST is the minimum possible distance between a query point and an MBR, such that there is no child rectangle or data point inside the MBR that can have a smaller distance value than this. The mathematical calculation for MINDIST between a query point, P, and an MBR, R is:

displaymath347

where n is the number of dimensions of P and R, tex2html_wrap_inline365 is the coordinate of point P in ith dimension, and

displaymath348

where tex2html_wrap_inline375 and tex2html_wrap_inline377 are the minimum and maximum values of the ith dimension of rectangle R. If P is located inside the rectangle R or on the perimeter of R, then the MINDIST = 0.

The MINMAXDIST is the minimum distance that guarantees a data object can be found within the distance. The MINMAXDIST is more computationally intensive than the MINDIST shown by the calculations in [10]. For the definition of MINMAXDIST, we let the readers refer to the Roussopoulos et al. paper, as our paper concentrates on using MINDIST for knn search.

Roussopoulos et al. proposed using a branchlist to browse a node to see whether it has any children within the maximum distance of the knn. The branchlist holds the MINDIST and MINMAXDIST of each MBR to the query object. The branchlist is sorted in ascending order based on the MINDIST value. Three pruning strategies were suggested for branchlists and the knn distance list. They are used to reduce the search path for knn

  1. Any rectangle M with a MINDIST that is greater than the MINMAXDIST value of another rectangle tex2html_wrap_inline399 can be pruned from the branchlist.
  2. The furthest data object in the knn list can be omitted if its Euclidean distance to the query point is longer than the MINMAXDIST of a rectangle.
  3. Any rectangle that has MINDIST greater than the furthest distance in the knn list can be discarded.


next up previous
Next: Simplified k Nearest Neighbour Up: Fast k nearest neighbour Previous: R-tree



Joseph Kuan
Wed Jun 3 13:57:27 BST 1998