Report ID
2000-24
Report Authors
Hakan Ferhatosmanoglu, Ertem Tuncel, Divyakant Agrawal, Amr ElAbbadi
Report Date
Abstract
In this paper, we develop a general framework for approximate nearest neighbor queries. We categorize the current approaches for nearestneighbor query processing based on either their ability to reduce the dataset that needs to be examined, or their ability to reduce the representation size of each data object. We first propose modifications towell-known techniques to support the progressive processing of approximatenearest neighbor queries. A user may therefore stop the retrieval processonce enough information has been returned. We then develop a new techniquebased on clustering that merges the benefits of the two general classes ofapproaches. Our cluster-based approach allows a user to progressivelyexplore the approximate results with increasing accuracy. We propose a newmetric for evaluation of approximate nearest neighbor searching techniques. Using both the proposed and the traditional metrics, weanalyze and compare several techniques with a detailed performance evaluation. We demonstrate the feasibility and efficiency of approximatenearest neighbor searching. We perform experiments on several real datasets and establish the superiority of the proposed cluster-based techniqueover the existing techniques for approximate nearest neighbor searching.
Document
2000-24.ps408.99 KB