Qin-Zhen Guo
Chinese Academy of Sciences, China
Title: Hashing and Vector Quantization Based Hierarchical Techniques for Approximate Nearest Neighbor Search
Biography
Biography: Qin-Zhen Guo
Abstract
Fast approximate nearest neighbor search techniques play an important role in large scale database search. Hashing based methods which convert the original data into binary codes have two advantages, high retrieval efficiency and low memory cost. But due to the thick boundary in Hamming space, the hashing based methods can not achieve ideal retrieval precision. Vector quantization, especially product quantization (PQ), based methods, which use a large codebook to quantize the data to reduce the cardinality of the original data space, are another class of approximate nearest neighbor search methods. There are also two advantages with PQ based methods, low memory cost and high retrieval precision. However, compared to hashing based methods, the retrieval efficiency of PQ based methods is lower. Considering the strengths and weaknesses of hashing and PQ methods, we have proposed a hierarchical method which combines hashing based methods and PQ based methods. Since the hashing methods have high retrieval efficiency, firstly, we use hashing methods to filter the obviously distant data. Then use PQ based methods to search the data retrieved by hashing methods since they have better retrieval precision. Experiments have shown that in large scale database, the hierarchical method can achieve better results than hashing based methods and higher retrieval efficiency than PQ based methods.