research-article

Open access

Point-to-Hyperplane Nearest Neighbor Search Beyond the Unit Hypersphere

Authors:

Qiang Huang,

Yifan Lei,

Anthony K. H. TungAuthors Info & Claims

SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data

Pages 777 - 789

https://doi.org/10.1145/3448016.3457240

Published: 18 June 2021 Publication History

PDF eReader

Abstract

Point-to-Hyperplane Nearest Neighbor Search (P2HNNS) is a fundamental yet challenging problem, and it has plenty of applications in various fields. Existing hyperplane hashing schemes enjoy sub-linear query time and achieve excellent performance on applications such as large-scale active learning with Support Vector Machines (SVMs). However, they only conditionally deal with this problem with a strong assumption that all of the data objects are normalized, located at the unit hypersphere. Those hyperplane hashing schemes may be arbitrarily bad without this assumption. In this paper, we introduce a new asymmetric transformation and develop the first two provable hyperplane hashing schemes, Nearest Hyperplane hashing (NH) and Furthest Hyperplane hashing (FH), for high-dimensional P2HNNS beyond the unit hypersphere. With this asymmetric transformation, we demonstrate that the hash functions of NH and FH are locality-sensitive to the hyperplane queries, and both of them enjoy quality guarantee on query results. Moreover, we propose a data-dependent multi-partition strategy to boost the search performance of FH. NH can perform the hyperplane queries in sub-linear time, while FH enjoys a better practical performance. We evaluate NH and FH over five real-life datasets and show that we are around $3 \sim 100 \times$ faster than the best competitor in four out of five datasets, especially for the recall in $[20%, 80%]$. Code is available at \urlhttps://github.com/HuangQiang/P2HNNS.

Supplementary Material

MP4 File (3448016.3457240.mp4)

Point-to-Hyperplane Nearest Neighbor Search (P2HNNS) is a fundamental yet challenging problem, and it has plenty of applications in various fields. Existing hyperplane hashing schemes enjoy sub-linear query time and achieve excellent performance on applications such as large-scale active learning with Support Vector Machines (SVMs). However, they only conditionally deal with this problem with a strong assumption that all of the data objects are located at the unit hypersphere. Those hyperplane hashing schemes may be arbitrarily bad if without this assumption.In this paper, we introduce a new asymmetric hyperplane transformation and develop the first two novel hashing schemes Nearest Hyperplane hashing (NH) and Furthest Hyperplane hashing (FH) for high-dimensional P2HNNS beyond the unit hypersphere. With this new asymmetric hyperplane transformation, we demonstrate that the hash functions of NH and FH are locality-sensitive to the hyperplane queries, and both of them enjoy quality guarantee on query results. Moreover, we propose a data-dependent multi-partition strategy to boost the search performance of FH. NH can answer the hyperplane queries in sub-linear time, while FH enjoys the better practical performance. Extensive experiments over five real-life datasets demonstrate the superior performance of these two proposed hyperplane hashing schemes.

Download
40.00 MB

References

[1]

Alexandr Andoni and Piotr Indyk. 2006. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS . 459--468.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Confirmation Sampling for Exact Nearest Neighbor Search

Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

Secure Approximate Nearest Neighbor Search with Locality-Sensitive Hashing

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations