research-article

Fast neural ranking on bipartite graph indices

Authors:

Ping LiAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 15, Issue 4

Pages 794 - 803

https://doi.org/10.14778/3503585.3503589

Published: 01 December 2021 Publication History

Abstract

Neural network based ranking has been widely adopted owing to its powerful capacity in modeling complex relationships (e.g., users and items, questions and answers). Online neural network ranking, i.e., the so called fast neural ranking, is considered a challenging task because neural network measures are in general non-convex and asymmetric. Traditional approximate near neighbor (ANN) search which typically focuses on metric ranking measures, is not applicable to these complex measures. To tackle this challenge, in this paper, we propose to construct BipartitE Graph INdices (BEGIN) for fast neural ranking. BEGIN contains two types of nodes: base/searching objects and sampled queries. The edges connecting these types of nodes are constructed via the neural network ranking measure. The proposed algorithm is a natural extension from traditional search on graph methods and is more suitable for fast neural ranking. Experiments demonstrate the effectiveness and efficiency of the proposed method.

References

[1]

Lawrence Cayton. 2008. Fast nearest neighbor retrieval for bregman divergences. In Proceedings of the Twenty-Fifth International Conference on Machine learning (ICML). Helsinki, Finland, 112--119.

Digital Library

[2]

Wei-Cheng Chang, Felix X Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2020. Pre-training Tasks for Embedding-based Large-scale Retrieval. In Proceedings of the 8th International Conference on Learning Representations (ICLR). Addis Ababa.

[3]

Moses S. Charikar. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings on 34th Annual ACM Symposium on Theory of Computing (STOC). Montreal, Canada, 380--388.

Digital Library

[4]

Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to Answer Open-Domain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada, 1870--1879.

[5]

Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys). Boston, MA, 191--198.

Digital Library

[6]

Ryan R Curtin and Parikshit Ram. 2014. Dual-tree fast exact max-kernel search. Statistical Analysis and Data Mining: The ASA Data Science Journal 7, 4 (2014), 229--253.

Digital Library

[7]

Ryan R Curtin, Parikshit Ram, and Alexander G Gray. 2013. Fast exact max-kernel search. In Proceedings of the 13th SIAM International Conference on Data Mining (SDM). Austin, TX, 1--9.

[8]

Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Shinjuku, Tokyo, 65--74.

[9]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 4171--4186.

[10]

Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2019. Fast approximate nearest neighbor search with the navigating spreading-out graph. Proceedings of the VLDB Endowment 12, 5 (2019), 461--474.

Digital Library

[11]

Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized Product Quantization for Approximate Nearest Neighbor Search. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland, OR, 2946--2953.

Digital Library

[12]

Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In Proceedings of 25th International Conference on Very Large Data Bases (VLDB). Edinburgh, Scotland, UK, 518--529.

Digital Library

[13]

Michel X. Goemans and David P. Williamson. 1995. Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming. J. ACM 42, 6 (1995), 1115--1145.

Digital Library

[14]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI). Melbourne, Australia, 1725--1731.

[15]

Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM). Indianapolis, IN, 55--64.

[16]

Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W. Bruce Croft, and Xueqi Cheng. 2020. A Deep Look into neural ranking models for information retrieval. Inf. Process. Manag. 57, 6 (2020), 102067.

[17]

Kiana Hajebi, Yasin Abbasi-Yadkori, Hossein Shahbazi, and Hong Zhang. 2011. Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI). Barcelona, CA, 1312--1317.

[18]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web (WWW). Perth, Australia, 173--182.

Digital Library

[19]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).

[20]

Piotr Indyk and Rajeev Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC). Dallas, TX, 604--613.

Digital Library

[21]

Sergey Ioffe. 2010. Improved Consistent Sampling, Weighted Minhash and L1 Sketching. In Proceedings of the 10th IEEE International Conference on Data Mining (ICDM). Sydney, Australia, 246--255.

Digital Library

[22]

Masajiro Iwasaki. 2016. Pruned Bi-directed K-nearest Neighbor Graph for Proximity Search. In Proceedings of the 9th International Conference on Similarity Search and Applications (SISAP), Vol. 9939. Tokyo, Japan, 20--33.

[23]

Masajiro Iwasaki and Daisuke Miyazaki. 2018. Optimization of indexing based on k-nearest neighbor graph for proximity search in high-dimensional data. arXiv preprint arXiv:1810.07355 (2018).

[24]

Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In Proceedings of the 10th European Conference on Computer Vision (ECCV), Part I. Marseille, France, 304--317.

Digital Library

[25]

Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2011. Product Quantization for Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1 (2011), 117--128.

Digital Library

[26]

Jyun-Yu Jiang, Patrick H. Chen, Cho-Jui Hsieh, and Wei Wang. 2020. Clustering and Constructing User Coresets to Accelerate Large-scale Top-K Recommender Systems. In Proceedings of the Web Conference (WWW). Taipei, 2177--2187.

Digital Library

[27]

Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Trans. Big Data 7, 3 (2021), 535--547.

[28]

Ping Li. 2017. Linearized GMM Kernels and Normalized Random Fourier Features. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). Halifax, Canada, 315--324.

Digital Library

[29]

Ping Li, Xiaoyun Li, Gennady Samorodnitsky, and Weijie Zhao. 2021. Consistent Sampling Through Extremal Process. In Proceedings of the Web Conference (WWW). Virtual Event / Ljubljana, Slovenia, 1317--1327.

Digital Library

[30]

Ping Li, Michael Mitzenmacher, and Anshumali Shrivastava. 2014. Coding for Random Projections. In Proceedings of the 31th International Conference on Machine Learning (ICML). Beijing, China, 676--684.

[31]

Xiaoyun Li and Ping Li. 2021. One-Sketch-for-All: Non-linear Random Features from Compressed Linear Measurements. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS). Virtual Event, 2647--2655.

[32]

Zhengdong Lu and Hang Li. 2013. A Deep Architecture for Matching ShortTexts. In Advances in Neural Information Processing Systems (NIPS). Lake Tahoe, NV, 1367--1375.

[33]

Yury A. Malkov and Dmitry A. Yashunin. 2020. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Trans. Pattern Anal. Mach. Intell. 42, 4 (2020), 824--836.

Digital Library

[34]

Mark Manasse, Frank McSherry, and Kunal Talwar. 2010. Consistent Weighted Sampling. Technical Report MSR-TR-2010-73. Microsoft Research.

[35]

Bhaskar Mitra and Nick Craswell. 2018. An introduction to neural information retrieval. Foundations and Trends® in Information Retrieval (2018).

Digital Library

[36]

Stanislav Morozov and Artem Babenko. 2018. Non-metric Similarity Graphs for Maximum Inner Product Search. In Advances in Neural Information Processing Systems (NeurIPS). Montreal, Canada, 4726--4735.

[37]

Ali Rahimi and Benjamin Recht. 2007. Random Features for Large-Scale Kernel Machines. In Advances in Neural Information Processing Systems (NIPS). Vancouver, Canada, 1177--1184.

[38]

Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of massive datasets. Cambridge University Press.

[39]

Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Santiago, Chile, 373--382.

Digital Library

[40]

Anshumali Shrivastava and Ping Li. 2012. Fast Near Neighbor Search in High-Dimensional Binary Data. In Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD). Bristol, UK, 474--489.

Digital Library

[41]

Shulong Tan, Zhaozhuo Xu, Weijie Zhao, Hongliang Fei, Zhixin Zhou, and Ping Li. 2021. Norm Adjusted Proximity Graph for Fast Inner Product Retrieval. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). Virtual Event, Singapore, 1552--1560.

Digital Library

[42]

Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, and Ping Li. 2020. Fast item ranking under neural network based measures. In Proceedings of the 13th International Conference on Web Search and Data Mining (WSDM). 591--599.

Digital Library

[43]

Jiaxi Tang and Ke Wang. 2018. Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). London, UK, 2289--2298.

Digital Library

[44]

Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking. In Proceedings of the 2018 World Wide Web Conference on World Wide Web (WWW). Lyon, France, 729--739.

Digital Library

[45]

Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Daniel N. Holtmann-Rice, David Simcha, and Felix X. Yu. 2017. Multiscale Quantization for Fast Similarity Search. In Advances in Neural Information Processing Systems (NIPS). Long Beach, CA, 5745--5755.

[46]

Yubao Wu, Ruoming Jin, and Xiang Zhang. 2014. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In Proceedings of the International Conference on Management of Data (SIGMOD). Snowbird, UT, 1139--1150.

Digital Library

[47]

Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Shinjuku, Tokyo, 55--64.

Digital Library

[48]

Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep Mutual Learning. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT, 4320--4328.

[49]

Weijie Zhao, Shulong Tan, and Ping Li. 2020. SONG: Approximate Nearest Neighbor Search on GPU. In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE). Dallas, TX, 1033--1044.

[50]

Zhixin Zhou, Shulong Tan, Zhaozhuo Xu, and Ping Li. 2019. Möbius Transformation for Fast Inner Product Search on Graph. In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada, 8216--8227.

Cited By

Chen MZhang KHe ZJing YWang X(2024)RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor SearchProceedings of the VLDB Endowment10.14778/3681954.368195917:11(2735-2749)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3681959
Zhao WTan SLi PHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657728
Doan KTan SZhao WLi PChen HDuh WHuang HKato MMothe JPoblete B(2023)Asymmetric Hashing for Fast Ranking via Neural Network MeasuresProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591640(697-707)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591640

Index Terms

Fast neural ranking on bipartite graph indices
1. Information systems
  1. Information retrieval

Index terms have been assigned to the content through auto-classification.

Recommendations

Fast Item Ranking under Neural Network based Measures
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

Recently, plenty of neural network based recommendation models have demonstrated their strength in modeling complicated relationships between heterogeneous objects (i.e., users and items). However, the applications of these fine trained recommendation ...
Asymmetric Hashing for Fast Ranking via Neural Network Measures
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Fast item ranking is an important task in recommender systems. In previous works, graph-based Approximate Nearest Neighbor (ANN) approaches have demonstrated good performance on item ranking tasks with generic searching/matching measures (including ...
GUITAR: Gradient Pruning toward Fast Neural Ranking
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

With the continuous popularity of deep learning and representation learning, fast vector search becomes a vital task in various ranking/retrieval based applications, say recommendation, ads ranking and question answering. Neural network based ranking is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 15, Issue 4

December 2021

246 pages

ISSN:2150-8097

Editors:
Juliana Freire
New York University
,
Xuemin Lin
University of New South Wales

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 December 2021

Published in PVLDB Volume 15, Issue 4

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
74
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)2

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen MZhang KHe ZJing YWang X(2024)RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor SearchProceedings of the VLDB Endowment10.14778/3681954.368195917:11(2735-2749)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3681959
Zhao WTan SLi PHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657728
Doan KTan SZhao WLi PChen HDuh WHuang HKato MMothe JPoblete B(2023)Asymmetric Hashing for Fast Ranking via Neural Network MeasuresProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591640(697-707)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591640
Yu TLiu JYang YLi YFei HLi PZhang ARangwala H(2022)EGM: Enhanced Graph-based Model for Large-scale Video Advertisement SearchProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539061(4443-4451)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539061

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents