Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Fast neural ranking on bipartite graph indices

Published: 01 December 2021 Publication History

Abstract

Neural network based ranking has been widely adopted owing to its powerful capacity in modeling complex relationships (e.g., users and items, questions and answers). Online neural network ranking, i.e., the so called fast neural ranking, is considered a challenging task because neural network measures are in general non-convex and asymmetric. Traditional approximate near neighbor (ANN) search which typically focuses on metric ranking measures, is not applicable to these complex measures. To tackle this challenge, in this paper, we propose to construct BipartitE Graph INdices (BEGIN) for fast neural ranking. BEGIN contains two types of nodes: base/searching objects and sampled queries. The edges connecting these types of nodes are constructed via the neural network ranking measure. The proposed algorithm is a natural extension from traditional search on graph methods and is more suitable for fast neural ranking. Experiments demonstrate the effectiveness and efficiency of the proposed method.

References

[1]
Lawrence Cayton. 2008. Fast nearest neighbor retrieval for bregman divergences. In Proceedings of the Twenty-Fifth International Conference on Machine learning (ICML). Helsinki, Finland, 112--119.
[2]
Wei-Cheng Chang, Felix X Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2020. Pre-training Tasks for Embedding-based Large-scale Retrieval. In Proceedings of the 8th International Conference on Learning Representations (ICLR). Addis Ababa.
[3]
Moses S. Charikar. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings on 34th Annual ACM Symposium on Theory of Computing (STOC). Montreal, Canada, 380--388.
[4]
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to Answer Open-Domain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada, 1870--1879.
[5]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys). Boston, MA, 191--198.
[6]
Ryan R Curtin and Parikshit Ram. 2014. Dual-tree fast exact max-kernel search. Statistical Analysis and Data Mining: The ASA Data Science Journal 7, 4 (2014), 229--253.
[7]
Ryan R Curtin, Parikshit Ram, and Alexander G Gray. 2013. Fast exact max-kernel search. In Proceedings of the 13th SIAM International Conference on Data Mining (SDM). Austin, TX, 1--9.
[8]
Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Shinjuku, Tokyo, 65--74.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 4171--4186.
[10]
Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2019. Fast approximate nearest neighbor search with the navigating spreading-out graph. Proceedings of the VLDB Endowment 12, 5 (2019), 461--474.
[11]
Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized Product Quantization for Approximate Nearest Neighbor Search. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland, OR, 2946--2953.
[12]
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In Proceedings of 25th International Conference on Very Large Data Bases (VLDB). Edinburgh, Scotland, UK, 518--529.
[13]
Michel X. Goemans and David P. Williamson. 1995. Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming. J. ACM 42, 6 (1995), 1115--1145.
[14]
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI). Melbourne, Australia, 1725--1731.
[15]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM). Indianapolis, IN, 55--64.
[16]
Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W. Bruce Croft, and Xueqi Cheng. 2020. A Deep Look into neural ranking models for information retrieval. Inf. Process. Manag. 57, 6 (2020), 102067.
[17]
Kiana Hajebi, Yasin Abbasi-Yadkori, Hossein Shahbazi, and Hong Zhang. 2011. Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI). Barcelona, CA, 1312--1317.
[18]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web (WWW). Perth, Australia, 173--182.
[19]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[20]
Piotr Indyk and Rajeev Motwani. 1998. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC). Dallas, TX, 604--613.
[21]
Sergey Ioffe. 2010. Improved Consistent Sampling, Weighted Minhash and L1 Sketching. In Proceedings of the 10th IEEE International Conference on Data Mining (ICDM). Sydney, Australia, 246--255.
[22]
Masajiro Iwasaki. 2016. Pruned Bi-directed K-nearest Neighbor Graph for Proximity Search. In Proceedings of the 9th International Conference on Similarity Search and Applications (SISAP), Vol. 9939. Tokyo, Japan, 20--33.
[23]
Masajiro Iwasaki and Daisuke Miyazaki. 2018. Optimization of indexing based on k-nearest neighbor graph for proximity search in high-dimensional data. arXiv preprint arXiv:1810.07355 (2018).
[24]
Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2008. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In Proceedings of the 10th European Conference on Computer Vision (ECCV), Part I. Marseille, France, 304--317.
[25]
Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2011. Product Quantization for Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1 (2011), 117--128.
[26]
Jyun-Yu Jiang, Patrick H. Chen, Cho-Jui Hsieh, and Wei Wang. 2020. Clustering and Constructing User Coresets to Accelerate Large-scale Top-K Recommender Systems. In Proceedings of the Web Conference (WWW). Taipei, 2177--2187.
[27]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Trans. Big Data 7, 3 (2021), 535--547.
[28]
Ping Li. 2017. Linearized GMM Kernels and Normalized Random Fourier Features. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). Halifax, Canada, 315--324.
[29]
Ping Li, Xiaoyun Li, Gennady Samorodnitsky, and Weijie Zhao. 2021. Consistent Sampling Through Extremal Process. In Proceedings of the Web Conference (WWW). Virtual Event / Ljubljana, Slovenia, 1317--1327.
[30]
Ping Li, Michael Mitzenmacher, and Anshumali Shrivastava. 2014. Coding for Random Projections. In Proceedings of the 31th International Conference on Machine Learning (ICML). Beijing, China, 676--684.
[31]
Xiaoyun Li and Ping Li. 2021. One-Sketch-for-All: Non-linear Random Features from Compressed Linear Measurements. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS). Virtual Event, 2647--2655.
[32]
Zhengdong Lu and Hang Li. 2013. A Deep Architecture for Matching ShortTexts. In Advances in Neural Information Processing Systems (NIPS). Lake Tahoe, NV, 1367--1375.
[33]
Yury A. Malkov and Dmitry A. Yashunin. 2020. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Trans. Pattern Anal. Mach. Intell. 42, 4 (2020), 824--836.
[34]
Mark Manasse, Frank McSherry, and Kunal Talwar. 2010. Consistent Weighted Sampling. Technical Report MSR-TR-2010-73. Microsoft Research.
[35]
Bhaskar Mitra and Nick Craswell. 2018. An introduction to neural information retrieval. Foundations and Trends® in Information Retrieval (2018).
[36]
Stanislav Morozov and Artem Babenko. 2018. Non-metric Similarity Graphs for Maximum Inner Product Search. In Advances in Neural Information Processing Systems (NeurIPS). Montreal, Canada, 4726--4735.
[37]
Ali Rahimi and Benjamin Recht. 2007. Random Features for Large-Scale Kernel Machines. In Advances in Neural Information Processing Systems (NIPS). Vancouver, Canada, 1177--1184.
[38]
Anand Rajaraman and Jeffrey David Ullman. 2011. Mining of massive datasets. Cambridge University Press.
[39]
Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Santiago, Chile, 373--382.
[40]
Anshumali Shrivastava and Ping Li. 2012. Fast Near Neighbor Search in High-Dimensional Binary Data. In Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD). Bristol, UK, 474--489.
[41]
Shulong Tan, Zhaozhuo Xu, Weijie Zhao, Hongliang Fei, Zhixin Zhou, and Ping Li. 2021. Norm Adjusted Proximity Graph for Fast Inner Product Retrieval. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). Virtual Event, Singapore, 1552--1560.
[42]
Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, and Ping Li. 2020. Fast item ranking under neural network based measures. In Proceedings of the 13th International Conference on Web Search and Data Mining (WSDM). 591--599.
[43]
Jiaxi Tang and Ke Wang. 2018. Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD). London, UK, 2289--2298.
[44]
Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking. In Proceedings of the 2018 World Wide Web Conference on World Wide Web (WWW). Lyon, France, 729--739.
[45]
Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Daniel N. Holtmann-Rice, David Simcha, and Felix X. Yu. 2017. Multiscale Quantization for Fast Similarity Search. In Advances in Neural Information Processing Systems (NIPS). Long Beach, CA, 5745--5755.
[46]
Yubao Wu, Ruoming Jin, and Xiang Zhang. 2014. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In Proceedings of the International Conference on Management of Data (SIGMOD). Snowbird, UT, 1139--1150.
[47]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Shinjuku, Tokyo, 55--64.
[48]
Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep Mutual Learning. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT, 4320--4328.
[49]
Weijie Zhao, Shulong Tan, and Ping Li. 2020. SONG: Approximate Nearest Neighbor Search on GPU. In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE). Dallas, TX, 1033--1044.
[50]
Zhixin Zhou, Shulong Tan, Zhaozhuo Xu, and Ping Li. 2019. Möbius Transformation for Fast Inner Product Search on Graph. In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada, 8216--8227.

Cited By

View all
  • (2024)RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor SearchProceedings of the VLDB Endowment10.14778/3681954.368195917:11(2735-2749)Online publication date: 1-Jul-2024
  • (2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
  • (2023)Asymmetric Hashing for Fast Ranking via Neural Network MeasuresProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591640(697-707)Online publication date: 19-Jul-2023

Index Terms

  1. Fast neural ranking on bipartite graph indices
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 15, Issue 4
    December 2021
    246 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 December 2021
    Published in PVLDB Volume 15, Issue 4

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor SearchProceedings of the VLDB Endowment10.14778/3681954.368195917:11(2735-2749)Online publication date: 1-Jul-2024
    • (2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
    • (2023)Asymmetric Hashing for Fast Ranking via Neural Network MeasuresProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591640(697-707)Online publication date: 19-Jul-2023
    • (2022)EGM: Enhanced Graph-based Model for Large-scale Video Advertisement SearchProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539061(4443-4451)Online publication date: 14-Aug-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media