Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3589334.3645605acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems

Published: 13 May 2024 Publication History

Abstract

Cascade ranking is widely used for large-scale top-k selection problems in online advertising and recommendation systems, and learning-to-rank is an important way to optimize the models in cascade ranking. Previous works on learning-to-rank usually focus on letting the model learn the complete order or top-k order, and adopt the corresponding rank metrics (e.g. OPA and NDCG@k) as optimization targets. However, these targets can not adapt to various cascade ranking scenarios with varying data complexities and model capabilities; and the existing metric-driven methods such as the Lambda framework can only optimize a rough upper bound of limited metrics, potentially resulting in sub-optimal and performance misalignment. To address these issues, we propose a novel perspective on optimizing cascade ranking systems by highlighting the adaptability of optimization targets to data complexities and model capabilities. Concretely, we employ multi-task learning to adaptively combine the optimization of relaxed and full targets, which refers to metrics Recall@m@k and OPA respectively. We also introduce permutation matrix to represent the rank metrics and employ differentiable sorting techniques to relax hard permutation matrix with controllable approximate error bound. This enables us to optimize both the relaxed and full targets directly and more appropriately. We named this method as Adaptive Neural Ranking Framework (abbreviated as ARF). Furthermore, we give a specific practice under ARF. We use the NeuralSort to obtain the relaxed permutation matrix and draw on the variant of the uncertainty weight method in multi-task learning to optimize the proposed losses jointly. Experiments on a total of 4 public and industrial benchmarks show the effectiveness and generalization of our method, and online experiment shows that our method has significant application value.

Supplemental Material

M4V File
Supplemental video
MP4 File
video presentation

References

[1]
Mathieu Blondel, Olivier Teboul, Quentin Berthet, and Josip Djolonga. 2020. Fast Differentiable Sorting and Ranking. In ICML. 950--959.
[2]
Sebastian Bruch, Masrour Zoghi, Michael Bendersky, and Marc Najork. 2019. Revisiting approximate metric optimization in the age of deep neural networks. In SIGIR. 1241--1244.
[3]
Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning (2010), 81.
[4]
Christopher J. C. Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Gregory N. Hullender. 2005. Learning to rank using gradient descent. In ICML. 89--96.
[5]
Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, and Hsiao-Wuen Hon. 2006. Adapting ranking SVM to document retrieval. In SIGIR. 186--193.
[6]
Linzheng Chai, Jian Yang, Tao Sun, Hongcheng Guo, Jiaheng Liu, Bing Wang, Xiannian Liang, Jiaqi Bai, Tongliang Li, Qiyao Peng, et al. 2024. xCoT: Crosslingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning. arXiv preprint arXiv:2401.07037 (2024).
[7]
Ruey-Cheng Chen, Luke Gallagher, Roi Blanco, and J. Shane Culpepper. 2017. Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval. In SIGIR. 445-- 454.
[8]
Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. 2018. GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks. In ICML. 793--802.
[9]
Marco Cuturi, Olivier Teboul, and Jean-Philippe Vert. 2019. Differentiable Ranking and Sorting using Optimal Transport. In NeurIPS. 6858--6868.
[10]
Domenico Dato, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2016. Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees. ACM Trans. Inf. Syst. (2016), 15:1--15:31.
[11]
Luke Gallagher, Ruey-Cheng Chen, Roi Blanco, and J. Shane Culpepper. 2019. Joint Optimization of Cascade Ranking Models. In WSDM. 15--23.
[12]
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep Sparse Rectifier Neural Networks. In AISTATS. 315--323.
[13]
Aditya Grover, Eric Wang, Aaron Zweig, and Stefano Ermon. 2019. Stochastic Optimization of Sorting Networks via Continuous Relaxations. In ICLR.
[14]
Yuyao Guo, Haoming Li, Xiang Ao, Min Lu, Dapeng Liu, Lei Xiao, Jie Jiang, and Qing He. 2022. Calibrated Conversion Rate Prediction via Knowledge Distillation under Delayed Feedback in Online Advertising. In CIKM. 3983--3987.
[15]
Guy Hadash, Oren Sar Shalom, and Rita Osadchy. 2018. Rank and rate: multi-task learning for recommender systems. In RecSys. 451--454.
[16]
Yun He, Xue Feng, Cheng Cheng, Geng Ji, Yunsong Guo, and James Caverlee. 2022. MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks. In WWW. 2205--2215.
[17]
Siguang Huang, Yunli Wang, Lili Mou, Huayue Zhang, Han Zhu, Chuan Yu, and Bo Zheng. 2022. MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration. In WWW. 2236--2246.
[18]
Rolf Jagerman, Zhen Qin, Xuanhui Wang, Michael Bendersky, and Marc Najork. 2022. On Optimizing Top-K Metrics for Neural Ranking Models. In SIGIR. 2303--2307.
[19]
Biye Jiang, Pengye Zhang, Rihan Chen, Binding Dai, Xinchen Luo, Yin Yang, Guan Wang, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. 2020. DCAF: A Dynamic Computation Allocation Framework for Online Serving System. CoRR abs/2006.09684 (2020).
[20]
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In SIGKDD. 133--142.
[21]
Ingmar Kanitscheider, Joost Huizinga, David Farhi, William Hebgen Guss, Brandon Houghton, Raul Sampedro, Peter Zhokhov, Bowen Baker, Adrien Ecoffet, Jie Tang, Oleg Klimov, and Jeff Clune. 2021. Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft. CoRR abs/2106.14876 (2021).
[22]
Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In CVPR. 7482-- 7491.
[23]
Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-Normalizing Neural Networks. In NeurIPS. 971--980.
[24]
Ping Li, Christopher J. C. Burges, and Qiang Wu. 2007. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting. In NeurIPS. 897--904.
[25]
Zhen Li, Chongyang Tao, Jiazhan Feng, Tao Shen, Dongyan Zhao, Xiubo Geng, and Daxin Jiang. 2023. FAA: Fine-grained Attention Alignment for Cascade Document Ranking. In ACL. 1688--1700.
[26]
Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, and Zhoujun Li. 2023. mCL-NER: Cross-Lingual Named Entity Recognition via Multiview Contrastive Learning. arXiv preprint arXiv:2308.09073 (2023).
[27]
Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, and Stephan Wolf. 2019. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank. In KDD. ACM, 2970--2978.
[28]
Felix Petersen, Christian Borgelt, Hilde Kuehne, and Oliver Deussen. 2021. Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision. In ICML. 8546--8555.
[29]
Felix Petersen, Christian Borgelt, Hilde Kuehne, and Oliver Deussen. 2022. Monotonic Differentiable Sorting Networks. In ICLR.
[30]
Jiarui Qin, Jiachen Zhu, Bo Chen, Zhirong Liu, Weiwen Liu, Ruiming Tang, Rui Zhang, Yong Yu, and Weinan Zhang. 2022. RankFlow: Joint Optimization of Multi-Stage Cascade Ranking Systems as Flows. In SIGIR. 814--824.
[31]
Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 Datasets. CoRR abs/1306.2597 (2013). arXiv:1306.2597 http://arxiv.org/abs/1306.2597
[32]
Tao Qin, Tie-Yan Liu, and Hang Li. 2010. A general approximation framework for direct optimization of information retrieval measures. Inf. Retr. 13, 4 (2010), 375--397.
[33]
Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, and Marc Najork. 2021. Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?. In ICLR.
[34]
Robin M. E. Swezey, Aditya Grover, Bruno Charron, and Stefano Ermon. 2021. PiRank: Scalable Learning To Rank via Differentiable Sorting. In NeurIPS. 21644-- 21654.
[35]
Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations. In RecSys. 269--278.
[36]
Michael J. Taylor, John Guiver, Stephen Robertson, and Tom Minka. 2008. Soft- Rank: optimizing non-smooth rank metrics. In WSDM. 77--86.
[37]
Lidan Wang, Jimmy Lin, and Donald Metzler. 2011. A cascade ranking model for efficient ranked retrieval. In SIGIR. 105--114.
[38]
Xuewei Wang, Qiang Jin, Shengyu Huang, Min Zhang, Xi Liu, Zhengli Zhao, Yukun Chen, Zhengyu Zhang, Jiyan Yang, EllieWen, Sagar Chordia,Wenlin Chen, and Qin Huang. 2023. Towards the Better Ranking Consistency: A Multi-task Learning Framework for Early Stage Ads Ranking. CoRR abs/2307.11096 (2023).
[39]
XuanhuiWang, Cheng Li, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018. The LambdaLoss Framework for Ranking Metric Optimization. In CIKM. 1313--1322.
[40]
YiningWang, LiweiWang, Yuanzhi Li, Di He, and Tie-Yan Liu. 2013. A Theoretical Analysis of NDCG Type Ranking Measures. In COLT. 25--54.
[41]
Yunli Wang, Yu Wu, Lili Mou, Zhoujun Li, and Wen-Han Chao. 2020. Formality Style Transfer with Shared Latent Space. In COLING. 2236--2249.
[42]
Mingrui Wu, Yi Chang, Zhaohui Zheng, and Hongyuan Zha. 2009. Smoothing DCG for learning to rank: a novel approach using smoothed hinge functions. In CIKM. 1923--1926.
[43]
Jun Xu and Hang Li. 2007. AdaRank: a boosting algorithm for information retrieval. In SIGIR. 391--398.
[44]
Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, FuruWei, and Zhoujun Li. 2023. GanLM: Encoder- Decoder Pre-training with an Auxiliary Discriminator. In ACL. 9394--9412.
[45]
Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan, Xia Song, and Furu Wei. 2021. Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task. In WMT. 446--455.
[46]
Jian Yang, Shuming Ma, Dongdong Zhang, ShuangzhiWu, Zhoujun Li, and Ming Zhou. 2020. Alternating language modeling for cross-lingual pre-training. In AAAI. 9386--9393.
[47]
Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Zhoujun Li, and Furu Wei. 2022. High-resource Language-specific Training for Multilingual Neural Machine Translation. In IJCAI. 4461--4467.
[48]
Jian Yang, Yuwei Yin, Liqun Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Furu Wei, and Zhoujun Li. 2023. GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation. TASLP 31 (2023), 1489--1498.
[49]
Xun Yang, YunliWang, Cheng Chen, Qing Tan, Chuan Yu, Jian Xu, and Xiaoqiang Zhu. 2021. Computation Resource Allocation Solution in Recommender Systems. CoRR abs/2103.02259 (2021).
[50]
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient Surgery for Multi-Task Learning. In NeurIPS 2020.
[51]
Weixia Zhang, Guangtao Zhai, Ying Wei, Xiaokang Yang, and Kede Ma. 2023. Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective. In CVPR. 14071--14081.
[52]
Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, and Gordon Sun. 2007. A General Boosting Method and its Application to Learning Ranking Functions for Web Search. In NeurIPS. 1697--1704.

Index Terms

  1. Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. differentiable sorting
    2. learning to rank in cascade systems
    3. multi-task learning

    Qualifiers

    • Research-article

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 82
      Total Downloads
    • Downloads (Last 12 months)82
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media