Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1963405.1963460acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

A stochastic learning-to-rank algorithm and its application to contextual advertising

Published: 28 March 2011 Publication History

Abstract

This paper is concerned with the problem of learning a model to rank objects (Web pages, ads and etc.). We propose a framework where the ranking model is both optimized and evaluated using the same information retrieval measures such as Normalized Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP). The main difficulty in direct optimization of NDCG and MAP is that these measures depend on the rank of objects and are not differentiable. Most learning-to-rank methods that attempt to optimize NDCG or MAP approximate such measures so that they can be differentiable. In this paper, we propose a simple yet effective stochastic optimization algorithm to directly minimize any loss function, which can be defined on NDCG or MAP for the learning-to-rank problem. The algorithm employs Simulated Annealing along with Simplex method for its parameter search and finds the global optimal parameters. Experiment results using NDCG-Annealing algorithm, an instance of the proposed algorithm, on LETOR benchmark data sets show that the proposed algorithm is both effective and stable when compared to the baselines provided in LETOR 3.0. In addition, we applied the algorithm for ranking ads in contextual advertising. Our method has shown to significantly improve relevance in offline evaluation and business metrics in online tests in a real large-scale advertising serving system. To scale our computations, we parallelize the algorithm in a MapReduce framework running on Hadoop.

References

[1]
H. M. D. Almeida, M. A. Concalves, M. Cristo, and P. Calado. A combined component approach for finding collection-adapted ranking functions based on genetic programming. ACM SIGIR, pages 399--406, 2007.
[2]
R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. Addison Wesley.
[3]
A. Bagherjeiran, A. Hatch, and A. Ratnaparkhi. Ranking for the conversion funnel. ACM SIGIR, pages 146--153, 2010.
[4]
S. Bandyopadhyay, S. K. Pal, and C. A. Murthy. Simulated annealing based pattern classification. Information Sciences: an International Journal, pages (109):165--184, 1998.
[5]
A. Z. Broder, M. Fontoura, V. Josifovski, and L. Riedel. A semantic approach to contextual advertising. SIGIR, pages 559--566, 2007.
[6]
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. ICML, pages 89--96, 2005.
[7]
C. J. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. NIPS, pages 193--200, 2006.
[8]
Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Huang, and H.-W. Hon. Adapting ranking svm to document retrieval. ACM SIGIR, pages 186--193, 2006.
[9]
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise approach. ICML, pages 129--136, 2007.
[10]
D. Chakrabarti, D. Agrawal, and V. Josifovski. Contextual advertising by combining relevance with click feedback. WWW, pages 416--426, 2008.
[11]
O. Chapelle and M. Wu. Gradient descent optimization of smoothed information retrieval metrics. IR Journal, pages 13(3):201--215, 2010.
[12]
D. Cossock and T. Zhang. Subset ranking using regression. CoLT, pages 605--619, 2006.
[13]
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Symposium on Operating System Design and Implementation, pages 137--150, 2004.
[14]
K. A. Dowsland. Simulated annealing. Modern Heuristic Techniques for Combinatorial Problems, 1993.
[15]
M. feng Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma. Frank: A ranking method with fidelity loss. ACM SIGIR, pages 383--390, 2007.
[16]
A. Foundation. Apache hadoop project. http://hadoop.apache.org/.
[17]
Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. ICML, pages 170--178, 1998.
[18]
R. Herbrich, T. Graepel, and K. Obermayer. Support vector learning for ordinal regression. Artificial Neural Networks, pages 97--102, 1999.
[19]
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers, pages 115--132, 2000.
[20]
C. Huang and R. Nevatia. High performance object detection by collaborative learning of joint ranking of granules features. CVPR, pages 41--48, 2010.
[21]
K. Jarvelin and J. Kekalainen. Ir evaluation methods for retrieving highly relevant documents. ACM SIGIR, pages 41--48, 2000.
[22]
T. Joachims. Optimizing search engines using clickthrough data. ACM KDD, pages 133--142, 2002.
[23]
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, pages 671--680, 1983.
[24]
A. Lacerda, M. Cristo, M. Goncalves, W. Fan, N. Ziviani, and B. Ribeiro-Neto. Learning to advertise. ACM SIGIR, pages 549--556, 2006.
[25]
P. Li, C. J. Burges, and Q. Wu. Mcrank: Learning to rank using multiple classification and gradient boosting. NIPS, 2007.
[26]
T. Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, pages 3(3):225--331, 2009.
[27]
T.-Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. SIGIR-LTR, 2007.
[28]
A. V. Lukashin and R. Fuchs. Analysis of temporal gene expression profiles: Clustering by simulated annealing and determining the optimal number of clusters. Bioinformatics, pages 405--414, 2000.
[29]
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of state calculation by fast computing machines. J. Of Chem. Phys., pages 1087--1091, 1953.
[30]
V. Murdock, M. Ciaramita, and V. Plachouras. A noisy-channel approach to contextual advertising. ADKDD, pages 21--27, 2007.
[31]
R. Nallapati. Discriminative models for information retrieval. ACM SIGIR, pages 64--71, 2004.
[32]
J. A. Nelder and R. Mead. A simplex method for function minimization. Computer Journal, pages (7):308--313, 1965.
[33]
Z. Nie, Y. Zhang, J.-R. Wen, and W.-Y. Ma. Object-level ranking: Bringing order to web objects. WWW, pages 567--574, 2005.
[34]
J. Peng, C. Macdonald, and I. Ounis. Learning to select a ranking function. ECIR, pages 111--126, 2010.
[35]
J. Ponte and W. Croft. A language modeling approach to information retrieval. ACM SIGIR, pages 275--281, 1998.
[36]
T. Qin, T.-Y. Liu, M. feng Tsai, X.-D. Zhang, and H. Li. Learning to search web pages with query-level loss functions. Technical Report, 2006.
[37]
A. Ratnaparkhi. A hidden class page-ad probability model for contextual advertising. WWW Workshop, 2008.
[38]
M. Regelson and D. Fian. Predicting click-through rate using keyword clusters. ACM EC, 2006.
[39]
B. Ribeiro-Neto, M. Cristo, P. Golgher, and E. S. D. Moura. Impedance coupling in content-targeted advertising. ACM SIGIR, pages 496--503, 2005.
[40]
M. Richardson, E. Dominowska, and R. Rango. Predicting clicks: Estimating the click-through rate for new ads. WWW, pages 521--530, 2007.
[41]
S. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, pages 27:129--146, 1976.
[42]
S. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. ACM SIGIR, pages 232--241, 1994.
[43]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, pages 24(5):513--523, 1988.
[44]
M. Taylor, J. Guiver, S. Robertson, and T. Minka. Softrank: optimizing non-smooth rank metrics. ACM WSDM, pages 77--86, 2008.
[45]
H. Valizadegan, R. Jin, R. Zhang, and J. Mao. Learning to rank by optimizing ndcg measure. NIPS, 2009.
[46]
M. N. Volkovs and R. S. Zemel. Boltzrank: learning to maximize expected ranking gain. ICML, pages 1089--1096, 2009.
[47]
J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. ACM SIGIR, pages 391--398, 2007.
[48]
W. Yang, L. Rueda, and A. Ngom. A simulated annealing approach to find the optimal parameters for fuzzy clustering microarray data. IEEE International Conference on The Chilean Computer Science Society, 2005.
[49]
J.-Y. Yeh, J.-Y. Lin, H.-R. Ke, and W.-P. Yang. Learning to rank for information retrieval using genetic programming. SIGIR-LTR Workshop, 2007.
[50]
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. ACM SIGIR, pages 271--278, 2007.

Cited By

View all
  • (2023)Interactive Evolutionary Multiobjective Optimization via Learning to RankIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.323426927:4(749-763)Online publication date: Aug-2023
  • (2020)RankViz: a Visualization Framework to Assist Interpretation of Learning to Rank AlgorithmsComputers & Graphics10.1016/j.cag.2020.09.017Online publication date: Oct-2020
  • (2019)Ranking in GenealogyProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330772(2754-2764)Online publication date: 25-Jul-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '11: Proceedings of the 20th international conference on World wide web
March 2011
840 pages
ISBN:9781450306324
DOI:10.1145/1963405
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. contextual advertising
  2. ir measures
  3. learning to rank
  4. ndcg
  5. ndcg-annealing
  6. simplex algorithm
  7. simulated annealing

Qualifiers

  • Research-article

Conference

WWW '11
WWW '11: 20th International World Wide Web Conference
March 28 - April 1, 2011
Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Interactive Evolutionary Multiobjective Optimization via Learning to RankIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.323426927:4(749-763)Online publication date: Aug-2023
  • (2020)RankViz: a Visualization Framework to Assist Interpretation of Learning to Rank AlgorithmsComputers & Graphics10.1016/j.cag.2020.09.017Online publication date: Oct-2020
  • (2019)Ranking in GenealogyProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330772(2754-2764)Online publication date: 25-Jul-2019
  • (2019)Evolutionary optimization for ranking how-to questions based on user-generated contentsExpert Systems with Applications: An International Journal10.1016/j.eswa.2013.06.01740:17(7060-7068)Online publication date: 25-Nov-2019
  • (2018)Computational AdvertisingFoundations and Trends in Information Retrieval10.1561/15000000458:4–5(263-418)Online publication date: 14-Dec-2018
  • (2018)Combined Regression and Tripletwise Learning for Conversion Rate Prediction in Real-Time Bidding AdvertisingThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210062(115-123)Online publication date: 27-Jun-2018
  • (2018)A cross-benchmark comparison of 87 learning to rank methodsInformation Processing and Management: an International Journal10.1016/j.ipm.2015.07.00251:6(757-772)Online publication date: 29-Dec-2018
  • (2018)Search-based QoS ranking prediction for web services in cloud environmentsFuture Generation Computer Systems10.1016/j.future.2015.01.00850:C(111-126)Online publication date: 30-Dec-2018
  • (2017)Efficient Retrieval for Contextual Advertising Utilizing Past Click Logsコンテキスト広告におけるクリックログを用いた効率的な広告引き当て手法Transactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.A-H5232:6(A-H52_1-10)Online publication date: 1-Nov-2017
  • (2017)SMART: Sponsored mobile app recommendation by balancing app downloads and appstore profit2017 IEEE International Conference on Big Data (Big Data)10.1109/BigData.2017.8258094(1600-1609)Online publication date: Dec-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media