research-article

Ranking function adaptation with boosting trees

Authors:

Zhaohui ZhengAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 29, Issue 4

Article No.: 18, Pages 1 - 31

https://doi.org/10.1145/2037661.2037663

Published: 08 December 2011 Publication History

Abstract

Machine-learned ranking functions have shown successes in Web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, that is, the problem of insufficient labeled training data, which has significantly slowed the development and deployment of machine-learned ranking functions for different domains. There are two possible approaches to address this problem: (1) combining labeled training data from similar domains with the small target-domain labeled data for training or (2) using pairwise preference data extracted from user clickthrough log for the target domain for training. In this article, we propose a new approach called tree-based ranking function adaptation (Trada) to effectively utilize these data sources for training cross-domain ranking functions. Tree adaptation assumes that ranking functions are trained with the Stochastic Gradient Boosting Trees method—a gradient boosting method on regression trees. It takes such a ranking function from one domain and tunes its tree-based structure with a small amount of training data from the target domain. The unique features include (1) automatic identification of the part of the model that needs adjustment for the new domain and (2) appropriate weighing of training examples considering both local and global distributions. Based on a novel pairwise loss function that we developed for pairwise learning, the basic tree adaptation algorithm is also extended (Pairwise Trada) to utilize the pairwise preference data from the target domain to further improve the effectiveness of adaptation. Experiments are performed on real datasets to show that tree adaptation can provide better-quality ranking functions for a new domain than other methods.

References

[1]

Agichtein, E., Brill, E., and Dumais, S. 2006. Improving web search ranking by incorporating user behavior information. In Proceedings of the Annual ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[2]

Ando, R. K. and Zhang, T. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817--1853.

Digital Library

[3]

Argyriou, A., Evgeniou, T., and Pontil, M. 2007. Multi-task feature learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems 19. MIT Press, Cambridge, MA.

[4]

Bacchiani, M. and Roark, B. 2003. Unsupervised language model adaptation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

[5]

Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison Wesley, NY.

Digital Library

[6]

Ben-David, S., Blitzer, J., Crammer, K., and Sokolova, P. M. 2007. Analysis of representations for domain adaptation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge, MA.

[7]

Blitzer, J., Crammer, K., Kulesza, A., Pereira, O., and Wortman, J. 2008. Learning bounds for domain adaptation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge, MA.

[8]

Blitzer, J., McDonald, R., and Pereira, F. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Digital Library

[9]

Burges, C., Le, Q., and Ragno, R. 2006. Learning to rank with nonsmooth cost functions. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS).

[10]

Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., and Hullender, G. 2005. Learning to rank using gradient descent. In Proceedings of the International Conference on Machine Learning (ICML).

Digital Library

[11]

Burges, C. J. 2010. From ranknet to lambdarank to lambdamart: An overview. Micro. Res. Tech. rep. MSR-TR-2010-82.

[12]

Cao, Y., Xu, J., Liu, T.-Y., Huang, Y., and Hon, H.-W. 2006. Adapting ranking svm to document retrieval. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

[13]

Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., and Li, H. 2007. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, NY, 129--136.

Digital Library

[14]

Chapelle, O. and Zhang, Y. 2009. A dynamic Bayesian network click model for web search ranking. In Proceedings of the 18th International Conference on World Wide Web. ACM, New York, NY, 1--10.

Digital Library

[15]

Chen, K., Lu, R., Wong, C., Sun, G., Heck, L., and Tseng, B. 2008. Trada: Tree based ranking function adaptation. In Proceedings of the ACM Conference on Information and Knowledge Management.

Digital Library

[16]

Cossock, D. and Zhang, T. 2006. Subset ranking using regression. In Proceedings of the ACM Conference on Learning Theory. ACM, New York, NY, 605--619.

Digital Library

[17]

Cristianini, N. and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, UK.

Digital Library

[18]

Dai, W., Yang, Q., Xue, G.-R., and Yu, Y. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, NY, 193--200.

Digital Library

[19]

Dai, W., Yang, Q., Xue, G.-R., and Yu, Y. 2008. Self-taught clustering. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 200--207.

Digital Library

[20]

Daumé III, H. and Marcu, D. 2006. Domain adaptation for statistical classifiers. J. Mach. Learn. Res. 16, 1817--1853.

[21]

Dong, A., Chang, Y., Ji, S., Liao, C., Li, X., and Zheng, Z. 2009. Empirical exploitation of click data for task specific ranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Morristown, NJ, 1086--1095.

Digital Library

[22]

Evgeniou, T. and Pontil, M. 2004. Regularized multitask learning. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 109--117.

Digital Library

[23]

Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. 2008. Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871--1874.

Digital Library

[24]

Freund, Y., Iyer, R., Schapire, R. E., and Singer, Y. 2003. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933--969.

Digital Library

[25]

Freund, Y. and Schapire, R. E. 1999. A short introduction to boosting. In Proceedings of the 16th International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1401--1406.

[26]

Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Annal. Stat. 29, 5, 1189--1232.

[27]

Friedman, J. H. and Popescu, B. E. 2003. Importance sampled learning ensembles. J. Mach. Learn. Res. 9, 4305.

[28]

Gao, J., Wu, Q., Burges, C., Svore, K., Su, Y., Khan, N., Shah, S., and Zhou, H. 2009. Model adaptation via model interpolation and boosting for web search ranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, 505--513.

Digital Library

[29]

Geng, B., Yang, L., Xu, C., and Hua, X.-S. 2009. Ranking model adaptation for domain-specific search. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, New York, NY, 197--206.

Digital Library

[30]

Geurts, P. 2010. Learning to rank with extremely randomized regression trees. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.

[31]

Gulin, A. and Kuralenok, I. 2010. Yetirank: Everybody lies. In Yahoo! Learning to Rank Challenge Workshop in ICML.

[32]

Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning. Springer-Verlag.

[33]

Herbrich, R., Graepel, T., and Obermayer, K. 2000. Large margin rank boundaries for ordinal regression. Adv. Large Marg. Class. 115--132.

[34]

Hwa, R. 1999. Supervised grammar induction using training data with limited constituent information. In Proceedings of the Conference of the Association for Computational Linguistics.

Digital Library

[35]

Jarvelin, K. and Kekalainen, J. 2000. IR evaluation methods for retrieving highly relevant documents. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[36]

Ji, S., Zhou, K., Liao, C., Zheng, Z., Xue, G.-R., Chapelle, O., Sun, G., and Zha, H. 2009. Global ranking by exploiting user clicks. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 35--42.

Digital Library

[37]

Jiang, J. and Zhai, C. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the Conference of the Association for Computational Linguistics.

[38]

Joachims, T. 2002. Optimizing search engines using clickthrough data. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining.

Digital Library

[39]

Joachims, T., Granka, L., Pan, B., and Gay, G. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[40]

Leggetter, C. and Woodland, P. 1995. Flexible speaker adaptation using maximum likelihood linear regression. In Proceedings of Eurospeech.

[41]

Lehmann, E. L. and Casella, G. 1998. Theory of Point Estimation. Springer-Verlag.

[42]

Li, P., Burges, C. J., and Wu, Q. 2007. Mcrank: Learning to rank using multiple classification and gradient boosting. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS).

[43]

Liao, X., Xue, Y., and Carin, L. 2005. Logistic regression with an auxiliary data source. In Proceedings of the 22nd International Conference on Machine Learning. ACM, New York, NY, 505--512.

Digital Library

[44]

Long, B., Lamkhede, S., Vadrevu, S., Zhang, Y., and Tseng, B. L. 2009. A risk minimization framework for domain adaptation. In Proceedings of the Conference on Information and Knowledge Management, ACM, New York, NY, 1347--1356.

Digital Library

[45]

Mansour, Y., Mohri, M., and Rostamizadeh, A. 2009. Domain adaptation: Learning bounds and algorithms. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS).

[46]

Mohan, A., Chen, Z., and Weinberger, K. 2010. Tree ensemble and transfer learning. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.

[47]

Nallapati, R. 2004. Discriminative models for information retrieval. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, NY, 64--71.

Digital Library

[48]

Pan, S. J. and Yang, Q. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Engin. 22, 10, 1345--1359.

Digital Library

[49]

Pavlov, D. and Brunk, C. 2010. Bagboo: Bagging the gradient boosting. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.

[50]

Raina, R., Battle, A., Lee, H., Packer, B., and Ng, A. Y. 2007. Self-taught learning: transfer learning from unlabeled data. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, NY, 759--766.

Digital Library

[51]

Schapire, R. E. 2003. The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification.

[52]

Sorokina, D. 2010. Application of additive groves to the learning to rank challenge. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.

[53]

Tsai, M.-F., Liu, T.-Y., Qin, T., Chen, H.-H., and Ma, W.-Y. 2007. Frank: a ranking method with fidelity loss. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, NY, 383--390.

Digital Library

[54]

Vapnik, V. N. 1999. The Nature of Statistical Learning Theory. Springer Science and Business Media, LLC, Berlin, Germany.

Digital Library

[55]

Wang, Z., Song, Y., and Zhang, C. 2008. Transferred dimensionality reduction. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. Springer-Verlag, Berlin, Germany, 550--565.

Digital Library

[56]

Wu, P. and Dietterich, T. G. 2004. Improving svm accuracy by training on auxiliary data sources. In Proceedings of International Conference on Machine Learning. 871--878.

Digital Library

[57]

Wu, Q., Burges, C. J., Svore, K., and Gao, J. 2008. Ranking, boosting, and model adaptation. Micro. Res. Tech. rep., Microsoft.

[58]

Wu, Q., Burges, C. J. C., Svore, K. M., and Gao, J. 2010. Adapting boosting for information retrieval measures. Inf. Retr. 13, 3, 254--270.

Digital Library

[59]

Xu, J. and Li, H. 2007. AdaRank: a boosting algorithm for information retrieval. In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[60]

Zheng, Z., Chen, K., Sun, G., and Zha, H. 2007a. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 287--294.

Digital Library

[61]

Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., and Sun, G. 2007b. A general boosting method and its application to learning ranking functions for web search. In Proceedings of Neural Information Processing Systems (NIPS).

Cited By

Silva RGomes GAlvim MGonçalves M(2022)How to build high quality L2R training data: Unsupervised compression-based selective sampling for learning to rankInformation Sciences10.1016/j.ins.2022.04.012601(90-113)Online publication date: Jul-2022
https://doi.org/10.1016/j.ins.2022.04.012
Arguello J(2018)Aggregated SearchFoundations and Trends in Information Retrieval10.1561/150000005210:5(365-502)Online publication date: 14-Dec-2018
https://dl.acm.org/doi/10.1561/1500000052
Desarkar MSarkar SMitra P(2018)Preference relations based unsupervised rank aggregation for metasearchExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.12.00549:C(86-98)Online publication date: 29-Dec-2018
https://dl.acm.org/doi/10.1016/j.eswa.2015.12.005
Show More Cited By

Index Terms

Ranking function adaptation with boosting trees
1. Information systems
  1. Information retrieval

Recommendations

Ranking model adaptation for domain-specific search
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Recently, various domain-specific search engines emerge, which are restricted to specific topicalities or document formats, and vertical to the broad-based search. Simply applying the ranking model trained for the broad-based search to the verticals ...
Trada: tree based ranking function adaptation
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, i.e., the problem of insufficient training ...
Learning to rank only using training data from related domain
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Like traditional supervised and semi-supervised algorithms, learning to rank for information retrieval requires document annotations provided by domain experts. It is costly to annotate training data for different search domains and tasks. We propose to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 29, Issue 4

December 2011

172 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/2037661

Issue’s Table of Contents

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2011

Accepted: 01 July 2011

Revised: 01 January 2011

Received: 01 June 2010

Published in TOIS Volume 29, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
393
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Silva RGomes GAlvim MGonçalves M(2022)How to build high quality L2R training data: Unsupervised compression-based selective sampling for learning to rankInformation Sciences10.1016/j.ins.2022.04.012601(90-113)Online publication date: Jul-2022
https://doi.org/10.1016/j.ins.2022.04.012
Arguello J(2018)Aggregated SearchFoundations and Trends in Information Retrieval10.1561/150000005210:5(365-502)Online publication date: 14-Dec-2018
https://dl.acm.org/doi/10.1561/1500000052
Desarkar MSarkar SMitra P(2018)Preference relations based unsupervised rank aggregation for metasearchExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.12.00549:C(86-98)Online publication date: 29-Dec-2018
https://dl.acm.org/doi/10.1016/j.eswa.2015.12.005
Becker CChristoudias CFua P(2015)Domain Adaptation for Microscopy ImagingIEEE Transactions on Medical Imaging10.1109/TMI.2014.237687234:5(1125-1139)Online publication date: May-2015
https://doi.org/10.1109/TMI.2014.2376872
Chen LHolsapple C(2013)Evaluating Journal Quality: Beyond “Expert” Journal Assessments in the IS DisciplineJournal of Organizational Computing and Electronic Commerce10.1080/10919392.2013.84046723:4(392-412)Online publication date: Oct-2013
https://doi.org/10.1080/10919392.2013.840467
Liu JCheng WZhang T(2013)Principal Factor Analysis for Forecasting Diurnal Water-Demand Pattern Using Combined Rough-Set and Fuzzy-Clustering TechniqueJournal of Water Resources Planning and Management10.1061/(ASCE)WR.1943-5452.0000223139:1(23-33)Online publication date: Jan-2013
https://doi.org/10.1061/(ASCE)WR.1943-5452.0000223
Vahidov R(2012)Design-Type Research in Information SystemsDesign-Type Research in Information Systems10.4018/978-1-4666-0131-4.ch005(94-114)Online publication date: 2012
https://doi.org/10.4018/978-1-4666-0131-4.ch005
Vahidov R(2012)Research in Information SystemsDesign-Type Research in Information Systems10.4018/978-1-4666-0131-4.ch003(51-75)Online publication date: 2012
https://doi.org/10.4018/978-1-4666-0131-4.ch003

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents