Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Ranking function adaptation with boosting trees

Published: 08 December 2011 Publication History

Abstract

Machine-learned ranking functions have shown successes in Web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, that is, the problem of insufficient labeled training data, which has significantly slowed the development and deployment of machine-learned ranking functions for different domains. There are two possible approaches to address this problem: (1) combining labeled training data from similar domains with the small target-domain labeled data for training or (2) using pairwise preference data extracted from user clickthrough log for the target domain for training. In this article, we propose a new approach called tree-based ranking function adaptation (Trada) to effectively utilize these data sources for training cross-domain ranking functions. Tree adaptation assumes that ranking functions are trained with the Stochastic Gradient Boosting Trees method—a gradient boosting method on regression trees. It takes such a ranking function from one domain and tunes its tree-based structure with a small amount of training data from the target domain. The unique features include (1) automatic identification of the part of the model that needs adjustment for the new domain and (2) appropriate weighing of training examples considering both local and global distributions. Based on a novel pairwise loss function that we developed for pairwise learning, the basic tree adaptation algorithm is also extended (Pairwise Trada) to utilize the pairwise preference data from the target domain to further improve the effectiveness of adaptation. Experiments are performed on real datasets to show that tree adaptation can provide better-quality ranking functions for a new domain than other methods.

References

[1]
Agichtein, E., Brill, E., and Dumais, S. 2006. Improving web search ranking by incorporating user behavior information. In Proceedings of the Annual ACM SIGIR Conference on Research and Development in Information Retrieval.
[2]
Ando, R. K. and Zhang, T. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817--1853.
[3]
Argyriou, A., Evgeniou, T., and Pontil, M. 2007. Multi-task feature learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems 19. MIT Press, Cambridge, MA.
[4]
Bacchiani, M. and Roark, B. 2003. Unsupervised language model adaptation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
[5]
Baeza-Yates, R. and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison Wesley, NY.
[6]
Ben-David, S., Blitzer, J., Crammer, K., and Sokolova, P. M. 2007. Analysis of representations for domain adaptation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge, MA.
[7]
Blitzer, J., Crammer, K., Kulesza, A., Pereira, O., and Wortman, J. 2008. Learning bounds for domain adaptation. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge, MA.
[8]
Blitzer, J., McDonald, R., and Pereira, F. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.
[9]
Burges, C., Le, Q., and Ragno, R. 2006. Learning to rank with nonsmooth cost functions. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS).
[10]
Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., and Hullender, G. 2005. Learning to rank using gradient descent. In Proceedings of the International Conference on Machine Learning (ICML).
[11]
Burges, C. J. 2010. From ranknet to lambdarank to lambdamart: An overview. Micro. Res. Tech. rep. MSR-TR-2010-82.
[12]
Cao, Y., Xu, J., Liu, T.-Y., Huang, Y., and Hon, H.-W. 2006. Adapting ranking svm to document retrieval. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[13]
Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., and Li, H. 2007. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, NY, 129--136.
[14]
Chapelle, O. and Zhang, Y. 2009. A dynamic Bayesian network click model for web search ranking. In Proceedings of the 18th International Conference on World Wide Web. ACM, New York, NY, 1--10.
[15]
Chen, K., Lu, R., Wong, C., Sun, G., Heck, L., and Tseng, B. 2008. Trada: Tree based ranking function adaptation. In Proceedings of the ACM Conference on Information and Knowledge Management.
[16]
Cossock, D. and Zhang, T. 2006. Subset ranking using regression. In Proceedings of the ACM Conference on Learning Theory. ACM, New York, NY, 605--619.
[17]
Cristianini, N. and Shawe-Taylor, J. 2000. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, UK.
[18]
Dai, W., Yang, Q., Xue, G.-R., and Yu, Y. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, NY, 193--200.
[19]
Dai, W., Yang, Q., Xue, G.-R., and Yu, Y. 2008. Self-taught clustering. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 200--207.
[20]
Daumé III, H. and Marcu, D. 2006. Domain adaptation for statistical classifiers. J. Mach. Learn. Res. 16, 1817--1853.
[21]
Dong, A., Chang, Y., Ji, S., Liao, C., Li, X., and Zheng, Z. 2009. Empirical exploitation of click data for task specific ranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Morristown, NJ, 1086--1095.
[22]
Evgeniou, T. and Pontil, M. 2004. Regularized multitask learning. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 109--117.
[23]
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. 2008. Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871--1874.
[24]
Freund, Y., Iyer, R., Schapire, R. E., and Singer, Y. 2003. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933--969.
[25]
Freund, Y. and Schapire, R. E. 1999. A short introduction to boosting. In Proceedings of the 16th International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1401--1406.
[26]
Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Annal. Stat. 29, 5, 1189--1232.
[27]
Friedman, J. H. and Popescu, B. E. 2003. Importance sampled learning ensembles. J. Mach. Learn. Res. 9, 4305.
[28]
Gao, J., Wu, Q., Burges, C., Svore, K., Su, Y., Khan, N., Shah, S., and Zhou, H. 2009. Model adaptation via model interpolation and boosting for web search ranking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, 505--513.
[29]
Geng, B., Yang, L., Xu, C., and Hua, X.-S. 2009. Ranking model adaptation for domain-specific search. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, New York, NY, 197--206.
[30]
Geurts, P. 2010. Learning to rank with extremely randomized regression trees. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.
[31]
Gulin, A. and Kuralenok, I. 2010. Yetirank: Everybody lies. In Yahoo! Learning to Rank Challenge Workshop in ICML.
[32]
Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning. Springer-Verlag.
[33]
Herbrich, R., Graepel, T., and Obermayer, K. 2000. Large margin rank boundaries for ordinal regression. Adv. Large Marg. Class. 115--132.
[34]
Hwa, R. 1999. Supervised grammar induction using training data with limited constituent information. In Proceedings of the Conference of the Association for Computational Linguistics.
[35]
Jarvelin, K. and Kekalainen, J. 2000. IR evaluation methods for retrieving highly relevant documents. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.
[36]
Ji, S., Zhou, K., Liao, C., Zheng, Z., Xue, G.-R., Chapelle, O., Sun, G., and Zha, H. 2009. Global ranking by exploiting user clicks. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 35--42.
[37]
Jiang, J. and Zhai, C. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the Conference of the Association for Computational Linguistics.
[38]
Joachims, T. 2002. Optimizing search engines using clickthrough data. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
[39]
Joachims, T., Granka, L., Pan, B., and Gay, G. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.
[40]
Leggetter, C. and Woodland, P. 1995. Flexible speaker adaptation using maximum likelihood linear regression. In Proceedings of Eurospeech.
[41]
Lehmann, E. L. and Casella, G. 1998. Theory of Point Estimation. Springer-Verlag.
[42]
Li, P., Burges, C. J., and Wu, Q. 2007. Mcrank: Learning to rank using multiple classification and gradient boosting. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS).
[43]
Liao, X., Xue, Y., and Carin, L. 2005. Logistic regression with an auxiliary data source. In Proceedings of the 22nd International Conference on Machine Learning. ACM, New York, NY, 505--512.
[44]
Long, B., Lamkhede, S., Vadrevu, S., Zhang, Y., and Tseng, B. L. 2009. A risk minimization framework for domain adaptation. In Proceedings of the Conference on Information and Knowledge Management, ACM, New York, NY, 1347--1356.
[45]
Mansour, Y., Mohri, M., and Rostamizadeh, A. 2009. Domain adaptation: Learning bounds and algorithms. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS).
[46]
Mohan, A., Chen, Z., and Weinberger, K. 2010. Tree ensemble and transfer learning. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.
[47]
Nallapati, R. 2004. Discriminative models for information retrieval. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, NY, 64--71.
[48]
Pan, S. J. and Yang, Q. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Engin. 22, 10, 1345--1359.
[49]
Pavlov, D. and Brunk, C. 2010. Bagboo: Bagging the gradient boosting. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.
[50]
Raina, R., Battle, A., Lee, H., Packer, B., and Ng, A. Y. 2007. Self-taught learning: transfer learning from unlabeled data. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, NY, 759--766.
[51]
Schapire, R. E. 2003. The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification.
[52]
Sorokina, D. 2010. Application of additive groves to the learning to rank challenge. In Proceedings of Yahoo! Learning to Rank Challenge Workshop in ICML.
[53]
Tsai, M.-F., Liu, T.-Y., Qin, T., Chen, H.-H., and Ma, W.-Y. 2007. Frank: a ranking method with fidelity loss. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, NY, 383--390.
[54]
Vapnik, V. N. 1999. The Nature of Statistical Learning Theory. Springer Science and Business Media, LLC, Berlin, Germany.
[55]
Wang, Z., Song, Y., and Zhang, C. 2008. Transferred dimensionality reduction. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. Springer-Verlag, Berlin, Germany, 550--565.
[56]
Wu, P. and Dietterich, T. G. 2004. Improving svm accuracy by training on auxiliary data sources. In Proceedings of International Conference on Machine Learning. 871--878.
[57]
Wu, Q., Burges, C. J., Svore, K., and Gao, J. 2008. Ranking, boosting, and model adaptation. Micro. Res. Tech. rep., Microsoft.
[58]
Wu, Q., Burges, C. J. C., Svore, K. M., and Gao, J. 2010. Adapting boosting for information retrieval measures. Inf. Retr. 13, 3, 254--270.
[59]
Xu, J. and Li, H. 2007. AdaRank: a boosting algorithm for information retrieval. In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval.
[60]
Zheng, Z., Chen, K., Sun, G., and Zha, H. 2007a. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 287--294.
[61]
Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., and Sun, G. 2007b. A general boosting method and its application to learning ranking functions for web search. In Proceedings of Neural Information Processing Systems (NIPS).

Cited By

View all
  • (2022)How to build high quality L2R training data: Unsupervised compression-based selective sampling for learning to rankInformation Sciences10.1016/j.ins.2022.04.012601(90-113)Online publication date: Jul-2022
  • (2018)Aggregated SearchFoundations and Trends in Information Retrieval10.1561/150000005210:5(365-502)Online publication date: 14-Dec-2018
  • (2018)Preference relations based unsupervised rank aggregation for metasearchExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.12.00549:C(86-98)Online publication date: 29-Dec-2018
  • Show More Cited By

Index Terms

  1. Ranking function adaptation with boosting trees

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 29, Issue 4
    December 2011
    172 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/2037661
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 December 2011
    Accepted: 01 July 2011
    Revised: 01 January 2011
    Received: 01 June 2010
    Published in TOIS Volume 29, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Web search ranking
    2. boosting regression trees
    3. domain adaptation
    4. learning to rank
    5. user feedback

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 25 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)How to build high quality L2R training data: Unsupervised compression-based selective sampling for learning to rankInformation Sciences10.1016/j.ins.2022.04.012601(90-113)Online publication date: Jul-2022
    • (2018)Aggregated SearchFoundations and Trends in Information Retrieval10.1561/150000005210:5(365-502)Online publication date: 14-Dec-2018
    • (2018)Preference relations based unsupervised rank aggregation for metasearchExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.12.00549:C(86-98)Online publication date: 29-Dec-2018
    • (2015)Domain Adaptation for Microscopy ImagingIEEE Transactions on Medical Imaging10.1109/TMI.2014.237687234:5(1125-1139)Online publication date: May-2015
    • (2013)Evaluating Journal Quality: Beyond “Expert” Journal Assessments in the IS DisciplineJournal of Organizational Computing and Electronic Commerce10.1080/10919392.2013.84046723:4(392-412)Online publication date: Oct-2013
    • (2013)Principal Factor Analysis for Forecasting Diurnal Water-Demand Pattern Using Combined Rough-Set and Fuzzy-Clustering TechniqueJournal of Water Resources Planning and Management10.1061/(ASCE)WR.1943-5452.0000223139:1(23-33)Online publication date: Jan-2013
    • (2012)Design-Type Research in Information SystemsDesign-Type Research in Information Systems10.4018/978-1-4666-0131-4.ch005(94-114)Online publication date: 2012
    • (2012)Research in Information SystemsDesign-Type Research in Information Systems10.4018/978-1-4666-0131-4.ch003(51-75)Online publication date: 2012

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media