Abstract
Recently, we designed a new experimental system MSearch, which is a cross-media meta-search system built on the database of the WikipediaMM task of ImageCLEF 2008. For a meta-search engine, the kernel problem is how to merge the results from multiple member search engines and provide a more effective rank list. This paper deals with a novel fusion model employing supervised learning. Our fusion model employs ranking SVM in training the fusion weight for each member search engine. We assume the fusion weight of each member search engine as a feature of a result document returned by the meta-search engine. For a returned result document, we first build a feature vector to represent the document, and set the value of each feature as the document’s score returned by the corresponding member search engine. Then we construct a training set from the documents returned from the meta-search engine to learn the fusion parameter. Finally, we use the linear fusion model based on the overlap set to merge the results set. Experimental results show that our approach significantly improves the performance of the cross-media meta-search (MSearch) and outperforms many of the existing fusion methods.
Similar content being viewed by others
References
Ahmad, N., Sufyan Beg, M.M., 2002. Fuzzy Logic Based Rank Aggregation Methods for the World Wide Web. Int. Conf. on Arifical Intelligence in Engineering and Technology, p.363–368.
Aslam, J.A., Montague, M., 2001. Models for Metasearch. Proc. 24th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.276–284. [doi:10.1145/383952.384007]
Cao, L., Han, L.X., Wu, S.L., 2009. Ranking algorithm for meta-search engine. Appl. Res. Comput., 26(2):411–414 (in Chinese).
Dwork, C., Kumar, R., Naor, M., Sivakumar, D., 2001. Rank Aggregation Methods for the Web. 10th Int. World Wide Web Conf., p.613–622. [doi:10.1145/371920.372165]
Fagin, R., Kumar, R., Sivakumar, D., 2003. Efficient Similiarity Search and Classification via Rank Aggregation. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.301–312. [doi:10.1145/872757. 872795]
Fox, E.A., Shaw, J.A., 1993. Combination of Multiple Searches. The Text Retrieval Conf., p.243–252.
Herbrich, R., Graepel, T., Obermaye, K., 2000. Large Margin Rank Boundaries for Ordinal Regression. Advances in Large Margin Classifiers, p.115–132.
Joachims, T., 2002. Optimizing Search Engines Using Clickthrough Data. Proc. ACM Conf. on Knowledge Discovery and Data Mining (KDD), p.133–142. [doi:10.1145/775047.775067]
Liu, T.Y., 2009. Learning to ranking for information retrieval. Found. Trends Inf. Retr., 3(3):225–331. [doi:10.1561/1500000016]
Selberg, E., Etzioni, O, 1995. Multi-Service Search and Comparison Using the Metacrawler. The 4th World Wide Web Conf., p.195–208.
Sufyan Beg, M.M., 2004. Parrallel Rank Aggregation for the World Wide. Intelligent Sensing and Information Processing, p.385–390. [doi:10.1109/ICISIP.2004.1287 688]
van Erp, M., Schomaker, L., 2000. Variants of the Borda Count Method for Combining Ranked Classifier Hypotheses. 7th Int. Workshop on Frontiers in Handwriting Recognition, p.443–452.
Yu, H., Kim, S., 2010. SVM Turorial: Classification, Regression, and Ranking. In: Handbook of Natural Computing. Springer.
Yuan, F.Y., Wang, J.D., 2009. An Implemented Rank Merging Algorithm for Meta Search Engine. Research Challenges in Computer Science, p.191–193. [doi:10.1109/ICRCCS. 2009.56]
Zhou, Z., Tian, Y.H., Li, Y.N., Liu, T., Huang, T.J., Gao, W., 2008. PKU at ImageCLEF 2008: Experiments with Query Extension Techniques for Text-Based and Content-Based Image Retrieval. Online Working Notes for the CLEF Workshop.
Zhou, Z., Tian, Y.H., Li, Y.N., Huang, T.J., Gao, W., 2009. Large-Scale Cross-Media Retrieval of WikipediaMM Images with Textual and Visual Query Expansion. Cross-Language Evaluation Forum, p.763–770. [doi:10.1007/978-3-642-04447-2_99]
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (No. 60605020) and the National High-Tech R & D Program (863) of China (Nos. 2006AA01Z320 and 2006AA010105)
Rights and permissions
About this article
Cite this article
Cao, Yl., Huang, Tj. & Tian, Yh. A ranking SVM based fusion model for cross-media meta-search engine. J. Zhejiang Univ. - Sci. C 11, 903–910 (2010). https://doi.org/10.1631/jzus.C1001009
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/jzus.C1001009