Abstract
One of the useful tools offered by existing web search engines is query suggestion (QS), which assists users in formulating keyword queries by suggesting keywords that are unfamiliar to users, offering alternative queries that deviate from the original ones, and even correcting spelling errors. The design goal of QS is to enrich the web search experience of users and avoid the frustrating process of choosing controlled keywords to specify their special information needs, which releases their burden on creating web queries. Unfortunately, the algorithms or design methodologies of the QS module developed by Google, the most popular web search engine these days, is not made publicly available, which means that they cannot be duplicated by software developers to build the tool for specifically-design software systems for enterprise search, desktop search, or vertical search, to name a few. Keyword suggested by Yahoo! and Bing, another two well-known web search engines, however, are mostly popular currently-searched words, which might not meet the specific information needs of the users. These problems can be solved by WebQS, our proposed web QS approach, which provides the same mechanism offered by Google, Yahoo!, and Bing to support users in formulating keyword queries that improve the precision and recall of search results. WebQS relies on frequency of occurrence, keyword similarity measures, and modification patterns of queries in user query logs, which capture information on millions of searches conducted by millions of users, to suggest useful queries/query keywords during the user query construction process and achieve the design goal of QS. Experimental results show that WebQS performs as well as Yahoo! and Bing in terms of effectiveness and efficiency and is comparable to Google in terms of query suggestion time.
Similar content being viewed by others
References
Baraglia, R., Castillo, C., Donato, D., Nardini, F., Perego, R., Silvestri, F.: The effects of time on query flow graph-based models for query suggestion. In: Proceedings of RIAO’10: Adaptivity, Personalization and Fusion of Heterogeneous Information, pp. 182–189 (2010)
Bhatia, S., Majumdar, D., Mitra, P.: Query suggestions in the absence of query logs. In: Proceedings of the International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 795–804 (2011)
Boldi, P., Bonchi, F., Castillo, C., Donato, D., Vigna, S.: Query suggestions using query flow graphs. In: Proceedings of the ACM Workshop on Web Search Click Data (WSCD), pp. 56–63 (2009)
Cao, G., Nie, J., Bai, J.: Integrating word relationships into language models. In: Proceedings of the International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 298–305 (2005)
Cao, H., Jiang, D., Pei, J., He, Q., Liao, A., Chen, E., Li, H.: Context-aware query suggestion by mining click-through and session data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 875–883 (2008)
Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice. Addison Wesley (2010)
Duan, H., Hsu, B.-J.: Online spelling correction for query completion. In: Proceedings of World Wide Web (WWW), pp. 117–126 (2011)
Efthimiadis, E.: Interactive query expansion: a user-based evaluation in a relevance feedback environment. J. Am. Soc. Inf. Sci. (JASIS) 51, 989–1003 (2000)
Fonseca, B., Golgher, P., Possas, B., Ribeiro-Neto, B., Ziviani, N.: Concept-based interactive query expansion. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), pp. 696–703 (2005)
Gao, J., Li, X., Micol, D., Quirk, C., Sun, X.: A large scale ranker-based system for search query spelling correction. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling), pp. 358–366 (2010)
Graepel, T., Candela, J., Borchert, T., Herbrich, R.: Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsoft’s Bing search engine. In: Proceedings of the 27th International Conference on Machine Learning (ICML’10), pp. 13–20 (2010)
Hansan, M., Parikh, N., Singh, G., Sundaresan, N.: Query suggestion for e-commerce sites. In: Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), pp. 765–774 (2011)
Huang, J., Efthimiadis, E.: Analyzing and evaluating query reformulation strategies in web search logs. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), pp. 77–86 (2009)
Jansen, B., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Inf. Process. Manag. (IPM) 36(2), 207–227 (2000)
Jones, B., Kenward, M.: Design and Analysis of Cross-Over Trials, 2nd edn. Chapman and Hall (2003)
Kato, M., Sakai, T., Tanaka, K.: Structured query suggestion for specialization and parallel movement: effect on search behaviors. In: Proceedings of International Conference on World Wide Web (WWW), pp. 389–398 (2012)
Koberstein, J., Ng, Y.-K.: Using word clusters to detect similar web documents. In: Proceedings of the 1st International Conference on Knowledge Science, Engineering and Management (KSEM), pp. 215–228 (2006)
Liao, Z., Jiang, D., Chen, E., Pei, J., Cao, H., Li, H.: Mining concept sequences from large-scale search logs for context-aware query suggestion. ACM Transactions on Intelligent Systems and Technology (ACMTIST) 3(1), 17:1–17:40 (2011)
Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. In: Proceedings of the International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 266–272 (2004)
Luger, G.: Artificial Intelligence: Structures and Strategies for Complex Problem Solving. Addison-Wesley (2008)
Mei, Q., Zhou, D., Church, K.: Query suggestion using hitting time. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), pp. 469–478 (2008)
Oliver, J.: Decision graphs—an extension of decision trees. Technical Report 92/173, Monash University (1992)
Pera, M., Ng, Y.-K.: SimPaD: a word-similarity sentence-based plagiarisum detection tool on web documents. Web Intelligence and Agent Systems: An International Journal (WIAS) 9(1), 27–41 (2011)
Qumsiyeh, R., Ng, Y.-K.: ReadAid: a robust and fully-automated readability assessment tool. In: Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 539–546 (2011)
Rozakis, L.: Test Taking Strategies and Study Skills for the Utterly Confused. McGraw Hill (2002)
Ruch, P., Tbahriti, I., Gobeill, J., Aronson, A.: Argumentative feedback: a linguistically-motivated term expansion for information retrieval. In: Proceedings of International Conference on Computational Linguistics (COLING), pp. 675–682 (2006)
Song, Y., He, L.: Optimal rare query suggestion with implicit user feedback. In: Proceedings of International Conference on World Wide Web (WWW), pp. 901–910 (2010)
Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using wikipedia. In: Proceedings of the 21st AAAI Conference on Artificial Intelligence, pp. 1419–1424 (2006)
Vectomova, O., Wang, Y.: A study of the effect of term proximity on query expansion. Inf. Sci. 32(4), 324–333 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qumsiyeh, R., Ng, YK. Assisting web search using query suggestion based on word similarity measure and query modification patterns. World Wide Web 17, 1141–1160 (2014). https://doi.org/10.1007/s11280-013-0235-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-013-0235-3