Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Enriching web taxonomies through subject categorization of query terms from search engine logs

Published: 01 April 2003 Publication History

Abstract

In this paper, we propose a query-categorization approach to facilitating the engineering process of constructing Web taxonomies. One primary step in taxonomy construction is to acquire the domain-specific terminology terms and the mapping between the subjects and these terms. We introduce a technique for categorizing Web query terms from the logs of on-line search services into a predefined subject taxonomy based on their supposed popular search interests. The obtained experimental results show our technique's effectiveness in reducing the workload of human indexers in constructing Web taxonomies and also show its usefulness in various Web information retrieval applications.

References

[1]
{1} E. Agirre, O. Ansa, E. Hovy, D. Martinez, Enriching Very Large Ontologies using the www, ECAI 2000, Workshop on Ontology Learning, Berlin, Germany.]]
[2]
{2} P.G. Anick, S. Tipirneni, The paraphrase search assistant: terminological feedback for interactive information seeking, Proceedings of the 22nd ACM International Conference on Research and Development in Information Retrieval (SIGIR'99), Berkeley, USA, August 15-19, 1999, ACM Press, New York, USA, 1999, pp. 153-159.]]
[3]
{3} D. Beeferman, A. Berger, Agglomerative clustering of a search engine query log, Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, August 20-23, 2000, ACM Press, New York, USA, 2000, pp. 407-416.]]
[4]
{4} P. Bruza, R. McArthur, S. Dennis, Interactive internet search: keyword, directory and query reformulation mechanisms compared, Proceedings of the 23th ACM International Conference on Research and Development in Information Retrieval (SIGIR'2000), Athens, Greece, July 24-28, 2000, ACM Press, New York, USA, 2000, pp. 280-287.]]
[5]
{5} R. Byrd, Y. Ravin, Identifying and extracting relations in text, Proceedings of the Fourth International Conference on Application of Natural Language to Information Systems (NLDB'99), Austria, June 17-19, 1999.]]
[6]
{6} L.-F. Chien, Pat-tree-based adaptive keyphrase extraction for intelligent Chinese information retrieval, Information Processing & Management 35 (1999) 501-521.]]
[7]
{7} L.-F. Chien, H.-T. Pu, Important issues on Chinese information retrieval, Computational Linguistics and Chinese Language Processing 1 (1) (1996) 205-221.]]
[8]
{8} S.-L. Chuang, H.-T. Pu, W.-H. Lu, L.-F. Chien, Autoconstruction of a live thesaurus from search term logs for interactive web search, Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2000), Athens, Greece, July 24-28, 2000, ACM Press, New York, USA, 2000, pp. 334-336.]]
[9]
{9} S. Deerwester, S.T. Dumais, G.W. Fumas, T.K. Landauer, R.A. Harshman, Indexing by latent semantic analysis, Journal of the American Society for Information Science 41 (6) (1990) 391-407.]]
[10]
{10} C. Fellbaum, Wordnet: An Electronic Lexical Database, MIT Press, Cambridge, 1998.]]
[11]
{11} U. Hahn, K. Schnattinger, Towards text knowledge engineering, Proceedings of the 15th National Conference on Artificial Intelligence (AAAI'98), Madison, Wisconsin, USA, July 26-30, 1998, AAAI Press, Menlo Park, California, USA, 1998, pp. 524-531.]]
[12]
{12} D. Koller, M. Sahami, Hierarchically classifying documents using very few words, Proceedings of the Fourteenth International Conference on Machine Learning (ML'97), Nashville, Tennessee, USA, July 6-12, 1997, Morgan Kaufmann, San Francisco, USA, 1997, pp. 170-178.]]
[13]
{13} A. Maedche, S. Staab, Discovering conceptual relations from text, Proceedings of the 14th European Conference on Artificial Intelligence (ECAI-2000), Berlin, Germany, August 20-25, 2000.]]
[14]
{14} A. Maedche, S. Staab, Mining ontologies from text, Proceedings of the 12th International Conference on Knowledge Engineering and Knowledge Management (EKAW-2000), Juanles-Pins, French Riviera, France, October 2-6, 2000, Springer-Verlag, Heidelberg, Germany, 2000.]]
[15]
{15} R. Mandala, T. Tokunaga, H. Tanaka, Combining multiple evidence from different types of thesaurus for query expansion, Proceedings of the 22nd ACM International Conference on Research and Development in Information Retrieval (SIGIR'99), Berkeley, USA, August 15-19, 1999, ACM Press, New York, USA, 1999, pp. 191-197.]]
[16]
{16} H.-T. Pu, S.-L. Chuang, Auto-categorization of search terms toward understanding web users' information needs, Proceedings of the 3rd International Conference of Asian Digital Library (ICADL2000), Seoul, Korea, December 6-8, 2000.]]
[17]
{17} E. Riloff, J. Shepherd, A corpus-based approach for building semantic lexicons, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, Providence, Rhode Island, USA, August 1-2, 1997.]]
[18]
{18} M. Sanderson, B. Croft, Deriving concept hierarchies from text, Proceedings of the 22nd ACM International Conference on Research and Development in Information Retrieval (SIGIR'99) (1999) pp. 206-213.]]
[19]
{19} C. Silverstein, M. Henzinger, H. Marais, M. Moricz, Analysis of a very large altavista query log, DEC SRC Technical Note, 1998.]]

Cited By

View all
  • (2023)The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web ArchivesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591890(2848-2860)Online publication date: 19-Jul-2023
  • (2017)Taxo-SemanticsDecision Support Systems10.1016/j.dss.2017.04.00198:C(10-25)Online publication date: 1-Jun-2017
  • (2012)Sequence clustering and labeling for unsupervised query intent discoveryProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124342(383-392)Online publication date: 8-Feb-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Decision Support Systems
Decision Support Systems  Volume 35, Issue 1
Web retrieval and mining
01 April 2003
181 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 April 2003

Author Tags

  1. information retrieval
  2. log analysis
  3. query categorization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web ArchivesProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591890(2848-2860)Online publication date: 19-Jul-2023
  • (2017)Taxo-SemanticsDecision Support Systems10.1016/j.dss.2017.04.00198:C(10-25)Online publication date: 1-Jun-2017
  • (2012)Sequence clustering and labeling for unsupervised query intent discoveryProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124342(383-392)Online publication date: 8-Feb-2012
  • (2012)Web log analysisData Mining and Knowledge Discovery10.1007/s10618-011-0228-824:3(663-696)Online publication date: 1-May-2012
  • (2011)Automatic maintenance of web directories by mining web browsing dataJournal of Web Engineering10.5555/2011114.201111710:2(153-173)Online publication date: 1-Jun-2011
  • (2010)Mining large query induced graphs towards a hierarchical query folksonomyProceedings of the 17th international conference on String processing and information retrieval10.5555/1928328.1928358(237-242)Online publication date: 11-Oct-2010
  • (2009)A survey on session detection methods in query logs and a proposal for future evaluationInformation Sciences: an International Journal10.1016/j.ins.2009.01.026179:12(1822-1843)Online publication date: 1-May-2009
  • (2008)A novel knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databasesData & Knowledge Engineering10.1016/j.datak.2008.04.00566:3(349-367)Online publication date: 1-Sep-2008
  • (2007)Building a directory for the underdeveloped webProceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers10.5555/1780653.1780729(468-477)Online publication date: 10-Dec-2007
  • (2007)Extracting semantic relations from query logsProceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1281192.1281204(76-85)Online publication date: 12-Aug-2007
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media