Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

Analysis of a very large web search engine query log

Published: 01 September 1999 Publication History

Abstract

In this paper we present an analysis of an AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents almost 285 million user sessions, each an attempt to fill a single information need. We present an analysis of individual queries, query duplication, and query sessions. We also present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the conjecture that web users differ significantly from the user assumed in the standard information retrieval literature. Specifically, we show that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query. This suggests that traditional information retrieval techniques may not work well for answering web search requests. The correlation analysis showed that the most highly correlated items are constituents of phrases. This result indicates it may be useful for search engines to consider search terms as parts of phrases even if the user did not explicitly specify them as such.

Cited By

View all
  • (2024)Exploratory and directed search strategies at a social science data archiveIASSIST Quarterly10.29173/iq108748:1Online publication date: 28-Mar-2024
  • (2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
  • (2024)Re-evaluating the Command-and-Control Paradigm in Conversational Search InteractionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679588(2260-2270)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGIR Forum
ACM SIGIR Forum  Volume 33, Issue 1
Fall 1999
45 pages
ISSN:0163-5840
DOI:10.1145/331403
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1999
Published in SIGIR Volume 33, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)330
  • Downloads (Last 6 weeks)50
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Exploratory and directed search strategies at a social science data archiveIASSIST Quarterly10.29173/iq108748:1Online publication date: 28-Mar-2024
  • (2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
  • (2024)Re-evaluating the Command-and-Control Paradigm in Conversational Search InteractionsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679588(2260-2270)Online publication date: 21-Oct-2024
  • (2024)JDivPS: A Diversified Product Search DatasetProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657888(1152-1161)Online publication date: 10-Jul-2024
  • (2024)Integrated Personalized and Diversified Search Based on Search LogsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.329100636:2(694-707)Online publication date: 1-Feb-2024
  • (2024)Measuring performance of metasearch engines to access information: an exploratory study based on precision metricsPerformance Measurement and Metrics10.1108/PMM-09-2023-002825:1(23-42)Online publication date: 18-Mar-2024
  • (2024)How to personalize and whether to personalize? Candidate documents decideKnowledge and Information Systems10.1007/s10115-024-02138-y66:9(5581-5604)Online publication date: 27-May-2024
  • (2023)Personalized and Diversified: Ranking Search Results in an Integrated WayACM Transactions on Information Systems10.1145/363198942:3(1-25)Online publication date: 9-Nov-2023
  • (2023)GDESA: Greedy Diversity Encoder with Self-attention for Search Results DiversificationACM Transactions on Information Systems10.1145/354410341:2(1-36)Online publication date: 3-Apr-2023
  • (2023)Incorporating Explicit Subtopics in Personalized SearchProceedings of the ACM Web Conference 202310.1145/3543507.3583488(3364-3374)Online publication date: 30-Apr-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media