Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1816123.1816135acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Exploiting time-based synonyms in searching document archives

Published: 21 June 2010 Publication History

Abstract

Query expansion of named entities can be employed in order to increase the retrieval effectiveness. A peculiarity of named entities compared to other vocabulary terms is that they are very dynamic in appearance, and synonym relationships between terms change with time. In this paper, we present an approach to extracting synonyms of named entities over time from the whole history of Wikipedia. In addition, we will use their temporal patterns as a feature in ranking and classifying them into two types, i.e., time-independent or time-dependent. Time-independent synonyms are invariant to time, while time-dependent synonyms are relevant to a particular time period, i.e., the synonym relationships change over time. Further, we describe how to make use of both types of synonyms to increase the retrieval effectiveness, i.e., query expansion with time-independent synonyms for an ordinary search, and query expansion with time-dependent synonyms for a search wrt. temporal criteria. Finally, through an evaluation based on TREC collections, we demonstrate how retrieval performance of queries consisting of named entities can be improved using our approach.

References

[1]
K. Berberich, S. Bedathur, T. Neumann, and G. Weikum. Fluxcapacitor: efficient time-travel text search. In Proceedings of the 33rd VLDB, 2007.
[2]
K. Berberich, S. J. Bedathur, T. Neumann, and G. Weikum. A time machine for text search. In Proceedings of SIGIR'2007, 2007.
[3]
C. Bøhn and K. Nørvåg. Extracting named entities and synonyms from Wikipedia. In Proceedings of AINA'2010, 2010.
[4]
R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of EACL'2006, 2006.
[5]
D. Efendioglu, C. Faschetti, and T. Parr. Chronica: a temporal web search engine. In Proceedings of the 6th ICWE, 2006.
[6]
J. Hu, L. Fang, Y. Cao, H.-J. Zeng, H. Li, Q. Yang, and Z. Chen. Enhancing text clustering by leveraging Wikipedia semantics. In Proceedings of SIGIR'2008, 2008.
[7]
A. Jatowt, Y. Kawai, and K. Tanaka. Temporal ranking of search engine results. In Proceedings of WISE, 2005.
[8]
N. Kanhabua and K. Nørvåg. Improving temporal language models for determining time of non-timestamped documents. In Proceedings of ECDL'2008, 2008.
[9]
J. Kleinberg. Bursty and hierarchical structure in streams. In Proceedings of SIGKDD'02, 2002.
[10]
Y. Li,W. P. R. Luk, K. S. E. Ho, and F. L. K. Chung. Improving weak ad-hoc queries using Wikipedia as external corpus. In Proceedings of SIGIR'2007, 2007.
[11]
O. Medelyan, D. N. Milne, C. Legg, and I. H. Witten. Mining meaning from Wikipedia. Int. J. Hum.-Comput. Stud., 67(9):716--754, 2009.
[12]
D. N. Milne, I. H. Witten, and D. M. Nichols. A knowledge-based search engine powered by Wikipedia. In Proceedings of CIKM'2007, 2007.
[13]
K. Nørvåg. Supporting temporal text-containment queries in temporal document databases. Journal of Data & Knowledge Engineering, 49(1):105--125, 2004.
[14]
M. Sanderson. Ambiguous queries: test collections need more sense. In Proceedings of SIGIR'2008, 2008.
[15]
N. Sato, M. Uehara, and Y. Sakai. Temporal ranking for fresh information retrieval. In Proceedings of the 6th IRAL, 2003.
[16]
R. Schenkel, F. M. Suchanek, and G. Kasneci. YAWN: A semantically annotated Wikipedia XML corpus. In Proceedings of BTW'2007, 2007.
[17]
P. Wang, J. Hu, H.-J. Zeng, L. Chen, and Z. Chen. Improving text classification by using encyclopedia knowledge. In Proceedings of ICDM'2007, 2007.
[18]
F. Wu and D. S.Weld. Autonomously semantifying Wikipedia. In Proceedings of CIKM'2007, 2007.
[19]
Y. Xu, G. J. Jones, and B. Wang. Query dependent pseudo-relevance feedback based on Wikipedia. In Proceedings of SIGIR'2009, 2009.
[20]
T. Zesch, I. Gurevych, and M. Mühlhäuser. Analyzing and accessing Wikipedia as a lexical semantic resource. In Proceedings of Biannual Conference of the Society for Computational Linguistics and Language Technology, 2007

Cited By

View all
  • (2019)Across-Time Comparative Summarization of News ArticlesProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3291008(735-743)Online publication date: 30-Jan-2019
  • (2019)Mapping Entity Sets in News Archives Across TimeData Science and Engineering10.1007/s41019-019-00102-3Online publication date: 9-Sep-2019
  • (2019)Typicality-Based Across-Time Mapping of Entity Sets in Document ArchivesDatabase Systems for Advanced Applications10.1007/978-3-030-18576-3_21(350-366)Online publication date: 24-Apr-2019
  • Show More Cited By

Index Terms

  1. Exploiting time-based synonyms in searching document archives

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    JCDL '10: Proceedings of the 10th annual joint conference on Digital libraries
    June 2010
    424 pages
    ISBN:9781450300858
    DOI:10.1145/1816123
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • IEEE CS

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 June 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. query expansion
    2. synonym detection
    3. temporal search

    Qualifiers

    • Research-article

    Conference

    JCDL10
    Sponsor:
    JCDL10: Joint Conference on Digital Libraries
    June 21 - 25, 2010
    Queensland, Gold Coast, Australia

    Acceptance Rates

    Overall Acceptance Rate 415 of 1,482 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Across-Time Comparative Summarization of News ArticlesProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3291008(735-743)Online publication date: 30-Jan-2019
    • (2019)Mapping Entity Sets in News Archives Across TimeData Science and Engineering10.1007/s41019-019-00102-3Online publication date: 9-Sep-2019
    • (2019)Typicality-Based Across-Time Mapping of Entity Sets in Document ArchivesDatabase Systems for Advanced Applications10.1007/978-3-030-18576-3_21(350-366)Online publication date: 24-Apr-2019
    • (2017)Temporal Analog Retrieval using Transformation over Dual Hierarchical StructuresProceedings of the 2017 ACM on Conference on Information and Knowledge Management10.1145/3132847.3132917(717-726)Online publication date: 6-Nov-2017
    • (2017)Is Tofu the Cheese of Asia?Proceedings of the 26th International Conference on World Wide Web Companion10.1145/3041021.3055132(1033-1042)Online publication date: 3-Apr-2017
    • (2016)Causal Relationship Detection in Archival Collections of Product Reviews for Understanding Technology EvolutionACM Transactions on Information Systems10.1145/293775235:1(1-41)Online publication date: 11-Aug-2016
    • (2016)Accounting for Language Changes Over Time in Document Similarity SearchACM Transactions on Information Systems10.1145/293467135:1(1-26)Online publication date: 3-Sep-2016
    • (2016)Detecting Evolution of Concepts based on Cause-Effect Relationships in Online ReviewsProceedings of the 25th International Conference on World Wide Web10.1145/2872427.2883013(649-660)Online publication date: 11-Apr-2016
    • (2016)The Past is Not a Foreign CountryIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.259100828:10(2793-2807)Online publication date: 1-Oct-2016
    • (2016)How to Search the Internet Archive Without Indexing ItResearch and Advanced Technology for Digital Libraries10.1007/978-3-319-43997-6_12(147-160)Online publication date: 10-Aug-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media