Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/1072228.1072355dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free access

Location normalization for information extraction

Published: 24 August 2002 Publication History

Abstract

Ambiguity is very high for location names. For example, there are 23 cities named 'Buffalo' in the U.S. Country names such as 'Canada', 'Brazil' and 'China' are also city names in the USA. Almost every city has a Main Street or Broadway. Such ambiguity needs to be handled before we can refer to location names for visualization of related extracted events. This paper presents a hybrid approach for location normalization which combines (i) lexical grammar driven by local context constraints, (ii) graph search for maximum spanning tree and (iii) integration of semi-automatically derived default senses. The focus is on resolving ambiguities for the following types of location names: island, town, city, province, and country. The results are promising with 93.8% accuracy on our test collections.

References

[1]
Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. 1990. Introduction to Algorithm. The MIT Press, pp. 504--505.
[2]
Dagon, Ido and Alon Itai. 1994. Word Sense Disambiguation Using a Second Language Monolingual Corpus. Computational Linguistics, Vol. 20, pp. 563--596.
[3]
Gale, W. A., K. W. Church, and D. Yarowsky. 1992. One Sense Per Discourse. In Proceedings of the 4th DARPA Speech and Natural Language Workshop. pp. 233--237.
[4]
Hirst, Graeme. 1987. Semantic Interpretation and the Resolution of Ambiguity. Cambridge University Press, Cambridge.
[5]
Krupka, G. R. and K. Hausman. 1998. IsoQuest Inc.: Description of the NetOwl (TM) Extractor System as Used for MUC-7. Proceedings of MUC.
[6]
McRoy, Susan W. 1992. Using Multiple Knowledge Sources for Word Sense Discrimination. Computational Linguistics, 18(1): 1--30.
[7]
Ng, Hwee Tou and Hian Beng Lee. 1996. Integrating Multiple Knowledge Sources to Disambiguate Word Sense: an Exemplar-based Approach. In Proceedings of 34th Annual Meeting of the Association for Computational Linguistics, pp. 40--47, California.
[8]
Srihari, Rohini, Cheng Niu, and Wei Li. 2000. A Hybrid Approach for Named Entity and Sub-Type Tagging. In Proceedings of ANLP 2000, Seattle.
[9]
Yarowsky, David. 1992. Word-sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92), pp. 454--460, Nates, France.
[10]
Yarowsky, David. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts.

Cited By

View all
  • (2016)A survey on the geographic scope of textual documentsComputers & Geosciences10.1016/j.cageo.2016.07.01796:C(23-34)Online publication date: 1-Nov-2016
  • (2015)When Location Meets Social MultimediaACM Transactions on Intelligent Systems and Technology10.1145/25971816:1(1-18)Online publication date: 26-Mar-2015
  • (2014)Automatic Identification of Locative Expressions from Social Media TextProceedings of the 4th International Workshop on Location and the Web10.1145/2663713.2664426(9-16)Online publication date: 3-Nov-2014
  • Show More Cited By
  1. Location normalization for information extraction

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image DL Hosted proceedings
    COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1
    August 2002
    1184 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    Published: 24 August 2002

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)86
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)A survey on the geographic scope of textual documentsComputers & Geosciences10.1016/j.cageo.2016.07.01796:C(23-34)Online publication date: 1-Nov-2016
    • (2015)When Location Meets Social MultimediaACM Transactions on Intelligent Systems and Technology10.1145/25971816:1(1-18)Online publication date: 26-Mar-2015
    • (2014)Automatic Identification of Locative Expressions from Social Media TextProceedings of the 4th International Workshop on Location and the Web10.1145/2663713.2664426(9-16)Online publication date: 3-Nov-2014
    • (2014)A POI Categorization by Composition of Onomastic and Contextual InformationProceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0210.1109/WI-IAT.2014.78(38-45)Online publication date: 11-Aug-2014
    • (2013)GEO-NASSProceedings of the 17th East European Conference on Advances in Databases and Information Systems - Volume 813310.1007/978-3-642-40683-6_5(56-69)Online publication date: 1-Sep-2013
    • (2012)Joint inference of named entity recognition and normalization for tweetsProceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 110.5555/2390524.2390598(526-535)Online publication date: 8-Jul-2012
    • (2012)Event-centric search and exploration in document collectionsProceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries10.1145/2232817.2232859(223-232)Online publication date: 10-Jun-2012
    • (2010)An efficient location extraction algorithm by leveraging web contextual informationProceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/1869790.1869801(53-60)Online publication date: 2-Nov-2010
    • (2010)Extraction and exploration of spatio-temporal information in documentsProceedings of the 6th Workshop on Geographic Information Retrieval10.1145/1722080.1722101(1-8)Online publication date: 18-Feb-2010
    • (2010)TWinnerProceedings of the 6th Workshop on Geographic Information Retrieval10.1145/1722080.1722093(1-8)Online publication date: 18-Feb-2010
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media