Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2666310.2666386acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Geocoding for texts with fine-grain toponyms: an experiment on a geoparsed hiking descriptions corpus

Published: 04 November 2014 Publication History

Abstract

Geoparsing and geocoding are two essential middleware services to facilitate final user applications such as location-aware searching or different types of location-based services. The objective of this work is to propose a method for establishing a processing chain to support the geoparsing and geocoding of text documents describing events strongly linked with space and with a frequent use of fine-grain toponyms. The geoparsing part is a Natural Language Processing approach which combines the use of part of speech and syntactico-semantic combined patterns (cascade of transducers). However, the real novelty of this work lies in the geocoding method. The geocoding algorithm is unsupervised and takes profit of clustering techniques to provide a solution for disambiguating the toponyms found in gazetteers, and at the same time estimating the spatial footprint of those other fine-grain toponyms not found in gazetteers. The feasibility of the proposal has been tested with a corpus of hiking descriptions in French, Spanish and Italian.

References

[1]
Semantic place localization from narratives. In Proceedings of The First ACM SIGSPATIAL International Workshop on Computational Models of Place, COMP '13, pages 16:16--16:19, New York, NY, USA, 2013. ACM.
[2]
R. J. Agrawal and J. G. Shanahan. Location disambiguation in local searches using gradient boosted decision trees. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS '10, pages 129--136, 2010.
[3]
A. Aji, X. Sun, H. Vo, Q. Liu, R. Lee, X. Zhang, J. H. Saltz, and F. Wang. Demonstration of Hadoop-GIS: a spatial data warehousing system over MapReduce. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 518--521, 2013.
[4]
K.-H. Anders and M. Sester. Parameter-free cluster detection in spatial databases and its application to typification. International Archives of Photogrammetry and Remote Sensing, 33(B4/1; PART 4): 75--83, 2000.
[5]
D. Buscaldi. Approaches to disambiguating toponyms. SIGSPATIAL Special, 3(2): 16--19, Jul 2011.
[6]
D. Buscaldi and P. Rosso. A conceptual density-based approach for the disambiguation of toponyms. Int. J. Geogr. Inf. Sci., 22(3): 301--313, Jan. 2008.
[7]
D. Buscaldi and P. Rosso. Map-based vs. knowledge-based toponym disambiguation. In Proceedings of the 2Nd International Workshop on Geographic Information Retrieval, GIR '08, pages 19--22, New York, NY, USA, 2008. ACM.
[8]
C. Derungs and R. S. Purves. From text to landscape: Locating, identifying and mapping the use of landscape features in a swiss alpine corpus. International Journal of Geographical Information Science, 28(6): 1272--1293, 2013.
[9]
A. Eldawy, Y. Li, M. F. Mokbel, and R. Janardan. Cg hadoop: computational geometry in mapreduce. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 284--293, 2013.
[10]
M. Ester, H.-P. Kriegel, J. Sander, M. Wimmer, and X. Xu. Incremental clustering for mining in a data warehousing environment. In VLDB, volume 98, pages 323--333, 1998.
[11]
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, volume 96, pages 226--231, 1996.
[12]
U. Feuerhake and M. Sester. Mining group movement patterns. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 510--513, 2013.
[13]
A. J. Florczyk, F. J. Lopez-Pellicer, P. R. Muro-Medrano, J. Nogueras-Iso, and F. J. Zarazaga-Soria. Semantic selection of georeferencing services for urban management. Journal of Information Technology in Construction, 15 (Special Issue Bringing urban ontologies into practice): 111--121, 2010.
[14]
M. Habib and M. Van Keulen. Improving toponym disambiguation by iteratively enhancing certainty of extraction. In KDIR 2012 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, 2012.
[15]
Q. Hao, R. Cai, C. Wang, R. Xiao, J.-M. Yang, Y. Pang, and L. Zhang. Equip tourists with knowledge mined from travelogues. In Proceedings of the 19th international conference on World wide web, pages 401--410. ACM, 2010.
[16]
S. Intagorn and K. Lerman. Learning boundaries of vague places from noisy annotations. In Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 425--428. ACM, 2011.
[17]
N. Ireson and F. Ciravegna. Toponym resolution in social media. In The Semantic Web--ISWC 2010, pages 370--385. Springer, 2010.
[18]
J. L. Leidner. Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names. Universal-Publishers, Jan. 2008.
[19]
J. L. Leidner and M. D. Lieberman. Detecting geographical references in the form of place names and associated spatial natural language. SIGSPATIAL Special, 3(2): 5--11, July 2011.
[20]
M. D. Lieberman and H. Samet. Adaptive context features for toponym resolution in streaming news. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pages 731--740. ACM, 2012.
[21]
M. D. Lieberman, H. Samet, and J. Sankaranarayanan. Geotagging with local lexicons to build indexes for textually-specified spatial data. In Data Engineering (ICDE), 2010 IEEE 26th International Conference on, pages 201--212. IEEE, 2010.
[22]
D. Maurel and N. Friburger. Finite-state transducer cascades to extract named entities in texts. Theoretical Computer Science, 313: 93--104, 2004.
[23]
L. Moncla, M. Gaio, and S. Mustière. Automatic itinerary reconstruction from texts. In Eighth International Conference on Geographic Information Science, GIScience 2014, Vienna, September, 23--26.
[24]
V. T. Nguyen, M. Gaio, and L. Moncla. Topographic subtyping of place named entities: a linguistic approach. In The 16th AGILE International Conference on Geographic Information Science, Leuven, Belgium, 2013.
[25]
T. Poibeau. Extraction automatique d'information(du texte brut au web sémantique). 2003.
[26]
T. Qin, R. Xiao, L. Fang, X. Xie, and L. Zhang. An efficient location extraction algorithm by leveraging web contextual information. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS '10, pages 53--60, 2010.
[27]
T. Rattenbury and M. Naaman. Methods for extracting place semantics from flickr tags. ACM Transactions on the Web (TWEB), 3(1): 1, 2009.
[28]
P. Serdyukov, V. Murdock, and R. Van Zwol. Placing flickr photos on a map. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 484--491. ACM, 2009.
[29]
P. D. Smart, C. Jones, and F. Twaroch. Multi-source toponym data integration and mediation for a meta-gazetteer service. In S. Fabrikant, T. Reichenbacher, M. Kreveld, and C. Schlieder, editors, Geographic Information Science, volume 6292 of Lecture Notes in Computer Science, pages 234--248. Springer Berlin Heidelberg, 2010.
[30]
D. Smith and G. Mann. Bootstrapping toponym classifiers. Association for Computational Linguistics, Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1: 45--49, 2003.
[31]
N. Wacholder, Y. Ravin, and M. Choi. Disambiguation of proper names in text. In Proceedings of the fifth conference on Applied natural language processing, pages 202--208. Association for Computational Linguistics, 1997.
[32]
X. Zhang, B. Qiu, P. Mitra, S. Xu, A. Klippel, and A. M. MacEachren. Disambiguating Road Names in Text Route Descriptions using Exact-All-Hop Shortest Path Algorithm. In ECAI, pages 876--881, 2012.
[33]
J. Zhao, P. Jin, Q. Zhang, and R. Wen. Exploiting location information for web search. Computers in Human Behavior, 30: 378--388, 2014.

Cited By

View all
  • (2024)Mapping cognitive place associations within the United Kingdom through online discussion on RedditTransactions of the Institute of British Geographers10.1111/tran.1266949:3Online publication date: 8-Jan-2024
  • (2024)Geographical and linguistic perspectives on developing geoparsers with generic resourcesInternational Journal of Geographical Information Science10.1080/13658816.2024.236953938:10(2039-2060)Online publication date: 30-Jun-2024
  • (2023)A Spatially-Aware Data-Driven Approach to Automatically Geocoding Non-Gazetteer Place NamesACM Transactions on Spatial Algorithms and Systems10.1145/362798710:1(1-34)Online publication date: 11-Dec-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGSPATIAL '14: Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
November 2014
651 pages
ISBN:9781450331319
DOI:10.1145/2666310
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. geocoding
  2. geoparsing
  3. location based services
  4. spatio-textual searching
  5. toponym disambiguation

Qualifiers

  • Research-article

Conference

SIGSPATIAL '14
Sponsor:
  • University of North Texas
  • Microsoft
  • ORACLE
  • Facebook
  • SIGSPATIAL

Acceptance Rates

SIGSPATIAL '14 Paper Acceptance Rate 39 of 184 submissions, 21%;
Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Mapping cognitive place associations within the United Kingdom through online discussion on RedditTransactions of the Institute of British Geographers10.1111/tran.1266949:3Online publication date: 8-Jan-2024
  • (2024)Geographical and linguistic perspectives on developing geoparsers with generic resourcesInternational Journal of Geographical Information Science10.1080/13658816.2024.236953938:10(2039-2060)Online publication date: 30-Jun-2024
  • (2023)A Spatially-Aware Data-Driven Approach to Automatically Geocoding Non-Gazetteer Place NamesACM Transactions on Spatial Algorithms and Systems10.1145/362798710:1(1-34)Online publication date: 11-Dec-2023
  • (2023)Location Reference Recognition from Texts: A Survey and ComparisonACM Computing Surveys10.1145/362581956:5(1-37)Online publication date: 27-Nov-2023
  • (2022)Transformer based named entity recognition for place name extraction from unstructured textInternational Journal of Geographical Information Science10.1080/13658816.2022.213312537:4(747-766)Online publication date: 17-Oct-2022
  • (2022)Exploring Descriptions of Movement Through Geovisual AnalyticsErforschung von Bewegungsbeschreibungen durch geovisuelle AnalytikKN - Journal of Cartography and Geographic Information10.1007/s42489-022-00098-372:1(5-27)Online publication date: 24-Feb-2022
  • (2021)Deep Learning for Toponym Resolution: Geocoding Based on Pairs of ToponymsISPRS International Journal of Geo-Information10.3390/ijgi1012081810:12(818)Online publication date: 2-Dec-2021
  • (2020)Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured TextRemote Sensing10.3390/rs1218304112:18(3041)Online publication date: 17-Sep-2020
  • (2020)Normalisation of 16th and 17th century texts in French and geographical named entity recognitionProceedings of the 4th ACM SIGSPATIAL Workshop on Geospatial Humanities10.1145/3423337.3429437(28-34)Online publication date: 3-Nov-2020
  • (2020)Classification des entités nommées dans l’Encyclopédie ou dictionnaire raisonné des sciences des arts et des métiers par une société de gens de lettres (1751-1772)SHS Web of Conferences10.1051/shsconf/2020781100878(11008)Online publication date: 4-Sep-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media