Kernel Density Estimation for Text-Based Geolocation

Authors

  • Mans Hulden University of Colorado Boulder
  • Miikka Silfverberg University of Helsinki
  • Jerid Francom Wake Forest University

DOI:

https://doi.org/10.1609/aaai.v29i1.9149

Keywords:

document geolocation, twitter, kernel classifier

Abstract

Text-based geolocation classifiers often operate with a grid-based view of the world. Predicting document location of origin based on text content on a geodesic grid is computationally attractive since many standard methods for supervised document classification carry over unchanged to geolocation in the form of predicting a most probable grid cell for a document. However, the grid-based approach suffers from sparse data problems if one wants to improve classification accuracy by moving to smaller cell sizes. In this paper we investigate an enhancement of common methods for determining the geographic point of origin of a text document by kernel density estimation. For geolocation of tweets we obtain a improvements upon non-kernel methods on datasets of U.S. and global Twitter content.

Downloads

Published

2015-02-09

How to Cite

Hulden, M., Silfverberg, M., & Francom, J. (2015). Kernel Density Estimation for Text-Based Geolocation. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9149