Abstract
The context for geographic research has shifted from a data-scarce to a data-rich environment, in which the most fundamental changes are not just the volume of data, but the variety and the velocity at which we can capture georeferenced data; trends often associated with the concept of Big Data. A data-driven geography may be emerging in response to the wealth of georeferenced data flowing from sensors and people in the environment. Although this may seem revolutionary, in fact it may be better described as evolutionary. Some of the issues raised by data-driven geography have in fact been longstanding issues in geographic research, namely, large data volumes, dealing with populations and messy data, and tensions between idiographic versus nomothetic knowledge. The belief that spatial context matters is a major theme in geographic thought and a major motivation behind approaches such as time geography, disaggregate spatial statistics and GIScience. There is potential to use Big Data to inform both geographic knowledge-discovery and spatial modeling. However, there are challenges, such as how to formalize geographic knowledge to clean data and to ignore spurious patterns, and how to build data-driven models that are both true and understandable.
Similar content being viewed by others
References
Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. Wired, 16, 07.
Anselin, L. (1995). Local indicators of spatial association: LISA. Geographical Analysis, 27(2), 93–115.
Batty, M. (2012). Smart cities, big data. Environment and Planning B, 39(2), 191–193.
Butler, D. (2008). Web data predict flu. Nature, 456, 287–288.
Carr, N. (2013) The great forgetting. The Atlantic, pp. 77–81.
Cetin, N., Nagel, K., Raney, B., & Voellmy, A. (2002). Large-scale multi-agent transportation simulations. Computer Physics Communications, 147(1–2), 559–564.
Charlton, M. (2008). Geographical Analysis Machine (GAM). In K. Kemp (Ed.), Encyclopedia of Geographic Information Science (pp. 179–180). London: Sage.
Cresswell, T. (2013). Geographic thought: A critical introduction. New York: Wiley-Blackwell.
DeLyser, D., & Sui, D. (2013). Crossing the qualitative-quantitative divide II: Inventive approaches to big data, mobile methods, and rhythmanalysis. Progress in Human Geography, 37(2), 293–305.
Diplock, G. (1998). Building new spatial interaction models by using genetic programming and a supercomputer. Environment and Planning A, 30(10), 1893–1904.
Dobson, J. E. (1983). Automated geography. The Professional Geographer, 35, 135–143.
Dumbill, E. (2012). What is big data? An introduction to the big data landscape, http://strata.oreilly.com/2012/01/what-is-big-data.html. Last accessed 17 April 2014.
Flake, G. W. (1998). The computational beauty of nature: computer explorations of fractals, chaos, complex systems, and adaptation. Cambridge: MIT Press.
Fotheringham, A. S. (1998). Trends in quantitative methods II: Stressing the computational. Progress in Human Geography, 22(2), 283–292.
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). Geographically weighted regression: The analysis of spatially varying relationships. Chichester: Wiley.
Gahegan, M. (2000). On the application of inductive machine learning tools to geographical analysis. Geographical Analysis, 32(1), 113–139.
Gahegan, M. (2009). Visual exploration and explanation in geography: Analysis with light. In H. J. Miller & J. Han (Eds.), Geographic data mining and knowledge discovery (2nd ed., pp. 291–324). London: Taylor and Francis.
Gibbings, J. C. (2011). Dimensional analysis. New York: Springer.
Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory. Chicago: Aldine.
Goffman, E. (1959). The presentation of self in everyday life. New York: Anchor Books.
Goodchild, M. F. (2004). GIScience, geography, form, and process. Annals of the Association of American Geographers, 94(4), 709–714.
Goodchild, M. F. (2007). Citizens as sensors: The world of volunteered geography. GeoJournal, 69(4), 211–221.
Goodchild, M. F., Egenhofer, M. J., Kemp, K. K., Mark, D. M., & Sheppard, E. (1999). Introduction to the Varenius project. International Journal of Geographical Information Science, 13(8), 731–745.
Goodchild, M. F., & Li, L. (2012). Assuring the quality of volunteered geographic information. Spatial Statistics, 1, 110–120. doi:10.1016/j.spasta.2012.03.002.
Graham, M., & Shelton, T. (2013). Geography and the future of big data, big data and the future of geography. Dialogues in Human Geography, 3(3), 255–261.
Guptill, S. C., & Morrison, J. L. (Eds.). (1995). Elements of spatial data quality. Oxford: Elsevier.
Haklay, M. (2010). How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and Design, 37(4), 682–703.
Hand, D. J. (1999). Discussion contribution on ‘data mining reconsidered: Encompassing and the general-to-specific approach to specification search’ by Hoover and Perez. Econometrics Journal, 2(2), 241–243.
Hartshorne, R. (1939). The nature of geography: A critical survey of current thought in the light of the past. Washington, DC: Association of American Geographers.
Hey, T., Tansley S., & Tolle, K. (Eds.). (2009). The fourth paradigm: Data-intensive scientific discovery.
Hoover, K. D., & Perez, S. J. (1999). Data mining reconsidered: Encompassing and the general-to-specific approach to specification search. Econometrics Journal, 2(2), 167–191.
Kitchin, R. (2014). Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography, 3(3), 262–267.
Kurzweil, R. (1999). The age of spiritual machines: when computers exceed human intelligence. New York: Vintage.
Mayer-Schonberger, V., Cukier, K. (2013). Big Data: A revolution that will transform how we live, work, and think.
Merton, R. K. (1967). On sociological theories of the middle range. In R. K. Merton (Ed.), On theoretical sociology (pp. 39–72). New York: The Free Press.
Miller, H. J. (2007). Place-based versus people-based geographic information science. Geography Compass, 1(3), 503–535.
Miller, H. J. (2010). The data avalanche is here. Shouldn’t we be digging? Journal of Regional Science, 50(1), 181–201.
O’Leary, M. (2012). Eurovision statistics: post-semifinal update, Cold Hard Facts (May 23). Available: http://mewo2.com/nerdery/2012/05/23/eurovision-statistics-post-semifinal-update/. Accessed October 25, 2013.
Openshaw, S. (1988). Building an automated modeling system to explore a universe of spatial interaction models. Geographical Analysis, 20(1), 31–46.
Openshaw, S., Charlton, M., Wymer, C., & Craft, A. (1987). A Mark I geographical analysis machine for the automated analysis of point data sets. International Journal of Geographical Information Systems, 1(4), 335–358.
Openshaw, S., & Taylor, P. J. (1979). A million or so correlation coefficients: three experiments on the modifiable areal unit problem. In N. Wrigley (Ed.), Statistical methods in the social sciences (pp. 127–144). London: Pion.
Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying trading behavior in financial markets using Google Trends. Scientific Reports, 3 (1684). doi:10.1038/srep01684.
Raymond, E. S. (2001). The cathedral and the bazaar: Musings on linux and open source by an accidental revolutionary. Sebastopol: O’Reilly Media.
Schuurman, N. (2000). Trouble in the heartland: GIS and its critics in the 1990s. Progress in Human Geography, 24(4), 569–589.
Silver, N. (2012). The signal and the noise: Why most predictions fail—but some don’t.
Smith, N. (1992). History and philosophy of geography: Real wars, theory wars. Progress in Human Geography, 16(2), 257–271.
Sui, D. (2004). GIS, cartography, and the “Third Culture”: Geographic imaginations in the computer age. Professional Geographer, 56(1), 62–72.
Sui, D., & DeLyser, D. (2012). Crossing the qualitative-quantitative chasm I: Hybrid geographies, the spatial turn, and volunteered geographic information (VGI). Progress in Human Geography, 36(1), 111–124.
Sui, D., & Goodchild, M. F. (2011). The convergence of GIS and social media: Challenges for GIScience. International Journal of Geographical Information Science, 25(11), 1737–1748.
Sui, D., Goodchild, M. F., & Elwood, S. (2013). Volunteered geographic information, the exaflood, and the growing digital divide. In D. Sui, S. Elwood, & M. F. Goodchild (Eds.), Crowdsourcing geographic knowledge (pp. 1–12). New York: Springer.
Taleb, N. N. (2007). The black swan: The impact of the highly improbable. New York: Random House.
The Economist. (19 October 2013). Trouble at the lab, pp. 26–30.
Townsend, A. (2013). Smart cities: Big data, civic hackers, and the quest for a new utopia. New York: Norton.
Tsou, M. H., Yang, J. A., Lusher, D., Han, S., Spitzberg, B., Gawron, J. M., et al. (2013). Mapping social activities and concepts with social media (Twitter) and web search engines (Yahoo and Bing): a case study in 2012 US Presidential Election. Cartography and Geographic Information Science, 40(4), 337–348.
Waldrop, M. M. (1990). Learning to drink from a fire hose. Science, 248(4956), 674–675.
Warntz, W. (1989). Newton, the Newtonians, and the Geographia Generalis Varenii. Annals of the Association of American Geographers, 79(2), 165–191.
Watts, D. J. (2011). Everything is Obvious – Once You Know the Answer. United States of America: Crown Business.
Weinberger, D. (2011). The machine that would predict the future, Scientific American, November 15, 2011. http://www.scientificamerican.com/article.cfm?id=the-machine-that-would-predict.
Zedner, L. (2010). Pre-crime and pre-punishment: a health warning. Criminal Justice Matters, 81(1), 24–25.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Miller, H.J., Goodchild, M.F. Data-driven geography. GeoJournal 80, 449–461 (2015). https://doi.org/10.1007/s10708-014-9602-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10708-014-9602-6