Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3297280.3297342acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

A transfer learning paradigm for spatial networks

Published: 08 April 2019 Publication History

Abstract

Advances in machine learning and the availability of spatial data have seen remarkable improvements in recent times. This parallel growth has influenced the increased application of traditional data mining techniques for knowledge discovery on spatial data. However, these techniques assume that the data is drawn from an independent and identical distribution whereas spatial data is inherently dependent and heterogeneous. This contradiction strongly suggests that a crass application of conventional data mining techniques to spatial data would be suboptimal. In this paper, we evaluate the relatedness of street networks using a transfer learning methodology within the formal contexts of spatial data. Adopting a statistical multi-measure, we analyze street networks from eight cities in an attempt to ascertain their similarities. We predict the street types using random forests and evaluate the accuracies as a function of transfer polarity. Positive transfer is when the transferred models perform better than the parent model or negative transfer when it is worse. With an overall average accuracy of 85%, our results show that it is possible to generalize machine learning models onto different domains and still produce excellent results. Also, we demonstrate that the improved or loss of model accuracy can be explained by the proportion of statistical similarity between the domains. This observation confirms that a measure of inter-domain similarity solely based on geo-political boundaries will be erroneous. The techniques we have described are a statistically sound foundation for analysis of similarities in the spatial context. It can be adopted towards understanding the extent of model generalization for spatial networks.

References

[1]
F Heinzle, KH Anders, and M Sester. Automatic Detection of Patterns in Road Networks - Methods and Evaluation. In Proc. of Joint Workshop Visualization and Exploration of Geospatial Data, Stuttgart, volume 36, page 4, 2007.
[2]
Nahid Mohajeri and Agust Gudmundsson. The Evolution and Complexity of Urban Street Networks. Geographical Analysis, 46(4):345--367, 2014.
[3]
Shashi Shekhar, Michael R Evans, James M Kang, and Pradeep Mohan. Identifying Patterns in Spatial Information: A Survey of Methods. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3):193--214, 2011.
[4]
Colin R Blyth. On simpson's Paradox and the Sure-thing Principle. Journal of the American Statistical Association, 67(338):364--366, 1972.
[5]
Jing Gao, Wei Fan, Jing Jiang, and Jiawei Han. Knowledge Transfer via Multiple Model Local Structure Mapping. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 283--291. ACM, 2008.
[6]
Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. A Survey of Transfer Learning. Journal of Big Data, 3(1):9, 2016.
[7]
Andrea Ballatore and Michela Bertolotto. Semantically Enriching VGI in Support of Implicit Feedback Analysis. In International Symposium on Web and Wireless Geographical Information Systems, pages 78--93. Springer, 2011.
[8]
Rob Kitchin and Gavin McArdle. What makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society, 3(1):2053951716631130, 2016.
[9]
Maryam M Najafabadi, Flavio Villanustre, Taghi M Khoshgoftaar, Naeem Seliya, Randall Wald, and Edin Muharemagic. Deep Learning Applications and Challenges in Big Data Analytics. Journal of Big Data, 2(1):1, 2015.
[10]
Mahmuda Ahmed, Sophia Karagiorgou, Dieter Pfoser, and Carola Wenk. A Comparison and Evaluation of Map Construction Algorithms using Vehicle Tracking Data. GeoInformatica, 19(3):601--632, 2015.
[11]
Jaeeun Lee, Hanme Jang, Jonghyeon Yang, and Kiyun Yu. Machine Learning Classification of Buildings for Map Generalization. ISPRS International Journal of Geo-Information, 6(10):309, 2017.
[12]
Rodolphe Devillers, Alfred Stein, Yvan Bédard, Nicholas Chrisman, Peter Fisher, and Wenzhong Shi. Thirty years of Research on Spatial Data Quality: Achievements, Failures, and Opportunities. Transactions in GIS, 14(4):387--400, 2010.
[13]
Harvey J Miller. Tobler's First Law and Spatial Analysis. Annals of the Association of American Geographers, 94(2):284--289, 2004.
[14]
Hidetoshi Shimodaira. Improving Predictive Inference under Covariate Shift by Weighting the Log-Likelihood Function. Journal of statistical planning and inference, 90(2):227--244, 2000.
[15]
Liang Ge, Jing Gao, Hung Ngo, Kang Li, and Aidong Zhang. On Handling Negative Transfer and Imbalanced Distributions in Multiple Source Transfer Learning. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(4):254--271, 2014.
[16]
Geoff Boeing. Osmnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks. Computers, Environment and Urban Systems, 65:126--139, 2017.
[17]
Mordechai Haklay. How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. Environment and planning B: Planning and design, 37(4):682--703, 2010.
[18]
Silvana Philippi Camboim, João Vitor Meza Bravo, and Claudia Robbi Sluter. An Investigation into the Completeness of, and the Updates to, OpenStreetMap Data in a Heterogeneous Area in Brazil. ISPRS International Journal of Geo-Information, 4(3):1366--1388, 2015.
[19]
OpenStreetMap. Highways. https://wiki.openstreetmap.org/wiki/Highways, 2018.
[20]
Geoff Boeing. Urban Spatial Order: Street Network Orientation, Configuration, and Entropy. 2018.
[21]
Noam Segev, Maayan Harel, Shie Mannor, Koby Crammer, and Ran El-Yaniv. Learn on Source, Refine on Target: A Model Transfer Learning Framework with Random Forests. IEEE transactions on pattern analysis and machine intelligence, 39(9):1811--1824, 2017.
[22]
Thomas G Dietterich. Ensemble Methods in Machine Learning. In International workshop on multiple classifier systems, pages 1--15. Springer, 2000.
[23]
David Opitz and Richard Maclin. Popular Ensemble Methods: An Empirical Study. Journal of artificial intelligence research, 11:169--198, 1999.
[24]
Leo Breiman. Random Forests. Machine learning, 45(1):5--32, 2001.
[25]
Simon Scheider, Frank O Ostermann, and Benjamin Adams. Why good data analysts need to be critical synthesists. Determining the role of semantics in data analysis. Future generation computer systems, 72:11--22, 2017.
[26]
Hansi Senaratne, Amin Mobasheri, Ahmed Loai Ali, Cristina Capineri, and Mordechai Haklay. A Review of Volunteered Geographic Information Quality Assessment Methods. International Journal of Geographical Information Science, 31(1):139--167, 2017.

Cited By

View all
  • (2022)City indicators for geographical transfer learning: an application to crash predictionGeoinformatica10.1007/s10707-022-00464-326:4(581-612)Online publication date: 1-Oct-2022
  • (2022)Multi-agent Systems for Distributed Data Mining Techniques: An OverviewBig Data Intelligence for Smart Applications10.1007/978-3-030-87954-9_3(57-92)Online publication date: 18-Jan-2022
  • (2021)The Regional and Local Scale Evolution of the Spatial Structure of High-Speed Railway Networks—A Case Study Focused on Beijing-Tianjin-Hebei Urban AgglomerationISPRS International Journal of Geo-Information10.3390/ijgi1008054310:8(543)Online publication date: 12-Aug-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing
April 2019
2682 pages
ISBN:9781450359337
DOI:10.1145/3297280
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. OpenStreetMap
  2. geographical data analysis
  3. heterogeneity
  4. spatial data mining
  5. spatial networks
  6. transfer learning

Qualifiers

  • Research-article

Conference

SAC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)City indicators for geographical transfer learning: an application to crash predictionGeoinformatica10.1007/s10707-022-00464-326:4(581-612)Online publication date: 1-Oct-2022
  • (2022)Multi-agent Systems for Distributed Data Mining Techniques: An OverviewBig Data Intelligence for Smart Applications10.1007/978-3-030-87954-9_3(57-92)Online publication date: 18-Jan-2022
  • (2021)The Regional and Local Scale Evolution of the Spatial Structure of High-Speed Railway Networks—A Case Study Focused on Beijing-Tianjin-Hebei Urban AgglomerationISPRS International Journal of Geo-Information10.3390/ijgi1008054310:8(543)Online publication date: 12-Aug-2021
  • (2021)Leveraging Road Characteristics and Contributor Behaviour for Assessing Road Type Quality in OSMISPRS International Journal of Geo-Information10.3390/ijgi1007043610:7(436)Online publication date: 25-Jun-2021
  • (2021)Towards Robust Representations of Spatial Networks Using Graph Neural NetworksApplied Sciences10.3390/app1115691811:15(6918)Online publication date: 27-Jul-2021
  • (2021)Examining the impact of cross-domain learning on crime predictionJournal of Big Data10.1186/s40537-021-00489-98:1Online publication date: 3-Jul-2021
  • (2021)Transferable Graph Neural Networks for Inferring Road Type Attributes in Street NetworksIEEE Access10.1109/ACCESS.2021.31288399(158331-158339)Online publication date: 2021
  • (2020)Improved Graph Neural Networks for Spatial Networks Using Structure-Aware SamplingISPRS International Journal of Geo-Information10.3390/ijgi91106749:11(674)Online publication date: 13-Nov-2020
  • (2020)OSMWatchman: Learning How to Detect Vandalized Contributions in OSM Using a Random Forest ClassifierISPRS International Journal of Geo-Information10.3390/ijgi90905049:9(504)Online publication date: 22-Aug-2020
  • (2020)Exploring Budgeted Learning for Data-Driven Semantic Inference via Urban FunctionsIEEE Access10.1109/ACCESS.2020.29738858(32258-32269)Online publication date: 2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media