Comprehensive analysis of distance and similarity measures for Wi-Fi fingerprinting indoor positioning systems
Expert Systems with Applications, 2015•Elsevier
Recent advances in indoor positioning systems led to a business interest in those
applications and services where a precise localization is crucial. Wi-Fi fingerprinting based
on machine learning and expert systems are commonly used in the literature. They compare
a current fingerprint to a database of fingerprints, and then return the most similar one/ones
according to: 1) a distance function, 2) a data representation method for received signal
strength values, and 3) a thresholding strategy. However, most of the previous works simply …
applications and services where a precise localization is crucial. Wi-Fi fingerprinting based
on machine learning and expert systems are commonly used in the literature. They compare
a current fingerprint to a database of fingerprints, and then return the most similar one/ones
according to: 1) a distance function, 2) a data representation method for received signal
strength values, and 3) a thresholding strategy. However, most of the previous works simply …
Abstract
Recent advances in indoor positioning systems led to a business interest in those applications and services where a precise localization is crucial. Wi-Fi fingerprinting based on machine learning and expert systems are commonly used in the literature. They compare a current fingerprint to a database of fingerprints, and then return the most similar one/ones according to: 1) a distance function, 2) a data representation method for received signal strength values, and 3) a thresholding strategy. However, most of the previous works simply use the Euclidean distance with the raw unprocessed data. There is not any previous work that studies which is the best distance function, which is the best way of representing the data and which is the effect of applying thresholding. In this paper, we present a comprehensive study using 51 distance metrics, 4 alternatives to represent the raw data (2 of them proposed by us), a thresholding based on the RSS values and the public UJIIndoorLoc database. The results shown in this paper demonstrate that researchers and developers should take into account the conclusions arisen in this work in order to improve the accuracy of their systems. The IPSs based on k-NN are improved by just selecting the appropriate configuration (mainly distance function and data representation). In the best case, 13-NN with Sørensen distance and the powed data representation, the error in determining the place (building and floor) has been reduced in more than a 50% and the positioning accuracy has been increased in 1.7 m with respect to the 1-NN with Euclidean distance and raw data commonly used in the literature. Moreover, our experiments also demonstrate that thresholding should not be applied in multi-building and multi-floor environments.
Elsevier