Background

In recent urban health studies, an increased focus has been directed to evaluating the accessibility of urban resources, either for individuals or for populations living in residential areas. Although the concept of "accessibility" is multidimensional (accessibility may be defined in terms of affordability, acceptability, availability and spatial accessibility [1]) evaluating geographical accessibility in residential areas offers critical information for public policy in planning and service provision as it allows for the identification of areas with lower (or higher) access to urban resources and the assessment of spatial and social inequalities in access [2, 3].

Geographical accessibility refers to the ease with which residents of a given area can reach services and facilities [2]. Most common approaches for defining geographical accessibility are based on distance or travel time to a resource (for a review, please refer to [4]). These measures assume that every member of the population is a potential user of the service; the pattern of spatial accessibility will depend on the relative location of the population and services [5, 6]. Table 1 synthesises approaches for conceptualizing and measuring different dimensions of geographical accessibility.

Table 1 Approaches for conceptualizing and measuring the geographical accessibility of services and facilities for residential areas

Several studies have measured the geographical accessibility in residential areas of services and facilities that have the potential to contribute to the population's well-being and health such as health care services [2, 718], recreational facilities [2, 16, 18, 19], and food supermarkets [16, 18, 2023]. Accessibility to these types of resources is especially important for populations with limited mobility and revenue since more direct and easier access confers opportunities by reducing the time and financial costs of access, and by potentially influencing life choices [24]. Other studies have measured geographical accessibility of resources potentially associated with more negative health outcomes, such as waste facilities, fast food restaurants, and pollution from large motorways [2527].

Over the past two decades, the operationalization of geographical accessibility measures in urban and health studies has become easier, largely due to developments in GIS transportation softwares and modules. These measures require the specification of a set of four parameters, namely 1) a spatial unit of reference for the population, i.e. a definition of residential areas; 2) an aggregation method, i.e. to account for the distribution of population in the residential area; 3) a measure of accessibility; and 4) a type of distance for computing the accessibility measures selected. The choice of these parameters is likely to generate different results, potentially leading to significant measurement errors [2, 28, 29].

In this paper, we investigate differences in results when geographical accessibility of residential areas (census tracts) to health care services is computed using different distances types and different aggregation methods. In the next section, we describe methods for defining the four parameters for computing accessibility measures. With these methods in mind, we then provide an overview of methodological issues in measuring accessibility.

Evaluating geographical accessibility of services and facilities in residential areas: specifying a set of parameters

Spatial unit of reference and aggregation methods

Selecting the appropriate spatial unit of analysis, i.e. the operational definition for residential areas, is critical for minimizing aggregation errors [2, 30]. Aggregation error arises from the distribution of individuals around the centroid of spatial units [2]. As spatial units vary in size from smaller areas, e.g. census blocks, to larger ones, e.g. census tracts, accessibility measured for smaller units is less subject to aggregation error than that measured for larger spatial units [2].

To evaluate the geographical accessibility of a service for a population living in a residential area, e.g. a census tract, three methods can be used [2]; they are illustrated in Figure 1. The first method consists in computing the distance between the centroid of the census tract and the service (Figure 1.a). This method shows the inappropriateness of ignoring the spatial distribution of the population inside the census tract [2].

Figure 1
figure 1

Choosing the spatial unit of reference for calculating distances and error aggregation.

The second method consists of calculating the population-weighted mean centre of the census tracts (Equation 1) and then evaluating the distance between this new location and the service. Toward this end, smaller spatial units entirely contained by the census tracts can be used, such as dissemination areas, census blocks, or postal codes. This method accounts for the spatial distribution of the population inside the census tract in order to minimize aggregation error.

( x i ¯ , y i ¯ ) = ( b i w b x b b i w b , k i w b y b b i w b ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaWaaeWaaeaadaqdaaqaaiabdIha4naaBaaaleaacqWGPbqAaeqaaaaakiabcYcaSmaanaaabaGaemyEaK3aaSbaaSqaaiabdMgaPbqabaaaaaGccaGLOaGaayzkaaGaeyypa0ZaaeWaaeaajuaGdaWcaaqaamaaqafabaGaem4DaC3aaSbaaeaacqWGIbGyaeqaaiabdIha4naaBaaabaGaemOyaigabeaaaeaacqWGIbGycqGHiiIZcqWGPbqAaeqacqGHris5aaqaamaaqafabaGaem4DaC3aaSbaaeaacqWGIbGyaeqaaaqaaiabdkgaIjabgIGiolabdMgaPbqabiabggHiLdaaaOGaeiilaWscfa4aaSaaaeaadaaeqbqaaiabdEha3naaBaaabaGaemOyaigabeaacqWG5bqEdaWgaaqaaiabdkgaIbqabaaabaGaem4AaSMaeyicI4SaemyAaKgabeGaeyyeIuoaaeaadaaeqbqaaiabdEha3naaBaaabaGaemOyaigabeaaaeaacqWGIbGycqGHiiIZcqWGPbqAaeqacqGHris5aaaaaOGaayjkaiaawMcaaaaa@63D3@
(1)

Where:

w b = total population of spatial unit b completely within census tract i (i.e. dissemination area or census block or postal code).

x b and y b = X and Y coordinates of spatial unit b.

Finally, the third method consists of computing the distance between the services and each centroid of spatial units completely within census tracts, and then calculating the average of these distances weighted by the total population of each unit (Figure 1.b and 1c). In comparison with the previous methods, this one is more accurate because it more exactly accounts for the distribution of the population inside the census tract.

Accessibility measures

The five most commonly used measures of accessibility are: 1) the distance to the closest service, 2) the number of services within n metres or minutes, 3) the mean distance to all services, 4) the mean distance to n closest services, and 5) the gravity model. If the more accurate aggregation method detailed previously is selected, these accessibility measures can be written as:

Z i a = b i w b ( min | d b s | ) b i w b , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdggaHbaakiabg2da9KqbaoaalaaabaWaaabuaeaacqWG3bWDdaWgaaqaaiabdkgaIbqabaGaeiikaGIagiyBa0MaeiyAaKMaeiOBa42aaqWaaeaacqWGKbazdaWgaaqaaiabdkgaIjabdohaZbqabaaacaGLhWUaayjcSdGaeiykaKcabaGaemOyaiMaeyicI4SaemyAaKgabeGaeyyeIuoaaeaadaaeqbqaaiabdEha3naaBaaabaGaemOyaigabeaaaeaacqWGIbGycqGHiiIZcqWGPbqAaeqacqGHris5aaaakiabcYcaSaaa@5225@
(2)

Where:

Z i a MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdggaHbaaaaa@2FE3@ = mean distance between census tract i and closest service.

w b = total population of spatial unit b completely within census tract i.

d bs = distance between spatial unit b and service s.

Z i b = b i W b j S S j b i W b , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaacbmGae8NwaO1aa0baaSqaaiabdMgaPbqaaiabdkgaIbaakiabg2da9maalaaabaWaaabuaeaacqWFxbWvdaWgaaWcbaGaemOyaigabeaaaeaacqWGIbGycqGHiiIZcqWGPbqAaeqaniabggHiLdGcdaaeqbqaaiab=nfatnaaBaaaleaacqWGQbGAaeqaaaqaaiabdQgaQjabgIGiolabdofatbqab0GaeyyeIuoaaOqaamaaqafabaGae83vaC1aaSbaaSqaaiabdkgaIbqabaaabaGaemOyaiMaeyicI4SaemyAaKgabeqdcqGHris5aaaakiabcYcaSaaa@4D15@
(3)

Where:

Z i b MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdkgaIbaaaaa@2FE5@ = mean number of services within n metres or minutes of census tract i.

w b = total population of spatial unit b completely within census tract i.

S = all services.

S j = number of services within n metres or minutes of spatial unit centroid b with S j = 1 where d bs n and S j = 0 where d bs > n.

Z i c = b i w b d b s b i w b , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdogaJbaakiabg2da9KqbaoaalaaabaWaaabuaeaacqWG3bWDdaWgaaqaaiabdkgaIbqabaGaemizaq2aaSbaaeaacqWGIbGycqWGZbWCaeqaaaqaaiabdkgaIjabgIGiolabdMgaPbqabiabggHiLdaabaWaaabuaeaacqWG3bWDdaWgaaqaaiabdkgaIbqabaaabaGaemOyaiMaeyicI4SaemyAaKgabeGaeyyeIuoaaaGccqGGSaalaaa@4933@
(4)

Z i c MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdogaJbaaaaa@2FE7@ = mean distance between census tract i population and all services.

w b = total population of spatial unit b completely within census tract i.

d bs = distance between spatial unit centroid b and service s.

Z i d = b i w b s d b s n b i w b , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdsgaKbaakiabg2da9KqbaoaalaaabaWaaabuaeaacqWG3bWDdaWgaaqaaiabdkgaIbqabaWaaabuaeaadaWcaaqaaiabdsgaKnaaBaaabaGaemOyaiMaem4CamhabeaaaeaacqWGUbGBaaaabaGaem4CamhabeGaeyyeIuoaaeaacqWGIbGycqGHiiIZcqWGPbqAaeqacqGHris5aaqaamaaqafabaGaem4DaC3aaSbaaeaacqWGIbGyaeqaaaqaaiabdkgaIjabgIGiolabdMgaPbqabiabggHiLdaaaOGaeiilaWcaaa@4E24@
(5)

Z i d MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdsgaKbaaaaa@2FE9@ = mean distance between census tract i and n closest services.

w b = total population of spatial unit b completely within census tract i.

d bs = distance between spatial unit centroid b and service s, d bs is sorted in ascending order.

n = number of closest services to be included in measure.

Z i e = b i w b S S w s d b s α b i w b , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdwgaLbaakiabg2da9KqbaoaalaaabaWaaabuaeaacqWG3bWDdaWgaaqaaiabdkgaIbqabaWaaabuaeaacqWGtbWudaWgaaqaaiabdEha3jabdohaZbqabaGaemizaq2aa0baaeaacqWGIbGycqWGZbWCaeaacqGHsisliiGacqWFXoqyaaaabaGaem4uamfabeGaeyyeIuoaaeaacqWGIbGycqGHiiIZcqWGPbqAaeqacqGHris5aaqaamaaqafabaGaem4DaC3aaSbaaeaacqWGIbGyaeqaaaqaaiabdkgaIjabgIGiolabdMgaPbqabiabggHiLdaaaOGaeiilaWcaaa@533B@
(6)

Z i e MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemOwaO1aa0baaSqaaiabdMgaPbqaaiabdwgaLbaaaaa@2FEB@ = mean value of potential gravity.

w b = total population of spatial unit b completely within census tract i.

S = number of services in study area.

d bs = distance between spatial unit centroid b and service s.

α = friction parameter (usually 1, 1.5 or 2).

S ws = weight given to the service s such as its size (for example, number of beds for a hospital).

Types of distance

Four types of distance are typically used for calculating accessibility measures: Euclidean distance (straight-line), Manhattan distance (distance along two sides of a right-angled triangle opposed to the hypotenuse), shortest network distance and shortest network time (Figure 2) [28, 31]. Euclidean and Manhattan distances can easily be computed using geographic coordinates:

Figure 2
figure 2

Several types of distance.

d i j = ( x i x j ) 2 + ( y i y j ) 2 , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemizaq2aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGH9aqpdaGcaaqaaiabcIcaOiabdIha4naaBaaaleaacqWGPbqAaeqaaOGaeyOeI0IaemiEaG3aaSbaaSqaaiabdQgaQbqabaGccqGGPaqkdaahaaWcbeqaaiabikdaYaaakiabgUcaRiabcIcaOiabdMha5naaBaaaleaacqWGPbqAaeqaaOGaeyOeI0IaemyEaK3aaSbaaSqaaiabdQgaQbqabaGccqGGPaqkdaahaaWcbeqaaiabikdaYaaaaeqaaOGaeiilaWcaaa@46F8@
(7)

d ij = |x i - x j | + |y i - y j |,

Where:

X i and Y i = X and Y coordinates of point i with a plane projection.

On the other hand, calculation of network distance and network time distance – which represent respectively the shortest and fastest paths between two points using a street network – is more complex. Indeed, the computation of these two distances necessitates geometric network files – with directions, speed limits, turning restrictions, and delays available for each street segment – integrated into GIS, and a GIS module or GIS software specialized in transportation analysis (ESRI Network Analyst Extension or TransCad software, for example).

Shortest network and network time distances represent two different objectives. Shortest network distance is useful for evaluating the path between two points as if taken on foot; consequently, it is frequently used in studies on the accessibility of "proximal" services and facilities [4, 18, 20, 3234]. Shortest time distance is more accurate for evaluating distances by car or public transportation.

Methodological issues and accuracy in measuring geographical accessibility

When evaluating geographical accessibility, the choice of these parameters is likely to generate different results, potentially leading to significant measurement errors [2, 28, 29].

Most studies examining the geographical accessibility of health care and health-related services have been concerned with measuring the accessibility of the closest facility using Euclidean distance [2, 13], shortest network distance [8, 18, 32], shortest network time distance [9, 11, 16, 17, 35], or a combination of distance types [10, 12, 15]. Others have also examined proximity to diversity by measuring the average shortest network distance to selected services [32], and the offer provided by the immediate surroundings, i.e. of the number of services within a given distance [7, 14, 18, 32]. Few studies have conceptualized different dimensions of geographical accessibility within one investigation (for exceptions, see [14, 15, 32]), although this would be useful in order to describe the complexity of geographical accessibility of a given service [32]. Furthermore, within a given set of data, the choice of the accessibility measure is fundamental since accessibility varies depending on the indicator used [3, 4, 36].

Some studies have compared discrepancies in results when geographical accessibility was measured using different types of distances. In a study exploring trade-offs between various types of distance, Apparicio and colleagues [28] compared distance matrices based on simple (Euclidean, Manhattan) and network (road, time) measures of distance between all census tracts in Canada's eight largest metropolitan areas. They examined whether the Euclidean and Manhattan approximations are correlated with a more accurate measure of distance, i.e. travel time along the road network, at the metropolitan and census tract levels. The authors concluded that, at the metropolitan level, the use of Euclidean or Manhattan distances to estimate shortest network time does not introduce major errors. However, in sub-metropolitan areas, or areas located away from the central business district, the use of Euclidean or Manhattan approximations of shortest network time may lead to substantial errors. In measuring accessibility in these areas, network-based distance/time matrices may provide more appropriate results. Similar results were also observed [10, 12, 15].

With respect to operational definition of residential areas, a wide variety of area units has been used, ranging from smaller units such as census meshblocks [9, 10, 16, 18], enumeration districts [11, 13] and census output areas [12] to larger units such as census tracts [7, 17, 32, 35], communities and city-defined neighbourhoods [2, 8], and wards [1315]. Some studies have controlled for the location of the population within the spatial unit by calculating the population-weighted mean centre of the spatial unit [12, 14, 16, 17, 35] or by calculating the population-weighted mean accessibility of smaller spatial units located within the spatial unit of interest [2, 13, 32]. Nonetheless, a considerable number of studies do not employ methods for minimizing aggregation errors, i.e. they compute accessibility for the geometric centroid of the spatial unit.

For public policy and planning, measuring geographical accessibility of urban resources and facilities is of interest as it conveys information on the presence of enabling resources [37] or opportunity structures [38] in the residential environment. Geographical accessibility measures are however prone to a variety of methodological problems, among which is error induced by using different distance types [28] and aggregation methods [2, 29].

Study objectives

In this paper, we investigate differences in results when geographical accessibility of residential areas (census tracts) to selected health care services is computed using different distances types and different aggregation methods. Specific objectives are to: 1) Compare measures of distances; and 2) Estimate aggregation errors for several accessibility measures.

Data and methods

Study area and health services

This study focuses on the Montréal census metropolitan area (CMA) which has a population of about 3.4 million inhabitants. The territory of the Montréal CMA is divided into 852 census tracts, 5,829 dissemination areas and 25,767 blocks with respective average population sizes of 4,022, 588 and 133 inhabitants, as defined by Statistics Canada. A total of 642 health services grouped into eight categories were integrated into geographic information systems (ArcGis) by geocoding addresses (Figure 3). These health services were inventoried from the website of the Ministère de la santé et des services sociaux du Québec (Quebec Ministry of Health and Social Services). Of the 642 services, 65 are located in a 10 km buffer zone around the boundaries of the Montréal CMA. These 65 services were included in order to prevent underestimation of accessibility in the bordering zones of the Montréal CMA.

Figure 3
figure 3

Categories of health services for the Montréal CMA, 2006.

Comparing distance types

To explore variations in results according to distance type, we calculate the four distance types – Euclidean, Manhattan, shortest network path and shortest network time distances – between the 642 health services and the centroids of census tracts, dissemination areas and blocks. In total, more than 83 million distances are computed (Table 2), with SAS software for Euclidean and Manhattan distances, and with the Network Analyst Extension of ArcView 3.3 [39] by using CanMap Streetfiles from DMTI [40] for shortest network and shortest network time distances.

Table 2 Distances calculated between health services and spatial units

Once these four distance types are computed, correlation analyses are performed globally and locally across entire census tract, dissemination area and block matrices. First, the global analysis, which yields one value for the CMA as a whole, allows us to assess the degree of correlation between the four distance types. Then, we examine correlations between the four distances for each spatial unit centroids and the 642 health services. This local analysis stage yields one mappable value for each census tract, dissemination area and block and allows to identify spatial variation in the degree of correlation between the four distance types.

Evaluating aggregation errors when measuring geographical accessibility

The same approach, i.e. global and local analyses, was used to evaluate aggregation errors for several accessibility measures at the census tract level. The global analysis involves calculating correlations between three aggregation methods: 1) the census tract centroid; 2) the population-weighted mean of the accessibility measure for dissemination areas within census tracts; and 3) the population-weighted mean of the accessibility measure for blocks within census tracts, the most accurate method. Although accessibility was computed for each of the eight categories of health services, for purposes of conciseness, results are reported only for accessibility of general and specialized care i.e. hospitals (n = 56) for census tracts. It is worth noting, that similar patterns of correlation were observed for other health services.

Results

Correlations between the four types of distances

Global correlations

Table 3 presents results for global correlation coefficients between the four types of distances computed for the entire sample of health services (n = 642). Three observations can be made. First, at the metropolitan scale, independently of the type of distance used, results are globally similar as indicated by high correlation coefficient values (greater than 0.95). Second, in comparison with Manhattan distance, Euclidean distance is most strongly correlated with the more accurate network path and time distances. Thus, if it is impossible to compute network distance in a study focussing on geographical accessibility in the Montréal CMA, Euclidean distance seems preferable to Manhattan distance. These results are in line with those of Apparicio et al. [28] for eight Canadian metropolitan areas (Toronto, Montréal, Vancouver, Ottawa-Hull, Calgary, Edmonton, Québec and Winnipeg), and with those of Fone et al. [12] for Caerphilly county borough in Wales. Finally, as expected, correlations between both network distances are almost perfect (0.992). This means that if directions and speed limits are unknown for computing the shortest network time, the shortest network distance is a very reliable alternative.

Table 3 Global Pearson correlations between alternative types of distance

Local correlations

Although global correlations are high, they are not perfect (values differ from one). For this reason, local variations at the intra-metropolitan scale must exist and should be examined in detail. Figure 4 presents local Pearson coefficients between Euclidean distance and shortest network time, and between Euclidean and Manhattan distances for the geographical accessibility of the 642 health services computed from the centroids of census tracts, dissemination areas, and blocks.

Figure 4
figure 4

Comparing alternative types of distance between spatial units and health services using local Pearson correlations.

Results show similar spatial patterns for the three spatial scales (census tract, dissemination area and block): with increasing distance from the central business district, correlations are reduced between Euclidean distance and shortest network time, and between Euclidean and Manhattan distances. For all spatial units in the centre of the Island of Montréal and those located on the south shore, correlations are higher. For those located on the periphery of the CMA, notably on the north shore, characterized by suburban areas, correlations are weaker, often lower than 0.9.

These results illustrate that for the Island of Montréal, integrating Euclidean distances at the census tract, dissemination area and block levels into statistical analysis, e.g. in regression or multilevel analysis, would yield similar results to those obtained if network distances were computed. However, if the focus is on the CMA as a whole, or on specific parts of the CMA, namely, those located in the northern suburbs, then results are likely to vary as a function of the distance type used to compute geographical accessibility.

Aggregation errors

Global errors

The global analysis of aggregation errors is performed by means of Spearman correlations between the three methods of aggregation used to calculated 20 accessibility measures at the census tract level using the more accurate distances (network distances). Results are shown in Table 4 for hospitals only, although similar patterns of correlation were observed for other health services.

Table 4 Spearman rank correlations between measures of the accessibility of hospitals by aggregation method

Correlations between the three aggregation methods are high (>0.9) for all measures of accessibility except for the number of services within 500, 1000 and 2000 metres. For example, correlation between the least exact aggregation method (census tract centroid) and the most exact based on blocks within census tracts is 0.588 for the number of hospitals within 500 metres, 0.776 for those within 1000 metres, and 0.898 for those within 2000 metres. This means that if we want to assess service provision in a close-proximity area around a census tract, it is preferable to use an aggregation method that precisely accounts for the distribution of population within it; if not, the risk of error might be considerable.

Local errors

A second stage of comparison of aggregation methods consists in assessing the absolute difference between the geographical accessibility results obtained from the methods based on census tract and dissemination areas centroids in relation to the most accurate method based on blocks within census tracts. The descriptive statistics for local errors are reported in Table 5 for hospitals.

Table 5 Aggregation errors in measures of the accessibility of hospitals at the census tract level

Not surprisingly, the local errors are on the whole quite small, though not insignificant: for example, compared with the most accurate method, the census tract centroid method misestimates the distance to the closest hospital by an average of 365 m, and the dissemination area method by an average of 134 m. Up to the third quartile (75%), the local errors are still quite small: for 75% of census tracts, the error associated with the census tract centroid approach is less than 365 m. However, in 10% of cases, the error is greater than 948 m, and in 5% of census tracts the error is greater than 1.5 km (Table 5). Despite the high correlations, significant errors in the measurement of geographical accessibility can occur in a small number of cases.

Absolute differences between aggregation methods for the closest hospital computed using shortest network distance and shortest distance time are further mapped in Figure 5. Again, stronger absolute aggregation errors are observed in suburban areas on the south and north shores of the CMA; errors remain smaller in central areas of the Island of Montréal.

Figure 5
figure 5

Evaluating local aggregation errors for hospitals.

For the purposes of statistical studies at a general level, the least precise aggregation method – based on census tract centroids – is adequate: it enables the broad identification of areas in Montréal that have the least access to health services. However, if one wishes to reach more precise conclusions for specific neighbourhoods, major errors appear for 5% to 10% of census tracts.

Conclusion

Over the past two decades, an increasing number of health studies have integrated the geographical accessibility of services and facilities as an important dimension of the built urban environment. The development of GIS with transportation module (ESRI Network Analyst Extension, for example) has largely fostered this increase. However, the results reported in this paper show that measures of geographical accessibility of urban health services may vary according to the distance type and aggregation method selected.

Although Euclidean and Manhattan distances are strongly correlated with network distances, local variations are nonetheless observed, notably in suburban areas. The choice of the aggregation method is also important: accessibility measures computed from census tract centroids, though not inaccurate, yield important measurement errors for 5% to 10% of census tracts. This is especially so in census tracts with lower population density and in those where the land use is largely non-residential. Because the accessibility of health services may be more problematic in suburban areas than in more central urban areas, geographical accessibility studies should be based on the most accurate measures. Thus, using the smallest area unit possible included in the spatial unit of interest appears to be a relevant alternative for minimizing aggregation errors.

Results obtained for Montreal – comparison of the four types of distances and evaluation of aggregation errors – may be generalisable to other North American cities where urban forms are similar. For example, in a study aiming at comparing types of distances, Apparicio and colleagues observed similar results for the eight largest Canadian metropolitan areas [28]. Moreover, results can also be extended to other services and amenities (not only for health services) although the magnitude of aggregation errors may likely vary [2].

Computing accessibility measures in GIS using network distances and more precise aggregation methods is no longer a daunting task. Nowadays, software and modules dedicated to network analysis are effective and user-friendly, notably the Network Analyst Extension of ArcGIS or the TransCad software. Moreover, street network data are easily accessible (for example, Statistics Canada or DMTI data). Because the calculation speed of computers is exponential, the time required for the computation of numerous network distances is no longer a limitation. Consequently, although errors associated with the choice of distance types are important for about 10% of census tracts, we should not avoid using the best estimation method possible for evaluating geographical accessibility. This is especially so if accessibility measures to health services or health-related resources are to be included as a dimension of the built environment in studies investigating residential area effects on health outcomes. Imprecision of accessibility measures could lead to errors or lack of precision in the estimation of area effects on health.

Future studies should investigate the extent of aggregation errors occurring when measuring accessibility to other types of services and amenities, but also in other cities where urban form may differ from that of North American cities, and finally in rural context where geographical accessibility is an important issue.