Abstract
Among the many tools suited to detect local clusters in group-level data, Kulldorff–Nagarwalla’s spatial scan statistic gained wide popularity (Kulldorff and Nagarwalla in Stat Med 14(8):799–810, 1995). The underlying assumptions needed for making statistical inference feasible are quite strong, as counts in spatial units are assumed to be independent Poisson distributed random variables. Unfortunately, outcomes in spatial units are often not independent of each other, and risk estimates of areas that are close to each other will tend to be positively correlated as they share a number of spatially varying characteristics. We therefore introduce a Bayesian model-based algorithm for cluster detection in the presence of spatially autocorrelated relative risks. Our approach has been made possible by the recent development of new numerical methods based on integrated nested Laplace approximation, by which we can directly compute very accurate approximations of posterior marginals within short computational time (Rue et al. in JRSS B 71(2):319–392, 2009). Simulated data and a case study show that the performance of our method is at least comparable to that of Kulldorff–Nagarwalla’s statistic.
Similar content being viewed by others
References
Aamodt G, Samuelsen SO, Skrondal A (2006) A simulation study of three methods for detecting disease clusters. Int J Health Geo 5:15
Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew normal distribution. JRSS B 61(3):579–602
Barbuti S, Quarto M, Germinario C, Prato R, Lopalco P, Coviello E, Caputi G, Martinelli D, Tafuri S, Balducci MT, Lamarina L, Fortunato F, Berardino R, Arbore AM, Palma MD (2006) Atlante delle Cause di Morte della Regione Puglia: Anni 2000–2005. Regione Puglia—Osservatorio Epidemiologico della Regione Puglia
Besag J, York J, Mollié A (1991) Bayesian image restoration, with two applications in spatial statistics. Ann Inst Stat Math 43(1):1–20
Bilancia M, Fedespina A (2009) Geographical clustering of lung cancer in the province of Lecce, Italy: 1992–2001. Int J Health Geogr 8(1):40
Congdon P (2006) Bayesian statistical modelling, 2nd edn. Wiley, New York
Dellaportas P, Forster J, Ntzoufras I (2002) On Bayesian model and variable selection using MCMC. Stat Comput 12(1):27–36
Duczmal L, Kulldorf M, Huang L (2006) Evaluation of spatial scan statistics for irregularly shaped clusters. J Comput Graph Stat 15(2):428–442
Eberly L, Carlin BP (2000) Identifiability and convergence issues for Markov chain Monte Carlo fitting of spatial models. Statist Med 19:2279–2294
Fong Y, Rue H, Wakefield J (2010) Bayesian inference for generalized linear mixed models. Biostatistics 11(3):397–412
Gelfand AE, Sahu SK (1999) Identifiability, improper priors, and Gibbs sampling for generalized linear models. JASA 94(445):247–253
Gómez-Rubio V, Ferrándiz-Ferragud J, Lopez-Quílez A (2005) Detecting clusters of disease with R. J Geogr Syst 7(2):189–206
Held L, Schrödle B, Rue H (2010) Posterior and cross-validatory predictive checks: a comparison of MCMC and INLA. In: Kneib T, Tutz G (eds) Statistical modelling and regression structures. Festschrift in Honour of Ludwig Fahrmeir, pp 91–110. Physica-Verlag
Kleinman K, Lazarus R, Platt R (2004) A generalized linear mixed model approach for detecting incident clusters for disease in small areas, with application to biological terrorism. Am J Epidemiol 159(3):217–224
Kulldorff M (1997) A spatial scan statistic. Commun Statist Theory Methods 26(6):1481–1496
Kulldorff M, Nagarwalla N (1995) Spatial disease clusters: detection and inference. Stat Med 14(8):799–810
Lawson A, Biggeri A, Boehning D, Lesaffre E, Viel J, Clark A, Schlattmann P, Divino F (2000) Disease mapping models: an empirical evaluation. Disease mapping collaborative group. Stat Med 19(17–18):2217–2241
Loh JM, Zhu H (2007) Accounting for spatial correlation in the spatial scan statistics. Ann Appl Stat 1(2):560–584
Martino S, Rue H (2010a) Case studies in Bayesian computation using INLA. In: Mantovan P, Secchi P (eds) Complex data modeling and computationally intensive statistical methods, contributions to statistics. Springer, Milan, pp 99–114
Martino S, Rue H (2010b) Implementing approximate Bayesian inference using integrated nested Laplace approximation: a manual for the inla program http://www.math.ntnu.no/~hrue/GMRFsim/manual.pdf
Ntzoufras I (2009) Bayesian modeling using WinBUGS. Wiley, New York (ISBN 978-0-47014-1144)
Patil GP, Taillie C (2004) Upper level set scan statistic for detecting arbitrarily shaped hotspots. Environ Ecol Stat 11:183–197
Plummer M (2008) Penalized loss functions for Bayesian model comparison. Biostatistics 9(3):523–539
Richardson S, Thomson A, Best N, Elliott P (2004) Interpreting posterior relative risk estimates in disease-mapping studies. Environ Health Perspect 112(9):1016–1025
R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (ISBN 3-900051-07-0)
Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. Chapman & Hall/CRC, London
Rue R, Martino A, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. JRSS B 71(2):319–392
Schrödle B, Held L (2011) A primer on disease mapping and ecological regression using INLA. Comput Stat 26(2):241–258
Song JJ, De Oliveira V (2012) Bayesian model selection in spatial lattice models. Stat Methodol 9(1–2):228–238
Spiegelhalter S, Best N, Carlin B, van der Linde A (2002) Bayesian measures of model complexity and fit. JRSS B 64(4):583–639
Tierney L, Kadane JB (1986) Accurate approximations for posterior moments and marginal densities. JASA 81(393):82–86
Wakefield J (2007) Disease mapping and spatial regression with count data. Biostatistics 8(2):158–183
Yiannakoulias N, Rosychuk N, Hodgson J (2007) Adaptations for finding irregularly shaped disease clusters. Int J Health Geo 6(1):28
Zhang Z, Assunção R, Kulldorff M (2010) Spatial scan statistics adjusted for multiple clusters. J Prob Stat 11. doi:10.1155/2010/642379 (Article ID 642379)
Acknowledgments
Massimo Bilancia conceived the study and wrote Sects. 1–5. Giacomo Demarinis wrote Sect. 7 and software for data analysis. Sections 6 and 8 were written jointly. Both authors read and approved the final manuscript. We wish to thank Claudia Monte, PhD, Department of Physics, University of Bari Aldo Moro, and Maria Rosa Debellis, Department of Basic Medical Sciences, Neuroscience and Sense Organs, University of Bari Aldo Moro, for their support. We would like to extend our gratitude to the valuable reviews and contributions by the two anonymous referees. For a better visualisation, the cartographic map of the province of Foggia shown in Fig. 5 is a simplified version of the original \(\hbox {ESRI}^{\mathrm{TM}}\) shapefile provided by Istat. Some polygons whose geographical boundaries lie entirely within the boundaries of another municipality (enclave) have been deleted, without modifying the spatial contiguity structure. All registered trademarks and trademarks appearing in this paper, respectively identified with symbols \({\circledR }\) or \(^\mathrm{TM}\), are the property of their respective owners. SaTScan\(^\mathrm{TM}\) is a trademark of Martin Kulldorff. The SaTScan\(^\mathrm{TM}\) software was developed under the joint auspices of (i) Martin Kulldorff, (ii) the National Cancer Institute, and (iii) Farzad Mostashari of the New York City Department of Health and Mental Hygiene. A no-charge suite of R functions has been developed for computing the Bayesian cluster detection procedure described in this paper. The code is made available upon request under MIT license.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bilancia, M., Demarinis, G. Bayesian scanning of spatial disease rates with integrated nested Laplace approximation (INLA). Stat Methods Appl 23, 71–94 (2014). https://doi.org/10.1007/s10260-013-0241-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-013-0241-8