Abstract
Mixture models in reliability bring a useful compromise between parametric and nonparametric models, when several failure modes are suspected. The classical methods for estimation in mixture models rarely handle the additional difficulty coming from the fact that lifetime data are often censored, in a deterministic or random way. We present in this paper several iterative methods based on EM and Stochastic EM methodologies, that allow us to estimate parametric or semiparametric mixture models for randomly right censored lifetime data, provided they are identifiable. We consider different levels of completion for the (incomplete) observed data, and provide genuine or EM-like algorithms for several situations. In particular, we show that simulating the missing data coming from the mixture allows to plug a standard R package for survival data analysis in an EM algorithm’s M-step. Moreover, in censored semiparametric situations, a stochastic step is the only practical solution allowing computation of nonparametric estimates of the unknown survival function. The effectiveness of the new proposed algorithms are demonstrated in simulation studies and an actual dataset example from aeronautic industry.
Similar content being viewed by others
Notes
Centre de Calcul Scientifique en région Centre, http://cascimodot.fdpoisson.fr/?q=ccsc.
The authors thank the Turbomeca Company http://www.turbomeca.com that allowed us to use these data.
References
Andersen P, Borgan O, Gill R, Keiding N (1993) Statistical models based on counting processes. Springer, New York
Atkinson SE (1992) The performance of standard and hybrid EM algorithms for ML estimates of the normal mixture model with censoring. J Stat Comput Simul 44(1–2):105–115
Balakrishnan N, Mitra D (2011) Likelihood inference for lognormal data with left truncation and right censoring with illustration. J Stat Plan Inference 144(11):3536–3553
Balakrishnan N, Mitra D (2014) EM-based likelihood inference for some lifetime distributions based on left truncated and right censored data and associated model discrimination. S Afr Stat J 48:125–171
Benaglia T, Chauveau D, Hunter DR (2009a) An EM-like algorithm for semi-and non-parametric estimation in multivariate mixtures. J Comput Graph Stat 18(2):505–526
Benaglia T, Chauveau D, Hunter DR, Young D (2009b) mixtools: an R package for analyzing finite mixture models. J Stat Softw 32(6):1–29
Beutner E, Bordes L (2011) Estimators based on data-driven generalized weighted Cramer-von Mises distances under censoring-with applications to mixture models. Scand J Stat 38(1):108–129
Bordes L, Chauveau D (2014) Comments: EM-based likelihood inference for some lifetime distributions based on left truncated and right censored data and associated model discrimination. S Afr Stat J 48:197–200
Bordes L, Chauveau D, Vandekerkhove P (2007) A stochastic EM algorithm for a semiparametric mixture model. Comput Stat Data Anal 51(11):5429–5443
Bordes L, Mottelet S, Vandekerkhove P (2006) Semiparametric estimation of a two-component mixture model. Ann Stat 34(3):1204–1232
Cao R, Janssen P, Veraverbeke N (2001) Relative density estimation and local bandwidth selection for censored data. Comput Stat Data Anal 36(4):497–510
Castet J-F, Saleh JH (2010) Single versus mixture weibull distributions for nonparametric satellite reliability. Reliab Eng Syst Saf 95:295–300
Cavanaugh JE, Shumway RH (1998) An Akaike information criterion for model selection in the presence of incomplete data. J Stat Plan Inference 67(1):45–65
Celeux G, Chauveau D, Diebolt J (1996) Stochastic versions of the EM algorithm: an experimental study in the mixture case. J Stat Comput Simul 55:287–314
Celeux G, Diebolt J (1986) The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comput Stat Q 2:73–82
Chauveau D (1995) A stochastic EM algorithm for mixtures with censored data. J Stat Plan Inference 46(1):1–25
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 39(1):1–38
Dirick L, Claeskens G, Baesens B (2015) An Akaike information criterion for multiple event mixture cure models. Eur J Oper Res 241:449–457
Dubos GF, Castet J-F, Saleh JH (2010) Statistical reliability analysis of satellites by mass category: Does spacecraft size matter? Acta Astronaut 67:584–595
Hunter DR, Wang S, Hettmansperger TP (2007) Inference for mixtures of symmetric distributions. Ann Stat 35(1):224–251
Karunamuni R, Wu J (2009) Minimum hellinger distance estimation in a nonparametric mixture model. J Stat Plan Inference 3:1118–1133
Lee G, Scott C (2012) EM algorithms for multivariate gaussian mixture models with truncated and censored data. Comput Stat Data Anal 56:2816–2829
Louis T (1982) Finding the observed information matrix when using the em algorithm. J R Stat Soc Ser B 44:226–233
McLachlan G, Peel D (2000) Finite mixture models: Wiley series in probability and statistics: applied probability and statistics. Wiley-Interscience, New York
McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions: Wiley series in probability and statistics: applied probability and statistics. Wiley-Interscience, New York
Nielsen SF (2000) The stochastic EM algorithm: estimation and asymptotic results. Bernoulli 6(3):457–489
R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Suzukawa A, Imai H, Sato Y (2001) Kullback–Leibler information consistent estimation for censored data. Ann Inst Stat Math 53(2):262–276
Svensson I, Sjöstedt-de Luna S (2010) Asymptotic properties of a stochastic EM algorithm for mixtures with censored data. J Stat Plan Inference 140:111–127
Therneau T, Lumley T (2009) survival: Survival analysis, including penalised likelihood. R package version 2.35-8
Wei G, Tanner M (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. J Am Stat Assoc 85:699–704
Yu H (2012) Rmpi: Interface (Wrapper) to MPI (Message-Passing Interface)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bordes, L., Chauveau, D. Stochastic EM algorithms for parametric and semiparametric mixture models for right-censored lifetime data. Comput Stat 31, 1513–1538 (2016). https://doi.org/10.1007/s00180-016-0661-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-016-0661-7