Abstract
We propose a method for variable selection in discriminant analysis with mixed continuous and binary variables. This method is based on a criterion that permits to reduce the variable selection problem to a problem of estimating suitable permutation and dimensionality. Then, estimators for these parameters are proposed and the resulting method for selecting variables is shown to be consistent. A simulation study that permits to study several properties of the proposed approach and to compare it with an existing method is given, and an example on a real data set is provided.
Similar content being viewed by others
References
Aspakourov O, Krzanowski WJ (2000) Non-parametric smoothing of the location model in mixed variables discrimination. Stat Comput 10:289–297
Bar-Hen A, Daudin JJ (1995) Generalization of the Mahalanobis distance in the mixed case. J Multivar Anal 53:332–342
Bedrick EJ, Lapidus J, Powell JF (2000) Estimating the Mahalanobis distance from mixed continuous and discrete data. Biometrics 56:394–401
Chang PC, Afifi AA (1979) Classification based on dichotomous and continue variables. J Am Stat Assoc 69:336–339
Daudin JJ (1986) Selection of variables in mixed-variable discriminant analysis. Biometrics 42:473–481
Daudin JJ, Bar-Hen A (1999) Selection in discriminant analysis with continuous and discrete variables. Comput Stat Data Anal 32:161–175
De Leon AR, Carriere KC (2005) A generalized Mahalanobis distance for mixed data. J Multivar Anal 92:174–185
De Leon AR, Soo A, Williamson T (2011) Classification with discrete and continuous variables via general mixed-data models. J Appl Stat 38:1021–1032
Fujikoshi Y (1982) A test for additional information in canonical correlation analysis. Ann Inst Stat Math 34:523–530
Fujikoshi Y (1985) Selection of variables in two-group discriminant analysis by error rate and Akaike’s information criteria. J Multivar Anal 17:27–37
Hand DJ (1997) Construction and assessment of classification rules. Wiley, Chichester
Krusinska E (1989a) New procedure for selection of variables in location model for mixed variable discrimination. Biom J 31:511–523
Krusinska E (1989b) Two step semi-optimal branch and bound algorithm for feature selection in mixed variable discrimination. Pattern Recognit. 22:455–459
Krusinska E (1990) Suitable location model selection in the terminology of graphical models. Biom J 32:817–826
Krzanowski WJ (1975) Discrimination and classification using both binary and continuous variables. J Am Stat Assoc 70:782–790
Krzanowski WJ (1983) Stepwise location model choice in mixed variable discrimination. J R Stat Soc C 32:260–266
Krzanowski WJ (1984) On the null distribution of distance between two groups, using mixed continuous and categorical variables. J Classif 1:243–253
Mahat NI, Krzanowski WJ, Hernandez A (2007) Variable selection in discriminant analysis based on the location model for mixed variables. Adv Data Anal Classif 1:105–122
McKay RJ (1977) Simultaneous procedures for variable selection in multiple discriminant analysis. Biometrika 64:283–290
McLachlan GJ (1992) Discriminant analysis and statistical pattern recognition. Wiley, New York
Nkiet GM (2012) Direct variable selection for discrimination among several groups. J Multivar Anal 105:151–163
Olkin I, Tate RF (1961) Multivariate correlation models with mixed discrete and continuous variables. Ann Math Stat 32:448–465. J Multivar Anal 105:151–163
Acknowledgements
We are very grateful to two anonymous referees for their helpful and constructive comments, which led to a much improved manuscript. Research by Alban Mbina Mbina was supported in part by the Agence Universitaire de la Francophonie (AUF).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Mbina Mbina, A., Nkiet, G.M. & Eyi Obiang, F. Variable selection in discriminant analysis for mixed continuous-binary variables and several groups. Adv Data Anal Classif 13, 773–795 (2019). https://doi.org/10.1007/s11634-018-0343-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-018-0343-0