Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Feature selection for regression problems based on the Morisita estimator of intrinsic dimension

Published: 01 October 2017 Publication History

Abstract

A new supervised filter for regression problems is proposed.The filter uses the newly introduced Morisita estimator of intrinsic dimension.The filter distinguishes between relevant, irrelevant and redundant features.The filter is comprehensively validated using real and simulated datasets.A generic methodology for validating and comparing filters is suggested. Data acquisition, storage and management have been improved, while the key factors of many phenomena are not well known. Consequently, irrelevant and redundant features artificially increase the size of datasets, which complicates learning tasks, such as regression. To address this problem, feature selection methods have been proposed. This paper introduces a new supervised filter based on the Morisita estimator of intrinsic dimension. It can identify relevant features and distinguish between redundant and irrelevant information. Besides, it offers a clear graphical representation of the results, and it can be easily implemented in different programming languages. Comprehensive numerical experiments are conducted using simulated datasets characterized by different levels of complexity, sample size and noise. The suggested algorithm is also successfully tested on a selection of real world applications and compared with RReliefF using extreme learning machine. In addition, a new measure of feature relevance is presented and discussed.

References

[1]
R. Bellman, Adaptive control processes, Princeton University Press, Princeton (US-NJ), 1961.
[2]
I. Guyon, A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3 (2003) 1157-1182.
[3]
I. Guyon, S. Gunn, M. Nikravesh, L.A. Zadeh, Feature extraction, Springer, Berlin, 2006.
[4]
I.A. Gheyas, L.S. Smith, Feature subset selection in large dimensionality domains, Pattern Recognit., 43 (2010) 5-13.
[5]
Z. Zeng, H. Zhang, R. Zhang, C. Yin, A novel feature selection method considering feature interaction, Pattern Recognit., 48 (2015) 2656-2666.
[6]
M. Robnik-ikonja, I. Kononenko, Theoretical and empirical analysis of ReliefF and RReliefF, Mach. Learn., 53 (2003) 23-69.
[7]
H.C. Peng, F. Long, C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27 (2005) 1226-1238.
[8]
M.A. Hall, Correlation-based feature selection for discrete and numeric class machine learning, Stanford, USA, 2000.
[9]
R. Kohavi, G.H. John, Wrappers for feature subset selection, Artif. Intell., 97 (1997) 273-324.
[10]
M. Leuenberger, M. Kanevski, Feature selection in environmental data mining combining simulated annealing and extreme learning machine, d-Side Pub., 2014.
[11]
R. Tibshirani, Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B 58 (1996) 267-288.
[12]
L. Breiman, Random forests, Machine Learning, 45 (2001) 5-32.
[13]
S.F. Cotter, K. Kreutz-Delgado, B.D. Rao, Backward sequential elimination for sparse vector subset selection, Signal Processing, 81 (2001) 1849-1864.
[14]
S. Colak, C. Isik, Feature subset selection for blood pressure classification using orthogonal forward selection, 2003.
[15]
A.W. Whitney, A direct method of nonparametric measurement selection, IEEE Trans. Comput., 20 (1971) 1100-1103.
[16]
R. Meiri, J. Zahavi, Using simulated annealing to optimize the feature selection problem in marketing applications, Eur. J. Oper. Res., 171 (2006) 842-858.
[17]
S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, Optimization by simulated annealing, Science, 220 (1983) 671-680.
[18]
S. Tabakhi, P. Moradi, Relevance-redundancy feature selection based on ant colony optimization, Pattern Recognit., 48 (2015) 2798-2811.
[19]
M. Dorigo, Politecnico di Milano, 1992.
[20]
S. Robert, L. Foresti, M. Kanveski, Spatial prediction of monthly wind speeds in complex terrain with adaptive general regression neural networks, Int. J. Climatol., 33 (2013) 1793-1804.
[21]
J. Golay, M. Leuenberger, M. Kanevski, Morisita-based feature selection for regression problems, d-Side Pub., 2015.
[22]
F. Camastra, Data dimensionality estimation methods: a survey, Pattern Recognit., 36 (2003) 2945-2954.
[23]
J.A. Lee, M. Verleysen, Nonlinear Dimensionality Reduction, Springer, New-York, 2007.
[24]
F. Camastra, A. Staiano, Intrinsic dimension estimation: advances and open problems, Inf. Sci., 328 (2016) 26-41.
[25]
J. Golay, M. Kanevski, A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognit., 48 (2015) 4070-4081.
[26]
C. Traina, A.J.M. Traina, L. Wu, C. Faloutsos, Fast feature selection using fractal dimension, 2000.
[27]
G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: Theory and applications, Neurocomputing, 70 (2006) 489-501.
[28]
B. Eriksson, M. Crovella, Estimation of Intrinsic Dimension via Clustering, Boston University, Department of Computer Science, 2011.
[29]
J. Golay, M. Kanevski, C.D.V. Orozco, M. Leuenberger, The multipoint Morisita index for the analysis of spatial patterns, Physica A, 406 (2014) 191-202.
[30]
P. Grassberger, I. Procaccia, Measuring the strangeness of strange attractors, Physica D, 9 (1983) 189-208.
[31]
S. Borgani, G. Murante, A. Provenzale, R. Valdarnini, Multifractal analysis of the galaxy distribution: Reliability of results from finite data sets, Phys. Rev. E, 47 (1993) 3879-3888.
[32]
S. Lovejoy, D. Schertzer, A. Tsonis, Functional box-counting and multiple elliptical dimensions in rain, Science, 235 (1987) 1036-1038.
[33]
Q. Huang, J.R. Lorch, R.C. Dubes, Can the fractal dimension of images be measured?, Pattern Recognit., 27 (1994) 339-349.
[34]
Y. Xu, Y. Quan, Z. Zhang, H. Ling, H. Ji, Classifying dynamic textures via spatiotemporal fractal analysis, Pattern Recognit., 48 (2015) 3239-3248.
[35]
B.B. Mandelbrot, The Fractal Geometry of Nature, W.H. Freeman, San Francisco, 1983.
[36]
E. Ott, Chaos in Dynamical Systems, Cambridge University Press, Cambridge (UK), 1993.
[37]
H.G.E. Hentschel, I. Procaccia, The infinite number of generalized dimensions of fractals and strange attractors, Physica D, 8 (1983) 435-444.
[38]
D. Mo, S.H. Huang, Fractal-based intrinsic dimension estimation and its application in dimensionality reduction, IEEE Trans. Knowl. Data Eng., 24 (2012) 59-71.
[39]
C. Traina Jr., A.J.M. Traina, C. Faloutsos, Fast feature selection using fractal dimension - ten years later, J. Inf. Data Manage., 1 (2010) 17-20.
[40]
E.P.M. De Sousa, C. Traina Jr., A.J.M. Traina, L. Wu, C. Faloutsos, A fast and effective method to find correlations among attributes in databases, Data Min. Knowl. Discovery, 14 (2007) 367-407.
[41]
H.D. Lee, M.C. Monard, F.C. Wu, A fractal dimension based filter algorithm to select features for supervised learning, in: Advances in Artificial Intelligence - IBERAMIA-SBIA, Springer, 2006, pp. 278-288.
[42]
D.T. Pham, M.S. Packianather, M.S. Garcia, M. Castellani, Novel feature selection method using mutual information and fractal dimension, 2009.
[43]
S.H. Hurlbert, Spatial Distribution of the Montane Unicorn, Oikos, 58 (1990) 257-271.
[44]
M. Morisita, Measuring of the Dispersion of Individuals and Analysis of the Distributional Patterns, Kyushu University, 1959.
[45]
R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2017.
[46]
J. Golay, M. Laib, IDmining: intrinsic dimension for data mining, 2016. R package. https://CRAN.R-project.org/package=IDmining
[47]
J.H. Friedman, Multivariate adaptive regression splines, Ann. Stat., 19 (1991) 1-67.
[48]
L. Torgo, Regression DataSets. https://www.dcc.fc.up.pt/~ltorgo/Regression/DataSets.html
[49]
M. Lichman, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, 2013.
[50]
M. Robnik-ikonja, I. Kononenko, Context sensitive attribute estimation in regression, Bari (IT), 1996.
[51]
M. Robnik-ikonja, I. Kononenko, An adaptation of Relief for attribute estimation in regression, Nashville (USA), 1997.
[52]
G. Huang, G.-B. Huang, S. Song, K. You, Trends in extreme learning machines: a review, Neural Netw., 61 (2015) 32-48.
[53]
K. Kira, L.A. Rendell, The feature selection problem: traditional methods and a new algorithm, San Jose (US-CA), 1992.
[54]
I. Kononenko, Estimating attributes: analysis and extensions of RELIEF, Springer, 2006.
[55]
I. Kononenko, E. imec, M. Robnik-ikonja, Overcoming the myopia of inductive learning algorithms with RELIEFF, Appl. Intell., 7 (1997) 39-45.
[56]
N.X. Vinh, S. Zhou, J. Chan, J. Bailey, Can high-order dependencies improve mutual information based feature selection, Pattern Recognit., 53 (2016) 46-58.
[57]
M. Robnik-ikonja, P. Savicky, J.A. Alao, CORElearn: classification, regression and feature evaluation, 2015. R package version 0.9.45. https://CRAN.R-project.org/package=CORElearn
[58]
F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain, Psychol. Rev., 65 (1958) 386-408.
[59]
F. Rosenblatt, Principles of neurodynamics, Spartan, Washington (DC), 1962.
[60]
P. Werbos, Beyond regression, Harvard University, 1974.
[61]
D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation, in: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, The MIT Press, Cambridge (USA), 1986, pp. 318-362.
[62]
A. Baradarani, Q.M.J. Wu, M. Ahmadi, An efficient illumination invariant face recognition framework via illumination enhancement and DD-DTCWT filtering, Pattern Recognit., 46 (2013) 57-72.
[63]
J.J. De Mesquita S Junior, A.R. Backes, ELM based signature for texture classification, Pattern Recognit., 51 (2016) 395-401.
[64]
F. Mateo, J.J. Carrasco, M. Milln-Giraldo, A. Sellami, P. Escandell-Montero, J.M. Martnez-Martnez, E. Soria-Olivas, Temperature forecast in buildings using machine learning techniques, d-Side Pub., 2013.
[65]
A. Jain, D. Zongker, Feature selection: Evaluation, application, and small sample performance, IEEE Trans. Pattern Anal. Mach. Intell., 19 (1997) 153-158.
[66]
J. Reunanen, Overfitting in making comparisons between variable selection methods, J. Mach. Learn. Res., 3 (2003) 1371-1382.
[67]
R. Liu, D.F. Gillies, Overfitting in linear feature extraction for classification of high-dimensional image data, Pattern Recognit., 53 (2016) 73-86.

Cited By

View all
  • (2021)Unsupervised Learning of High Dimensional Environmental Data Using Local Fractality ConceptPattern Recognition. ICPR International Workshops and Challenges10.1007/978-3-030-68780-9_13(130-138)Online publication date: 10-Jan-2021
  • (2019)Ensemble feature selection for high-dimensional data: a stability analysis across multiple domainsNeural Computing and Applications10.1007/s00521-019-04082-332:10(5951-5973)Online publication date: 25-Feb-2019
  1. Feature selection for regression problems based on the Morisita estimator of intrinsic dimension

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Pattern Recognition
    Pattern Recognition  Volume 70, Issue C
    October 2017
    152 pages

    Publisher

    Elsevier Science Inc.

    United States

    Publication History

    Published: 01 October 2017

    Author Tags

    1. Data mining
    2. Feature selection
    3. Intrinsic dimension
    4. Measure of relevance
    5. Morisita index

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Unsupervised Learning of High Dimensional Environmental Data Using Local Fractality ConceptPattern Recognition. ICPR International Workshops and Challenges10.1007/978-3-030-68780-9_13(130-138)Online publication date: 10-Jan-2021
    • (2019)Ensemble feature selection for high-dimensional data: a stability analysis across multiple domainsNeural Computing and Applications10.1007/s00521-019-04082-332:10(5951-5973)Online publication date: 25-Feb-2019

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media