Abstract
With the continued and relentless growth in dataset sizes in recent times, feature or attribute selection has become a necessary step in tackling the resultant intractability. Indeed, as the number of dimensions increases, the number of corresponding data instances required in order to generate accurate models increases exponentially. Fuzzy-rough set-based feature selection techniques offer great flexibility when dealing with real-valued and noisy data; however, most of the current approaches focus on the supervised domain where the data object labels are known. Very little work has been carried out using fuzzy-rough sets in the areas of unsupervised or semi-supervised learning. This paper proposes a novel approach for semi-supervised fuzzy-rough feature selection where the object labels in the data may only be partially present. The approach also has the appealing property that any generated subsets are also valid (super)reducts when the whole dataset is labelled. The experimental evaluation demonstrates that the proposed approach can generate stable and valid subsets even when up to 90 % of the data object labels are missing.
Similar content being viewed by others
Notes
- 1.
When \(B = \{a\}\), i.e., B is a singleton, \(R_a\) is written rather than \(R_{\{a\}}\).
- 2.
A t-norm \(\mathcal {T}\) is an increasing, commutative, associative \([0,1]^2 \rightarrow [0,1]\) mapping satisfying \(\mathcal {T}(x,1) = x\) for x in [0, 1].
- 3.
An implicator \(\mathcal {I}\) is a \([0,1]^2 \rightarrow [0,1]\) mapping that is decreasing in its first and increasing in its second argument, satisfying \(\mathcal {I}(0,0)=\mathcal {I}(0,1)=\mathcal {I}(1,1)=1\) and \(\mathcal {I}(1,0)=0\).
References
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning, pp. 115–123 (1995)
Cornelis, C., Jensen, R., Hurtado Martín, G., Ślȩzak, D.: Attribute selection with fuzzy decision reducts. Inf. Sci. 180(2), 209–224 (2010)
Dubois, D., Prade, H.: Putting rough sets and fuzzy sets together. In: Słowiński, R. (ed.) Intelligent Decision Support, pp. 203–232. Springer, Dordrecht (1992)
Frank, A., Asuncion, A.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA (2010). http://archive.ics.uci.edu/ml
Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Trans. Fuzzy Syst. 17(4), 824–838 (2009)
Jensen, R., Tuson, A., Shen, Q.: Finding rough and fuzzy-rough set reducts with SAT. Inf. Sci. 255, 100–120 (2014)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Dordrecht (1991)
Radzikowska, A.M., Kerre, E.E.: A comparative study of fuzzy rough sets. Fuzzy Sets Syst. 126(2), 137–155 (2002)
Widz, S., Ślęzak, D.: Attribute Subset Quality Functions over a Universe of Weighted Objects. In: Kryszkiewicz, M., Cornelis, C., Ciucci, D., Medina-Moreno, J., Motoda, H., Raś, Z.W. (eds.) RSEISP 2014. LNCS, vol. 8537, pp. 99–110. Springer, Heidelberg (2014)
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)
Acknowledgment
Neil Mac Parthaláin would like to acknowledge the financial support for this research through NISCHR (National Institute for Social Care and Health Research) Wales, Grant reference: RFS-12-37. Sarah Vluymans is supported by the Special Research Fund (BOF) of Ghent University. Chris Cornelis was partially supported by the Spanish Ministry of Science and Technology under the project TIN2011-28488 and the Andalusian Research Plans P11-TIC-7765, P10-TIC-6858 and P12-TIC-2958.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Jensen, R., Vluymans, S., Parthaláin, N.M., Cornelis, C., Saeys, Y. (2015). Semi-Supervised Fuzzy-Rough Feature Selection. In: Yao, Y., Hu, Q., Yu, H., Grzymala-Busse, J.W. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Lecture Notes in Computer Science(), vol 9437. Springer, Cham. https://doi.org/10.1007/978-3-319-25783-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-25783-9_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25782-2
Online ISBN: 978-3-319-25783-9
eBook Packages: Computer ScienceComputer Science (R0)