Abstract
Feature selection is an important task in the areas of pattern recognition and data mining. Various approaches to feature selection have been developed. In particular, this paper focuses on the algorithms for computing irreducible testors, which have been used to solve feature selection problems. The calculation of irreducible testors is an expensive computational process; the complexity of the algorithms to calculate the complete set of irreducible testors exponentially depends on the number of characteristics that describe the objects in the problem. To improve the execution time of these algorithms, different alternatives have been developed, such as parallel implementations, hardware-software implementation, rearrangement of the data, as well as heuristics to generate just an irreducible testor or a subset of the entire set of irreducible testors, among other strategies. This paper presents a review of the literature on irreducible testors, with the aim of providing a guide for researchers working in the areas of pattern recognition and data mining, interested in feature selection, using heterogeneous data and possibly missing data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aguila L, Ruiz J (1984) MB algorithm for k-valued information elaboration on pattern recognition problems. Ciencias Mat J V(3):89–101
Alba-Cabrera E, Godoy-Calderon S, Ibarra-Fiallo J (2016) Generating synthetic test matrices as a benchmark for the computational behavior of typical testor-finding algorithms. Pattern Recogn Lett 80:46–51
Alba-Cabrera E, Godoy-Calderon S, Lazo-Cortés MS, Martínez-Trinidad JF, Carrasco-Ochoa JA (2019) On the relation between the concepts of irreducible testor and minimal transversal. IEEE Access 7:82809–82816
Alba-Cabrera E, Ibarra-Fiallo J, Godoy-Calderon S (2013) A theoretical and practical framework for assessing the computational behavior of typical testor-finding algorithms. In: 18th Iberoamerican congress CIARP. LNCS, vol. 8258. Springer, New York, pp 351–358
Alba-Cabrera E, Ibarra-Fiallo J, Godoy-Calderon S, Cervantes-Alonso F (2014) YYC: a fast performance incremental algorithm for finding typical testors. In: 19th Iberoamerican congress CIARP. LNCS, vol. 8827. Springer, New York, pp 416–423
Alba E, Santana R, Ochoa A, Lazo M (2000) Finding typical testors by using an evolutionary strategy. In: Proceedings of V Iberoamerican workshop on pattern recognition, pp 267–278
Armstrong DB (1966) On finding a nearly minimal set of fault detection tests for combinatorial logic nets. IEEE Trans Electron Comput 15:66–73
Asaithambi A, Valev V (2004) Construction of all non-reducible descriptors. Pattern Recogn 37(9):1817–1823
Aslanyan L, Ryazanov V, Sahakyan H (2015) Testor and logic separation in pattern recognition. Math Problems Comput Sci 44:33–41
Ayaquica-Martinez I, Jimenez-Jacinto V (1997) A new algorithm of outer scale for the generation of typical testors. In: Proceedings of the Iberoamerican workshop on pattern recognition (TIARP 97), pp 141–148
Borja-Cazales D, Diaz-Garcia M (2014) Differential evolution and particle swarm algorithms for calculate typical testors. BSc thesis, National Polytechnic Institute, Mexico
Bravo A (1983) Algorithm CT for compute of typical test of a k-valued matrix. Ciencias Mat J IV(2):123–144
Carrasco-Ochoa J, Martinez-Trinidad J (2004) Feature selection for natural disaster texts classification using testors. In: Proc. of fifth int. conf. on intelligent data engineering and automated learning. LCNS, vol. 3177. Springer, New York, pp 424–429
Chegis IA, Yablonskii SV (1955) On tests for electric circuits. Uspieji Matematicheskij Nauk 4(66):182–184
Chikalov I, Lozin V, Lozina I, Moshkov M, Nguyen HS, Skowron A, Zielosko B (2012) Three approaches to data analysis: test theory, rough sets and logical analysis of data, vol 41. Springer Science & Business Media, New York, pp 3–38
Cumplido R, Carrasco-Ochoa A, Feregrino C (2006) On the design and implementation of a high performance configurable architecture for testor identification. In: Proc. 10th Iberoamerican congress CIARP. LNCS, vol 4225. Springer, New York, pp 665–673
Dmitriev A, Zhuravlev I, Krendeliev F (1966) About mathematical principles and phenomena classification. Diskretni Analiz 7:3–15
Eldred RD (1959) Test routines based on symbolic logic statements. J ACM 6(1):33–36
Freeman C, Kulic D, Basir O (2015) An evaluation of classifier-specific filter measure performance for feature selection. Pattern Recogn 48(5):1812–1826
Gallegos-Acosta A (2018) Identification of risk factors in medical pathologies by means of feature selection characteristics. MsC. Thesis, Autonomous University of Aguascalientes, Mexico
Gallegos A, Torres D, Alvarez F, Torres A (2016) Feature subset selection and typical testors applied to breast cancer cells. Res Comput Sci 121:151–163
Gonzalez-Guevara V, Godoy-Calderon S, Alba-Cabrera E, Ibarra-Fiallo J (2015) A mixed learning strategy for finding typical testors in large datasets. In: 20th Iberoamerican congress CIARP. LNCS, vol 9423. Springer, New York, pp 716–723
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning, In: Proc. 17th int. conf. machine learning, pp 359–366
Jimenez-Jacinto V (1995) Feature selection with the algorithm REC. BSc. thesis on applied mathematics and computation, UNAM, Mexico
Journal Impact Factor (2020) Journal citation reports science edition. Clarivate analytics
Kohavi R, Jhon G (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Lazo-Cortes M, Ruiz-Shulcloper J, Alba-Cabrera E (2001) An overview of the evolution of the concept of testor. Pattern Recogn 34(4):753–762
Lazo-Cortés MS, Martínez-Trinidad JF, Carrasco-Ochoa JA, Sanchez-Diaz G (2015) On the relation between rough set reducts and typical testors. Inf Sci 294:152–163
Li F, Zhu Q (2011) Dcoument clustering in research literature based on NMF and testor theory. J Softw 6(1):78–82
Lias-Rodriguez A, Pons-Porrata A (2009) BR: a new method for computing all typical testors. In: 14th Iberoamerican congress CIARP. LNCS, vol. 5856. Springer, New York, pp 433–440
Lias-Rodriguez A, Sanchez-Diaz G (2013) An algorithm for computing typical testors based on gaps and reduction of columns. Int J Pattern Recognit Artif Intell 27(8):1–18
Lopez-Perez S, Lazo-Cortes M, Estrada-Garcia H (1997) Medical electrodiagnostic using pattern recognition tools. In: Proceedings of the Iberoamerican workshop on pattern recognition (TIARP 97), pp 237–244
McMahon A, Lewis E, Buniello A, Cerezo M, Hall P, Sollis E, Parkinson H, Hindorff L, Harris L, MacArthur J (2021) Sequencing-based genome-wide association studies reporting standards. Cell Genom 1(1):1–29
Mierswa I, Michael W (2006) Information preserving multiobjective feature selection for unsupervised learning. In: Proceedings of the genetic and evolutionary computation conference. ACM Press, pp 1545–1552
Morales-Escobar J, Roblero-Aguilar S, Guevara-Cruz E, Orozco-Aguirre H (2017) The use of typical testors for determinate the impact of contents in subjects of vocational training. Oper Res J 3:299–304
Ochoa J, Valdes M, Moctezuma I, Ayala C (2008) Dimension reduction in image databases using the logical combinatorial approach. In: Innovations and advances techniques in systems, computing sciences and software engineering. Springer, New York, pp 260–265
Ortiz-Posadas M, Martinez-Trinidad J, Ruiz-Shulcloper J (2001) A new approach to differential diagnosis of diseases. Int J Biomed Comput 40(3):179–185
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht, MA
Piza-Davila I, Sanchez-Diaz G, Aguirre-Salado C, Lazo-Cortes M (2015) A parallel hill-climbing algorithm to generate a subset of irreducible testors. Appl Intell 42:622–641
Piza-Davila I, Sanchez-Diaz G, Lazo-Cortes M, Rizo-Dominguez L (2017) A CUDA-based hill-climbing algorithm to find irreducible testors from a training matrix. Pattern Recogn Lett 95:22–28
Piza-Davila I, Sanchez-Diaz G, Lazo-Cortes M, Noyola-Medrano C (2018) Enhancing the performance of YYC algorithm useful to generate irreducible testors. Int J Pattern Recognit Artif Intell 32(1):1–18
Piza-Davila I, Sanchez-Diaz G, Lazo-Cortes M, Villalon-Turrubiates I (2020) An algorithm for computing minimum-length irreducible testors. IEEE Access 8:56312–56320
Pons-Porrata A, Ruiz-Shulcloper J, Berlanga-Llavori R (2003) A method for the automatic summarization of topic-based clusters of documents. In: Proceedings of VIII Iberoamerican conference on pattern recognition. LNCS, vol 2905. Springer, New York, pp 596–603
Pons-Porrata A, Gil-Garcia R, Berlanga-Llavori R (2007) Using typical testors for feature selection in text categorization. In: Proceedings of XII Iberoamerican conference on pattern recognition. LNCS, vol 4756. Springer, New York, pp 643–652
Rissanen R (1978) Modeling by shortest data description. Automatica 14(5):465–471
Rodriguez-Diez V, Martinez-Trinidad F, Carrasco-Ochoa J, Lazo-Cortes M, Feregrino-Uribe C, Cumplido R (2015) A fast hardware software platform for computing irreducible testors. Expert Syst Appl 42:9612–9619
Rojas A, Cumplido R, Carrasco-Ochoa A, Feregrino C, Martínez-Trinidad J (2012) Hardware-software platform for computing irreducible testors. Expert Syst Appl 39:2203–2210
Rojas-Delgado J (2016) Feature selection on pap-smear data using heuristic information. Cuban J Inf Sci 10(2):73–88
Roth JP (1966) Diagnosis of automata failures: a calculus and a method. IBM J Res Dev 10:278–291
Ruiz-Shulcloper J, Abidi M (2002) Logical combinatorial pattern recognition: a review. In: Recent research developments in pattern recognition. Transword Research Networks, India, pp 133–176
Ruiz-Shulcloper J, Bravo-Martinez A, Aguila-Feros L (1985) Algorithms BT and TB for compute all typical testors. Ciencias Mat J VI:11–18
Ruiz-Shulcloper J, Lazo-Cortes M (1999) Mathematical algorithms for the supervised classification based on fuzzy partial precedence. Math Comput Model 29:111–119
Saeys Y, Degroeve S, Van de Peer Y (2004) Digging into acceptor splice site prediction: an iterative feature selection approach. In: Proceedings of principles and practice of knowledge discovery in databases, pp 386–397
Sanchez-Diaz G (1997) Develoment and programming of efficient algorithms (sequential and parallel) for generated typical testors of a basic matrix. MsC. thesis, Autonomous University of Puebla, Mexico
Sanchez-Diaz G, Diaz-Sanchez G, Mora-Gonzalez M, Piza-Davila I, Aguirre-Salado C, Huerta-Cuellar G, Reyes-Cardenas O, Cardenas-Tristan A (2014) An evolutionary algorithm with acceleration operator to generate a subset of typical testors. Pattern Recogn Lett 41:34–42
Sanchez-Diaz G, Lazo-Cortes M (2007) CT-EXT: an algorithm for computing typical testor set. In: 11th Iberoamerican congress CIARP. LNCS, vol. 4756. Springer, New York, pp 506–514
Sanchez-Diaz G, Lazo-Cortes M (2002) Modifications to algorithm BT for improve their execution time. Ciencias Mat J 20(2):129–136
Sanchez-Diaz G, Lazo-Cortes M, Fuentes-Chavez O (1999) Genetic algorithm to calculate typical testors of minimal cost. In: IV Iberoamerican symposium on pattern recognition, pp 207–212
Sanchez-Diaz G, Lazo-Cortes M, Piza-Davila I (2012) A fast implementation for the typical testor property identification based on an accumulative binary tuple. Int J Comput Intell Syst 5(6):1025–1039
Santiesteban-Alganza Y, Pons-Porrata A (2003) LEX: a new algorithm to generate typical testors. Ciencias Mat J 21(1):85–95
Santos J, Carrasco A, Martinez J (2004) Feature selection using typical testors applied to estimation to stellar parameters. Comput Sistemas J 8(1):15–23
Tonkin E, Tourte G (2016) Working with text: tools, techniques and approaches for text mining. Chandos Publishing, Cambridge, MA
Torres D, Torres A, Cuellar F, Torres M, Ponce-de-Leon E, Pinales F (2014) Evolutionary computation in the identification of risk factors, case of TRALI. Expert Syst Appl 41(3):831–840
Torres D, Torres A, Ponce-de-Leon E (2006) Genetic algorithm and typical testors in feature subset selection problem. In: Proceedings of sixth Iberoamerican conference on systemics, cybernetics and informatics, pp 1–5
Valev V, Asaithambi A (2003) On computational complexity of non-reducible descriptors. In: Proceedings of the IEEE international conference on information reuse and integration, pp 208–211
Valev V, Radeva P (1993) On the determining of non-irreducible descriptors for multidimensional pattern recognition problems. Pattern Recogn Image Anal 3(3):258–265
Valev V, Sankur B (2004) Generalized non-reducible descriptors. Pattern Recogn 37(9):1809–1815
Valev V, Zhuravlev Y (1991) Integer-valued problems of transforming the training tables in k-valued code in pattern recognition problems. Pattern Recogn 24(4):283–288
Vazquez R, Godoy-Calderon S (2007) Using testor theory to reduce the dimension of neural network models. Res Comput Sci 28:93–103
Webb A (2002) Statistical pattern recognition. Wiley, New York, pp 305–360
Witten DM, Tibshirani R (2010) A framework for feature selection in clustering. J Am Stat Assoc 105(490):713–726
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare no conflict of interest in this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sanchez-Diaz, G., Lazo-Cortes, M.S., Aguirre-Salado, C.A. et al. A review of algorithms to computing irreducible testors applied to feature selection. Artif Intell Rev 55, 6607–6628 (2022). https://doi.org/10.1007/s10462-022-10162-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10162-z