Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks

Krzysztof Krawiec¹

772 Accesses
130 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper we use genetic programming for changing the representation of the input data for machine learners. In particular, the topic of interest here is feature construction in the learning-from-examples paradigm, where new features are built based on the original set of attributes. The paper first introduces the general framework for GP-based feature construction. Then, an extended approach is proposed where the useful components of representation (features) are preserved during an evolutionary run, as opposed to the standard approach where valuable features are often lost during search. Finally, we present and discuss the results of an extensive computational experiment carried out on several reference data sets. The outcomes show that classifiers induced using the representation enriched by the GP-constructed features provide better accuracy of classification on the test set. In particular, the extended approach proposed in the paper proved to be able to outperform the standard approach on some benchmark problems on a statistically significant level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning feature spaces for regression with genetic programming

Article 11 March 2020

On the Transfer Learning of Genetic Programming Classification Algorithms

Can Genetic Programming Do Manifold Learning Too?

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

H. N. Bensusan, Automatic Bias Learning: An Inquiry into the Inductive Basis of Induction. Ph.D. Dissertation, School of Computing and Cognitive Sciences, University of Sussex, Sussex, 1998.
Google Scholar
H. N. Bensusan and I. Kuscu, “Constructive induction using genetic programming,” in Proc. Int. Conf. Machine Learning, Evolutionary Computing and Machine Learning Workshop, T. Fogarty and G. Venturini (eds.), Bari, Italy, 1996.
B. Bhanu and K. Krawiec, “Coevolutionary construction of features for transformation of representation in machine learning,” in Proceedings of the Bird of a Feather Workshops, Genetic and Evolutionary Computation Conference (GECCO 2002), A. M. Barry (ed.), AAAI Press: New York, 2002, pp. 249–254.
Google Scholar
C. L. Blake and C. J. Merz, “UCI Repository of machine learning databases,” [http://www.ics.uci.edu/~mlearn/MLRepository.html], University of California: Irvine, CA, 1998.
Google Scholar
M. Brameier and W. Banzhaf, “Evolving teams of predictors with linear genetic programming,” Genetic Programming and Evolvable Machines vol. 2, pp. 381–407, 2001.
Google Scholar
C. E. Brodley and A. Pohoreckyj Danyluk (eds.), Proc. Int. Conf. Machine Learning, Morgan Kaufmann: San Francisco 2001.
Google Scholar
M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis vol. 1, no. 3, pp. 131–156, 1997.
Google Scholar
K. A. De Jong, An Analysis of the Behavior of a Class of Genetic Adaptive Systems. Doctoral dissertation, University of Michigan, Ann Arbor, 1975.
Google Scholar
K. A. De Jong, W. M. Spears, and D. F. Gordon, “Using genetic algorithms for concept learning,” Machine Learning vol. 13, pp. 161–188, 1993.
Google Scholar
D. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley: Reading, 1989.
J. J. Grefenstette, “Lamarckian learning in multi-agent environments,” in Proc. Fourth Intl. Conf. of Genetic Algorithms, Morgan Kaufmann, San Mateo, CA, 1991, pp. 303–310.
Google Scholar
J. H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press: Ann Arbor, 1975.
Google Scholar
I. Imam and H. Vafaie, “An empirical comparison between global and greedy-like search for feature selection,” in Proceedings of the Florida AI Research Symposium (FLAIRS-94), Pensacola Beach, FL, 1994, pp. 66–70.
J. K. Kishore, L. M. Patnaik, V. Mani, and V. K. Agrawal, “Application of genetic programming for multicategory pattern classification,” IEEE Trans. Evolutionary Comp. vol. 4, no. 3, pp. 242–258, 2000.
Google Scholar
M. Komosinski and K. Krawiec, “Evolutionary weighting of image features for diagnosing of CNS tumors,” Artif. Intell. in Medicine vol. 19, no. 1, pp. 25–38, 2000.
Google Scholar
R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artif. Intell. Journal vol. 1–2, pp. 273–324, 1997.
Google Scholar
R. Kohavi, D. Sommerfeld, and J. Dougherty, “Data mining using MLC++: Amachine learning library in C++,” in Proc. of the Eight International Conference on Tools with Artificial Intelligence (ICTAI'96), IEEE Computer Society, 1996, pp. 234–245.
J. R. Koza, Genetic Programming—2, MIT Press: Cambridge, 1994.
Google Scholar
K. Krawiec, “Pairwise comparison of hypotheses in evolutionary learning,” in Proc. Int. Conf. Machine Learning, C. E. Brodley and A. Pohoreckyj Danyluk (eds.), Morgan Kaufmann: San Francisco, 2001, pp. 266–273.
Google Scholar
K. Krawiec, “Genetic programming with local improvement for visual learning from examples,” in Computer Analysis of Images and Patterns (LNCS 2124), W. Skarbek, (ed.), Springer: Berlin, 2001, pp. 209–216.
K. Krawiec, “On the use of pairwise comparison of hypotheses in evolutionary learning applied to learning from visual examples,” in Machine Learning and Data Mining in Pattern Recognition (LNAI 2123), P. Perner (ed.), Springer: Berlin, 2001, pp. 307–321.
P. Langley, Elements of Machine Learning, Morgan Kaufmann: San Francisco, 1996.
Google Scholar
T.-S. Lim, W.-Y. Loh, and Y.-S. Shih, “Acomparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms,” Machine Learning, vol. 40, pp. 203–228, 2000.
Google Scholar
S. Luke, “ECJ 7: An EC and GP system in Java,” http://www.cs.umd.edu/projects/plus/ec/ecj/, 2001.
C. J. Matheus and L. A. Rendell, “Constructive induction on decision trees,” in Proc. of the Eleventh International Joint Conference on Artificial Intelligence, N. S. Sridharan (ed.), Detroit, MI, USA, August 1989, Morgan Kaufmann 1989, pp. 645–650.
C. J. Matheus, “Aconstructive induction framework,” in Proc. of the Sixth International Workshop on Machine Learning, Ithaca: New York, 1989, pp. 474–475.
G. Mayraz and G. Hinton, “Recognizing hand-written digits using hierarchical products of experts,” Adv. NIPS 13, MIT Press: Cambridge, MA, 2001, pp. 953–959.
Google Scholar
R. S. Michalski, “Atheory and methodology of inductive learning,” Artificial Intelligence 20, pp. 111–161, 1983.
Google Scholar
T. M. Mitchell, An Introduction to Genetic Algorithms, MIT Press: Cambridge, MA, 1996.
Google Scholar
T. M. Mitchell, Machine Learning, McGraw-Hill: New York, 1997.
Google Scholar
J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann: San Mateo, CA, 1992.
Google Scholar
M. L. Raymer, W. F. Punch, E. D. Goodman, L. A. Kuhn, and A. K. Jain, “Dimensionality reduction using genetic algorithm,” IEEE Trans. on Evolutionary Computation, vol. 4, no. 2, pp. 164–171, 2000.
Google Scholar
M. L. Raymer, W. F. Punch, E. D. Goodman, and L. A. Kuhn, “Genetic programming for improved data mining—application to the biochemistry of protein interactions,” in Genetic Programming 1996: Proceedings, J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo, (eds.), MIT Press: Cambridge, MA, 1996, pp. 275–381.
Google Scholar
S. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, J. Cheng, K. De Jong, S. Dzeroski, R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R. S. Michalski, T. Mitchell, P. Pachowicz, B. Roger, H. Vafaie, W. Van de Velde, W. Wenzel, J. Wnek, and J. Zhang, “The MONK's problems: Aperformance comparison of different learning algorithms,” Technical Report CMU-CS-91-197, Carnegie Mellon University: Pittsburgh, PA, 1991.
Google Scholar
I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann: San Francisco, CA, 1999.
Google Scholar
J. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” IEEE Intelligent Systems (Special Issue on Feature Transformation and Subset Selection), vol. 13, pp. 44–49, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computing Science, Poznań University of Technology, Piotrowo 3A, 60965, Poznań, Poland
Krzysztof Krawiec

Authors

Krzysztof Krawiec
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Krawiec, K. Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines 3, 329–343 (2002). https://doi.org/10.1023/A:1020984725014

Download citation

Issue Date: December 2002
DOI: https://doi.org/10.1023/A:1020984725014

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning feature spaces for regression with genetic programming

On the Transfer Learning of Genetic Programming Classification Algorithms

Can Genetic Programming Do Manifold Learning Too?

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning feature spaces for regression with genetic programming

On the Transfer Learning of Genetic Programming Classification Algorithms

Can Genetic Programming Do Manifold Learning Too?

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation