article

Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm

Author:

Peter D. TurneyAuthors Info & Claims

Journal of Artificial Intelligence Research, Volume 2, Issue 1

Pages 369 - 409

Published: 01 April 1995 Publication History

Abstract

This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm. The fitness function of the genetic algorithm is the average cost of classification when using the decision tree, including both the costs of tests (features, measurements) and the costs of classification errors. ICET is compared here with three other algorithms for cost-sensitive classification -- EG2, CS-ID3, and IDX -- and also with C4.5, which classifies without regard to cost. The five algorithms are evaluated empirically on five real-world medical datasets. Three sets of experiments are performed. The first set examines the baseline performance of the five algorithms on the five datasets and establishes that ICET performs significantly better than its competitors. The second set tests the robustness of ICET under a variety of conditions and shows that ICET maintains its advantage. The third set looks at ICET's search in bias space and discovers a way to improve the search.

References

[1]

Ackley, D., & Littman, M. (1991). Interactions between learning and evolution. In Proceedings of the Second Conference on Artificial Life, C. Langton, C. Taylor, D. Farmer, and S. Rasmussen, editors. California: Addison-Wesley.

[2]

Aha, D.W., Kibler, D., & Albert, M.K. (1991). Instance-based learning algorithms, Machine Learning, 6, 37-66.

[3]

Aha, D.W., & Bankert, R.L. (1994). Feature selection for case-based classification of cloud types: An empirical comparison. Case-Based Reasoning: Papers from the 1994 Workshop , edited by D.W. Aha, Technical Report WS-94-07, pp. 106-112. Menlo Park, CA: AAAI Press.

[4]

Anderson, R.W. (in press). Learning and evolution: A quantitative genetics approach. Journal of Theoretical Biology.

[5]

Baldwin, J.M. (1896). A new factor in evolution. American Naturalist, 30, 441-451.

[6]

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. California: Wadsworth.

[7]

De Jong, K.A., Spears, W.M., & Gordon, D.F. (1993). Using genetic algorithms for concept learning. Machine Learning, 13, 161-188.

[8]

Fogarty, T.C. (1992). Technical note: First nearest neighbor classification on Frey and Slate's letter recognition problem. Machine Learning, 9, 387-388.

Digital Library

[9]

Frey, P.W., & Slate, D.J., (1991). Letter recognition using Holland-style adaptive classifiers. Machine Learning, 6, 161-182.

Digital Library

[10]

Friedman, J.H., & Stuetzle, W. (1981). Projection pursuit regression. Journal of the American Statistics Association, 76, 817-823.

[11]

Gordon, D.F., & Perlis, D. (1989). Explicitly biased generalization. Computational Intelligence , 5, 67-81.

Digital Library

[12]

Grefenstette, J.J. (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 16, 122-128.

Digital Library

[13]

Grefenstette, J.J., Ramsey, C.L., & Schultz, A.C. (1990). Learning sequential decision rules using simulation models and competition. Machine Learning, 5, 355-381.

Digital Library

[14]

Hermans, J., Habbema, J.D.F., & Van der Burght, A.T. (1974). Cases of doubt in allocation problems, k populations. Bulletin of the International Statistics Institute, 45, 523-529.

[15]

Hinton, G.E., & Nowlan, S.J. (1987). How learning can guide evolution. Complex Systems, 1, 495-502.

Digital Library

[16]

Karakoulas, G. (in preparation). A Q-learning approach to cost-effective classification. Technical Report, Knowledge Systems Laboratory, National Research Council Canada. Also submitted to the Twelfth International Conference on Machine Learning, ML-95.

[17]

Knoll, U., Nakhaeizadeh, G., & Tausend, B. (1994). Cost-sensitive pruning of decision trees. Proceedings of the Eight European Conference on Machine Learning, ECML-94, pp. 383-386. Berlin, Germany: Springer-Verlag.

[18]

Koza, J.R. (1992). Genetic Programming: On the programming of computers by means of natural selection. Cambridge, MA: MIT Press.

[19]

Lirov, Y., & Yue, O.-C. (1991). Automated network troubleshooting knowledge acquisition. Journal of Applied Intelligence, 1, 121-132.

[20]

Maynard Smith, J. (1987). When learning guides evolution. Nature, 329, 761-762.

[21]

Morgan, C.L. (1896). On modification and variation. Science, 4, 733-740.

[22]

Murphy, P.M., & Aha, D.W. (1994). UCI Repository of Machine Learning Databases. University of California at Irvine, Department of Information and Computer Science.

[23]

Norton, S.W. (1989). Generating better decision trees. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, IJCAI-89, pp. 800-805. Detroit, Michigan.

[24]

Náñez, M. (1988). Economic induction: A case study. Proceedings of the Third European Working Session on Learning, EWSL-88, pp. 139-145. California: Morgan Kaufmann.

[25]

Náñez, M. (1991). The use of background knowledge in decision tree induction. Machine Learning, 6, 231-250.

[26]

Ontario Ministry of Health (1992). Schedule of benefits: Physician services under the health insurance act, October 1, 1992. Ontario: Ministry of Health.

[27]

Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs: Knowledge-intensive approaches to learning from noisy data. Proceedings of the Eleventh International Conference on Machine Learning, ML-94, pp. 217- 225. New Brunswick, New Jersey.

[28]

Pearl, J. (1984). Heuristics: Intelligent search strategies for computer problem solving. Massachusetts: Addison-Wesley.

[29]

Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference . California: Morgan Kaufmann.

[30]

Pipitone, F., De Jong, K.A., & Spears, W.M. (1991). An artificial intelligence approach to analog systems diagnosis. In Testing and Diagnosis of Analog Circuits and Systems, Ruey-wen Liu, editor. New York: Van Nostrand-Reinhold.

[31]

Provost, F.J. (1994). Goal-directed inductive learning: Trading off accuracy for reduced error cost. AAAI Spring Symposium on Goal-Driven Learning.

[32]

Provost, F.J., & Buchanan, B.G. (in press). Inductive policy: The pragmatics of bias selection. Machine Learning.

[33]

Quinlan, J.R. (1992). C4.5: Programs for machine learning. California: Morgan Kaufmann.

[34]

Ragavan, H., & Rendell, L. (1993). Lookahead feature construction for learning hard concepts. Proceedings of the Tenth International Conference on Machine Learning, ML-93, pp. 252-259. California: Morgan Kaufmann.

[35]

Rymon, R. (1993). An SE-tree based characterization of the induction problem. Proceedings of the Tenth International Conference on Machine Learning, ML-93, pp. 268-275. California: Morgan Kaufmann.

[36]

Schaffer, C. (1993). Selecting a classification method by cross-validation. Machine Learning, 13, 135-143.

Digital Library

[37]

Schaffer, J.D., Whitley, D., & Eshelman, L.J. (1992). Combinations of genetic algorithms and neural networks: A survey of the state of the art. In Combinations of Genetic Algorithms and Neural Networks, D. Whitley and J.D. Schaffer, editors. California: IEEE Computer Society Press.

[38]

Seshu, R. (1989). Solving the parity problem. Proceedings of the Fourth European Working Session on Learning, EWSL-89, pp. 263-271. California: Morgan Kaufmann.

[39]

Spears, W.M. (1992). Crossover or mutation? Foundations of Genetic Algorithms 2, FOGA- 92, edited by D. Whitley. California: Morgan Kaufmann.

[40]

Sutton, R.S. (1992). Introduction: The challenge of reinforcement learning. Machine Learning, 8, 225-227.

[41]

Tan, M., & Schlimmer, J. (1989). Cost-sensitive concept learning of sensor use in approach and recognition. Proceedings of the Sixth International Workshop on Machine Learning, ML-89, pp. 392-395. Ithaca, New York.

[42]

Tan, M., & Schlimmer, J. (1990). CSL: A cost-sensitive learning system for sensing and grasping objects. IEEE International Conference on Robotics and Automation. Cincinnati, Ohio.

[43]

Tan, M. (1993). Cost-sensitive learning of classification knowledge and its applications in robotics. Machine Learning, 13, 7-33.

Digital Library

[44]

Tcheng, D., Lambert, B., Lu, S., Rendell, L. (1989). Building robust learning systems by combining induction and optimization. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, IJCAI-89, pp. 806-812. Detroit, Michigan.

[45]

Turney, P.D. (in press). Technical note: Bias and the quantification of stability. Machine Learning.

[46]

Verdenius, F. (1991). A method for inductive cost optimization. Proceedings of the Fifth European Working Session on Learning, EWSL-91, pp. 179-191. New York: Springer-Verlag.

[47]

Waddington, C.H. (1942). Canalization of development and the inheritance of acquired characters. Nature, 150, 563-565.

[48]

Whitley, D., Dominic, S., Das, R., & Anderson, C.W. (1993). Genetic reinforcement learning for neurocontrol problems. Machine Learning, 13, 259-284.

Digital Library

[49]

Whitley, D., & Gruau, F. (1993). Adding learning to the cellular development of neural networks: Evolution and the Baldwin effect. Evolutionary Computation, 1, 213-233.

Digital Library

[50]

Whitley, D., Gordon, S., & Mathias, K. (1994). Lamarckian evolution, the Baldwin effect and function optimization. Parallel Problem Solving from Nature - PPSN III. Y. Davidor, H.P. Schwefel, and R. Manner, editors, pp. 6-15. Berlin: Springer-Verlag.

[51]

Wilson, S.W. (1987). Classifier systems and the animat problem. Machine Learning, 2, 199- 228

Digital Library

Cited By

Feng YZhou MTong X(2021)Imbalanced classificationStatistical Analysis and Data Mining10.1002/sam.1153814:5(383-406)Online publication date: 16-Sep-2021
https://dl.acm.org/doi/10.1002/sam.11538
Ghatasheh NFaris HAbukhurma RCastillo PAl-Madi NMora AAl-Zoubi AHassanat A(2020)Cost-sensitive ensemble methods for bankruptcy prediction in a highly imbalanced data distribution: a real case from the Spanish marketProgress in Artificial Intelligence10.1007/s13748-020-00219-x9:4(361-375)Online publication date: 24-Oct-2020
https://dl.acm.org/doi/10.1007/s13748-020-00219-x
Karlsson IRebane JPapapetrou PGionis A(2019)Locally and globally explainable time series tweakingKnowledge and Information Systems10.1007/s10115-019-01389-462:5(1671-1700)Online publication date: 30-Aug-2019
https://dl.acm.org/doi/10.1007/s10115-019-01389-4
Show More Cited By

Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm
1. Computing methodologies
  1. Artificial intelligence
    1. Search methodologies
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

The Multiclass ROC Front method for cost-sensitive classification

This paper addresses the problem of learning a multiclass classification system that can suit to any environment. By that we mean that particular (imbalanced) misclassification costs are taken into account by the classifier for predictions. However, ...
Cost-Sensitive Boosting

A novel framework is proposed for the design of cost-sensitive boosting algorithms. The framework is based on the identification of two necessary conditions for optimal cost-sensitive learning that 1) expected losses must be minimized by optimal cost-...
Cost-sensitive matrixized classification learning with information entropy
Abstract
Classifier design is one of the most significant fields in pattern recognition. Most classifiers are measured by classification accuracy, which assumes that all the misclassification cost are the same. In the real world, different ...
Highlights
- The CsMatMHKS combine the matrixized learning framework with cost-sensitive learning.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Artificial Intelligence Research

Journal of Artificial Intelligence Research Volume 2, Issue 1

August 1994

603 pages

ISSN:1076-9757

Issue’s Table of Contents

Publisher

AI Access Foundation

El Segundo, CA, United States

Publication History

Published: 01 April 1995

Received: 01 November 1994

Published in JAIR Volume 2, Issue 1

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

137
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Feng YZhou MTong X(2021)Imbalanced classificationStatistical Analysis and Data Mining10.1002/sam.1153814:5(383-406)Online publication date: 16-Sep-2021
https://dl.acm.org/doi/10.1002/sam.11538
Ghatasheh NFaris HAbukhurma RCastillo PAl-Madi NMora AAl-Zoubi AHassanat A(2020)Cost-sensitive ensemble methods for bankruptcy prediction in a highly imbalanced data distribution: a real case from the Spanish marketProgress in Artificial Intelligence10.1007/s13748-020-00219-x9:4(361-375)Online publication date: 24-Oct-2020
https://dl.acm.org/doi/10.1007/s13748-020-00219-x
Karlsson IRebane JPapapetrou PGionis A(2019)Locally and globally explainable time series tweakingKnowledge and Information Systems10.1007/s10115-019-01389-462:5(1671-1700)Online publication date: 30-Aug-2019
https://dl.acm.org/doi/10.1007/s10115-019-01389-4
Maliah SShani GMcIlraith SWeinberger K(2018)MDP-based cost sensitive classification using decision treesProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504494(3746-3753)Online publication date: 2-Feb-2018
https://dl.acm.org/doi/10.5555/3504035.3504494
Fedotov DPerepelkina OKazimirova EKonstantinova MMinker W(2018)Multimodal approach to engagement and disengagement detection with highly imbalanced in-the-wild dataProceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data10.1145/3279810.3279842(1-9)Online publication date: 16-Oct-2018
https://dl.acm.org/doi/10.1145/3279810.3279842
Fernando CSygnowski JOsindero SWang JSchaul TTeplyashin DSprechmann PPritzel ARusu ATakadama K(2018)Meta-learning by the Baldwin effectProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3205651.3208249(1313-1320)Online publication date: 6-Jul-2018
https://dl.acm.org/doi/10.1145/3205651.3208249
Zhang YGong DCheng J(2017)Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in ClassificationIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2015.247679614:1(64-75)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1109/TCBB.2015.2476796
Zhao HLi X(2017)A cost sensitive decision tree algorithm based on weighted class distribution with batch deleting attribute mechanismInformation Sciences: an International Journal10.1016/j.ins.2016.09.054378:C(303-316)Online publication date: 1-Feb-2017
https://dl.acm.org/doi/10.1016/j.ins.2016.09.054
Qiu CJiang LLi C(2017)Randomly selected decision tree for test-cost sensitive learningApplied Soft Computing10.1016/j.asoc.2016.12.04753:C(27-33)Online publication date: 1-Apr-2017
https://dl.acm.org/doi/10.1016/j.asoc.2016.12.047
Fang YLiu ZMin F(2017)A PSO algorithm for multi-objective cost-sensitive attribute reduction on numeric data with error rangesSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-016-2260-521:23(7173-7189)Online publication date: 1-Dec-2017
https://dl.acm.org/doi/10.1007/s00500-016-2260-5
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents