Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem

Published: 01 January 2006 Publication History

Abstract

This paper studies empirically the effect of sampling and threshold-moving in training cost-sensitive neural networks. Both oversampling and undersampling are considered. These techniques modify the distribution of the training data such that the costs of the examples are conveyed explicitly by the appearances of the examples. Threshold-moving tries to move the output threshold toward inexpensive classes such that examples with higher costs become harder to be misclassified. Moreover, hard-ensemble and soft-ensemble, i.e., the combination of above techniques via hard or soft voting schemes, are also tested. Twenty-one UCI data sets with three types of cost matrices and a real-world cost-sensitive data set are used in the empirical study. The results suggest that cost-sensitive learning with multiclass tasks is more difficult than with two-class tasks, and a higher degree of class imbalance may increase the difficulty. It also reveals that almost all the techniques are effective on two-class tasks, while most are ineffective and even may cause negative effect on multiclass tasks. Overall, threshold-moving and soft-ensemble are relatively good choices in training cost-sensitive neural networks. The empirical study also suggests that some methods that have been believed to be effective in addressing the class imbalance problem may, in fact, only be effective on learning with imbalanced two-class data sets.

References

[1]
N. Abe, B. Zadrozny, and J. Langford, “An Iterative Method for Multiclass Cost-Sensitive Learning,” Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 3-11, 2004.
[2]
E. Bauer and R. Kohavi, “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants,” Machine Learning, vol. 36, nos. 1-2, pp. 102-139, 1999.
[3]
S.D. Bay, “UCI KDD Archive,” Dept. of Information and Computer Science, Univ. of California, Irvine, 2000,
[4]
C. Blake, E. Keogh, and C.J. Merz, “UCI Repository of Machine Learning Databases,” Dept. of Information and Computer Science, Univ. of California, Irvine, 1998,
[5]
J.P. Bradford, C. Kuntz, R. Kohavi, C. Brunk, and C.E. Brodley, “Pruning Decision Trees with Misclassification Costs,” Proc. 10th European Conf. Machine Learning, pp. 131-136, 1998.
[6]
U. Brefeld, P. Geibel, and F. Wysotzki, “Support Vector Machines with Example Dependent Costs,” Proc. 14th European Conf. Machine Learning, pp. 23-34, 2003.
[7]
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. Belmont, Calif.: Wadsworth, 1984.
[8]
N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, “SMOTE: Synthetic Minority Over-Sampling Technique,” J. Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
[9]
B.V. Dasarathy, Nearest Neighbor Norms: NN Pattern Classification Techniques. Los Alamitos, Calif.: IEEE CS Press, 1991.
[10]
T.G. Dietterich, “Ensemble Learning,” The Handbook of Brain Theory and Neural Networks, second ed., M.A. Arbib, ed., Cambridge, Mass.: MIT Press, 2002.
[11]
P. Domingos, “MetaCost: A General Method for Making Classifiers Cost-Sensitive,” Proc. Fifth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 155-164, 1999.
[12]
C. Drummond and R.C. Holte, “Explicitly Representing Expected Cost: An Alternative to ROC Representation,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 198-207, 2000.
[13]
C. Drummond and R.C. Holte, “C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling Beats Over-Sampling,” Working Notes of the ICML'03 Workshop Learning from Imbalanced Data Sets, 2003.
[14]
C. Elkan, “The Foundations of Cost-Sensitive Learning,” Proc. 17th Int'l Joint Conf. Artificial Intelligence, pp. 973-978, 2001.
[15]
N. Japkowicz, “Learning from Imbalanced Data Sets: A Comparison of Various Strategies,” Working Notes of the AAAI'00 Workshop Learning from Imbalanced Data Sets, pp. 10-15, 2000.
[16]
N. Japkowicz and S. Stephen, “The Class Imbalance Problem: A Systematic Study,” Intelligent Data Analysis, vol. 6, no. 5, pp. 429-450, 2002.
[17]
U. Knoll, G. Nakhaeizadeh, and B. Tausend, “Cost-Sensitive Pruning of Decision Trees,” Proc. Eighth European Conf. Machine Learning, pp. 383-386, 1994.
[18]
M. Kubat and S. Matwin, “Addressing the Curse of Imbalanced Training Sets: One-Sided Selection,” Proc. 14th Int'l Conf. Machine Learning, pp. 179-186, 1997.
[19]
M. Kukar and I. Kononenko, “Cost-Sensitive Learning with Neural Networks,” Proc. 13th European Conf. Artificial Intelligence, pp. 445-449, 1998.
[20]
L.I. Kuncheva and C.J. Whitaker, “Measures of Diversity in Classifier Ensembles,” Machine Learning, vol. 51, no. 2, pp. 181-207, 2003.
[21]
S. Lawrence, I. Burns, A. Back, A.C. Tsoi, and C.L. Giles, “Neural Network Classification and Prior Class Probabilities,” Lecture Notes in Computer Science 1524, G.B. Orr and K.-R. Müller, eds., pp. 299-313, Berlin: Springer, 1998.
[22]
M.A. Maloof, “Learning When Data Sets are Imbalanced and When Costs Are Unequal and Unknown,” Proc. Working Notes ICML'03 Workshop Learning from Imbalanced Data Sets, 2003.
[23]
D.D. Margineantu and T.G. Dietterich, “Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers,” Proc. 17th Int'l Conf. Machine Learning, pp. 583-590, 2000.
[24]
M. Pazzani, C. Merz, P. Murphy, K. Ali, T. Hume, and C. Brunk, “Reducing Misclassification Costs,” Proc. 11th Int'l Conf. Machine Learning, pp. 217-225, 1994.
[25]
F. Provost, “Machine Learning from Imbalanced Data Sets 101,” Working Notes AAAI'00 Workshop Learning from Imbalanced Data Sets, pp. 1-3, 2000.
[26]
F. Provost and T. Fawcett, “Analysis and Visualization of Classifier Performance: Comparison Under Imprecise Class and Cost Distributions,” Proc. Third ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 43-48, 1997.
[27]
J.R. Quinlan, “MiniBoosting Decision Trees,”
[28]
D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning Internal Representations by Error Propagation,” Parallel Distributed Processing: Explorations in The Microstructure of Cognition, D.E. Rumelhart and J.L. McClelland, eds., vol. 1, pp. 318-362, Cambridge, Mass.: MIT Press, 1986.
[29]
Machine Learning— A Technological Roadmap, L. Saitta, ed. The Netherlands: Univ. of Amsterdam, 2000.
[30]
C. Stanfill and D. Waltz, “Toward Memory-Based Reasoning,” Comm. ACM, vol. 29, no. 12, pp. 1213-1228, 1986.
[31]
K.M. Ting, “A Comparative Study of Cost-Sensitive Boosting Algorithms,” Proc. 17th Int'l Conf. Machine Learning, pp. 983-990, 2000.
[32]
K.M. Ting, “An Empirical Study of MetaCost Using Boosting Algorithm,” Proc. 11th European Conf. Machine Learning, pp. 413-425, 2000.
[33]
K.M. Ting, “An Instance-Weighting Method to Induce Cost-Sensitive Trees,” IEEE Trans. Knowledge and Data Eng., vol. 14, no. 3, pp. 659-665, Apr./May 2002.
[34]
I. Tomek, “Two Modifications of CNN,” IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 769-772, 1976.
[35]
P.D. Turney, “Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm,” J. Artificial Intelligence Research, vol. 2, pp. 369-409, 1995.
[36]
M. Vlachos, C. Domeniconi, D. Gunopulos, G. Kollios, and N. Koudas, “Non-Linear Dimensionality Reduction Techniques for Classification and Visualization,” Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 645-651, 2002.
[37]
G.I. Webb, “Cost-Sensitive Specialization,” Proc. Fourth Pacific Rim Int'l Conf. Artificial Intelligence, pp. 23-34, 1996.
[38]
G.M. Weiss, “Mining with Rarity— Problems and Solutions: A Unifying Framework,” SIGKDD Explorations, vol. 6, no. 1, pp. 7-19, 2004.
[39]
B. Zadrozny and C. Elkan, “Learning and Making Decisions When Costs and Probabilities Are Both Unknown,” Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 204-213, 2001.

Cited By

View all
  • (2024)Predictive Modeling of Pulmonary Arterial Hypertension Based on Phonocardiogram SignalsProceedings of the 2024 16th International Conference on Computer Modeling and Simulation10.1145/3686812.3686816(1-0)Online publication date: 21-Jun-2024
  • (2024)Mastering Long-Tail Complexity on Graphs: Characterization, Learning, and GeneralizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671880(3045-3056)Online publication date: 25-Aug-2024
  • (2024)Cost-Sensitive Trees for Interpretable Reinforcement LearningProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3632443(91-99)Online publication date: 4-Jan-2024
  • Show More Cited By

Index Terms

  1. Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Knowledge and Data Engineering
        IEEE Transactions on Knowledge and Data Engineering  Volume 18, Issue 1
        January 2006
        142 pages

        Publisher

        IEEE Educational Activities Department

        United States

        Publication History

        Published: 01 January 2006

        Author Tags

        1. Index Terms- Machine learning
        2. class imbalance learning
        3. cost-sensitive learning
        4. data mining
        5. ensemble learning.
        6. neural networks
        7. sampling
        8. threshold-moving

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 04 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Predictive Modeling of Pulmonary Arterial Hypertension Based on Phonocardiogram SignalsProceedings of the 2024 16th International Conference on Computer Modeling and Simulation10.1145/3686812.3686816(1-0)Online publication date: 21-Jun-2024
        • (2024)Mastering Long-Tail Complexity on Graphs: Characterization, Learning, and GeneralizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671880(3045-3056)Online publication date: 25-Aug-2024
        • (2024)Cost-Sensitive Trees for Interpretable Reinforcement LearningProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3632443(91-99)Online publication date: 4-Jan-2024
        • (2024)ChatPRCS: A Personalized Support System for English Reading Comprehension Based on ChatGPTIEEE Transactions on Learning Technologies10.1109/TLT.2024.340574717(1762-1776)Online publication date: 27-May-2024
        • (2024)Revisiting the Effective Number Theory for Imbalanced LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336794936:8(4192-4206)Online publication date: 1-Aug-2024
        • (2024)LT-SEI: Long-Tailed Specific Emitter Identification Based on Decoupled Representation Learning in Low-Resource ScenariosIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.330871625:1(929-943)Online publication date: 1-Jan-2024
        • (2024)Relabeling & raking algorithm for imbalanced classification▪Expert Systems with Applications: An International Journal10.1016/j.eswa.2024.123274247:COnline publication date: 1-Aug-2024
        • (2024)A membership-based resampling and cleaning algorithm for multi-class imbalanced overlapping data▪Expert Systems with Applications: An International Journal10.1016/j.eswa.2023.122565240:COnline publication date: 15-Apr-2024
        • (2024)ECC + +Expert Systems with Applications: An International Journal10.1016/j.eswa.2023.121366236:COnline publication date: 1-Feb-2024
        • (2024)Hybrid resampling and weighted majority voting for multi-class anomaly detection on imbalanced malware and network traffic dataEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107568128:COnline publication date: 14-Mar-2024
        • Show More Cited By

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media