Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3220267.3220286acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicsieConference Proceedingsconference-collections
research-article

Using SMOTE and Heterogeneous Stacking in Ensemble learning for Software Defect Prediction

Published: 02 May 2018 Publication History

Abstract

Nowadays, there are a lot of classifications models used for predictions in the software engineering field such as effort estimation and defect prediction. One of these models is the ensemble learning machine that improves model performance by combining multiple models in different ways to get a more powerful model.
One of the problems facing the prediction model is the misclassification of the minority samples. This problem mainly appears in the case of defect prediction. Our aim is the classification of defects which are considered minority samples during the training phase. This can be improved by implementing the Synthetic Minority Over-Sampling Technique (SMOTE) before the implementation of the ensemble model which leads to over-sample the minority class instances.
In this paper, our work propose applying a new ensemble model by combining the SMOTE technique with the heterogeneous stacking ensemble to get the most benefit and performance in training a dataset that focus on the minority subset as in the software prediction study. Our proposed model shows better performance that overcomes other techniques results applied on the minority samples of the defect prediction.

References

[1]
Hamad Alsawalqah, Hossam Faris, Ibrahim Aljarah, Loai Alnemer, and Nouh Alhindawi. 2017. Hybrid SMOTE-Ensemble Approach for Software Defect Prediction, Software Engineering Trends and Techniques in Intelligent Systems, Proceedings of the 6th Computer Science On-line Conference 2017 (CSOC2017), Vol 3, Springer, pp. 355-366.
[2]
Tim Menzies, Ekrem Kocagüneli, Leandro Minku, Fayola Peters, Burak Turhan, 2014. Sharing Data and Models in Software Engineering, Morgan Kaufmann Elsevier, MA 02451, USA.
[3]
Mahmoud O. Elish, Tarek Helmy, and Muhammad Imtiaz Hussain, Empirical Study of Homogeneous and Heterogeneous Ensemble Models for Software Development Effort Estimation, " Mathematical Problems in Engineering, vol. 2013, Article ID 312067, 21 pages, 2013.
[4]
Haonan Tong, Bin Liu, Shihai Wang. 2018. Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning, Information and Software Technology, Volume 96, 2018, Pages 94--111, ISSN 0950-5849.
[5]
Wattana Punlumjeak, Sitti Rugtanom, Samatachai Jantarat, and Nachirat Rachburee, Improving Classification of Imbalanced Student Dataset Using Ensemble Method of Voting, Bagging, and Adaboost with Under-Sampling Technique, Springer Nature Singapore, IT Convergence and Security 2017, pp.27--34.
[6]
Chenggang Zhang, Jiazhi Song, Zhili Pei and Jingqing Jiang, An Imbalanced Data Classification Algorithm of De-Noising Auto-Encoder Neural Network Based on SMOTE, MATEC Web of Conferences, vol 56, 2016, 4 pages.
[7]
Jean Petric, David Bowes, Tracy Hall, Bruce Christianson, and Nathan Baddoo. 2016. Building an Ensemble for Software Defect Prediction Based on Diversity Selection. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '16). ACM, New York, NY, USA, Article 46, 10 pages.
[8]
Tim Menzies, Laurie Williams, Thomas Zimmermann, Leandro L. Minku. 2016. Perspectives on Data Science for Software Engineering, Morgan Kaufmann Elsevier, MA 02139, USA.
[9]
R. Polikar. 2006. Ensemble based systems in decision making, in IEEE Circuits and Systems Magazine, vol. 6, no. 3, pp. 21--45, Third Quarter 2006.
[10]
Dinesh R. Pai, Kevin S. McFall and Girish H. Subramanian, Software Effort Estimation Using a Neural Network Ensemble, Journal of Computer Information Systems, Vol.53 (4), July 2013, pp. 49--58.
[11]
Ali Idri, Mohamed Hosni, Alain Abran. 2016. Systematic literature review of ensemble effort estimation, Journal of Systems and Software, Vol. 118, 2016, Pages 151--175, ISSN 0164-1212.
[12]
Leandro L. Minku and Xin Yao. 2013. An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation. In Proceedings of the 9th International Conference on Predictive Models in Software Engineering (PROMISE '13). ACM, New York, NY, USA, Article 8, 10 pages.
[13]
Ying Wang, Yongjun Shen and Guidong Zhang. 2016. Research on Intrusion Detection Model using ensemble learning methods, 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 2016, pp. 422--425.
[14]
Issam H. Laradji, Mohammad Alshayeb, Lahouari Ghouti. 2015. Software defect prediction using ensemble learning on selected features, Information and Software Technology, Vol. 58, 2015, Pages 388--402, ISSN 0950-5849.
[15]
Necati Demir, Ensemble Methods: Elegant Techniques to Produce Improved Machine Learning Results https://www.toptal.com/machine-learning/ensemble-methods-machine-learning, last access on 22 February 2018
[16]
Improving Predictions with Ensemble Model, Posted by Valiance Solutions, August 2016 (https://www.datasciencecentral.com/profiles/blogs/improving-predictions-with-ensemble-model), last access on 22 February 2018
[17]
Eibe Frank, Mark A. Hall, and Ian H. Witten. 2016. Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann Elsevier, Fourth Edition, 2016.
[18]
Zhiqiang Li, Xiao-Yuan Jing, Xiaoke Zhu, Hongyu Zhang, Heterogeneous Defect Prediction through Multiple Kernel Learning and Ensemble Learning, IEEE International Conference on Software Maintenance and Evolution, vol. 2017, pp.91--102.
[19]
T. Menzies, E. Kocaguneli and J. W. Keung, "On the Value of Ensemble Effort Estimation," in IEEE Transactions on Software Engineering, vol. 38, no., pp. 1403--1416, 2012.
[20]
Azzeh, Mohammad & Nassif, Ali & L. Minku, Leandro. An empirical evaluation of ensemble adjustment methods for analogy-based effort estimation, ElSEVIER, vol 2015, pp.36-52.

Cited By

View all
  • (2024)Software Defect Prediction Approach Based on a Diversity Ensemble Combined With Neural NetworkIEEE Transactions on Reliability10.1109/TR.2024.335651573:3(1487-1501)Online publication date: Sep-2024
  • (2024)Hyperparameter Optimization for Software Bug Prediction Using Ensemble LearningIEEE Access10.1109/ACCESS.2024.338002412(51869-51878)Online publication date: 2024
  • (2023)Software Defect Prediction with Bayesian ApproachesMathematics10.3390/math1111252411:11(2524)Online publication date: 31-May-2023
  • Show More Cited By

Index Terms

  1. Using SMOTE and Heterogeneous Stacking in Ensemble learning for Software Defect Prediction

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICSIE '18: Proceedings of the 7th International Conference on Software and Information Engineering
    May 2018
    147 pages
    ISBN:9781450364690
    DOI:10.1145/3220267
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 May 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Classification
    2. Defect Prediction
    3. Ensemble
    4. Heterogeneous
    5. Machine Learning
    6. SMOTE
    7. Software Engineering
    8. Stacking

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICSIE '18

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Software Defect Prediction Approach Based on a Diversity Ensemble Combined With Neural NetworkIEEE Transactions on Reliability10.1109/TR.2024.335651573:3(1487-1501)Online publication date: Sep-2024
    • (2024)Hyperparameter Optimization for Software Bug Prediction Using Ensemble LearningIEEE Access10.1109/ACCESS.2024.338002412(51869-51878)Online publication date: 2024
    • (2023)Software Defect Prediction with Bayesian ApproachesMathematics10.3390/math1111252411:11(2524)Online publication date: 31-May-2023
    • (2023)Model for Predicting Lateral Shifts and Driving Style of Motorized Two-WheelersTransportation Research Record: Journal of the Transportation Research Board10.1177/036119812312032202678:6(972-988)Online publication date: 31-Oct-2023
    • (2023)Predictive Analytics and Software Defect Severity: A Systematic Review and Future DirectionsScientific Programming10.1155/2023/62213882023(1-18)Online publication date: 2-Feb-2023
    • (2023)Automatically Tagging the “AAA” Pattern in Unit Test Cases Using Machine Learning ModelsIEEE Transactions on Software Engineering10.1109/TSE.2023.325244249:5(3305-3324)Online publication date: 1-May-2023
    • (2023)Ensemble Classifiers in Software Defect Prediction: A Systematic Literature Review2023 11th International Conference in Software Engineering Research and Innovation (CONISOFT)10.1109/CONISOFT58849.2023.00011(1-8)Online publication date: 6-Nov-2023
    • (2022)African buffalo optimized multinomial softmax regression based convolutional deep neural network for software fault predictionMaterials Today: Proceedings10.1016/j.matpr.2021.08.09761(619-626)Online publication date: 2022
    • (2022)Data quality issues in software fault prediction: a systematic literature reviewArtificial Intelligence Review10.1007/s10462-022-10371-656:8(7839-7908)Online publication date: 21-Dec-2022
    • (2022)Empirical Analysis of Data Sampling-Based Ensemble Methods in Software Defect PredictionComputational Science and Its Applications – ICCSA 2022 Workshops10.1007/978-3-031-10548-7_27(363-379)Online publication date: 26-Jul-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media