Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Data Mining Techniques for Software Effort Estimation: A Comparative Study

Published: 01 March 2012 Publication History

Abstract

A predictive model is required to be accurate and comprehensible in order to inspire confidence in a business setting. Both aspects have been assessed in a software effort estimation setting by previous studies. However, no univocal conclusion as to which technique is the most suited has been reached. This study addresses this issue by reporting on the results of a large scale benchmarking study. Different types of techniques are under consideration, including techniques inducing tree/rule-based models like M5 and CART, linear models such as various types of linear regression, nonlinear models (MARS, multilayered perceptron neural networks, radial basis function networks, and least squares support vector machines), and estimation techniques that do not explicitly induce a model (e.g., a case-based reasoning approach). Furthermore, the aspect of feature subset selection by using a generic backward input selection wrapper is investigated. The results are subjected to rigorous statistical testing and indicate that ordinary least squares regression in combination with a logarithmic transformation performs best. Another key finding is that by selecting a subset of highly predictive attributes such as project size, development, and environment related attributes, typically a significant increase in estimation accuracy can be obtained.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 38, Issue 2
March 2012
256 pages

Publisher

IEEE Press

Publication History

Published: 01 March 2012

Author Tags

  1. Data mining
  2. regression.
  3. software effort estimation

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Agile effort estimation in ColombiaScience of Computer Programming10.1016/j.scico.2024.103115236:COnline publication date: 1-Sep-2024
  • (2024)A random forest model for early-stage software effort estimation for the SEERA datasetInformation and Software Technology10.1016/j.infsof.2024.107413169:COnline publication date: 1-May-2024
  • (2023)Learning to Predict Code Review Completion Time In Modern Code ReviewEmpirical Software Engineering10.1007/s10664-023-10300-328:4Online publication date: 20-May-2023
  • (2023)CoBRA without expertsJournal of Software: Evolution and Process10.1002/smr.256935:12Online publication date: 25-Apr-2023
  • (2022)Promotion Strategy of Ideological and Political Education Management in Colleges and Universities Using Clustering TechniquesMobile Information Systems10.1155/2022/31792602022Online publication date: 1-Jan-2022
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
  • (2022)Locally weighted regression with different kernel smoothers for software effort estimationScience of Computer Programming10.1016/j.scico.2021.102744214:COnline publication date: 1-Feb-2022
  • (2022)Empirical investigation of hyperparameter optimization for software defect count predictionExpert Systems with Applications: An International Journal10.1016/j.eswa.2021.116217191:COnline publication date: 1-Apr-2022
  • (2022)A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimationEmpirical Software Engineering10.1007/s10664-019-09686-w24:5(3153-3204)Online publication date: 10-Mar-2022
  • (2022)Hybrid PPFCM-ANN model: an efficient system for customer churn prediction through probabilistic possibilistic fuzzy clustering and artificial neural networkNeural Computing and Applications10.1007/s00521-018-3548-431:11(7181-7200)Online publication date: 11-Mar-2022
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media