Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Active Learning and Effort Estimation: Finding the Essential Content of Software Effort Estimation Data

Published: 01 August 2013 Publication History

Abstract

Background: Do we always need complex methods for software effort estimation (SEE)? Aim: To characterize the essential content of SEE data, i.e., the least number of features and instances required to capture the information within SEE data. If the essential content is very small, then 1) the contained information must be very brief and 2) the value added of complex learning schemes must be minimal. Method: Our QUICK method computes the euclidean distance between rows (instances) and columns (features) of SEE data, then prunes synonyms (similar features) and outliers (distant instances), then assesses the reduced data by comparing predictions from 1) a simple learner using the reduced data and 2) a state-of-the-art learner (CART) using all data. Performance is measured using hold-out experiments and expressed in terms of mean and median MRE, MAR, PRED(25), MBRE, MIBRE, or MMER. Results: For 18 datasets, QUICK pruned 69 to 96 percent of the training data (median = 89 percent). $({K}=1)$ nearest neighbor predictions (in the reduced data) performed as well as CART's predictions (using all data). Conclusion: The essential content of some SEE datasets is very small. Complex estimation methods may be overelaborate for such datasets and can be simplified. We offer QUICK as an example of such a simpler SEE method.

Cited By

View all
  • (2024)A random forest model for early-stage software effort estimation for the SEERA datasetInformation and Software Technology10.1016/j.infsof.2024.107413169:COnline publication date: 1-May-2024
  • (2024)An improved analogy-rule based software effort estimation using HTRR-RNN in software project managementExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124107251:COnline publication date: 24-Jul-2024
  • (2024)Software effort estimation using convolutional neural network and fuzzy clusteringNeural Computing and Applications10.1007/s00521-024-09855-z36:23(14449-14464)Online publication date: 1-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 39, Issue 8
August 2013
147 pages

Publisher

IEEE Press

Publication History

Published: 01 August 2013

Author Tags

  1. Complexity theory
  2. Estimation
  3. Euclidean distance
  4. Frequency selective surfaces
  5. Indexes
  6. Labeling
  7. Principal component analysis
  8. Software cost estimation
  9. active learning
  10. analogy
  11. k-NN

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A random forest model for early-stage software effort estimation for the SEERA datasetInformation and Software Technology10.1016/j.infsof.2024.107413169:COnline publication date: 1-May-2024
  • (2024)An improved analogy-rule based software effort estimation using HTRR-RNN in software project managementExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124107251:COnline publication date: 24-Jul-2024
  • (2024)Software effort estimation using convolutional neural network and fuzzy clusteringNeural Computing and Applications10.1007/s00521-024-09855-z36:23(14449-14464)Online publication date: 1-Aug-2024
  • (2023)Learning to Predict Code Review Completion Time In Modern Code ReviewEmpirical Software Engineering10.1007/s10664-023-10300-328:4Online publication date: 20-May-2023
  • (2022)On the value of project productivity for early effort estimationScience of Computer Programming10.1016/j.scico.2022.102819219:COnline publication date: 1-Jul-2022
  • (2022)Locally weighted regression with different kernel smoothers for software effort estimationScience of Computer Programming10.1016/j.scico.2021.102744214:COnline publication date: 1-Feb-2022
  • (2022)An extended study on applicability and performance of homogeneous cross-project defect prediction approaches under homogeneous cross-company effort estimation situationEmpirical Software Engineering10.1007/s10664-021-10103-427:2Online publication date: 27-Jan-2022
  • (2022)DRE: density-based data selection with entropy for adversarial-robust deep learning modelsNeural Computing and Applications10.1007/s00521-022-07812-235:5(4009-4026)Online publication date: 19-Oct-2022
  • (2022)Genetic algorithm-based oversampling approach to prune the class imbalance issue in software defect predictionSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-021-06112-626:23(12915-12931)Online publication date: 1-Dec-2022
  • (2021)FRUGALProceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE51524.2021.9678617(394-406)Online publication date: 15-Nov-2021
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media