Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2884781.2884830acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Multi-objective software effort estimation

Published: 14 May 2016 Publication History

Abstract

We introduce a bi-objective effort estimation algorithm that combines Confidence Interval Analysis and assessment of Mean Absolute Error. We evaluate our proposed algorithm on three different alternative formulations, baseline comparators and current state-of-the-art effort estimators applied to five real-world datasets from the PROMISE repository, involving 724 different software projects in total. The results reveal that our algorithm outperforms the baseline, state-of-the-art and all three alternative formulations, statistically significantly (p < 0.001) and with large effect size (Â12 ≥ 0.9) over all five datasets. We also provide evidence that our algorithm creates a new state-of-the-art, which lies within currently claimed industrial human-expert-based thresholds, thereby demonstrating that our findings have actionable conclusions for practicing software engineers.

References

[1]
L. Angelis and I. Stamelos. A simulation tool for efficient analogy based cost estimation. EMSE, 5(1):35--68, 2000.
[2]
L. Angelis, I. Stamelos, and M. Morisio. Building A software cost estimation model based on categorical data. In Proc. of METRICS'01, pages 4--15, 2001.
[3]
A. Arcuri and L. C. Briand. A hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. STVR, 24(3):219--250, 2014.
[4]
A. Bakir, B. Turhan, and A. Bener. A comparative study for estimating software development effort intervals. SQJ, 19(3):537--552, 2011.
[5]
M. Barros. An analysis of the effects of composite objectives in multiobjective software module clustering. In Proc. of GECCO '12, pages 1205--1212, 2012.
[6]
S. Bibi, I. Stamelos, and E. Angelis. Software Cost Prediction with Predefined Interval Estimates. In Proc. of SMEF'04, pages 237--246, 2004.
[7]
P. Braga, A. Oliveira, and S. Meira. Software effort estimation using machine learning techniques with robust confidence intervals. In Proc. of HIS'07, pages 352--357, 2007.
[8]
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont, California, U.S.A., 1984.
[9]
L. C. Briand and I. Wieczorek. Software resource estimation. Encyclopedia of Software Engineering, pages 1160--1196, 2002.
[10]
L. C. Briand and J. Wüst. Modeling development effort in object-oriented systems using design properties. IEEE TSE, 27(11):963--986, 2001.
[11]
C. J. Burgess and M. Lefley. Can genetic programming improve software effort estimation? a comparative evaluation. IST, 43(14):863--873, 2001.
[12]
J. Cohen. Statistical power analysis for the behavioral sciences. Lawrence Earlbaum Associates, 2nd edition, 1988.
[13]
T. E. Colanzi, S. R. Vergilio, W. K. G. Assuncao, and A. Pozo. Search based software engineering: Review and analysis of the field in Brazil. JSS, 86(4):970--984, 2013.
[14]
D. Conte, H. Dunsmore, and V. Shen. Software engineering metrics and models. Benjamin/Cummings Publishing Company, Inc., 1986.
[15]
A. Corazza, S. Di Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. How effective is tabu search to configure support vector regression for effort estimation? In Proc. of PROMISE'10, pages 4:1--4:10, 2010.
[16]
A. Corazza, S. D. Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. Using tabu search to configure support vector regression for effort estimation. EMSE, 18(3):506--546, 2013.
[17]
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE TEC, 6:182--197, 2002.
[18]
J. J. Dolado. A validation of the component-based method for software size estimation. IEEE TSE, 26(10):1006--1021, 2000.
[19]
F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. Using tabu search to estimate software development effort. In Proc. of MENSURA'09, pages 307--320. LNCS 5891, Springer, 2009.
[20]
F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. Genetic programming for effort estimation: An analysis of the impact of different fitness functions. In Proc. of SSBSE'10, pages 89--98, 2010.
[21]
F. Ferrucci, C. Gravino, R. Oliveto, F. Sarro, and E. Mendes. Investigating tabu search for web effort estimation. In Proc. of EUROMICRO-SEAA'10, pages 350--357, 2010.
[22]
F. Ferrucci, C. Gravino, and F. Sarro. How multi-objective genetic programming is effective for software development effort estimation? In Proc. of SSBSE'11, pages 274--275, 2011.
[23]
F. Ferrucci, M. Harman, J. Ren, and F. Sarro. Not going to take this anymore: Multi-objective overtime planning for software engineering projects. In Proc. of ICSE'13, 2013.
[24]
F. Ferrucci, M. Harman, and F. Sarro. Search-based software project management. In Software Project Management in a Changing World, pages 373--399. Springer, 2014.
[25]
T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. A simulation study of the model evaluation criterion MMRE. IEEE TSE, 29(11):985--995, 2003.
[26]
F. G. Freitas and J. T. Souza. Ten years of search based software engineering: A bibliometric analysis. In Proc. of SSBSE'11, pages 18--32, 2011.
[27]
M. Harman. The current state and future of search based software engineering. In Proc. of FOSE'07, pages 342--357, 2007.
[28]
M. Harman, Y. Jia, and Y. Zhang. Achievements, open problems and challenges for search based software testing (keynote). In Proc. of ICST'14, 2014.
[29]
M. Harman and B. F. Jones. Search based software engineering. IST, 43(14):833--839, 2001.
[30]
M. Harman, A. Mansouri, and Y. Zhang. Search based software engineering: Trends, techniques and applications. ACM Computing Surveys, 45(1):11:1--11:61, 2012.
[31]
M. Harman, P. McMinn, J. Teixeira de Souza, and S. Yoo. Search based software engineering: Techniques, taxonomy, tutorial. In LASER, pages 1--59, 2010.
[32]
G. W. Hill. Algorithm 396: Student's t-quantiles. Commun. ACM, 13(10):619--620, 1970.
[33]
S.-J. Huang and N.-H. Chiu. Optimization of analogy weights by genetic algorithm for software effort estimation. JSS, 48(11):1034--1045, 2006.
[34]
R. Jeffery, M. Ruhe, and I. Wieczorek. A comparative study of cost modelling techniques using public domain multi-organisational and company-specific data. In Proc. of ESCOM'2000, 2000.
[35]
M. Jørgensen. Comments on 'A simulation tool for efficient analogy based cost estimation'. EMSE, 7(4):375--376, 2002.
[36]
M. Jørgensen. The ignorance of confidence levels in minimum-maximum software development effort interval. LNSE, 2(4):327--330, 2004.
[37]
M. Jørgensen. A review of studies on expert estimation of software development effort. JSS, 70(1-2):37--60, 2004.
[38]
M. Jørgensen and K. Moløkken. Combination of software development effort prediction intervals: Why, when and how? In Proc. of SEKE'02, pages 425--428, 2002.
[39]
M. Jørgensen and M. Shepperd. A systematic review of software development cost estimation studies. IEEE TSE, 33(1):33--53, 2007.
[40]
M. Jørgensen and D. Sjöberg. An effort prediction interval approach based on the empirical distribution of previous estimation accuracy. IST, 45(3):123--136, 2003.
[41]
M. Jørgensen, K. H. Teigen, and K. Moløkken. Better sure than safe? over-confidence in judgement based software development effort prediction intervals. JSS, 70(1-2):79--93, 2004.
[42]
G. Kadoda, M. Cartwright, and M. Shepperd. Issues on the effective use of cbr technology for software project prediction. In Case-Based Reasoning Research and Development, LNCS v. 2080, pages 276--290. 2001.
[43]
G. Kadoda and M. Shepperd. Using simulation to evaluate predictions techniques. In Proc. of Int. Software Metrics Symposium, pages 349--358. IEEE press, 2001.
[44]
B. Kitchenham, L. Pickard, and S. Pfleeger. Case studies for method and tool evaluation. IEEE Software, 12(4):52--62, 1995.
[45]
B. Kitchenham, L. M. Pickard, S. G. MacDonell, and M. J. Shepperd. What accuracy statistics really measure. IEEE Proc. Software, 148(3):81--85, 2001.
[46]
J. D. Knowles, L. Thiele, and E. Zitzler. A tutorial on the performance assessment of stochastic multiobjective optimizers. Technical Report 214, ETH Zurich, 2006.
[47]
E. Kocaguneli, T. Menzies, A. Bener, and J. Keung. Exploiting the essential assumptions of analogy-based effort estimation. IEEE TSE, 38(2):425--438, 2012.
[48]
E. Kocaguneli, T. Menzies, J. Hihn, and B. H. Kang. Size doesn't matter?: On the value of software size features for effort estimation. In Proc. of PROMISE'12, pages 89--98, 2012.
[49]
E. Kocaguneli, T. Menzies, J. Keung, D. Cok, and R. Madachy. Active learning and effort estimation: Finding the essential content of software effort estimation data. IEEE TSE, 39(8):1040--1053, 2013.
[50]
E. Kocaguneli, T. Menzies, and J. W. Keung. On the value of ensemble effort estimation. IEEE TSE, 38(6):1403--1416, 2012.
[51]
E. Kocaguneli, A. Misirli, B. Caglayan, and A. Bener. Experiences on developer participation and effort estimation. In Proc. of EUROMICRO-SEAA'11, pages 419--422, 2011.
[52]
E. Kocaguneli, A. Tosun, and A. Bener. Ai-based models for software effort estimation. In Proc. of EUROMICRO-SEAA'10, pages 323--326, 2010.
[53]
M. Korte and D. Port. Confidence in software cost estimation results based on mmre and pred. In Proc. of PROMISE'08, pages 63--70, 2008.
[54]
J. R. Koza. Genetic Programming. MIT Press, 1992.
[55]
W. B. Langdon, J. Dolado, F. Sarro, and M. Harman. Exact mean absolute error of baseline predictor, MARP0. IST, 73:16--18, 2016.
[56]
M. Lefley and M. J. Shepperd. Using genetic programming to improve software effort estimation based on general data sets. In Proc. of GECCO'03, pages 2477--2487, 2003.
[57]
C. Lokan. What should you optimize when building an estimation model? In Proc. of METRICS'05, page 34, 2005.
[58]
C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster. An investigation of machine learning based prediction systems. JSS, 53(1):23--29, 2000.
[59]
S. McConnell. Software Estimation: Demystifying the Black Art. Microsoft Press, 2006.
[60]
E. Mendes, S. Counsell, N. Mosley, C. Triggs, and I. Watson. A comparative study of cost estimation models for web hypermedia applications. EMSE, 8(23):163--196, 2003.
[61]
E. Mendes and B. Kitchenham. A comparison of cross-company and within-company effort estimation models for web applications. In Proc. of EASE'04, pages 47--55, 2004.
[62]
E. Mendes and N. Mosley. Further investigation into the use of cbr and stepwise regression to predict development effort for web hypermedia applications. In Proc. of Int. Symposium on Empirical Software Engineering, pages 79--90, 2002.
[63]
T. Menzies, Z. Chen, J. Hihn, and K. Lum. Selecting best practices for effort estimation. IEEE TSE, 32(11):883--895, 2006.
[64]
T. Menzies, M. Rees-Jones, R. Krishna, and C. Pape. The promise repository of empirical software engineering data, 2015.
[65]
T. Menzies and M. Shepperd. Special issue on repeatable results in software engineering prediction. EMSE, 17(1):1--17, 2012.
[66]
L. L. Minku and X. Yao. Software effort estimation as a multiobjective learning problem. ACM TOSEM, 22(4):35, 2013.
[67]
K. Molkken and M. Jörgensen. A review of surveys on software effort estimation. In Proc. of ISESE'03, pages 223--230, 2003.
[68]
S. Nejati and L. C. Briand. Identifying optimal trade-offs between cpu time usage and temporal constraints using search. In Proc. of ISSTA'14, pages 351--361, 2014.
[69]
G. Neumann, M. Harman, and S. M. Poulding. Transformed vargha-delaney effect size. In Proc. of SSBSE'15, pages 318--324, 2015.
[70]
R. Olaechea, D. Rayside, J. Guo, and K. Czarnecki. Comparison of exact and approximate multi-objective optimization for software product lines. In Proc. of SPLC'14, pages 92--101, 2014.
[71]
D. Port and M. Korte. Comparative studies of the model evaluation criterions mmre and pred in software cost estimation research. In Proc. of ESEM'08, pages 51--60, 2008.
[72]
K. Praditwong, M. Harman, and X. Yao. Software module clustering as a multi-objective search problem. IEEE TSE, 37(2):264--282, 2011.
[73]
P. Royston. An extension of Shapiro and Wilk's W test for normality to large samples. Applied Statistics, 31(2):115--124, 1982.
[74]
F. Sarro, F. Ferrucci, and C. Gravino. Single and multi objective genetic programming for software development effort estimation. In Proc. of ACM SAC'12, pages 1221--1226, 2012.
[75]
P. Sentas, L. Angelis, and I. Stamelos. Multinomial logistic regression applied on software productivity prediction. In 9th Panhellenic Conf. in Inf., 2003.
[76]
P. Sentas, L. Angelis, I. Stamelos, and G. Bleris. Software productivity and effort prediction with ordinal regression. IST, 47(1):17--29, 2005.
[77]
Y. Shan, R. I. Mckay, C. J. Lokan, and D. L. Essam. Software project effort estimation using genetic programming. In Proc. of CCS'02, pages 1108--1112, 2002.
[78]
M. Shepperd. Case-based reasoning and software engineering. In Managing Software Engineering Knowledge, pages 181--198. Springer, 2003.
[79]
M. Shepperd, M. Cartwright, and G. Kadoda. On building prediction systems for software engineers. EMSE, 5(3):175--182, 2000.
[80]
M. Shepperd and C. Schofield. Estimating Software Project Effort using Analogies. IEEE TSE, 23(11):736--743, 1997.
[81]
M. Shepperd and C. Schofield. Estimating software project effort using analogies. IEEE TSE, 23(11):736--743, 2000.
[82]
M. J. Shepperd and S. G. MacDonell. Evaluating prediction systems in software project estimation. IST, 54(8):820--827, 2012.
[83]
D. L. Shrestha and D. P. Solomatine. Machine learning approaches for estimation of prediction interval for the model output. Neural Networks, 19(2):225--235, 2006.
[84]
I. Sommerville. Software Engineering. Pearson, 9th edition, 2010.
[85]
I. Stamelos and L. Angelis. Managing uncertainty in project portfolio cost estimation. IST, 43(13):759--768, 2001.
[86]
I. Stamelos, L. Angelis, P. Dimou, and E. Sakellaris. On the use of bayesian belief networks for the prediction of software productivity. IST, 45(1):51--60, 2003.
[87]
E. Stensrud, T. Foss, B. Kitchenham, and I. Myrtveit. A further empirical investigation of the relationship between MRE and project size. EMSE, 8(2):139--161, 2003.
[88]
A. Trendowicz. Software Project Effort Estimation: Foundations and Best Practice Guidelines for Success. Springer, 2014.
[89]
D. A. V. Veldhuizen and G. B. Lamont. Multiobjective evolutionary algorithm research: A history and analysis, 1998.
[90]
P. A. Whigham, C. A. Owen, and S. G. Macdonell. A baseline model for software effort estimation. ACM TOSEM, 24(3):20:1--20:11, 2015.
[91]
E. Zitzler and L. Thiele. Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE TEC, 3(4):257--271, 1999.
[92]
E. Zitzler, L. Thiele, M. Laumanns, C. Fonseca, and V. da Fonseca. Performance assessment of multiobjective optimizers: an analysis and review. IEEE TEC, 7(2):117--132, 2003.

Cited By

View all
  • (2024)Context-Aware Automated Sprint Plan Generation for Agile Software DevelopmentProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695540(1745-1756)Online publication date: 27-Oct-2024
  • (2024)Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project HealthACM Transactions on Software Engineering and Methodology10.1145/363025233:3(1-22)Online publication date: 14-Mar-2024
  • (2024)Fine-SE: Integrating Semantic Features and Expert Features for Software Effort EstimationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623349(1-12)Online publication date: 20-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '16: Proceedings of the 38th International Conference on Software Engineering
May 2016
1235 pages
ISBN:9781450339001
DOI:10.1145/2884781
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. confidence interval
  2. estimates uncertainty
  3. multi-objective evolutionary algorithm
  4. software effort estimation

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)205
  • Downloads (Last 6 weeks)22
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Context-Aware Automated Sprint Plan Generation for Agile Software DevelopmentProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695540(1745-1756)Online publication date: 27-Oct-2024
  • (2024)Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project HealthACM Transactions on Software Engineering and Methodology10.1145/363025233:3(1-22)Online publication date: 14-Mar-2024
  • (2024)Fine-SE: Integrating Semantic Features and Expert Features for Software Effort EstimationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623349(1-12)Online publication date: 20-May-2024
  • (2024)Multi-Objective Evolutionary Search for Optimal Robotic Process Automation ArchitecturesIEEE Transactions on Services Computing10.1109/TSC.2024.339632917:5(2654-2671)Online publication date: Sep-2024
  • (2024)An Artificial Intelligence Framework for Project Planning and Control using Decision Tree Analysis and Artificial Neural Network2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG)10.1109/SEB4SDG60871.2024.10630126(1-10)Online publication date: 2-Apr-2024
  • (2024)Agile Effort Estimation: Have We Solved the Problem Yet? Insights From the Replication of the GPT2SP Study2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00111(1034-1041)Online publication date: 12-Mar-2024
  • (2024)On The Effectiveness of One-Class Support Vector Machine in Different Defect Prediction Scenarios2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00061(535-545)Online publication date: 12-Mar-2024
  • (2024)Computational intelligence for estimating software development effort: a systematic mapping studyIran Journal of Computer Science10.1007/s42044-024-00178-97:3(607-630)Online publication date: 9-Apr-2024
  • (2024)Micro Frontend Based Performance Improvement and Prediction for Microservices Using Machine LearningJournal of Grid Computing10.1007/s10723-024-09760-822:2Online publication date: 16-Apr-2024
  • (2024)Search-based Automatic Repair for Fairness and Accuracy in Decision-making SoftwareEmpirical Software Engineering10.1007/s10664-023-10419-329:1Online publication date: 3-Jan-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media