Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Software effort estimation as a multiobjective learning problem

Published: 22 October 2013 Publication History

Abstract

Ensembles of learning machines are promising for software effort estimation (SEE), but need to be tailored for this task to have their potential exploited. A key issue when creating ensembles is to produce diverse and accurate base models. Depending on how differently different performance measures behave for SEE, they could be used as a natural way of creating SEE ensembles. We propose to view SEE model creation as a multiobjective learning problem. A multiobjective evolutionary algorithm (MOEA) is used to better understand the tradeoff among different performance measures by creating SEE models through the simultaneous optimisation of these measures. We show that the performance measures behave very differently, presenting sometimes even opposite trends. They are then used as a source of diversity for creating SEE ensembles. A good tradeoff among different measures can be obtained by using an ensemble of MOEA solutions. This ensemble performs similarly or better than a model that does not consider these measures explicitly. Besides, MOEA is also flexible, allowing emphasis of a particular measure if desired. In conclusion, MOEA can be used to better understand the relationship among performance measures and has shown to be very effective in creating SEE models.

References

[1]
Agarwal, R., Kumar, M., Mallick, Y. S., Bharadwaj, R. M., and Anantwar, D. 2001. Estimating software projects. Softw. Eng. Notes 16, 4, 60--67.
[2]
Baskeles, B., Turhan, B., and Bener, A. 2007. Software effort estimation using machine learning methods. In Proceedings of ISCIS'07. 1--6.
[3]
Bishop, C. M. 2005. Neural Networks for Pattern Recognition. Oxford University Press, UK.
[4]
Boehm, B. 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ.
[5]
Boehm, B., Abts, C., Brown, A. W., Chulani, S., Clark, B. K., Horowitz, E., Madachy, R., Reifer, D. J., and Steece, B. 2000. Software Cost Estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ.
[6]
Braga, P. L., Oliveira, A., Ribeiro, G., and Meira, S. 2007. Bagging predictors for estimation of software project effort. In Proceedings of IJCNN'07. 1595--1600.
[7]
Breiman, L. 1996. Bagging predictors. Mach. Learn. 24, 2, 123--140.
[8]
Brown, G., Wyatt, J., Harris, R., and Yao, X. 2005. Diversity creation methods: A survey and categorisation. Inf. Fusion 6, 5--20.
[9]
Cartwright, M. H., Shepperd, M. J., and Song, Q. 2003. Dealing with missing software project data. In Proceedings of METRICS'03. 154--165.
[10]
Chandra, A. and Yao, X. 2006. Ensemble learning using multi-objective evolutionary algorithms. J. Math. Modell. Algor. 5, 4, 417--445.
[11]
Chen, H. and Yao, X. 2009. Regularized negative correlation learning for neural network ensembles. IEEE Trans. Neural Netw. 20, 12, 1962--1979.
[12]
Chulani, S., Bohem, B., and Steece, B. 1999. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25, 4, 573--583.
[13]
Cohen, J. 1992. A power primer. Psych. Bull. 112, 155--159.
[14]
Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evalut. Computa. 6, 2, 182--197.
[15]
Dejaeger, K., Verbeke, W., Martens, D., and Baesens, B. 2012. Data mining techniques for software effort estimation: A comparative study. IEEE Trans. Softw. Eng. 38, 2, 375--397.
[16]
Demšar, J. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Rese. 7, 130.
[17]
Dolado, J. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 1006--1021.
[18]
Dolado, J. 2001. On the problem of the software cost function. Info. Softw. Tech. 43, 61--72.
[19]
Finnoff, W., Hergert, F., and Zimmermann, H. G. 1993. Improving model selection by nonconvergent methods. Neural Netw. 6, 771--783.
[20]
Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit, I. 2003. A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29, 11, 985--995.
[21]
Gruschke, T. M. and Jørgensen, M. 2008. The role of outcome feedback in improving the uncertainty assessment of software development effort estimates. ACM Trans. Softw. Eng. Meth. 17, 4, 20:1--20:35.
[22]
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explorations 11, 1, 10--18.
[23]
Harman, M. and Clark, J. 2004. Metrics are fitness functions too. In Proceedings of METRICS'04. 172--183.
[24]
Hartigan, J. A. 1975. Clustering Algorithms. John Wiley & Sons, New York.
[25]
Heiat, A. 2002. Comparison of artificial neural network and regression models for estimating software development effort. Info. Softw. Tech. 44, 911--922.
[26]
ISBSG. 2011. The International Software Benchmarking Standards Group. http://www.isbsg.org.
[27]
Jørgensen, M. and Shepperd, M. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1, 33--53.
[28]
Jørgensen, M. and Grimstad, S. 2011. The impact of irrelevant and misleading information on software development effort estimates: A randomized controlled field experiment. IEEE Trans. Softw. Eng. 37, 5, 695--707.
[29]
Khare, V., Yao, X., and Deb, K. 2003. Performance scaling of multi-objective evolutionary algorithms. In Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (EMO'03), C. M. Fonseca, P. J. Fleming, E. Zitzler, K. Deb, and L. Thiele, Eds., Lecture Notes in Computer Science, vol. 2632. Springer-Verlag, 376--390.
[30]
Kocaguneli, E., Bener, A., and Kultur, Y. 2009. Combining multiple learners induced on multiple datasets for software effort prediction. In Proceedings of ISSRE'07.
[31]
Kocaguneli, E., Menzies, T., and Keung, J. 2012. On the value of ensemble effort estimation. IEEE Trans. Softw. Eng. 38, 6, 1403--1416.
[32]
Kultur, Y., Turhan, B., and Bener, A. 2009. Ensemble of neural networks with associative memory (ENNA) for estimating software development costs. Knowl. Based Syst. 22, 395--402.
[33]
Kuncheva, L. I. and Whitaker, C. J. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machi. Learn. 51, 181--207.
[34]
Legg, S., Hutter, M., and Kumar, A. 2004. Tournament versus fitness uniform selection. In Proceedings of the Congress of Evolutionary Computation (CEC). 2144--2151.
[35]
Liu, Y. and Yao, X. 1999a. Ensemble learning via negative correlation. Neur. Netw. 12, 1399--1404.
[36]
Liu, Y. and Yao, X. 1999b. Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Trans. Syst. Man Cybernetics - Part B: Cybernetics 29, 6, 716--725.
[37]
Lukasiewycz, M., Gla, M., Reimann, F., and Helwig, S. 2011. Opt4j: The meta-heuristic optimisation framework for java. http://opt4j.sourceforge.net.
[38]
Menzies, T., Chen, Z., Hihn, J., and Lum, K. 2006. Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32, 11, 883--895.
[39]
Menzies, T. and Shepperd, M. Eds. 2012. Empirical Software Engineering: Special issue on Repeatable Results in Software Engineering Prediction. 17, 1/2:1--17.
[40]
Miller, B. L. and Goldberg, D. E. 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9, 3, 193--212.
[41]
Minku, L. L. 2011. Machine learning for software effort estimation. The 13th CREST Open Workshop Future Internet Testing (FITTEST) & Search Based Software Engineering (SBSE), http://crest.cs.ucl.ac.uk/cow/13/slides/presentation_leandro.pdf, http://crest.cs.ucl.ac.uk/cow/13/videos/M2U00270Minku.mp4.
[42]
Minku, L. L. and Yao, X. 2011. A principled evaluation of ensembles of learning machines for software effort estimation. In Proceedings of PROMISE'11.
[43]
Minku, L. L. and Yao, X. 2013. Ensembles and locality: Insight on improving software effort estimation. Inf. Softw. Technol. 55, 8, 1512--1528.
[44]
Mohagheghi, P., Anda, B., and Conradi, R. 2005. Effort estimation of use cases for incremental large-scale software development. In Proceedings of ICSE. 303--311.
[45]
Montgomery, D. C. 2004. Design and Analysis of Experiments 6th Ed. John Wiley and Sons.
[46]
Praditwong, K., Harman, M., and Yao, X. 2011. Software module clustering as a multi-objective search problem. IEEE Trans. Softw. Eng. 37, 2, 264--282.
[47]
Praditwong, K. and Yao, X. 2006. A new multi-objective evolutionary optimisation algorithm: the two-archive algorithm. In Proceedings of the International Conference on Computational Intelligence and Security (CIS'06). Vol. 1, 286--291.
[48]
Rosenthal, R. 1994. The Handbook of Research Synthesis. Vol. 236, Sage, New York.
[49]
Seo, Y.-S., Yoon, K.-A., and Bae, D.-H. 2008. An empirical analysis of software effort estimation with outlier elimination. In Proceedings of the PROMISE. 25--32.
[50]
Shan, Y., McKay, R. J., Lokan, C. J., and Essam, D. L. 2002. Software project effort estimation using genetic programming. In Proceedings of the ICCCAS & WESINO EXPO. Vol. 2. 1108--1112.
[51]
Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23, 12, 736--743.
[52]
Shirabad, J. S. and Menzies, T. 2005. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada, http://promise. site.uottawa.ca/SERepository.
[53]
Srinivas, N. and Deb, K. 1994. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolut. Comput. 2, 221--248.
[54]
Srivasan, K. and Fisher, D. 1995. Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21, 2, 126--137.
[55]
Tan, H. B. K., Zhao, Y., and Zhang, H. 2006. Estimating LOC for information systems from their conceptual data models. In Proceedings of ICSE. 321--330.
[56]
Tan, H. B. K., Zhao, Y., and Zhang, H. 2009. Conceptual data model-based software size estimation for information systems. ACM Trans. Softw. Eng. Meth. 19, 2, 4:1--4:37.
[57]
Tronto, I. F. B., Silva, J. D. S., and Sant'Anna, N. 2007. Comparison of artificial neural network and regression models in software effort estimation. In Proceedings of IJCNN'07. 771--776.
[58]
Wang, Z., Tang, K., and Yao, X. 2010. Multi-objective approaches to optimal testing resource allocation in modular software systems. IEEE Trans. Reliability 59, 3, 563--575.
[59]
Wittig, G. E. and Finnie, G. R. 1994. Using artificial neural networks and function points to estimate 4GL software development effort. Austral. J. Info. Syst. 1, 2, 87--94.
[60]
Wittig, G. E. and Finnie, G. R. 1997. Estimating software development effort with connectionist models. Inf. Softw. Tech. 39, 469--476.
[61]
Zhao, Y. and Zhang, Y. 2008. Comparison of decision tree methods for finding active objects. Adv. Space 41, 1955--1959.
[62]
Zitzler, E., Laumanns, M., and Thiele, L. 2002. SPEA2: Improving the strength pareto evolutionary algorithm. In Proceedings of the Conference on Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems (EUROGEN'02), 95--100.

Cited By

View all
  • (2024)Multi-objective Feature Attribution Explanation For Explainable Machine LearningACM Transactions on Evolutionary Learning and Optimization10.1145/36173804:1(1-32)Online publication date: 23-Feb-2024
  • (2024)Computational intelligence for estimating software development effort: a systematic mapping studyIran Journal of Computer Science10.1007/s42044-024-00178-97:3(607-630)Online publication date: 9-Apr-2024
  • (2024)Cost Adjustment for Software Crowdsourcing Tasks Using Ensemble Effort Estimation and Topic ModelingArabian Journal for Science and Engineering10.1007/s13369-024-08746-849:9(12693-12728)Online publication date: 27-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 22, Issue 4
Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
October 2013
387 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2522920
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2013
Accepted: 01 December 2012
Revised: 01 August 2012
Received: 01 November 2011
Published in TOSEM Volume 22, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Software effort estimation
  2. ensembles of learning machines
  3. multi-objective evolutionary algorithms

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)1
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Multi-objective Feature Attribution Explanation For Explainable Machine LearningACM Transactions on Evolutionary Learning and Optimization10.1145/36173804:1(1-32)Online publication date: 23-Feb-2024
  • (2024)Computational intelligence for estimating software development effort: a systematic mapping studyIran Journal of Computer Science10.1007/s42044-024-00178-97:3(607-630)Online publication date: 9-Apr-2024
  • (2024)Cost Adjustment for Software Crowdsourcing Tasks Using Ensemble Effort Estimation and Topic ModelingArabian Journal for Science and Engineering10.1007/s13369-024-08746-849:9(12693-12728)Online publication date: 27-Feb-2024
  • (2023)Mitigating Unfairness via Evolutionary Multiobjective Ensemble LearningIEEE Transactions on Evolutionary Computation10.1109/TEVC.2022.320954427:4(848-862)Online publication date: 1-Aug-2023
  • (2023)A Two-Stage Algorithm Based on 12 Priority Rules for the Stochastic Distributed Resource-Constrained Multi-Project Scheduling Problem With Multi-Skilled StaffIEEE Access10.1109/ACCESS.2023.326113911(29554-29565)Online publication date: 2023
  • (2023)Heterogeneous Ensemble Model to Optimize Software Effort Estimation AccuracyIEEE Access10.1109/ACCESS.2023.325653311(27759-27792)Online publication date: 2023
  • (2023)Evaluating ensemble imputation in software effort estimationEmpirical Software Engineering10.1007/s10664-022-10260-028:2Online publication date: 15-Mar-2023
  • (2023)Artificial Intelligence in Software Project ManagementOptimising the Software Development Process with Artificial Intelligence10.1007/978-981-19-9948-2_2(19-65)Online publication date: 20-Jul-2023
  • (2022)Software Vulnerability Prediction Using Grey Wolf-Optimized Random Forest on the Unbalanced Data SetsInternational Journal of Applied Metaheuristic Computing10.4018/IJAMC.29250813:1(1-15)Online publication date: 1-Jan-2022
  • (2022)Intelligent Adaptive Optimisation Method for Enhancement of Information Security in IoT-Enabled EnvironmentsSustainability10.3390/su14201363514:20(13635)Online publication date: 21-Oct-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media