research-article

Software effort estimation as a multiobjective learning problem

Authors:

Leandro L. Minku,

Xin YaoAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 22, Issue 4

Article No.: 35, Pages 1 - 32

https://doi.org/10.1145/2522920.2522928

Published: 22 October 2013 Publication History

Abstract

Ensembles of learning machines are promising for software effort estimation (SEE), but need to be tailored for this task to have their potential exploited. A key issue when creating ensembles is to produce diverse and accurate base models. Depending on how differently different performance measures behave for SEE, they could be used as a natural way of creating SEE ensembles. We propose to view SEE model creation as a multiobjective learning problem. A multiobjective evolutionary algorithm (MOEA) is used to better understand the tradeoff among different performance measures by creating SEE models through the simultaneous optimisation of these measures. We show that the performance measures behave very differently, presenting sometimes even opposite trends. They are then used as a source of diversity for creating SEE ensembles. A good tradeoff among different measures can be obtained by using an ensemble of MOEA solutions. This ensemble performs similarly or better than a model that does not consider these measures explicitly. Besides, MOEA is also flexible, allowing emphasis of a particular measure if desired. In conclusion, MOEA can be used to better understand the relationship among performance measures and has shown to be very effective in creating SEE models.

References

[1]

Agarwal, R., Kumar, M., Mallick, Y. S., Bharadwaj, R. M., and Anantwar, D. 2001. Estimating software projects. Softw. Eng. Notes 16, 4, 60--67.

Digital Library

[2]

Baskeles, B., Turhan, B., and Bener, A. 2007. Software effort estimation using machine learning methods. In Proceedings of ISCIS'07. 1--6.

[3]

Bishop, C. M. 2005. Neural Networks for Pattern Recognition. Oxford University Press, UK.

[4]

Boehm, B. 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ.

Digital Library

[5]

Boehm, B., Abts, C., Brown, A. W., Chulani, S., Clark, B. K., Horowitz, E., Madachy, R., Reifer, D. J., and Steece, B. 2000. Software Cost Estimation with COCOMO II. Prentice-Hall, Englewood Cliffs, NJ.

Digital Library

[6]

Braga, P. L., Oliveira, A., Ribeiro, G., and Meira, S. 2007. Bagging predictors for estimation of software project effort. In Proceedings of IJCNN'07. 1595--1600.

[7]

Breiman, L. 1996. Bagging predictors. Mach. Learn. 24, 2, 123--140.

Digital Library

[8]

Brown, G., Wyatt, J., Harris, R., and Yao, X. 2005. Diversity creation methods: A survey and categorisation. Inf. Fusion 6, 5--20.

[9]

Cartwright, M. H., Shepperd, M. J., and Song, Q. 2003. Dealing with missing software project data. In Proceedings of METRICS'03. 154--165.

Digital Library

[10]

Chandra, A. and Yao, X. 2006. Ensemble learning using multi-objective evolutionary algorithms. J. Math. Modell. Algor. 5, 4, 417--445.

[11]

Chen, H. and Yao, X. 2009. Regularized negative correlation learning for neural network ensembles. IEEE Trans. Neural Netw. 20, 12, 1962--1979.

Digital Library

[12]

Chulani, S., Bohem, B., and Steece, B. 1999. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25, 4, 573--583.

Digital Library

[13]

Cohen, J. 1992. A power primer. Psych. Bull. 112, 155--159.

[14]

Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evalut. Computa. 6, 2, 182--197.

Digital Library

[15]

Dejaeger, K., Verbeke, W., Martens, D., and Baesens, B. 2012. Data mining techniques for software effort estimation: A comparative study. IEEE Trans. Softw. Eng. 38, 2, 375--397.

Digital Library

[16]

Demšar, J. 2006. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Rese. 7, 130.

Digital Library

[17]

Dolado, J. 2000. A validation of the component-based method for software size estimation. IEEE Trans. Softw. Eng. 26, 1006--1021.

Digital Library

[18]

Dolado, J. 2001. On the problem of the software cost function. Info. Softw. Tech. 43, 61--72.

[19]

Finnoff, W., Hergert, F., and Zimmermann, H. G. 1993. Improving model selection by nonconvergent methods. Neural Netw. 6, 771--783.

Digital Library

[20]

Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit, I. 2003. A simulation study of the model evaluation criterion mmre. IEEE Trans. Softw. Eng. 29, 11, 985--995.

Digital Library

[21]

Gruschke, T. M. and Jørgensen, M. 2008. The role of outcome feedback in improving the uncertainty assessment of software development effort estimates. ACM Trans. Softw. Eng. Meth. 17, 4, 20:1--20:35.

Digital Library

[22]

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. 2009. The weka data mining software: An update. SIGKDD Explorations 11, 1, 10--18.

Digital Library

[23]

Harman, M. and Clark, J. 2004. Metrics are fitness functions too. In Proceedings of METRICS'04. 172--183.

Digital Library

[24]

Hartigan, J. A. 1975. Clustering Algorithms. John Wiley & Sons, New York.

Digital Library

[25]

Heiat, A. 2002. Comparison of artificial neural network and regression models for estimating software development effort. Info. Softw. Tech. 44, 911--922.

[26]

ISBSG. 2011. The International Software Benchmarking Standards Group. http://www.isbsg.org.

[27]

Jørgensen, M. and Shepperd, M. 2007. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33, 1, 33--53.

Digital Library

[28]

Jørgensen, M. and Grimstad, S. 2011. The impact of irrelevant and misleading information on software development effort estimates: A randomized controlled field experiment. IEEE Trans. Softw. Eng. 37, 5, 695--707.

Digital Library

[29]

Khare, V., Yao, X., and Deb, K. 2003. Performance scaling of multi-objective evolutionary algorithms. In Proceedings of the 2nd International Conference on Evolutionary Multi-Criterion Optimization (EMO'03), C. M. Fonseca, P. J. Fleming, E. Zitzler, K. Deb, and L. Thiele, Eds., Lecture Notes in Computer Science, vol. 2632. Springer-Verlag, 376--390.

Digital Library

[30]

Kocaguneli, E., Bener, A., and Kultur, Y. 2009. Combining multiple learners induced on multiple datasets for software effort prediction. In Proceedings of ISSRE'07.

[31]

Kocaguneli, E., Menzies, T., and Keung, J. 2012. On the value of ensemble effort estimation. IEEE Trans. Softw. Eng. 38, 6, 1403--1416.

Digital Library

[32]

Kultur, Y., Turhan, B., and Bener, A. 2009. Ensemble of neural networks with associative memory (ENNA) for estimating software development costs. Knowl. Based Syst. 22, 395--402.

Digital Library

[33]

Kuncheva, L. I. and Whitaker, C. J. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machi. Learn. 51, 181--207.

Digital Library

[34]

Legg, S., Hutter, M., and Kumar, A. 2004. Tournament versus fitness uniform selection. In Proceedings of the Congress of Evolutionary Computation (CEC). 2144--2151.

[35]

Liu, Y. and Yao, X. 1999a. Ensemble learning via negative correlation. Neur. Netw. 12, 1399--1404.

Digital Library

[36]

Liu, Y. and Yao, X. 1999b. Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Trans. Syst. Man Cybernetics - Part B: Cybernetics 29, 6, 716--725.

Digital Library

[37]

Lukasiewycz, M., Gla, M., Reimann, F., and Helwig, S. 2011. Opt4j: The meta-heuristic optimisation framework for java. http://opt4j.sourceforge.net.

[38]

Menzies, T., Chen, Z., Hihn, J., and Lum, K. 2006. Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32, 11, 883--895.

Digital Library

[39]

Menzies, T. and Shepperd, M. Eds. 2012. Empirical Software Engineering: Special issue on Repeatable Results in Software Engineering Prediction. 17, 1/2:1--17.

Digital Library

[40]

Miller, B. L. and Goldberg, D. E. 1995. Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9, 3, 193--212.

[41]

Minku, L. L. 2011. Machine learning for software effort estimation. The 13th CREST Open Workshop Future Internet Testing (FITTEST) & Search Based Software Engineering (SBSE), http://crest.cs.ucl.ac.uk/cow/13/slides/presentation_leandro.pdf, http://crest.cs.ucl.ac.uk/cow/13/videos/M2U00270Minku.mp4.

[42]

Minku, L. L. and Yao, X. 2011. A principled evaluation of ensembles of learning machines for software effort estimation. In Proceedings of PROMISE'11.

Digital Library

[43]

Minku, L. L. and Yao, X. 2013. Ensembles and locality: Insight on improving software effort estimation. Inf. Softw. Technol. 55, 8, 1512--1528.

Digital Library

[44]

Mohagheghi, P., Anda, B., and Conradi, R. 2005. Effort estimation of use cases for incremental large-scale software development. In Proceedings of ICSE. 303--311.

Digital Library

[45]

Montgomery, D. C. 2004. Design and Analysis of Experiments 6th Ed. John Wiley and Sons.

Digital Library

[46]

Praditwong, K., Harman, M., and Yao, X. 2011. Software module clustering as a multi-objective search problem. IEEE Trans. Softw. Eng. 37, 2, 264--282.

Digital Library

[47]

Praditwong, K. and Yao, X. 2006. A new multi-objective evolutionary optimisation algorithm: the two-archive algorithm. In Proceedings of the International Conference on Computational Intelligence and Security (CIS'06). Vol. 1, 286--291.

[48]

Rosenthal, R. 1994. The Handbook of Research Synthesis. Vol. 236, Sage, New York.

[49]

Seo, Y.-S., Yoon, K.-A., and Bae, D.-H. 2008. An empirical analysis of software effort estimation with outlier elimination. In Proceedings of the PROMISE. 25--32.

Digital Library

[50]

Shan, Y., McKay, R. J., Lokan, C. J., and Essam, D. L. 2002. Software project effort estimation using genetic programming. In Proceedings of the ICCCAS & WESINO EXPO. Vol. 2. 1108--1112.

[51]

Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23, 12, 736--743.

Digital Library

[52]

Shirabad, J. S. and Menzies, T. 2005. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada, http://promise. site.uottawa.ca/SERepository.

[53]

Srinivas, N. and Deb, K. 1994. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolut. Comput. 2, 221--248.

Digital Library

[54]

Srivasan, K. and Fisher, D. 1995. Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21, 2, 126--137.

Digital Library

[55]

Tan, H. B. K., Zhao, Y., and Zhang, H. 2006. Estimating LOC for information systems from their conceptual data models. In Proceedings of ICSE. 321--330.

Digital Library

[56]

Tan, H. B. K., Zhao, Y., and Zhang, H. 2009. Conceptual data model-based software size estimation for information systems. ACM Trans. Softw. Eng. Meth. 19, 2, 4:1--4:37.

Digital Library

[57]

Tronto, I. F. B., Silva, J. D. S., and Sant'Anna, N. 2007. Comparison of artificial neural network and regression models in software effort estimation. In Proceedings of IJCNN'07. 771--776.

[58]

Wang, Z., Tang, K., and Yao, X. 2010. Multi-objective approaches to optimal testing resource allocation in modular software systems. IEEE Trans. Reliability 59, 3, 563--575.

[59]

Wittig, G. E. and Finnie, G. R. 1994. Using artificial neural networks and function points to estimate 4GL software development effort. Austral. J. Info. Syst. 1, 2, 87--94.

[60]

Wittig, G. E. and Finnie, G. R. 1997. Estimating software development effort with connectionist models. Inf. Softw. Tech. 39, 469--476.

[61]

Zhao, Y. and Zhang, Y. 2008. Comparison of decision tree methods for finding active objects. Adv. Space 41, 1955--1959.

[62]

Zitzler, E., Laumanns, M., and Thiele, L. 2002. SPEA2: Improving the strength pareto evolutionary algorithm. In Proceedings of the Conference on Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems (EUROGEN'02), 95--100.

Cited By

Wang ZHuang CLi YYao X(2024)Multi-objective Feature Attribution Explanation For Explainable Machine LearningACM Transactions on Evolutionary Learning and Optimization10.1145/36173804:1(1-32)Online publication date: 23-Feb-2024
https://dl.acm.org/doi/10.1145/3617380
Benala TKaushik ADehuri SJain L(2024)Computational intelligence for estimating software development effort: a systematic mapping studyIran Journal of Computer Science10.1007/s42044-024-00178-97:3(607-630)Online publication date: 9-Apr-2024
https://doi.org/10.1007/s42044-024-00178-9
Yasmin A(2024)Cost Adjustment for Software Crowdsourcing Tasks Using Ensemble Effort Estimation and Topic ModelingArabian Journal for Science and Engineering10.1007/s13369-024-08746-849:9(12693-12728)Online publication date: 27-Feb-2024
https://doi.org/10.1007/s13369-024-08746-8
Show More Cited By

Index Terms

Software effort estimation as a multiobjective learning problem

Recommendations

An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation
PROMISE '13: Proceedings of the 9th International Conference on Predictive Models in Software Engineering

Background: Previous work showed that Multi-objective Evolutionary Algorithms (MOEAs) can be used for training ensembles of learning machines for Software Effort Estimation (SEE) by optimising different performance measures concurrently. Optimisation ...
MOEA/D with opposition-based learning for multiobjective optimization problem

Multiobjective evolutionary algorithm based on decomposition (MOEA/D) has attracted a great deal of attention and has obtained enormous success in the field of evolutionary multiobjective optimization. It converts a multiobjective optimization problem (...
Towards a Pareto Front Shape Invariant Multi-Objective Evolutionary Algorithm Using Pair-Potential Functions
Advances in Computational Intelligence
Abstract
Reference sets generated with uniformly distributed weight vectors on a unit simplex are widely used by several multi-objective evolutionary algorithms (MOEAs). They have been employed to tackle multi-objective optimization problems (MOPs) with ... $^{}$

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 22, Issue 4

Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance

October 2013

387 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/2522920

Issue’s Table of Contents

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2013

Accepted: 01 December 2012

Revised: 01 August 2012

Received: 01 November 2011

Published in TOSEM Volume 22, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Engineering and Physical Sciences Research Council

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

80
Total Citations
View Citations
1,025
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang ZHuang CLi YYao X(2024)Multi-objective Feature Attribution Explanation For Explainable Machine LearningACM Transactions on Evolutionary Learning and Optimization10.1145/36173804:1(1-32)Online publication date: 23-Feb-2024
https://dl.acm.org/doi/10.1145/3617380
Benala TKaushik ADehuri SJain L(2024)Computational intelligence for estimating software development effort: a systematic mapping studyIran Journal of Computer Science10.1007/s42044-024-00178-97:3(607-630)Online publication date: 9-Apr-2024
https://doi.org/10.1007/s42044-024-00178-9
Yasmin A(2024)Cost Adjustment for Software Crowdsourcing Tasks Using Ensemble Effort Estimation and Topic ModelingArabian Journal for Science and Engineering10.1007/s13369-024-08746-849:9(12693-12728)Online publication date: 27-Feb-2024
https://doi.org/10.1007/s13369-024-08746-8
Zhang QLiu JZhang ZWen JMao BYao X(2023)Mitigating Unfairness via Evolutionary Multiobjective Ensemble LearningIEEE Transactions on Evolutionary Computation10.1109/TEVC.2022.320954427:4(848-862)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.1109/TEVC.2022.3209544
Yu YXu ZZhao S(2023)A Two-Stage Algorithm Based on 12 Priority Rules for the Stochastic Distributed Resource-Constrained Multi-Project Scheduling Problem With Multi-Skilled StaffIEEE Access10.1109/ACCESS.2023.326113911(29554-29565)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3261139
Ali SRen JZhang KWu JLiu C(2023)Heterogeneous Ensemble Model to Optimize Software Effort Estimation AccuracyIEEE Access10.1109/ACCESS.2023.325653311(27759-27792)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3256533
Abnane IIdri AChlioui IAbran A(2023)Evaluating ensemble imputation in software effort estimationEmpirical Software Engineering10.1007/s10664-022-10260-028:2Online publication date: 15-Mar-2023
https://dl.acm.org/doi/10.1007/s10664-022-10260-0
Song LMinku L(2023)Artificial Intelligence in Software Project ManagementOptimising the Software Development Process with Artificial Intelligence10.1007/978-981-19-9948-2_2(19-65)Online publication date: 20-Jul-2023
https://doi.org/10.1007/978-981-19-9948-2_2
Rhmann W(2022)Software Vulnerability Prediction Using Grey Wolf-Optimized Random Forest on the Unbalanced Data SetsInternational Journal of Applied Metaheuristic Computing10.4018/IJAMC.29250813:1(1-15)Online publication date: 1-Jan-2022
https://doi.org/10.4018/IJAMC.292508
Singh SAlotaibi YKumar GRawat S(2022)Intelligent Adaptive Optimisation Method for Enhancement of Information Security in IoT-Enabled EnvironmentsSustainability10.3390/su14201363514:20(13635)Online publication date: 21-Oct-2022
https://doi.org/10.3390/su142013635
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents