Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/PROMISE.2007.3acmconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
Article

Column Pruning Beats Stratification in Effort Estimation

Published: 20 May 2007 Publication History

Abstract

Local calibration combined with stratification, also known as row pruning, is a common technique used by cost estimation professionals to improve model performance. The results presented in this paper raise several serious questions concerning the benefits of row pruning for improving effort estimation indicating the need to rethink standard practice. Firstly, the mean size of improvements from row pruning appears to be relatively small compared to the size of the standard deviations in effort estimation data. Secondly, the advantages of row pruning especially for the purposes of deleting spurious outliers can be achieved using column pruning much more effectively. Hence, we advise against row pruning and advocate column pruning instead.

References

[1]
{1} B. Boehm. Software Engineering Economics. Prentice Hall, 1981.
[2]
{2} B. Boehm, E. Horowitz, R. Madachy, D. Reifer, B. K. Clark, B. Steece, A. W. Brown, S. Chulani, and C. Abts. Software Cost Estimation with Cocomo II. Prebtice Hall, 2000.
[3]
{3} G. Boetticher. When will it be done? the 300 billion dollar question, machine learner answers. IEEE Intelligent Systems , June 2003.
[4]
{4} G. Boetticher, T. Menzies, and T. Ostrand. The PROMISE Repository of Empirical Software Engineering Data, 2007. http://promisedata.org/repository.
[5]
{5} L. Briand, T. Langley, and I. Wieczorek. A replicated assessment and comparison of common software cost modeling techniques. In Proceedings of the 22nd International Conference on Software Engineering, Limerick, Ireland, pages 377-386, 2000.
[6]
{6} Z. Chen, T. Menzies, D. Port, and B. Boehm. Finding the right data for software cost modeling. IEEE Software, Nov 2005.
[7]
{7} S. Chulani, B. Boehm, and B. Steece. Bayesian analysis of empirical software engineering cost models. IEEE Transaction on Software Engineerining, 25(4), July/August 1999.
[8]
{8} D. Ferens and D. Christensen. Calibrating software cost models to Department of Defense Database: A review of ten studies. Journal of Parametrics, 18(1):55-74, November 1998.
[9]
{9} M. Garre, M. S. J.J. Cuadrado-Gallego, M. Charro, and D. Rodriguez. Segmented parametric software estimation models: Using the em algorithm with the isbsg 8 database. In 27th International Conference on Information Technology Interfaces. ITI 2005, Dubrovnik, Croatia, 2005.
[10]
{10} M. Hall and G. Holmes. Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions On Knowledge And Data Engineering, 15(6):1437-1447, 2003.
[11]
{11} R. Jensen. An improved macrolevel software development resource estimation model. In 5th ISPA Conference, pages 88-92, April 1983.
[12]
{12} M. Jorgensen. A review of studies on expert estimation of software development effort. Journal of Systems and Software , 70(1-2):37-60, 2004.
[13]
{13} C. Kemerer. An empirical validation of software cost estimation models. Communications of the ACM, 30(5):416-429, May 1987.
[14]
{14} C. Kirsopp and M. Shepperd. Case and feature subset selection in case-based software project effort prediction. In Proc. of 22nd SGAI International Conference on Knowledge-Based Systems and Applied Artificial Intelligence, Cambridge, UK, 2002.
[15]
{15} R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1-2):273-324, 1997.
[16]
{16} K. Lum, J. Powell, and J. Hihn. Validation of spacecraft cost estimation models for flight and ground systems. In ISPA Conference Proceedings, Software Modeling Track, May 2002.
[17]
{17} C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. S. ofieldl, M. Shepperd, and S. Webster. An investigation of machine learning based prediction systems. The Journal of Systems and Software, 53(1):23-29, 2000.
[18]
{18} T. Menzies, Z. Chen, J. Hihn, and K. Lum. Selecting best practices for effort estimation. IEEE Transactions on Software Engineering, November 2006. Available from http: //menzies.us/pdf/06coseekmo.pdf.
[19]
{19} T. Menzies, D. Port, Z. Chen, J. Hihn, and S. Stukes. Validation methods for calibrating software effort models. In Proceedings, ICSE, 2005. Available from http:// menzies.us/pdf/04coconut.pdf.
[20]
{20} A. Miller. Subset Selection in Regression (second edition). Chapman & Hall, 2002.
[21]
{21} R. Park. The central equations of the price software cost model. In 4th COCOMO Users Group Meeting, November 1988.
[22]
{22} L. Putnam and W. Myers. Measures for Excellence. Yourdon Press Computing Series, 1992.
[23]
{23} J. R. Quinlan. Learning with Continuous Classes. In 5th Australian Joint Conference on Artificial Intelligence, pages 343-348, 1992. Available from http://citeseer. nj.nec.com/quinlan92learning.html.
[24]
{24} M. Shepperd and C. Schofield. Estimating software project effort using analogies. IEEE Transactions on Software Engineering , 23(12), November 1997. Available from http: //www.utdallas.edu/~rbanker/SE_XII.pdf.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PROMISE '07: Proceedings of the Third International Workshop on Predictor Models in Software Engineering
May 2007
104 pages
ISBN:0769529542

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 20 May 2007

Check for updates

Qualifiers

  • Article

Conference

PROMISE '07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 98 of 213 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media