Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3611643.3616328acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Dynamic Prediction of Delays in Software Projects using Delay Patterns and Bayesian Modeling

Published: 30 November 2023 Publication History

Abstract

Modern agile software projects are subject to constant change, making it essential to re-asses overall delay risk throughout the project life cycle. Existing effort estimation models are static and not able to incorporate changes occurring during project execution. In this paper, we propose a dynamic model for continuously predicting overall delay using delay patterns and Bayesian modeling. The model incorporates the context of the project phase and learns from changes in team performance over time. We apply the approach to real-world data from 4,040 epics and 270 teams at ING. An empirical evaluation of our approach and comparison to the state-of-the-art demonstrate significant improvements in predictive accuracy. The dynamic model consistently outperforms static approaches and the state-of-the-art, even during early project phases.

Supplementary Material

Video (fse23main-p838-p-video.mp4)
"Modern agile software projects are subject to constant change, making it essential to re-asses overall delay risk throughout the project life cycle. Existing effort estimation models are static and not able to incorporate changes occurring during project execution. In this paper, we propose a dynamic model for continuously predicting overall delay using delay patterns and Bayesian modeling. The model incorporates the context of the project phase and learns from changes in team performance over time. We apply the approach to real-world data from 4,040 epics and 270 teams at ING. An empirical evaluation of our approach and comparison to the state-of-the-art demonstrate significant improvements in predictive accuracy. The dynamic model consistently outperforms static approaches and the state-of-the-art, even during early project phases."

References

[1]
Pekka Abrahamsson, Ilenia Fronza, Raimund Moser, Jelena Vlasenko, and Witold Pedrycz. 2011. Predicting development effort from user stories. In 2011 International Symposium on Empirical Software Engineering and Measurement. 400–403.
[2]
Pekka Abrahamsson, Raimund Moser, Witold Pedrycz, Alberto Sillitti, and Giancarlo Succi. 2007. Effort prediction in iterative software development processes–Incremental versus global prediction models. In First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007). 344–353.
[3]
Saeed Aghabozorgi, Ali Seyed Shirkhorshidi, and Teh Ying Wah. 2015. Time-series clustering–a decade review. Information systems, 53 (2015), 16–38.
[4]
Manish Agrawal and Kaushal Chari. 2007. Software effort, quality, and cycle time: A study of CMM level 5 projects. IEEE Transactions on software engineering, 33, 3 (2007), 145–156.
[5]
Ahmed Al-Emran, Dietmar Pfahl, and Günther Ruhe. 2007. DynaReP: A discrete event simulation model for re-planning of software releases. In Software Process Dynamics and Agility: International Conference on Software Process, ICSP 2007, Minneapolis, MN, USA, May 19-20, 2007. Proceedings. 246–258.
[6]
David Ameller, Carles Farré, Xavier Franch, Danilo Valerio, and Antonino Cassarino. 2017. Towards continuous software release planning. In 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). 402–406.
[7]
Andrea Arcuri and Lionel Briand. 2014. A hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability, 24, 3 (2014), 219–250.
[8]
Mehmet Şirin Artan and İsmail Şahin. 2021. Exploring patterns of train delay evolution and timetable robustness. IEEE Transactions on Intelligent Transportation Systems, 23, 8 (2021), 11205–11214.
[9]
Michael Betancourt. 2017. A conceptual introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434.
[10]
Paul-Christian Bürkner, Jonah Gabry, and Aki Vehtari. 2020. Approximate leave-future-out cross-validation for Bayesian time series models. Journal of Statistical Computation and Simulation, 90, 14 (2020), 2499–2523.
[11]
H Frank Cervone. 2011. Understanding agile project management methods using Scrum. OCLC Systems & Services: International digital library perspectives.
[12]
Morakot Choetkiertikul, Hoa Khanh Dam, and Aditya Ghose. 2015. Threshold-based prediction of schedule overrun in software projects. In Proceedings of the ASWEC 2015 24th Australasian Software Engineering Conference. 81–85.
[13]
Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2015. Predicting delays in software projects using networked classification (t). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 353–364.
[14]
Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2017. Predicting the delay of issues with due dates in software projects. Empirical Software Engineering, 22 (2017), 1223–1263.
[15]
Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, Aditya Ghose, and John Grundy. 2017. Predicting delivery capability in iterative software development. IEEE Transactions on Software Engineering, 44, 6 (2017), 551–573.
[16]
Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, Trang Pham, Aditya Ghose, and Tim Menzies. 2018. A deep learning model for estimating story points. IEEE Transactions on Software Engineering, 45, 7 (2018), 637–656.
[17]
Maëlick Claes, Mika V Mäntylä, Miikka Kuutila, and Bram Adams. 2018. Do programmers work at night or during the weekend? In Proceedings of the 40th International Conference on Software Engineering. 705–715.
[18]
Mike Cohn. 2004. User stories applied: For agile software development. Addison-Wesley Professional.
[19]
Mike Cohn. 2005. Agile estimating and planning. Pearson Education.
[20]
Francesco Corman and Pavle Kecman. 2018. Stochastic prediction of train delays in real-time using Bayesian networks. Transportation Research Part C: Emerging Technologies, 95 (2018), 599–615.
[21]
Emanuel Dantas, Mirko Perkusich, Ednaldo Dilorenzo, Danilo FS Santos, Hyggo Almeida, and Angelo Perkusich. 2018. Effort estimation in agile software development: an updated review. International Journal of Software Engineering and Knowledge Engineering, 28, 11n12 (2018), 1811–1831.
[22]
Karel De Bakker, Albert Boonstra, and Hans Wortmann. 2010. Does risk management contribute to IT project success? A meta-analysis of empirical evidence. International Journal of Project Management, 28, 5 (2010), 493–503.
[23]
Amany Elbanna and Suprateek Sarker. 2015. The risks of agile software development: Learning from adopters. IEEE Software, 33, 5 (2015), 72–79.
[24]
Silvia Ferrari and Francisco Cribari-Neto. 2004. Beta regression for modelling rates and proportions. Journal of applied statistics, 31, 7 (2004), 799–815.
[25]
Tron Foss, Erik Stensrud, Barbara Kitchenham, and Ingunn Myrtveit. 2003. A simulation study of the model evaluation criterion MMRE. IEEE transactions on software engineering, 29, 11 (2003), 985–995.
[26]
Carlo A Furia, Robert Feldt, and Richard Torkar. 2019. Bayesian data analysis in empirical software engineering research. IEEE Transactions on Software Engineering, 47, 9 (2019), 1786–1810.
[27]
Carlo A Furia, Richard Torkar, and Robert Feldt. 2022. Applying Bayesian analysis guidelines to empirical software engineering data: The case of programming languages and code quality. ACM Transactions on Software Engineering and Methodology (TOSEM), 31, 3 (2022), 1–38.
[28]
Andrew Gelman. 2006. Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian analysis, 1, 3 (2006), 515–534.
[29]
James Grenning. 2002. Planning poker or how to avoid analysis paralysis while release planning. Hawthorn Woods: Renaissance Software Consulting, 3 (2002), 22–23.
[30]
Torleif Halkjelsvik and Magne Jørgensen. 2012. From origami to software development: A review of studies on judgment-based predictions of performance time. Psychological bulletin, 138, 2 (2012), 238.
[31]
Wen-Ming Han and Sun-Jen Huang. 2007. An empirical analysis of risk components and performance on software projects. Journal of Systems and Software, 80, 1 (2007), 42–50.
[32]
John A Hartigan and Manchek A Wong. 1979. A k-means clustering algorithm. Applied statistics, 28, 1 (1979), 100–108.
[33]
Peter Hearty, Norman Fenton, David Marquez, and Martin Neil. 2008. Predicting project velocity in xp using a learning dynamic bayesian network model. IEEE Transactions on Software Engineering, 35, 1 (2008), 124–137.
[34]
Ping Huang, Thomas Spanninger, and Francesco Corman. 2022. Enhancing the understanding of train delays with delay evolution pattern discovery: A clustering and Bayesian network approach. IEEE Transactions on Intelligent Transportation Systems, 23, 9 (2022), 15367–15381.
[35]
Irum Inayat, Siti Salwah Salim, Sabrina Marczak, Maya Daneva, and Shahaboddin Shamshirband. 2015. A systematic literature review on agile requirements engineering practices and challenges. Computers in human behavior, 51 (2015), 915–929.
[36]
Yushan Jiang, Yongxin Liu, Dahai Liu, and Houbing Song. 2020. Applying machine learning to aviation big data for flight delay prediction. In 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). 665–672.
[37]
Magne Jørgensen. 2004. A review of studies on expert estimation of software development effort. Journal of Systems and Software, 70, 1-2 (2004), 37–60.
[38]
Magne Jørgensen. 2019. Evaluating probabilistic software development effort estimates: Maximizing informativeness subject to calibration. Information and software Technology, 115 (2019), 93–96.
[39]
Magne Jørgensen, Morten Welde, and Torleif Halkjelsvik. 2021. Evaluation of probabilistic project cost estimates. IEEE Transactions on Engineering Management.
[40]
Sungjoo Kang, Okjoo Choi, and Jongmoon Baik. 2010. Model-based dynamic cost estimation and tracking method for agile software development. In 2010 IEEE/ACIS 9th International Conference on Computer and Information Science. 743–748.
[41]
Noureddine Kerzazi and Foutse Khomh. 2014. Factors impacting rapid releases: an industrial case study. In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 1–8.
[42]
Barbara A Kitchenham, Lesley M Pickard, Stephen G. MacDonell, and Martin J. Shepperd. 2001. What accuracy statistics really measure. IEE Proceedings-Software, 148, 3 (2001), 81–85.
[43]
Elvan Kula, Eric Greuter, Arie Van Deursen, and Gousios Georgios. [n. d.]. Supplemental material for Dynamic Prediction of Delays in Software Projects Using Bayesian Modeling, year = 2023, url =. https://figshare.com/s/4672f25236520a2b4428
[44]
Elvan Kula, Eric Greuter, Arie Van Deursen, and Gousios Georgios. 2021. Factors Affecting On-Time Delivery in Large-Scale Agile Software Development. IEEE Transactions on Software Engineering.
[45]
Elvan Kula, Ayushi Rastogi, Hennie Huijgens, Arie van Deursen, and Georgios Gousios. 2019. Releasing fast and slow: an exploratory case study at ING. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 785–795.
[46]
Elvan Kula, Arie van Deursen, and Georgios Gousios. 2021. Modeling Team Dynamics for the Characterization and Prediction of Delays in User Stories. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 991–1002.
[47]
William B Langdon, Javier Dolado, Federica Sarro, and Mark Harman. 2016. Exact mean absolute error of baseline predictor, MARP0. Information and Software Technology, 73 (2016), 16–18.
[48]
Dean Leffingwell. 2007. Scaling software agility: best practices for large enterprises. Pearson Education.
[49]
Nathan P Lemoine. 2019. Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128, 7 (2019), 912–928.
[50]
Chandra Maddila, Chetan Bansal, and Nachiappan Nagappan. 2019. Predicting pull request completion time: a case study on large scale cloud services. In Proceedings of the 2019 27th acm joint meeting on european software engineering conference and symposium on the foundations of software engineering. 874–882.
[51]
Viljan Mahnič and Tomaž Hovelja. 2012. On using planning poker for estimating user stories. Journal of Systems and Software, 85, 9 (2012), 2086–2095.
[52]
Carolyn Mair, Gada Kadoda, Martin Lefley, Keith Phalp, Chris Schofield, Martin Shepperd, and Steve Webster. 2000. An investigation of machine learning based prediction systems. Journal of systems and software, 53, 1 (2000), 23–29.
[53]
Richard McElreath. 2020. Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.
[54]
Tim Menzies, Zhihao Chen, Jairus Hihn, and Karen Lum. 2006. Selecting best practices for effort estimation. IEEE Transactions on Software Engineering, 32, 11 (2006), 883–895.
[55]
Y Miyazaki, M Terakado, K Ozaki, and H Nozaki. 1994. Robust regression for developing software estimation models. Journal of Systems and Software, 27, 1 (1994), 3–16.
[56]
Kjetil Molokken and Magne Jorgensen. 2003. A review of software surveys on software effort estimation. In 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. 223–230.
[57]
Meinard Müller. 2007. Dynamic time warping. Information retrieval for music and motion, 69–84.
[58]
Danh Nguyen-Cong and De Tran-Cao. 2013. A review of effort estimation studies in agile, iterative and incremental software development. In The 2013 RIVF International Conference on Computing & Communication Technologies-Research, Innovation, and Vision for Future (RIVF). 27–30.
[59]
Bernd Oreschko, Thomas Kunze, Michael Schultz, Hartmut Fricke, Vivek Kumar, and Lance Sherry. 2012. Turnaround prediction with stochastic process times and airport specific delay pattern. In International Conference on Research in Airport Transportation (ICRAT), Berkeley.
[60]
Raydonal Ospina and Silvia LP Ferrari. 2012. A general class of zero-or-one inflated beta regression models. Computational Statistics & Data Analysis, 56, 6 (2012), 1609–1623.
[61]
Aditi Panda, Shashank Mouli Satapathy, and Santanu Kumar Rath. 2015. Empirical validation of neural network models for agile software effort estimation based on story points. Procedia Computer Science, 57 (2015), 772–781.
[62]
Jirat Pasuksmit, Patanamon Thongtanunam, and Shanika Karunasekera. 2021. Towards Just-Enough Documentation for Agile Effort Estimation: What Information Should Be Documented? In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). 114–125.
[63]
Dan Port and Marcel Korte. 2008. Comparative studies of the model evaluation criterions mmre and pred in software cost estimation research. In Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement. 51–60.
[64]
Paul Ralph and Ewan Tempero. 2018. Construct validity in software engineering research and software metrics. In Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018. 13–23.
[65]
Jakob Runge, Peer Nowack, Marlene Kretschmer, Seth Flaxman, and Dino Sejdinovic. 2019. Detecting and quantifying causal associations in large nonlinear time series datasets. Science advances, 5, 11 (2019), eaau4996.
[66]
Federica Sarro, Alessio Petrozziello, and Mark Harman. 2016. Multi-objective software effort estimation. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). 619–630.
[67]
Ken Schwaber and Mike Beedle. 2002. Agile software development with Scrum. 1, Prentice Hall Upper Saddle River.
[68]
Richard Torkar, Carlo A Furia, Robert Feldt, Francisco Gomes de Oliveira Neto, Lucas Gren, Per Lenberg, and Neil A Ernst. 2021. A method to assess and argue for practical significance in software engineering. IEEE Transactions on Software Engineering, 48, 6 (2021), 2053–2065.
[69]
Carlos Joaquín Torrecilla-Salinas, Jorge Sedeño, MJ Escalona, and Manuel Mejías. 2015. Estimating, planning and managing Agile Web development projects under a value-based perspective. Information and Software Technology, 61 (2015), 124–144.
[70]
Adam Trendowicz and Ross Jeffery. 2014. Software project effort estimation. Foundations and Best Practice Guidelines for Success, Constructive Cost Model–COCOMO pags, 277–293.
[71]
Adam Trendowicz and Jürgen Münch. 2009. Factors influencing software development productivity—state-of-the-art and industrial experiences. Advances in computers, 77 (2009), 185–241.
[72]
Muhammad Usman, Emilia Mendes, and Jürgen Börstler. 2015. Effort estimation in agile software development: a survey on the state of the practice. In Proceedings of the 19th international conference on Evaluation and Assessment in Software Engineering. 1–10.
[73]
Muhammad Usman, Emilia Mendes, Francila Weidt, and Ricardo Britto. 2014. Effort estimation in agile software development: a systematic literature review. In Proceedings of the 10th international conference on predictive models in software engineering. 82–91.
[74]
Aki Vehtari, Andrew Gelman, and Jonah Gabry. 2017. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and computing, 27, 5 (2017), 1413–1432.
[75]
Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian Bürkner. 2021. Rank-normalization, folding, and localization: an improved R for assessing convergence of MCMC (with discussion). Bayesian analysis, 16, 2 (2021), 667–718.

Cited By

View all

Index Terms

  1. Dynamic Prediction of Delays in Software Projects using Delay Patterns and Bayesian Modeling

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    November 2023
    2215 pages
    ISBN:9798400703270
    DOI:10.1145/3611643
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 November 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. agile methods
    2. bayesian modeling
    3. delay patterns
    4. delay prediction

    Qualifiers

    • Research-article

    Conference

    ESEC/FSE '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 112 of 543 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 222
      Total Downloads
    • Downloads (Last 12 months)222
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media