Abstract
Natural language (NL) is arguably the most prevalent medium for expressing systems and software requirements. Detecting incompleteness in NL requirements is a major challenge. One approach to identify incompleteness is to compare requirements with external sources. Given the rise of large language models (LLMs), an interesting question arises: Are LLMs useful external sources of knowledge for detecting potential incompleteness in NL requirements? This article explores this question by utilizing BERT. Specifically, we employ BERT’s masked language model to generate contextualized predictions for filling masked slots in requirements. To simulate incompleteness, we withhold content from the requirements and assess BERT’s ability to predict terminology that is present in the withheld content but absent in the disclosed content. BERT can produce multiple predictions per mask. Our first contribution is determining the optimal number of predictions per mask, striking a balance between effectively identifying omissions in requirements and mitigating noise present in the predictions. Our second contribution involves designing a machine learning-based filter to post-process BERT’s predictions and further reduce noise. We conduct an empirical evaluation using 40 requirements specifications from the PURE dataset. Our findings indicate that: (1) BERT’s predictions effectively highlight terminology that is missing from requirements, (2) BERT outperforms simpler baselines in identifying relevant yet missing terminology, and (3) our filter reduces noise in the predictions, enhancing BERT’s effectiveness for completeness checking of requirements.
Similar content being viewed by others
References
Abbas M, Ferrari A, Shatnawi A, Enoiu EP, Saadatmand M (2021) Is requirements similarity a good proxy for software similarity? An empirical investigation in industry. In: 27th international working conference on requirements engineering: foundation for software quality (REFSQ’21)
Alrajeh D, Kramer J, van Lamsweerde A, Russo A, Uchitel S (2012) Generating obstacle conditions for requirements completeness. In: 34th international conference on software engineering (ICSE’12)
Amaral CO, Abualhaija S, Torre D, Sabetzadeh M, Briand L (2022) AI-enabled automation for completeness checking of privacy policies. IEEE Trans Softw Eng 48(11):4647–4674
Chetan A, Mehrdad S, Lionel B (2019) An empirical study on the potential usefulness of domain models for completeness checking of requirements. Empir Softw Eng 24(4):2509–2539
Chetan A, Mehrdad S, Lionel B, Frank Z (2015) Automated checking of conformance to requirements templates using natural language processing. IEEE Trans Softw Eng 41(10):944–968
Chetan A, Mehrdad S, Lionel B, Frank Z (2017) Automated extraction and clustering of requirements glossary terms. IEEE Trans Softw Eng 43(10):918–945
Chetan A, Mehrdad S, Shiva N, Lionel B (2019) An active learning approach for improving the accuracy of automated domain model extraction. ACM Trans Softw Eng Methodol 28:1–34
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:286-–05
Berry D (2021) Empirical evaluation of tools for hairy requirements engineering tasks. Empir Softw Eng 26:11
Berry DM., Kamsties E, Krieger M (2003) From contract drafting to software specification: Linguistic sources of ambiguity, a handbook. https://cs.uwaterloo.ca/dberry/handbook/ambiguityHandbook.pdf
Bhatia J, Breaux T (2018) Semantic incompleteness in privacy policy goals. In: 26th IEEE international requirements engineering conference (RE’18)
Jie C, Jiawei L, Shulin W, Sheng Y (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Capon JA (1988) Elementary statistics for the social sciences: study guide. Wadsworth
Cui G, Lu Q, Li W, Chen Y-R (2008) Corpus exploitation from Wikipedia for ontology construction. In: 6th international conference on language resources and evaluation (LREC’08)
Dalpiaz F, van der Schalk I, Lucassen G (2018) Pinpointing ambiguity and incompleteness in requirements engineering via information visualization and NLP. In: 24th international working conference on requirements engineering: foundation for software quality (REFSQ’18)
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Annual conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT’19)
Eckhardt J, Vogelsang A, Femmer H, Mager P (2016) Challenging incompleteness of performance requirements by sentence patterns. In: 24th IEEE international requirements engineering conference (RE’16)
Espana S, Condori-Fernandez N, Gonzalez A, Pastor Ó (2009) Evaluating the completeness and granularity of functional requirements specifications: a controlled experiment. In: 17th IEEE international requirements engineering conference (RE’09)
Ezzini S, Abualhaija S, Arora C, Sabetzadeh M (2022) Automated handling of anaphoric ambiguity in requirements: a multi-solution study. In: 44th international conference on software engineering (ICSE’22)
Ezzini S, Abualhaija S, Arora C, Sabetzadeh M, Briand L (2021) Using domain-specific corpora for improved handling of ambiguity in requirements. In: 43rd international conference on software engineering (ICSE’21)
Ezzini S, Abualhaija S, Sabetzadeh M (2022) WikiDoMiner: wikipedia domain-specific miner. In: 30th ACM joint European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE’22)
Fellbaum C (1998) WordNet: an electronic lexical database. Bradford Books, Bradford
Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018). Cost-Sensitive Learning. In: Learning from Imbalanced Data Sets. Springer, Cham
Ferrari A, dell’Orletta F, Spagnolo GO, Gnesi S (2014) Measuring and improving the completeness of natural language requirements. In: 20th international working conference on requirements engineering: foundation for software quality (REFSQ’14)
Ferrari A, Donati B, Gnesi S (2017) Detecting domain-specific ambiguities: an NLP approach based on Wikipedia crawling and word embeddings. In: 25th IEEE international requirements engineering conference workshops (REW’17)
Ferrari A, Spagnolo GO, Gnesi S (2017) PURE: a dataset of public requirements documents. In: 25th IEEE international requirements engineering conference (RE’17)
Gigante G, Gargiulo F, Ficco M (2015) A semantic driven approach for requirements verification. In: Camacho D, Braubach L, Venticinque S, Badica C (eds) Intelligent distributed computing VIII. Springer, Cham
Hasso H, Großer K, Aymaz I, Geppert H, Jürjens J (2022) Abbreviation-expansion pair detection for glossary term extraction. In: 28th international working conference on requirements engineering: foundation for software quality (REFSQ’22)
Hess M, Kromrey J (2004) Robust confidence intervals for effect sizes: a comparative study of cohen’s d and cliff’s delta under non-normality and heterogeneous variances. Annual Meeting of the American Educational Research Association
Hey T, Keim J, Koziolek A, Tichy WF (2020) NoRBERT: transfer learning for requirements classification. In: 28th IEEE international requirements engineering conference (RE’20)
Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266
Jurafsky D, Martin JH (2019) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd edn. Pearson, London
Krzeszowski TP (2011) Contrasting languages: the scope of contrastive linguistics, vol 51. Walter de Gruyter, Berlin
Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin
Lucassen G, Dalpiaz F, Van der Werf JM, Brinkkemper S (2016) Improving agile requirements: the quality user story framework and tool. Requir Eng 21:383–403
Luitel D, Hassani S, Sabetzadeh M (2023) Replication package. https://bit.ly/REJ-BERT-2023
Luitel D, Hassani S, Sabetzadeh M (2023) Using language models for enhancing the completeness of natural-language requirements. In: 29th international working conference on requirements engineering: foundation for software quality (REFSQ’23)
Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Syngress, Oxford
Mikolov T, Yih W-T, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Annual conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT’13)
Open AI. ChatGPT. https://openai.com/blog/chatgpt Accessed June 2023
OpenAI (2023) GPT-4 technical report. arXiv:2303.08774
Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Conference on empirical methods in natural language processing (EMNLP’14)
Sainani A, Anish PR, Joshi V, Ghaisas S (2020) Extracting and classifying requirements from software engineering contracts. In: 28th IEEE international requirements engineering conference (RE’20)
Sammut C, Webb GI (2010) editors. TF–IDF. Springer
Shen Y, Breaux T (2022) Domain model extraction from user-authored scenarios and word embeddings. In: 30th IEEE international requirements engineering conference workshops (REW’22)
Sleimi A, Sannier N, Sabetzadeh M, Briand L, Dann J (2018) Automated extraction of semantic legal metadata using natural language processing. In: 26th IEEE international requirements engineering conference (RE’18)
Vargha A, Delaney H (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17)
Witten Ian H, Eibe F, Hall Mark A (2017) Data mining: practical machine learning tools and techniques, 4th edn. Morgan Kaufmann, Burlington
Witten IH, Frank E, Hall MA, Pal CJ (2016) The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”, 4th edn. Morgan Kaufmann Publishers Inc., Burlington
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13:08
Zhao L, Alhoshan W, Ferrari A, Letsholo KJ, Ajagbe MA, Chioasca E-V, Batista-Navarro RT (2021) Natural language processing for requirements engineering: a systematic mapping study. ACM Comput Surv 54(3):1–41
Didar Z, Vincenzo G (2003) On the interplay between consistency, completeness, and correctness in requirements evolution. Inf Softw Technol 45(14):993–1009
Zowghi D, Gervasi V (2003) The three Cs of requirements: consistency, completeness, and correctness. In: 8th international workshop on requirements engineering: foundation for software quality (REFSQ’03)
Acknowledgements
This work was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) under the Discovery and Discovery Accelerator programs.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Luitel, D., Hassani, S. & Sabetzadeh, M. Improving requirements completeness: automated assistance through large language models. Requirements Eng 29, 73–95 (2024). https://doi.org/10.1007/s00766-024-00416-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-024-00416-3