Improving requirements completeness: automated assistance through large language models

1138 Accesses
6 Citations
Explore all metrics

Abstract

Natural language (NL) is arguably the most prevalent medium for expressing systems and software requirements. Detecting incompleteness in NL requirements is a major challenge. One approach to identify incompleteness is to compare requirements with external sources. Given the rise of large language models (LLMs), an interesting question arises: Are LLMs useful external sources of knowledge for detecting potential incompleteness in NL requirements? This article explores this question by utilizing BERT. Specifically, we employ BERT’s masked language model to generate contextualized predictions for filling masked slots in requirements. To simulate incompleteness, we withhold content from the requirements and assess BERT’s ability to predict terminology that is present in the withheld content but absent in the disclosed content. BERT can produce multiple predictions per mask. Our first contribution is determining the optimal number of predictions per mask, striking a balance between effectively identifying omissions in requirements and mitigating noise present in the predictions. Our second contribution involves designing a machine learning-based filter to post-process BERT’s predictions and further reduce noise. We conduct an empirical evaluation using 40 requirements specifications from the PURE dataset. Our findings indicate that: (1) BERT’s predictions effectively highlight terminology that is missing from requirements, (2) BERT outperforms simpler baselines in identifying relevant yet missing terminology, and (3) our filter reduces noise in the predictions, enhancing BERT’s effectiveness for completeness checking of requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using Language Models for Enhancing the Completeness of Natural-Language Requirements

Natural Language Processing for Requirements Formalization: How to Derive New Approaches?

An empirical study on the potential usefulness of domain models for completeness checking of requirements

Article Open access 18 April 2019

References

Abbas M, Ferrari A, Shatnawi A, Enoiu EP, Saadatmand M (2021) Is requirements similarity a good proxy for software similarity? An empirical investigation in industry. In: 27th international working conference on requirements engineering: foundation for software quality (REFSQ’21)
Alrajeh D, Kramer J, van Lamsweerde A, Russo A, Uchitel S (2012) Generating obstacle conditions for requirements completeness. In: 34th international conference on software engineering (ICSE’12)
Amaral CO, Abualhaija S, Torre D, Sabetzadeh M, Briand L (2022) AI-enabled automation for completeness checking of privacy policies. IEEE Trans Softw Eng 48(11):4647–4674
Article Google Scholar
Chetan A, Mehrdad S, Lionel B (2019) An empirical study on the potential usefulness of domain models for completeness checking of requirements. Empir Softw Eng 24(4):2509–2539
Article Google Scholar
Chetan A, Mehrdad S, Lionel B, Frank Z (2015) Automated checking of conformance to requirements templates using natural language processing. IEEE Trans Softw Eng 41(10):944–968
Article Google Scholar
Chetan A, Mehrdad S, Lionel B, Frank Z (2017) Automated extraction and clustering of requirements glossary terms. IEEE Trans Softw Eng 43(10):918–945
Article Google Scholar
Chetan A, Mehrdad S, Shiva N, Lionel B (2019) An active learning approach for improving the accuracy of automated domain model extraction. ACM Trans Softw Eng Methodol 28:1–34
Google Scholar
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:286-–05
Berry D (2021) Empirical evaluation of tools for hairy requirements engineering tasks. Empir Softw Eng 26:11
Article Google Scholar
Berry DM., Kamsties E, Krieger M (2003) From contract drafting to software specification: Linguistic sources of ambiguity, a handbook. https://cs.uwaterloo.ca/dberry/handbook/ambiguityHandbook.pdf
Bhatia J, Breaux T (2018) Semantic incompleteness in privacy policy goals. In: 26th IEEE international requirements engineering conference (RE’18)
Jie C, Jiawei L, Shulin W, Sheng Y (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Article Google Scholar
Capon JA (1988) Elementary statistics for the social sciences: study guide. Wadsworth
Cui G, Lu Q, Li W, Chen Y-R (2008) Corpus exploitation from Wikipedia for ontology construction. In: 6th international conference on language resources and evaluation (LREC’08)
Dalpiaz F, van der Schalk I, Lucassen G (2018) Pinpointing ambiguity and incompleteness in requirements engineering via information visualization and NLP. In: 24th international working conference on requirements engineering: foundation for software quality (REFSQ’18)
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Annual conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT’19)
Eckhardt J, Vogelsang A, Femmer H, Mager P (2016) Challenging incompleteness of performance requirements by sentence patterns. In: 24th IEEE international requirements engineering conference (RE’16)
Espana S, Condori-Fernandez N, Gonzalez A, Pastor Ó (2009) Evaluating the completeness and granularity of functional requirements specifications: a controlled experiment. In: 17th IEEE international requirements engineering conference (RE’09)
Ezzini S, Abualhaija S, Arora C, Sabetzadeh M (2022) Automated handling of anaphoric ambiguity in requirements: a multi-solution study. In: 44th international conference on software engineering (ICSE’22)
Ezzini S, Abualhaija S, Arora C, Sabetzadeh M, Briand L (2021) Using domain-specific corpora for improved handling of ambiguity in requirements. In: 43rd international conference on software engineering (ICSE’21)
Ezzini S, Abualhaija S, Sabetzadeh M (2022) WikiDoMiner: wikipedia domain-specific miner. In: 30th ACM joint European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE’22)
Fellbaum C (1998) WordNet: an electronic lexical database. Bradford Books, Bradford
Book Google Scholar
Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018). Cost-Sensitive Learning. In: Learning from Imbalanced Data Sets. Springer, Cham
Ferrari A, dell’Orletta F, Spagnolo GO, Gnesi S (2014) Measuring and improving the completeness of natural language requirements. In: 20th international working conference on requirements engineering: foundation for software quality (REFSQ’14)
Ferrari A, Donati B, Gnesi S (2017) Detecting domain-specific ambiguities: an NLP approach based on Wikipedia crawling and word embeddings. In: 25th IEEE international requirements engineering conference workshops (REW’17)
Ferrari A, Spagnolo GO, Gnesi S (2017) PURE: a dataset of public requirements documents. In: 25th IEEE international requirements engineering conference (RE’17)
Gigante G, Gargiulo F, Ficco M (2015) A semantic driven approach for requirements verification. In: Camacho D, Braubach L, Venticinque S, Badica C (eds) Intelligent distributed computing VIII. Springer, Cham
Google Scholar
Hasso H, Großer K, Aymaz I, Geppert H, Jürjens J (2022) Abbreviation-expansion pair detection for glossary term extraction. In: 28th international working conference on requirements engineering: foundation for software quality (REFSQ’22)
Hess M, Kromrey J (2004) Robust confidence intervals for effect sizes: a comparative study of cohen’s d and cliff’s delta under non-normality and heterogeneous variances. Annual Meeting of the American Educational Research Association
Hey T, Keim J, Koziolek A, Tichy WF (2020) NoRBERT: transfer learning for requirements classification. In: 28th IEEE international requirements engineering conference (RE’20)
Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266
Article MathSciNet Google Scholar
Jurafsky D, Martin JH (2019) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd edn. Pearson, London
Google Scholar
Krzeszowski TP (2011) Contrasting languages: the scope of contrastive linguistics, vol 51. Walter de Gruyter, Berlin
Google Scholar
Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin
Google Scholar
Lucassen G, Dalpiaz F, Van der Werf JM, Brinkkemper S (2016) Improving agile requirements: the quality user story framework and tool. Requir Eng 21:383–403
Article Google Scholar
Luitel D, Hassani S, Sabetzadeh M (2023) Replication package. https://bit.ly/REJ-BERT-2023
Luitel D, Hassani S, Sabetzadeh M (2023) Using language models for enhancing the completeness of natural-language requirements. In: 29th international working conference on requirements engineering: foundation for software quality (REFSQ’23)
Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Syngress, Oxford
Book Google Scholar
Mikolov T, Yih W-T, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Annual conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT’13)
Open AI. ChatGPT. https://openai.com/blog/chatgpt Accessed June 2023
OpenAI (2023) GPT-4 technical report. arXiv:2303.08774
Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: Conference on empirical methods in natural language processing (EMNLP’14)
Sainani A, Anish PR, Joshi V, Ghaisas S (2020) Extracting and classifying requirements from software engineering contracts. In: 28th IEEE international requirements engineering conference (RE’20)
Sammut C, Webb GI (2010) editors. TF–IDF. Springer
Shen Y, Breaux T (2022) Domain model extraction from user-authored scenarios and word embeddings. In: 30th IEEE international requirements engineering conference workshops (REW’22)
Sleimi A, Sannier N, Sabetzadeh M, Briand L, Dann J (2018) Automated extraction of semantic legal metadata using natural language processing. In: 26th IEEE international requirements engineering conference (RE’18)
Vargha A, Delaney H (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17)
Witten Ian H, Eibe F, Hall Mark A (2017) Data mining: practical machine learning tools and techniques, 4th edn. Morgan Kaufmann, Burlington
Google Scholar
Witten IH, Frank E, Hall MA, Pal CJ (2016) The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”, 4th edn. Morgan Kaufmann Publishers Inc., Burlington
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13:08
Article Google Scholar
Zhao L, Alhoshan W, Ferrari A, Letsholo KJ, Ajagbe MA, Chioasca E-V, Batista-Navarro RT (2021) Natural language processing for requirements engineering: a systematic mapping study. ACM Comput Surv 54(3):1–41
Article Google Scholar
Didar Z, Vincenzo G (2003) On the interplay between consistency, completeness, and correctness in requirements evolution. Inf Softw Technol 45(14):993–1009
Article Google Scholar
Zowghi D, Gervasi V (2003) The three Cs of requirements: consistency, completeness, and correctness. In: 8th international workshop on requirements engineering: foundation for software quality (REFSQ’03)

Download references

Acknowledgements

This work was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) under the Discovery and Discovery Accelerator programs.

Author information

Authors and Affiliations

University of Ottawa, 800 King Edward Avenue, Ottawa, ON, K1N 6N5, Canada
Dipeeka Luitel, Shabnam Hassani & Mehrdad Sabetzadeh

Authors

Dipeeka Luitel
View author publications
You can also search for this author in PubMed Google Scholar
Shabnam Hassani
View author publications
You can also search for this author in PubMed Google Scholar
Mehrdad Sabetzadeh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dipeeka Luitel.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

A psuedo-code for baselines

See Figs. 10, 11 and 12.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Luitel, D., Hassani, S. & Sabetzadeh, M. Improving requirements completeness: automated assistance through large language models. Requirements Eng 29, 73–95 (2024). https://doi.org/10.1007/s00766-024-00416-3

Download citation

Received: 06 July 2023
Accepted: 12 February 2024
Published: 25 March 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00766-024-00416-3