Can AI serve as a substitute for human subjects in software engineering research?

Marco Gerosa¹,
Bianca Trinkenreich²,
Igor Steinmacher¹ &
…
Anita Sarma²

1059 Accesses
2 Citations
Explore all metrics

Abstract

Research within sociotechnical domains, such as software engineering, fundamentally requires the human perspective. Nevertheless, traditional qualitative data collection methods suffer from difficulties in participant recruitment, scaling, and labor intensity. This vision paper proposes a novel approach to qualitative data collection in software engineering research by harnessing the capabilities of artificial intelligence (AI), especially large language models (LLMs) like ChatGPT and multimodal foundation models. We explore the potential of AI-generated synthetic text as an alternative source of qualitative data, discussing how LLMs can replicate human responses and behaviors in research settings. We discuss AI applications in emulating humans in interviews, focus groups, surveys, observational studies, and user evaluations. We discuss open problems and research opportunities to implement this vision. In the future, an integrated approach where both AI and human-generated data coexist will likely yield the most effective outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large language models for qualitative research in software engineering: exploring opportunities and challenges

Article 21 December 2023

Challenges, adaptations, and fringe benefits of conducting software engineering research with human participants during the COVID-19 pandemic

Article Open access 07 June 2024

The GenAI is out of the bottle: generative artificial intelligence from a business model innovation perspective

Article Open access 13 September 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aher, G.V., Arriaga, R.I., Kalai, A.T.: Using large language models to simulate multiple humans and replicate human subject studies. In: International Conference on Machine Learning, pp. 337–371. PMLR (2023)
Argyle, L.P., Busby, E.C., Fulda, N., Gubler, J.R., Rytting, C., Wingate, D.: Out of one, many: using language models to simulate human samples. Polit. Anal. 31(3), 337–351 (2023)
Article Google Scholar
Biber, D.: Text-linguistic approaches to register variation. Regist. Stud. 1(1), 42–75 (2019)
Article Google Scholar
Burnett, M., Stumpf, S., Macbeth, J., Makri, S., Beckwith, L., Kwan, I., Peters, A., Jernigan, W.: Gendermag: a method for evaluating software’s gender inclusiveness. Interact. Comput. 28(6), 760–787 (2016)
Article Google Scholar
Chaves, A.P., Egbert, J., Hocking, T., Doerry, E., Gerosa, M.A.: Chatbots language design: the influence of language variation on user experience with tourist assistant chatbots. ACM Trans. Comput. Hum. Interact. 29(2), 1–38 (2022)
Article Google Scholar
Chew, R., Bollenbacher, J., Wenger, M., Speer, J., Kim, A.: LLM-assisted content analysis: using large language models to support deductive coding (2023). arXiv:2306.14924
Dai, S.-C., Xiong, A., Ku, L.-W.: LLM-in-the-loop: leveraging large language model for thematic analysis (2023). arXiv:2310.15100
De Paoli, S.: Improved prompting and process for writing user personas with LLMs, using qualitative interviews: capturing behaviour and personality traits of users (2023). arXiv:2310.06391
Demszky, D., Yang, D., Yeager, D.S., Bryan, C.J., Clapper, M., Chandhok, S., Eichstaedt, J.C., Hecht, C., Jamieson, J., Johnson, M., et al.: Using large language models in psychology. Nat. Rev. Psychol. 2, 1–14 (2023)
Google Scholar
Dillion, D., Tandon, N., Gu, Y., Gray, K.: Can AI language models replace human participants? Trends Cogn. Sci. 27(7), 597–600 (2023). https://doi.org/10.1016/j.tics.2023.04.008
Article Google Scholar
Eliot, L.: The bold promise Of mega-personas as a new shake-up for prompt engineering generative AI techniques (2023). Accessed 08 Nov 2023. https://www.forbes.com/sites/lanceeliot/2023/08/15/the-bold-promise-of-mega-personas-as-a-new-shake-up-for-prompt-engineering-generative-ai-techniques/?sh=2be155065552
Gerosa, M., Wiese, I., Trinkenreich, B., Link, G., Robles, G., Treude, C., Steinmacher, I., Sarma, A.: The shifting sands of motivation: Revisiting what drives contributors in open source. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 1046–1058. IEEE (2021)
Hämäläinen, P., Tavast, M., Kunnari, A.: Evaluating large language models in generating synthetic HCI research data: a case study. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. CHI ’23. Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3544548.3580688
Hutson, M., Mastin, A.: Guinea pigbots. Science (New York, NY) 381(6654), 121–123 (2023)
Article Google Scholar
Jiang, H., Zhang, X., Cao, X., Kabbara, J., Roy, D.: PersonaLLM: investigating the ability of GPT-3.5 to express personality traits and gender differences (2023). arXiv:2305.02547
Jung, S.-g., Salminen, J., Kwak, H., An, J., Jansen, B.J.: Automatic persona generation (APG) a rationale and demonstration. In: Proceedings of the 2018 Conference on Human Information Interaction and Retrieval, pp. 321–324 (2018)
Kaddour, J., Harris, J., Mozes, M., Bradley, H., Raileanu, R., McHardy, R.: Challenges and applications of large language models (2023). arXiv:2307.10169
Kim, J., Lee, B.: AI-augmented surveys: leveraging large language models for opinion prediction in nationally representative surveys (2023). arXiv:2305.09620
Kokinda, E., Moster, M., Dominic, J., Rodeghero, P.: Under the bridge: trolling and the challenges of recruiting software developers for empirical research studies. In: 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pp. 55–59 (2023). https://doi.org/10.1109/ICSE-NIER58687.2023.00016
Lee, S., Peng, T.-Q., Goldberg, M.H., Rosenthal, S.A., Kotcher, J.E., Maibach, E.W., Leiserowitz, A.: Can large language models capture public opinion about global warming? An empirical assessment of algorithmic fidelity and bias (2023). arXiv:2311.00217
Sanders, N.E., Ulinich, A., Schneier, B.: Demonstrations of the potential of AI-based political issue polling (2023). arXiv:2307.04781
Simmons, G., Hare, C.: Large language models as subpopulation representative models: a review (2023). arXiv:2310.17888
Smith, M., Danilova, A., Naiakshina, A.: A meta-research agenda for recruitment and study design for developer studies. In: 1st International Workshop on Recruiting Participants for Empirical Software Engineering (RoPES’22), 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) (2022)
Storey, M.-A., Ernst, N.A., Williams, C., Kalliamvakou, E.: The who, what, how of software engineering research: a socio-technical framework. Empir. Softw. Eng. 25, 4097–4129 (2020)
Article Google Scholar
Suguri Motoki, F.Y., Monteiro, J., Malagueño, R., Rodrigues, V.: From data scarcity to data abundance: crafting synthetic survey data in management accounting using ChatGPT (2023). Available at SSRN
Treude, C., Hata, H.: She elicits requirements and he tests: software engineering gender bias in large language models (2023). arXiv:2303.10131
Trinkenreich, B., Wiese, I., Sarma, A., Gerosa, M., Steinmacher, I.: Women’s participation in open source software: a survey of the literature. ACM Trans. Softw. Eng. Methodol. (TOSEM) 31(4), 1–37 (2022)
Article Google Scholar
Wang, Z., Mao, S., Wu, W., Ge, T., Wei, F., Ji, H.: Unleashing cognitive synergy in large language models: a task-solving agent through multi-persona self-collaboration (2023). arXiv:2307.05300
Xiao, Z., Yuan, X., Liao, Q.V., Abdelghani, R., Oudeyer, P.-Y.: Supporting qualitative analysis with large language models: combining codebook with GPT-3 for deductive coding. In: Companion Proceedings of the 28th International Conference on Intelligent User Interfaces, pp. 75–78 (2023)
Zhou, Y., Muresanu, A.I., Han, Z., Paster, K., Pitis, S., Chan, H., Ba, J.: Large language models are human-level prompt engineers (2022). arXiv:2211.01910

Download references

Acknowledgements

Partial support of the NSF Grants 2236198, 2235601, 2247929, 2303043, and 2303042. ChatGPT v4 was used to copy-edit this article.

Author information

Authors and Affiliations

Northern Arizona University, Flagstaff, USA
Marco Gerosa & Igor Steinmacher
Oregon State University, Corvallis, USA
Bianca Trinkenreich & Anita Sarma

Authors

Marco Gerosa
View author publications
You can also search for this author in PubMed Google Scholar
Bianca Trinkenreich
View author publications
You can also search for this author in PubMed Google Scholar
Igor Steinmacher
View author publications
You can also search for this author in PubMed Google Scholar
Anita Sarma
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MG: wrote the main manuscript text IS and AS: ideated about the paper and helped in copy editing. BT: participated in writing the paper and creating the reference list.

Corresponding author

Correspondence to Marco Gerosa.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gerosa, M., Trinkenreich, B., Steinmacher, I. et al. Can AI serve as a substitute for human subjects in software engineering research?. Autom Softw Eng 31, 13 (2024). https://doi.org/10.1007/s10515-023-00409-6

Download citation

Received: 13 November 2023
Accepted: 10 December 2023
Published: 11 January 2024
DOI: https://doi.org/10.1007/s10515-023-00409-6

Can AI serve as a substitute for human subjects in software engineering research?

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Large language models for qualitative research in software engineering: exploring opportunities and challenges

Challenges, adaptations, and fringe benefits of conducting software engineering research with human participants during the COVID-19 pandemic

The GenAI is out of the bottle: generative artificial intelligence from a business model innovation perspective

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Can AI serve as a substitute for human subjects in software engineering research?

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Large language models for qualitative research in software engineering: exploring opportunities and challenges

Challenges, adaptations, and fringe benefits of conducting software engineering research with human participants during the COVID-19 pandemic

The GenAI is out of the bottle: generative artificial intelligence from a business model innovation perspective

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation