Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3581754.3584136acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
poster

Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding

Published: 27 March 2023 Publication History

Abstract

Qualitative analysis of textual contents unpacks rich and valuable information by assigning labels to the data. However, this process is often labor-intensive, particularly when working with large datasets. While recent AI-based tools demonstrate utility, researchers may not have readily available AI resources and expertise, let alone be challenged by the limited generalizability of those task-specific models. In this study, we explored the use of large language models (LLMs) in supporting deductive coding, a major category of qualitative analysis where researchers use pre-determined codebooks to label the data into a fixed set of codes. Instead of training task-specific models, a pre-trained LLM could be used directly for various tasks without fine-tuning through prompt learning. Using a curiosity-driven questions coding task as a case study, we found, by combining GPT-3 with expert-drafted codebooks, our proposed approach achieved fair to substantial agreements with expert-coded results. We lay out challenges and opportunities in using LLMs to support qualitative coding and beyond.

References

[1]
Rania Abdelghani, Pierre-Yves Oudeyer, Edith Law, Catherine de Vulpillieres, and Hélene Sauzéon. 2022. Conversational agents for fostering curiosity-driven learning in children. arXiv preprint arXiv:2204.03546(2022).
[2]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
[3]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311(2022).
[4]
Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295
[5]
Hsiu-Fang Hsieh and Sarah E Shannon. 2005. Three approaches to qualitative content analysis. Qualitative health research 15, 9 (2005), 1277–1288.
[6]
Diane M Korngiebel and Sean D Mooney. 2021. Considering the possibilities and pitfalls of Generative Pre-trained Transformer 3 (GPT-3) in healthcare delivery. NPJ Digital Medicine 4, 1 (2021), 1–3.
[7]
Jasy Suet Yan Liew, Nancy McCracken, Shichun Zhou, and Kevin Crowston. 2014. Optimizing features in active machine learning for complex qualitative content analysis. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science. 44–48.
[8]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586(2021).
[9]
Mary L McHugh. 2012. Interrater reliability: the kappa statistic. Biochemia medica 22, 3 (2012), 276–282.
[10]
Michael Muller, Shion Guha, Eric PS Baumer, David Mimno, and N Sadat Shami. 2016. Machine learning and grounded theory method: convergence, divergence, and combination. In Proceedings of the 19th international conference on supporting group work. 3–8.
[11]
Pablo Paredes, Ana Rufino Ferreira, Cory Schillaci, Gene Yoo, Pierre Karashchuk, Dennis Xing, Coye Cheshire, and John Canny. 2017. Inquire: Large-scale early insight discovery for qualitative research. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1562–1575.
[12]
Tim Rietz and Alexander Maedche. 2021. Cody: An AI-Based System to Semi-Automate Coding for Qualitative Research. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 394, 14 pages. https://doi.org/10.1145/3411764.3445591
[13]
William W Wilen. 1991. Questioning skills, for teachers. What research says to the teacher. (1991).
[14]
Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces. 841–852.
[15]
Xingdi Yuan, Tong Wang, Yen-Hsiang Wang, Emery Fine, Rania Abdelghani, Pauline Lucas, Hélène Sauzéon, and Pierre-Yves Oudeyer. 2022. Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation. arXiv preprint arXiv:2209.11000(2022).
[16]
Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, 2022. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068(2022).

Cited By

View all
  • (2025)Differences in User Perception of Artificial Intelligence-Driven Chatbots and Traditional Tools in Qualitative Data AnalysisApplied Sciences10.3390/app1502063115:2(631)Online publication date: 10-Jan-2025
  • (2025)Towards an understanding of large language models in software engineering tasksEmpirical Software Engineering10.1007/s10664-024-10602-030:2Online publication date: 1-Mar-2025
  • (2024)Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models using a Large Language Model (Preprint)JMIR Medical Informatics10.2196/65454Online publication date: 15-Aug-2024
  • Show More Cited By

Index Terms

  1. Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '23 Companion: Companion Proceedings of the 28th International Conference on Intelligent User Interfaces
    March 2023
    266 pages
    ISBN:9798400701078
    DOI:10.1145/3581754
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 March 2023

    Check for updates

    Author Tags

    1. Deductive Coding
    2. GPT-3
    3. Large Language Model
    4. Qualitative Analysis

    Qualifiers

    • Poster
    • Research
    • Refereed limited

    Conference

    IUI '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,260
    • Downloads (Last 6 weeks)187
    Reflects downloads up to 15 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Differences in User Perception of Artificial Intelligence-Driven Chatbots and Traditional Tools in Qualitative Data AnalysisApplied Sciences10.3390/app1502063115:2(631)Online publication date: 10-Jan-2025
    • (2025)Towards an understanding of large language models in software engineering tasksEmpirical Software Engineering10.1007/s10664-024-10602-030:2Online publication date: 1-Mar-2025
    • (2024)Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models using a Large Language Model (Preprint)JMIR Medical Informatics10.2196/65454Online publication date: 15-Aug-2024
    • (2024)Evaluating the Influence of Role-Playing Prompts on ChatGPT’s Misinformation Detection Accuracy: Quantitative StudyJMIR Infodemiology10.2196/606784(e60678)Online publication date: 26-Sep-2024
    • (2024)Harnessing ChatGPT for Thematic Analysis: Are We Ready?Journal of Medical Internet Research10.2196/5497426(e54974)Online publication date: 31-May-2024
    • (2024)Comparing the Efficacy and Efficiency of Human and Generative AI: Qualitative Thematic AnalysesJMIR AI10.2196/544823(e54482)Online publication date: 2-Aug-2024
    • (2024)GPT-4 as an X data annotator: Unraveling its performance on a stance classification taskPLOS ONE10.1371/journal.pone.030774119:8(e0307741)Online publication date: 15-Aug-2024
    • (2024)Multimedia design for learner interest and achievement: a visual guide to pharmacologyBMC Medical Education10.1186/s12909-024-05077-y24:1Online publication date: 5-Feb-2024
    • (2024)Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of TeachingInternational Journal of Qualitative Methods10.1177/1609406924129328323Online publication date: 14-Nov-2024
    • (2024)An Examination of the Use of Large Language Models to Aid Analysis of Textual DataInternational Journal of Qualitative Methods10.1177/1609406924123116823Online publication date: 13-Feb-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media