Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3173574.3173922acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Semi-Automated Coding for Qualitative Research: A User-Centered Inquiry and Initial Prototypes

Published: 21 April 2018 Publication History

Abstract

Qualitative researchers perform an important and painstaking data annotation process known as coding. However, much of the process can be tedious and repetitive, becoming prohibitive for large datasets. Could coding be partially automated, and should it be? To answer this question, we interviewed researchers and observed them code interview transcripts. We found that across disciplines, researchers follow several coding practices well-suited to automation. Further, researchers desire automation after having developed a codebook and coded a subset of data, particularly in extending their coding to unseen data. Researchers also require any assistive tool to be transparent about its recommendations. Based on our findings, we built prototypes to partially automate coding using simple natural language processing techniques. Our top-performing system generates coding that matches human coders on inter-rater reliability measures. We discuss implications for interface and algorithm design, meta-issues around automating qualitative research, and suggestions for future work.

References

[1]
Aneesha Bakharia, Peter Bruza, Jim Watters, Bhuva Narayan, and Laurianne Sitbon. 2016. Interactive Topic Modeling for aiding Qualitative Content Analysis. Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval - CHIIR '16 (2016), 213--222.
[2]
Tehmina N. Basit. 2003. Manual or electronic? The role of coding in qualitative data analysis. Educational Research 45, 2 (2003), 143--154.
[3]
Peter M Bednar and Christine Welch. 2009. Contextual Inquiry and Requirements Shaping. Information Systems Development Challenges in Practice Theory and Education Vols 1and 2 1 (2009), 225--236.
[4]
Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O'Reilly Media Inc. 479 pages.
[5]
V. Braun and V. Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3 (2006), 77--101.
[6]
Liora Bresler, Judy Davidson Wasser, Nancy B Hertzog, and Mary Lemons. 1996. Beyond the lone ranger researcher: Team work in qualitative research. Research Studies in Music Education 7, 1 (1996), 13--27.
[7]
John L Campbell, Charles Quincy, Jordan Osserman, and Ove K Pedersen. 2013. Coding In-depth Semistructured Interviews: Problems of Unitization and Intercoder Reliability and Agreement. Sociological Methods & Research 42, 3 (2013), 294--320.
[8]
Matt Chaput. 2016. Whoosh: A fast, pure-Python full text indexing, search, and spell checking library. (2016). https://pypi.python.org/pypi/Whoosh/
[9]
Kathy Charmaz. 2006. Constructing Grounded Theory: A Practical Guide Through Qualitative Analysis R. Sage Publications, Thousand Oaks, CA.
[10]
Nan-chen Chen, Michael Brooks, Rafal Kocielnik, Sungsoo Ray Hong, Jeff Smith, Sanny Lin, Zening Qu, and Cecilia Aragon. 2017. Lariat : A Visual Analytics Tool for Social Media Researchers to Explore Twitter Datasets. In Hawaii International Conference on System Sciences. Honolulu, HI.
[11]
Nan-chen Chen, Rafal Kocielnik, Margaret Drouhard, Vanessa Peña-Araya, Jina Suh, Keting Cen, Xiangyi Zheng, and Cecilia Aragon. 2016. Challenges of Applying Machine Learning to Qualitative Coding. In ACM SIGCHI Workshop on Human-Centered Machine Learning. http://hcml2016.goldsmithsdigital.com/program/
[12]
J W Creswell. 2007. Qualitative Inquiry and Research Design: Choosing Among Five Approaches. SAGE Publications, Thousand Oaks, CA.
[13]
Gregory Dam and Stefan Kaufmann. 2008. Computer assessment of interview data using latent semantic analysis. Behavior Research Methods 40, 1 (2008), 8--20.
[14]
J. Davidson, T. Paulus, and K. Jackson. 2016. Speculating on the Future of Digital Tools for Qualitative Research. Qualitative Inquiry 22, 7 (2016), 606--610.
[15]
S Deerwester, S Dumais, G Furnas, T Landauer, and R Harshman. 1990. Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 41, 6 (1990), 391--407.
[16]
N.K. Denzin and Y.S. Lincoln. 1994. Handbook of Qualitative Research. Sage Publications, Thousand Oaks, CA.
[17]
Karthik Dinakar, Jackie Chen, Henry Lieberman, Rosalind Picard, and Robert Filbin. 2015. Mixed-Initiative Real-Time Topic Modeling & Visualization for Crisis Counseling. Proceedings of the 20th International Conference on Intelligent User Interfaces - IUI '15 (2015), 417--426.
[18]
Karthik Dinakar, Henry Lieberman, Allison J B Chaney, and David M Blei. 2014. Real-time Topic Models for Crisis Counseling. (2014), 1--4.
[19]
Robert Emerson, Rachel Fretz, and Linda Shaw. 2011. Writing Ethnographic Fieldnotes. University of Chicago Press, Chicago, IL.
[20]
Jerry Alan Fails and Dan R Olsen. 2003. Interactive machine learning. Proceedings of the 8th international conference on Intelligent user interfaces IUI 03 (2003), 39--45.
[21]
Ethan Fast, Binbin Chen, and Michael S. Bernstein. 2016. Empath: Understanding Topic Signals in Large-Scale Text. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems - CHI '16 (2016), 4647--4657.
[22]
Nigel Fielding and Raymond Lee. 2002. New Patterns in the Adoption and Use of Qualitative Software. Field Methods 14, 2 (2002), 197--216.
[23]
N. G. Fielding. 2012. Triangulation and Mixed Methods Designs: Data Integration With New Research Technologies. Journal of Mixed Methods Research 6, 2 (2012), 124--136.
[24]
Linda S. Gilbert, Kristi Jackson, and Silvana di Gregorio. 2014. Tools for Analyzing Qualitative Data: The History and Relevance of Qualitative Data Analysis Software. In Handbook of Research on Educational Communications and Technology. 221--236.
[25]
Barney G Glaser and Anselm L Strauss. 1967. The discovery of grounded theory. International Journal of Qualitative Methods 5 (1967), 1--10.
[26]
Tony Hak and Ton Bernts. 1996. Coder training: Theoretical training or practical socialization? Qualitative Sociology 19, 2 (jun 1996), 235--257.
[27]
Enamul Hoque and Giuseppe Carenini. 2015. ConVisIT : Interactive Topic Modeling for Exploring Asynchronous Online Conversations. Proceedings of the 20th International Conference on Intelligent User Interfaces IUI '15 (2015), 169--180.
[28]
Paula Jarzabkowski, Rebecca Bednarek, and Laure Cabantous. 2014. Conducting global team-based ethnography: Methodological challenges and practical methods. Human Relations (2014), 0018726714535449.
[29]
Udo Kelle, Gerald Prein, and Katherine Bird. 1995. Computer-aided qualitative data analysis : theory, methods and practice. Sage Publications. 224 pages.
[30]
Raymond M Lee and Nigel G Fielding. 1995. Users' Experiences of Qualitative Data Analysis Software. In Computer-aided qualitative data analysis: theory, methods and practice, Udo Kelle, Gerald Prein, and Katherine Bird (Eds.). SAGE Publications, London ; Thousand Oaks, Chapter 2, 29--.
[31]
R. B. Lewis. 2004. NVivo 2.0 and ATLAS.ti 5.0: A Comparative Review of Two Popular Qualitative Data-Analysis Programs. Field Methods 16, 4 (2004), 439--464.
[32]
Christopher D Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. 2008 1, c (2008), 496.
[33]
Joseph A Maxwell. 2010. Qualitative Research Design: An Interactive Approach. SAGE Publications, Thousand Oaks, CA.
[34]
M B Miles, A M Huberman, and Johnny Saldaña. 2014. Qualitative Data Analysis: A Methods Sourcebook. Thousand Oaks, CA.
[35]
Peter Axel Nielsen. 2012. Collaborative Coding of Qualitative Data. (2012).
[36]
Pablo Paredes, Ana Rufino Ferreira, Cory Schillaci, Gene Yoo, Pierre Karashchuk, Dennis Xing, Coye Cheshire, and John Canny. 2017. Inquire: Large-scale Early Insight Discovery for Qualitative Research. In Proceedings of the 20th ACM Conference on Computer-Supported Cooperative Work & Social Computing.
[37]
Mary Elizabeth Raven and Alicia Flanders. 1996. Using contextual inquiry to learn about your audiences. ACM SIGDOC Asterisk Journal of Computer Documentation 20, 1 (1996), 1--13.
[38]
Johnny Saldaña. 2012. The Coding Manual for Qualitative Researchers. Sage Publications, Thousand Oaks, CA.
[39]
Bruce Sherin. 2013. A Computational Study of Commonsense Science: An Exploration in the Automated Analysis of Clinical Interview Data. Journal of the Learning Sciences 22, 4 (2013), 600--638.
[40]
Amit Singhal. 2001. Modern Information Retrieval: A Brief Overview. Bulletin of the IEEEE Computer Society Technical Committee on Data Engineering 24, 4 (2001), 1--9.
[41]
Jonathan A Smith and Mike Osborn. 2015. Interpretative phenomenological analysis. In Qualitative Psychology: A Practical Guide to Research Methods. 54--80.
[42]
Pontus Stenetorp, Sampo Pyysalo, Goran Topi, Tomoko Ohta, Sophia Ananiadou, and Jun'ichi Tsujii. 2012. BRAT : a Web-based Tool for NLP-Assisted Text Annotation. Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL '12) Figure 1 (2012), 102--107. http://dl.acm.org/citation.cfm?id=2380921.2380942
[43]
David R Thomas. 2006. A general inductive approach for analyzing qualiative evaluation data. American Journal of Evaluation 27, 2 (2006), 237--246.
[44]
Gregor Wiedemann. 2013. Opening up to big data: Computer-assisted analysis of textual data in social sciences. Historical Social Research 38, 4 (2013), 332--357.
[45]
Jasy Suet Liew Yan, Nancy McCracken, and Kevin Crowston. 2014. Semi-Automatic Content Analysis of Qualitative Data. In iConference Proceedings.

Cited By

View all
  • (2024)Exploring Sustainability and Efficiency Improvements in Healthcare: A Qualitative StudySustainability10.3390/su1619830616:19(8306)Online publication date: 24-Sep-2024
  • (2024)Understanding Human-AI Workflows for Generating PersonasProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660729(757-781)Online publication date: 1-Jul-2024
  • (2024)SenseMate: An Accessible and Beginner-Friendly Human-AI Platform for Qualitative Data AnalysisProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645194(922-939)Online publication date: 18-Mar-2024
  • Show More Cited By

Index Terms

  1. Semi-Automated Coding for Qualitative Research: A User-Centered Inquiry and Initial Prototypes

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems
    April 2018
    8489 pages
    ISBN:9781450356206
    DOI:10.1145/3173574
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 April 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    • Best Paper

    Author Tags

    1. natural language processing
    2. qualitative coding
    3. qualitative data analysis
    4. qualitative research
    5. user-centered design

    Qualifiers

    • Research-article

    Conference

    CHI '18
    Sponsor:

    Acceptance Rates

    CHI '18 Paper Acceptance Rate 666 of 2,590 submissions, 26%;
    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)315
    • Downloads (Last 6 weeks)26
    Reflects downloads up to 01 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Exploring Sustainability and Efficiency Improvements in Healthcare: A Qualitative StudySustainability10.3390/su1619830616:19(8306)Online publication date: 24-Sep-2024
    • (2024)Understanding Human-AI Workflows for Generating PersonasProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660729(757-781)Online publication date: 1-Jul-2024
    • (2024)SenseMate: An Accessible and Beginner-Friendly Human-AI Platform for Qualitative Data AnalysisProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645194(922-939)Online publication date: 18-Mar-2024
    • (2024)Annota: Peer-based AI Hints Towards Learning Qualitative Coding at ScaleProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645168(455-470)Online publication date: 18-Mar-2024
    • (2024)Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social AnnotationProceedings of the 14th Learning Analytics and Knowledge Conference10.1145/3636555.3636910(518-528)Online publication date: 18-Mar-2024
    • (2024)Human-AI Collaboration in Thematic Analysis using ChatGPT: A User Study and Design RecommendationsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650732(1-7)Online publication date: 11-May-2024
    • (2024)Bridging the Integrity Gap: Towards AI-assisted Design ResearchExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3647962(1-5)Online publication date: 11-May-2024
    • (2024)CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642002(1-29)Online publication date: 11-May-2024
    • (2024)When and How to Use AI in the Design Process? Implications for Human-AI Design CollaborationInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2353451(1-16)Online publication date: 22-May-2024
    • (2024)The Use of Digital Tools and Emerging Technologies in Qualitative Research—A Systematic Review of LiteratureComputer Supported Qualitative Research10.1007/978-3-031-65735-1_16(257-269)Online publication date: 2-Oct-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media