research-article

Open access

Towards Efficient Annotations for a Human-AI Collaborative, Clinical Decision Support System: A Case Study on Physical Stroke Rehabilitation Assessment

Authors:

Daniel P. Siewiorek,

Asim Smailagic,

Alexandre Bernardino,

Sergi Bermúdez i BadiaAuthors Info & Claims

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Pages 4 - 14

https://doi.org/10.1145/3490099.3511112

Published: 22 March 2022 Publication History

All formats PDF

Abstract

Artificial intelligence (AI) and machine learning (ML) algorithms are increasingly being explored to support various decision-making tasks in health (e.g. rehabilitation assessment). However, the development of such AI/ML-based decision support systems is challenging due to the expensive process to collect an annotated dataset. In this paper, we describe the development process of a human-AI collaborative, clinical decision support system that augments an ML model with a rule-based (RB) model from domain experts. We conducted its empirical evaluation in the context of assessing physical stroke rehabilitation with the dataset of three exercises from 15 post-stroke survivors and therapists. Our results bring new insights on the efficient development and annotations of a decision support system: when an annotated dataset is not available initially, the RB model can be used to assess post-stroke survivor’s quality of motion and identify samples with low confidence scores to support efficient annotations for training an ML model. Specifically, our system requires only 22 - 33% of annotations from therapists to train an ML model that achieves equally good performance with an ML model with all annotations from a therapist. Our work discusses the values of a human-AI collaborative approach for effectively collecting an annotated dataset and supporting a complex decision-making task.

References

[1]

Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Magazine 35, 4 (2014), 105–120.

Digital Library

[2]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 3.

Digital Library

[3]

PK Anooj. 2012. Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules. Journal of King Saud University-Computer and Information Sciences 24, 1(2012), 27–40.

Digital Library

[4]

Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 2(2019), 423–443.

Digital Library

[5]

Mark T Bayley, Amanda Hurdowar, Carol L Richards, Nicol Korner-Bitensky, Sharon Wood-Dauphinee, Janice J Eng, Marilyn McKay-Lyons, Edward Harrison, Robert Teasell, Margaret Harrison, 2012. Barriers to implementation of stroke rehabilitation evidence: findings from a multi-site pilot project. Disability and rehabilitation 34, 19 (2012), 1633–1638.

[6]

Emma Beede, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M Vardoulakis. 2020. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[7]

Edmon Begoli, Tanmoy Bhattacharya, and Dimitri Kusnezov. 2019. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 1, 1 (2019), 20–23.

[8]

Or Biran and Courtenay Cotton. 2017. Explanation and justification in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI), Vol. 8. 1.

[9]

Norbert Buch, Sergio A Velastin, and James Orwell. 2011. A review of computer vision techniques for the analysis of urban traffic. IEEE Transactions on intelligent transportation systems 12, 3(2011), 920–939.

Digital Library

[10]

Bruce G Buchanan and Richard O Duda. 1983. Principles of rule-based expert systems. In Advances in computers. Vol. 22. Elsevier, 163–216.

[11]

Federico Cabitza, Raffaele Rasoini, and Gian Franco Gensini. 2017. Unintended consequences of machine learning in medicine. Jama 318, 6 (2017), 517–518.

[12]

Carrie J Cai, Emily Reif, Narayan Hegde, Jason Hipp, Been Kim, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Greg S Corrado, Martin C Stumpe, 2019. Human-centered tools for coping with imperfect algorithms during medical decision-making. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 4.

Digital Library

[13]

Carrie J Cai, Samantha Winter, David Steiner, Lauren Wilcox, and Michael Terry. 2019. ” Hello AI”: Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–24.

Digital Library

[14]

Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1721–1730.

Digital Library

[15]

Po-Hsuan Cameron Chen, Yun Liu, and Lily Peng. 2019. How to develop machine learning models for healthcare. Nature materials 18, 5 (2019), 410.

[16]

Samarjit Das, Laura Trutoiu, Akihiko Murai, Dunbar Alcindor, Michael Oh, Fernando De la Torre, and Jessica Hodgins. 2011. Quantitative measurement of motor symptoms in Parkinson’s disease: A study with full-body motion capture data. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 6789–6792.

[17]

Maria De-Arteaga, Riccardo Fogliato, and Alexandra Chouldechova. 2020. A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[18]

Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai. 2019. Bias in bios: A case study of semantic representation bias in a high-stakes setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 120–128.

Digital Library

[19]

TT Dhivyaprabha, P Subashini, and Marimuthu Krishnaveni. 2016. Computational intelligence based machine learning methods for rule-based reasoning in computer vision applications. In 2016 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 1–8.

[20]

Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and Sebastian Thrun. 2017. Dermatologist-level classification of skin cancer with deep neural networks. nature 542, 7639 (2017), 115–118.

[21]

Ben Green and Yiling Chen. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–24.

Digital Library

[22]

Eren Gultepe, Jeffrey P Green, Hien Nguyen, Jason Adams, Timothy Albertson, and Ilias Tagkopoulos. 2014. From vital signs to clinical outcomes for patients with sepsis: a machine learning basis for a clinical decision support system. Journal of the American Medical Informatics Association 21, 2(2014), 315–325.

[23]

Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International Conference on Machine Learning. PMLR, 1321–1330.

[24]

Jatinder ND Gupta, Guisseppi A Forgionne, and Manuel Mora. 2007. Intelligent decision-making support systems: foundations, applications and challenges. Springer Science & Business Media.

[25]

Fred Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. 2018. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE transactions on visualization and computer graphics 25, 8(2018), 2674–2693.

Digital Library

[26]

Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and visualizing data iteration in machine learning. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–13.

Digital Library

[27]

Mark Jones, Karen Grimmer, Ian Edwards, Joy Higgs, and Franziska Trede. 2006. Challenges in applying best evidence to physiotherapy. Internet Journal of Allied Health Sciences and Practice 4, 3(2006), 11.

[28]

Mayank Kabra, Alice A Robie, Marta Rivera-Alba, Steven Branson, and Kristin Branson. 2013. JAABA: interactive machine learning for automatic annotation of animal behavior. Nature methods 10, 1 (2013), 64–67.

[29]

Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive optimization for steering machine classification. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1343–1352.

Digital Library

[30]

Danielle Leah Kehl and Samuel Ari Kessler. 2017. Algorithms in the criminal justice system: Assessing the use of risk assessments in sentencing. (2017).

[31]

Saif Khairat, David Marc, William Crosby, and Ali Al Sanousi. 2018. Reasons for physicians not adopting clinical decision support systems: critical analysis. JMIR medical informatics 6, 2 (2018), e24.

[32]

Bongjun Kim and Bryan Pardo. 2018. A human-in-the-loop system for sound event detection and annotation. ACM Transactions on Interactive Intelligent Systems (TiiS) 8, 2(2018), 1–23.

Digital Library

[33]

Been Kim, Julie A Shah, and Finale Doshi-Velez. 2015. Mind the gap: A generative approach to interpretable feature selection and extraction. In Advances in Neural Information Processing Systems. 2260–2268.

[34]

Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018. Human decisions and machine predictions. The quarterly journal of economics 133, 1 (2018), 237–293.

[35]

Jan-Christoph Klie, Michael Bugert, Beto Boullosa, Richard Eckart de Castilho, and Iryna Gurevych. 2018. The inception platform: Machine-assisted and knowledge-oriented interactive annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations. 5–9.

[36]

Amanda Kube, Sanmay Das, and Patrick J Fowler. 2019. Allocating interventions based on predicted outcomes: A case study on homelessness services. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 622–629.

Digital Library

[37]

Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th international conference on intelligent user interfaces. 126–137.

Digital Library

[38]

Peter Langhorne, Julie Bernhardt, and Gert Kwakkel. 2011. Stroke rehabilitation. The Lancet 377, 9778 (2011), 1693–1702.

[39]

Min Hun Lee, Daniel P Siewiorek, Asim Smailagic, Alexandre Bernadino, 2019. Learning to assess the quality of stroke rehabilitation exercises. In Proceedings of the 24th International Conference on intelligent user interfaces. ACM, 218–228.

Digital Library

[40]

Min Hun Lee, Daniel P Siewiorek, Asim Smailagic, Alexandre Bernardino, and Sergi Bermúdez i Badia. 2020. Co-Design and Evaluation of an Intelligent Decision Support System for Stroke Rehabilitation Assessment. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2(2020), 1–27.

Digital Library

[41]

Min Hun Lee, Daniel P Siewiorek, Asim Smailagic, Alexandre Bernardino, and Sergi Bermúdez i Badia. 2020. An exploratory study on techniques for quantitative assessment of stroke rehabilitation exercises. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization. 303–307.

Digital Library

[42]

Min Hun Lee, Daniel P. Siewiorek, Asim Smailagic, Alexandre Bernardino, and Sergi Bermúdez i Badia. 2021. A Human-AI Collaborative Approach for Clinical Decision Making on Rehabilitation Assessment. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[43]

Benjamin Letham, Cynthia Rudin, Tyler H McCormick, and David Madigan. 2015. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics 9, 3 (2015), 1350–1371.

[44]

Mingkun Li and Ishwar K Sethi. 2006. Confidence-based active learning. IEEE transactions on pattern analysis and machine intelligence 28, 8(2006), 1251–1261.

Digital Library

[45]

Andrew F Long, Rosie Kneafsey, and Julia Ryan. 2003. Rehabilitation practice: challenges to effective team working. International journal of nursing studies 40, 6 (2003), 663–673.

[46]

Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In Proceedings of the 22nd international conference on Machine learning. 625–632.

Digital Library

[47]

Susan B O’Sullivan, Thomas J Schmitz, and George Fulk. 2019. Physical rehabilitation. FA Davis.

[48]

Madhuri Panwar, Dwaipayan Biswas, Harsh Bajaj, Michael Jöbges, Ruth Turk, Koushik Maharatna, and Amit Acharyya. 2019. Rehab-Net: Deep Learning Framework for Arm Movement Classification Using Wearable Sensors for Stroke Rehabilitation. IEEE Transactions on Biomedical Engineering 66, 11 (2019), 3026–3037.

[49]

Nathan Peiffer-Smadja, Timothy Miles Rawson, Raheelah Ahmad, Albert Buchard, P Georgiou, F-X Lescure, Gabriel Birgand, and Alison Helen Holmes. 2020. Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clinical Microbiology and Infection 26, 5 (2020), 584–595.

[50]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135–1144.

Digital Library

[51]

Brandon Rohrer, Susan Fasoli, Hermano Igo Krebs, Richard Hughes, Bruce Volpe, Walter R Frontera, Joel Stein, and Neville Hogan. 2002. Movement smoothness changes during stroke recovery. Journal of Neuroscience 22, 18 (2002), 8297–8304.

[52]

Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.

[53]

Saima Safdar, Saad Zafar, Nadeem Zafar, and Naurin Farooq Khan. 2018. Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artificial Intelligence Review 50, 4 (2018), 597–623.

Digital Library

[54]

Mark Sendak, Madeleine Clare Elish, Michael Gao, Joseph Futoma, William Ratliff, Marshall Nichols, Armando Bedoya, Suresh Balu, and Cara O’Brien. 2020. ” The human body is a black box” supporting clinical decision-making with deep learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 99–109.

Digital Library

[55]

Emily Seto, Kevin J Leonard, Joseph A Cafazzo, Jan Barnsley, Caterina Masino, and Heather J Ross. 2012. Developing healthcare rule-based expert systems: case study of a heart failure telemonitoring system. International journal of medical informatics 81, 8(2012), 556–565.

[56]

Philip J Smith, Norman D Geddes, and Roger Beatty. 2009. Human-centered design of decision-support systems. In Human-Computer Interaction. CRC Press, 263–292.

[57]

Katherine J Sullivan, Julie K Tilson, Steven Y Cen, Dorian K Rose, Julie Hershberg, Anita Correa, Joann Gallichio, Molly McLeod, Craig Moore, Samuel S Wu, 2011. Fugl-Meyer assessment of sensorimotor function after stroke: standardized training procedure for clinical practice and clinical trials. Stroke 42, 2 (2011), 427–432.

[58]

Edward Taub, David M Morris, Jean Crago, Danna Kay King, Mary Bowman, Camille Bryson, Staci Bishop, Sonya Pearson, and Sharon E Shaw. 2011. Wolf motor function test (WMFT) manual. Birmingham: University of Alabama, CI Therapy Research Group (2011).

[59]

David Webster and Ozkan Celik. 2014. Systematic review of Kinect applications in elderly care and stroke rehabilitation. Journal of neuroengineering and rehabilitation 11, 1(2014), 108.

[60]

Qian Yang, Aaron Steinfeld, and John Zimmerman. 2019. Unremarkable AI: Fitting Intelligent Decision Support into Critical, Clinical Decision-Making Processes. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 238.

Digital Library

[61]

Bianca Zadrozny and Charles Elkan. 2002. Transforming classifier scores into accurate multiclass probability estimates. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. 694–699.

Digital Library

Cited By

Anderson AGuevara JMoussaoui FLi TVorvoreanu MBurnett M(2024)Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving StylesACM Transactions on Interactive Intelligent Systems10.1145/366374014:3(1-90)Online publication date: 24-Sep-2024
https://dl.acm.org/doi/10.1145/3663740
Belli Ode Castro H(2024)A new metric for reliable diagnosis of rotating machines applied to a multi-fault rotor using Bayesian neural networksJournal of the Brazilian Society of Mechanical Sciences and Engineering10.1007/s40430-024-05222-046:11Online publication date: 14-Oct-2024
https://doi.org/10.1007/s40430-024-05222-0
Lee MChew C(2023)Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision MakingProceedings of the ACM on Human-Computer Interaction10.1145/36102187:CSCW2(1-22)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3610218
Show More Cited By

Index Terms

Towards Efficient Annotations for a Human-AI Collaborative, Clinical Decision Support System: A Case Study on Physical Stroke Rehabilitation Assessment
1. Information systems

Index terms have been assigned to the content through auto-classification.

Recommendations

A Human-AI Collaborative Approach for Clinical Decision Making on Rehabilitation Assessment
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Advances in artificial intelligence (AI) have made it increasingly applicable to supplement expert’s decision-making in the form of a decision support system on various tasks. For instance, an AI-based system can provide therapists quantitative ...
Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision Making
CSCW

Artificial intelligence (AI) is increasingly being considered to assist human decision-making in high-stake domains (e.g. health). However, researchers have discussed an issue that humans can over-rely on wrong suggestions of the AI model instead of ...
Home-Based Rehabilitation System for Stroke Survivors: A Clinical Evaluation
Abstract
Recently, a home-based rehabilitation system for stroke survivors (Baptista et al. Comput. Meth. Prog. Biomed. 176:111–120 2019), composed of two linked applications (one for the therapist and another one for the patient), has been introduced. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

March 2022

888 pages

ISBN:9781450391443

DOI:10.1145/3490099

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 March 2022

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

LISBOA 2020 and the FCT
Singapore Ministry of Education (MOE)
National Science Foundation
FCT

Conference

IUI '22

Sponsor:

IUI '22: 27th International Conference on Intelligent User Interfaces

March 22 - 25, 2022

Helsinki, Finland

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
1,122
Total Downloads

Downloads (Last 12 months)434
Downloads (Last 6 weeks)53

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Anderson AGuevara JMoussaoui FLi TVorvoreanu MBurnett M(2024)Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving StylesACM Transactions on Interactive Intelligent Systems10.1145/366374014:3(1-90)Online publication date: 24-Sep-2024
https://dl.acm.org/doi/10.1145/3663740
Belli Ode Castro H(2024)A new metric for reliable diagnosis of rotating machines applied to a multi-fault rotor using Bayesian neural networksJournal of the Brazilian Society of Mechanical Sciences and Engineering10.1007/s40430-024-05222-046:11Online publication date: 14-Oct-2024
https://doi.org/10.1007/s40430-024-05222-0
Lee MChew C(2023)Understanding the Effect of Counterfactual Explanations on Trust and Reliance on AI for Human-AI Collaborative Clinical Decision MakingProceedings of the ACM on Human-Computer Interaction10.1145/36102187:CSCW2(1-22)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3610218
Capel TBrereton M(2023)What is Human-Centered about Human-Centered AI? A Map of the Research LandscapeProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580959(1-23)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3580959
Uymaz PUymaz AAkgül Y(2023)Assessing the Behavioral Intention of Individuals to Use an AI Doctor at the Primary, Secondary, and Tertiary Care LevelsInternational Journal of Human–Computer Interaction10.1080/10447318.2023.223312640:18(5229-5246)Online publication date: 17-Jul-2023
https://doi.org/10.1080/10447318.2023.2233126

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents