Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3368555.3384460acmconferencesArticle/Chapter ViewAbstractPublication PageschilConference Proceedingsconference-collections
research-article
Open access

Explaining an increase in predicted risk for clinical alerts

Published: 02 April 2020 Publication History

Abstract

Much work aims to explain a model's prediction on a static input. We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step. When the estimated risk increases, the goal of the explanation is to attribute the increase to a few relevant inputs from the past.
While our formal setup and techniques are general, we carry out an in-depth case study in a clinical setting. The goal here is to alert a clinician when a patient's risk of deterioration rises. The clinician then has to decide whether to intervene and adjust the treatment. Given a potentially long sequence of new events since she last saw the patient, a concise explanation helps her to quickly triage the alert.
We develop methods to lift static attribution techniques to the dynamical setting, where we identify and address challenges specific to dynamics. We then experimentally assess the utility of different explanations of clinical alerts through expert evaluation.

References

[1]
David W Bates and Eyal Zimlichman. Finding patients before they crash: the next major opportunity to improve patient safety. BMJ Qual. Saf., 24(1):1--3, January 2015.
[2]
David W Bates, Suchi Saria, Lucila Ohno-Machado, Anand Shah, and Gabriel Escobar. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff., 33(7):1123--1131, July 2014.
[3]
Vincent Liu, Yan S. Kim, Benjamin J. Turk, Arona Ragins, Brian A. Dummett, Carnen L. Adams, Elizabeth A. Scruth, and Patricia Kipnis. Early Detection of Impending Deterioration Outside The ICU: A Difference-in-Differences (DiD) Study, pages A7614--A7614. 2016.
[4]
Barbara J Drew, Patricia Harris, Jessica K Zègre-Hemsey, Tina Mammone, Daniel Schindler, Rebeca Salas-Boni, Yong Bai, Adelita Tinoco, Quan Ding, and Xiao Hu. Insights into the problem of alarm fatigue with physiologic monitor devices: a comprehensive observational study of consecutive intensive care unit patients. PLoS One, 9(10):e110274, October 2014.
[5]
Shannon M Fernando, Alexandre Tran, Monica Taljaard, Wei Cheng, Bram Rochwerg, Andrew J E Seely, and Jeffrey J Perry. Prognostic accuracy of the quick sequential organ failure assessment for mortality in patients with suspected infection: A systematic review and meta-analysis. Ann. Intern. Med., February 2018.
[6]
Sana Tonekaboni, Shalmali Joshi, Melissa D. McCradden, and Anna Goldenberg. What clinicians want: Contextualizing explainable machine learning for clinical end use. In MLHC, 2019.
[7]
David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert Müller. How to explain individual classification decisions. J. Mach. Learn. Res., 11:1803--1831, August 2010.
[8]
Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. Visualizing higher-layer features of a deep network. Technical Report 1341, University of Montreal, June 2009. Also presented at the ICML 2009 Workshop on Learning Feature Hierarchies, Montréal, Canada.
[9]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, abs/1312.6034, 2013.
[10]
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In ICML, 2017.
[11]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. September 2014.
[12]
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, pages 1721--1730, New York, NY, USA, 2015. ACM.
[13]
Michael J Rothman, Steven I Rothman, and Joseph Beals, 4th. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform., 46(5):837--848, October 2013.
[14]
Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning. arXiv, 2017.
[15]
Lauren Block, Robert Habicht, Albert W Wu, Sanjay V Desai, Kevin Wang, Kathryn Novello Silva, Timothy Niessen, Nora Oliver, and Leonard Feldman. In the wake of the 2003 and 2011 duty hours regulations, how do internal medicine interns spend their time? J. Gen. Intern. Med., 28(8):1042--1047, August 2013.
[16]
Lena Mamykina, David K Vawdrey, and George Hripcsak. How do residents spend their shift time? a time and motion study with a particular focus on the use of computers. Acad. Med., 91(6):827--832, June 2016.
[17]
Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. Recurrent neural networks for multivariate time series with missing values. CoRR, abs/1606.01865, 2016.
[18]
Zachary C. Lipton, David C. Kale, and Randall C. Wetzel. Directly modeling missing data in sequences with rnns: Improved classification of clinical time series. In MLHC, 2016.
[19]
Harini Suresh, Nathan Hunt, Alistair Johnson, Leo Anthony Celi, Peter Szolovits, and Marzyeh Ghassemi. Clinical intervention prediction and understanding using deep networks. May 2017.
[20]
Denis Agniel, Isaac S Kohane, and Griffin M Weber. Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ, 361:k1479, April 2018.
[21]
Robert A Verheij, Vasa Curcin, Brendan C Delaney, and Mark M McGilchrist. Possible sources of bias in primary care electronic health record data use and reuse. J Med Internet Res, 20(5):e185, May 2018.
[22]
Matti Suistomaa, Aarno Kari, Esko Ruokonen, and Jukka Takala. Sampling rate causes bias in apache ii and saps ii scores. Intensive Care Medicine, 26(12):1773--1778, Dec 2000.
[23]
Supriyo Chakraborty, Richard Tomsett, Ramya Raghavendra, Daniel Harborne, Moustafa Alzantot, Federico Cerutti, Mani Srivastava, Alun Preece, Julier Simon, Raghuveer M. Rao, Troy D. Kelley, David Braines, Murat Sensoy, Christopher J. Willis, and Prudhvi Gurram. Interpretability of deep learning models: a survey of results. In IEEE Smart World Congress 2017 Workshop: DAIS 2017 - Workshop on Distributed Analytics InfraStructure and Algorithms for Multi-Organization Federations, 2017.
[24]
D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg. SmoothGrad: removing noise by adding noise. ICML workshop on visualization for deep learning, June 2017.
[25]
Matthew D. Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, editors, Computer Vision -- ECCV 2014, pages 818--833, Cham, 2014. Springer International Publishing.
[26]
Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin A. Riedmiller. Striving for simplicity: The all convolutional net. CoRR, abs/1412.6806, 2014.
[27]
Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, and Kedar Dhamdhere. Did the model understand the question? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pages 1896--1906, 2018.
[28]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should i trust you?": Explaining the predictions of any classifier. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pages 1135--1144, New York, NY, USA, 2016. ACM.
[29]
Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4765--4774. Curran Associates, Inc., 2017.
[30]
Jianbo Chen, Le Song, Martin Wainwright, and Michael Jordan. Learning to explain: An information-theoretic perspective on model interpretation. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 883--892, Stockholmsmässan, Stockholm Sweden, 10--15 Jul 2018. PMLR.
[31]
Lionel A Mandell, Richard G Wunderink, Antonio Anzueto, John G Bartlett, G Douglas Campbell, Nathan C Dean, Scott F Dowell, Thomas M File, Jr, Daniel M Musher, Michael S Niederman, Antonio Torres, Cynthia G Whitney, Infectious Diseases Society of America, and American Thoracic Society. Infectious diseases society of America/American thoracic society consensus guidelines on the management of community-acquired pneumonia in adults. Clin. Infect. Dis., 44 Suppl 2:S27--72, March 2007.
[32]
W S Lim, D L Smith, M P Wise, and S A Welham. British thoracic society community acquired pneumonia guideline and the NICE pneumonia guideline: how they fit together. BMJ Open Respir Res, 2(1):e000091, May 2015.
[33]
Matthew M Churpek and Dana P Edelson. Moving beyond Single-Parameter early warning scores for rapid response system activation. Crit. Care Med., 44(12):2283--2285, December 2016.
[34]
Michael D Howell, Long Ngo, Patricia Folcarelli, Julius Yang, Lawrence Mottley, Edward R Marcantonio, Kenneth E Sands, Donald Moorman, and Mark D Aronson. Sustained effectiveness of a primary-team-based rapid response system. Crit. Care Med., 40(9):2562--2568, September 2012.
[35]
Julia Adler-Milstein, Catherine M DesRoches, Peter Kralovec, Gregory Foster, Chantal Worzala, Dustin Charles, Talisha Searcy, and Ashish K Jha. Electronic health record adoption in US hospitals: Progress continues, but challenges persist. Health Aff., 34(12):2174--2180, December 2015.
[36]
Benjamin Shickel, Patrick Tighe, Azra Bihorac, and Parisa Rashidi. Deep ehr: A survey of recent advances in deep learning techniques for electronic health record (ehr) analysis. IEEE Journal of Biomedical and Health Informatics, PP(99):1--1, 2017.
[37]
Harini Suresh, Nathan Hunt, Alistair Johnson, Leo Anthony Celi, Peter Szolovits, and Marzyeh Ghassemi. Clinical intervention prediction and understanding with deep neural networks. In Finale Doshi-Velez, Jim Fackler, David Kale, Rajesh Ranganath, Byron Wallace, and Jenna Wiens, editors, Proceedings of the 2nd Machine Learning for Healthcare Conference, volume 68 of Proceedings of Machine Learning Research, pages 322--337, Boston, Massachusetts, 18--19 Aug 2017. PMLR.
[38]
Phuoc Nguyen, Truyen Tran, Nilmini Wickramasinghe, and Svetha Venkatesh. Deepr: A convolutional net for medical records. July 2016.
[39]
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F Stewart, and Jimeng Sun. Doctor AI: Predicting clinical events via recurrent neural networks. In Proceedings of the 1st Machine Learning for Healthcare Conference, pages 301--318. jmlr.org, 2016.
[40]
Zhengping Che, David Kale, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu. Deep computational phenotyping. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15, pages 507--516, New York, NY, USA, 2015. ACM.
[41]
Thomas A. Lasko, Joshua C. Denny, and Mia A. Levy. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PLoS One, 8(6), 2013.
[42]
Truyen Tran, Tu Dinh Nguyen, Dinh Phung, and Svetha Venkatesh. Learning vector representation of medical objects via emr-driven nonnegative restricted boltzmann machines (enrbm). J. of Biomedical Informatics, 54(C):96--105, April 2015.
[43]
Finale Doshi-Velez, Yaorong Ge, and Isaac Kohane. Comorbidity clusters in autism spectrum disorders: An electronic health record time-series analysis. Pediatrics, 133(1):e54--e63, 2014.
[44]
Mathieu Guillame-Bert, Artur Dubrawski, Donghan Wang, Marilyn Hravnak, Gilles Clermont, and Michael R. Pinsky. Learning temporal rules to forecast instability in continuously monitored patients. JAMIA, 24(1):47--53, 2017.
[45]
Thomas McCoy Roy Perlis Finale Doshi-Velez Michael C. Hughes, Huseyin Melih Elibol. Supervised topic models for clinical interpretability. In Proceedings of the 1st Machine Learning for Healthcare Conference, 2016.
[46]
Anand Avati, Kenneth Jung, Stephanie Harman, Lance Downing, Andrew Ng, and Nigam H Shah. Improving palliative care with deep learning. November 2017.
[47]
Ying Sha and May D. Wang. Interpretable predictions of clinical outcomes with an attention-based recurrent neural network. In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics, ACM-BCB '17, pages 233--240, New York, NY, USA, 2017. ACM.
[48]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 3504--3512. Curran Associates, Inc., 2016.
[49]
Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. Interpretable deep models for icu outcome prediction. 2016:371--380, 02 2017.
[50]
Zhengping Che, Sanjay Purushotham, Robinder G. Khemani, and Yan Liu. Distilling knowledge from deep networks with applications to healthcare domain. In NIPS Workshop on Machine Learning for Healthcare, 2015.
[51]
Hany Farid and Eero Simoncelli. Differentiation of discrete multidimensional signals. IEEE Transactions on Image Processing, 13(4):496--508, 4 2004.
[52]
Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term memory. Neural Comput., 9(8):1735--1780, November 1997.
[53]
Alistair E.W. Johnson, Tom J. Pollard, Lu Shen, Li-wei H. Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G. Mark. Mimic-iii, a freely accessible critical care database. Scientific Data, 2016.
[54]
G. Michael Felker, Larry A. Allen, Stuart J. Pocock, Linda K. Shaw, John J.V. McMurray, Marc A. Pfeffer, Karl Swedberg, Duolao Wang, Salim Yusuf, Eric L. Michelson, and Christopher B. Granger. Red cell distribution width as a novel prognostic marker in heart failure: Data from the charm program and the duke databank. Journal of the American College of Cardiology, 50(1):40 -- 47, 2007.
[55]
Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M Dai, Nissan Hajaj, Peter J Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L Volchenboum, Katherine Chou, Michael Pearson, Srinivasan Madabushi, Nigam H Shah, Atul J Butte, Michael Howell, Claire Cui, Greg Corrado, and Jeff Dean. Scalable and accurate deep learning for electronic health records. npj Digital Medicine, 1, January 2018.
[56]
Yarin Gal and Zoubin Ghahramani. A theoretically grounded application of dropout in recurrent neural networks. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 1019--1027. Curran Associates, Inc., 2016.
[57]
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12:2121--2159, July 2011.
[58]
J. A. Kellum and N. Lameire. Diagnosis, evaluation, and management of acute kidney injury: a KDIGO summary (Part 1). Crit Care, 17(1):204, Feb 2013.
[59]
P.-J. Kindermans, S. Hooker, J. Adebayo, M. Alber, K. T. Schütt, S. Dähne, D. Erhan, and B. Kim. The (Un)reliability of saliency methods. NIPS workshop on Explaining and Visualizing Deep Learning, 2017.
[60]
Zachary Chase Lipton. The mythos of model interpretability. In ICML Workshop on Human Interpretability, 2016.
[61]
Aaron Springer, Victoria Hollis, and Steve Whittaker. Dice in the black box: User experiences with an inscrutable algorithm, 2017.

Cited By

View all
  • (2024)Graph and Structured Data Algorithms in Electronic Health Records: A Scoping ReviewMetadata and Semantic Research10.1007/978-3-031-65990-4_6(61-73)Online publication date: 31-Jul-2024
  • (2022)Generative time series forecasting with diffusion, denoise, and disentanglementProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601942(23009-23022)Online publication date: 28-Nov-2022
  • (2022)Interpretability and fairness evaluation of deep learning models on MIMIC-IV datasetScientific Reports10.1038/s41598-022-11012-212:1Online publication date: 3-May-2022
  • Show More Cited By

Index Terms

  1. Explaining an increase in predicted risk for clinical alerts

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHIL '20: Proceedings of the ACM Conference on Health, Inference, and Learning
    April 2020
    265 pages
    ISBN:9781450370462
    DOI:10.1145/3368555
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 April 2020

    Check for updates

    Author Tags

    1. dynamics
    2. interpretability
    3. machine learning

    Qualifiers

    • Research-article

    Conference

    ACM CHIL '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 27 of 110 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)163
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 01 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Graph and Structured Data Algorithms in Electronic Health Records: A Scoping ReviewMetadata and Semantic Research10.1007/978-3-031-65990-4_6(61-73)Online publication date: 31-Jul-2024
    • (2022)Generative time series forecasting with diffusion, denoise, and disentanglementProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601942(23009-23022)Online publication date: 28-Nov-2022
    • (2022)Interpretability and fairness evaluation of deep learning models on MIMIC-IV datasetScientific Reports10.1038/s41598-022-11012-212:1Online publication date: 3-May-2022
    • (2021)Prediction of In-Hospital Cardiac Arrest Using Shallow and Deep LearningDiagnostics10.3390/diagnostics1107125511:7(1255)Online publication date: 13-Jul-2021
    • (2021)Concurrent imputation and prediction on EHR data using bi-directional GANsProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3459930.3469512(1-9)Online publication date: 1-Aug-2021
    • (2020)Benchmarking deep learning interpretability in time series predictionsProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496264(6441-6452)Online publication date: 6-Dec-2020

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media