Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3442381.3449855acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Distilling Knowledge from Publicly Available Online EMR Data to Emerging Epidemic for Prognosis

Published: 03 June 2021 Publication History

Abstract

Due to the characteristics of COVID-19, the epidemic develops rapidly and overwhelms health service systems worldwide. Many patients suffer from life-threatening systemic problems and need to be carefully monitored in ICUs. An intelligent prognosis can help physicians take an early intervention, prevent adverse outcomes, and optimize the medical resource allocation, which is urgently needed, especially in this ongoing global pandemic crisis. However, in the early stage of the epidemic outbreak, the data available for analysis is limited due to the lack of effective diagnostic mechanisms, the rarity of the cases, and privacy concerns. In this paper, we propose a distilled transfer learning framework, which leverages the existing publicly available online Electronic Medical Records to enhance the prognosis for inpatients with emerging infectious diseases. It learns to embed the COVID-19-related medical features based on massive existing EMR data. The transferred parameters are further trained to imitate the teacher model’s representation based on distillation, which embeds the health status more comprehensively on the source dataset. We conduct Length-of-Stay prediction experiments for patients in ICUs on real-world COVID-19 datasets. The experiment results indicate that our proposed model consistently outperforms competitive baseline methods. In order to further verify the scalability of o deal with different clinical tasks on different EMR datasets, we conduct an additional mortality prediction experiment on End-Stage Renal Disease datasets. The extensive experiments demonstrate that an benefit the prognosis for emerging pandemics and other diseases with limited EMR.

References

[1]
Inci M Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K Jain, and Jiayu Zhou. 2017. Patient subtyping via time-aware LSTM networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 65–74.
[2]
Yanping Chen, Eamonn Keogh, Bing Hu, Nurjahan Begum, Anthony Bagnall, Abdullah Mueen, and Gustavo Batista. 2015. The ucr time series classification archive. (2015).
[3]
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2015. Doctor AI: Predicting Clinical Events via Recurrent Neural Networks. (2015).
[4]
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun. 2017. GRAM: graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 787–795.
[5]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504–3512.
[6]
Edward Choi, Cao Xiao, Walter Stewart, and Jimeng Sun. 2018. Mime: Multilevel medical embedding of electronic health records for predictive healthcare. In Advances in Neural Information Processing Systems. 4547–4557.
[7]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555(2014).
[8]
Amir Emami, Fatemeh Javanmardi, Neda Pirbonyeh, and Ali Akbari. 2020. Prevalence of underlying diseases in hospitalized patients with COVID-19: a systematic review and meta-analysis. Archives of academic emergency medicine 8, 1 (2020).
[9]
Cristóbal Esteban, Oliver Staeck, Stephan Baier, Yinchong Yang, and Volker Tresp. 2016. Predicting clinical events by combining static and dynamic information using recurrent neural networks. In Healthcare Informatics (ICHI), 2016 IEEE International Conference on. Ieee, 93–101.
[10]
Yujie Feng, Jiangtao Wang, Yasha Wang, and Sumi Helal. 2021. Completing Missing Prevalence Rates for Multiple Chronic Diseases by Jointly Leveraging Both Intra- and Inter-Disease Population Health Data Correlations. In The Web Conference (WWW).
[11]
Leiwen Fu, Bingyi Wang, Tanwei Yuan, Xiaoting Chen, Yunlong Ao, Tom Fitzpatrick, Peiyang Li, Yiguo Zhou, Yifan Lin, Qibin Duan, 2020. Clinical characteristics of coronavirus disease 2019 (COVID-19) in China: a systematic review and meta-analysis. Journal of Infection(2020).
[12]
Jingyue Gao, Xiting Wang, Yasha Wang, Zhao Yang, Junyi Gao, Jiangtao Wang, Wen Tang, and Xing Xie. 2019. Camp: Co-attention memory networks for diagnosis prediction in healthcare. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1036–1041.
[13]
Junyi Gao, Cao Xiao, Yasha Wang, Wen Tang, Lucas M Glass, and Jimeng Sun. 2020. StageNet: Stage-Aware Neural Networks for Health Risk Prediction. In Proceedings of The Web Conference 2020. 530–540.
[14]
Priyanka Gupta, Pankaj Malhotra, Lovekesh Vig, and Gautam Shroff. 2018. Transfer Learning for Clinical Time Series Analysis using Recurrent Neural Networks. arXiv preprint arXiv:1807.01705(2018).
[15]
Priyanka Gupta, Pankaj Malhotra, Lovekesh Vig, and Gautam Shroff. 2018. Using Features from Pre-trained TimeNet for Clinical Predictions. In The 3rd International Workshop on Knowledge Discovery in Healthcare Data at IJCAI.
[16]
Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, Greg Ver Steeg, and Aram Galstyan. 2019. Multitask learning and benchmarking with clinical time series data. Scientific data 6, 1 (2019), 1–18.
[17]
Jay Heo, Hae Beom Lee, Saehoon Kim, Juho Lee, Kwang Joon Kim, Eunho Yang, and Sung Ju Hwang. 2018. Uncertainty-aware attention for reliable interpretation and prediction. In Advances in Neural Information Processing Systems. 909–918.
[18]
HM hospitales. [n.d.]. COVID DATA SAVE LIVES. https://www.hmhospitales.com/. Accessed: 2020-10-20.
[19]
Chaolin Huang, Yeming Wang, Xingwang Li, Lili Ren, Jianping Zhao, Yi Hu, Li Zhang, Guohui Fan, Jiuyang Xu, Xiaoying Gu, 2020. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The lancet 395, 10223 (2020), 497–506.
[20]
Marcello Ienca and Effy Vayena. 2020. On the responsible use of digital data to tackle the COVID-19 pandemic. Nature medicine 26, 4 (2020), 463–464.
[21]
Tamara Isakova, Huiliang Xie, Wei Yang, Dawei Xie, Amanda Hyre Anderson, Julia Scialla, Patricia Wahl, Orlando M Gutiérrez, Susan Steigerwalt, Jiang He, 2011. Fibroblast growth factor 23 and risks of mortality and end-stage renal disease in patients with chronic kidney disease. Jama 305, 23 (2011), 2432–2439.
[22]
Yunpeng Ji, Zhongren Ma, Maikel P Peppelenbosch, and Qiuwei Pan. 2020. Potential association between COVID-19 mortality and health-care resource availability. The Lancet Global Health 8, 4 (2020), e480.
[23]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[24]
Stephen A Lauer, Kyra H Grantz, Qifang Bi, Forrest K Jones, Qulu Zheng, Hannah R Meredith, Andrew S Azman, Nicholas G Reich, and Justin Lessler. 2020. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Annals of internal medicine 172, 9 (2020), 577–582.
[25]
Chonho Lee, Zhaojing Luo, Kee Yuan Ngiam, Meihui Zhang, Kaiping Zheng, Gang Chen, Beng Chin Ooi, and Wei Luen James Yip. 2017. Big healthcare data analytics: Challenges and applications. In Handbook of Large-Scale Distributed Computing in Smart Healthcare. Springer, 11–41.
[26]
Wonsung Lee, Sungrae Park, Weonyoung Joo, and Il-Chul Moon. 2018. Diagnosis Prediction via Medical Context Attention Networks Using Deep Generative Modeling. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 1104–1109.
[27]
Fenglong Ma, Quanzeng You, Houping Xiao, Radha Chitta, Jing Zhou, and Jing Gao. 2018. Kame: Knowledge-based attention model for diagnosis prediction in healthcare. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 743–752.
[28]
Liantao Ma, Junyi Gao, Yasha Wang, Chaohe Zhang, Jiangtao Wang, Wenjie Ruan, Wen Tang, Xin Gao, and Xinyu Ma. 2020. AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration. In Thirty-Fourth AAAI Conference on Artificial Intelligence.
[29]
Liantao Ma, Chaohe Zhang, Yasha Wang, Wenjie Ruan, Jiangtao Wang, Wen Tang, Xinyu Ma, Xin Gao, and Junyi Gao. 2020. ConCare: Personalized Clinical Feature Embedding via Capturing the Healthcare Context. In Thirty-Fourth AAAI Conference on Artificial Intelligence.
[30]
Tengfei Ma, Cao Xiao, and Fei Wang. 2018. Health-ATM: A Deep Architecture for Multifaceted Patient Health Record Representation and Risk Prediction. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 261–269.
[31]
US Department of Health, Human Services, 2017. Pandemic influenza plan: 2017 Update. URL https://www. cdc. gov/flu/pandemic-resources/pdf/pan-flu-report-2017v2. pdf(2017).
[32]
T Pedersen, K Eliasen, and Eet al Henriksen. 1990. A prospective study of mortality associated with anaesthesia and surgery: risk indicators of mortality in hospital. Acta Anaesthesiologica Scandinavica 34, 3 (1990), 176–182.
[33]
Trang Pham, Truyen Tran, Dinh Phung, and Svetha Venkatesh. 2016. Deepcare: A deep dynamic memory model for predictive medicine. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 30–41.
[34]
Sanjay Purushotham, Chuizheng Meng, Zhengping Che, and Yan Liu. 2017. Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets.arXiv: Learning (2017).
[35]
Zhi Qiao, Shiwan Zhao, Cao Xiao, Xiang Li, Yong Qin, and Fei Wang. 2018. Pairwise-Ranking based Collaborative Recurrent Neural Networks for Clinical Event Prediction. In IJCAI. 3520–3526.
[36]
Chandan K Reddy and Charu C Aggarwal. 2015. Healthcare data analytics. Chapman and Hall/CRC.
[37]
Matthew A Reyna, Christopher S Josef, Russell Jeter, Supreeth P Shashikumar, M Brandon Westover, Shamim Nemati, Gari D Clifford, and Ashish Sharma. 2019. Early prediction of sepsis from clinical data: the PhysioNet/Computing in Cardiology Challenge 2019. Critical Care Medicine(2019).
[38]
Michael T Rosenstein, Zvika Marx, Leslie Pack Kaelbling, and Thomas G Dietterich. 2005. To transfer or not to transfer. In NIPS 2005 workshop on transfer learning, Vol. 898. 1–4.
[39]
Zhe Sun, Shaoliang Peng, Yaning Yang, Xiaoqi Wang, and Fei Li. 2019. A General Fine-tuned Transfer Learning Model for Predicting Clinical Task Acrossing Diverse EHRs Datasets. In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).
[40]
Harini Suresh, Jen J Gong, and John Guttag. 2018. Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU. arXiv preprint arXiv:1806.02878(2018).
[41]
Navdeep Tangri, David Ansell, and David Naimark. 2011. Determining factors that predict technique survival on peritoneal dialysis: application of regression and artificial neural network methods. Nephron Clinical Practice 118, 2 (2011), c93–c100.
[42]
Xinhui Wang, Xuexian Fang, Zhaoxian Cai, Xiaotian Wu, Xiaotong Gao, Junxia Min, Fudi Wang, 2020. Comorbid Chronic Diseases and Acute Organ Injuries Are Strongly Correlated with Disease Severity and Mortality among COVID-19 Patients: A Systemic Review and Meta-Analysis. Research 2020(2020), 2402961.
[43]
Gary E Weissman, Andrew Crane-Droesch, Corey Chivers, ThaiBinh Luong, Asaf Hanish, Michael Z Levy, Jason Lubken, Michael Becker, Michael E Draugelis, George L Anesi, 2020. Locally informed simulation to predict hospital capacity needs during the COVID-19 pandemic. Annals of internal medicine(2020).
[44]
Li Yan, Hai-Tao Zhang, Jorge Goncalves, Yang Xiao, Maolin Wang, Yuqi Guo, Chuan Sun, Xiuchuan Tang, Liang Jing, Mingyang Zhang, 2020. An interpretable mortality prediction model for COVID-19 patients. Nature Machine Intelligence(2020), 1–6.
[45]
Kaiping Zheng, Jinyang Gao, Kee Yuan Ngiam, Beng Chin Ooi, and Wei Luen James Yip. 2017. Resolving the bias in electronic medical records. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2171–2180.
[46]
Yaowei Zheng, Richong Zhang, Suyuchen Wang, Samuel Mensah, and Yongyi Mao. 2020. Anchored Model Transfer and Soft Instance Transfer for Cross-Task Cross-Domain Learning: A Study Through Aspect-Level Sentiment Classification. In Proceedings of The Web Conference 2020. 2754–2760.

Cited By

View all
  • (2024)Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression Against Heterogeneous Attacks Toward AI Software DeploymentIEEE Transactions on Software Engineering10.1109/TSE.2023.334851550:3(376-390)Online publication date: Mar-2024
  • (2023)Advances in the Development of Representation Learning and Its Innovations against COVID-19COVID10.3390/covid30900963:9(1389-1415)Online publication date: 13-Sep-2023
  • (2023)VecoCareProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/547(4921-4929)Online publication date: 19-Aug-2023
  • Show More Cited By
  1. Distilling Knowledge from Publicly Available Online EMR Data to Emerging Epidemic for Prognosis

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '21: Proceedings of the Web Conference 2021
    April 2021
    4054 pages
    ISBN:9781450383127
    DOI:10.1145/3442381
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Electronic Medical Record
    2. Healthcare Informatics
    3. Prognosis
    4. Transfer Learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '21
    Sponsor:
    WWW '21: The Web Conference 2021
    April 19 - 23, 2021
    Ljubljana, Slovenia

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)80
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression Against Heterogeneous Attacks Toward AI Software DeploymentIEEE Transactions on Software Engineering10.1109/TSE.2023.334851550:3(376-390)Online publication date: Mar-2024
    • (2023)Advances in the Development of Representation Learning and Its Innovations against COVID-19COVID10.3390/covid30900963:9(1389-1415)Online publication date: 13-Sep-2023
    • (2023)VecoCareProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/547(4921-4929)Online publication date: 19-Aug-2023
    • (2023)Quality-Guaranteed and Cost-Effective Population Health Profiling: A Deep Active Learning ApproachACM Transactions on Computing for Healthcare10.1145/36171794:4(1-19)Online publication date: 13-Oct-2023
    • (2023)Vertical Federated Knowledge Transfer via Representation Distillation for Healthcare Collaboration NetworksProceedings of the ACM Web Conference 202310.1145/3543507.3583874(4188-4199)Online publication date: 30-Apr-2023
    • (2023)SeqCare: Sequential Training with External Medical Knowledge Graph for Diagnosis Prediction in Healthcare DataProceedings of the ACM Web Conference 202310.1145/3543507.3583543(2819-2830)Online publication date: 30-Apr-2023
    • (2023)Cross-Hospital Sepsis Early Detection via Semi-Supervised Optimal Transport With Self-Paced EnsembleIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2023.325320827:6(3049-3060)Online publication date: Jun-2023
    • (2023)A Survey on Application of Knowledge Distillation in Healthcare Domain2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS)10.1109/ICICCS56967.2023.10142871(762-768)Online publication date: 17-May-2023
    • (2023)Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalitiesJournal of Biomedical Informatics10.1016/j.jbi.2023.104466145:COnline publication date: 1-Sep-2023
    • (2023)Pre-training in Medical Data: A SurveyMachine Intelligence Research10.1007/s11633-022-1382-820:2(147-179)Online publication date: 21-Feb-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media