Abstract
Predicting the risk of in-hospital mortality from electronic health records (EHRs) has received considerable attention. Such predictions will provide early warning of a patient’s health condition to healthcare professionals so that timely interventions can be taken. This prediction task is challenging since EHR data are intrinsically irregular, with not only many missing values but also varying time intervals between medical records. Existing approaches focus on exploiting the variable correlations in patient medical records to impute missing values and establishing time-decay mechanisms to deal with such irregularity. This paper presents a novel contrastive learning-based imputation-prediction network for predicting in-hospital mortality risks using EHR data. Our approach introduces graph analysis-based patient stratification modeling in the imputation process to group similar patients. This allows information of similar patients only to be used, in addition to personal contextual information, for missing value imputation. Moreover, our approach can integrate contrastive learning into the proposed network architecture to enhance patient representation learning and predictive performance on the classification task. Experiments on two real-world EHR datasets show that our approach outperforms the state-of-the-art approaches in both imputation and prediction tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The implementation code is available at https://github.com/liulab1356/CL-ImpPreNet.
- 2.
- 3.
References
Cao, W., Wang, D., Li, J., Zhou, H., Li, L., Li, Y.: Brits: bidirectional recurrent imputation for time series. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Che, Z., Purushotham, S., Cho, K., Sontag, D., Liu, Y.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 1–12 (2018)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Cui, S., Wang, J., Gui, X., Wang, T., Ma, F.: Automed: automated medical risk predictive modeling on electronic health records. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 948–953. IEEE (2022)
Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Groenwold, R.H.: Informative missingness in electronic health record systems: the curse of knowing. Diagn. Prognostic Res. 4(1), 1–6 (2020)
Harutyunyan, H., Khachatrian, H., Kale, D.C., Ver Steeg, G., Galstyan, A.: Multitask learning and benchmarking with clinical time series data. Sci. Data 6(1), 1–18 (2019)
Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 1–9 (2016)
Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Le-Khac, P.H., Healy, G., Smeaton, A.F.: Contrastive representation learning: a framework and review. IEEE Access 8, 193907–193934 (2020)
Lee, Y., Jun, E., Choi, J., Suk, H.I.: Multi-view integrative attention-based deep representation learning for irregular clinical time-series data. IEEE J. Biomed. Health Inform. 26(8), 4270–4280 (2022)
Li, J., Shang, J., McAuley, J.: Uctopic: unsupervised contrastive learning for phrase representations and topic mining. arXiv preprint arXiv:2202.13469 (2022)
Li, M., Li, C.G., Guo, J.: Cluster-guided asymmetric contrastive learning for unsupervised person re-identification. IEEE Trans. Image Process. 31, 3606–3617 (2022)
Li, R., Ma, F., Gao, J.: Integrating multimodal electronic health records for diagnosis prediction. In: AMIA Annual Symposium Proceedings, vol. 2021, p. 726. American Medical Informatics Association (2021)
Luo, Y., Cai, X., Zhang, Y., Xu, J., et al.: Multivariate time series imputation with generative adversarial networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Luo, Y., Zhang, Y., Cai, X., Yuan, X.: E2GAN: end-to-end generative adversarial network for multivariate time series imputation. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 3094–3100. AAAI Press (2019)
Ma, L., et al.: Adacare: explainable clinical health status representation learning via scale-adaptive feature extraction and recalibration. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 825–832 (2020)
Ma, L., et al.: Distilling knowledge from publicly available online EMR data to emerging epidemic for prognosis. In: Proceedings of the Web Conference 2021, pp. 3558–3568 (2021)
Ma, L., et al.: Concare: personalized clinical feature embedding via capturing the healthcare context. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 833–840 (2020)
McCombe, N., et al.: Practical strategies for extreme missing data imputation in dementia diagnosis. IEEE J. Biomed. Health Inform. 26(2), 818–827 (2021)
Mulyadi, A.W., Jun, E., Suk, H.I.: Uncertainty-aware variational-recurrent imputation network for clinical time series. IEEE Trans. Cybern. 52(9), 9684–9694 (2021)
Ni, Q., Cao, X.: MBGAN: an improved generative adversarial network with multi-head self-attention and bidirectional RNN for time series imputation. Eng. Appl. Artif. Intell. 115, 105232 (2022)
Oh, E., Kim, T., Ji, Y., Khyalia, S.: Sting: self-attention based time-series imputation networks using GAN. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 1264–1269. IEEE (2021)
Pang, B., et al.: Unsupervised representation for semantic segmentation by implicit cycle-attention contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2044–2052 (2022)
Pereira, R.C., Abreu, P.H., Rodrigues, P.P.: Partial multiple imputation with variational autoencoders: tackling not at randomness in healthcare data. IEEE J. Biomed. Health Inform. 26(8), 4218–4227 (2022)
Pollard, T.J., Johnson, A.E., Raffa, J.D., Celi, L.A., Mark, R.G., Badawi, O.: The eICU collaborative research database, a freely available multi-center database for critical care research. Sci. Data 5(1), 1–13 (2018)
Sheikhalishahi, S., Balaraman, V., Osmani, V.: Benchmarking machine learning models on multi-centre eicu critical care dataset. PLoS ONE 15(7), e0235424 (2020)
Shi, Z., et al.: Deep dynamic imputation of clinical time series for mortality prediction. Inf. Sci. 579, 607–622 (2021)
Tan, Q., et al.: Data-GRU: dual-attention time-aware gated recurrent unit for irregular multivariate time series. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 930–937 (2020)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Wang, W., Zhou, T., Yu, F., Dai, J., Konukoglu, E., Van Gool, L.: Exploring cross-image pixel contrast for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7303–7313 (2021)
Wang, Y., Min, Y., Chen, X., Wu, J.: Multi-view graph contrastive representation learning for drug-drug interaction prediction. In: Proceedings of the Web Conference 2021, pp. 2921–2933 (2021)
Xu, D., Sheng, J.Q., Hu, P.J.H., Huang, T.S., Hsu, C.C.: A deep learning-based unsupervised method to impute missing values in patient records for improved management of cardiovascular patients. IEEE J. Biomed. Health Inform. 25(6), 2260–2272 (2020)
Yang, C., An, Z., Cai, L., Xu, Y.: Mutual contrastive learning for visual representation learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 3045–3053 (2022)
Yıldız, A.Y., Koç, E., Koç, A.: Multivariate time series imputation with transformers. IEEE Signal Process. Lett. 29, 2517–2521 (2022)
Yuan, X., et al.: Multimodal contrastive training for visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6995–7004 (2021)
Zang, C., Wang, F.: SCEHR: supervised contrastive learning for clinical risk prediction using electronic health records. In: Proceedings of IEEE International Conference on Data Mining, vol. 2021, pp. 857–866 (2021)
Zhang, Y., Zhou, B., Cai, X., Guo, W., Ding, X., Yuan, X.: Missing value imputation in multivariate time series with end-to-end generative adversarial networks. Inf. Sci. 551, 67–82 (2021)
Acknowledgement
This research is partially funded by the ARC Centre of Excellence for Automated Decision-Making and Society (CE200100005) by the Australian Government through the Australian Research Council.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Ethical Statement
The experimental datasets used for this work are obtained from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) dataset and the eICU Collaborative Research dataset. These data were used under license. The authors declare that they have no conflicts of interest. This article does not contain any studies involving human participants performed by any of the authors.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, Y., Zhang, Z., Qin, S., Salim, F.D., Yepes, A.J. (2023). Contrastive Learning-Based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling Using EHRs. In: De Francisci Morales, G., Perlich, C., Ruchansky, N., Kourtellis, N., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14174. Springer, Cham. https://doi.org/10.1007/978-3-031-43427-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-43427-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43426-6
Online ISBN: 978-3-031-43427-3
eBook Packages: Computer ScienceComputer Science (R0)