Nothing Special   »   [go: up one dir, main page]

skip to main content
review-article

Deep representation learning of patient data from Electronic Health Records (EHR): : A systematic review

Published: 01 March 2021 Publication History

Graphical abstract

Display Omitted

Highlights

A systematic review of the current works pertinent to patient representation learning.
A growing trend in building deep learning based patient representations from EHRs.
The learned representations attempt to gain a cohesive picture of a patient’s data.
Capabilities of deep learning models can largely address the challenges of EHR data.
Future work: advanced learning methods to obtain robust, and precise representations.

Abstract

Objectives

Patient representation learning refers to learning a dense mathematical representation of a patient that encodes meaningful information from Electronic Health Records (EHRs). This is generally performed using advanced deep learning methods. This study presents a systematic review of this field and provides both qualitative and quantitative analyses from a methodological perspective.

Methods

We identified studies developing patient representations from EHRs with deep learning methods from MEDLINE, EMBASE, Scopus, the Association for Computing Machinery (ACM) Digital Library, and the Institute of Electrical and Electronics Engineers (IEEE) Xplore Digital Library. After screening 363 articles, 49 papers were included for a comprehensive data collection.

Results

Publications developing patient representations almost doubled each year from 2015 until 2019. We noticed a typical workflow starting with feeding raw data, applying deep learning models, and ending with clinical outcome predictions as evaluations of the learned representations. Specifically, learning representations from structured EHR data was dominant (37 out of 49 studies). Recurrent Neural Networks were widely applied as the deep learning architecture (Long short-term memory: 13 studies, Gated recurrent unit: 11 studies). Learning was mainly performed in a supervised manner (30 studies) optimized with cross-entropy loss. Disease prediction was the most common application and evaluation (31 studies). Benchmark datasets were mostly unavailable (28 studies) due to privacy concerns of EHR data, and code availability was assured in 20 studies.

Discussion & Conclusion

The existing predictive models mainly focus on the prediction of single diseases, rather than considering the complex mechanisms of patients from a holistic review. We show the importance and feasibility of learning comprehensive representations of patient EHR data through a systematic review. Advances in patient representation learning techniques will be essential for powering patient-level EHR analyses. Future work will still be devoted to leveraging the richness and potential of available EHR data. Reproducibility and transparency of reported results will hopefully improve. Knowledge distillation and advanced learning techniques will be exploited to assist the capability of learning patient representation further.

References

[1]
J. Wu, J. Roy, W.F. Stewart, Prediction modeling using EHR data: Challenges, strategies, and a comparison of machine learning approaches, Med. Care 48 (2010) S106–S113,.
[2]
Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and new perspectives, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2013) 1798–1828,.
[3]
Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015) 436–444,.
[4]
D. Svozil, V. Kvasnicka, J. Pospichal, Introduction to multi-layer feed-forward neural networks, Chemomet. Intell. Lab. Syst. 39 (1997) 43–62,.
[5]
Z. Che, D. Kale, W. Li, M.T. Bahadori, Y. Liu, Deep computational phenotyping, in: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2015, pp. 507–516.
[6]
A. Rajkomar, E. Oren, K. Chen, A.M. Dai, N. Hajaj, M. Hardt, P.J. Liu, X. Liu, J. Marcus, M. Sun, P. Sundberg, H. Yee, K. Zhang, Y. Zhang, G. Flores, G.E. Duggan, J. Irvine, Q. Le, K. Litsch, A. Mossin, J. Tansuwan, D. Wang, J. Wexler, J. Wilson, D. Ludwig, S.L. Volchenboum, K. Chou, M. Pearson, S. Madabushi, N.H. Shah, A.J. Butte, M.D. Howell, C. Cui, G.S. Corrado, J. Dean, Scalable and accurate deep learning with electronic health records, Npj Digit. Med. 1 (2018),.
[7]
P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proceedings of the 25th International Conference on Machine Learning - ICML’08, ACM Press, Helsinki, Finland, 2008, pp. 1096–1103. https://doi.org/10.1145/1390156.1390294.
[8]
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, J. Mach. Learn. Res. 11 (2010) 3371–3408. http://jmlr.org/papers/v11/vincent10a.html.
[9]
C. Doersch, Tutorial on variational autoencoders, ArXiv Preprint ArXiv:1606.05908 (2016).
[10]
S. Rifai, P. Vincent, X. Muller, X. Glorot, Y. Bengio, Contractive auto-encoders: explicit invariance during feature extraction, in: Proceedings of the 28th International Conference on International Conference on Machine Learning, 2011, pp. 833–840.
[11]
R. Miotto, L. Li, B.A. Kidd, J.T. Dudley, Deep patient: An unsupervised representation to predict the future of patients from the electronic health records, Sci. Rep. 6 (2016) 26094,.
[12]
Y. LeCun, B.E. Boser, J.S. Denker, D. Henderson, R.E. Howard, W.E. Hubbard, L.D. Jackel, Handwritten digit recognition with a back-propagation network, Adv. Neural Inf. Process. Syst. (1990) 396–404.
[13]
Y. Kim, Convolutional Neural Networks for Sentence Classification, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1746–1751.
[14]
Y. Xu, S. Biswal, S.R. Deshpande, K.O. Maher, J. Sun, RAIM: Recurrent attentive and intensive model of multimodal patient monitoring data, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, New York, NY, USA, 2018, pp. 2565–2573. https://doi.org/10.1145/3219819.3220051.
[15]
Y. Cheng, F. Wang, P. Zhang, J. Hu, Risk prediction with electronic health records: A deep learning approach, in: Proceedings of the 2016 SIAM International Conference on Data Mining, SIAM, 2016, pp. 432–440.
[16]
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: International Conference on Learning Representations, 2015.
[17]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[18]
T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, Adv. Neural Inf. Process. Syst. (2013) 3111–3119.
[19]
T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, ArXiv Preprint ArXiv:1301.3781 (2013).
[20]
Y. Choi, C.Y.-I. Chiu, D. Sontag, Learning low-dimensional representations of medical concepts, AMIA Summits on Translational Science Proceedings, 2016, 2016, p. 41.
[21]
L. Taslaman, B. Nilsson, A framework for regularized non-negative matrix factorization, with application to the analysis of gene expression data, PLoS ONE 7 (2012) e46331,.
[22]
G.L. Stein-O’Brien, R. Arora, A.C. Culhane, A.V. Favorov, L.X. Garmire, C.S. Greene, L.A. Goff, Y. Li, A. Ngom, M.F. Ochs, Y. Xu, E.J. Fertig, Enter the Matrix: Factorization Uncovers Knowledge from Omics, Trends Genet. 34 (2018) 790–805,.
[23]
F. Wang, N. Lee, J. Hu, J. Sun, S. Ebadollahi, A.F. Laine, A framework for mining signatures from event sequences and its applications in healthcare data, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2012) 272–285.
[24]
F. Wang, N. Lee, J. Hu, J. Sun, S. Ebadollahi, Towards heterogeneous temporal clinical event pattern discovery: a convolutional approach, in: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012, pp. 453–461.
[25]
J. Zhou, F. Wang, J. Hu, J. Ye, From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 135–144.
[26]
C. Liu, F. Wang, J. Hu, H. Xiong, Temporal phenotyping from longitudinal electronic health records: A graph based framework, in: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2015, pp. 705–714.
[27]
E. Choi, M.T. Bahadori, L. Song, W.F. Stewart, J. Sun, GRAM: Graph-based Attention Model for Healthcare Representation Learning, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17, ACM Press, Halifax, NS, Canada, 2017, pp. 787–795. https://doi.org/10.1145/3097983.3098126.
[28]
M. Niepert, M. Ahmed, K. Kutzkov, Learning convolutional neural networks for graphs, in: International Conference on Machine Learning, 2016, pp. 2014–2023.
[29]
P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, ArXiv Preprint ArXiv:1710.10903 (2017).
[30]
A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864.
[31]
F. Ma, Q. You, H. Xiao, R. Chitta, J. Zhou, J. Gao, KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare, in: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, ACM, New York, NY, USA, 2018, pp. 743–752. https://doi.org/10.1145/3269206.3271701.
[32]
S. Wang, P. Ren, Z. Chen, Z. Ren, J. Ma, M. de Rijke, Order-free Medicine Combination Prediction with Graph Convolutional Reinforcement Learning, in: Proceedings of the 28th ACM International Conference on Information and Knowledge Management - CIKM ’19, ACM Press, Beijing, China, 2019, pp. 1623–1632. https://doi.org/10.1145/3357384.3357965.
[33]
J. Zhang, J. Gong, L. Barnes, HCNN: Heterogeneous Convolutional Neural Networks for Comorbid Risk Prediction with Electronic Health Records, in: Proceedings of the Second IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies, IEEE Press, Piscataway, NJ, USA, 2017, pp. 214–221. https://doi.org/10.1109/CHASE.2017.80.
[34]
D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, ArXiv Preprint ArXiv:1409.0473 (2014).
[35]
I. Sutskever, O. Vinyals, Q.V. Le, Sequence to sequence learning with neural networks, Adv. Neural Inf. Process. Syst. (2014) 3104–3112.
[36]
K. Cho, B. Van Merriënboer, D. Bahdanau, Y. Bengio, On the properties of neural machine translation: Encoder-decoder approaches, ArXiv Preprint ArXiv:1409.1259 (2014).
[37]
S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (1997) 1735–1780.
[38]
E. Choi, M.T. Bahadori, A. Schuetz, W.F. Stewart, J. Sun, Doctor AI: Predicting Clinical Events via Recurrent Neural Networks, ArXiv:1511.05942 [Cs]. (2015). http://arxiv.org/abs/1511.05942 (accessed April 10, 2019).
[39]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Adv. Neural Inf. Process. Syst. (2017) 5998–6008.
[40]
A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, Improving language understanding by generative pre-training, Technical Report OpenAI, 2018.
[41]
J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186.
[42]
E. Choi, Z. Xu, Y. Li, M.W. Dusenberry, G. Flores, E. Xue, A.M. Dai, Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer, Proceedings of the AAAI Conference on Artificial Intelligence. (2020).
[43]
H. Song, D. Rajan, J.J. Thiagarajan, A. Spanias, Attend and diagnose: Clinical time series analysis using attention models, in: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, 2018, pp. 4091–4098.
[44]
Y. Li, S. Rao, J.R.A. Solares, A. Hassaine, R. Ramakrishnan, D. Canoy, Y. Zhu, K. Rahimi, G. Salimi-Khorshidi, BEHRT: Transformer for electronic health records, Sci. Rep. 10 (2020) 7155,.
[45]
L. Rasmy, Y. Xiang, Z. Xie, C. Tao, D. Zhi, Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction, ArXiv Preprint ArXiv:2005.12833 (2020).
[46]
T.G. Kolda, B.W. Bader, Tensor decompositions and applications, SIAM Rev. 51 (2009) 455–500,.
[47]
E.C. Chi, T.G. Kolda, On tensors, sparsity, and nonnegative factorizations, SIAM J. Matrix Anal. Appl. 33 (2012) 1272–1299,.
[48]
K. Yang, X. Li, H. Liu, J. Mei, G. Xie, J. Zhao, B. Xie, F. Wang, TaGiTeD: Predictive task guided tensor decomposition for representation learning from electronic health records, in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI Press, 2017, pp. 2824–2830.
[49]
J.C. Ho, J. Ghosh, J. Sun, Marble: High-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’14, ACM Press, New York, New York, USA, 2014, pp. 115–124. https://doi.org/10.1145/2623330.2623658.
[50]
H. He, J. Henderson, J.C. Ho, Distributed Tensor Decomposition for Large Scale Health Analytics, in: The World Wide Web Conference on - WWW ’19, ACM Press, San Francisco, CA, USA, 2019, pp. 659–669. https://doi.org/10.1145/3308558.3313548.
[51]
M. Ouzzani, H. Hammady, Z. Fedorowicz, A. Elmagarmid, Rayyan—a web and mobile app for systematic reviews, Syst Rev. 5 (2016) 210,.
[52]
K.M. Ahmed, B. Al Dhubaib, Zotero: A bibliographic assistant to researcher, J. Pharmacol. Pharmacotherap. 2 (2011) 303.
[53]
D. Moher, A. Liberati, J. Tetzlaff, D.G. Altman, Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement, Int. J. Surg. 8 (2010) 336–341,.
[54]
E. Zhang, R. Robinson, B. Pfahringer, Deep Holistic Representation Learning from EHR, in: 2018. https://doi.org/10.1109/ISMICT.2018.8573698.
[55]
H. Suresh, N. Hunt, A. Johnson, L.A. Celi, P. Szolovits, M. Ghassemi, Clinical intervention prediction and understanding with deep neural networks, Mach. Learn. Healthc. Conf. (2017) 322–337.
[56]
T. Bai, A.K. Ch, B.L. Egleston, S. Vucetic, EHR phenotyping via jointly embedding medical concepts and words into a unified vector space, BMC Med. Inf. Decis. Mak. 18 (2018) 123. http://www.embase.com/search/results?subaction=viewrecord&from=export&id=L625525781.
[57]
J. Kemp, A. Rajkomar, A.M. Dai, Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes, ArXiv:1909.03039 [Cs, Stat]. (2019). http://arxiv.org/abs/1909.03039 (accessed November 22, 2019).
[58]
K. Yin, D. Qian, W.K. Cheung, B.C.M. Fung, J. Poon, Learning phenotypes and dynamic patient representations via RNN regularized collective non-negative tensor factorization, AAAI 33 (2019) 1246–1253,.
[59]
Z. Li, K. Roberts, X. Jiang, Q. Long, Distributed learning from multiple EHR databases: Contextual embedding models for medical events, J. Biomed. Inform. 92 (2019) 103138,.
[60]
T. Tran, T.D. Nguyen, D. Phung, S. Venkatesh, Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM), J. Biomed. Inform. 54 (2015) 96–105,.
[61]
H. Suresh, J.J. Gong, J. Guttag, Learning tasks for multitask learning: heterogenous patient populations in the ICU, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’18. (2018) 802–810. https://doi.org/10.1145/3219819.3219930.
[62]
D. Dligach, T. Miller, Learning Patient Representations from Text, in: Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, 2018, pp. 119–123.
[63]
D. Dligach, M. Afshar, T. Miller, Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse, J. Am. Med. Inform. Assoc. (2019) ocz072,.
[64]
T. Bai, B.L. Egleston, S. Zhang, S. Vucetic, Interpretable representation learning for healthcare via capturing disease progression through time, in: 2018, pp. 43–51. https://doi.org/10.1145/3219819.3219904.
[65]
L. Liu, J. Shen, M. Zhang, Z. Wang, J. Tang, Learning the joint representation of heterogeneous temporal events for clinical endpoint prediction, in: 2018, pp. 109–116. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85060476955&partnerID=40&md5=e3cfd1382f1464164edc3f0dd4ab7baa.
[66]
X.S. Zhang, F. Tang, H.H. Dodge, J. Zhou, F. Wang, MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, New York, NY, USA, 2019, pp. 2487–2495. https://doi.org/10.1145/3292500.3330779.
[67]
I.M. Baytas, C. Xiao, X. Zhang, F. Wang, A.K. Jain, J. Zhou, Patient subtyping via time-aware LSTM networks, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2017, pp. 65–74.
[68]
M. Rafiq, G. Keel, P. Mazzocato, J. Spaak, C. Savage, C. Guttmann, Deep learning architectures for vector representations of patients and exploring predictors of 30-day hospital readmissions in patients with multiple chronic conditions, 2019. https://doi.org/10.1007/978-3-030-12738-1_17.
[69]
J. Liu, Z. Zhang, N. Razavian, Deep EHR: Chronic Disease Prediction Using Medical Notes, ArXiv:1808.04928 [Cs, Stat]. (2018). http://arxiv.org/abs/1808.04928 (accessed April 9, 2019).
[70]
Q. Suo, F. Ma, Y. Yuan, M. Huai, W. Zhong, J. Gao, A. Zhang, Deep patient similarity learning for personalized healthcare, IEEE Trans. Nanobiosci. 17 (2018) 219–227,.
[71]
Z. Che, Y. Cheng, Z. Sun, Y. Liu, Exploiting Convolutional Neural Network for Risk Prediction with Medical Feature Embedding, ArXiv:1701.07474 [Cs, Stat]. (2017). http://arxiv.org/abs/1701.07474 (accessed April 10, 2019).
[72]
T. Ma, C. Xiao, F. Wang, Health-ATM: A deep architecture for multifaceted patient health record representation and risk prediction, in: 2018, pp. 261–269. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85045143398&partnerID=40&md5=b63d90606c942e23cb2d49ae4fed27fd.
[73]
Y. Si, K. Roberts, Deep patient representation of clinical notes via multi-task learning for mortality prediction, AMIA Jt Summits Transl Sci Proc, 2019, 2019, pp. 779–788.
[74]
Y. Zhang, H. Zhou, J. Li, W. Sun, Y. Chen, A Time-Sensitive Hybrid Learning Model for Patient Subgrouping, in: 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, 2018, pp. 1–8. https://doi.org/10.1109/IJCNN.2018.8488991.
[75]
L. Lei, Y. Zhou, J. Zhai, L. Zhang, Z. Fang, P. He, J. Gao, An Effective Patient Representation Learning for Time-series Prediction Tasks Based on EHRs, in: 2019, pp. 885–892. https://doi.org/10.1109/BIBM.2018.8621542.
[76]
F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, J. Gao, Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17. (2017) 1903–1911. https://doi.org/10.1145/3097983.3098088.
[77]
L. Liu, H. Li, Z. Hu, H. Shi, Z. Wang, J. Tang, M. Zhang, Learning Hierarchical Representations of Electronic Health Records for Clinical Outcome Prediction, ArXiv Preprint ArXiv:1903.08652. (2019).
[78]
J. Zhang, K. Kowsari, J.H. Harrison, J.M. Lobo, L.E. Barnes, Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record, IEEE Access 6 (2018) 65333–65346,.
[79]
C. Xiao, T. Ma, A.B. Dieng, D.M. Blei, F. Wang, Readmission prediction via deep contextual embedding of clinical concepts, PLoS ONE 13 (2018) e0195024,.
[80]
E. Choi, M.T. Bahadori, J. Sun, J. Kulas, A. Schuetz, W. Stewart, RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism, Adv. Neural Inf. Process. Syst. (2016) 3504–3512.
[81]
E. Choi, C. Xiao, W. Stewart, J. Sun, Mime: Multilevel medical embedding of electronic health records for predictive healthcare, Adv. Neural Inf. Process. Syst. (2018) 4547–4557.
[82]
C. Zhou, Y. Jia, M. Motani, J. Chew, Learning Deep Representations from Heterogeneous Patient Data for Predictive Diagnosis, in: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics - ACM-BCB ’17, ACM Press, Boston, Massachusetts, USA, 2017, pp. 115–123. https://doi.org/10.1145/3107411.3107433.
[83]
C. Zhou, Y. Jia, M. Motani, Optimizing Autoencoders for Learning Deep Representations from Health Data, IEEE J. Biomed. Health. Inf. 23 (2019) 103–111,.
[84]
M. Sushil, S. Šuster, K. Luyckx, W. Daelemans, Patient representation learning and interpretable evaluation using clinical notes, J. Biomed. Inform. 84 (2018) 103–113,.
[85]
J. Stojanovic, D. Gligorijevic, V. Radosavljevic, N. Djuric, M. Grbovic, Z. Obradovic, Modeling Healthcare Quality via Compact Representations of Electronic Health Records, IEEE/ACM Trans. Comput. Biol. Bioinformatics, 14, 2017, pp. 545–554,.
[86]
E. Choi, M.T. Bahadori, E. Searles, C. Coffey, M. Thompson, J. Bost, J. Tejedor-Sojo, J. Sun, Multi-layer Representation Learning for Medical Concepts, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’16, ACM Press, San Francisco, California, USA, 2016, pp. 1495–1504. https://doi.org/10.1145/2939672.2939823.
[87]
L. Cui, X. Xie, Z. Shen, Prediction task guided representation learning of medical codes in EHR, J. Biomed. Inform. 84 (2018) 1–10,.
[88]
S. Barbieri, J. Kemp, O. Perez-Concha, S. Kotwal, M. Gallagher, A. Ritchie, L. Jorm, Benchmarking Deep Learning Architectures for Predicting Readmission to the ICU and Describing Patients-at-Risk, Sci Rep. 10 (2020) 1111,.
[89]
D.Y. Ding, C. Simpson, S. Pfohl, D.C. Kale, K. Jung, N.H. Shah, The Effectiveness of Multitask Learning for Phenotyping with Electronic Health Records Data, Pac Symp Biocomput, 24, 2019, pp. 18–29.
[90]
D. Liu, D. Dligach, T. Miller, Two-stage Federated Phenotyping and Patient Representation Learning, ArXiv:1908.05596 [Cs]. (2019). http://arxiv.org/abs/1908.05596 (accessed September 20, 2019).
[91]
A. Hosseini, T. Chen, W. Wu, Y. Sun, M. Sarrafzadeh, HeteroMed: Heterogeneous Information Network for Medical Diagnosis, in: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Association for Computing Machinery, New York, NY, USA, 2018, pp. 763–772. https://doi.org/10.1145/3269206.3271805.
[92]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image database, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition, Ieee, 2009, pp. 248–255.
[93]
A.E. Johnson, T.J. Pollard, L. Shen, H.L. Li-wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L.A. Celi, R.G. Mark, MIMIC-III, a freely accessible critical care database, Sci. Data 3 (2016) 160035.
[94]
K. Marek, D. Jennings, S. Lasch, A. Siderowf, C. Tanner, T. Simuni, C. Coffey, K. Kieburtz, E. Flagg, S. Chowdhury, The Parkinson progression marker initiative (PPMI), Prog. Neurobiol. 95 (2011) 629–635.
[95]
S.G. Mueller, M.W. Weiner, L.J. Thal, R.C. Petersen, C. Jack, W. Jagust, J.Q. Trojanowski, A.W. Toga, L. Beckett, The Alzheimer’s disease neuroimaging initiative, Neuroimag. Clin. 15 (2005) 869–877.
[96]
Ö. Uzuner, Recognizing obesity and comorbidities in sparse data, J. Am. Med. Inform. Assoc. 16 (2009) 561–570.
[97]
T.J. Pollard, A.E.W. Johnson, J.D. Raffa, L.A. Celi, R.G. Mark, O. Badawi, The eICU Collaborative Research Database, a freely available multi-center database for critical care research, Sci Data. 5 (2018) 180178,.
[98]
M. Sushil, S. Šuster, K. Luyckx, W. Daelemans, Unsupervised patient representations from clinical notes with interpretable classification decisions, ArXiv:1711.05198 [Cs]. (2017). http://arxiv.org/abs/1711.05198 (accessed April 10, 2019).
[99]
W. Wang, C. Guo, J. Xu, A. Liu, Bi-Dimensional Representation of Patients for Diagnosis Prediction, in: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), IEEE, Milwaukee, WI, USA, 2019, pp. 374–379. https://doi.org/10.1109/COMPSAC.2019.10235.
[100]
X. Zhang, J. Chou, J. Liang, C. Xiao, Y. Zhao, H. Sarva, C. Henchcliffe, F. Wang, Data-Driven Subtyping of Parkinson’s Disease Using Longitudinal Clinical Records: A Cohort Study, Sci. Rep. 9 (2019) 797,.
[101]
G.E. Hinton, R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science 313 (2006) 504–507.
[102]
F. Wang, R. Kaushal, D. Khullar, Should Health Care Demand Interpretable Artificial Intelligence or Accept “Black Box” Medicine?, Ann. Intern. Med. 172 (2020) 59,.
[103]
L. van der Maaten, G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res. 9 (2008) 2579–2605.
[104]
M.A. Cox, T.F. Cox, Multidimensional scaling, in: Handbook of Data Visualization, Springer, 2008, pp. 315–347.
[105]
M. Ringnér, What is principal component analysis?, Nat. Biotechnol. 26 (2008) 303–304.
[106]
L. McInnes, J. Healy, J. Melville, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, ArXiv:1802.03426 [Cs, Stat]. (2018). http://arxiv.org/abs/1802.03426 (accessed April 21, 2020).
[107]
M.N. Sadat, M.M. Al Aziz, N. Mohammed, F. Chen, X. Jiang, S. Wang, SAFETY: secure gwAs in federated environment through a hYbrid solution, IEEE/ACM Trans. Comput. Biol. Bioinf., 16, 2018, pp. 93–102.
[108]
H. Yu, X. Jiang, J. Vaidya, Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data, in: Proceedings of the 2006 ACM Symposium on Applied Computing, 2006, pp. 603–610.
[109]
Y. Kim, J. Sun, H. Yu, X. Jiang, Federated tensor factorization for computational phenotyping, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 887–895.
[110]
W. Dai, S. Wang, H. Xiong, X. Jiang, Privacy preserving federated big data analysis, in: Guide to Big Data Applications, Springer, 2018, pp. 49–82.
[111]
J. Lee, J. Sun, F. Wang, S. Wang, C.-H. Jun, X. Jiang, Privacy-preserving patient similarity learning in a federated environment: development and analysis, JMIR Med. Inf. 6 (2018) e20.
[112]
Y. Si, K. Roberts, Patient Representation Transfer Learning from Clinical Notes based on Hierarchical Attention Network, in: AMIA Joint Summits on Translational Science Proceedings. AMIA Joint Summits on Translational Science, 2020.
[113]
F. Wang, L.P. Casalino, D. Khullar, Deep Learning in Medicine—Promise, Progress, and Challenges, JAMA, Intern Med. 179 (2019) 293,.
[114]
C. Yun, S. Bhojanapalli, A.S. Rawat, S.J. Reddi, S. Kumar, Are Transformers universal approximators of sequence-to-sequence functions?, ArXiv:1912.10077 [Cs, Stat]. (2020). http://arxiv.org/abs/1912.10077 (accessed May 5, 2020).
[115]
E. Steinberg, K. Jung, J.A. Fries, C.K. Corbin, S.R. Pfohl, N.H. Shah, Language models are an effective representation learning technique for electronic health record data, J. Biomed. Inf. 113 (2020) 103637.
[116]
Y. Si, E.V. Bernstam, K. Roberts, Generalized and Transferable Patient Language Representation for Phenotyping with Limited Data, arXiv (2021).
[117]
C. Finn, P. Abbeel, S. Levine, Model-agnostic Meta-learning for Fast Adaptation of Deep Networks, in: Proceedings of the 34th International Conference on Machine Learning - Volume 70, JMLR.org, 2017, pp. 1126–1135. http://dl.acm.org/citation.cfm?id=3305381.3305498.
[118]
J. Wiens, S. Saria, M. Sendak, M. Ghassemi, V.X. Liu, F. Doshi-Velez, K. Jung, K. Heller, D. Kale, M. Saeed, P.N. Ossorio, S. Thadaney-Israni, A. Goldenberg, Do no harm: a roadmap for responsible machine learning for health care, Nat. Med. 25 (2019) 1337–1340,.
[119]
D. Lee, X. Jiang, H. Yu, Harmonized representation learning on dynamic EHR graphs, J. Biomed. Inform. 106 (2020) 103426,.
[120]
S. Shilo, H. Rossman, E. Segal, Axes of a revolution: challenges and promises of big data in healthcare, Nat. Med. 26 (2020) 29–38,.
[121]
A. Cheerla, O. Gevaert, Deep learning with multimodal representation for pancancer prognosis prediction, Bioinformatics 35 (2019) i446–i454.
[122]
X. Zhu, J. Yao, G. Xiao, Y. Xie, J. Rodriguez-Canales, E.R. Parra, C. Behrens, I.I. Wistuba, J. Huang, Imaging-genetic data mapping for clinical outcome prediction via supervised conditional gaussian graphical model, in: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, 2016, pp. 455–459.
[123]
G. Jia, Y. Li, H. Zhang, I. Chattopadhyay, A.B. Jensen, D.R. Blair, L. Davis, P.N. Robinson, T. Dahlén, S. Brunak, others, Estimating heritability and genetic correlations from large health datasets in the absence of genetic data, Nat. Commun. 10 (2019) 1–11.
[124]
E. Laparra, S. Bethard, T.A. Miller, Rethinking domain adaptation for machine learning over clinical language, JAMIA Open (2020) ooaa010,.
[125]
J. Konečný, B. McMahan, D. Ramage, Federated Optimization:Distributed Optimization Beyond the Datacenter, ArXiv:1511.03575 [Cs, Math]. (2015). http://arxiv.org/abs/1511.03575 (accessed May 7, 2020).
[126]
F. Zerka, S. Barakat, S. Walsh, M. Bogowicz, R.T.H. Leijenaar, A. Jochems, B. Miraglio, D. Townend, P. Lambin, Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care, JCO Clin. Cancer Inf. (2020) 184–200,.
[127]
J. Xu, F. Wang, Federated Learning for Healthcare Informatics, ArXiv:1911.06270 [Cs]. (2019). http://arxiv.org/abs/1911.06270 (accessed May 7, 2020).
[128]
N. Rieke, J. Hancox, W. Li, F. Milletari, H. Roth, S. Albarqouni, S. Bakas, M.N. Galtier, B. Landman, K. Maier-Hein, S. Ourselin, M. Sheller, R.M. Summers, A. Trask, D. Xu, M. Baust, M.J. Cardoso, The Future of Digital Health with Federated Learning, ArXiv:2003.08119 [Cs]. (2020). http://arxiv.org/abs/2003.08119 (accessed May 7, 2020).
[129]
P. McClure, C.Y. Zheng, J. Kaczmarzyk, J. Rogers-Lee, S. Ghosh, D. Nielson, P.A. Bandettini, F. Pereira, Distributed Weight Consolidation: A Brain Segmentation Case Study, in: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett (Eds.), Advances in Neural Information Processing Systems 31, Curran Associates, Inc., 2018, pp. 4093–4103. http://papers.nips.cc/paper/7664-distributed-weight-consolidation-a-brain-segmentation-case-study.pdf.
[130]
K. Chang, N. Balachandar, C. Lam, D. Yi, J. Brown, A. Beers, B. Rosen, D.L. Rubin, J. Kalpathy-Cramer, Distributed deep learning networks among institutions for medical imaging, J. Am. Med. Inform. Assoc. 25 (2018) 945–954,.
[131]
T.M. Deist, A. Jochems, J. van Soest, G. Nalbantov, C. Oberije, S. Walsh, M. Eble, P. Bulens, P. Coucke, W. Dries, A. Dekker, P. Lambin, Infrastructure and distributed learning methodology for privacy-preserving multi-centric rapid learning health care: euroCAT, Clin. Transl. Radiat. Oncol. 4 (2017) 24–31,.
[132]
G. Price, M. van Herk, C. Faivre-Finn, Data Mining in Oncology: The ukCAT Project and the Practicalities of Working with Routine Patient Data, Clin. Oncol. 29 (2017) 814–817,.
[133]
S. Darabi, M. Kachuee, S. Fazeli, M. Sarrafzadeh, TAPER: Time-Aware Patient EHR Representation, ArXiv:1908.03971 [Cs, Stat]. (2019). http://arxiv.org/abs/1908.03971 (accessed September 20, 2019).
[134]
B. Hettige, Y.-F. Li, W. Wang, S. Le, W. Buntine, MedGraph: Structural and Temporal Representation Learning of Electronic Medical Records, ArXiv:1912.03703 [Cs, Stat]. (2020). http://arxiv.org/abs/1912.03703 (accessed May 7, 2020).
[135]
S. Darabi, M. Kachuee, M. Sarrafzadeh, Unsupervised Representation for EHR Signals and Codes as Patient Status Vector, ArXiv:1910.01803 [Cs, Stat]. (2019). http://arxiv.org/abs/1910.01803 (accessed May 7, 2020).
[136]
S. Dubois, N. Romano, D.C. Kale, N. Shah, K. Jung, Effective Representations of Clinical Notes, ArXiv:1705.07025 [Cs, Stat]. (2017). http://arxiv.org/abs/1705.07025 (accessed April 15, 2019).
[137]
J.R. Ayala Solares, F.E. Diletta Raimondi, Y. Zhu, F. Rahimian, D. Canoy, J. Tran, A.C. Pinho Gomes, A.H. Payberah, M. Zottoli, M. Nazarzadeh, N. Conrad, K. Rahimi, G. Salimi-Khorshidi, Deep learning for electronic health records: A comparative review of multiple deep neural architectures, J. Biomed. Inf. 101 (2020) 103337,.
[138]
N. Sadati, M.Z. Nezhad, R.B. Chinnam, D. Zhu, Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study, ArXiv:1801.02961 [Cs, Stat]. (2018). http://arxiv.org/abs/1801.02961 (accessed October 21, 2019).
[139]
X. Min, B. Yu, F. Wang, Predictive Modeling of the Hospital Readmission Risk from Patients’ Claims Data Using Machine Learning: A Case Study on COPD, Sci. Rep. 9 (2019) 2362,.

Cited By

View all
  • (2024)PRISM: Mitigating EHR Data Sparsity via Learning from Missing Feature Calibrated Prototype Patient RepresentationsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679521(3560-3569)Online publication date: 21-Oct-2024
  • (2024)Knowledge-based dynamic prompt learning for multi-label disease diagnosisKnowledge-Based Systems10.1016/j.knosys.2024.111395286:COnline publication date: 17-Apr-2024
  • (2024)Graph neural networks for clinical risk prediction based on electronic health recordsJournal of Biomedical Informatics10.1016/j.jbi.2024.104616151:COnline publication date: 1-Mar-2024
  • Show More Cited By

Index Terms

  1. Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image Journal of Biomedical Informatics
            Journal of Biomedical Informatics  Volume 115, Issue C
            Mar 2021
            228 pages

            Publisher

            Elsevier Science

            San Diego, CA, United States

            Publication History

            Published: 01 March 2021

            Author Tags

            1. Systematic review
            2. Electronic health records
            3. Patient representation
            4. Deep learning

            Qualifiers

            • Review-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 25 Nov 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)PRISM: Mitigating EHR Data Sparsity via Learning from Missing Feature Calibrated Prototype Patient RepresentationsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679521(3560-3569)Online publication date: 21-Oct-2024
            • (2024)Knowledge-based dynamic prompt learning for multi-label disease diagnosisKnowledge-Based Systems10.1016/j.knosys.2024.111395286:COnline publication date: 17-Apr-2024
            • (2024)Graph neural networks for clinical risk prediction based on electronic health recordsJournal of Biomedical Informatics10.1016/j.jbi.2024.104616151:COnline publication date: 1-Mar-2024
            • (2024)Strategies for secondary use of real-world clinical and administrative data for outcome ascertainment in pragmatic clinical trialsJournal of Biomedical Informatics10.1016/j.jbi.2024.104587150:COnline publication date: 1-Feb-2024
            • (2024)Predicting line of therapy transition via similar patient augmentationJournal of Biomedical Informatics10.1016/j.jbi.2023.104511147:COnline publication date: 1-Feb-2024
            • (2024)Improving the classification of multiple sclerosis and cerebral small vessel disease with interpretable transfer attention neural networkComputers in Biology and Medicine10.1016/j.compbiomed.2024.108530176:COnline publication date: 1-Jun-2024
            • (2024)A novel method leveraging time series data to improve subphenotyping and application in critically ill patients with COVID-19Artificial Intelligence in Medicine10.1016/j.artmed.2023.102750148:COnline publication date: 1-Feb-2024
            • (2024)Meta-learning in Healthcare: A SurveySN Computer Science10.1007/s42979-024-03166-95:6Online publication date: 12-Aug-2024
            • (2024)Interpretable EHR Disease Prediction System Based on Disease Experts and Patient Similarity Graph (DE-PSG)Artificial Neural Networks and Machine Learning – ICANN 202410.1007/978-3-031-72353-7_7(87-102)Online publication date: 17-Sep-2024
            • (2024)Augmenting Infrequent Relationships in Clinical Language Models with Graph-Encoded Hierarchical OntologiesArtificial Intelligence in Healthcare10.1007/978-3-031-67278-1_3(31-44)Online publication date: 4-Sep-2024
            • Show More Cited By

            View Options

            View options

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media