research-article

Public Access

GRAM: Graph-based Attention Model for Healthcare Representation Learning

Authors:

Mohammad Taha Bahadori,

Walter F. Stewart,

Jimeng SunAuthors Info & Claims

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 787 - 795

https://doi.org/10.1145/3097983.3098126

Published: 04 August 2017 Publication History

Abstract

Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -

Data insufficiency: Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results.

Interpretation: The representations learned by deep learning methods should align with medical knowledge.

To address these challenges, we propose GRaph-based Attention Model (GRAM) that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. Based on the data volume and the ontology structure, GRAM represents a medical concept as a combination of its ancestors in the ontology via an attention mechanism.

We compared predictive performance (i.e. accuracy, data needs, interpretability) of GRAM to various methods including the recurrent neural network (RNN) in two sequential diagnoses prediction tasks and one heart failure prediction task. Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data. Additionally, unlike other methods, the medical concept representations learned by GRAM are well aligned with the medical ontology. Finally, GRAM exhibits intuitive attention behaviors by adaptively generalizing to higher level concepts when facing data insufficiency at the lower level concepts.

References

[1]

Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2014. Multiple object recognition with visual attention. arXiv:1412.7755 (2014).

[2]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473 (2014).

[3]

Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5, 2 (1994).

Digital Library

[4]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD.

Digital Library

[5]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In NIPS.

[6]

Zhengping Che, David Kale, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu. 2015. Deep Computational Phenotyping. In SIGKDD.

Digital Library

[7]

Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2016. Recurrent Neural Networks for Multivariate Time Series with Missing Values. arXiv:1606.01865 (2016).

[8]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP.

[9]

Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, and Jimeng Sun. 2016. Doctor AI: Predicting Clinical Events via Recurrent Neural Networks. In MLHC.

Digital Library

[10]

Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, and Jimeng Sun. 2016. RETAIN: Interpretable Predictive Model in Healthcare using Reverse Time Attention Mechanism. In NIPS.

Digital Library

[11]

Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier T Sojo, and Jimeng Sun. 2016. Multi-layer Representation Learning for Medical Concepts. In SIGKDD.

[12]

Edward Choi, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016. Using Recurrent Neural Network Models for Early Detection of Heart Failure Onset. JAMIA (2016).

[13]

Youngduck Choi, Chill Yi-I Chiu, and David Sontag. 2016. Learning Low-Dimensional Representations of Medical Concepts. (2016). AMIA CRI.

[14]

Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. End-to-end continuous speech recognition using attention-based recurrent NN: First results. arXiv:1412.1602 (2014).

[15]

Ary Goldberger and others. 2000. Physiobank, physiotoolkit, and physionet components of a new research resource for complex physiologic signals. Circulation (2000).

[16]

Aditya Grover and Jure Leskovec. 2016. Node2Vec: Scalable Feature Learning for Networks. In SIGKDD.

Digital Library

[17]

Jerry Gurwitz, David Magid, David Smith, Robert Goldberg, David McManus, Larry Allen, Jane Saczynski, Micah Thorp, Grace Hsu, Sue Hee Sung, and others. 2013. Contemporary prevalence and correlates of incident heart failure with preserved ejection fraction. The American journal of medicine 126, 5 (2013).

[18]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997).

Digital Library

[19]

Alistair Johnson and others. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3 (2016).

[20]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 (2016).

[21]

Quoc V Le, Navdeep Jaitly, and Geoffrey E Hinton. 2015. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units. arXiv:1504.00941 (2015).

[22]

Yuezhang Li, Ronghuo Zheng, Tian Tian, Zhiting Hu, Rahul Iyer, and Katia Sycara. 2016. Joint Embedding of Hierarchical Categories and Entities for Concept Categorization and Dataless Classification. (2016).

[23]

Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In AAAI.

[24]

Zachary C Lipton, David C Kale, Charles Elkan, and Randall Wetzell. 2015. Learning to Diagnose with LSTM Recurrent Neural Networks. arXiv:1511.03677 (2015).

[25]

Zachary C Lipton, David C Kale, and Randall Wetzel. 2016. Modeling Missing Data in Clinical Time Series with RNNs. In MLHC.

[26]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR 9, Nov (2008).

[27]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS.

[28]

George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995).

[29]

Riccardo Miotto, Li Li, Brian A Kidd, and Joel T Dudley. 2016. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Scientific Reports 6 (2016).

[30]

Phuoc Nguyen, Truyen Tran, Nilmini Wickramasinghe, and Svetha Venkatesh. 2016. Deepr: A Convolutional Net for Medical Records. arXiv:1607.07519 (2016).

[31]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.

[32]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In SIGKDD.

[33]

Healthcare Cost & Utilization Project and others. 2010. Clinical classifications software (CCS) for ICD-9-CM. Rockville, MD: Agency for Healthcare Research and Quality (2010).

[34]

Narges Razavian, Jake Marcus, and David Sontag. 2016. Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests. In MLHC.

[35]

Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion. In NIPS.

[36]

Michael Q Stearns, Colin Price, Kent A Spackman, and Amy Y Wang. 2001. SNOMED clinical terms: overview of the development process and project status. In AMIA.

[37]

Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding. In WWW.

Digital Library

[38]

The Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv:1605.02688 (2016).

[39]

Rajakrishnan Vijayakrishnan, Steven Steinhubl, Kenney Ng, Jimeng Sun, Roy Byrd, Zahra Daar, Brent Williams, Shahram Ebadollahi, Walter Stewart, and others. 2014. Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. Journal of cardiac failure 20, 7 (2014).

[40]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In AAAI .

[41]

Kilian Q Weinberger, Fei Sha, Qihui Zhu, and Lawrence K Saul. 2006. Graph Laplacian Regularization for Large-Scale Semidefinite Programming. In NIPS.

[42]

Ruobing Xie, Zhiyuan Liu, and Maosong Sun. 2016. Representation Learning of Knowledge Graphs with Hierarchical Types. In IJCAI.

[43]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, and Yoshua Bengio. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In ICML.

[44]

Zhilin Yang, William Cohen, and Ruslan Salakhutdinov. 2016. Revisiting Semi- Supervised Learning with Graph Embeddings. arXiv:1603.08861 (2016).

[45]

Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv:1212.5701 (2012)

Cited By

Lu YChen THao NVan Rechem CChen JFu T(2024)Uncertainty Quantification and Interpretability for Clinical Trial Approval PredictionHealth Data Science10.34133/hds.01264Online publication date: 15-Apr-2024
https://doi.org/10.34133/hds.0126
Ding ZLi ZLi XLi H(2024)DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature FusionMathematics10.3390/math1203048812:3(488)Online publication date: 2-Feb-2024
https://doi.org/10.3390/math12030488
Waqas ATripathi ARamachandran RStewart PRasool G(2024)Multimodal data integration for oncology in the era of deep neural networks: a reviewFrontiers in Artificial Intelligence10.3389/frai.2024.14088437Online publication date: 25-Jul-2024
https://doi.org/10.3389/frai.2024.1408843
Show More Cited By

Index Terms

GRAM: Graph-based Attention Model for Healthcare Representation Learning
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks

Recommendations

Interpretable Representation Learning for Healthcare via Capturing Disease Progression through Time
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Various deep learning models have recently been applied to predictive modeling of Electronic Health Records (EHR). In medical claims data, which is a particular type of EHR data, each patient is represented as a sequence of temporally ordered ...
Automatic Phenotyping by a Seed-guided Topic Model
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Electronic health records (EHRs) provide rich clinical information and the opportunities to extract epidemiological patterns to understand and predict patient disease risks with suitable machine learning methods such as topic models. However, existing ...
Implementing the lifelong personal health record in a regionalised health information system: The case of Lombardy, Italy
Abstract Background
The use of personal health records (PHRs) can help people make better health decisions and improves the quality of care by allowing access to and use of the information needed to communicate effectively with ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2017

2240 pages

ISBN:9781450348874

DOI:10.1145/3097983

General Chairs:
Stan Matwin
Dalhousie University
,
Shipeng Yu
LinkedIn
,
Faisal Farooq
IBM

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Institutes of Health
Nvidia
UCB
Intel
Office of Naval Research
Children's Healthcare of Atlanta
Google Faculty Award
National Science Foundation

Conference

KDD '17

Sponsor:

KDD '17: The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 13 - 17, 2017

NS, Halifax, Canada

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

393
Total Citations
View Citations
7,570
Total Downloads

Downloads (Last 12 months)1,014
Downloads (Last 6 weeks)116

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lu YChen THao NVan Rechem CChen JFu T(2024)Uncertainty Quantification and Interpretability for Clinical Trial Approval PredictionHealth Data Science10.34133/hds.01264Online publication date: 15-Apr-2024
https://doi.org/10.34133/hds.0126
Ding ZLi ZLi XLi H(2024)DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature FusionMathematics10.3390/math1203048812:3(488)Online publication date: 2-Feb-2024
https://doi.org/10.3390/math12030488
Waqas ATripathi ARamachandran RStewart PRasool G(2024)Multimodal data integration for oncology in the era of deep neural networks: a reviewFrontiers in Artificial Intelligence10.3389/frai.2024.14088437Online publication date: 25-Jul-2024
https://doi.org/10.3389/frai.2024.1408843
He YChen JDong HHorrocks IAllocca CKim TSapkota B(2024)DeepOnto: A Python package for ontology engineering with deep learningSemantic Web10.3233/SW-24356815:5(1991-2004)Online publication date: 9-Oct-2024
https://doi.org/10.3233/SW-243568
Tian HHe XYang KDai XLiu YZhang FShu ZZheng QWang SXia JWen TLiu BYu JZhou X(2024)DAPNet: multi-view graph contrastive network incorporating disease clinical and molecular associations for disease progression predictionBMC Medical Informatics and Decision Making10.1186/s12911-024-02756-024:1Online publication date: 19-Nov-2024
https://doi.org/10.1186/s12911-024-02756-0
Nguyen TPoh KChong SLee J(2024)Med-MGF: multi-level graph-based framework for handling medical data imbalance and representationBMC Medical Informatics and Decision Making10.1186/s12911-024-02649-224:1Online publication date: 2-Sep-2024
https://doi.org/10.1186/s12911-024-02649-2
Zhang ZCui HXu RXie YHo JYang CBaeza-Yates RBonchi F(2024)TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR DataProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671594(6324-6334)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671594
Zhu YRen CWang ZZheng XXie SFeng JZhu XLi ZMa LPan CSerra ESpezzano F(2024)EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented GenerationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679582(3549-3559)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679582
Xie YWang KZheng JLiu FWang XHuang GHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)OEHR: An Orthopedic Electronic Health Record DatasetProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657885(1126-1135)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657885
Zhang SYu JXu XYin CLu YYao BTory MPadilla LCaterino JZhang PWang D(2024)Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis DiagnosisProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642343(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642343
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents