Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3097983.3098126acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

GRAM: Graph-based Attention Model for Healthcare Representation Learning

Published: 04 August 2017 Publication History

Abstract

Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -
Data insufficiency: Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results.
Interpretation: The representations learned by deep learning methods should align with medical knowledge.
To address these challenges, we propose GRaph-based Attention Model (GRAM) that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. Based on the data volume and the ontology structure, GRAM represents a medical concept as a combination of its ancestors in the ontology via an attention mechanism.
We compared predictive performance (i.e. accuracy, data needs, interpretability) of GRAM to various methods including the recurrent neural network (RNN) in two sequential diagnoses prediction tasks and one heart failure prediction task. Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data. Additionally, unlike other methods, the medical concept representations learned by GRAM are well aligned with the medical ontology. Finally, GRAM exhibits intuitive attention behaviors by adaptively generalizing to higher level concepts when facing data insufficiency at the lower level concepts.

References

[1]
Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2014. Multiple object recognition with visual attention. arXiv:1412.7755 (2014).
[2]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473 (2014).
[3]
Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5, 2 (1994).
[4]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD.
[5]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In NIPS.
[6]
Zhengping Che, David Kale, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu. 2015. Deep Computational Phenotyping. In SIGKDD.
[7]
Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2016. Recurrent Neural Networks for Multivariate Time Series with Missing Values. arXiv:1606.01865 (2016).
[8]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP.
[9]
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, and Jimeng Sun. 2016. Doctor AI: Predicting Clinical Events via Recurrent Neural Networks. In MLHC.
[10]
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, and Jimeng Sun. 2016. RETAIN: Interpretable Predictive Model in Healthcare using Reverse Time Attention Mechanism. In NIPS.
[11]
Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier T Sojo, and Jimeng Sun. 2016. Multi-layer Representation Learning for Medical Concepts. In SIGKDD.
[12]
Edward Choi, Andy Schuetz, Walter F Stewart, and Jimeng Sun. 2016. Using Recurrent Neural Network Models for Early Detection of Heart Failure Onset. JAMIA (2016).
[13]
Youngduck Choi, Chill Yi-I Chiu, and David Sontag. 2016. Learning Low-Dimensional Representations of Medical Concepts. (2016). AMIA CRI.
[14]
Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. End-to-end continuous speech recognition using attention-based recurrent NN: First results. arXiv:1412.1602 (2014).
[15]
Ary Goldberger and others. 2000. Physiobank, physiotoolkit, and physionet components of a new research resource for complex physiologic signals. Circulation (2000).
[16]
Aditya Grover and Jure Leskovec. 2016. Node2Vec: Scalable Feature Learning for Networks. In SIGKDD.
[17]
Jerry Gurwitz, David Magid, David Smith, Robert Goldberg, David McManus, Larry Allen, Jane Saczynski, Micah Thorp, Grace Hsu, Sue Hee Sung, and others. 2013. Contemporary prevalence and correlates of incident heart failure with preserved ejection fraction. The American journal of medicine 126, 5 (2013).
[18]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997).
[19]
Alistair Johnson and others. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3 (2016).
[20]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 (2016).
[21]
Quoc V Le, Navdeep Jaitly, and Geoffrey E Hinton. 2015. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units. arXiv:1504.00941 (2015).
[22]
Yuezhang Li, Ronghuo Zheng, Tian Tian, Zhiting Hu, Rahul Iyer, and Katia Sycara. 2016. Joint Embedding of Hierarchical Categories and Entities for Concept Categorization and Dataless Classification. (2016).
[23]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In AAAI.
[24]
Zachary C Lipton, David C Kale, Charles Elkan, and Randall Wetzell. 2015. Learning to Diagnose with LSTM Recurrent Neural Networks. arXiv:1511.03677 (2015).
[25]
Zachary C Lipton, David C Kale, and Randall Wetzel. 2016. Modeling Missing Data in Clinical Time Series with RNNs. In MLHC.
[26]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR 9, Nov (2008).
[27]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS.
[28]
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995).
[29]
Riccardo Miotto, Li Li, Brian A Kidd, and Joel T Dudley. 2016. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Scientific Reports 6 (2016).
[30]
Phuoc Nguyen, Truyen Tran, Nilmini Wickramasinghe, and Svetha Venkatesh. 2016. Deepr: A Convolutional Net for Medical Records. arXiv:1607.07519 (2016).
[31]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP.
[32]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In SIGKDD.
[33]
Healthcare Cost & Utilization Project and others. 2010. Clinical classifications software (CCS) for ICD-9-CM. Rockville, MD: Agency for Healthcare Research and Quality (2010).
[34]
Narges Razavian, Jake Marcus, and David Sontag. 2016. Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests. In MLHC.
[35]
Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion. In NIPS.
[36]
Michael Q Stearns, Colin Price, Kent A Spackman, and Amy Y Wang. 2001. SNOMED clinical terms: overview of the development process and project status. In AMIA.
[37]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding. In WWW.
[38]
The Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv:1605.02688 (2016).
[39]
Rajakrishnan Vijayakrishnan, Steven Steinhubl, Kenney Ng, Jimeng Sun, Roy Byrd, Zahra Daar, Brent Williams, Shahram Ebadollahi, Walter Stewart, and others. 2014. Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. Journal of cardiac failure 20, 7 (2014).
[40]
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In AAAI .
[41]
Kilian Q Weinberger, Fei Sha, Qihui Zhu, and Lawrence K Saul. 2006. Graph Laplacian Regularization for Large-Scale Semidefinite Programming. In NIPS.
[42]
Ruobing Xie, Zhiyuan Liu, and Maosong Sun. 2016. Representation Learning of Knowledge Graphs with Hierarchical Types. In IJCAI.
[43]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, and Yoshua Bengio. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In ICML.
[44]
Zhilin Yang, William Cohen, and Ruslan Salakhutdinov. 2016. Revisiting Semi- Supervised Learning with Graph Embeddings. arXiv:1603.08861 (2016).
[45]
Matthew D Zeiler. 2012. ADADELTA: an adaptive learning rate method. arXiv:1212.5701 (2012)

Cited By

View all
  • (2024)Uncertainty Quantification and Interpretability for Clinical Trial Approval PredictionHealth Data Science10.34133/hds.01264Online publication date: 15-Apr-2024
  • (2024)DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature FusionMathematics10.3390/math1203048812:3(488)Online publication date: 2-Feb-2024
  • (2024)Multimodal data integration for oncology in the era of deep neural networks: a reviewFrontiers in Artificial Intelligence10.3389/frai.2024.14088437Online publication date: 25-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2017
2240 pages
ISBN:9781450348874
DOI:10.1145/3097983
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention model
  2. electronic health records
  3. graph
  4. predictive healthcare

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '17
Sponsor:

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,014
  • Downloads (Last 6 weeks)116
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Uncertainty Quantification and Interpretability for Clinical Trial Approval PredictionHealth Data Science10.34133/hds.01264Online publication date: 15-Apr-2024
  • (2024)DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature FusionMathematics10.3390/math1203048812:3(488)Online publication date: 2-Feb-2024
  • (2024)Multimodal data integration for oncology in the era of deep neural networks: a reviewFrontiers in Artificial Intelligence10.3389/frai.2024.14088437Online publication date: 25-Jul-2024
  • (2024)DeepOnto: A Python package for ontology engineering with deep learningSemantic Web10.3233/SW-24356815:5(1991-2004)Online publication date: 9-Oct-2024
  • (2024)DAPNet: multi-view graph contrastive network incorporating disease clinical and molecular associations for disease progression predictionBMC Medical Informatics and Decision Making10.1186/s12911-024-02756-024:1Online publication date: 19-Nov-2024
  • (2024)Med-MGF: multi-level graph-based framework for handling medical data imbalance and representationBMC Medical Informatics and Decision Making10.1186/s12911-024-02649-224:1Online publication date: 2-Sep-2024
  • (2024)TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR DataProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671594(6324-6334)Online publication date: 25-Aug-2024
  • (2024)EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented GenerationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679582(3549-3559)Online publication date: 21-Oct-2024
  • (2024)OEHR: An Orthopedic Electronic Health Record DatasetProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657885(1126-1135)Online publication date: 10-Jul-2024
  • (2024)Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis DiagnosisProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642343(1-18)Online publication date: 11-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media