Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3107411.3107423acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Tensor-Factorization-Based Phenotyping using Group Information: Case Study on the Efficacy of Statins

Published: 20 August 2017 Publication History

Abstract

To automatically extract medical concepts from raw electronic health records (EHRs), several applications based on machine learning techniques have been proposed. Among the various techniques, tensor factorization methods have attracted considerable attention because tensor representations can capture interactions among high-dimensional EHRs. Most of the existing tensor factorization methods for computational phenotyping are only designed to derive individual phenotypes that approximate the original data. However, deriving grouped phenotypes is desirable because patients form natural groups of interest (i.e., efficacy of treatment and disease categories). In this paper, we propose Supervised Non-negative Tensor Factorization with Multinomial Logistic Regression (SNTFL) to derive grouped phenotypes that are discriminative. We define a discriminative constraint to derive grouped phenotypes and jointly optimize a multinomial logistic regression during the tensor factorization process. Our case study on a hyperlipidemia dataset demonstrates that our proposed method obtains better discrimination on patient groups compared to the baselines and successfully discovers meaningful patient subgroups.

References

[1]
American Diabetes Association. 2017. Standards of Medical Care in Diabetes-2017 Abridged for Primary Care Providers. Clinical Diabetes 35, 1 (2017), 5--26.
[2]
B. W. Bader, T. G. Kolda, et al. 2012. MATLAB Tensor Toolbox Version 2.5. Available online. (January 2012). http://www.sandia.gov/~tgkolda/TensorToolbox/
[3]
D. P. Bertsekas. 1999. Nonlinear programming. In Belmont: Athena scientific. 1--60.
[4]
J. D. Carroll and J. J. Chang. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart-Young" decomposition. Psychometrika 35, 3 (1970), 283--319.
[5]
Z. Che, D. Kale, W. Li, M. T. Bahadori, and Y. Liu. 2015. Deep computational phenotyping. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. 507--516.
[6]
E. C. Chi and T.G. Kolda. 2012. On tensors, sparsity, and nonnegative factorizations. SIAM J. Matrix Anal. Appl. 33, 4 (2012), 1272--1299.
[7]
A. Cichock, A. H. Phan R. Zdunek, and S.I. Amari. 2009. Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley & Sons (2009).
[8]
C. R. Dormuth and et al. 2013. Use of high potency statins and rates of admission for acute kidney injury: multicenter, retrospective observational analysis of administrative databases. Bmj 346, f880 (2013).
[9]
M. Dyrby, D. Baunsgaard, R. Bro, and S. B. Engelsen. 2005. Multiway chemometric analysis of the metabolic response to toxins monitored by NMR. Chemometrics and Intelligent Laboratory Systems 76, 1 (2005), 79--89.
[10]
R. A. Harshman. 1970. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis. In UCLA Work Papers Phonet, Vol. 16. 1--84.
[11]
J. C. Ho, J. Ghosh, S. R. Steinhubl, W. F. Stewart, J. C. Denny, B. A. Malin, and J. Sun. 2014. Limestone: High-throughput candidate phenotype generation via tensor factorization. Journal of biomedical informatics 52, 6 (2014), 199--211.
[12]
J. C. Ho, J. Ghosh, and J. Sun. 2014. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM. 115--124.
[13]
G. Hripcsak and D. J. Albers. 2013. Next-generation phenotyping of electronic health records. Journal of the American Medical Informatics Association. Journal of the American Medical Informatics Association 20, 1 (2013), 117--121.
[14]
D. Kale, Z. Che, Y. Liu, and R. Wetzel. 2014. Computational discovery of physiomes in critically ill children using deep learning. In DMMI Workshop, AMIA.
[15]
D. Kansagara, H. Englander, A. Salanitro, D. Kagen, C. Theobald, M. Freeman, and S. Kripalani. 2011. Risk prediction models for hospital readmission: a systematic review. Jama 306, 15 (2011), 1688--1698.
[16]
H. S. Kim, H. Lee, B. Park, S. Park, H. Kim, S. H. Lee, and et al. 2016. Comparative analysis of the efficacy of low-and moderate-intensity statins in Korea. International journal of clinical pharmacology and therapeutics 54, 11 (2016), 864.
[17]
Y. Kim, R. El-Kareh, J. Sun, H. Yu, and X. Jiang. 2017. Discriminative and distinct phenotyping by constrained tensor factorization. Scientific Reports 7 (2017).
[18]
T. A. Lasko, J. C. Denny, and M. A. Levy. 2013. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PloS one 8, 6 (2013), e66341.
[19]
L. D. Lathauwer and B. D. Moor. 1998. From matrix to tensor: Multilinear algebra and signal processing. In Institute of Mathematics and Its Applications Conference Series. 1--16.
[20]
C. J. Lin. 2007. Projected gradient methods for nonnegative matrix factorization. Neural computation 19, 10 (2007), 2756--2779.
[21]
C. Liu, F. Wang, J. Hu, and H. Xiong. 2015. Temporal phenotyping from longitudinal electronic health records: A graph based framework. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. 705--714.
[22]
T. P. Minka. 2003. A comparison of numerical optimizers for logistic regression. Unpublished draft (2003).
[23]
R. Miotto, L. Li, B. A. Kidd, and J. T. Dudley. 2016. Deep patient: An unsupervised representation to predict the future of patients from the electronic health records. Scientific reports 6 (2016).
[24]
R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley. 2017. Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics (2017), bbx044.
[25]
M. Mørup. 2011. Applications of tensor (multiway array) factorizations and decompositions in data mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1, 1 (2011), 24--40.
[26]
K. M. Newton, P. L. Peissig, A. N. Kho, S. J. Bielinski, R. L. Berg, V. Choudhary, M. Basford, C. G. Chute, I. J. Kullo, R. Li, J. A. Pacheco, L. V. Rasmussen, L. Spangler, and J. C. Denny. 2013. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. JAMIA 20, e1 (2013), e147--e154.
[27]
P. Schulam, F. Wigley, and S. Saria. 2015. Clustering Longitudinal Clinical Marker Trajectories from Electronic Health Data: Applications to Phenotyping and Endotype Discovery. In AAAI. 2956--2964.
[28]
N. J. Stone and et al. 2014. 2013 ACC/AHA Guideline on the Treatment of Blood Cholesterol to Reduce Atherosclerotic Cardiovascular Risk in Adults. Circulation 129, 25 suppl 2 (2014), S1--S45.
[29]
K. B. Wagholikar, K. L. MacLaughlin, M. R. Henry, R. A. Greenes, R. A. Hankey, H. Liu, and R. Chaudhry. 2005. Clinical decision support with automated text processing for cervical cancer screening. Journal of the American Medical Informatics Association 19, 5 (2005), 833--839.
[30]
H. Wang and N. Ahuja. 2004. Compact representation of multidimensional data using tensor rank-one decomposition. vectors (2004).
[31]
Y. Wang, R. Chen, J. Ghosh, J. C. Denny, A. Kho, Y. Chen, B. A. Malin, and J. Sun. 2015. Rubik: Knowledge guided tensor factorization and completion for health data analytics. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. 1265--1274.
[32]
Williams and Lippincott. 2002. Third report of the National Cholesterol Education Program (NCEP) expert panel on detection, evaluation, and treatment of high blood cholesterol in adults (Adult Treatment Panel III) final report. Circulation 106, 25 (2002), 3143--3143.
[33]
J. Wu, J. Roy, and W. F. Stewart. 2010. Prediction modeling using EHR data: challenges, strategies, and a comparison of machine learning approaches. Medical care 48, 6 (2010), S106--S113.
[34]
K. Yang, X. Li, H. Liu, J. Mei, G. Xie, J. Zhao, and F. Wang. 2017. TaGiTeD: Predictive Task Guided Tensor Decomposition for Representation Learning from Electronic Health Records. In Thirty-First AAAI Conference on Artificial Intelligence.
[35]
J. Zhou, J. Sun, Y. Liu, J. Hu, and J. Ye. 2013. Patient risk prediction model via top-k stability selection. In SIAM conference on data mining (SIAM).
[36]
J. Zhou, F. Wang, J. Hu, and J. Ye. 2014. From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM. 135--144.

Cited By

View all
  • (2019)Scalable Multimodal Factorization for Learning from Big DataMultimodal Analytics for Next-Generation Big Data Technologies and Applications10.1007/978-3-319-97598-6_10(245-268)Online publication date: 19-Jul-2019
  • (2018)Phenotyping of Korean patients with better-than-expected efficacy of moderate-intensity statins using tensor factorizationPLOS ONE10.1371/journal.pone.019751813:6(e0197518)Online publication date: 13-Jun-2018
  • (2018)Fully Supervised Non-Negative Matrix Factorization for Feature ExtractionIGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium10.1109/IGARSS.2018.8518592(5772-5775)Online publication date: Jul-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
August 2017
800 pages
ISBN:9781450347228
DOI:10.1145/3107411
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computational phenotyping
  2. joint learning
  3. representation learning

Qualifiers

  • Research-article

Funding Sources

  • Ministry of Health & Welfare, Republic of Korea
  • Ministry of Science ICT and Future Planning, Republic of Korea

Conference

BCB '17
Sponsor:

Acceptance Rates

ACM-BCB '17 Paper Acceptance Rate 42 of 132 submissions, 32%;
Overall Acceptance Rate 254 of 885 submissions, 29%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Scalable Multimodal Factorization for Learning from Big DataMultimodal Analytics for Next-Generation Big Data Technologies and Applications10.1007/978-3-319-97598-6_10(245-268)Online publication date: 19-Jul-2019
  • (2018)Phenotyping of Korean patients with better-than-expected efficacy of moderate-intensity statins using tensor factorizationPLOS ONE10.1371/journal.pone.019751813:6(e0197518)Online publication date: 13-Jun-2018
  • (2018)Fully Supervised Non-Negative Matrix Factorization for Feature ExtractionIGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium10.1109/IGARSS.2018.8518592(5772-5775)Online publication date: Jul-2018
  • (2018)Auxiliary treatment of thyroid disease tensor combined with active learning method for multiple tasks2018 International Conference on Cloud Computing, Big Data and Blockchain (ICCBB)10.1109/ICCBB.2018.8756399(1-7)Online publication date: Nov-2018

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media