Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3477314.3507007acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Configuring a federated network of real-world patient health data for multimodal deep learning prediction of health outcomes

Published: 06 May 2022 Publication History

Abstract

Vast quantities of electronic patient medical data are currently being collated and processed in large federated data repositories. For instance, TriNetX, Inc., a global health research network, has access to more than 300 million patients, sourced from healthcare organizations, biopharmaceutical companies, and contract research organizations. As such, pipelines that are able to algorithmically extract huge quantities of patient data from multiple modalities present opportunities to leverage machine learning and deep learning approaches with the possibility of generating actionable insight. In this work, we present a modular, semi-automated end-to-end machine and deep learning pipeline designed to interface with a federated network of structured patient data. This proof-of-concept pipeline is disease-agnostic, scalable, and requires little domain expertise and manual feature engineering in order to quickly produce results for the case of a user-defined binary outcome event. We demonstrate the pipeline's efficacy with three different disease workflows, with high discriminatory power achieved in all cases.

References

[1]
Beam, A.L., Kompa, B., Schmaltz, A., Fried, I., Weber, G., Palmer, N., Shi, X., Cai, T. and Kohane, I.S. 2020. Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 25, (2020), 295--306.
[2]
Choi, E., Bahadori, M.T., Kulas, J.A., Schuetz, A., Stewart, W.F. and Sun, J. 2016. RETAIN: an interpretable predictive model for healthcare using reverse time attention mechanism. Proceedings of the 30th International Conference on Neural Information Processing Systems (Red Hook, NY, USA, Dec. 2016), 3512--3520.
[3]
Choi, E., Bahadori, M.T., Searles, E., Coffey, C. and Sun, J. 2016. Multi-layer Representation Learning for Medical Concepts. arXiv:1602.05568 [cs]. (Feb. 2016).
[4]
Choi, E., Bahadori, M.T., Searles, E., Coffey, C., Thompson, M., Bost, J., Tejedor-Sojo, J. and Sun, J. 2016. Multi-layer Representation Learning for Medical Concepts. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, Aug. 2016), 1495--1504.
[5]
Choi, E., Schuetz, A., Stewart, W. and Sun, J. 2016. Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction. ArXiv. (2016).
[6]
Gastaldelli, A. and Cusi, K. 2019. From NASH to diabetes and from diabetes to NASH: Mechanisms and treatment options. JHEP reports: innovation in hepatology. 1, 4 (Oct. 2019), 312--328.
[7]
Jorge, A., D'Silva, K.M., Cohen, A., Wallace, Z.S., McCormick, N., Zhang, Y. and Choi, H.K. 2021. Temporal trends in severe COVID-19 outcomes in patients with rheumatic disease: a cohort study. The Lancet Rheumatology. 3, 2 (Feb. 2021), e131--e137.
[8]
Kawaguchi, K., Kaelbling, L.P. and Bengio, Y. 2017. Generalization in Deep Learning. arXiv:1710.05468 [cs, stat]. (Oct. 2017).
[9]
Levy, O. and Goldberg, Y. 2014. Linguistic Regularities in Sparse and Explicit Word Representations. CoNLL (2014).
[10]
Lundberg, S.M. and Lee, S.-I. 2017. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems 30. I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds. Curran Associates, Inc. 4765--4774.
[11]
Maaten, L. van der and Hinton, G. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research. 9, 86 (2008), 2579--2605.
[12]
Manemann, S.M., Sauver, J.L.S., Liu, H., Larson, N.B., Moon, S., Takahashi, P.Y., Olson, J.E., Rocca, W.A., Miller, V.M., Therneau, T.M., Ngufor, C.G., Roger, V.L., Zhao, Y., Decker, P.A., Killian, J.M. and Bielinski, S.J. 2021. Longitudinal cohorts for harnessing the electronic health record for disease prediction in a US population. BMJ Open. 11, 6 (Jun. 2021), e044353.
[13]
Marchesini, G. and Marzocchi, R. 2007. Metabolic syndrome and NASH. Clinics in Liver Disease. 11, 1 (Feb. 2007), 105--117, ix.
[14]
McDonald, C.J., Huff, S.M., Suico, J.G., Hill, G., Leavelle, D., Aller, R., Forrey, A., Mercer, K., DeMoor, G., Hook, J., Williams, W., Case, J. and Maloney, P. 2003. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clinical Chemistry. 49, 4 (Apr. 2003), 624--633.
[15]
Mikolov, T., Chen, K., Corrado, G. and Dean, J. 2013. Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs]. (Jan. 2013).
[16]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. and Dean, J. 2013. Distributed Representations of Words and Phrases and their Compositionality. arXiv:1310.4546 [cs, stat]. (Oct. 2013).
[17]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems (2013), 3111--3119.
[18]
Miotto, R., Li, L., Kidd, B.A. and Dudley, J.T. 2016. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Scientific Reports. 6, (May 2016), 26094.
[19]
Miotto, R., Wang, F., Wang, S., Jiang, X. and Dudley, J.T. 2017. Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics. 19, 6 (May 2017), 1236--1246.
[20]
Muhammad, W., Hart, G.R., Nartowt, B., Farrell, J.J., Johung, K., Liang, Y. and Deng, J. 2019. Pancreatic Cancer Prediction Through an Artificial Neural Network. Frontiers in Artificial Intelligence. 2, (2019).
[21]
Quan, H., Sundararajan, V., Halfon, P., Fong, A., Burnand, B., Luthi, J.-C., Saunders, L.D., Beck, C.A., Feasby, T.E. and Ghali, W.A. 2005. Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data. Medical Care. 43, 11 (2005), 1130--1139.
[22]
Raghupathi, W. and Raghupathi, V. 2014. Big data analytics in healthcare: promise and potential. Health Information Science and Systems. 2, 1 (Feb. 2014), 3.
[23]
Rajkomar, A., Oren, E., Chen, K., Dai, A.M., Hajaj, N., Liu, P.J., Liu, X., Sun, M., Sundberg, P., Yee, H., Zhang, K., Duggan, G.E., Flores, G., Hardt, M., Irvine, J., Le, Q., Litsch, K., Marcus, J., Mossin, A., Tansuwan, J., Wang, D., Wexler, J., Wilson, J., Ludwig, D., Volchenboum, S.L., Chou, K., Pearson, M., Madabushi, S., Shah, N.H., Butte, A.J., Howell, M., Cui, C., Corrado, G. and Dean, J. 2018. Scalable and accurate deep learning for electronic health records. npj Digital Medicine. 1, 1 (Dec. 2018), 18.
[24]
Rieke, N., Hancox, J., Li, W., Milletarì, F., Roth, H.R., Albarqouni, S., Bakas, S., Galtier, M.N., Landman, B.A., Maier-Hein, K., Ourselin, S., Sheller, M., Summers, R.M., Trask, A., Xu, D., Baust, M. and Cardoso, M.J. 2020. The future of digital health with federated learning. npj Digital Medicine. 3, 1 (Sep. 2020), 1--7.
[25]
Rubenstein, J. 2014. ICD-10: are you ready? Current Urology Reports. 15, 11 (Nov. 2014), 449.
[26]
Shickel, B., Tighe, P., Bihorac, A. and Rashidi, P. 2018. Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis. IEEE Journal of Biomedical and Health Informatics. 22, 5 (Sep. 2018), 1589--1604.
[27]
Sutton-Tyrrell, K. 1991. Assessing bias in case-control studies. Proper selection of cases and controls. Stroke. 22, 7 (Jul. 1991), 938--942.
[28]
Topaloglu, U. and Palchuk, M.B. 2018. Using a Federated Network of Real-World Data to Optimize Clinical Trials Operations. JCO Clinical Cancer Informatics. 2 (Dec. 2018), 1--10.
[29]
Warnat-Herresthal, S., Schultze, H., Shastry, K.L., Manamohan, S., Mukherjee, S., Garg, V., Sarveswara, R., Händler, K., Pickkers, P., Aziz, N.A., Ktena, S., Tran, F., Bitzer, M., Ossowski, S., Casadei, N., Herr, C., Petersheim, D., Behrends, U., Kern, F., Fehlmann, T., Schommers, P., Lehmann, C., Augustin, M., Rybniker, J., Altmüller, J., Mishra, N., Bernardes, J.P., Krämer, B., Bonaguro, L., Schulte-Schrepping, J., De Domenico, E., Siever, C., Kraut, M., Desai, M., Monnet, B., Saridaki, M., Siegel, C.M., Drews, A., Nuesch-Germano, M., Theis, H., Heyckendorf, J., Schreiber, S., Kim-Hellmuth, S., Nattermann, J., Skowasch, D., Kurth, I., Keller, A., Bals, R., Nürnberg, P., Rieß, O., Rosenstiel, P., Netea, M.G., Theis, F., Mukherjee, S., Backes, M., Aschenbrenner, A.C., Ulas, T., Breteler, M.M.B., Giamarellos-Bourboulis, E.J., Kox, M., Becker, M., Cheran, S., Woodacre, M.S., Goh, E.L. and Schultze, J.L. 2021. Swarm Learning for decentralized and confidential clinical machine learning. Nature. 594, 7862 (Jun. 2021), 265--270.

Cited By

View all
  • (2024)Novel Architecture Integrating XAI with Blockchain and IoT Devices for Healthcare2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE)10.1109/ic-ETITE58242.2024.10493418(1-5)Online publication date: 22-Feb-2024
  • (2022)Artificial Intelligence, Bioinformatics, and PathologyAdvances in Molecular Pathology10.1016/j.yamp.2023.01.0025:1(e25-e52)Online publication date: Nov-2022

Index Terms

  1. Configuring a federated network of real-world patient health data for multimodal deep learning prediction of health outcomes

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
    April 2022
    2099 pages
    ISBN:9781450387132
    DOI:10.1145/3477314
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 May 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. comorbidity
    2. deep learning
    3. electronic health records
    4. federated data network
    5. lab measurements

    Qualifiers

    • Research-article

    Conference

    SAC '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Novel Architecture Integrating XAI with Blockchain and IoT Devices for Healthcare2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE)10.1109/ic-ETITE58242.2024.10493418(1-5)Online publication date: 22-Feb-2024
    • (2022)Artificial Intelligence, Bioinformatics, and PathologyAdvances in Molecular Pathology10.1016/j.yamp.2023.01.0025:1(e25-e52)Online publication date: Nov-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media