Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Establishment of machine learning-based tool for early detection of pulmonary embolism

Published: 12 April 2024 Publication History

Highlights

Screening features that passed the hypothesis test were established with reference to the 2019 ESC guidelines for diagnosis and management of acute pulmonary embolism. The data set of the study on pulmonary embolism was established by cleaning, sorting and screening the massive data obtained from hospitals.
Five machine learning (SVM, LogisticRregression, random forest XGBoost, and BP neural network) models for pulmonary embolism were developed. XGBoost model is approved as optimal among the five models, and its sensitivity, specificity and missed diagnosis rate are all superior to the comparison model, reaching the standard of assisting doctors in the clinical application.
The important features that constitute the XGBoost decision result are obtained. the 2019 ESC guideline has shown that these features are also important in the clinic, suggesting that our model has learned important information about screening for pulmonary embolism.
The model is used prior to pulmonary angiography and only requires the input of routine laboratory and test results to assess the patient's risk of pulmonary embolism and provides a reference for doctors in the next-step examination.

Abstract

Background and objectives

Pulmonary embolism (PE) is a complex disease with high mortality and morbidity rate, leading to increasing society burden. However, current diagnosis is solely based on symptoms and laboratory data despite its complex pathology, which easily leads to misdiagnosis and missed diagnosis by inexperienced doctors. Especially, CT pulmonary angiography, the gold standard method, is not widely available. In this study, we aim to establish a rapid and accurate screening model for pulmonary embolism using machine learning technology. Importantly, data required for disease prediction are easily accessed, including routine laboratory data and medical record information of patients.

Methods

We extracted features from patients' routine laboratory results and medical records, including blood routine, biochemical group, blood coagulation routine and other test results, as well as symptoms and medical history information. Samples with a feature loss rate greater than 0.8 were deleted from the original database. Data from 4723 cases were retained, 231 of which were positive for pulmonary embolism. 50 features were retained through the positive and negative statistical hypothesis testing which was used to build the predictive model. In order to avoid identification as majority-class samples caused by the imbalance of sample proportion, we used the method of Synthetic Minority Oversampling Technique (SMOTE) to increase the amount of information on minority samples. Five typical machine learning algorithms were used to model the screening of pulmonary embolism, including Support Vector Machines, Logistic Regression, Random Forest, XGBoost, and Back Propagation Neural Networks. To evaluate model performance, sensitivity, specificity and AUC curve were analyzed as the main evaluation indicators. Furthermore, a baseline model was established using the characteristics of the pulmonary embolism guidelines as a comparison model.

Results

We found that XGBoost showed better performance compared to other models, with the highest sensitivity and specificity (0.99 and 0.99, respectively). Moreover, it showed significant improvement in performance compared to the baseline model (sensitivity and specificity were 0.76 and 0.76 respectively). More important, our model showed low missed diagnosis rate (0.46) and high AUC value (0.992). Finally, the calculation time of our model is only about 0.05 s to obtain the possibility of pulmonary embolism.

Conclusions

In this study, five machine learning classification models were established to assess the likelihood of patients suffering from pulmonary embolism, and the XGBoost model most significantly improved the precision, sensitivity, and AUC for pulmonary embolism screening. Collectively, we have established an AI-based model to accurately predict pulmonary embolism at early stage.

References

[1]
R. Osteresch, A. Fach, R. Hambrecht, et al., ESC-leitlinien 2019 zu diagnostik und management der akuten lungenembolie, Herz 44 (2019) 696–700,.
[2]
A. Qaseem, R. Chou, L.L. Humphrey, P. Shekelle, Inpatient glycemic control: best practice advice from the clinical guidelines committee of the american college of physicians, Am. J. Med. Qual. 29 (2) (2014) 95–98,.
[3]
M.G. Beckman, W.C. Hooper, S.E. Critchley, T.L. Ortel, Venous thromboembolism: a public health concern, Am. J. Prev. Med. 38 (4) (2010) S495–S501,. Suppl.
[4]
P.D. Stein, P.K. Woodard, J.G. Weg, T.W. Wakefield, V.F. Tapson, H.D. Sostman, T.A. Sos, D.A. Quinn, K.V. Leeper, Hull Jr, D. R, C.A. Hales, A. Gottschalk, L.R. Goodman, S.E. Fowler, J.D. Buckley, Diagnostic pathways in acute pulmonary embolism: recommendations of the PIOPED II investigators, Am. J. Med., 119, 2006, pp. 1048–1055,.
[5]
H.A. Tran, H. Gibbs, E. Merriman, J.L. Curnow, L. Young, A. Bennett, C.W. Tan, S.D. Chunilal, C.M. Ward, R. Baker, H. Nandurkar, New guidelines from the Thrombosis and Haemostasis Society of Australia and New Zealand for the diagnosis and management of venous thromboembolism, Med. J. Aust. 210 (5) (2019) 227–235,.
[6]
American College of Emergency Physicians Clinical Policies Subcommittee (Writing Committee) on Thromboembolic Disease, S.J. Wolf, S.A. Hahn, L.M. Nentwich, A.S. Raja, S.M. Silvers, M.D. Brown, Clinical policy: critical issues in the evaluation and management of adult patients presenting to the emergency department with suspected acute venous thromboembolic disease, Ann. Emerg. Med. 71 (5) (2018) e59–e109,.
[7]
M.M. Samama, A.T. Cohen, J.Y. Darmon, L. Desjardins, A. Eldor, C. Janbon, A. Leizorovicz, H. Nguyen, C.G. Olsson, A comparison of enoxaparin with placebo for the prevention of venous thromboembolism in acutely III medical patients, Surv. Anesthesiol. 44 (2000) 137–138.
[8]
P. Ferroni, F.M. Zanzotto, N. Scarpato, S. Riondino, F. Guadagni, Validation of a machine learning approach for venous thromboembolism risk prediction in oncology, Dis. Markers (2017),. 2017.
[9]
Ferroni P., Zanzotto F.M., Scarpato N., et al. Risk assessment for venous thromboembolism in chemotherapy-treated ambulatory cancer patients. Med. Decis. Mak.: An International Journal of the Society for Medical Decision Making. 2017;37(2): 234–242. 10.1177/0272989x16662654.
[10]
D. Mora, J.A. Nieto, J. Mateo, B. Bikdeli, S. Barco, J. Trujillo Santos, S. Soler, L. Font, M. Bosevski, M. Monreal, Machine learning to predict outcomes in patients with acute pulmonary embolism who prematurely discontinued anticoagulant therapy, Thromb. Haemost. (2021).
[11]
A. Vepa, A. Saleem, K. Rakhshan, A. Daneshkhah, T. Sedighi, S. Shohaimi, A. Omar, N. Salari, O. Chatrabgoun, D. Dharmaraj, J. Sami, S. Parekh, M. Ibrahim, M. Raza, P. Kapila, P. Chakrabarti, Using machine learning algorithms to develop a clinical decision-making tool for COVID-19 inpatients, Int. J. Environ. Res. Public Health 18 (12) (2021) 6228,. PMID: 34207560; PMCID: PMC8296041.
[12]
G. Nguyen, S. Dlugolinsky, M. Bobák, et al., Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey, Artif. Intell. Rev. 52 (2019) 77–124,.
[13]
D. Mora, J.A. Nieto, J. Mateo, B. Bikdeli, S. Barco, J. Trujillo-Santos, S. Soler, L. Font, M. Bosevski, M. Monreal, RIETE Investigators, Machine learning to predict outcomes in patients with acute pulmonary embolism who prematurely discontinued anticoagulant therapy, Thromb. Haemost. 122 (4) (2022) 570–577,.
[14]
H. Villacorta, J.W. Pickering, Y. Horiuchi, M. Olim, C. Coyne, A.S. Maisel, M.P. Than, Machine learning with d-dimer in the risk stratification for pulmonary embolism: a derivation and internal validation study, Eur. Heart J. Acute Cardiovasc. Care 11 (1) (2022) 13–19,.
[15]
J. Shen, S. Casie Chetty, S. Shokouhi, J. Maharjan, Y. Chuba, J. Calvert, Q. Mao, Massive external validation of a machine learning algorithm to predict pulmonary embolism in hospitalized patients, Thromb. Res. 216 (2022) 14–21,.
[16]
L. Ryan, J. Maharjan, S. Mataraso, G. Barnes, J. Hoffman, Q. Mao, J. Calvert, R. Das, Predicting pulmonary embolism among hospitalized patients with machine learning algorithms, Pulm. Circ. 12 (1) (2022) e12013,. PMID: 35506114; PMCID: PMC9052977.
[17]
L. Hou, L. Hu, W. Gao, et al., Construction of a risk prediction model for hospital-acquired pulmonary embolism in hospitalized patients, Clin. Appl. Thromb. Hemost. (2021) 27,.
[18]
P. Ajmera, A. Kharat, J. Seth, et al., A deep learning approach for automated diagnosis of pulmonary embolism on computed tomographic pulmonary angiography, BMC Med. Imaging 22 (1) (2022) 195,.
[19]
S.V. Konstantinides, G. Meyer, C. Becattini, et al., 2019 ESC guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European respiratory society (ERS): the task force for the diagnosis and management of acute pulmonary embolism of the European society of cardiology (ESC)[J], Eur. Respir. J. 54 (3) (2019),. pii: 1901647.
[20]
P. Chen, W. Dong, J. Wang, X. Lu, U. Kaymak, Z. Huang, Interpretable clinical prediction via attention-based neural network, BMC Med. Infom. Decis. Mak. 20 (Suppl 3) (2020) 131,.
[21]
A. Aggarwal, K. Puri, S. Liangpunsakul, Deep vein thrombosis and pulmonary embolism in cirrhotic patients: systematic review, World J. Gastroenterol. 20 (19) (2014) 5737–5745,.
[22]
X. Yang, N. Li, T. Guo, X. Guan, J. Tan, X. Gao, Y. Wu, L. Jia, M. Gu, L. Hua, H. Liu, Comparison of the effects of low-molecular-weight heparin and fondaparinux on liver function in patients with pulmonary embolism, J. Clin. Pharmacol. 60 (12) (2020) 1671–1678,.
[23]
A. Asakage, M. Fujisawa, T. Takei, J. Kumagai, Diagnostic significance of fat globules in blood in fulminant-type fat embolism syndrome, Clin. Case Rep. 9 (10) (2021) e04950,.
[24]
F. Boyuk, The role of the multi-inflammatory index as a novel inflammation-related index in the differential diagnosis of massive and non-massive pulmonary embolism, Int. J. Clin. Pract. 75 (12) (2021) e14966,.
[25]
F. Boyuk, The predictor potential role of the glucose to potassium ratio in the diagnostic differentiation of massive and non-massive pulmonary embolism, Clin. appl. Thromb. Hemost. 28 (2022),. : official journal of the International Academy of Clinical and Applied Thrombosis/Hemostasis.
[26]
K.R. Pohl, L. Hobohm, V.J. Krieg, C. Sentler, N.I.J. Rogge, L. Steimke, M. Ebner, M. Lerchbaumer, G. Hasenfuß, S. Konstantinides, M. Lankeit, K. Keller, Impact of thyroid dysfunction on short-term outcomes and long-term mortality in patients with pulmonary embolism, Thromb. Res. 211 (2022) 70–78,.
[27]
A. Walther, A. Schellhaaß, B. Böttiger, et al., Diagnose, Therapie und Sekundärprophylaxe der akuten Lungenembolie, . Anaesthesist 58 (2009) 1048–1054,.
[28]
Agrawal N., Ramegowda R.T., Patra S., Hegde M., Agarwal A., Kolhari V., Gupta K., & Nanjappa M.C. (2014). Predictors of inhospital prognosis in acute pulmonary embolism: keeping it simple and effective!. Blood Coagul. Fibrinolysis Int. J. Haemost. Thromb., 25(5), 492–500. 10.1097/MBC.0000000000000093.
[29]
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning: With Applications in R, Springer, Berlin/Heidelberg, Germany, 2017.
[30]
C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (1995) 273–297,.
[31]
M. Richardson, E. Dominowska, Robert Ragno, Predicting clicks: estimating the click-through rate for new ads, in: Proceedings of the 16th international conference on World Wide Web (WWW '07), New York, NY, USA, 2007, pp. 521–530,. Association for Computing Machinery.
[32]
C. Bishop, Pattern recognition and machine learning, J. Electron. Imaging 16 (4) (2006) 140–155,.
[33]
R.E. Schapire, The strength of weak learnability, Mach. Learn. 5 (1990) 197–227,.
[34]
Jerome H. Friedman, Greedy function approximation: a gradient boosting machine, Ann. Stat. 29 (2001) 1189–1232.
[35]
T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, ACM, 2016, pp. 785–794.
[36]
David E. Rumelhart, et al., Learning representations by back-propagating errors, Nature 323 (1986) 533–536.
[37]
A. Radhachandran, A. Garikipati, Z. Iqbal, et al., A machine learning approach to predicting risk of myelodysplastic syndrome, Leuk. Res. 109 (2021).
[38]
S. Le, A. Allen, J. Calvert, et al., Convolutional neural network model for intensive care unit acute kidney injury prediction, Kidney Int. Rep. 6 (2021) 1289–1298.
[39]
T.G. Dietterich, Ensemble Methods in Machine Learning, Multiple Classifier Syst. (2000) 1–15,.
[40]
J.H. Shen, H.L. Chen, J.R. Chen, J.L. Xing, P. Gu, B.F. Zhu, Comparison of the wells score with the revised Geneva score for assessing suspected pulmonary embolism: a systematic review and meta-analysis, J. Thromb. Thrombolysis 41 (3) (2016) 482–492,.
[41]
W. Ageno, The Wells rule is not accurate in hospitalized patients, Nat. Rev. Cardiol. 12 (2015) 449–450,.
[42]
S. Rose, Machine Learning for Prediction in Electronic Health Data, JAMA Netw. Open 1 (4) (2018),.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computer Methods and Programs in Biomedicine
Computer Methods and Programs in Biomedicine  Volume 244, Issue C
Feb 2024
678 pages

Publisher

Elsevier North-Holland, Inc.

United States

Publication History

Published: 12 April 2024

Author Tags

  1. Pulmonary embolism
  2. Data processing
  3. Machine learning
  4. Xgboost

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media