Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification

  • Original Paper
  • Published:
Journal of Digital Imaging Aims and scope Submit manuscript

Abstract

While radiologists regularly issue follow-up recommendations, our preliminary research has shown that anywhere from 35 to 50% of patients who receive follow-up recommendations for findings of possible cancer on abdominopelvic imaging do not return for follow-up. As such, they remain at risk for adverse outcomes related to missed or delayed cancer diagnosis. In this study, we develop an algorithm to automatically detect free text radiology reports that have a follow-up recommendation using natural language processing (NLP) techniques and machine learning models. The data set used in this study consists of 6000 free text reports from the author’s institution. NLP techniques are used to engineer 1500 features, which include the most informative unigrams, bigrams, and trigrams in the training corpus after performing tokenization and Porter stemming. On this data set, we train naive Bayes, decision tree, and maximum entropy models. The decision tree model, with an F1 score of 0.458 and accuracy of 0.862, outperforms both the naive Bayes (F1 score of 0.381) and maximum entropy (F1 score of 0.387) models. The models were analyzed to determine predictive features, with term frequency of n-grams such as “renal neoplasm” and “evalu with enhanc” being most predictive of a follow-up recommendation. Key to maximizing performance was feature engineering that extracts predictive information and appropriate selection of machine learning algorithms based on the feature set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Dutta S, Long WJ, Brown DF, Reisner AT: Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings. Ann Emerg Med 62(2):162–169, 2013

    Article  Google Scholar 

  2. Yetisgen-Yildiz M, Gunn ML, Xia F, Payne TH: Automatic identification of critical follow-up recommendation sentences in radiology reports. AMIA Annu Symp Proc 2011:1593–1602, 2011

    PubMed  PubMed Central  Google Scholar 

  3. Cook TS, Lalevic D, Sloan C, Chadalavada SC, Langlotz CP, Schnall MD, Zafar HM: Implementation of an automated radiology recommendation-tracking engine for abdominal imaging findings of possible cancer. J Am Coll Radiol. 14(5):629–636, 2017

    Article  Google Scholar 

  4. Langlotz CP: Structured radiology reporting: are we there yet? Radiology. 253(1):23–25, 2009

    Article  Google Scholar 

  5. Bosmans JM, Peremans L, Menni M, De Schepper AM, Duyck PO, Parizel PM: Structured reporting: if, why, when, how-and at what expense? Results of a focus group meeting of radiology professionals from eight countries. Insights Imaging. 3(3):295–302, 2012

    Article  CAS  Google Scholar 

  6. Pons E, Braun LM, Hunink MG, Kors JA: Natural language processing in radiology: a systematic review. Radiology. 279(2):329–343, 2016

    Article  Google Scholar 

  7. Xu Y, Tsujii J, Chang EI: Named entity recognition of follow-up and time information in 20,000 radiology reports. J Am Med Inform Assoc. 19(5):792–799, 2012

    Article  Google Scholar 

  8. Yetisgen-Yildiz M, Gunn ML, Xia F, Payne TH: A text processing pipeline to extract recommendations from radiology reports. J Biomed Inform 46(2):354–362, 2013

    Article  Google Scholar 

  9. Zafar HM, Chadalavada SC, Kahn CE, Cook TS, Sloan CE, Lalevic D et al.: Code abdomen: an assessment coding scheme for abdominal imaging findings possibly representing cancer. J Am Coll Radiol JACR. 12(9):947–950, 2015

    Article  Google Scholar 

  10. The Porter Stemming Algorithm. https://tartarus.org/martin/PorterStemmer/. Accessed June 1, 2017.

  11. Peng HC, Long F, Ding C: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence. 27(8):1226–1238, 2005

    Article  Google Scholar 

  12. MEGA Model Optimization Package. http://legacydirs.umiacs.umd.edu/~hal/megam/version0_3/. Accessed June 1, 2017.

  13. Pennington J, Socher R, Manning C: Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543

    Chapter  Google Scholar 

  14. Chen MC, Ball RL, Yang L, Moradzadeh N, Chapman BE, Larson DB, Langlotz CP, Amrhein TJ, Lungren MP: Deep learning to classify radiology free-text reports. Radiology. 286(3):845–852, 2018

    Article  Google Scholar 

  15. Chen PH, Zafar H, Galperin-Aizenberg M, Cook T: Integrating natural language processing and machine learning algorithms to categorize oncologic response in radiology reports. J Digit Imaging. 31(2):178–184, 2018

    Article  Google Scholar 

  16. Hassanpour S, Langlotz CP, Amrhein TJ, Befera NT, Lungren MP: Performance of a machine learning classifier of knee MRI reports in two large academic radiology practices: a tool to estimate diagnostic yield. AJR Am J Roentgenol. 208(4):750–753, 2017

    Article  Google Scholar 

  17. Hassanpour S, Bay G, Langlotz CP: Characterization of change and significance for clinical findings in radiology reports through natural language processing. J Digit Imaging. 30(3):314–322, 2017

    Article  Google Scholar 

  18. Bird, Steven, Ewan, Klein, and Loper, Edward (2009), Natural language processing with Python, O’Reilly Media..

  19. Garla V, Taylor C, Brandt C: Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. J Biomed Inform 46(5):869–875, 2013

    Article  Google Scholar 

  20. Reiner BI: Quantitative analysis of uncertainty in medical reporting: creating a standardized and objective methodology. J Digit Imaging. 31(2):145–149, 2018

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Lou.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lou, R., Lalevic, D., Chambers, C. et al. Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification. J Digit Imaging 33, 131–136 (2020). https://doi.org/10.1007/s10278-019-00271-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-019-00271-7

Keywords

Navigation