Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification

Robert Lou ORCID: orcid.org/0000-0003-4723-5416¹,
Darco Lalevic²,
Charles Chambers²,
Hanna M. Zafar² &
…
Tessa S. Cook²

1196 Accesses
36 Citations
4 Altmetric
Explore all metrics

Abstract

While radiologists regularly issue follow-up recommendations, our preliminary research has shown that anywhere from 35 to 50% of patients who receive follow-up recommendations for findings of possible cancer on abdominopelvic imaging do not return for follow-up. As such, they remain at risk for adverse outcomes related to missed or delayed cancer diagnosis. In this study, we develop an algorithm to automatically detect free text radiology reports that have a follow-up recommendation using natural language processing (NLP) techniques and machine learning models. The data set used in this study consists of 6000 free text reports from the author’s institution. NLP techniques are used to engineer 1500 features, which include the most informative unigrams, bigrams, and trigrams in the training corpus after performing tokenization and Porter stemming. On this data set, we train naive Bayes, decision tree, and maximum entropy models. The decision tree model, with an F1 score of 0.458 and accuracy of 0.862, outperforms both the naive Bayes (F1 score of 0.381) and maximum entropy (F1 score of 0.387) models. The models were analyzed to determine predictive features, with term frequency of n-grams such as “renal neoplasm” and “evalu with enhanc” being most predictive of a follow-up recommendation. Key to maximizing performance was feature engineering that extracts predictive information and appropriate selection of machine learning algorithms based on the feature set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports

Article 27 October 2017

Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing

Article 31 August 2017

Developing a triage predictive model for access to a spinal surgeon using clinical variables and natural language processing of radiology reports

Article 06 February 2023

References

Dutta S, Long WJ, Brown DF, Reisner AT: Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings. Ann Emerg Med 62(2):162–169, 2013
Article Google Scholar
Yetisgen-Yildiz M, Gunn ML, Xia F, Payne TH: Automatic identification of critical follow-up recommendation sentences in radiology reports. AMIA Annu Symp Proc 2011:1593–1602, 2011
PubMed PubMed Central Google Scholar
Cook TS, Lalevic D, Sloan C, Chadalavada SC, Langlotz CP, Schnall MD, Zafar HM: Implementation of an automated radiology recommendation-tracking engine for abdominal imaging findings of possible cancer. J Am Coll Radiol. 14(5):629–636, 2017
Article Google Scholar
Langlotz CP: Structured radiology reporting: are we there yet? Radiology. 253(1):23–25, 2009
Article Google Scholar
Bosmans JM, Peremans L, Menni M, De Schepper AM, Duyck PO, Parizel PM: Structured reporting: if, why, when, how-and at what expense? Results of a focus group meeting of radiology professionals from eight countries. Insights Imaging. 3(3):295–302, 2012
Article CAS Google Scholar
Pons E, Braun LM, Hunink MG, Kors JA: Natural language processing in radiology: a systematic review. Radiology. 279(2):329–343, 2016
Article Google Scholar
Xu Y, Tsujii J, Chang EI: Named entity recognition of follow-up and time information in 20,000 radiology reports. J Am Med Inform Assoc. 19(5):792–799, 2012
Article Google Scholar
Yetisgen-Yildiz M, Gunn ML, Xia F, Payne TH: A text processing pipeline to extract recommendations from radiology reports. J Biomed Inform 46(2):354–362, 2013
Article Google Scholar
Zafar HM, Chadalavada SC, Kahn CE, Cook TS, Sloan CE, Lalevic D et al.: Code abdomen: an assessment coding scheme for abdominal imaging findings possibly representing cancer. J Am Coll Radiol JACR. 12(9):947–950, 2015
Article Google Scholar
The Porter Stemming Algorithm. https://tartarus.org/martin/PorterStemmer/. Accessed June 1, 2017.
Peng HC, Long F, Ding C: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence. 27(8):1226–1238, 2005
Article Google Scholar
MEGA Model Optimization Package. http://legacydirs.umiacs.umd.edu/~hal/megam/version0_3/. Accessed June 1, 2017.
Pennington J, Socher R, Manning C: Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543
Chapter Google Scholar
Chen MC, Ball RL, Yang L, Moradzadeh N, Chapman BE, Larson DB, Langlotz CP, Amrhein TJ, Lungren MP: Deep learning to classify radiology free-text reports. Radiology. 286(3):845–852, 2018
Article Google Scholar
Chen PH, Zafar H, Galperin-Aizenberg M, Cook T: Integrating natural language processing and machine learning algorithms to categorize oncologic response in radiology reports. J Digit Imaging. 31(2):178–184, 2018
Article Google Scholar
Hassanpour S, Langlotz CP, Amrhein TJ, Befera NT, Lungren MP: Performance of a machine learning classifier of knee MRI reports in two large academic radiology practices: a tool to estimate diagnostic yield. AJR Am J Roentgenol. 208(4):750–753, 2017
Article Google Scholar
Hassanpour S, Bay G, Langlotz CP: Characterization of change and significance for clinical findings in radiology reports through natural language processing. J Digit Imaging. 30(3):314–322, 2017
Article Google Scholar
Bird, Steven, Ewan, Klein, and Loper, Edward (2009), Natural language processing with Python, O’Reilly Media..
Garla V, Taylor C, Brandt C: Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management. J Biomed Inform 46(5):869–875, 2013
Article Google Scholar
Reiner BI: Quantitative analysis of uncertainty in medical reporting: creating a standardized and objective methodology. J Digit Imaging. 31(2):145–149, 2018
Article Google Scholar

Download references

Author information

Authors and Affiliations

Perelman School of Medicine at the University of Pennsylvania, 801 S 24th St #3, Philadelphia, PA, 19146, USA
Robert Lou
Hospital of the University of Pennsylvania, Philadelphia, PA, USA
Darco Lalevic, Charles Chambers, Hanna M. Zafar & Tessa S. Cook

Authors

Robert Lou
View author publications
You can also search for this author in PubMed Google Scholar
Darco Lalevic
View author publications
You can also search for this author in PubMed Google Scholar
Charles Chambers
View author publications
You can also search for this author in PubMed Google Scholar
Hanna M. Zafar
View author publications
You can also search for this author in PubMed Google Scholar
Tessa S. Cook
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert Lou.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lou, R., Lalevic, D., Chambers, C. et al. Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification. J Digit Imaging 33, 131–136 (2020). https://doi.org/10.1007/s10278-019-00271-7

Download citation

Published: 03 September 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s10278-019-00271-7

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports

Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing

Developing a triage predictive model for access to a spinal surgeon using clinical variables and natural language processing of radiology reports

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports

Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing

Developing a triage predictive model for access to a spinal surgeon using clinical variables and natural language processing of radiology reports

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation