Abstract
As the demand for more descriptive machine learning models grows within medical imaging, bottlenecks due to data paucity will exacerbate. Thus, collecting enough large-scale data will require automated tools to harvest data/label pairs from messy and real-world datasets, such as hospital picture archiving and communication systems (PACSs). This is the focus of our work, where we present a principled data curation tool to extract multi-phase computed tomography (CT) liver studies and identify each scan’s phase from a real-world and heterogenous hospital PACS dataset. Emulating a typical deployment scenario, we first obtain a set of noisy labels from our institutional partners that are text mined using simple rules from DICOM tags. We train a deep learning system, using a customized and streamlined 3D squeeze and excitation (SE) architecture, to identify non-contrast, arterial, venous, and delay phase dynamic CT liver scans, filtering out anything else, including other types of liver contrast studies. To exploit as much training data as possible, we also introduce an aggregated cross entropy loss that can learn from scans only identified as “contrast”. Extensive experiments on a dataset of 43K scans of 7680 patient imaging studies demonstrate that our 3DSE architecture, armed with our aggregated loss, can achieve a mean F1 of 0.977 and can correctly harvest up to \(92.7\%\) of studies, which significantly outperforms the text-mined and standard-loss approach, and also outperforms other, and more complex, model architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Litjens, G.J.S., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Zhou, B., Lin, X., Eck, B., Hou, J., Wilson, D.: Generation of virtual dual energy images from standard single-shot radiographs using multi-scale and conditional adversarial network. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 298–313. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_19
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE CVPR, pp. 248–255 (2009)
Kohli, M.D., Summers, R.M., Geis, J.R.: Medical image data and datasets in the era of machine learning: Whitepaper from the 2016 C-MIMI meeting dataset session. J. Digital Imaging 30(4), 392–399 (2017)
Harvey, H., Glocker, B.: A standardised approach for preparing imaging data for machine learning tasks in radiology. In: Ranschaert, E.R., Morozov, S., Algra, P.R. (eds.) Artificial Intelligence in Medical Imaging, pp. 61–72. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-94878-2_6
Yan, K., Wang, X., Lu, L., Summers, R.M.: Deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J. Med. Imaging 5(3), 036501 (2018)
Zhou, B., Chen, A., Crawford, R., Dogdas, B., Goldmarcher, G.: A progressively-trained scale-invariant and boundary-aware deep neural network for the automatic 3D segmentation of lung lesions. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10. IEEE (2019)
Irvin, J., Rajpurkar, P., et al.: Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI (2019)
Peng, Y., Wang, X., Lu, L., Bagheri, M., Summers, R., Lu, Z.: NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Jt Summits Transl. Sci. Proc. 2018, 188–196 (2018)
Burrowes, D.P., Medellin, A., Harris, A.C., Milot, L., Wilson, S.R.: Contrast-enhanced us approach to the diagnosis of focal liver masses. RadioGraphics 37(5), 1388–1400 (2017)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Gueld, M.O., et al.: Quality of DICOM header information for image categorization. In: Proceedings of SPIE Medical Imaging (2002)
Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3D residual networks for action recognition. In: IEEE CVPR, pp. 3154–3160 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Yeh, A.: More accurate tests for the statistical significance of result differences. In: Proceedings of the 18th Conference on Computational Linguistics - Volume 2. COLING 2000, Stroudsburg, PA, USA, pp. 947–953 (2000)
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)
Zhao, G., Zhou, B., Wang, K., Jiang, R., Xu, M.: Respond-CAM: analyzing deep models for 3D imaging data by visualizations. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 485–492. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_55
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, B. et al. (2019). CT Data Curation for Liver Patients: Phase Recognition in Dynamic Contrast-Enhanced CT. In: Wang, Q., et al. Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data. DART MIL3ID 2019 2019. Lecture Notes in Computer Science(), vol 11795. Springer, Cham. https://doi.org/10.1007/978-3-030-33391-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-33391-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33390-4
Online ISBN: 978-3-030-33391-1
eBook Packages: Computer ScienceComputer Science (R0)