Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3591569.3591602acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciitConference Proceedingsconference-collections
research-article

Vision Transformer for Pneumonia Classification in X-ray Images

Published: 13 July 2023 Publication History

Abstract

Pneumonia is a common medical condition, usually caused by a lung infection, which causes the tissues in the lungs to become inflamed and affects the functioning of the lungs. Pneumonia ranges from mild pneumonia to life-threatening severity. Identifying the responsible pathogen can be difficult. Diagnosis is often based on symptoms and physical examination, which includes chest X-rays. However, the examination of chest X-rays is a challenging task and is prone to subjective variability. In this study, we focus on the research of a new image classification algorithm for classifying images indicating pneumonia pathology. The proposed method uses the Vision transformer architecture to extract data characteristics and classify the input image as pneumonia or not. Two popular deep learning architectures are compared: Vision transformer and Convolutional Neural Network. In this work, we evaluate Vit-B/16 (for Vision transformer) compared to Convolutional Neural Network algorithms such as MobileNetV2, VGG16, ResNet-50. In this study, the Vision transformer algorithm gives relatively positive classification results with an accuracy of approximately 94%.

References

[1]
[1] Pneumonia - WHO (World Health Organization), https://www.who.int/news-room/fact-sheets/detail/pneumonia.
[2]
[2] Pneumonia: Causes, Symptom, diagnosis and treatment (Online), https://vnvc.vn/viem-phoi/
[3]
[3] Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J.,... and Zheng, X. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265-283).
[4]
[4] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.,... and Darrell, T. November, 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 675-678).
[5]
[5] Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G.,... and Bengio, Y. 2010. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy) (Vol. 4, No. 3, pp. 1-7).
[6]
[6] He, K., Zhang, X., Ren, S., and Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://doi.org/10.48550/arXiv.1512.03385.
[7]
[7] Gao Huang, Zhuang Liu, Kilian Q. Weinberger, and Laurens van der Maaten. 2016. Densely Connected Convolutional Networks. https://doi.org/10.48550/arXiv.1608.06993.
[8]
[8] Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
[9]
[9] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D.,... and Rabinovich, A. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
[10]
[10] Ren, S., He, K., Girshick, R., and Sun, J. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. https://doi.org/10.1109/TPAMI.2016.2577031.
[11]
[11] Lin, T. Y., Goyal, P., Girshick, R., He, K., and Dollár, P. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).
[12]
[12] Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. 2019. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6569-6578).
[13]
[13] Ronneberger, O., Fischer, P., and Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.
[14]
[14] Lee, Y., and Park, J. 2020. Centermask: Real-time anchor-free instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 13906-13915).
[15]
[15] Fu, C. Y., Shvets, M., and Berg, A. C. 2019. RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free. https://doi.org/10.48550/arXiv.1901.03353.
[16]
[16] Ruhan Sa, William Owens, Raymond Wiegand, Mark Studin, Donald Capoferri, Kenneth Barooha, Alexander Greaux, Robert Rattray, Adam Hutton, John Cintineo, Vipin Chaudhary. 2017. Intervertebral disc detection in X-ray images using faster R-CNN. http://dx.doi.org/10.1109/EMBC.2017.8036887.
[17]
[17] Shangjie Yao, Yaowu Chen, Xiang Tian, Rongxin Jiang and Shuhao Ma. 2020. An Improved Algorithm for Detecting Pneumonia Based on YOLOv 3. http://dx.doi.org/10.3390/app10051818.
[18]
[18] Ayat Abedalla, Malak Abdullah, Mahmoud Al-Ayyoub, Elhadj Benkhelifa. 2021. Chest X-ray pneumothorax segmentation using U-Net with EfficientNet and ResNet architectures. http://dx.doi.org/10.7717/peerj-cs.607
[19]
[19] Shuxu Zhao, Qing Luo and Changrong Liu. 2020. Tooth Segmentation and Classification in Dental Panoramic X ray Images. http://dx.doi.org/10.21203/rs.3.rs-89894/v1.
[20]
[20] Srikanth Tammina. 2019. Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images. http://dx.doi.org/10.29322/IJSRP.9.10.2019.p9420.
[21]
[21] Karhan Z. and Akal F. 2020. COVID-19 classification using deep learning in chest X-ray images. Medical Technologies Congress (TIPTEKNO), Antalya, Turkey, Nov. 2020, pp. 1–4. https://doi.org/10.1109/TIPTEKNO50054.2020.9299315.
[22]
[22] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, 5998–6008. https://doi.org/10.48550/arXiv.1706.03762.
[23]
[23] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen. 2019. MobileNetV2: Inverted Residuals and Linear Bottlenecks. https://doi.org/10.48550/arXiv.1801.04381.
[24]
[24] K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large‐Scale Image Recognition. https://doi.org/10.48550/arXiv.1409.1556.
[25]
[25] Rohit KunduID, Ritacheta DasID, Zong Woo GeemID, Gi-Tae HanID, Ram SarkarI. 2021. Pneumonia detection in chest X-ray images using an ensemble of deep learning models. https://doi.org/10.1371/journal.pone.0256630.
[26]
[26] RNSA dataset, https://pubs.rsna.org/doi/10.1148/ryai.2019180041.
[27]
[27] Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. 2017. ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. IEEE CVPR 2017.
[28]
[28] Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., et al. 2015. Going deeper with convolutions. Proceedings Of The IEEE Conference On Computer Vision And Pattern Recognition. pp. 1-9. http://dx.doi.org/10.1109/CVPR.2015.7298594.
[29]
[29] Shelke, A., Inamdar, M., Shah, V. et al. 2021. Chest X-ray Classification Using Deep Learning for Automated COVID-19 Screening. SN COMPUT. SCI. 2, 300. https://doi.org/10.1007/s42979-021-00695-5.
[30]
[30] Golla AK, Bauer DF, Schmidt R, Russ T, Norenberg D, Chung K, et al. 2021. Convolutional neural network ensemble segmentation with ratio-based sampling for the arteries and veins in abdominal CT scans”. IEEE Trans Bio-med Eng. 68:1518–26. https://doi.org/10.1109/tbme.2020.3042640.
[31]
[31] Li H, Xiong P, An J, Wang L. 2018. Pyramid attention network for semantic segmentation. https://doi.org/10.48550/arXiv.1805.10180.
[32]
[32] Li, X.; Tan, W.; Liu, P.; Zhou, Q.; Yang, J. 2021. Classification of COVID-19 chest CT images based on ensemble deep learning. J. Healthc. Eng., 5528441. https://doi.org/10.1155/2021/5528441.
[33]
[33] Chun-Fu Chen, Quanfu Fan, Rameswar Panda. 2021. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. https://doi.org/10.48550/arXiv.2103.14899.
[34]
[34] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database.” In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. https://doi.org/10.1109/CVPR.2009.5206848.
[35]
[35] Michael Yang. 2022. Visual Transformer for Object Detection. https://doi.org/10.48550/arXiv.2206.06323.
[36]
[36] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. ICCV. https://doi.org/10.48550/arXiv.2103.14030.
[37]
[37] Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, dela Barriuso, and Antonio Torralba. 2018. Semantic understanding of scenes through the ade20k dataset. In International Journal on Computer Vision. https://doi.org/10.48550/arXiv.1608.05442.
[38]
[38] Jiawei Chen, Chiu Man Ho. 2021 Nov. MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition. https://doi.org/10.48550/arXiv.2108.09322.
[39]
[39] Ivan S. Blekanov,Nikita Tarasov and Svetlana S. Bodrunova. 2022. Transformer-Based Abstractive Summarization for Reddit and Twitter: Single Posts vs. Comment Pools in Three Languages. https://doi.org/10.3390/fi14030069.
[40]
[40] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer,Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. https://doi.org/10.48550/arXiv.2010.11929.
[41]
[41] Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, and Lidia S. Chao. 2019. Learning deep transformer models for machine translation. In ACL. https://doi.org/10.48550/arXiv.1906.01787.
[42]
[42] Alexei Baevski and Michael Auli. 2019. Adaptive input representations for neural language modeling. In ICLR. https://doi.org/10.48550/arXiv.1809.10853.
[43]
[43] Dan Hendrycks, Kevin Gimpel. 2016. Gaussian Error Linear Units (GELUs). https://doi.org/10.48550/arXiv.1606.08415.
[44]
[44] Kermany, Daniel; Zhang, Kang; Goldbaum, Michael (2018), “Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification”, Mendeley Data, V2.
[45]
[45] Mendeley Data, London, UK, 2018. J. P. Cohen, P. Bertin, and V. Frappier. 2019. Chester: A Web Delivered Locally Computed Chest X-Ray Disease Prediction System. https://doi.org/10.48550/arXiv.1901.11210.
[46]
[46] Salehi, M.; Mohammadi, R.; Ghaffari, H.; Sadighi, N.; Reiazi, R. Automated detection of pneumonia cases using deep transfer learning with paediatric chest X-ray images. Br. J. Radiol. 2021, 94, 20201263.
[47]
[47] Sharma H., Jain J., Bansal P. and Gupta S. Feature extraction and classification of chest x-ray images using cnn to detect pneumonia. 2020 10th International Conference On Cloud Computing, Data Science and Engineering (Confluence). pp. 227-231 (2020).
[48]
[48] Stephen O., Sain M., Maduh U. and Jeong D. An efficient deep learning approach to pneumonia classification in healthcare. Journal Of Healthcare Engineering. 2019 (2019) https://doi.org/10.1155/2019/4180949.
[49]
[49] “kagge,” 2021. (Online). Available: https://www.kaggle.com/c/vinbigdata-chest-xray-abnormalities-detection/.

Cited By

View all
  • (2024)Classification and detection of Covid-19 based on X-Ray and CT images using deep learning and machine learning techniques: A bibliometric analysisAIMS Electronics and Electrical Engineering10.3934/electreng.20240048:1(71-103)Online publication date: 2024

Index Terms

  1. Vision Transformer for Pneumonia Classification in X-ray Images

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICIIT '23: Proceedings of the 2023 8th International Conference on Intelligent Information Technology
    February 2023
    310 pages
    ISBN:9781450399616
    DOI:10.1145/3591569
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Classification.
    2. Convolution Neural Network
    3. Pneumonia
    4. Residual Neural Network
    5. Vision Transformer

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICIIT 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)68
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 27 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Classification and detection of Covid-19 based on X-Ray and CT images using deep learning and machine learning techniques: A bibliometric analysisAIMS Electronics and Electrical Engineering10.3934/electreng.20240048:1(71-103)Online publication date: 2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media