research-article

Vision Transformer for Pneumonia Classification in X-ray Images

Authors:

Antoine Doucet,

Giang Son TranAuthors Info & Claims

ICIIT '23: Proceedings of the 2023 8th International Conference on Intelligent Information Technology

Pages 185 - 192

https://doi.org/10.1145/3591569.3591602

Published: 13 July 2023 Publication History

Abstract

Pneumonia is a common medical condition, usually caused by a lung infection, which causes the tissues in the lungs to become inflamed and affects the functioning of the lungs. Pneumonia ranges from mild pneumonia to life-threatening severity. Identifying the responsible pathogen can be difficult. Diagnosis is often based on symptoms and physical examination, which includes chest X-rays. However, the examination of chest X-rays is a challenging task and is prone to subjective variability. In this study, we focus on the research of a new image classification algorithm for classifying images indicating pneumonia pathology. The proposed method uses the Vision transformer architecture to extract data characteristics and classify the input image as pneumonia or not. Two popular deep learning architectures are compared: Vision transformer and Convolutional Neural Network. In this work, we evaluate Vit-B/16 (for Vision transformer) compared to Convolutional Neural Network algorithms such as MobileNetV2, VGG16, ResNet-50. In this study, the Vision transformer algorithm gives relatively positive classification results with an accuracy of approximately 94%.

References

[1]

[1] Pneumonia - WHO (World Health Organization), https://www.who.int/news-room/fact-sheets/detail/pneumonia.

[2]

[2] Pneumonia: Causes, Symptom, diagnosis and treatment (Online), https://vnvc.vn/viem-phoi/

[3]

[3] Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J.,... and Zheng, X. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265-283).

[4]

[4] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.,... and Darrell, T. November, 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 675-678).

[5]

[5] Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G.,... and Bengio, Y. 2010. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy) (Vol. 4, No. 3, pp. 1-7).

[6]

[6] He, K., Zhang, X., Ren, S., and Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://doi.org/10.48550/arXiv.1512.03385.

[7]

[7] Gao Huang, Zhuang Liu, Kilian Q. Weinberger, and Laurens van der Maaten. 2016. Densely Connected Convolutional Networks. https://doi.org/10.48550/arXiv.1608.06993.

[8]

[8] Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.

Digital Library

[9]

[9] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D.,... and Rabinovich, A. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).

[10]

[10] Ren, S., He, K., Girshick, R., and Sun, J. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. https://doi.org/10.1109/TPAMI.2016.2577031.

Digital Library

[11]

[11] Lin, T. Y., Goyal, P., Girshick, R., He, K., and Dollár, P. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).

[12]

[12] Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., and Tian, Q. 2019. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6569-6578).

[13]

[13] Ronneberger, O., Fischer, P., and Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer, Cham.

[14]

[14] Lee, Y., and Park, J. 2020. Centermask: Real-time anchor-free instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 13906-13915).

[15]

[15] Fu, C. Y., Shvets, M., and Berg, A. C. 2019. RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free. https://doi.org/10.48550/arXiv.1901.03353.

[16]

[16] Ruhan Sa, William Owens, Raymond Wiegand, Mark Studin, Donald Capoferri, Kenneth Barooha, Alexander Greaux, Robert Rattray, Adam Hutton, John Cintineo, Vipin Chaudhary. 2017. Intervertebral disc detection in X-ray images using faster R-CNN. http://dx.doi.org/10.1109/EMBC.2017.8036887.

[17]

[17] Shangjie Yao, Yaowu Chen, Xiang Tian, Rongxin Jiang and Shuhao Ma. 2020. An Improved Algorithm for Detecting Pneumonia Based on YOLOv 3. http://dx.doi.org/10.3390/app10051818.

[18]

[18] Ayat Abedalla, Malak Abdullah, Mahmoud Al-Ayyoub, Elhadj Benkhelifa. 2021. Chest X-ray pneumothorax segmentation using U-Net with EfficientNet and ResNet architectures. http://dx.doi.org/10.7717/peerj-cs.607

[19]

[19] Shuxu Zhao, Qing Luo and Changrong Liu. 2020. Tooth Segmentation and Classification in Dental Panoramic X ray Images. http://dx.doi.org/10.21203/rs.3.rs-89894/v1.

[20]

[20] Srikanth Tammina. 2019. Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images. http://dx.doi.org/10.29322/IJSRP.9.10.2019.p9420.

[21]

[21] Karhan Z. and Akal F. 2020. COVID-19 classification using deep learning in chest X-ray images. Medical Technologies Congress (TIPTEKNO), Antalya, Turkey, Nov. 2020, pp. 1–4. https://doi.org/10.1109/TIPTEKNO50054.2020.9299315.

[22]

[22] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, 5998–6008. https://doi.org/10.48550/arXiv.1706.03762.

[23]

[23] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen. 2019. MobileNetV2: Inverted Residuals and Linear Bottlenecks. https://doi.org/10.48550/arXiv.1801.04381.

[24]

[24] K. Simonyan and A. Zisserman. 2014. Very Deep Convolutional Networks for Large‐Scale Image Recognition. https://doi.org/10.48550/arXiv.1409.1556.

[25]

[25] Rohit KunduID, Ritacheta DasID, Zong Woo GeemID, Gi-Tae HanID, Ram SarkarI. 2021. Pneumonia detection in chest X-ray images using an ensemble of deep learning models. https://doi.org/10.1371/journal.pone.0256630.

[26]

[26] RNSA dataset, https://pubs.rsna.org/doi/10.1148/ryai.2019180041.

[27]

[27] Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. 2017. ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. IEEE CVPR 2017.

[28]

[28] Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., et al. 2015. Going deeper with convolutions. Proceedings Of The IEEE Conference On Computer Vision And Pattern Recognition. pp. 1-9. http://dx.doi.org/10.1109/CVPR.2015.7298594.

[29]

[29] Shelke, A., Inamdar, M., Shah, V. et al. 2021. Chest X-ray Classification Using Deep Learning for Automated COVID-19 Screening. SN COMPUT. SCI. 2, 300. https://doi.org/10.1007/s42979-021-00695-5.

Digital Library

[30]

[30] Golla AK, Bauer DF, Schmidt R, Russ T, Norenberg D, Chung K, et al. 2021. Convolutional neural network ensemble segmentation with ratio-based sampling for the arteries and veins in abdominal CT scans”. IEEE Trans Bio-med Eng. 68:1518–26. https://doi.org/10.1109/tbme.2020.3042640.

[31]

[31] Li H, Xiong P, An J, Wang L. 2018. Pyramid attention network for semantic segmentation. https://doi.org/10.48550/arXiv.1805.10180.

[32]

[32] Li, X.; Tan, W.; Liu, P.; Zhou, Q.; Yang, J. 2021. Classification of COVID-19 chest CT images based on ensemble deep learning. J. Healthc. Eng., 5528441. https://doi.org/10.1155/2021/5528441.

[33]

[33] Chun-Fu Chen, Quanfu Fan, Rameswar Panda. 2021. CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. https://doi.org/10.48550/arXiv.2103.14899.

[34]

[34] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database.” In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. https://doi.org/10.1109/CVPR.2009.5206848.

[35]

[35] Michael Yang. 2022. Visual Transformer for Object Detection. https://doi.org/10.48550/arXiv.2206.06323.

[36]

[36] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. ICCV. https://doi.org/10.48550/arXiv.2103.14030.

[37]

[37] Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, dela Barriuso, and Antonio Torralba. 2018. Semantic understanding of scenes through the ade20k dataset. In International Journal on Computer Vision. https://doi.org/10.48550/arXiv.1608.05442.

[38]

[38] Jiawei Chen, Chiu Man Ho. 2021 Nov. MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition. https://doi.org/10.48550/arXiv.2108.09322.

[39]

[39] Ivan S. Blekanov,Nikita Tarasov and Svetlana S. Bodrunova. 2022. Transformer-Based Abstractive Summarization for Reddit and Twitter: Single Posts vs. Comment Pools in Three Languages. https://doi.org/10.3390/fi14030069.

[40]

[40] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer,Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. https://doi.org/10.48550/arXiv.2010.11929.

[41]

[41] Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, and Lidia S. Chao. 2019. Learning deep transformer models for machine translation. In ACL. https://doi.org/10.48550/arXiv.1906.01787.

[42]

[42] Alexei Baevski and Michael Auli. 2019. Adaptive input representations for neural language modeling. In ICLR. https://doi.org/10.48550/arXiv.1809.10853.

[43]

[43] Dan Hendrycks, Kevin Gimpel. 2016. Gaussian Error Linear Units (GELUs). https://doi.org/10.48550/arXiv.1606.08415.

[44]

[44] Kermany, Daniel; Zhang, Kang; Goldbaum, Michael (2018), “Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification”, Mendeley Data, V2.

[45]

[45] Mendeley Data, London, UK, 2018. J. P. Cohen, P. Bertin, and V. Frappier. 2019. Chester: A Web Delivered Locally Computed Chest X-Ray Disease Prediction System. https://doi.org/10.48550/arXiv.1901.11210.

[46]

[46] Salehi, M.; Mohammadi, R.; Ghaffari, H.; Sadighi, N.; Reiazi, R. Automated detection of pneumonia cases using deep transfer learning with paediatric chest X-ray images. Br. J. Radiol. 2021, 94, 20201263.

[47]

[47] Sharma H., Jain J., Bansal P. and Gupta S. Feature extraction and classification of chest x-ray images using cnn to detect pneumonia. 2020 10th International Conference On Cloud Computing, Data Science and Engineering (Confluence). pp. 227-231 (2020).

[48]

[48] Stephen O., Sain M., Maduh U. and Jeong D. An efficient deep learning approach to pneumonia classification in healthcare. Journal Of Healthcare Engineering. 2019 (2019) https://doi.org/10.1155/2019/4180949.

[49]

[49] “kagge,” 2021. (Online). Available: https://www.kaggle.com/c/vinbigdata-chest-xray-abnormalities-detection/.

Cited By

Chawki YElasnaoui KOuhda M(2024)Classification and detection of Covid-19 based on X-Ray and CT images using deep learning and machine learning techniques: A bibliometric analysisAIMS Electronics and Electrical Engineering10.3934/electreng.20240048:1(71-103)Online publication date: 2024
https://doi.org/10.3934/electreng.2024004

Index Terms

Vision Transformer for Pneumonia Classification in X-ray Images
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems

Recommendations

Overview of fine-tuning CNN-Based Models for X-ray Image Classification
ICIIT '24: Proceedings of the 2024 9th International Conference on Intelligent Information Technology

A lung infection is usually the cause of pneumonia, a common medical condition. It irritates the lungs’ tissues and reduces their functionality. The severity of pneumonia can vary from a minor illness to a serious one. Identifying the exact infection ...
Convolutional neural networks applied in the detection of pneumonia by X-ray images

According to the World Health Organization (WHO), pneumonia kills about 2 million children under the age of 5 and is constantly estimated as the leading cause of child mortality, killing more children than AIDS, malaria, and measles together. The ...
Cough Sound Analysis for Pneumonia and Asthma Classification in Pediatric Population
ISMS '15: Proceedings of the 2015 6th International Conference on Intelligent Systems, Modelling and Simulation

Pneumonia and asthma are the common diseases in pediatric population. The diseases share some similarities of symptoms that make them difficult to separate without the proper diagnostic tools. The majority of pneumonia cases occur in the third world ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIIT '23: Proceedings of the 2023 8th International Conference on Intelligent Information Technology

February 2023

310 pages

ISBN:9781450399616

DOI:10.1145/3591569

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICIIT 2023

ICIIT 2023: 2023 8th International Conference on Intelligent Information Technology

February 24 - 26, 2023

Da Nang, Vietnam

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
88
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)6

Reflects downloads up to 27 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chawki YElasnaoui KOuhda M(2024)Classification and detection of Covid-19 based on X-Ray and CT images using deep learning and machine learning techniques: A bibliometric analysisAIMS Electronics and Electrical Engineering10.3934/electreng.20240048:1(71-103)Online publication date: 2024
https://doi.org/10.3934/electreng.2024004

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents