Multi-class Cancer Classification of Whole Slide Images Through Transformer and Multiple Instance Learning

Haijing Luan ORCID: orcid.org/0000-0002-7290-2236^11,12,
Taiyuan Hu^11,12,
Jifang Hu ORCID: orcid.org/0009-0002-6254-1251^11,12,
Ruilin Li ORCID: orcid.org/0000-0001-9593-1235¹¹,
Detao Ji^11,12,
Jiayin He¹¹,
Xiaohong Duan¹³,
Chunyan Yang¹³,
Yajun Gao¹³,
Fan Chen¹³ &
…
Beifang Niu ORCID: orcid.org/0000-0002-7448-7793^11,12

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 14248))

Included in the following conference series:

International Symposium on Bioinformatics Research and Applications

1117 Accesses
2 Citations

Abstract

Whole slide images (WSIs) are high-resolution and lack localized annotations, whose classification can be treated as a multiple instance learning (MIL) problem while slide-level labels are available. We introduce a approach for WSI classification that leverages the MIL and Transformer, effectively eliminating the requirement for localized annotations. Our method consists of three key components. Firstly, we use ResNet50, which has been pre-trained on ImageNet, as an instance feature extractor. Secondly, we present a Transformer-based MIL aggregator that adeptly captures contextual information within individual regions and correlation information among diverse regions within the WSI. Thirdly, we introduce the global average pooling (GAP) layer to increase the mapping relationship between WSI features and category features. To evaluate our model, we conducted experiments on the The Cancer Imaging Archive (TCIA) Clinical Proteomic Tumor Analysis Consortium (CPTAC) dataset. Our proposed method achieves a top-1 accuracy of 94.8% and an area under the curve (AUC) exceeding 0.996, establishing state-of-the-art performance in WSI classification without reliance on localized annotations. The results demonstrate the superiority of our approach compared to previous MIL-based methods.

H. Luan and T. Hu—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Clustering-Based Multi-instance Learning Network for Whole Slide Image Classification

DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification

MixUp-MIL: Novel Data Augmentation for Multiple Instance Learning and a Study on Thyroid Cancer Diagnosis

References

Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Chen, H., Qi, X., Yu, L., Heng, P.A.: Dcan: deep contour-aware networks for accurate gland segmentation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Choromanski, K., et al.: Rethinking attention with performers. arXiv preprint arXiv:2009.14794 (2020)
Deng, S., et al.: Deep learning in digital pathology image analysis: a survey. Front. Med. 14(4), 18 (2020)
Article Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YicbFdNTTy
Feng, J., Zhou, Z.H.: Deep miml network. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. AAAI’17, pp. 1884–1890. AAAI Press (2017)
Google Scholar
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 1243–1252. PMLR, 06–11 August 2017. https://proceedings.mlr.press/v70/gehring17a.html
Ilse, M., Tomczak, J., Welling, M.: Attention-based deep multiple instance learning. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2127–2136. PMLR, 10–15 July 2018. https://proceedings.mlr.press/v80/ilse18a.html
Islam, M.A., Jia, S., Bruce, N.D.: How much position information do convolutional neural networks encode. arXiv preprint arXiv:2001.08248 (2020)
Kitaev, N., Kaiser, Ł., Levskaya, A.: Reformer: the efficient transformer. arXiv preprint arXiv:2001.04451 (2020)
Kraus, O.Z., Ba, J.L., Frey, B.J.: Classifying and segmenting microscopy images with deep multiple instance learning. Bioinformatics 32(12), i52–i59 (2016). https://doi.org/10.1093/bioinformatics/btw252
Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: Conference on Computer Vision and Pattern Recognition Workshops. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Workshops 2021, pp. 14318–14328 (2021)
Google Scholar
Lu, M.Y., et al.: AI-based pathology predicts origins for cancers of unknown primary. Nature 594(7861), 106–110 (2021)
Article CAS PubMed Google Scholar
Lu, M.Y., Williamson, D.F.K., Chen, T.Y., Chen, R.J., Mahmood, F.: Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 1–16 (2021)
Article CAS Google Scholar
Pinheiro, P.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1713–1721 (2015). https://doi.org/10.1109/CVPR.2015.7298780
Sabeena Beevi, K., Nair, M.S., Bindu, G.: Automatic mitosis detection in breast histopathology images using convolutional neural network based deep transfer learning. Biocybern. Biomed. Eng. 39(1), 214–223 (2019). https://doi.org/10.1016/j.bbe.2018.10.007, https://www.sciencedirect.com/science/article/pii/S0208521618302572
Shao, Z., et al.: Transmil: transformer based correlated multiple instance learning for whole slide image classification. In: Advances in Neural Information Processing Systems, vol. 34, pp. 2136–2147 (2021)
Google Scholar
Tay, Y., Dehghani, M., Bahri, D., Metzler, D.: Efficient transformers: a survey. ACM Comput. Surv. 55(6), 1–28 (2022)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17, pp. 6000–6010. Curran Associates Inc., Red Hook (2017)
Google Scholar
Wang, S., Li, B.Z., Khabsa, M., Fang, H., Ma, H.: Linformer: self-attention with linear complexity. arXiv preprint arXiv:2006.04768 (2020)
Xing, F., Yang, L.: Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: a comprehensive review. IEEE Rev. Biomed. Eng. 9, 234–263 (2016). https://doi.org/10.1109/RBME.2016.2515127
Article PubMed PubMed Central Google Scholar
Xu, Y., Jia, Z., Ai, Y., Zhang, F., Lai, M., Chang, E.I.C.: Deep convolutional activation features for large scale brain tumor histopathology image classification and segmentation. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 947–951 (2015). https://doi.org/10.1109/ICASSP.2015.7178109
Zheng, Y., et al.: A graph-transformer for whole slide image classification. IEEE Trans. Med. Imaging 41(11), 3003–3015 (2022). https://doi.org/10.1109/TMI.2022.3176598
Article PubMed PubMed Central Google Scholar
Zheng, Y., et al.: Diagnostic regions attention network (DRA-net) for histopathology WSI recommendation and retrieval. IEEE Trans. Med. Imaging 40(3), 1090–1103 (2021). https://doi.org/10.1109/TMI.2020.3046636
Article PubMed Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (grant numbers 92259101) and the Strategic Priority Research Program of the Chinese Academy of Sciences (grant number XDB38040100).

Author information

Authors and Affiliations

Computer Network Information Center, Chinese Academy of Sciences, 100190, Beijing, China
Haijing Luan, Taiyuan Hu, Jifang Hu, Ruilin Li, Detao Ji, Jiayin He & Beifang Niu
University of Chinese Academy of Sciences, 100190, Beijing, China
Haijing Luan, Taiyuan Hu, Jifang Hu, Detao Ji & Beifang Niu
ChosenMed Technology (Beijing) Co., Ltd., 100176, Beijing, China
Xiaohong Duan, Chunyan Yang, Yajun Gao & Fan Chen

Authors

Haijing Luan
View author publications
You can also search for this author in PubMed Google Scholar
Taiyuan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jifang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Ruilin Li
View author publications
You can also search for this author in PubMed Google Scholar
Detao Ji
View author publications
You can also search for this author in PubMed Google Scholar
Jiayin He
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Duan
View author publications
You can also search for this author in PubMed Google Scholar
Chunyan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yajun Gao
View author publications
You can also search for this author in PubMed Google Scholar
Fan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Beifang Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Beifang Niu .

Editor information

Editors and Affiliations

University of North Texas, Denton, TX, USA
Xuan Guo
University of Southern California, Los Angeles, CA, USA
Serghei Mangul
Georgia State University, Atlanta, GA, USA
Murray Patterson
Georgia State University, Atlanta, GA, USA
Alexander Zelikovsky

Ethics declarations

Availability

The pathology slides and corresponding labels for WSIs are available from the CPTAC Pathology Portal. All source code used in our study was implemented in Python using PyTorch learning library, which are available at https://github.com/Luan-zb/TMG.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luan, H. et al. (2023). Multi-class Cancer Classification of Whole Slide Images Through Transformer and Multiple Instance Learning. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_12

Download citation

DOI: https://doi.org/10.1007/978-981-99-7074-2_12
Published: 08 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7073-5
Online ISBN: 978-981-99-7074-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-class Cancer Classification of Whole Slide Images Through Transformer and Multiple Instance Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Clustering-Based Multi-instance Learning Network for Whole Slide Image Classification

DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification

MixUp-MIL: Novel Data Augmentation for Multiple Instance Learning and a Study on Thyroid Cancer Diagnosis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Availability

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multi-class Cancer Classification of Whole Slide Images Through Transformer and Multiple Instance Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Clustering-Based Multi-instance Learning Network for Whole Slide Image Classification

DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification

MixUp-MIL: Novel Data Augmentation for Multiple Instance Learning and a Study on Thyroid Cancer Diagnosis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Availability

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation