Abstract
Spatial Transcriptomics (ST) quantitatively interprets human diseases by providing the gene expression of each fine-grained spot (i.e., window) on a tissue slide. This paper focuses on predicting gene expression in specific windows on a tissue slide image. However, gene expression related to image features typically exhibits diverse spatial scales. To spatially model these features, we propose the Coarse and Fine Attention Network (CFANet). At the coarse level, we employ a coarse-to-fine strategy to acquire adaptable global features. Through coarse-gained areas (i.e., area) guiding to realize sparse external window attention by filtering out the most irrelevant feature areas. At the fine level, using dynamical convolutions realizes internal window attention to obtain dynamic local features. By iterating our CFAN Block, we construct features for different gene types within the slide image windows to predict gene expression. Particularly, without any pre-training, on 10X Genomics breast cancer data, our CFANet achieves an impressive PCC@S of 81.6% for gene expression prediction, surpassing the current SOTA model by 5.6%. This demonstrates the potential of the model to be a useful network for gene prediction. Code is available (https://github.com/biyecc/CFANet).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Marx, V.: Method of the year: spatially resolved transcriptomics. Nat. Methods 18(1), 9–14 (2021)
He, B., et al.: Integrating spatial gene expression and breast tumour morphology via deep learning. Nature Biomed. Eng. 4(8), 827–834 (2020)
Dawood, M., Branson, K., Rajpoot, N.M., Minhas, F.u.A.A.: All you need is color: image based spatial gene expression prediction using neural stain learning. In: Machine Learning and Principles and Practice of Knowledge Discovery in Databases: International Workshops of ECML PKDD 2021, Virtual Event, September 13–17, 2021, Proceedings, Part II, pp. 437–450. Springer (2022). https://doi.org/10.1007/978-3-030-93733-1_32
Yang, Y., Hossain, M.Z., Stone, E.A., Rahman, S.: Exemplar guided deep neural network for spatial transcriptomics analysis of gene expression prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5039–5048 (2023)
Yang, Y., Hossain, M.Z., Stone, E., Rahman, S.: Spatial transcriptomics analysis of gene expression prediction using exemplar guided graph neural network. bioRxiv, pp. 2023–03 (2023)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T.: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Avsec, Ž, Agarwal, V., Visentin, D., Ledsam, J.R., Grabska-Barwinska, A., Taylor, K.R., Assael, Y., Jumper, J., Kohli, P., Kelley, D.R.: Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18(10), 1196–1203 (2021)
Dong, X., et al.: Cswin transformer: a general vision transformer backbone with cross-shaped windows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12124–12134 (2022)
Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops(2019)
Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Wang, W., et al.: Crossformer: a versatile vision transformer hinging on cross-scale attention. arxiv 2021. arXiv preprint arXiv:2108.00154
Xia, Z., Pan, X., Song, S., Li, L.E., Huang, G.: Vision transformer with deformable attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4794–4803 (2022)
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022)
Lee, Y., Kim, J., Willette, J., Hwang, S.J.: Mpvit: multi-path vision transformer for dense prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7287–7296 (2022)
Chen, S., Xie, E., Ge, C., Chen, R., Liang, D., Luo, P.: Cyclemlp: a mlp-like architecture for dense prediction. arXiv preprint arXiv:2107.10224 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Wang, P., Wang, X., Wang, F., Lin, M., Chang, S., Li, H., Jin, R.: KVT: K-NN attention for boosting vision transformers. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIV, pp. 285–302. Springer (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, C., Zhang, Z., Mounir, A., Liu, X., Huang, B. (2024). Spatial Gene Expression Prediction Using Coarse and Fine Attention Network. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_34
Download citation
DOI: https://doi.org/10.1007/978-981-99-7025-4_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7024-7
Online ISBN: 978-981-99-7025-4
eBook Packages: Computer ScienceComputer Science (R0)