Abstract
Generating accurate structural segmentation for 3D two-photon excitation microscopy (TPEM) images affords insights into how cellular-scale networks in living animal models respond to disease. Manual creation of dense segmentation masks is, however, very time-consuming image modeling (MIM) has recently emerged as a highly effective self-supervised learning (SSL) formulation for feature extraction in natural images, reducing reliance on human-created labels. Here, we extend MIM to 3D TPEM datasets and show that a model pretrained using MIM obtains improved downstream segmentation performance relative to random initialization. We assessed our novel pipeline using multi-channel TPEM data on two common segmentation tasks, neuronal and vascular segmentation. We also introduce intensity-based and channel-separated masking strategies that respectively aim to exploit the intra-channel correlation of intensity and foreground structures, and inter-channel correlations that are specific to microscopy images. We show that these methods are effective for generating representations of TPEM images, and identify novel insights on how MIM can be modified to yield more salient image representations for microscopy. Our method reaches statistically similar performances to a fully-supervised model (using the entire dataset) when only requiring just 25% of the labeled data for both neuronal and vascular segmentation tasks. To the best of our knowledge, this is the first investigation applying MIM methods to microscopy, and we hope our presented SSL pipeline may both reduce the necessary labeling effort and improve downstream analysis of TPEM images for neuroscience investigations. To this end, we plan to make the SSL pipeline, pretrained models and training code available under the following GitHub organization: https://github.com/AICONSlab.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Albelwi, S.: Survey on self-supervised learning: auxiliary pretext tasks and contrastive learning methods in imaging. Entropy 24(4) (2022). https://doi.org/10.3390/e24040551. https://www.mdpi.com/1099-4300/24/4/551
Baevski, A., Hsu, W.N., Xu, Q., Babu, A., Gu, J., Auli, M.: data2vec: a general framework for self-supervised learning in speech, vision and language (2022). https://doi.org/10.48550/arxiv.2202.03555. http://arxiv.org/abs/2202.03555
Bao, H., Dong, L., Piao, S., Wei, F.: Beit: bert pre-training of image transformers (2022)
Berg, S., et al.: ilastik: interactive machine learning for (bio)image analysis. Nat. Methods 16(12), 1226–1232 (2019). https://doi.org/10.1038/s41592-019-0582-9
Cardoso, M.J., et al.: MONAI: an open-source framework for deep learning in healthcare (2022). https://doi.org/10.48550/arXiv.2211.02701
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging properties in self-supervised vision transformers. In: Proceedings of the International Conference on Computer Vision (ICCV) (2021)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709 (2020)
Chen, Z., Agarwal, D., Aggarwal, K., Safta, W., Balan, M., Brown, K.: Masked image modeling advances 3d medical image analysis. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1969–1979. IEEE Computer Society, Los Alamitos (2023). https://doi.org/10.1109/WACV56688.2023.00201
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: ICLR (2021)
Hatamizadeh, A., et al.: Unetr: transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners (2021)
Kakogeorgiou, I., et al.: What to hide from your students: attention-guided masked image modeling. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, pp. 300–318. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20056-4_18
Klinghoffer, T., Morales, P., Park, Y., Evans, N., Chung, K., Brattain, L.J.: Self-supervised feature extraction for 3d axon segmentation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4213–4219. IEEE Computer Society, Los Alamitos (2020). https://doi.org/10.1109/CVPRW50498.2020.00497
Li, Q., Shen, L.: 3d neuron reconstruction in tangled neuronal image with deep networks. IEEE Trans. Med. Imaging 39(2), 425–435 (2020). https://doi.org/10.1109/TMI.2019.2926568
Li, R., Zeng, T., Peng, H., Ji, S.: Deep learning segmentation of optical microscopy images improves 3-d neuron reconstruction. IEEE Trans. Med. Imaging 36(7), 1533–1541 (2017)
Taleb, A., et al.: 3d self-supervised methods for medical imaging. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 18158–18172. Curran Associates, Inc. (2020). https://proceedings.neurips.cc/paper/2020/file/d2dc6368837861b42020ee72b0896182-Paper.pdf
Tang, Y., et al.: Self-supervised pre-training of swin transformers for 3d medical image analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 20730–20740 (2022)
Tendle, A., Hasan, M.R.: A study of the generalizability of self-supervised representations. Mach. Learn. Appl. 6, 100124 (2021). https://doi.org/10.1016/j.mlwa.2021.100124
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Wang, H., et al.: Multiscale kernels for enhanced u-shaped network to improve 3d neuron tracing. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1105–1113 (2019). https://doi.org/10.1109/CVPRW.2019.00144
Xie, Z., et al.: Simmim: a simple framework for masked image modeling (2022)
Zhang, C., Zhang, C., Song, J., Yi, J.S.K., Zhang, K., Kweon, I.S.: A survey on masked autoencoder for self-supervised learning in vision and beyond. arXiv preprint arXiv:2208.00173 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, T. et al. (2023). Masked Image Modeling for Label-Efficient Segmentation in Two-Photon Excitation Microscopy. In: Xue, Z., et al. Medical Image Learning with Limited and Noisy Data. MILLanD 2023. Lecture Notes in Computer Science, vol 14307. Springer, Cham. https://doi.org/10.1007/978-3-031-44917-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-44917-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47196-4
Online ISBN: 978-3-031-44917-8
eBook Packages: Computer ScienceComputer Science (R0)