Abstract
Learning with multiple modalities is crucial for automated brain tumor segmentation from magnetic resonance imaging data. Explicitly optimizing the common information shared among all modalities (e.g., by maximizing the total correlation) has been shown to achieve better feature representations and thus enhance the segmentation performance. However, existing approaches are oblivious to partial common information shared by subsets of the modalities. In this paper, we show that identifying such partial common information can significantly boost the discriminative power of image segmentation models. In particular, we introduce a novel concept of partial common information mask (PCI-mask) to provide a fine-grained characterization of what partial common information is shared by which subsets of the modalities. By solving a masked correlation maximization and simultaneously learning an optimal PCI-mask, we identify the latent microstructure of partial common information and leverage it in a self-attention module to selectively weight different feature representations in multi-modal data. We implement our proposed framework on the standard U-Net. Our experimental results on the Multi-modal Brain Tumor Segmentation Challenge (BraTS) datasets outperform those of state-of-the-art segmentation baselines, with validation Dice similarity coefficients of 0.920, 0.897, 0.837 for the whole tumor, tumor core, and enhancing tumor on BraTS-2020.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bauer, S., Wiest, R., Nolte, L.P., Reyes, M.: A survey of MRI-based medical image analysis for brain tumor studies. Phys. Med. Biol. 58(13), R97 (2013)
Bian, W., Chen, Y., Ye, X., Zhang, Q.: An optimization-based meta-learning model for MRI reconstruction with diverse dataset. J. Imaging 7(11), 231 (2021)
Bian, W., Zhang, Q., Ye, X., Chen, Y.: A learnable variational model for joint multimodal MRI reconstruction and synthesis. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) International Conference on Medical Image Computing and Computer-Assisted Intervention, vol. 13436, pp. 354–364. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16446-0_34
Chen, H., Qi, X., Yu, L., Heng, P.A.: DCAN: deep contour-aware networks for accurate gland segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2487–2496 (2016)
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Cui, S., Mao, L., Jiang, J., Liu, C., Xiong, S.: Automatic semantic segmentation of brain gliomas from MRI images using a deep cascaded neural network. J. Healthcare Eng. 2018 (2018)
DeAngelis, L.M.: Brain tumors. N. Engl. J. Med. 344(2), 114–123 (2001)
Feizi, S., Makhdoumi, A., Duffy, K., Kellis, M., Medard, M.: Network maximal correlation. IEEE Trans. Netw. Sci. Eng. 4(4), 229–247 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Huang, S.L., Makur, A., Zheng, L., Wornell, G.W.: An information-theoretic approach to universal feature selection in high-dimensional inference. In: 2017 IEEE International Symposium on Information Theory (ISIT), pp. 1336–1340. IEEE (2017)
Huang, S.L., Xu, X., Zheng, L.: An information-theoretic approach to unsupervised feature selection for high-dimensional data. IEEE J. Sel. Areas Inf. Theory 1(1), 157–166 (2020)
Huang, S.L., Xu, X., Zheng, L., Wornell, G.W.: An information theoretic interpretation to deep neural networks. In: 2019 IEEE International Symposium on Information Theory (ISIT), pp. 1984–1988. IEEE (2019)
Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: Brain tumor segmentation and radiomics survival prediction: contribution to the BRATS 2017 challenge. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 287–297. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_25
Isensee, F., et al.: Abstract: nnU-Net: self-adapting framework for U-Net-based medical image segmentation. In: Bildverarbeitung für die Medizin 2019. I, pp. 22–22. Springer, Wiesbaden (2019). https://doi.org/10.1007/978-3-658-25326-4_7
Jia, H., Cai, W., Huang, H., Xia, Y.: H\(^2\)NF-net for brain tumor segmentation using multimodal MR imaging: 2nd place solution to BraTS challenge 2020 segmentation task. In: Crimi, A., Bakas, S. (eds.) BrainLes 2020. LNCS, vol. 12659, pp. 58–68. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72087-2_6
Jiang, Z., Ding, C., Liu, M., Tao, D.: Two-stage cascaded U-Net: 1st place solution to BraTS challenge 2019 segmentation task. In: Crimi, A., Bakas, S. (eds.) BrainLes 2019. LNCS, vol. 11992, pp. 231–241. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46640-4_22
Kaganami, H.G., Beiji, Z.: Region-based segmentation versus edge detection. In: 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 1217–1221. IEEE (2009)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Louis, D.N., et al.: The 2016 world health organization classification of tumors of the central nervous system: a summary. Acta Neuropathol. 131(6), 803–820 (2016)
Ma, F., Zhang, W., Li, Y., Huang, S.L., Zhang, L.: An end-to-end learning approach for multimodal emotion recognition: extracting common and private information. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 1144–1149. IEEE (2019)
McKinley, R., Meier, R., Wiest, R.: Ensembles of densely-connected CNNs with label-uncertainty for brain tumor segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 456–465. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_40
Mei, Y., Lan, T., Imani, M., Subramaniam, S.: A Bayesian optimization framework for finding local optima in expensive multi-modal functions. arXiv preprint arXiv:2210.06635 (2022)
Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2014)
Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Muthukrishnan, R., Radha, M.: Edge detection techniques for image segmentation. Int. J. Comput. Sci. Inf. Technol. 3(6), 259 (2011)
Myronenko, A.: 3D MRI brain tumor segmentation using autoencoder regularization. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 311–320. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_28
Oktay, O., et al.: Attention U-Net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Pearson, K.: Vii. note on regression and inheritance in the case of two parents. Proc. Roy. Soc. London 58(347–352), 240–242 (1895)
Petit, O., Thome, N., Rambour, C., Themyr, L., Collins, T., Soler, L.: U-Net transformer: self and cross attention for medical image segmentation. In: Lian, C., Cao, X., Rekik, I., Xu, X., Yan, P. (eds.) MLMI 2021. LNCS, vol. 12966, pp. 267–276. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87589-3_28
Rényi, A.: On measures of dependence. Acta Mathematica Academiae Scientiarum Hungarica 10(3–4), 441–451 (1959)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Wang, G., Li, W., Ourselin, S., Vercauteren, T.: Automatic brain tumor segmentation using cascaded anisotropic convolutional neural networks. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 178–190. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_16
Wang, L., et al.: An efficient approach to informative feature extraction from multimodal data. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5281–5288 (2019)
Wang, Y., et al.: Modality-pairing learning for brain tumor segmentation. In: Crimi, A., Bakas, S. (eds.) BrainLes 2020. LNCS, vol. 12658, pp. 230–240. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72084-1_21
Wu, X., Hu, Z., Pei, J., Huang, H.: Serverless federated AUPRC optimization for multi-party collaborative imbalanced data mining. In: SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). ACM (2023)
Xu, F., Ma, H., Sun, J., Wu, R., Liu, X., Kong, Y.: LSTM multi-modal UNet for brain tumor segmentation. In: 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC), pp. 236–240. IEEE (2019)
Xu, X., Huang, S.L.: Maximal correlation regression. IEEE Access 8, 26591–26601 (2020)
Zhang, D., Zhou, F., Jiang, Y., Fu, Z.: MM-BSN: self-supervised image denoising for real-world with multi-mask based on blind-spot network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4188–4197 (2023)
Zhang, J., Jiang, Z., Dong, J., Hou, Y., Liu, B.: Attention gate ResU-Net for automatic MRI brain tumor segmentation. IEEE Access 8, 58533–58545 (2020)
Zhang, W., Gu, W., Ma, F., Ni, S., Zhang, L., Huang, S.L.: Multimodal emotion recognition by extracting common and modality-specific information. In: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, pp. 396–397 (2018)
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Proof of Theorem 1
To begin with, we rewrite the covariance \(\mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\) and \(\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\) by leveraging expectations of feature representations to get the unbiased estimators of the covariance matrices. The unbiased estimators of the covariance matrices are as follows:
Based on optimization problem (3), we apply the selective mask vector \(\boldsymbol{s}\) to input feature representations by leveraging the element-wise product. Per property that The element-wise product of two vectors is the same as the matrix multiplication of one vector by the corresponding diagonal matrix of the other vector, we have:
where \(D_{\boldsymbol{s}}\) represents the diagonal matrix with the same diagonal elements as the vector \(\boldsymbol{s}\).
The transpose of the diagonal matrix equals to itself. Therefore, the function \(\bar{L}\) in (3) is now given by:
Considering that the input in Eq. (8a) subjects to zero-mean: \(\mathbb {E}[\boldsymbol{f}_i(X_i)]={\textbf {0}}\) for \(i=1,2,\dots ,k\), the term (8b) becomes:
Thus, (8b) can be omitted as it equals to 0. Using the property of matrix trace, the third term (8c) can be turned into:
where the multiplication of two diagonal matrix \(D_{\boldsymbol{s}}\) is also a diagonal matrix with dimension of \(m\times m\). Therefore, we define \(\mathbf {\Lambda }\) as a diagonal matrix satisfying:
The constraints of the vector \(\boldsymbol{s}\) are still applicable to \(\mathbf {\Lambda }\). Using \(\mathbf {\Lambda }\) to replace multiplications in terms (8a) and (8c), we have the equivalent function to (9a):
B Proof of Lemma 1
Given function f with respect to matrix X, we can connect the matrix derivative with the total differential \(\mathop {}\!{d}f\) by:
Note that Eq. (10) still holds if the matrix X is degraded to a vector \(\boldsymbol{x}\).
The gradient computation in Lemma 1 is equivalent to computing the partial derivative regarding \(\mathbf {\Lambda }_{ij}\) in Eq. (9a). To start with, we compute the total differential of first term (9a) as follows:
Leveraging the Eq. (10), we can derive the partial derivative of term (9a) from Eq. (11b) as:
Similarly, we repeat the same procedure to compute the total differential of second term (9b), which is given by:
and then calculate the partial derivative regarding \(\mathbf {\Lambda }_{ij}\) using Eq. (10) and (13b) as:
Therefore, by adding up Equation (12) and (14), the derivative of function \(\tilde{L}\) is the same as Eq. () in Lemma 1.
C Algorithms
1.1 C.1 Masked Maximal Correlation Loss
As the masked maximal correlation loss is the negative of \(\tilde{L}\) in Eq. (4b), we have:
Based on Eq. (15), we provide the detailed procedure of masked maximal correlation loss calculation in Algorithm 2.
1.2 C.2 Routine: Truncation Function
We leverage the truncation function to meet the range constraint in Theorem 1 by projecting the element values in PCI-mask to [0, 1]. The routine of the truncation is given by Algorithm 3.
D Supplementary Experiments
1.1 D.1 Implementation Details and Hyperparameters
This section introduces the implementation details and hyper-parameters we used in the experiment. All the experiments are implemented in PyTorch and trained on NVIDIA 2080Ti with fixed hyper-parameter settings. Five-fold cross-validation is adopted while training models on the training dataset. We set the learning rate of the model to 0.0001 and the batch size to 32. The PCI-masks are randomly initialized. When optimizing the PCI-mask, step size \(\alpha \) is set to 2, and tolerable error e is set to 0.01 of the sum threshold. We enable the Adam optimizer to train the model and set the maximum number of training epochs as 200. We fixed other grid-searched/Bayesian-optimized [25] hyperparameters during the learning.
1.2 D.2 Experimental Results on BraTS-2015 Dataset
We provide supplementary results on an older version dataset, BraTS-2015, to validate the effectiveness of our proposed approach.
BraTS-2015 Dataset: The BraTS-2015 training dataset comprises 220 scans of HGG and 54 scans of LGG, of which four modalities (FLAIR, T1, T1c, and T2) are consistent with BraTS-2020. BraTS-2015 MRI images include four labels: NCR with label 1, ED with label 2, NET with label 3 (which is merged with label 1 in BraTS-2020), and ET with label 4. We perform the same data preprocessing procedure for BraTS-2015.
Evaluation Metrics: Besides DSC, Sensitivity, Specificity, and PPV, we add Intersection over Union (IoU), also known as the Jaccard similarity coefficient, as an additional metric for evaluation. IoU measures the overlap of the ground truth and prediction region and is positively correlated to DSC. The value of IoU ranges from 0 to 1, with 1 signifying the most significant similarity between prediction and ground truth.
Segmentation Results: We present the segmentation results of our method on the BraTS-2015 dataset in Table 4, where our method achieves the best results. Specifically, we show the IoU of each label independently, along with DSC, Sensitivity, Specificity, and PPV for the complete tumor labeled by NCR, ED, NET, and ET together. The baselines include the vanilla U-Net [34], LSTM U-Net [40], CI-Autoencoder [23], and U-Net Transformer [32]. In the table, the DSC score of our method outperforms the second-best one by 3.9%, demonstrating the superior performance of our design.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mei, Y., Venkataramani, G., Lan, T. (2024). Exploiting Partial Common Information Microstructure for Multi-modal Brain Tumor Segmentation. In: Maier, A.K., Schnabel, J.A., Tiwari, P., Stegle, O. (eds) Machine Learning for Multimodal Healthcare Data. ML4MHD 2023. Lecture Notes in Computer Science, vol 14315. Springer, Cham. https://doi.org/10.1007/978-3-031-47679-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-47679-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47678-5
Online ISBN: 978-3-031-47679-2
eBook Packages: Computer ScienceComputer Science (R0)