Exploiting Partial Common Information Microstructure for Multi-modal Brain Tumor Segmentation

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14315))

Included in the following conference series:

Workshop on Machine Learning for Multimodal Healthcare Data

504 Accesses
1 Citations

Abstract

Learning with multiple modalities is crucial for automated brain tumor segmentation from magnetic resonance imaging data. Explicitly optimizing the common information shared among all modalities (e.g., by maximizing the total correlation) has been shown to achieve better feature representations and thus enhance the segmentation performance. However, existing approaches are oblivious to partial common information shared by subsets of the modalities. In this paper, we show that identifying such partial common information can significantly boost the discriminative power of image segmentation models. In particular, we introduce a novel concept of partial common information mask (PCI-mask) to provide a fine-grained characterization of what partial common information is shared by which subsets of the modalities. By solving a masked correlation maximization and simultaneously learning an optimal PCI-mask, we identify the latent microstructure of partial common information and leverage it in a self-attention module to selectively weight different feature representations in multi-modal data. We implement our proposed framework on the standard U-Net. Our experimental results on the Multi-modal Brain Tumor Segmentation Challenge (BraTS) datasets outperform those of state-of-the-art segmentation baselines, with validation Dice similarity coefficients of 0.920, 0.897, 0.837 for the whole tumor, tumor core, and enhancing tumor on BraTS-2020.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Brain Tumor Segmentation Network Using Attention-Based Fusion and Spatial Relationship Constraint

Hierarchical and Global Modality Interaction for Brain Tumor Segmentation

Learning intra-inter-modality complementary for brain tumor segmentation

Article 16 July 2023

References

Bauer, S., Wiest, R., Nolte, L.P., Reyes, M.: A survey of MRI-based medical image analysis for brain tumor studies. Phys. Med. Biol. 58(13), R97 (2013)
Article Google Scholar
Bian, W., Chen, Y., Ye, X., Zhang, Q.: An optimization-based meta-learning model for MRI reconstruction with diverse dataset. J. Imaging 7(11), 231 (2021)
Article Google Scholar
Bian, W., Zhang, Q., Ye, X., Chen, Y.: A learnable variational model for joint multimodal MRI reconstruction and synthesis. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) International Conference on Medical Image Computing and Computer-Assisted Intervention, vol. 13436, pp. 354–364. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16446-0_34
Chen, H., Qi, X., Yu, L., Heng, P.A.: DCAN: deep contour-aware networks for accurate gland segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2487–2496 (2016)
Google Scholar
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Chapter Google Scholar
Cui, S., Mao, L., Jiang, J., Liu, C., Xiong, S.: Automatic semantic segmentation of brain gliomas from MRI images using a deep cascaded neural network. J. Healthcare Eng. 2018 (2018)
Google Scholar
DeAngelis, L.M.: Brain tumors. N. Engl. J. Med. 344(2), 114–123 (2001)
Article Google Scholar
Feizi, S., Makhdoumi, A., Duffy, K., Kellis, M., Medard, M.: Network maximal correlation. IEEE Trans. Netw. Sci. Eng. 4(4), 229–247 (2017)
Article MathSciNet Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Huang, S.L., Makur, A., Zheng, L., Wornell, G.W.: An information-theoretic approach to universal feature selection in high-dimensional inference. In: 2017 IEEE International Symposium on Information Theory (ISIT), pp. 1336–1340. IEEE (2017)
Google Scholar
Huang, S.L., Xu, X., Zheng, L.: An information-theoretic approach to unsupervised feature selection for high-dimensional data. IEEE J. Sel. Areas Inf. Theory 1(1), 157–166 (2020)
Article Google Scholar
Huang, S.L., Xu, X., Zheng, L., Wornell, G.W.: An information theoretic interpretation to deep neural networks. In: 2019 IEEE International Symposium on Information Theory (ISIT), pp. 1984–1988. IEEE (2019)
Google Scholar
Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: Brain tumor segmentation and radiomics survival prediction: contribution to the BRATS 2017 challenge. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 287–297. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_25
Chapter Google Scholar
Isensee, F., et al.: Abstract: nnU-Net: self-adapting framework for U-Net-based medical image segmentation. In: Bildverarbeitung für die Medizin 2019. I, pp. 22–22. Springer, Wiesbaden (2019). https://doi.org/10.1007/978-3-658-25326-4_7
Chapter Google Scholar
Jia, H., Cai, W., Huang, H., Xia, Y.: H$^2$NF-net for brain tumor segmentation using multimodal MR imaging: 2nd place solution to BraTS challenge 2020 segmentation task. In: Crimi, A., Bakas, S. (eds.) BrainLes 2020. LNCS, vol. 12659, pp. 58–68. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72087-2_6
Chapter Google Scholar
Jiang, Z., Ding, C., Liu, M., Tao, D.: Two-stage cascaded U-Net: 1st place solution to BraTS challenge 2019 segmentation task. In: Crimi, A., Bakas, S. (eds.) BrainLes 2019. LNCS, vol. 11992, pp. 231–241. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46640-4_22
Chapter Google Scholar
Kaganami, H.G., Beiji, Z.: Region-based segmentation versus edge detection. In: 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 1217–1221. IEEE (2009)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Louis, D.N., et al.: The 2016 world health organization classification of tumors of the central nervous system: a summary. Acta Neuropathol. 131(6), 803–820 (2016)
Article Google Scholar
Ma, F., Zhang, W., Li, Y., Huang, S.L., Zhang, L.: An end-to-end learning approach for multimodal emotion recognition: extracting common and private information. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 1144–1149. IEEE (2019)
Google Scholar
McKinley, R., Meier, R., Wiest, R.: Ensembles of densely-connected CNNs with label-uncertainty for brain tumor segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 456–465. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_40
Chapter Google Scholar
Mei, Y., Lan, T., Imani, M., Subramaniam, S.: A Bayesian optimization framework for finding local optima in expensive multi-modal functions. arXiv preprint arXiv:2210.06635 (2022)
Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2014)
Article Google Scholar
Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Google Scholar
Muthukrishnan, R., Radha, M.: Edge detection techniques for image segmentation. Int. J. Comput. Sci. Inf. Technol. 3(6), 259 (2011)
Google Scholar
Myronenko, A.: 3D MRI brain tumor segmentation using autoencoder regularization. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11384, pp. 311–320. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11726-9_28
Chapter Google Scholar
Oktay, O., et al.: Attention U-Net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Pearson, K.: Vii. note on regression and inheritance in the case of two parents. Proc. Roy. Soc. London 58(347–352), 240–242 (1895)
Google Scholar
Petit, O., Thome, N., Rambour, C., Themyr, L., Collins, T., Soler, L.: U-Net transformer: self and cross attention for medical image segmentation. In: Lian, C., Cao, X., Rekik, I., Xu, X., Yan, P. (eds.) MLMI 2021. LNCS, vol. 12966, pp. 267–276. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87589-3_28
Chapter Google Scholar
Rényi, A.: On measures of dependence. Acta Mathematica Academiae Scientiarum Hungarica 10(3–4), 441–451 (1959)
Article MathSciNet MATH Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, G., Li, W., Ourselin, S., Vercauteren, T.: Automatic brain tumor segmentation using cascaded anisotropic convolutional neural networks. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 178–190. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_16
Chapter Google Scholar
Wang, L., et al.: An efficient approach to informative feature extraction from multimodal data. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5281–5288 (2019)
Google Scholar
Wang, Y., et al.: Modality-pairing learning for brain tumor segmentation. In: Crimi, A., Bakas, S. (eds.) BrainLes 2020. LNCS, vol. 12658, pp. 230–240. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72084-1_21
Chapter Google Scholar
Wu, X., Hu, Z., Pei, J., Huang, H.: Serverless federated AUPRC optimization for multi-party collaborative imbalanced data mining. In: SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). ACM (2023)
Google Scholar
Xu, F., Ma, H., Sun, J., Wu, R., Liu, X., Kong, Y.: LSTM multi-modal UNet for brain tumor segmentation. In: 2019 IEEE 4th International Conference on Image, Vision and Computing (ICIVC), pp. 236–240. IEEE (2019)
Google Scholar
Xu, X., Huang, S.L.: Maximal correlation regression. IEEE Access 8, 26591–26601 (2020)
Article Google Scholar
Zhang, D., Zhou, F., Jiang, Y., Fu, Z.: MM-BSN: self-supervised image denoising for real-world with multi-mask based on blind-spot network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4188–4197 (2023)
Google Scholar
Zhang, J., Jiang, Z., Dong, J., Hou, Y., Liu, B.: Attention gate ResU-Net for automatic MRI brain tumor segmentation. IEEE Access 8, 58533–58545 (2020)
Article Google Scholar
Zhang, W., Gu, W., Ma, F., Ni, S., Zhang, L., Huang, S.L.: Multimodal emotion recognition by extracting common and modality-specific information. In: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, pp. 396–397 (2018)
Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

The George Washington University, Washington DC, 20052, USA
Yongsheng Mei, Guru Venkataramani & Tian Lan

Authors

Yongsheng Mei
View author publications
You can also search for this author in PubMed Google Scholar
Guru Venkataramani
View author publications
You can also search for this author in PubMed Google Scholar
Tian Lan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongsheng Mei .

Editor information

Editors and Affiliations

Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Bayern, Germany
Andreas K. Maier
Technical University of Munich, Garching bei München, Bayern, Germany
Julia A. Schnabel
College of Engineering, University of Wisconsin–Madison, Madison, WI, USA
Pallavi Tiwari
German Cancer Research Center, Heidelberg, Germany
Oliver Stegle

Appendices

A Proof of Theorem 1

To begin with, we rewrite the covariance $\mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}$ and $\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}$ by leveraging expectations of feature representations to get the unbiased estimators of the covariance matrices. The unbiased estimators of the covariance matrices are as follows:

$$\begin{aligned} \begin{aligned} \mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}=\mathbb {E}\left[ \boldsymbol{f}_i(X_i){\boldsymbol{f}_i}^\textrm{T}(X_i)\right] , \\ \mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}=\mathbb {E}\left[ \boldsymbol{f}_j(X_j){\boldsymbol{f}_j}^\textrm{T}(X_j)\right] . \end{aligned} \end{aligned}$$

Based on optimization problem (3), we apply the selective mask vector $\boldsymbol{s}$ to input feature representations by leveraging the element-wise product. Per property that The element-wise product of two vectors is the same as the matrix multiplication of one vector by the corresponding diagonal matrix of the other vector, we have:

$$\begin{aligned} \boldsymbol{s}\odot \boldsymbol{f}=D_{\boldsymbol{s}}\boldsymbol{f}, \end{aligned}$$

where $D_{\boldsymbol{s}}$ represents the diagonal matrix with the same diagonal elements as the vector $\boldsymbol{s}$.

The transpose of the diagonal matrix equals to itself. Therefore, the function $\bar{L}$ in (3) is now given by:

$$\begin{aligned} &\bar{L}(\boldsymbol{s}\odot \boldsymbol{f}_i,\boldsymbol{s}\odot \boldsymbol{f}_j) \nonumber \\ &=\mathbb {E}\left[ {\boldsymbol{f}_i}^\textrm{T}(X_i)D_{\boldsymbol{s}}D_{\boldsymbol{s}}\boldsymbol{f}_j(X_j)\right] \end{aligned}$$

(8a)

$$\begin{aligned} &+\left( \mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_i(X_i)\right] \right) ^\textrm{T}\mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_j(X_j)\right] \end{aligned}$$

(8b)

$$\begin{aligned} &-\frac{1}{2}\textrm{tr}\left\{ \mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_i(X_i){\boldsymbol{f}_i}^\textrm{T}(X_i)D_{\boldsymbol{s}}\right] \mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_j(X_j){\boldsymbol{f}_j}^\textrm{T}(X_j)D_{\boldsymbol{s}}\right] \right\} . \end{aligned}$$

(8c)

Considering that the input in Eq. (8a) subjects to zero-mean: $\mathbb {E}[\boldsymbol{f}_i(X_i)]={\textbf {0}}$ for $i=1,2,\dots ,k$, the term (8b) becomes:

$$\begin{aligned} \left( \mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_i(X_i)\right] \right) ^\textrm{T}\mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_j(X_j)\right] =0. \end{aligned}$$

Thus, (8b) can be omitted as it equals to 0. Using the property of matrix trace, the third term (8c) can be turned into:

$$\begin{aligned} \begin{aligned} &-\frac{1}{2}\textrm{tr}\left\{ \mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_i(X_i){\boldsymbol{f}_i}^\textrm{T}(X_i)D_{\boldsymbol{s}}\right] \cdot \mathbb {E}\left[ D_{\boldsymbol{s}}\boldsymbol{f}_j(X_j){\boldsymbol{f}_j}^\textrm{T}(X_j)D_{\boldsymbol{s}}\right] \right\} \\ =&-\frac{1}{2}\textrm{tr}\left\{ \mathbb {E}\left[ \boldsymbol{f}_i(X_i){\boldsymbol{f}_i}^\textrm{T}(X_i)\right] D_{\boldsymbol{s}}D_{\boldsymbol{s}}\cdot \mathbb {E}\left[ \boldsymbol{f}_j(X_j){\boldsymbol{f}_j}^\textrm{T}(X_j)\right] D_{\boldsymbol{s}}D_{\boldsymbol{s}}\right\} , \end{aligned} \end{aligned}$$

where the multiplication of two diagonal matrix $D_{\boldsymbol{s}}$ is also a diagonal matrix with dimension of $m\times m$. Therefore, we define $\mathbf {\Lambda }$ as a diagonal matrix satisfying:

$$\begin{aligned} \mathbf {\Lambda }=D_{\boldsymbol{s}}^{2}. \end{aligned}$$

The constraints of the vector $\boldsymbol{s}$ are still applicable to $\mathbf {\Lambda }$. Using $\mathbf {\Lambda }$ to replace multiplications in terms (8a) and (8c), we have the equivalent function to (9a):

$$\begin{aligned} &\tilde{L}(\boldsymbol{f}_i,\boldsymbol{f}_j, \mathbf {\Lambda }_{ij}) \nonumber \\ {} &=\mathbb {E}\left[ {\boldsymbol{f}_i}^\textrm{T}(X_i)\mathbf {\Lambda }_{ij}\boldsymbol{f}_j(X_j)\right] \end{aligned}$$

(9a)

$$\begin{aligned} &-\frac{1}{2}\textrm{tr}\left\{ \mathbb {E}\left[ \boldsymbol{f}_i(X_i){\boldsymbol{f}_i}^\textrm{T}(X_i)\right] \mathbf {\Lambda }_{ij}\mathbb {E}\left[ \boldsymbol{f}_j(X_j){\boldsymbol{f}_j}^\textrm{T}(X_j)\right] \mathbf {\Lambda }_{ij}\right\} . \end{aligned}$$

(9b)

B Proof of Lemma 1

Given function f with respect to matrix X, we can connect the matrix derivative with the total differential $\mathop {}\!{d}f$ by:

$$\begin{aligned} \mathop {}\!{d}f=\sum _{i=1}^{m}\sum _{j=1}^{n}\frac{\mathop {}\!{\partial }f}{\mathop {}\!{\partial }X_{i,j}}\mathop {}\!{d}X_{i,j}=\textrm{tr}\left( \frac{\mathop {}\!{\partial }f^\textrm{T}}{\mathop {}\!{\partial }X}\mathop {}\!{d}X\right) . \end{aligned}$$

(10)

Note that Eq. (10) still holds if the matrix X is degraded to a vector $\boldsymbol{x}$.

The gradient computation in Lemma 1 is equivalent to computing the partial derivative regarding $\mathbf {\Lambda }_{ij}$ in Eq. (9a). To start with, we compute the total differential of first term (9a) as follows:

$$\begin{aligned} &\mathop {}\!{d}\ \mathbb {E}\left[ {\boldsymbol{f}_i}^\textrm{T}(X_i)\mathbf {\Lambda }_{ij}\boldsymbol{f}_j(X_j)\right] \nonumber \\ &=\mathbb {E}\left[ {\boldsymbol{f}_i}^\textrm{T}(X_i)d\mathbf {\Lambda }_{ij}\boldsymbol{f}_j(X_j)\right] \end{aligned}$$

(11a)

$$\begin{aligned} &=\mathbb {E}\left\{ \textrm{tr}\left[ {\boldsymbol{f}_j(X_j)\boldsymbol{f}_i}^\textrm{T}(X_i)d\mathbf {\Lambda }_{ij}\right] \right\} . \end{aligned}$$

(11b)

Leveraging the Eq. (10), we can derive the partial derivative of term (9a) from Eq. (11b) as:

$$\begin{aligned} \frac{\mathop {}\!{\partial }\ \mathbb {E}\left[ {\boldsymbol{f}_i}^\textrm{T}(X_i)\mathbf {\Lambda }_{ij}\boldsymbol{f}_j(X_j)\right] }{\mathop {}\!{\partial }\mathbf {\Lambda }_{ij}}=\mathbb {E}\left[ {\boldsymbol{f}_j(X_j)\boldsymbol{f}_i}^\textrm{T}(X_i)\right] . \end{aligned}$$

(12)

Similarly, we repeat the same procedure to compute the total differential of second term (9b), which is given by:

$$\begin{aligned} &-\frac{1}{2}d\ \textrm{tr}\left\{ \mathbb {E}\left[ \boldsymbol{f}_i(X_i){\boldsymbol{f}_i}^\textrm{T}(X_i)\right] \mathbf {\Lambda }_{ij}\mathbb {E}\left[ \boldsymbol{f}_j(X_j){\boldsymbol{f}_j}^\textrm{T}(X_j)\right] \mathbf {\Lambda }_{ij}\right\} \nonumber \\ &=-\frac{1}{2}d\ \textrm{tr}\left[ \mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\mathbf {\Lambda }_{ij}\right] \end{aligned}$$

(13a)

$$\begin{aligned} &=-\frac{1}{2}\textrm{tr}\left[ \mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}d\mathbf {\Lambda }_{ij}+\mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}d\mathbf {\Lambda }_{ij}\right] , \end{aligned}$$

(13b)

and then calculate the partial derivative regarding $\mathbf {\Lambda }_{ij}$ using Eq. (10) and (13b) as:

$$\begin{aligned} \begin{aligned} &-\frac{1}{2}\frac{\mathop {}\!{\partial }\ \textrm{tr}\left[ \mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\mathbf {\Lambda }_{ij}\right] }{\mathop {}\!{\partial }\mathbf {\Lambda }_{ij}} \\ &=-\frac{1}{2}\left\{ \left[ \mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\right] ^\textrm{T}+\left[ \mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\right] ^\textrm{T}\right\} . \end{aligned} \end{aligned}$$

(14)

Therefore, by adding up Equation (12) and (14), the derivative of function $\tilde{L}$ is the same as Eq. () in Lemma 1.

C Algorithms

1.1 C.1 Masked Maximal Correlation Loss

As the masked maximal correlation loss is the negative of $\tilde{L}$ in Eq. (4b), we have:

$$\begin{aligned} \mathcal {L}_{corr}=-\mathbb {E}\left[ \sum _{i\ne j}^{k}{\boldsymbol{f}_i}^\textrm{T}(X_i)\mathbf {\Lambda }_{ij}\boldsymbol{f}_j(X_j)\right] +\frac{1}{2}\sum _{i\ne j}^{k}\textrm{tr}\left[ \mathbf {\Sigma }_{\boldsymbol{f}_i(X_i)}\mathbf {\Lambda }_{ij}\mathbf {\Sigma }_{\boldsymbol{f}_j(X_j)}\mathbf {\Lambda }_{ij}\right] . \end{aligned}$$

(15)

Based on Eq. (15), we provide the detailed procedure of masked maximal correlation loss calculation in Algorithm 2.

1.2 C.2 Routine: Truncation Function

We leverage the truncation function to meet the range constraint in Theorem 1 by projecting the element values in PCI-mask to [0, 1]. The routine of the truncation is given by Algorithm 3.

D Supplementary Experiments

1.1 D.1 Implementation Details and Hyperparameters

This section introduces the implementation details and hyper-parameters we used in the experiment. All the experiments are implemented in PyTorch and trained on NVIDIA 2080Ti with fixed hyper-parameter settings. Five-fold cross-validation is adopted while training models on the training dataset. We set the learning rate of the model to 0.0001 and the batch size to 32. The PCI-masks are randomly initialized. When optimizing the PCI-mask, step size $\alpha $ is set to 2, and tolerable error e is set to 0.01 of the sum threshold. We enable the Adam optimizer to train the model and set the maximum number of training epochs as 200. We fixed other grid-searched/Bayesian-optimized [25] hyperparameters during the learning.

1.2 D.2 Experimental Results on BraTS-2015 Dataset

We provide supplementary results on an older version dataset, BraTS-2015, to validate the effectiveness of our proposed approach.

BraTS-2015 Dataset: The BraTS-2015 training dataset comprises 220 scans of HGG and 54 scans of LGG, of which four modalities (FLAIR, T1, T1c, and T2) are consistent with BraTS-2020. BraTS-2015 MRI images include four labels: NCR with label 1, ED with label 2, NET with label 3 (which is merged with label 1 in BraTS-2020), and ET with label 4. We perform the same data preprocessing procedure for BraTS-2015.

Evaluation Metrics: Besides DSC, Sensitivity, Specificity, and PPV, we add Intersection over Union (IoU), also known as the Jaccard similarity coefficient, as an additional metric for evaluation. IoU measures the overlap of the ground truth and prediction region and is positively correlated to DSC. The value of IoU ranges from 0 to 1, with 1 signifying the most significant similarity between prediction and ground truth.

Segmentation Results: We present the segmentation results of our method on the BraTS-2015 dataset in Table 4, where our method achieves the best results. Specifically, we show the IoU of each label independently, along with DSC, Sensitivity, Specificity, and PPV for the complete tumor labeled by NCR, ED, NET, and ET together. The baselines include the vanilla U-Net [34], LSTM U-Net [40], CI-Autoencoder [23], and U-Net Transformer [32]. In the table, the DSC score of our method outperforms the second-best one by 3.9%, demonstrating the superior performance of our design.

Table 4. Segmentation result comparisons between our method and baselines of the best single model on BraTS-2015.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mei, Y., Venkataramani, G., Lan, T. (2024). Exploiting Partial Common Information Microstructure for Multi-modal Brain Tumor Segmentation. In: Maier, A.K., Schnabel, J.A., Tiwari, P., Stegle, O. (eds) Machine Learning for Multimodal Healthcare Data. ML4MHD 2023. Lecture Notes in Computer Science, vol 14315. Springer, Cham. https://doi.org/10.1007/978-3-031-47679-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-47679-2_6
Published: 26 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47678-5
Online ISBN: 978-3-031-47679-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics