Abstract
Arterial spin labeling (ASL) perfusion MRI and blood-oxygen-level-dependent (BOLD) fMRI provide complementary information for assessing brain functions. ASL is quantitative, insensitive to low-frequency drift but has lower signal-to-noise-ratio (SNR) and lower temporal resolution than BOLD. However, there still lacks a way to fuse the benefits provided by both of them. When only one modality is available, it is also desirable to have a technique that can extract the other modality from the one being acquired. The purpose of this study was to develop such a technique that can combine the advantages of BOLD fMRI and ASL MRI, i.e., to quantify cerebral blood flow (CBF) like ASL MRI but with high SNR and temporal resolution as in BOLD fMRI. We pursued this goal using a new deep learning-based algorithm to extract CBF directly from BOLD fMRI. Using a relatively large dataset containing dual-echo ASL and BOLD images, we built a wide residual learning based convolutional neural network to predict CBF from BOLD fMRI. We dubbed this technique as a BOA-Net (BOLD to ASL networks). Our testing results demonstrated that ASL CBF can be reliably predicted from BOLD fMRI with comparable image quality and higher SNR. We also evaluated BOA-Net with different deep learning networks.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Human brain function can be assessed non-invasively using two MR techniques with a whole brain coverage and relatively high spatial resolution. One is the blood-oxygen-level-dependent fMRI [10]; the other is arterial spin labeling (ASL) perfusion MRI [2]. BOLD fMRI is more widely used, offering high temporal and spatial resolution, but only provides relative values. It is sensitive to low-frequency drift and suffers from the susceptibility gradient-induced artifacts. By contrast, ASL MRI measures cerebral blood flow (CBF) in a physical unit of ml/100 g/min. The quantitative nature of ASL MRI makes it insensitive to low-frequency drift. Measuring signal from the capillary bed, ASL MRI is potentially more accurate for localizing functional activation than BOLD fMRI which is often mainly contributed by oxygen level change in venous vessels rather than the activation site. However, ASL MRI has lower signal-to-noise-ratio (SNR), lower temporal resolution, and has only seen increasing visibility in recent years. An important but still open question is how to fuse the benefits provided by the two complementary functional imaging modalities. Since many finished and ongoing large size fMRI projects only had or have BOLD fMRI, a related question is whether we can reliably extract CBF signal from BOLD fMRI. Solving both questions requires understanding the elusive relationship between these two imaging modalities.
Theoretically, ASL MRI works by magnetically labeling the arterial blood water as an endogenous tracer using radio-frequency (RF) pulses [2]. The perfusion-weighted MR image is acquired after the labeled spins reach the imaging place. To remove the background signal, a control image is also acquired using the same ASL imaging sequence but with modulations to avoid labeling the arterial blood so the background signal under the influence of RF pulses is the same as that in the spin labeling condition. The perfusion-weighted signal is subsequently extracted from the difference between the label (L) image and the control (C) and converted into a quantitative CBF measure using an appropriate compartment model [1]. Due to the limitation of the longitudinal relaxation rate (T1) of blood water and the post-labeling transmit process a small portion of tissue water can be labeled, resulting in a low SNR [15]. Thus, ASL often acquires many pairs of L/C images to improve SNR of the mean perfusion map. Practically, 10–50 L/C pairs are allowed in a typical 3–6 min scan, which can only provide minor to moderate SNR improvement by averaging across the limited number of measurements. The interleaved labeling and non-labeling procedure reduces the temporal resolution of ASL MRI by half compared to the regular dynamic MR imaging. The relatively long labeling and post-labeling delay time before data acquisition further reduces the temporal resolution of ASL MRI. Ideally, these drawbacks can be avoided instantly if CBF can be extracted from BOLD fMRI. From a technical point of view, ASL MRI can be acquired with many different imaging sequences. That is why the gradient-echo weighted BOLD imaging sequence is still widely used to acquire ASL MRI data. It is then theoretically reasonable to hypothesize that CBF can be extracted from the BOLD fMRI. The challenge is then to find an appropriate model for the unknown BOLD-CBF relationship.
A canonical BOLD-CBF model has been proposed in [4], but requires data acquired under gas-challenging. The underlying assumption of no change of cerebral metabolic rate of oxygen by gas-challenging may also be inaccurate. Without extra experiments, there isn’t an analytic way to extract quantitative CBF from BOLD fMRI. Alternatively, a learning-based approach might be able to solve this problem. Over the years, machine learning especially deep machine learning has been increasingly used to achieve astonishing success for modeling various highly complex data relationship [7].
Deep learning (DL) is motivated by the hierarchical learning in the visual system [3]. The most widely used deep neural networks consist of multiple layers of receptive field constrained local filters which are trained layer by layer by error backpropagations [6] and are often called convolutional neural networks (CNN). The local feature extraction, hierarchical abstraction, step-wise backpropagation of CNN and the introduction of several training strategies such as weight drop-out, batch-normalization, skip connection, and residual learning etc make CNN high flexible and capable for modeling any nonlinear function buried in a large data. Because medical imaging processing is often hindered by some unknown nonlinear processes or transforms, DL may provide a potentially versatile tool for medical imaging processing as increasingly demonstrated in a variety of applications, including image segmentation [11] and image reconstructions [13] etc. Specific to ASL MRI, DL has been adopted to improve SNR of ASL CBF maps [5, 16]. Most related to this study is that Xie et al. [17] piloted a pairwise label to control image prediction using CNN. Since the ASL MRI used in the so-called super-ASL network was acquired with the gradient-echo weighted BOLD fMRI sequence, it suggests the feasibility of directly extracting CBF from BOLD fMRI.
The purpose of this study was mainly to build and validate a DL-based BOLD-ASL relationship learning model to predict CBF signal directly from BOLD fMRI. We dubbed the network as the BOA-Net. Different from the super-ASL work, we used current ASL MRI and BOLD fMRI acquired with a dual-echo ASL MRI sequence [12] so the network doesn’t need to consider the physiological difference or signal drift-induced difference between the BOLD fMRI and ASL CBF. Another contribution is that we introduced a new CNN architecture based on dilated convolution [19] and wide activation residual block [20].
2 Methods
2.1 Problem Formulation
Denote the CBF image generated by i-th L/C pair by \(y_i\) and the BOLD image (the 2nd echo) after the i-th C image by \(x_i\). Given same brain structure and transitory acquisition time, we want to build a parametric regression model \(f_{\varTheta }\) that learns the mapping \(f_{\varTheta }(x_i)\rightarrow y_i\), where \(i = 1,2,...,N\), and N is the total number of one subject’s CBF maps. \(\varTheta \) are the parameters of the model and are adjusted through the training process. The model, typically a CNN, can be learned by minimizing the loss function: \(\sum _i L(f_{\varTheta }(x_i), y_i)\), where the loss function can be either the mean square error or mean absolute error between input and reference.
As we don’t have gold standard CBF maps as the training references, using the low SNR ASL CBF images as the training references may result in an inaccurate BOA-Net. Interestingly, a recent study [8] showed that the inaccurate model training concern due to the use of noisy reference is not necessarily true. Inspired by their work, we proposed a noisy reference-based BOA-Net. Instead of using the L2 norm as the loss function, we chose the L1 norm to reduce sensitivity to outliers which are common to ASL MRI [9].
2.2 Network Architecture
Figure 1 demonstrates the architecture of DWAN used in BOA-Net and the wide activation residual block. The two-path DilatedNet [5] were used to extract both local and global contextual features. The wide activation residual blocks were adapted to expand data features and pass more information through the network [20]. In DWAN, each pathway contains 4 wide activation residual blocks. Inside each wide activation residual block, the first convolution layer expands the number of input feature maps by a factor of 4. After a ReLU layer, the following convolution layer shrinks the number of feature maps back to input size. The difference between local pathway and global pathway is that, the first convolution layer of the 4 wide activation residual blocks in the global pathway used a dilation rate of 2, 4, 8 and 16 respectively. The convolution kernel size was \(3\times 3\). A \(3\times 3\) convolution link [20] from the input layer to the output layer implements the residual learning of DWAN.
2.3 Data Preparation and Model Training
ASL and BOLD fMRI data were acquired with the dual-echo ASL sequence [12] from 50 young healthy subjects at Hangzhou Normal University with signed informed written consent forms. The experiment and the form were applied by local IRB. Imaging parameters were: labeling time/delay time/TR/TE1/TE2 = 1.4 s/1.5 s/4.5 s/11.7 msec/68 msec, 90 acquisitions (90 BOLD images and 45 C/L image pairs), FOV = 22 cm, matrix = \(64\times 64\), 16 slices with a thickness of 5 mm plus 1 mm gap. We used ASLtbx [14] to preprocess ASL images with the procedures in [9].
The BOA-Net was trained with data from 23 subjects’ CBF maps (input and reference). 4 different subjects were used for validation samples. The remaining 23 subjects’ CBF maps were used as test samples. For each subject, we extracted slices from 7 to 11 of 3D ASL CBF maps. The number of total 2D CBF maps extracted for training and validation were \(27 \times 5 \times 45 = 6075\). The 2D CBF maps were 64 \(\times \) 64 pixels. U-Net [18] and DilatedNet [5], two popular CNN structure widely used in medical imaging, were implemented as a comparison to our DWAN-based BOA-Net.
We also compare the effects of training with smoothing CBFs versus non-smoothing CBFs. The CBFs that were generated from the L/C pairs with Gaussian smoothing were called smoothing CBFs. The CBFs that were generated from the L/C pairs without Gaussian smoothing were called non-smoothing CBFs. the suffix ‘sm’ and ‘nsm’ were added to the name of each model to represent that the model was trained using the smoothed or non-smoothed CBFs, respectively. We use Peak signal-to-noise ratio (PSNR) and structure similarity index (SSIM) to quantitatively compare the performance of DWAN with U-Net and DilatedNet. When calculating PSNR and SSIM, all the predicted results were compared with genuine mean CBF maps from smoothed ASL data.
All networks were implemented using the Keras and Tensorflow platform. The network was trained using adaptive moment estimation (ADAM) algorithm with basic learning rate of 0.001. All the models were trained with batches, each containing 64 training samples. All experiments were performed on a PC with Intel(R) Core(TM) i7-5820k CPU @3.30 GHz and a Nvidia GeForce Titan Xp GPU.
We used SNR to measure the image quality of ASL CBF. The SNR was calculated by using the mean signal of a grey matter (GM) region-of-interest (ROI) divided by the standard deviation of a white matter (WM) ROI in slice 9. Similarity of mean CBF from the outputs of BOA-Net to genuine mean CBF maps from ASL data, was evaluated by the correlation coefficient between the CBF values of all testing subjects (n = 23). This process was performed at each voxel for BOA-Net_sm and BOA-Net_nsm separately. The correlation coefficient maps were thresholded by r>0.3 for the purpose of comparison and display.
3 Results
Figure 2 shows the results of BOLD-based CBF prediction for one representative subject. As compared to the genuine mean CBF map from the acquired ASL MRI, the CBF map produced by BOA-Net showed substantially improved quality in terms of suppressed noise and better perfusion contrast between tissues. Moreover, BOA-Net recovered CBF signals in the air-brain boundaries. Signal loss in the genuine mean CBF in the prefrontal region was caused by the signal loss in BOLD images.
Figure 3 shows box plot of the SNR and spatial correlation BOA-Net_sm and BOA-Net_nsm. the average SNR of genuine mean CBF maps from non-smoothed and smoothed ASL data were 6.96 and 12.64 respectively. The average SNR of mean CBF maps from outputs of BOA-Net_nsm and BOA-Net_sm were 12.26 and 15.11. BOA-Net_sm improved SNR by 19.54% compared with mean CBF maps of smoothed ASL while BOA-Net_nsm achieved a 76.15% SNR improvement compared with the mean CBF maps of non-smoothed ASL. Correlation coefficient at each voxel was calculated between the genuine mean CBF map and network output. Figure 3 shows outputs of BOA-Net_sm and BOA-Net_nsm strongly correlated to the genuine mean CBF, proving that both networks can predict individual subjects’ CBF patterns correctly.
Table 1 shows the PSNR and SSIM of mean CBF maps predicted from different models. DWAN achieved highest PSNR and SSIM in both BOA-Net_sm and BOA-Net_nsm categories. Figure 4 demonstrates the visual comparison of mean CBF maps predicted from BOLD fMRI using different CNN architectures. DWAN suppressed more noises than DilatedNet while recovered more details than U-Net. Moreover, DWAN_nsm has better perfusion contrast than DWAN_sm while DWAN_sm recover more signals in air-brain boundaries.
4 Discussion and Conclusion
To our knowledge, this study represents the first effort to extract quantitative CBF from BOLD fMRI. Comparing with genuine mean CBF from ASL data, the BOA-Net can provide CBF measurement with higher SNR, higher temporal resolution, both inherited from BOLD fMRI (higher SNR is also contributed by DL denoising). For existing dataset without ASL MRI acquired, this provides a unique opportunity to generate a new functional imaging modality. For future studies, it offers an opportunity to avoid ASL MRI scan though that will need more evaluations especially in diseased populations. Even if ASL MRI scan is still needed, its scan time can be substantially shortened and the reduced SNR can be compensated by CBF estimated from BOA-Net. Because this study was only tested on dual-echo MRI sequences, future work will also aim at extending our work on different datasets.
References
Alsop, D.C., et al.: Recommended implementation of arterial spin-labeled perfusion MRI for clinical applications. Magn. Reson. Med. 73(1), 102–116 (2015)
Detre, J.A., et al.: Perfusion imaging. Magn. Reson. Med. 23(1), 37–45 (1992)
Fukushima, K., et al.: Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE Trans. Syst. Man Cybern. 5, 826–834 (1983)
Hoge, R.D., et al.: Linear coupling between cerebral blood flow and oxygen consumption in activated human cortex. Proc. Natl. Acad. Sci. 96(16), 9403–9408 (1999)
Kim, K.H., et al.: Improving arterial spin labeling by using deep learning. Radiology 287(2), 658–666 (2017)
Krizhevsky, A., et al.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Lehtinen, J., et al.: Noise2Noise: learning image restoration without clean data. arXiv preprint arXiv:1803.04189 (2018)
Li, Y., et al.: Priors-guided slice-wise adaptive outlier cleaning for arterial spin labeling perfusion MRI. J. Neurosci. Methods 307, 248–253 (2018)
Ogawa, S., et al.: Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proc. Natl. Acad. Sci. 87(24), 9868–9872 (1990)
Shen, D., Wu, G., Suk, H.I.: Deep learning in medical image analysis. Annu. Rev. Biomed. Eng. 19, 221–248 (2017)
Shin, D.D., et al.: Pseudocontinuous arterial spin labeling with optimized tagging efficiency. Magn. Reson. Med. 68(4), 1135–1144 (2012)
Wang, S., et al.: Accelerating magnetic resonance imaging via deep learning. In: 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), pp. 514–517. IEEE (2016)
Wang, Z., et al.: Empirical optimization of ASL data analysis using an asl data processing toolbox: ASLtbx. Magn. Reson. Imaging 26(2), 261–269 (2008)
Wong, E.: Potential and pitfalls of arterial spin labeling based perfusion imaging techniques for MRI. In: Moonen, C.T.W., Bandettini, P.A. (eds.) Functional MRI, pp. 63–69. Springer, Heidelberg (1999)
Xie, D., et al.: Denoising arterial spin labeling cerebral blood flow images using deep learning. arXiv preprint arXiv:1801.09672 (2018)
Xie, D., et al.: Super-ASL: Improving SNR and temporal resolution of ASL MRI using deep learning. In: 2018 ISMRM Workshop on Machine Learning (2018)
Xu, J., et al.: 200x low-dose pet reconstruction using deep learning. arXiv preprint arXiv:1712.04119 (2017)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
Yu, J., et al.: Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718 (2018)
Acknowledgements
This work was supported by NIH/NIA grant: R01AG060054-01A1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Xie, D. et al. (2019). BOLD fMRI-Based Brain Perfusion Prediction Using Deep Dilated Wide Activation Networks. In: Suk, HI., Liu, M., Yan, P., Lian, C. (eds) Machine Learning in Medical Imaging. MLMI 2019. Lecture Notes in Computer Science(), vol 11861. Springer, Cham. https://doi.org/10.1007/978-3-030-32692-0_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-32692-0_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32691-3
Online ISBN: 978-3-030-32692-0
eBook Packages: Computer ScienceComputer Science (R0)