Multi-Evidence and Multi-Modal Fusion Network for Ground-Based Cloud Recognition
"> Figure 1
<p>Some examples of ground-based cloud images. Each row indicates the cloud images from the same class.</p> "> Figure 2
<p>The architecture of the proposed multi-evidence and multi-modal fusion network (MMFN).</p> "> Figure 3
<p>The overall framework of ResNet-50. Herein, the residual building blocks are expressed in green boxes and their number is displayed above the boxes.</p> "> Figure 4
<p>(<b>a</b>) represents the original ground-based cloud images and (<b>b</b>) represents the visualization features of convolutional neural networks (CNNs).</p> "> Figure 5
<p>The process of obtaining an attentive map. (<b>a</b>) is the convolutional activation map with the highlighted top <math display="inline"><semantics> <mrow> <mi>n</mi> <mo>×</mo> <mi>n</mi> </mrow> </semantics></math> responses, and (<b>b</b>) is the corresponding attentive map, where <span class="html-italic">n</span> is set to 5. Note that, the deeper the color, the larger the response.</p> "> Figure 6
<p>The sketches of variant1 ∼ variant6.</p> "> Figure 7
<p>Some samples from each category in multi-modal ground-based cloud dataset (MGCD), where the multi-modal information is embedded in the corresponding ground-based cloud image.</p> "> Figure 8
<p>The recognition accuracy of MMFN with different <span class="html-italic">n</span>.</p> "> Figure 9
<p>The recognition accuracies (%) of MMFN with the training and test samples under different ratios.</p> ">
Abstract
:1. Introduction
2. Methods
2.1. Main Network
2.2. Attentive Network
2.3. Multi-Modal Network
2.4. Heterogeneous Feature Fusion
2.5. Comparison Methods
2.5.1. Variants of MMFN
2.5.2. Hand-Crafted and Learning-Based Methods
2.6. Implementation Details
3. Data
4. Results
4.1. Comparison with Variants of MMFN
4.2. Comparison with Other Methods
4.3. Parameter Analysis
5. Discussion
5.1. Overall Discussion
5.2. Potential Applications and Future Work
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
MMFN | Multi-evidence and multi-modal fusion network |
MGCD | Multimodal ground-based cloud dataset |
TSI | Total-sky imager |
CNN | Convolutional neural network |
Leaky ReLU | Leaky rectified linear unite |
SGD | Stochastic gradient descent |
BoVW | Bag-of-visual-words |
SIFT | Scale invariant feature transform |
LBP | Local binary pattern |
CLBP | Completed LBP |
DMF | Deep multimodal fusion |
JFCNN | Joint fusion convolutional neural network |
References
- Ceppi, P.; Hartmann, D.L. Clouds and the atmospheric circulation response to warming. J. Clim. 2016, 29, 783–799. [Google Scholar] [CrossRef]
- Zhou, C.; Zelinka, M.D.; Klein, S.A. Impact of decadal cloud variations on the Earth’s energy budget. Nat. Geosci. 2016, 9, 871. [Google Scholar] [CrossRef]
- McNeill, V.F. Atmospheric aerosols: Clouds, chemistry, and climate. Annu. Rev. Chem. Biomol. 2005, 8, 258–271. [Google Scholar] [CrossRef] [PubMed]
- Huang, W.; Wang, Y.; Chen, X. Cloud detection for high-resolution remote-sensing images of urban areas using colour and edge features based on dual-colour models. Int. J. Remote Sens. 2018, 39, 6657–6675. [Google Scholar] [CrossRef]
- Liu, Y.; Tang, Y.; Hua, S.; Luo, R.; Zhu, Q. Features of the cloud base height and determining the threshold of relative humidity over southeast China. Remote Sens. 2019, 11, 2900. [Google Scholar] [CrossRef] [Green Version]
- Calbo, J.; Sabburg, J. Feature extraction from whole-sky ground-based images for cloud-type recognition. J. Atmos. Ocean. Technol. 2008, 25, 3–14. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Wang, C.; Shi, C.; Xiao, B. A Selection Criterion for the Optimal Resolution of Ground-Based Remote Sensing Cloud Images for Cloud Classification. IEEE Trans. Geosci. Remote 2018, 57, 1358–1367. [Google Scholar] [CrossRef]
- Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
- Ryu, A.; Ito, M.; Ishii, H.; Hayashi, Y. Preliminary analysis of short-term solar irradiance forecasting by using total-sky Imager and convolutional neural network. In Proceedings of the IEEE PES GTD Grand International Conference and Exposition Asia, Bangkok, Thailand, 21–23 March 2019; pp. 627–631. [Google Scholar]
- Nouri, B.; Wilbert, S.; Segura, L.; Kuhn, P.; Hanrieder, N.; Kazantzidis, A.; Schmidt, T.; Zarzalejo, L.; Blanc, P.; Pitz-Paal, R. Determination of cloud transmittance for all sky imager based solar nowcasting. Sol. Energy 2019, 181, 251–263. [Google Scholar] [CrossRef] [Green Version]
- Nouri, B.; Kuhn, P.; Wilbert, S.; Hanrieder, N.; Prahl, C.; Zarzalejo, L.; Kazantzidis, A.; Blanc, P.; Pitz-Paal, R. Cloud height and tracking accuracy of three all sky imager systems for individual clouds. Sol. Energy 2019, 177, 213–228. [Google Scholar] [CrossRef] [Green Version]
- Liu, S.; Wang, C.; Xiao, B.; Zhang, Z.; Shao, Y. Salient local binary pattern for ground-based cloud classification. Acta Meteorol. Sin. 2013, 27, 211–220. [Google Scholar] [CrossRef]
- Cheng, H.Y.; Yu, C.C. Multi-model solar irradiance prediction based on automatic cloud classification. Energy 2015, 91, 579–587. [Google Scholar] [CrossRef]
- Kliangsuwan, T.; Heednacram, A. Feature extraction techniques for ground-based cloud type classification. Expert Syst. Appl. 2015, 42, 8294–8303. [Google Scholar] [CrossRef]
- Cheng, H.Y.; Yu, C.C. Block-based cloud classification with statistical features and distribution of local texture features. Atmos. Meas. Tech. 2015, 8, 1173–1182. [Google Scholar] [CrossRef] [Green Version]
- Gan, J.; Lu, W.; Li, Q.; Zhang, Z.; Yang, J.; Ma, Y.; Yao, W. Cloud type classification of total-sky images using duplex norm-bounded sparse coding. IEEE J.-STARS 2017, 10, 3360–3372. [Google Scholar] [CrossRef]
- Kliangsuwan, T.; Heednacram, A. A FFT features and hierarchical classification algorithms for cloud images. Eng. Appl. Artif. Intel. 2018, 76, 40–54. [Google Scholar] [CrossRef]
- Oikonomou, S.; Kazantzidis, A.; Economou, G.; Fotopoulos, S. A local binary pattern classification approach for cloud types derived from all-sky imagers. Int. J. Remote Sens. 2019, 40, 2667–2682. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 580–587. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Liu, C.; Chen, L.C.; Schroff, F.; Adam, H.; Hua, W.; Yuille, A.L.; Fei-Fei, L. Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 82–92. [Google Scholar]
- Choi, J.; Kwon, J.; Lee, K.W. Deep meta learning for real-time target-aware visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, Long Beach, CA, USA, 16–20 June 2019; pp. 911–920. [Google Scholar]
- DeLancey, E.R.; Simms, J.F.; Mahdianpari, M.; Brisco, B.; Mahoney, C.; Kariyeva, J. Comparing deep learning and shallow learning for large-scale wetland classification in Alberta, Canada. Remote Sens. 2019, 12, 2. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Chen, C.; Ding, M.; Li, J. Real-time dense semantic labeling with dual-Path framework for high-resolution remote sensing image. Remote Sens. 2019, 11, 3020. [Google Scholar] [CrossRef] [Green Version]
- Shi, C.; Wang, C.; Wang, Y.; Xiao, B. Deep convolutional activations-based features for ground-based cloud classification. IEEE Geosci. Remote Sens. 2017, 14, 816–820. [Google Scholar] [CrossRef]
- Zhang, J.; Liu, P.; Zhang, F.; Song, Q. CloudNet: Ground-based cloud classification with deep convolutional neural network. Geophys. Res. Lett. 2018, 45, 8665–8672. [Google Scholar] [CrossRef]
- Li, M.; Liu, S.; Zhang, Z. Dual guided loss for ground-based cloud classification in weather station networks. IEEE Access 2019, 7, 63081–63088. [Google Scholar] [CrossRef]
- Ye, L.; Cao, Z..; Xiao, Y. DeepCloud: Ground-based cloud image categorization using deep convolutional features. IEEE Trans. Geosci. Remote 2017, 55, 5729–5740. [Google Scholar] [CrossRef]
- Baker, M.B.; Peter, T. Small-scale cloud processes and climate. Nature 2008, 451, 299. [Google Scholar] [CrossRef] [PubMed]
- Farmer, D.K.; Cappa, C.D.; Kreidenweis, S.M. Atmospheric processes and their controlling influence on cloud condensation nuclei activity. Chem. Rev. 2015, 115, 4199–4217. [Google Scholar] [CrossRef] [Green Version]
- Liu, S.; Li, M. Deep multimodal fusion for ground-based cloud classification in weather station networks. EURASIP J. Wirel. Comm. 2018, 2018, 48. [Google Scholar] [CrossRef] [Green Version]
- Liu, S.; Li, M.; Zhang, Z.; Xiao, B.; Cao, X. Multimodal ground-based cloud classification using joint fusion convolutional neural network. Remote Sens. 2018, 10, 822. [Google Scholar] [CrossRef] [Green Version]
- Li, Q.; Zhang, Z.; Lu, W.; Yang, J.; Ma, Y.; Yao, W. From pixels to patches: A cloud classification method based on a bag of micro-structures. Atmos. Meas. Technol. 2016, 9, 753–764. [Google Scholar] [CrossRef] [Green Version]
- Dev, S.; Lee, Y.H.; Winkler, S. Categorization of cloud image patches using an improved texton-based approach. In Proceedings of the IEEE International Conference on Image Processing, Quebec, QC, Canada, 27–30 September 2015; pp. 422–426. [Google Scholar]
- Walther, D.; Rutishauser, U.; Koch, C.; Perona, P. On the usefulness of attention for object recognition. In Proceedings of the European Conference on Computer Vision Workshop on Attention and Performance in Computational Vision, Prague, Czech Republic, 15 May 2004; pp. 96–103. [Google Scholar]
- Chang, X.; Qian, Y.; Yu, D. Monaural multi-talker speech recognition with attention mechanism and gated convolutional networks. In Proceedings of the Interspeech, Hyderabad, India, 2–6 September 2018; pp. 1586–1590. [Google Scholar]
- Chorowski, J.K.; Bahdanau, D.; Serdyuk, D.; Cho, K.; Bengio, Y. Attention-based models for speech recognition. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 577–585. [Google Scholar]
- Zhu, Y.; Zhao, C.; Guo, H.; Wang, J.; Zhao, X.; Lu, H. Attention couplenet: Fully convolutional attention coupling network for object detection. IEEE Trans. Image Process. 2019, 28, 113–126. [Google Scholar] [CrossRef]
- Chen, L.; Zhang, H.; Xiao, J.; Nie, L.; Shao, J.; Liu, W.; Chua, T.S. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5659–5667. [Google Scholar]
- Fu, J.; Zheng, H.; Mei, T. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4438–4446. [Google Scholar]
- Wang, F.; Jiang, M.; Qian, C.; Yang, S.; Li, C.; Zhang, H.; Wang, X.; Tang, X. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3156–3164. [Google Scholar]
- Peng, Y.; He, X.; Zhao, J. Object-part attention model for fine-grained image classification. IEEE Trans. Image Process. 2018, 27, 1487–1500. [Google Scholar] [CrossRef] [Green Version]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual attention network for scene segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 3146–3154. [Google Scholar]
- Liu, Y.; Liu, Y.; Ding, L. Scene classification based on two-stage deep feature fusion. IEEE Geosci. Remote Sens. 2018, 15, 183–186. [Google Scholar] [CrossRef]
- Chen, J.; Chen, Z.; Chi, Z.; Fu, H. Facial expression recognition in video with multiple feature fusion. IEEE Trans. Affect. Comput. 2018, 9, 38–50. [Google Scholar] [CrossRef]
- Uddin, M.A.; Lee, Y. Feature fusion of deep spatial features and handcrafted spatiotemporal features for human action recognition. Sensors 2019, 19, 1599. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chaib, S.; Liu, H.; Gu, Y.; Yao, H. Deep feature fusion for VHR remote sensing scene classification. IEEE Trans. Geosci. Remote 2017, 55, 4775–4784. [Google Scholar] [CrossRef]
- Song, W.; Li, S.; Fang, L.; Lu, T. Hyperspectral image classification with deep feature fusion network. IEEE Trans. Geosci. Remote 2018, 56, 3173–3184. [Google Scholar] [CrossRef]
- Tang, P.; Wang, H.; Kwong, S. G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition. Neurocomputing 2017, 225, 188–197. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Bodla, N.; Zheng, J.; Xu, H.; Chen, J.C.; Castillo, C.; Chellappa, R. Deep heterogeneous feature fusion for template-based face recognition. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Santa Rosa, CA, USA, 27–29 March 2017; pp. 586–595. [Google Scholar]
- Chen, Y.; Li, C.; Ghamisi, P.; Jia, X.; Gu, Y. Deep fusion of remote sensing data for accurate classification. IEEE Geosci. Remote Sens. 2017, 14, 1253–1257. [Google Scholar] [CrossRef]
- Guo, J.; Song, B.; Zhang, P.; Ma, M.; Luo, W. Affective video content analysis based on multimodal data fusion in heterogeneous networks. Inform. Fusion 2019, 51, 224–232. [Google Scholar] [CrossRef]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Guo, Z.; Zhang, L.; Zhang, D. A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process. 2010, 19, 1657–1663. [Google Scholar]
- Csurka, G.; Dance, C.; Fan, L.; Willamowski, J.; Bray, C. Visual categorization with bags of keypoints. In Proceedings of the European Conference on Computer Vision Workshop on Statistical Learning in Computer Vision, Prague, Czech Republic, 11–14 May 2004; pp. 1–16. [Google Scholar]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Lazebnik, S.; Schmid, C.; Ponce, J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 17–22 June 2006; pp. 2169–2178. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intel. Syst. Technol. 2011, 2, 27:1–27:27. [Google Scholar] [CrossRef]
- Li, M.; Liu, S.; Zhang, Z. Deep tensor fusion network for multimodal ground-based cloud classification in weather station networks. Ad Hoc Netw. 2020, 96, 101991. [Google Scholar] [CrossRef]
- Liu, S.; Duan, L.; Zhang, Z.; Cao, X. Hierarchical multimodal fusion for ground-based cloud classification in weather station networks. IEEE Access 2019, 7, 85688–85695. [Google Scholar] [CrossRef]
- Huo, J.; Bi, Y.; Lü, D.; Duan, S. Cloud classification and distribution of cloud types in Beijing using Ka-band radar data. Adv. Atmos. Sci. 2019, 36, 793–803. [Google Scholar] [CrossRef]
- Xiao, Y.; Cao, Z.; Zhuo, W.; Ye, L.; Zhu, L. mCLOUD: A multiview visual feature extraction mechanism for ground-based cloud image categorization. J. Atmos. Ocean. Technol. 2016, 33, 789–801. [Google Scholar] [CrossRef]
Methods | Accuracy (%) |
---|---|
variant1 | 83.15 |
variant1 + MI | 84.48 |
variant2 | 82.23 |
variant2 + MI | 83.70 |
variant3 | 86.25 |
variant3 + MI | 87.10 |
variant4 | 85.90 |
variant5 | 83.70 |
variant6 | 87.38 |
variant7 | 87.60 |
MMFN | 88.63 |
Methods | Accuracy (%) | Methods | Accuracy (%) |
---|---|---|---|
BoVW | 66.15 | BoVW + MI | 67.20 |
PBoVW | 66.13 | PBoVW + MI | 67.15 |
LBP | 45.38 | LBP + MI | 45.25 |
LBP | 49.00 | LBP + MI | 47.25 |
LBP | 50.20 | LBP + MI | 50.53 |
CLBP | 65.10 | CLBP + MI | 65.40 |
CLBP | 68.20 | CLBP + MI | 68.48 |
CLBP | 69.18 | CLBP + MI | 69.68 |
VGG-16 | 77.95 | DMF [31] | 79.05 |
DCAFs [25] | 82.67 | DCAFs + MI | 82.97 |
CloutNet [26] | 79.92 | CloutNet + MI | 80.37 |
JFCNN [32] | 84.13 | ||
DTFN [62] | 86.48 | ||
HMF [63] | 87.90 | ||
MMFN | 88.63 |
(, ) | Accuracy (%) |
---|---|
(0.2, 0.8) | 75.02 |
(0.3, 0.7) | 88.63 |
(0.4, 0.6) | 88.33 |
(0.5, 0.5) | 88.53 |
(0.6, 0.4) | 88.30 |
(0.7, 0.3) | 88.10 |
(0.8, 0.2) | 87.85 |
(, ) | Accuracy (%) |
---|---|
(0.2, 0.8) | 87.90 |
(0.3, 0.7) | 88.63 |
(0.4, 0.6) | 87.75 |
(0.5, 0.5) | 87.85 |
(0.6, 0.4) | 87.90 |
(0.7, 0.3) | 87.85 |
(0.8, 0.2) | 87.80 |
(, ) | Accuracy (%) |
---|---|
(0.6, 0.4) | 87.15 |
(0.7, 0.3) | 87.30 |
(0.8, 0.2) | 87.85 |
(1, 1) | 88.63 |
(1, 1.5) | 87.93 |
(1, 2) | 87.80 |
(1.5, 1) | 87.38 |
(2, 1) | 87.00 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, S.; Li, M.; Zhang, Z.; Xiao, B.; Durrani, T.S. Multi-Evidence and Multi-Modal Fusion Network for Ground-Based Cloud Recognition. Remote Sens. 2020, 12, 464. https://doi.org/10.3390/rs12030464
Liu S, Li M, Zhang Z, Xiao B, Durrani TS. Multi-Evidence and Multi-Modal Fusion Network for Ground-Based Cloud Recognition. Remote Sensing. 2020; 12(3):464. https://doi.org/10.3390/rs12030464
Chicago/Turabian StyleLiu, Shuang, Mei Li, Zhong Zhang, Baihua Xiao, and Tariq S. Durrani. 2020. "Multi-Evidence and Multi-Modal Fusion Network for Ground-Based Cloud Recognition" Remote Sensing 12, no. 3: 464. https://doi.org/10.3390/rs12030464
APA StyleLiu, S., Li, M., Zhang, Z., Xiao, B., & Durrani, T. S. (2020). Multi-Evidence and Multi-Modal Fusion Network for Ground-Based Cloud Recognition. Remote Sensing, 12(3), 464. https://doi.org/10.3390/rs12030464