DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification
"> Figure 1
<p>Illustration of convolution.</p> "> Figure 2
<p>Flowchart of a traditional convolutional neural network (CNN).</p> "> Figure 3
<p>Flowchart of the proposed “network in network” method.</p> "> Figure 4
<p>Operation of densely connected convolution.</p> "> Figure 5
<p>Dilated convolution.</p> "> Figure 6
<p>Structure of a subnetwork in the proposed network method.</p> "> Figure 7
<p>The ten experimental images: (<b>a</b>,<b>c</b>,<b>f</b>,<b>j</b>–<b>l</b>) are from the BJ02 satellite; (<b>b</b>,<b>d</b>,<b>e</b>,<b>g</b>,<b>m</b>–<b>o</b>) are from the GF02 satellite; (<b>h</b>,<b>i</b>) are from the geoeye and quickbird satellites, respectively.</p> "> Figure 7 Cont.
<p>The ten experimental images: (<b>a</b>,<b>c</b>,<b>f</b>,<b>j</b>–<b>l</b>) are from the BJ02 satellite; (<b>b</b>,<b>d</b>,<b>e</b>,<b>g</b>,<b>m</b>–<b>o</b>) are from the GF02 satellite; (<b>h</b>,<b>i</b>) are from the geoeye and quickbird satellites, respectively.</p> "> Figure 8
<p>(<b>a</b>,<b>b</b>) display the overall accuracy acquired from the BJ02 and GF02 images, respectively, based on different network structures.</p> "> Figure 9
<p>BJ02 classification results from (<b>a</b>) manually labeled reference data, (<b>b</b>) proposed method, (<b>c</b>) internal classifier-removed, (<b>d</b>) method using 1 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 1 kernel, (<b>e</b>) method using 3 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 3 kernel, (<b>f</b>) method using 5 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 5 kernel, (<b>g</b>) ResNet-based method using contextual deep CNN, (<b>h</b>) method without densely connected convolution, (<b>i</b>) improved URDNN, (<b>j</b>) SCAE + SVM, (<b>k</b>) two-stream network, (<b>l</b>) deconvolution, (<b>m</b>) parallelepiped, (<b>n</b>) minimum distance, (<b>o</b>) Mahalanobis distance and (<b>p</b>) maximum likelihood.</p> "> Figure 10
<p>GF02 classification results from (<b>a</b>) manually labeled reference data, (<b>b</b>) proposed method, (<b>c</b>) internal classifier-removed, (<b>d</b>) method using 1 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 1 kernel, (<b>e</b>) method using 3 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 3 kernel, (<b>f</b>) method using 5 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 5 kernel, (<b>g</b>) ResNet-based method using contextual deep CNN, (<b>h</b>) method without densely connected convolution, (<b>i</b>) improved URDNN, (<b>j</b>) SCAE + SVM, (<b>k</b>) two-stream network, (<b>l</b>) deconvolution, (<b>m</b>) parallelepiped, (<b>n</b>) minimum distance, (<b>o</b>) Mahalanobis distance and (<b>p</b>) maximum likelihood.</p> "> Figure 10 Cont.
<p>GF02 classification results from (<b>a</b>) manually labeled reference data, (<b>b</b>) proposed method, (<b>c</b>) internal classifier-removed, (<b>d</b>) method using 1 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 1 kernel, (<b>e</b>) method using 3 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 3 kernel, (<b>f</b>) method using 5 <math display="inline"><semantics> <mo>×</mo> </semantics></math> 5 kernel, (<b>g</b>) ResNet-based method using contextual deep CNN, (<b>h</b>) method without densely connected convolution, (<b>i</b>) improved URDNN, (<b>j</b>) SCAE + SVM, (<b>k</b>) two-stream network, (<b>l</b>) deconvolution, (<b>m</b>) parallelepiped, (<b>n</b>) minimum distance, (<b>o</b>) Mahalanobis distance and (<b>p</b>) maximum likelihood.</p> "> Figure 11
<p>(<b>a</b>,<b>b</b>) are changes of OA values for the BJ02 experiments and GF02 experiments, respectively, with the amount of training data decreasing.</p> "> Figure 12
<p>Classification images for extra eight images.</p> "> Figure 12 Cont.
<p>Classification images for extra eight images.</p> "> Figure 12 Cont.
<p>Classification images for extra eight images.</p> ">
Abstract
:1. Introduction
- A novel depth-width double reinforced neural network is proposed for per-pixel VHRRS classification. DenseNet and internal classifiers are used to design a deeper network in which negative effects from gradient disappearance and overfitting are reduced and hidden layer transparency is increased. Multi-scale filters are employed to widen the network and increase the diversity of the extracted features by acquiring joint spatio-spectral information and diverse local spatial structures.
- DenseNet, which is seldom utilized in VHRRS image per-pixel classification, is introduced. Feature reusing, shorter connection between input and output layers and supervision over all layers help to strengthen the gradients and reduce overfitting and the problems of too many redundant parameters, making it possible to fully utilize the expressive ability of deep networks.
- The “network in network” concept is applied to smoothly fuse the deepening and widening strategies. Staking of subnets increases the network depth and enhances the network’s diverse information acquisition ability, which improves the expressive ability of the network and enable it to face more complicated situations.
2. Background Knowledge
3. Proposed Method
3.1. Making a Deeper Network
3.1.1. Improved DenseNet
3.1.2. Internal Classifier Supervision
3.2. Making a Wider Network
3.3. Architecture of the Proposed Network
Algorithm 1. Optimization approach for the proposed method. | |
1 | Inputs: |
2 | Input data: and corresponding ground truth . |
3 | Iterations: , Number of categories: |
4 | Number of layers: Number of objective functions: |
5 | Linear weights: |
6 | network parameter: |
7 | learning rate: , |
8 | Algorithm: |
9 | for i ⟵ 1 to |
10 | input |
11 | for j ⟵ 1 to |
12 | do ⟵ + |
13 | end |
14 | do ⟵ |
15 | Δ ⟵ ⟵ + Δ |
16 | for n ⟵ 1 to |
17 | for m ⟵ 1 to |
18 | do ⟵ + |
19 | end |
20 | Δ ⟵ ⟵ + Δ |
21 | end |
end |
4. Experiments and Results
4.1. Experimental Data
4.2. Experimental Strategies
4.3. Experimental Results and Analysis
4.3.1. Influence of Network Structure on Network Performance
4.3.2. Influence of Network Components on Network Performance
4.3.3. Contrast Experiments with Other Networks
4.3.4. Influence of Training Data Size on Network Performance
4.3.5. More Experiments and Verifications
5. Conclusions
Author Contributions
Acknowledgments
Conflicts of Interest
References
- Huang, Z.; Cheng, G.; Wang, H.; Li, H.; Shi, L.; Pan, C. Building extraction from multi-source remote sensing images via deep deconvolution neural networks. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 1835–1838. [Google Scholar]
- Wei, Y.; Wang, Z.; Xu, M. Road structure refined CNN for road extraction in aerial image. IEEE Geosci. Remote Sens. Lett. 2017, 14, 709–713. [Google Scholar] [CrossRef]
- Hwang, J.-J.; Liu, T.-L. Pixel-wise deep learning for contour detection. arXiv, 2015; arXiv:1504.01989. [Google Scholar]
- Zhao, W.; Du, S. Spectral–spatial feature extraction for hyperspectral image classification: A dimension reduction and deep learning approach. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4544–4554. [Google Scholar] [CrossRef]
- Reis, S.; Tasdemir, K. Identification of hazelnut fields using spectral and Gabor textural features. ISPRS J. Photogramm. Remote Sens. 2011, 66, 652–661. [Google Scholar] [CrossRef]
- Wang, T.; Zhang, H.; Lin, H.; Fang, C. Textural–spectral feature-based species classification of mangroves in Mai Po Nature Reserve from Worldview-3 imagery. Remote Sens. 2016, 8, 24. [Google Scholar] [CrossRef]
- Yu, H.; Yang, W.; Xia, G.-S.; Liu, G. A color-texture-structure descriptor for high-resolution satellite image classification. Remote Sens. 2016, 8, 259. [Google Scholar] [CrossRef]
- Huang, L.; Chen, C.; Li, W.; Du, Q. Remote sensing image scene classification using multi-scale completed local binary patterns and Fisher vectors. Remote Sens. 2016, 8, 483. [Google Scholar] [CrossRef]
- Cheng, G.; Han, J.; Guo, L.; Liu, Z.; Bu, S.; Ren, J. Effective and efficient midlevel visual elements-oriented land-use classification using VHR remote sensing images. IEEE Trans. Geosci. Remote Sens. 2015, 53, 4238–4249. [Google Scholar] [CrossRef]
- Chaib, S.; Liu, H.; Gu, Y.; Yao, H. Deep feature fusion for VHR remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4775–4784. [Google Scholar] [CrossRef]
- Zhang, F.; Du, B.; Zhang, L. Scene classification via a gradient boosting random convolutional network framework. IEEE Trans. Geosci. Remote Sens. 2016, 54, 1793–1802. [Google Scholar] [CrossRef]
- Li, E.; Xia, J.; Du, P.; Lin, C.; Samat, A. Integrating multilayer features of convolutional neural networks for remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 5653–5665. [Google Scholar] [CrossRef]
- Bazi, Y.; Melgani, F. Convolutional SVM Networks for Object Detection in UAV Imagery. IEEE Trans. Geosci. Remote Sens. 2018, 1–12. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Fontainebleau Resort, Miami, FL, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Pohlen, T.; Hermans, A.; Mathias, M.; Leibe, B. Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes. arXiv, 2016; arXiv:1611.08323. [Google Scholar]
- Mou, L.; Ghamisi, P.; Zhu, X.X. Unsupervised spectral-spatial feature learning via deep residual conv-deconv network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 391–406. [Google Scholar] [CrossRef]
- Lee, H.; Kwon, H. Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans. Image Process. 2017, 26, 4843–4855. [Google Scholar] [CrossRef] [PubMed]
- Huang, G.; Liu, Z.; Weinberger, K.Q.; van der Maaten, L. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; Volume 1, p. 3. [Google Scholar]
- Huang, G.; Sun, Y.; Liu, Z.; Sedra, D.; Weinberger, K.Q. Deep networks with stochastic depth. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 6–12 September 2016; pp. 646–661. [Google Scholar]
- Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M. Striving for simplicity: The all convolutional net. arXiv, 2014; arXiv:1412.6806. [Google Scholar]
- Lee, C.Y.; Xie, S.; Gallagher, P.; Zhang, Z.; Tu, Z. Deeply-supervised nets. In Proceedings of the Artificial Intelligence and Statistics, San Diego, CA, USA, 9–12 May 2015; pp. 562–570. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014; arXiv:1409.1556. [Google Scholar]
- Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv, 2013; arXiv:1312.4400. [Google Scholar]
- Soriano, A.; Vergara, L.; Ahmed, B.; Salazar, A. Fusion of scores in a detection context based on Alpha integration. Neural Comput. 2015, 27, 1983–2010. [Google Scholar] [CrossRef] [PubMed]
- Tao, Y.; Xu, M.; Zhong, Y.; Cheng, Y. GAN-Assisted Two-Stream Neural Network for High-Resolution Remote Sensing Image Classification. Remote Sens. 2017, 9, 1328. [Google Scholar] [CrossRef]
- Hao, S.; Wang, W.; Ye, Y.; Nie, T.; Bruzzone, L. Two-Stream Deep Architecture for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 56, 2349–2361. [Google Scholar] [CrossRef]
- Xu, X.; Li, W.; Ran, Q.; Du, Q.; Gao, L.; Zhang, B. Multisource remote sensing data classification based on convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 937–949. [Google Scholar] [CrossRef]
- Hu, J.; Mou, L.; Schmitt, A.; Zhu, X.X. FusioNet: A two-stream convolutional neural network for urban scene classification using PolSAR and hyperspectral data. In Proceedings of the Urban Remote Sensing Event (JURSE), Dubai, UAE, 6–8 March 2017; pp. 1–4. [Google Scholar]
- Han, X.; Zhong, Y.; Cao, L.; Zhang, L. Pre-Trained AlexNet Architecture with Pyramid Pooling and Supervision for High Spatial Resolution Remote Sensing Image Scene Classification. Remote Sens. 2017, 9, 848. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Guadarrama, S.; Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia (MM), Orlando, FL, USA, 3–7 November 2014; pp. 675–678. [Google Scholar]
- Cheng, G.; Zhou, P.; Han, J. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7405–7415. [Google Scholar] [CrossRef]
- Liang, H.; Lin, X.; Zhang, Q.; Kang, X. Recognition of spoofed voice using convolutional neural networks. In Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, QC, Canada, 14–16 November 2017; pp. 293–297. [Google Scholar]
- Salehinejad, H.; Barfett, J.; Aarabi, P.; Valaee, S.; Colak, E.; Gray, B.; Dowdell, T. A Convolutional Neural Network for Search Term Detection. arXiv, 2017; arXiv:1708.02238. [Google Scholar]
- Romero, A.; Gatta, C.; Camps-Valls, G. Unsupervised deep feature extraction for remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 2016, 54, 1349–1362. [Google Scholar] [CrossRef]
- Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef]
- Long, Y.; Gong, Y.; Xiao, Z.; Liu, Q. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2486–2498. [Google Scholar] [CrossRef]
- Liu, Y.; Liu, Y.; Ding, L. Scene Classification Based on Two-Stage Deep Feature Fusion. IEEE Geosci. Remote Sens. Lett. 2018, 15, 183–186. [Google Scholar] [CrossRef]
- Yu, Y.; Gong, Z.; Wang, C.; Zhong, P. An Unsupervised Convolutional Feature Fusion Network for Deep Representation of Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2018, 15, 23–27. [Google Scholar] [CrossRef]
- Song, W.; Li, S.; Fang, L.; Lu, T. Hyperspectral Image Classification with Deep Feature Fusion Network. IEEE Trans. Geosci. Remote Sens. 2018, 1–12. [Google Scholar] [CrossRef]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 645–657. [Google Scholar] [CrossRef]
- Kampffmeyer, M.; Salberg, A.-B.; Jenssen, R. Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. In Proceedings of the IEEE Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 680–688. [Google Scholar]
- Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv, 2015; arXiv:1511.07122. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res. 2010, 9, 249–256. [Google Scholar]
- Volpi, M.; Tuia, D. Dense semantic labeling of subdecimeter resolution images with convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 881–893. [Google Scholar] [CrossRef]
- Tao, Y.; Xu, M.; Zhang, F.; Du, B.; Zhang, L. Unsupervised-Restricted Deconvolutional Neural Network for Very High Resolution Remote-Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6805–6823. [Google Scholar] [CrossRef]
- Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
BJ | Size 800 × 800 | GF | Size 570 × 570 | ||||
---|---|---|---|---|---|---|---|
No. | Category | Mark Color | Number of Pixels | No. | Category | Mark Color | Number of Pixels |
1 | Water | Light Blue | 35522 | 1 | Water | Light Blue | 65444 |
2 | Tree | Blue | 226305 | 2 | Tree | Blue | 86927 |
4 | Bare Land | Red | 70549 | 4 | Bare Land | Red | 46296 |
5 | Building | Green | 115512 | 5 | Building | Green | 37549 |
6 | Road | Purple | 71464 | 6 | Road | Purple | 26875 |
1 Subnet | 2 Subnets | 3 Subnets | |
---|---|---|---|
14 | 0.96799 | 0.973863 | 0.977498 |
18 | 0.969218 | 0.97481 | 0.97878 |
22 | 0.971377 | 0.97796 | 0.98119 |
26 | 0.97239 | 0.978845 | 0.982915 |
30 | 0.973252 | 0.979723 | 0.983866 |
1 Subnet | 2 Subnets | 3 Subnets | |
---|---|---|---|
14 | 0.946285 | 0.956165 | 0.962246 |
18 | 0.948313 | 0.95764 | 0.964387 |
22 | 0.951957 | 0.96299 | 0.9684 |
26 | 0.95358 | 0.964458 | 0.9713 |
30 | 0.955042 | 0.965945 | 0.972904 |
1 Subnet | 2 Subnets | 3 Subnets | |
---|---|---|---|
14 | 0.973990 | 0.982858 | 0.984737 |
18 | 0.974033 | 0.983738 | 0.986798 |
22 | 0.975775 | 0.984148 | 0.9877175 |
26 | 0.978170 | 0.984558 | 0.98822 |
30 | 0.979980 | 0.985323 | 0.988855 |
1 Subnet | 2 Subnets | 3 Subnets | |
---|---|---|---|
14 | 0.965193 | 0.977048 | 0.979563 |
18 | 0.965233 | 0.978228 | 0.982318 |
22 | 0.967575 | 0.978772 | 0.9835525 |
26 | 0.970774 | 0.979333 | 0.98422 |
30 | 0.973195 | 0.980348 | 0.9850775 |
Method | OA | KAPPA | WATER | TREE | BARE LAND | BUILDING | ROAD |
---|---|---|---|---|---|---|---|
proposed method | 0.984 ± 0.002 | 0.973 ± 0.002 | 0.998/0.960 | 0.989/0.999 | 0.978/0.920 | 0.963/0.984 | 0.985/0.968 |
Non-internal classifier | 0.977 ± 0.002 | 0.962 ± 0.003 | 0.995/0.973 | 0.982/0.999 | 0.971/0.890 | 0.951/0.975 | 0.987/0.947 |
1*1 kernel | 0.945 ± 0.004 | 0.907 ± 0.005 | 0.955/0.905 | 0.988/0.997 | 0.884/0.811 | 0.834/0.920 | 0.916/0.843 |
3*3 kernel | 0.977 ± 0.004 | 0.962 ± 0.005 | 0.997/0.952 | 0.985/0.999 | 0.965/0.892 | 0.947/0.974 | 0.984/0.954 |
5*5 kernel | 0.981 ± 0.002 | 0.968 ± 0.002 | 0.996/0.967 | 0.989/0.999 | 0.969/0.906 | 0.954/0.978 | 0.981/0.958 |
contextual deep CNN | 0.977 ± 0.002 | 0.961 ± 0.003 | 0.995/0.956 | 0.990/0.999 | 0.950/0.895 | 0.938/0.977 | 0.980/0.945 |
without DenseNet | 0.974 ± 0.002 | 0.957 ± 0.004 | 0.996/0.944 | 0.983/0.998 | 0.976/0.870 | 0.935/0.975 | 0.977/0.958 |
Improved URDNN | 0.977 ± 0.001 | 0.962 ± 0.002 | 0.997/0.927 | 0.982/0.999 | 0.963/0.934 | 0.961/0.953 | 0.982/0.954 |
SCAE + SVM | 0.892 ± 0.001 | 0.821 ± 0.001 | 0.868/0.774 | 0.964/0.995 | 0.865/0.708 | 0.726/0.776 | 0.764/0.752 |
Two-stream network | 0.976 | 0.959 | 0.998/0.921 | 0.983/0.999 | 0.951/0.897 | 0.955/0.968 | 0.978/0.959 |
deconvolution | 0.965 | 0.942 | 0.994/0.936 | 0.988/0.999 | 0.923/0.856 | 0.897/0.965 | 0.967/0.901 |
parallelepiped | 0.606 | 0.515 | 1.000/0.444 | 1.000/0.983 | 0.000/0.000 | 0.547/0.609 | 0.410/0.996 |
minimum distance | 0.746 | 0.683 | 0.912/0.873 | 1.000/0.992 | 0.550/0.645 | 0.626/0.421 | 0.683/0.753 |
Mahalanobis distance | 0.820 | 0.700 | 0.659/0.942 | 0.984/0.973 | 0.479/0.697 | 0.619/0.259 | 0.628/0.831 |
maximum likelihood | 0.8275 | 0.720 | 0.756/0.910 | 0.996/0.924 | 0.519/0.0.744 | 0.542/0.486 | 0.763/0.821 |
Method | OA | KAPPA | WATER | TREE | BARE LAND | BUILDING | ROAD |
---|---|---|---|---|---|---|---|
proposed method | 0.989 ± 0.001 | 0.985 ± 0.001 | 1.000/1.000 | 0.990/0.999 | 0.973/0.974 | 0.973/0.956 | 0.992/0.978 |
Non-internal classifier | 0.984 ± 0.003 | 0.978 ± 0.003 | 1.000/1.000 | 0.985/0.999 | 0.963/0.950 | 0.956/0.948 | 0.989/0.970 |
1*1 kernel | 0.956 ± 0.003 | 0.942 ± 0.004 | 0.999/1.000 | 0.987/0.998 | 0.866/0.902 | 0.862/0.856 | 0.948/0.878 |
3*3 kernel | 0.983 ± 0.002 | 0.978 ± 0.002 | 0.999/0.999 | 0.987/0.998 | 0.967/0.939 | 0.957/0.948 | 0.976/0.986 |
5*5 kernel | 0.985 ± 0.001 | 0.980 ± 0.002 | 0.999/0.998 | 0.979/0.997 | 0.973/0.958 | 0.971/0.958 | 0.993/0.970 |
contextual deep CNN | 0.982 ± 0.001 | 0.976 ± 0.001 | 1.000/1.000 | 0.988/0.998 | 0.960/0.954 | 0.945/0.951 | 0.981/0.950 |
without DenseNet | 0.979 ± 0.001 | 0.972 ± 0.002 | 0.999/0.997 | 0.976/0.998 | 0.954/0.942 | 0.967/0.933 | 0.976/0.962 |
Improved URDNN | 0.981 ± 0.001 | 0.975 ± 0.001 | 0.998/0.998 | 0.978/0.998 | 0.967/0.957 | 0.961/0.937 | 0.986/0.961 |
SCAE + SVM | 0.913 ± 0.001 | 0.883 ± 0.001 | 0.994/0.998 | 0.984/0.997 | 0.816/0.783 | 0.535/0.799 | 0.948/0.727 |
Two-stream network | 0.981 | 0.975 | 1.000/1.000 | 0.987/0.999 | 0.942/0.960 | 0.959/0.929 | 0.985/0.951 |
deconvolution | 0.969 | 0.958 | 0.995/1.000 | 0.969/0.998 | 0.945/0.892 | 0.939/0.890 | 0.956/0.975 |
parallelepiped | 0.333 | 0..208 | 0.997/0.086 | 1.000/0.549 | 0.000/0.000 | 0.159/1.000 | 0.000/0.000 |
minimum distance | 0.839 | 0.787 | 0.952/0.997 | 0.998/0.930 | 0.700/0.675 | 0.465/0.349 | 0.624/0.880 |
Mahalanobis distance | 0.859 | 0.813 | 0.980/0.998 | 1.000/0.946 | 0.691/0.699 | 0.644/0.377 | 0.615/0.945 |
maximum likelihood | 0.773 | 0.708 | 0.993/0.997 | 1.000/0.709 | 0.693/0.680 | 0.270/0.367 | 0.615/0.976 |
Method | OA | KAPPA | WATER | TREE | BARE LAND | BUILDING | ROAD | GRASS | |
---|---|---|---|---|---|---|---|---|---|
1 | OUR’S | 0.988 | 0.983 | 1.000/0.999 | 0.989/0.989 | 0.963/0.967 | 0.994/0.993 | 0.999/0.981 | / |
URDNN | 0.980 | 0.971 | 0.996/0.998 | 0.976/0.993 | 0.957/0.935 | 0.992/0.976 | 0.992/0.880 | / | |
ResNet | 0.985 | 0.978 | 1.000/0.999 | 0.983/0.992 | 0.967/0.948 | 0.991/0.992 | 0.976/0.908 | / | |
2 | OUR’S | 0.974 | 0.961 | 0.998/0.974 | 0.968/0.998 | 0.985/0.953 | 0.965/0.954 | 0.992/0.965 | / |
URDNN | 0.966 | 0.949 | 0.998/0.972 | 0.963/0.995 | 0.978/0.934 | 0.945/0.949 | 0.989/0.959 | / | |
ResNet | 0.968 | 0.952 | 0.997/0.967 | 0.969/0.996 | 0.977/0.936 | 0.940/0.964 | 0.994/0.922 | / | |
3 | OUR’S | 0.992 | 0.987 | 1.000/0.997 | 0.995/0.998 | 0.989/0.962 | 0.982/0.995 | 1.000/0.987 | / |
URDNN | 0.985 | 0.976 | 1.000/0.996 | 0.983/0.999 | 0.989/0.919 | 0.981/0.992 | 1.000/0.990 | / | |
ResNet | 0.987 | 0.979 | 0.999/0.997 | 0.992/0.999 | 0.993/0.924 | 0.963/0.996 | 1.000/0.991 | / | |
4 | OUR’S | 0.982 | 0.971 | 0.999/0.999 | 0.978/0.997 | 0.985/0.962 | 0.987/0.934 | 0.987/0.989 | / |
URDNN | 0.975 | 0.960 | 1.000/0.999 | 0.964/0.998 | 0.998/0.915 | 0.985/0.928 | 0.988/0.986 | / | |
ResNet | 0.976 | 0.961 | 1.000/1.000 | 0.970/0.998 | 0.987/0.934 | 0.992/0.914 | 0.970/0.993 | / | |
5 | OUR’S | 0.976 | 0.964 | 0.997/0.975 | 0.980/0.994 | 0.959/0.978 | 0.990/0.885 | 0.986/0.959 | / |
URDNN | 0.970 | 0.954 | 0.995/0.975 | 0.983/0.991 | 0.924/0.986 | 0.992/0.849 | 0.998/0.919 | / | |
ResNet | 0.966 | 0.948 | 0.993/0.963 | 0.978/0.996 | 0.918/0.980 | 0.995/0.773 | 0.999/0.962 | / | |
6 | OUR’S | 0.988 | 0.983 | 1.000/0.997 | 0.994/0.993 | 0.969/0.974 | 0.984/0.985 | 0.995/0.990 | 0.996/0.983 |
URDNN | 0.985 | 0.979 | 0.999/0.991 | 0.989/0.996 | 0.980/0.956 | 0.973/0.989 | 0.994/0.983 | 0.998/0.944 | |
ResNet | 0.978 | 0.971 | 0.999/0.998 | 0.972/0.996 | 0.976/0.965 | 0.978/0.989 | 0.989/0.880 | 0.993/0.918 | |
7 | OUR’S | 0.982 | 0.976 | 0.998/0.998 | 0.940/0.936 | 0.977/0.982 | 0.984/0.989 | 0.978/0.980 | 0.995/0.974 |
URDNN | 0.978 | 0.971 | 0.998/0.998 | 0.936/0.912 | 0.974/0.973 | 0.979/0.989 | 0.978/0.975 | 0.984/0.970 | |
ResNet | 0.980 | 0.973 | 1.000/0.996 | 0.912/0.957 | 0.960/0.959 | 0.988/0.985 | 0.974/0.978 | 0.996/0.975 | |
8 | OUR’S | 0.991 | 0.987 | 0.997/0.988 | 0.990/0.999 | 0.992/0.966 | 0.995/0.973 | 0.996/0.990 | 0.970/0.987 |
URDNN | 0.988 | 0.982 | 0.995/0.969 | 0.985/0.998 | 0.983/0.962 | 0.999/0.971 | 0.991/0.996 | 0.970/0.983 | |
ResNet | 0.990 | 0.985 | 0.997/0.980 | 0.990/0.999 | 0.982/0.967 | 0.996/0.973 | 0.991/0.994 | 0.965/0.979 | |
9 | OUR’S | 0.984 | 0.979 | 0.999/0.979 | 0.983/0.997 | 0.984/0.988 | 0.977/0.981 | 0.991/0.959 | / |
URDNN | 0.961 | 0.948 | 1.000/0.981 | 0.960/0.992 | 0.958/0.996 | 0.943/0.913 | 0.970/0.909 | / | |
ResNet | 0.970 | 0.960 | 0.999/0.989 | 0.965/0.996 | 0.981/0.990 | 0.954/0.930 | 0.978/0.936 | / | |
10 | OUR’S | 0.992 | 0.986 | 0.999/1.000 | 0.991/0.999 | 0.987/0.983 | 0.997/0.977 | 0.997/0.955 | / |
URDNN | 0.979 | 0.963 | 0.999/1.000 | 0.972/0.996 | 0.993/0.982 | 0.994/0.911 | 0.962/0.954 | / | |
ResNet | 0.973 | 0.954 | 0.998/1.000 | 0.964/0.998 | 0.984/0.984 | 0.993/0.902 | 0.977/0.874 | / | |
11 | OUR’S | 0.983 | 0.977 | 0.993/0.984 | 0.975/0.994 | 0.975/0.971 | 0.986/0.991 | 0.995/0.913 | / |
URDNN | 0.964 | 0.953 | 0.988/0.995 | 0.961/0.994 | 0.942/0.908 | 0.960/0.984 | 0.979/0.787 | / | |
ResNet | 0.971 | 0.961 | 0.986/0.980 | 0.956/0.991 | 0.977/0.935 | 0.975/0.991 | 0.976/0.837 | / | |
12 | OUR’S | 0.989 | 0.984 | 1.000/0.965 | 0.981/0.999 | 0.994/0.980 | 0.994/0.983 | 0.998/0.986 | / |
URDNN | 0.970 | 0.956 | 0.998/0.978 | 0.958/0.998 | 0.977/0.991 | 0.972/0.848 | 0.984/0.919 | / | |
ResNet | 0.977 | 0.966 | 1.000/0.895 | 0.964/0.996 | 0.983/0.966 | 0.988/0.967 | 0.995/0.978 | / | |
13 | OUR’S | 0.981 | 0.974 | 1.000/1.000 | 0.983/0.996 | 0.971/0.973 | 0.991/0.973 | 0.992/0.996 | 0.977/0.950 |
URDNN | 0.966 | 0.955 | 0.995/0.998 | 0.962/0.995 | 0.975/0.925 | 0.948/0.978 | 0.996/0.858 | 0.991/0.924 | |
ResNet | 0.971 | 0.961 | 0.995/0.991 | 0.972/0.994 | 0.968/0.948 | 0.957/0.992 | 0.989/0.847 | 0.994/0.893 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tao, Y.; Xu, M.; Lu, Z.; Zhong, Y. DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification. Remote Sens. 2018, 10, 779. https://doi.org/10.3390/rs10050779
Tao Y, Xu M, Lu Z, Zhong Y. DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification. Remote Sensing. 2018; 10(5):779. https://doi.org/10.3390/rs10050779
Chicago/Turabian StyleTao, Yiting, Miaozhong Xu, Zhongyuan Lu, and Yanfei Zhong. 2018. "DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification" Remote Sensing 10, no. 5: 779. https://doi.org/10.3390/rs10050779
APA StyleTao, Y., Xu, M., Lu, Z., & Zhong, Y. (2018). DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification. Remote Sensing, 10(5), 779. https://doi.org/10.3390/rs10050779