Image Enhancement Driven by Object Characteristics and Dense Feature Reuse Network for Ship Target Detection in Remote Sensing Imagery
"> Figure 1
<p>Framework of the proposed algorithm for ship target detection in remote sensing imagery (RSI).</p> "> Figure 2
<p>Architectures of the image enhancement network.</p> "> Figure 3
<p>Description of dense feature reuse (DFR) module.</p> "> Figure 4
<p>Description of the specific implementation of the DFR module.</p> "> Figure 5
<p>The atrous spatial pyramid pooling (ASPP) model used in receptive field expansion (RFE) module.</p> "> Figure 6
<p>Schematic diagram of the RFE module.</p> "> Figure 7
<p>Examples of different scenarios in the ChangGuang dataset. (<b>a</b>) Ocean wave scene, (<b>b</b>) cloud and fog scene, (<b>c</b>) nearshore scene, (<b>d</b>) scenes with different target scales.</p> "> Figure 8
<p>Some samples from SSDD. (<b>a</b>) Simple background, (<b>b</b>) complex background, (<b>c</b>) nearshore targets.</p> "> Figure 9
<p>Some output results of the OCIE module. (<b>a</b>) is the dark target selected in the Changguang dataset, (<b>c</b>) is the cloudy scene, (<b>b</b>,<b>d</b>) respectively correspond to the output of these two samples through the OCIE module, in which the target is effectively enhanced.</p> "> Figure 10
<p>Some of the detection results obtained by the proposed overall framework on Mask RCNN.</p> "> Figure 11
<p>Average precision–intersection over union (AP–IoU) curves for the two datasets. Plotted results show the detection performance of our detection framework and some representative detection methods.</p> "> Figure 12
<p>Precision–recall curve for the datasets. Plotted results show the detection performance of our detection framework.</p> ">
Abstract
:1. Introduction
1.1. Related Work
1.2. Problem Description and Motivations
- (1)
- Under different sea conditions and backgrounds, the color, texture, and noise distribution information of images vary greatly. Common deep learning networks have poor resistance to datasets, which means that small pixel differences between images may lead to drastic changes in detection results. On the other hand, for SAR images, the gray value of pixels (intensity or amplitude) is related to the radar cross-section of ground objects, including the radar irradiation angle, object geometry, material, and other factors. In practice, the reflectivity of target radar is easily interfered with by the complex background, meaning that the target is easily submerged in the background noise.
- (2)
- Due to the complexity of the background texture information of RSIs and the influence of various environmental factors on the feature expression of ships, the ability of the CNN to extract the geometric shape and texture information of ships is weakened, and it is difficult to distinguish between ships and shore false-alarm targets. The CNN with outstanding results is often accompanied by a high computation and a large amount memory storage, and the feature vectors of the high-level output lose a lot of spatial location information required by detection tasks due to multiple pooling or sampling.
- (3)
- Ship targets on the sea have the problem of unbalanced scales. In the same scene, larger warships and smaller fishing boats may exist at the same time. The detection effect of multi-scale targets in a general single-scale network is not ideal, which often makes the detection accuracy for small target objects very low.
1.3. Contributions and Structure
- (1)
- Inspired by the GAN [30] and the DLSR photo enhancement dataset (DPED) [31], a generator subnetwork and a discriminator subnetwork were employed to form the object characteristic-driven image enhancement (OCIE) module. This was utilized to automatically generate visually pleasing satellite images with enough target information, which makes them conducive to the target detection task, while augmenting the training set. This module optimizes the texture, color, smoothness, and semantic information of the training image and greatly improves the target background contrast, having a good effect on the background classification in RPN.
- (2)
- The dense feature reuse (DFR) module contains multi-level residual networks with dense connections that explore the spatial location feature without extra increases in the parameters, avoiding the problem of gradient disappearance. Inspired by the original dense-block, it uses 1 × 1 convolution to suppress channel growth and merge low-level position information and information of different resolutions. It retains identity mapping and strengthens the transmission of information flow.
- (3)
- In order to further improve the ability to obtain spatial scale information, multi-scale atrous convolution kernels with different sparsity and sizes were combined in a manner similar to spatial pyramid pooling (SPP). The generated ASPP (atrous spatial pyramid pooling) structure was integrated with the FPN to form the receptive field expansion (RFE) module, which enhances the receptive field and strengthens the network’s ability to obtain information of different scales and better process global information.
2. Preliminaries
2.1. Image Enhancement Network Based on GAN
2.2. Densely Connected Convolutional Networks
3. Proposed Method
3.1. The Overall Framework of Ship Detection Methods
3.2. Image Enhancement Method Considering Ship Target Characteristics
3.3. Dense Feature Reuse Module
3.4. Improved Receptive Field Expansion Module Based on Multi-Scale
4. Experiments and Analysis
4.1. Experimental Data
4.2. Experimental Results and Analysis
4.2.1. Experiment Evaluation of OCIE and DFR Module for RSIs
4.2.2. Comparison of Performance between the Proposed Overall Framework and the State of the Art
4.2.3. AP versus IoU Curve
4.2.4. Precision versus Recall
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Colomina, I.; Molina, P. Unmanned aerial systems for photogrammetry and remote sensing: A review. ISPRS J. Photogramm. Remote Sens. 2014, 92, 79–97. [Google Scholar] [CrossRef] [Green Version]
- Fromm, M.; Schubert, M.; Castilla, G.; Linke, J.; McDermid, G. Automated Detection of Conifer Seedlings in Drone Imagery Using Convolutional Neural Networks. Remote Sens. 2019, 11, 2585. [Google Scholar] [CrossRef] [Green Version]
- Cheng, G.; Han, J. A survey on object detection in optical remote sensing images. ISPRS J. Photogramm. Remote Sens. 2016, 117, 11–28. [Google Scholar] [CrossRef] [Green Version]
- Svatonova, H. Analysis of Visual Interpretation of Satellite Data. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41. [Google Scholar] [CrossRef]
- Chang, Y.L.; Anagaw, A.; Chang, L.; Wang, Y.C.; Hsiao, C.Y.; Lee, W.H. Ship detection based on YOLOv2 for SAR imagery. Remote Sens. 2019, 11, 786. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Tayara, H.; Chong, K.T. Object detection in very high-resolution aerial images using one-stage densely connected feature pyramid network. Sensors 2018, 18, 3341. [Google Scholar] [CrossRef] [Green Version]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Huang, D. Receptive field block net for accurate and fast object detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 385–400. [Google Scholar]
- Li, Z.; Zhou, F. FSSD: Feature fusion single shot multibox detector. arXiv 2017, arXiv:1712.00960. [Google Scholar]
- Zhu, R.; Zhang, S.; Wang, X.; Wen, L.; Shi, H.; Bo, L.; Mei, T. ScratchDet: Training single-shot object detectors from scratch. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2268–2277. [Google Scholar]
- Zhang, S.; Wen, L.; Bian, X.; Lei, Z.; Li, S.Z. Single-shot refinement neural network for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4203–4212. [Google Scholar]
- Yang, X.; Sun, H.; Fu, K.; Yang, J.; Sun, X.; Yan, M.; Guo, Z. Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sens. 2018, 10, 132. [Google Scholar] [CrossRef] [Green Version]
- Chen, Z.; Zhang, T.; Ouyang, C. End-to-end airplane detection using transfer learning in remote sensing images. Remote Sens. 2018, 10, 139. [Google Scholar] [CrossRef] [Green Version]
- Ji, H.; Gao, Z.; Mei, T.; Ramesh, B. Vehicle detection in remote sensing images leveraging on simultaneous super-resolution. IEEE Geosci. Remote Sens. Lett. 2019, 17, 676–680. [Google Scholar] [CrossRef]
- Li, Q.; Mou, L.; Xu, Q.; Zhang, Y.; Zhu, X.X. R3-net: A deep network for multi-oriented vehicle detection in aerial images and videos. arXiv 2018, arXiv:1808.05560. [Google Scholar]
- Ammour, N.; Alhichri, H.; Bazi, Y.; Benjdira, B.; Alajlan, N.; Zuair, M. Deep learning approach for car detection in UAV imagery. Remote Sens. 2017, 9, 312. [Google Scholar] [CrossRef] [Green Version]
- Tang, T.; Zhou, S.; Deng, Z.; Zou, H.; Lei, L. Vehicle detection in aerial images based on region convolutional neural networks and hard negative example mining. Sensors 2017, 17, 336. [Google Scholar] [CrossRef] [Green Version]
- Ren, Y.; Zhu, C.; Xiao, S. Small object detection in optical remote sensing images via modified faster R-CNN. Appl. Sci. 2018, 8, 813. [Google Scholar] [CrossRef] [Green Version]
- Li, W.; Fu, H.; Yu, L.; Cracknell, A. Deep learning based oil palm tree detection and counting for high-resolution remote sensing images. Remote Sens. 2017, 9, 22. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Z.Q.; Zheng, P.; Xu, S.T.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [Green Version]
- Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Ignatov, A.; Kobyshev, N.; Timofte, R.; Vanhoey, K.; Van Gool, L. Dslr-quality photos on mobile devices with deep convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 3277–3285. [Google Scholar]
- Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
- Farid, H. Blind inverse gamma correction. IEEE Trans. Image Process. 2001, 10, 1428–1433. [Google Scholar] [CrossRef] [PubMed]
- Ignatov, A.; Kobyshev, N.; Timofte, R.; Vanhoey, K.; Van Gool, L. Wespe: Weakly supervised photo enhancer for digital cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 691–700. [Google Scholar]
- Chen, Y.S.; Wang, Y.C.; Kao, M.H.; Chuang, Y.Y. Deep photo enhancer: Unpaired learning for image enhancement from photographs with gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6306–6314. [Google Scholar]
- Sharma, V.; Diba, A.; Neven, D.; Brown, M.S.; Van Gool, L.; Stiefelhagen, R. Classification-driven dynamic image enhancement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4033–4041. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Kim, T.; Cha, M.; Kim, H.; Lee, J.K.; Kim, J. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the International Conference on Machine Learning (PMLR 2017), Sydney, Australia, 6–11 August 2017; pp. 1857–1865. [Google Scholar]
- Yi, Z.; Zhang, H.; Tan, P.; Gong, M. Dualgan: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2849–2857. [Google Scholar]
- Liu, M.Y.; Tuzel, O. Coupled generative adversarial networks. arXiv 2016, arXiv:1606.07536. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Sun, Y.; Liu, Z.; Sedra, D.; Weinberger, K.Q. Deep networks with stochastic depth. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 646–661. [Google Scholar]
- Srivastava, R.K.; Greff, K.; Schmidhuber, J. Highway networks. arXiv 2015, arXiv:1505.00387. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
- Li, J.; Qu, C.; Shao, J. Ship detection in SAR images based on an improved faster R-CNN. In Proceedings of the 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017; pp. 1–6. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
Algorithm | ChangGuang Dataset | SSDD | ||
---|---|---|---|---|
AP (%) | AR (%) | AP (%) | AR (%) | |
YOLOV3 | 74.07% | 76.34% | 67.01% | 66.40% |
YOLOV3 (OCIE) | 75.11% | 77.43% | 68.52% | 68.15% |
Mask-RCNN | 81.18% | 77.27% | 87.46% | 79.18% |
Mask-RCNN (OCIE) | 84.29% | 78.04% | 90.13% | 80.44% |
Algorithm | ChangGuang Dataset | SSDD | ||
---|---|---|---|---|
AP (%) | AR (%) | AP (%) | AR (%) | |
YOLOV3 (OCIE) | 75.11% | 77.43% | 68.52% | 68.15% |
YOLOV3 (OCIE-DFR) | 76.29% | 77.95% | 69.44% | 68.83% |
Mask-RCNN (OCIE) | 84.29% | 78.04% | 90.13% | 80.44% |
Mask-RCNN (OCIE-DFR) | 86.58% | 79.80% | 91.35% | 82.08% |
Algorithm | #Params | GFLOPs |
---|---|---|
YOLOV3 | 41.95 M | 195.55 |
YOLOV3 (Our Framework) | 38.17 M | 181.69 |
Mask-RCNN | 44.17 M | 253.37 |
Mask-RCNN (Our Framework) | 40.03 M | 197.45 |
Algorithm | ChangGuang Dataset | SSDD | ||
---|---|---|---|---|
AP (%) | AR (%) | AP (%) | AR (%) | |
CornerNet | 63.61% | 70.25% | 74.31% | 66.70% |
FCOS | 74.10% | 79.93% | 84.74% | 76.30% |
Faster CNN | 79.27% | 77.34% | 85.94% | 78.36% |
Cascade RCNN | 79.11% | 79.95% | 87.10% | 78.93% |
YOLOv3-tiny | 70.46% | 71.85% | 64.04% | 66.23% |
YOLOV3 | 74.07% | 76.34% | 67.01% | 66.40% |
YOLOV3 (OCIE) | 75.11% | 77.43% | 68.52% | 68.15% |
YOLOV3 (OCIE-DFR) | 76.29% | 77.95% | 69.44% | 68.83% |
YOLOV3 (OCIE-DFR-RFE) | 79.32% | 78.86% | 69.84% | 69.71% |
Mask-RCNN | 81.18% | 77.27% | 87.46% | 79.18% |
Mask-RCNN (OCIE) | 84.29% | 78.04% | 90.13% | 80.44% |
Mask-RCNN (OCIE-DFR) | 86.58% | 79.80% | 91.35% | 82.08% |
Mask-RCNN (OCIE-DFR-RFE) | 87.39% | 80.56% | 92.09% | 82.25% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tian, L.; Cao, Y.; He, B.; Zhang, Y.; He, C.; Li, D. Image Enhancement Driven by Object Characteristics and Dense Feature Reuse Network for Ship Target Detection in Remote Sensing Imagery. Remote Sens. 2021, 13, 1327. https://doi.org/10.3390/rs13071327
Tian L, Cao Y, He B, Zhang Y, He C, Li D. Image Enhancement Driven by Object Characteristics and Dense Feature Reuse Network for Ship Target Detection in Remote Sensing Imagery. Remote Sensing. 2021; 13(7):1327. https://doi.org/10.3390/rs13071327
Chicago/Turabian StyleTian, Ling, Yu Cao, Bokun He, Yifan Zhang, Chu He, and Deshi Li. 2021. "Image Enhancement Driven by Object Characteristics and Dense Feature Reuse Network for Ship Target Detection in Remote Sensing Imagery" Remote Sensing 13, no. 7: 1327. https://doi.org/10.3390/rs13071327
APA StyleTian, L., Cao, Y., He, B., Zhang, Y., He, C., & Li, D. (2021). Image Enhancement Driven by Object Characteristics and Dense Feature Reuse Network for Ship Target Detection in Remote Sensing Imagery. Remote Sensing, 13(7), 1327. https://doi.org/10.3390/rs13071327