Multi-Modal Multi-Stage Underwater Side-Scan Sonar Target Recognition Based on Synthetic Images
<p>Image encoder–decoder reconstruction model structure diagram.</p> "> Figure 2
<p>Illustration of image encoder–decoder style transfer.</p> "> Figure 3
<p>Illustration of WST multi-feature layer sequence image style transfer: (<b>a</b>) VGG19 network structure diagram; (<b>b</b>) WST process details and multi-feature layer addition.</p> "> Figure 4
<p>Basic model of TL-based object recognition.</p> "> Figure 5
<p>Structure diagram of multi-mode multi-stage transmission network for object recognition.</p> "> Figure 6
<p>Side-scan sonar dataset samples: (<b>a</b>) three classes of side-scan image targets; (<b>b</b>) sample distribution diagram.</p> "> Figure 7
<p>Image transformation result graph: (<b>a</b>) image sample; (<b>b</b>) center crop; (<b>c</b>) bottom-left crop; (<b>d</b>) top-left crop; (<b>e</b>) bottom-right crop; (<b>f</b>) top-right crop; (<b>g</b>) equal-height stretch; (<b>h</b>) equal-width stretch; (<b>i</b>) contrast transformation (gamma = 0.87); (<b>j</b>) contrast transformation (gamma = 1.07); (<b>k</b>) rotation by 45<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>l</b>) rotation by 90<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>m</b>) rotation by 135<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>n</b>) rotation by 180<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>o</b>) rotation by 225<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>p</b>) rotation by 270<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>q</b>) rotation by 315<math display="inline"><semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics></math>; (<b>r</b>) left and right flip.</p> "> Figure 8
<p>The datasets used in the experiments: (<b>a</b>) grayscale optical image samples; (<b>b</b>) synthetic image samples; (<b>c</b>) SAR image samples; (<b>d</b>) SSS image samples.</p> "> Figure 9
<p>Result graph comparing the proposed method with the traditional ones.</p> "> Figure 10
<p>Confusion matrix result graph between the proposed method and DesNet.</p> "> Figure 11
<p>Forward-looking sonar dataset samples’ diagram.</p> ">
Abstract
:1. Introduction
2. Related Work
3. Materials and Methods
3.1. SSS Image Style Transfer for Optical Images
3.1.1. Image Encoding–Decoding Reconstruction Model
3.1.2. Image Content Information Extraction
3.1.3. Image Style Information Transfer
3.1.4. WST Multi-Feature Layer Sequential Image Style Transfer
3.2. Multi-Modal Multi-Stage Transfer Network for Object Recognition
4. Experiments
4.1. Experimental Settings
4.1.1. Application Dataset
4.1.2. Experimental Dataset Preprocessing
4.2. Evaluation Metrics
- TP:
- If a sample belongs to a specific class, and is predicted as such, this is considered a true positive outcome.
- TN:
- If a sample does not belong in a class, and is predicted not to belong, this is considered a true negative outcome.
- FP:
- If a sample does not belong to a class, but is predicted to, then this is considered a false positive outcome.
- FN:
- If a sample belongs to a class, but is predicted not to, this is considered a false negative.
4.3. Performance Analysis
- (1)
- Comparison of our method with traditional DL models:
- (2)
- Comparison of the methods for the classification of SSS images:
- (3)
- Comparison of different backbones for the classification of SSS images:
- (4)
- Comparison of different TL strategies:
- (5)
- Comparison of various backbones for classifying noisy SSS images:
- (6)
- Application of FLS target recognition:
5. Discussion
5.1. Method Importance
5.2. Algorithm Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bhanu, B. Automatic target recognition: State of the art survey. IEEE Trans. Aerosp. Electron. Syst. 1986, AES-22, 364–379. [Google Scholar] [CrossRef]
- Chaillan, F.; Fraschini, C.; Courmontagne, P. Speckle noise reduction in SAS imagery. Signal Process. 2007, 87, 762–781. [Google Scholar] [CrossRef]
- Kazimierski, W.; Zaniewicz, G. Determination of process noise for underwater target tracking with forward looking sonar. Remote. Sens. 2021, 13, 1014. [Google Scholar] [CrossRef]
- Wang, H.; Wang, B.; Li, Y. IAFNet: Few-shot learning for modulation recognition in underwater impulsive noise. IEEE Commun. Lett. 2022, 26, 1047–1051. [Google Scholar] [CrossRef]
- Zhang, X.; Ying, W.; Yang, P.; Sun, M. Parameter estimation of underwater impulsive noise with the Class B model. IET Radar Sonar Navig. 2020, 14, 1055–1060. [Google Scholar] [CrossRef]
- Li, H.; Dong, Y.; Gong, C.; Zhang, Z.; Wang, X.; Dai, X. A non-gaussianity-aware receiver for impulsive noise mitigation in underwater communications. IEEE Trans. Veh. Technol. 2021, 70, 6018–6028. [Google Scholar] [CrossRef]
- Topple, J.M.; Fawcett, J.A. MiNet: Efficient deep learning automatic target recognition for small autonomous vehicles. IEEE Geosci. Remote. Sens. Lett. 2020, 18, 1014–1018. [Google Scholar] [CrossRef]
- Lin, Z.; Ji, K.; Leng, X.; Kuang, G. Squeeze and excitation rank faster R-CNN for ship detection in SAR images. IEEE Geosci. Remote. Sens. Lett. 2018, 16, 751–755. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Shi, J.; Wei, S. High-speed ship detection in SAR images by improved yolov3. In Proceedings of the 2019 16th International Computer Conference on Wavelet Active Media Technology and Information Processing, Chengdu, China, 13–15 December 2019; pp. 149–152. [Google Scholar]
- Dobeck, G.J.; Hyland, J.C. Automated detection and classification of sea mines in sonar imagery. Proc. SPIE 1997, 3079, 90–110. [Google Scholar]
- Wan, S.; Yeh, M.L.; Ma, H.L. An innovative intelligent system with integrated CNN and SVM: Considering various crops through hyperspectral image data. ISPRS Int. J. Geo-Inf. 2021, 10, 242. [Google Scholar] [CrossRef]
- Çelebi, A.T.; Güllü, M.K.; Ertürk, S. Mine detection in side scan sonar images using Markov Random Fields with brightness compensation. In Proceedings of the 2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU), Antalya, Turkey, 20–22 April 2011; pp. 916–919. [Google Scholar]
- Ye, X.; Li, C.; Zhang, S.; Yang, P.; Li, X. Research on side-scan sonar image target classification method based on transfer learning. In Proceedings of the OCEANS 2018 Conference, Charleston, NC, USA, 22–25 October 2018; pp. 1–6. [Google Scholar]
- Huo, G.; Wu, Z.; Li, J. Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data. IEEE Access 2020, 8, 47407–47418. [Google Scholar] [CrossRef]
- Luo, X.; Qin, X.; Wu, Z.; Yang, F.; Wang, M.; Shang, J. Sediment classification of small-size seabed acoustic images using convolutional neural networks. IEEE Access 2019, 7, 98331–98339. [Google Scholar] [CrossRef]
- Qin, X.; Luo, X.; Wu, Z.; Shang, J. Optimizing the sediment classification of small side-scan sonar images based on deep learning. IEEE Access 2021, 9, 29416–29428. [Google Scholar] [CrossRef]
- Gerg, I.D.; Monga, V. Structural prior driven regularized deep learning for sonar image classification. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 4200416. [Google Scholar] [CrossRef]
- Zhang, P.; Tang, J.; Zhong, H.; Ning, M.; Liu, D.; Wu, K. Self-trained target detection of radar and sonar images using automatic deep learning. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 4701914. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Xu, S.; Qiu, X.; Wang, C.; Zhong, L.; Yuan, X. Desnet: Deep residual networks for Descalloping of ScanSar images. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 23–27 July 2018; pp. 8929–8932. [Google Scholar]
- Wu, Z.; Shen, C.; Van Den Hengel, A. Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognit. 2019, 90, 119–133. [Google Scholar] [CrossRef] [Green Version]
- Qiu, C.; Zhou, W. A survey of recent advances in CNN-based fine-grained visual categorization. In Proceedings of the 2020 IEEE 20th International Conference on Communication Technology (ICCT), Nanning, China, 28–31 October 2020; pp. 1377–1384. [Google Scholar]
- Fukushima, K.; Miyake, S.; Ito, T. Neocognitron: A neural network model for a mechanism of visual pattern recognition. IEEE Trans. Syst. Man Cybern. 1983, 13, 826–834. [Google Scholar] [CrossRef]
- LeCun, Y. Generalization and network design strategies. Connect. Perspect. 1989, 19, 18. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–15 June 2009; pp. 248–255. [Google Scholar]
- He, K.; Girshick, R.; Dollár, P. Rethinking imagenet pre-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 11–17 October 2019; pp. 4918–4927. [Google Scholar]
- Zhao, L.; Li, S. Object detection algorithm based on improved YOLOv3. Electronics 2020, 9, 537. [Google Scholar] [CrossRef] [Green Version]
- Xu, X.; Zhao, M.; Shi, P.; Ren, R.; He, X.; Wei, X.; Yang, H. Crack detection and comparison study based on faster R-CNN and mask R-CNN. Sensors 2022, 22, 1215. [Google Scholar] [CrossRef] [PubMed]
- Yulin, T.; Jin, S.; Bian, G.; Zhang, Y. Shipwreck target recognition in side-scan sonar images by improved YOLOv3 model based on transfer learning. IEEE Access 2020, 8, 173450–173460. [Google Scholar] [CrossRef]
- Ji-yang, Y.; Dan, H.; Lu-yuan, W.; Xin, L.; Wen-juan, L. On-board ship targets detection method based on multi-scale salience enhancement for remote sensing image. In Proceedings of the 2016 IEEE 13th International Conference on Signal Processing (ICSP), Chengdu, China, 6–10 November 2016; pp. 217–221. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15), Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]
- Chandrashekar, G.; Raaza, A.; Rajendran, V.; Ravikumar, D. Side scan sonar image augmentation for sediment classification using deep learning based transfer learning approach. Mater. Today Proc. 2021. [Google Scholar] [CrossRef]
- Ge, Q.; Ruan, F.; Qiao, B.; Zhang, Q.; Zuo, X.; Dang, L. Side-scan sonar image classification based on style transfer and pre-trained convolutional neural networks. Electronics 2021, 10, 1823. [Google Scholar] [CrossRef]
- Li, Y.; Fang, C.; Yang, J.; Wang, Z.; Lu, X.; Yang, M.H. Diversified texture synthesis with feed-forward networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21 July–26 July 2017; pp. 3920–3928. [Google Scholar]
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2414–2423. [Google Scholar]
- Gatys, L.; Ecker, A.S.; Bethge, M. Texture synthesis using convolutional neural networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
Subnetwork | Subnetwork_1 | Subnetwork_2 | Subnetwork_3 | Subnetwork_4 | |
---|---|---|---|---|---|
Usage Step/Datasets | |||||
Step 1/ImageNet | Train | Train | Train | Train | |
Step 2/Synthetic Data | Freeze | Train | Train | Train | |
Step 3/SAR | Freeze | Freeze | Train | Train | |
Step 4/SSS | Freeze | Freeze | Freeze | Train |
Prediction | Total | |||
---|---|---|---|---|
1 | 0 | |||
Actual | 1 | True Positive (TP) | True Negative (FN) | Actual Positive (TP + FN) |
0 | False Positive (FP) | False Negative (TN) | Actual Negative (FP + TN) | |
Total | Predicted Positive (TP + FP) | Predicted Negative (FN + TN) | (TP + FN + FP + TN) |
Precision | Recall | Accuracy | |
---|---|---|---|
planeDenseNet | 0.5386 | 0.5345 | 0.8672 |
shipDenseNet | 0.8316 | 0.819 | 0.8523 |
othersDenseNet | 0.9764 | 0.9597 | 0.9775 |
Our Method | 1 | 1 | 1 |
Methods | Layer Number | Accuracy (%) |
---|---|---|
Shallow CNN [15] | 11 | 83.19 |
GoogleNet [16] | 22 | 91.86 |
VGG11 fine-tuning + semi-synthetic data [13] | 11 | 92.51 |
VGG19 fine-tuning [14] | 19 | 94.67 |
VGG19 fine-tuning + semi-synthetic data | 19 | 97.76 |
SPDRDL [17] | 46 | 97.38 |
FL-DARTS [18] | 50 | 99.07 |
Ours | 152 | 100 |
Backbone Networks | Accuracy (%) |
---|---|
AlexNet | 94.14 |
GoogleNet | 94.46 |
VGG16 | 94.5 |
VGG19 | 94.67 |
ResNet18 | 91.86 |
ResNet50 | 93.5 |
DenseNet | 94.14 |
Dataset Training Order | Accuracy (%) |
---|---|
SAR | 97.72 |
Optical | 97.12 |
SAR + Optical | 98.34 |
Optical + Synthetic Dataset + SAR + SSS (Our Method) | 100 |
Backbone Networks | Accuracy (%) |
---|---|
VGG | 95.5 |
ResNet | 92.68 |
DenseNet | 91.63 |
Ours | 95.92 |
Methods | Optimal OA (%) |
---|---|
DenseNet201 | 89.07 |
DenseNet121 | 88.87 |
DenseNet169 | 89.91 |
ResNet50 | 89.49 |
ResNet101 | 88.14 |
ResNet152 | 88.03 |
VGGNet16 | 90.63 |
VGGNet19 | 85.22 |
Proposed | 100 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Li, H.; Huo, G.; Li, C.; Wei, Y. Multi-Modal Multi-Stage Underwater Side-Scan Sonar Target Recognition Based on Synthetic Images. Remote Sens. 2023, 15, 1303. https://doi.org/10.3390/rs15051303
Wang J, Li H, Huo G, Li C, Wei Y. Multi-Modal Multi-Stage Underwater Side-Scan Sonar Target Recognition Based on Synthetic Images. Remote Sensing. 2023; 15(5):1303. https://doi.org/10.3390/rs15051303
Chicago/Turabian StyleWang, Jian, Haisen Li, Guanying Huo, Chao Li, and Yuhang Wei. 2023. "Multi-Modal Multi-Stage Underwater Side-Scan Sonar Target Recognition Based on Synthetic Images" Remote Sensing 15, no. 5: 1303. https://doi.org/10.3390/rs15051303
APA StyleWang, J., Li, H., Huo, G., Li, C., & Wei, Y. (2023). Multi-Modal Multi-Stage Underwater Side-Scan Sonar Target Recognition Based on Synthetic Images. Remote Sensing, 15(5), 1303. https://doi.org/10.3390/rs15051303