A Marine Organism Detection Framework Based on the Joint Optimization of Image Enhancement and Object Detection
<p>The overall structure of the proposed framework.</p> "> Figure 2
<p>The diagram of the blind denoising sub-module network.</p> "> Figure 3
<p>Overall structure of the color correction sub-module based on generative adversarial networks.</p> "> Figure 4
<p>The diagram of the multiscale-based deblurring network.</p> "> Figure 5
<p>Residual feature extraction network with a combination of top-down and bottom-up paths.</p> "> Figure 6
<p>Color cast correction results of different methods: (<b>a</b>) original image, (<b>b</b>) histogram equalization, (<b>c</b>) ground Truth and (<b>d</b>) proposed module.</p> "> Figure 7
<p>Test results for the holothurian, echinus, scallop, and starfish.</p> "> Figure 8
<p>Ablation experiments results: (<b>a</b>–<b>d</b>) corresponds to the trend of the detection precision of each organism category in the data set changed with the IoU threshold during the ablation experiments in six groups.</p> "> Figure 9
<p>Comparison results of different model detections: (<b>a</b>) SSD, (<b>b</b>) Yolo_v3, (<b>c</b>) Faster RCNN, (<b>d</b>) Cascade RCNN (<b>e</b>) proposed framework.</p> ">
Abstract
:1. Introduction
- (1)
- An end-to-end underwater object detection framework is proposed, which can jointly optimize the enhancement module and the detection module so as to improve the problem of large information loss in the existing two-stage model with first enhancement and then detection. In the enhancement module, the introduction of three sub-modules as denoising, color correction, and deblurring can alleviate the effects of the three main factors that lead to a significant drop in underwater imaging quality at the same time.
- (2)
- The feature pyramid network is introduced given the problem of difficult detection caused by the uneven distribution of sizes and positions of different types of marine organisms in underwater images. High-quality feature extraction of marine organisms can be achieved by the use of the combination of deep semantic information and shallow detail information at different levels.
- (3)
- Dynamic label allocation, dynamic smoothing of L1 loss, and dynamical adjustment of the IoU threshold are introduced because of the difficulty of generating enough positive samples from the network arising from the clustering effect of marine organisms. Thus, the contribution of positive samples in the training model is increased and model training is accelerated.
2. Detection Framework
2.1. Overall Structure
2.2. Underwater Image Enhancement Module
2.2.1. Denoising Sub-Module
2.2.2. Color Correction Sub-Module
2.2.3. Deblurring Sub-Module
2.3. Feature Extraction Network
2.4. Detecting Networks
3. Training Processes
3.1. Pre-Training of the UIEM
3.1.1. Pre-Training of the Denoising Sub-Module
3.1.2. Pre-Training of the Color Correct Sub-Module
3.1.3. Pre-Training of the Deblurring Sub-Module
3.2. Detection Network
4. Experiment Details
4.1. Evaluation Indicator
- Recall:
- Precision:
- Average precision (AP): With 0.05 as the interval, the average of all the accuracies of a certain category with the IoU threshold value from 0.5 to 0.95.
- Mean Average Precision (mAP): The average value of APs in all detection categories under a certain IoU threshold.
- mAP@ [0.5:0.05:0.95]: represents the average mAP at different IoU thresholds (from 0.5 to 0.95 in steps of 0.05).
4.2. Data Sets
5. Experimental Results
5.1. Experimental Results Obtained with the Underwater Data Set
5.2. Ablation Experiment
5.3. Comparative Test
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Teng, B.; Zhao, H. Underwater target recognition methods based on the framework of deep learning: A survey. Int. J. Adv. Robot. Syst. 2020, 17. [Google Scholar] [CrossRef]
- Dobeck, J.G.; Hyland, J.C.; Smedley, L. Automated detection and classification of sea mines in sonar imagery. In Proceedings of the SPIE—The International Society for Optical Engineering, Orlando, FL, USA, 21–22 April 1997; p. 3079. [Google Scholar]
- Acar, E.U.; Choset, H.; Zhang, Y.; Schervish, M. Path planning for robotic demining: Robust sensor-based coverage of unstructured environments and probabilistic methods. Int. J. Robot. Res. 2003, 22, 441–466. [Google Scholar] [CrossRef]
- Cao, X.; Zhang, X.; Yu, Y.; Niu, L. Deep Learning-Based Recognition of Underwater Target. In Proceedings of the IEEE International Conference on Digital Signal Processing (DSP), Beijing, China, 16–18 October 2016. [Google Scholar]
- Qi, J.; Gong, Z.; Xue, W.; Liu, X.; Yao, A.; Zhong, P. An Unmixing-Based Network for Underwater Target Detection from Hyperspectral Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2021, 14, 5470–5487. [Google Scholar] [CrossRef]
- Cai, L.; Sun, Q.; Xu, T.; Ma, Y.; Chen, Z. Multi-AUV Collaborative Target Recognition Based on Transfer-Reinforcement Learning. IEEE Access 2020, 8, 39273–39284. [Google Scholar] [CrossRef]
- Rova, A.; Mori, G.; Dill, L.M. One fish, two fish, butterfish, trumpeter: Recognizing fish in underwater video. In Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2007), Tokyo, Japan, 16–18 May 2007. [Google Scholar]
- Yuan, F.; Huang, Y.; Chen, X.; Cheng, E. A Biological Sensor System Using Computer Vision for Water Quality Monitoring. IEEE Access 2018, 6, 61535–61546. [Google Scholar] [CrossRef]
- Heithaus, R.M.; Dill, L.M. Food availability and tiger shark predation risk influence bottlenose dolphin habitat use. Ecology 2002, 83, 480–491. [Google Scholar] [CrossRef]
- Qian, J.; Li, J.; Wang, Y.; Liu, J.; Wang, J.; Zheng, D. Underwater image recovery method based on hyperspectral polarization imaging. Opt. Commun. 2021, 484, 126691. [Google Scholar] [CrossRef]
- Mathur, M.; Goel, N. Enhancement algorithm for high visibility of underwater images. IET Image Process. 2021, 1–16. [Google Scholar] [CrossRef]
- Han, Y.; Huang, L.; Hong, Z.; Cao, S.; Zhang, Y.; Wang, J. Deep Supervised Residual Dense Network for Underwater Image Enhancement. Sensors 2021, 21, 3289. [Google Scholar] [CrossRef]
- Zhu, D.; Liu, Z.; Zhang, Y. Underwater image enhancement based on colour correction and fusion. IET Image Process. 2021, 15, 2591–2603. [Google Scholar] [CrossRef]
- Tang, Z.; Jiang, L.; Luo, Z. A new underwater image enhancement algorithm based on adaptive feedback and Retinex algorithm. Multimedia Tools Appl. 2021, 80, 1–13. [Google Scholar] [CrossRef]
- Huang, Y.; Liu, M.; Yuan, F. Color correction and restoration based on multi-scale recursive network for underwater optical image. Signal Process. Image Commun. 2021, 93, 116174. [Google Scholar] [CrossRef]
- Shanmugasundaram, M.; Sukumaran, S.; Shanmugavadivu, N. Fusion based denoise-engine for underwater images using curvelet transform. In Proceedings of the 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Mysore, India, 22–25 August 2013; pp. 941–946. [Google Scholar]
- Wang, Z.; Zheng, B.; Tian, W. New approach for underwater image denoise combining inhomogeneous illumination and dark channel prior. In Proceedings of the MTS/IEEE Oceans Conference, San Diego, CA, USA, 23–27 September 2013. [Google Scholar]
- Chan, F.T.; Vese, L.A. Active Contours without Edges. IEEE Trans. Image Process. 2001, 10, 266–277. [Google Scholar] [CrossRef] [Green Version]
- Deng, J.; Dong, W. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE-Computer-Society Conference on Computer Vision and Pattern Recognition Workshops, Miami Beach, FL, USA, 20–25 June 2009. [Google Scholar]
- Patterson, G.; Hays, J. COCO Attributes: Attributes for People, Animals, and Objects. In Proceedings of the 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Pei, Y.; Huang, Y.; Zou, Q.; Zhang, X.; Wang, S. Effects of Image Degradation and Degradation Removal to CNN-Based Image Classification. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1239–1253. [Google Scholar] [CrossRef]
- Bazeille, S.; Quidu, I.; Jaulin, L. Identification of Underwater Man-Made Object Using a Colour Criterion. In Proceedings of the Conference on detection and classification of underwater targets, Edinburgh, UK, 18–19 September 2007. [Google Scholar]
- Varma, M.; Zisserman, A. A statistical approach to texture classification from single images. Int. J. Comput. Vis. 2005, 62, 61–81. [Google Scholar] [CrossRef]
- Hsiao, Y.-H.; Chen, C.-C.; Lin, S.-I.; Lin, F.-P. Real-world underwater fish recognition and identification, using sparse representation. Ecol. Inform. 2014, 23, 13–21. [Google Scholar] [CrossRef]
- Beijbom, O.; Edmunds, P.J.; Kline, D.I.; Mitchell, B.G.; Kriegman, D. Automated Annotation of Coral Reef Survey Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012. [Google Scholar]
- Cao, S.; Zhao, D.; Liu, X.; Sun, Y. Real-time robust detector for underwater live crabs based on deep learning. Comput. Electron. Agric. 2020, 172, 105339. [Google Scholar] [CrossRef]
- Sung, M.; Yu, S.-C.; Girdhar, Y. Vision based Real-time Fish Detection Using Convolutional Neural Network. In Proceedings of the Oceans Aberdeen Conference, Aberdeen, UK, 16–21 June 2017. [Google Scholar]
- Han, F.; Yao, J.; Zhu, H.; Wang, C. Marine Organism Detection and Classification from Underwater Vision Based on the Deep CNN Method. Math. Probl. Eng. 2020, 2020, 1–11. [Google Scholar] [CrossRef]
- Han, F.; Yao, J.; Zhu, H.; Wang, C. Underwater Image Processing and Object Detection Based on Deep CNN Method. J. Sensors 2020, 2020, 1–20. [Google Scholar] [CrossRef]
- Madhan, E.S.; Kannan, K.S.; Rani, P.S.; Rani, J.V.; Anguraj, D.K. A distributed submerged object detection and classification enhancement with deep learning. Distrib. Parallel Databases 2021, 1–17. [Google Scholar] [CrossRef]
- Li, X.; Yang, Z.; Shang, M.; Hao, J. Underwater Image Enhancement via Dark Channel Prior and Luminance Adjustment. In Proceedings of the Oceans Conference, Shanghai, China, 15–16 September 2016. [Google Scholar]
- Fan, B.; Chen, W.; Cong, Y.; Tian, J. Dual refinement underwater object detection network. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020. Part 20. [Google Scholar]
- Zhou, H.; Huang, H.; Yang, X.; Zhang, L.; Qi, L. Faster R-CNN for Marine Organism Detection and Recognition Using Data Augmentation. In Proceedings of the International Conference on Video and Image Processing (ICVIP), Singapore, 27–29 September 2017. [Google Scholar]
- He, K.; Sun, J. Convolutional Neural Networks at Constrained Time Cost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Srivastava, K.R.; Greff, K.; Schmidhuber, J. Highway networks. arXiv 2015, arXiv:1505.00387. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Johnson, J.; Alahi, A.; Li, F.-F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Proceedings of the 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Nah, S.; Hyun Kim, T.; Mu Lee, K. Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring. In Proceedings of the 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2016. [Google Scholar]
- Su, S.; Delbracio, M.; Wang, J.; Sapiro, G.; Heidrich, W.; Wang, O. Deep video deblurring for hand-held cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the 29th Annual Conference on Neural Information Processing Systems (NIPS), Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
Class | Ground Truths | Detections | Recall | AP |
---|---|---|---|---|
Holothurian | 626 | 1849 | 0.759 | 0.650 |
Echinus | 3228 | 9997 | 0.900 | 0.777 |
Scallop | 4206 | 8576 | 0.735 | 0.594 |
Starfish | 1834 | 3367 | 0.827 | 0.755 |
IoU | 0.50 | 0.55 | 0.60 | 0.65 | 0.70 | 0.75 | 0.80 | mAP@ [0.5:0.05:0.95] | |
---|---|---|---|---|---|---|---|---|---|
Class | |||||||||
Holothurian | 0.650 | 0.635 | 0.622 | 0.566 | 0.514 | 0.429 | 0.296 | 0.408 | |
Echinus | 0.777 | 0.768 | 0.752 | 0.711 | 0.510 | 0.435 | 0.287 | ||
Scallop | 0.594 | 0.585 | 0.537 | 0.514 | 0.446 | 0.346 | 0.239 | ||
Starfish | 0.755 | 0.744 | 0.687 | 0.666 | 0.580 | 0.469 | 0.355 | ||
mAP | 0.694 | 0.683 | 0.650 | 0.614 | 0.538 | 0.420 | 0.295 |
Denoising | Deblurring | Color Correction | [email protected] | [email protected] | |
---|---|---|---|---|---|
① | 0.634 | 0.477 | |||
② | √ | 0.656 | 0.506 | ||
③ | √ | 0.655 | 0.501 | ||
④ | √ | 0.638 | 0.495 | ||
⑤ | √ | √ | 0.662 | 0.515 | |
⑥ | √ | √ | 0.657 | 0.507 | |
⑦ | √ | √ | √ | 0.694 | 0.538 |
IoU = 0.5 | IoU = 0.7 | mAP@ [0.5,0.95] | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Holothurian | Echinus | Scallop | Starfish | mAP | Holothurian | Echinus | Scallop | Starfish | mAP | ||
SSD | 0.487 | 0.682 | 0.421 | 0.676 | 0.567 | 0.352 | 0.442 | 0.253 | 0.462 | 0.377 | 0.309 |
Yolo_v3 | 0.577 | 0.691 | 0.442 | 0.599 | 0.577 | 0.368 | 0.442 | 0.297 | 0.392 | 0.375 | 0.305 |
Faster RCNN | 0.587 | 0.767 | 0.489 | 0.688 | 0.633 | 0.433 | 0.543 | 0.358 | 0.526 | 0.465 | 0.358 |
Cascade RCNN | 0.589 | 0.767 | 0.495 | 0.685 | 0.634 | 0.437 | 0.549 | 0.366 | 0.556 | 0.477 | 0.373 |
Proposed framework | 0.650 | 0.777 | 0.595 | 0.755 | 0.694 | 0.514 | 0.610 | 0.446 | 0.580 | 0.538 | 0.408 |
IoU = 0.5 | IoU = 0.7 | |||||||
---|---|---|---|---|---|---|---|---|
Holothurian | Echinus | Scallop | Starfish | Holothurian | Echinus | Scallop | Starfish | |
SSD | 0.740 | 0.889 | 0.702 | 0.808 | 0.490 | 0.614 | 0.401 | 0.568 |
Yolo_v3 | 0.577 | 0.691 | 0.442 | 0.599 | 0.368 | 0.442 | 0.297 | 0.392 |
Faster RCNN | 0.716 | 0. 859 | 0.621 | 0.774 | 0.564 | 0.681 | 0.483 | 0.619 |
Cascade RCNN | 0.728 | 0.847 | 0.633 | 0.785 | 0.567 | 0.681 | 0.499 | 0.637 |
Proposed framework | 0.759 | 0.900 | 0.735 | 0.827 | 0.604 | 0.722 | 0.588 | 0.690 |
Model | SSD | Yolo_v3 | Faster RCNN | Cascade RCNN | Proposed Framework |
---|---|---|---|---|---|
Fps | 57 | 70 | 43 | 38 | 41 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, X.; Fang, X.; Pan, M.; Yuan, L.; Zhang, Y.; Yuan, M.; Lv, S.; Yu, H. A Marine Organism Detection Framework Based on the Joint Optimization of Image Enhancement and Object Detection. Sensors 2021, 21, 7205. https://doi.org/10.3390/s21217205
Zhang X, Fang X, Pan M, Yuan L, Zhang Y, Yuan M, Lv S, Yu H. A Marine Organism Detection Framework Based on the Joint Optimization of Image Enhancement and Object Detection. Sensors. 2021; 21(21):7205. https://doi.org/10.3390/s21217205
Chicago/Turabian StyleZhang, Xueting, Xiaohai Fang, Mian Pan, Luhua Yuan, Yaxin Zhang, Mengyi Yuan, Shuaishuai Lv, and Haibin Yu. 2021. "A Marine Organism Detection Framework Based on the Joint Optimization of Image Enhancement and Object Detection" Sensors 21, no. 21: 7205. https://doi.org/10.3390/s21217205