Detection of River Plastic Using UAV Sensor Data and Deep Learning
"> Figure 1
<p>Location of study sites (Background map: OpenStreetMap, 2021).</p> "> Figure 2
<p>Study area showing Houay Mak Hiao River, Vientiane, Laos. (Background map: OpenStreetMap, 2021).</p> "> Figure 3
<p>Study area showing Khlong Nueng, Talad Thai, Pathum Thani, Thailand (Background map: OpenStreetMap, 2021).</p> "> Figure 4
<p>Methodological framework for assessment of performance of deep learning architectures for plastic detection.</p> "> Figure 5
<p>Sample images from datasets used for training deep learning models for plastic detection in rivers. (<b>a</b>) HMH in Laos with co-ordinates (887,503.069 m, 1,995,416.74 m); (887,501.986 m, 1,995,416.537 m); and (887,501.418 m, 1,995,417.692 m) (<b>b</b>) TT in Thailand with co-ordinates 674,902.457 m, 1,557,870.257 m); (674,903.403 m, 1,557,860.135 m); and (674,925.317 m, 1,557,850.965 m) under WGS_1984_UTM_Zone_47N.</p> "> Figure 6
<p>Experiment VII results. Smallest and largest plastics detected. (<b>a</b>) HMH. (<b>b</b>) TT. (<b>c</b>) Transfer from TT to HMH. (<b>d</b>) Transfer from HMH to TT. For reference, the actual dimensions of a 600 mL bottle of water are 23 × 5 cm = 75 cm<sup>2</sup>.</p> "> Figure 7
<p>The HMH model fine-tuned on TT performs well in some cases. (<b>a</b>) TT model result on TT. (<b>b</b>) HMH model results on TT with fine-tuning. (Note: bar-like objects are galvanized stainless steel roof sheets).</p> "> Figure 8
<p>Fine-tuning the HMH model on TT is weak in some cases. (<b>a</b>) TT model result on TT. (<b>b</b>) HMH model results on TT with fine-tuning. Transfer learning confidence scores are lower. (Note: bar-like objects are galvanized stainless steel roof sheets).</p> "> Figure 9
<p>Both the TT model and the HMH model transferred to TT fail in some cases. Neither model detected any plastic in these images from TT.</p> "> Figure 10
<p>The TT model fine-tuned on HMH performs well in some cases. (<b>a</b>) HMH model result on HMH. (<b>b</b>) TT model results on HMH with fine-tuning.</p> "> Figure 11
<p>Fine-tuning the TT model on HMH is weak or fails in some cases. (<b>a</b>) HMH model result on HMH. (<b>b</b>) TT model results on HMH with fine-tuning.</p> "> Figure 12
<p>Both the HMH model and the TT model with transfer learning fail in some cases. Neither model detected any plastic in these images.</p> ">
Abstract
:1. Introduction
- higher baseline performance;
- less time to develop the model;
- better final performance.
- We examine the performance of object detection models in the You Only Look Once (YOLO) family for plastic detection in ortho imagery acquired by low-altitude UAVs.
- We examine the transferability of the knowledge encapsulated in a detection model from one location to another.
- We contribute a new dataset comprising images with annotations for the public to use to develop and evaluate riverine plastic monitoring systems.
2. Materials and Methods
2.1. Study Area
2.2. Materials
2.3. Methodology
2.3.1. Deep Learning Models for Object Detection
2.3.2. Selection of Object Detection Models
2.3.3. Transfer Learning
2.3.4. Performance Assessment of Transfer Learning
- Data preparation: Prepare the data set in the appropriate format (e.g., DarkNet format for YOLOv4-tiny and PyTorch format for YOLOv5s) and then split it into training and validation sets.
- Input: Prepare images and label files for training and validation dataset along with the pre-trained weights and configuration file for training.
- Output: Save trained model to a file containing optimized weights.
- (A)
- Training models from pre-trained networks (S1):
- Load pre-trained weights (optimized for the COCO dataset) into the model.
- Freeze the initial N1 layers and unfreeze the last N2 layers of the model.
- Select a hyperparameter configuration from Table 1.
- Train the model and stop training when average loss stops decreasing.
- Record final average loss.
- Repeat steps iii–v for all combinations of hyperparameters.
- Select the model with hyperparameters that achieve the lowest average loss.
- (B)
- Training from scratch (S2):
- Load the pre-trained weights (trained on COCO dataset).
- Unfreeze all layers and initialize weights to random values from Gaussian distributions having mean zero and standard deviation √(2/n), where n denotes unit’s fan in (number of input units). This initialization controls the initial output and improves convergence empirically [63].
- Select a subset of hyperparameters from Table 1.
- Train the model and stop training when average loss stops decreasing.
- Record average loss.
- Repeat steps iii–v for all combinations of hyperparameters.
- Select the model with hyperparameters that achieve the lowest average loss.
- (C)
- Transfer learning:
- Collect best weights for each model and each type of training at one location.
- Load the best weights for one location and one model.
- Freeze initial N1 layers and fine-tune the last N2 layers.
- Select a subset of hyperparameters from Table 1.
- Train the model in a new location and stop training when average loss stops decreasing.
- Calculate average loss.
- Repeat steps iv–vi for all combinations of hyperparameters, for all models.
2.3.5. Performance Indicators
- (A)
- Mean Average Precision (mAP):
- (B)
- F1-Score:
3. Results
3.1. Dataset Preparation
3.2. Experimental Parameter Sets
3.3. Experiments I, II, III, and IV: Plastic Detection in UAV Imagery
3.4. Experiment V and VI: Transfer Learning from One Location to Another
3.5. Experiment VII: Estimation of Plastic Volume in Different Detection Cases
4. Discussion
4.1. Analysis of Sample Plastic Detection Cases with/without Transfer Learning from HMH to TT
4.2. Analysis of Sample Plastic Detection Cases with/without Transfer Learning from TT to HMH
4.3. Analysis of Performance of YOLO Models for Detection
4.4. Challenges in Plastic Detection and Future Opportunities for Improvement
5. Conclusions
- Our experiments provide insight into the spatial resolution needed by UAV imaging and computational capacity required for deep learning of YOLO models for precise plastic detection.
- Transfer learning from one location to another with fine-tuning improves performance.
- Detection ability depends on a variety of features of the objects imaged including the type of plastic, as well as its brightness, shape, size, and color.
- The datasets used in this research can be used as references for detection of plastic in other regions as well.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AP | Average Precision |
AUVs | Autonomous Underwater Vehicles |
CNNs | Convolutional Neural Networks |
COCO | Microsoft Common Objects in Context |
CSM | Class-specific Semantic enhancement Module |
CSP | Cross Stage Partial |
DETR | Detection Transformer |
DL | Deep Learning |
FDI | Floating Debris Index |
FN | False Negative |
FP | False Positive |
FPS | Floating Point Systems |
GFLOPs | One billion Floating-point Operations Per Second |
GNSS | Global Navigation Satellite System |
GPS | Global Positioning System |
GPU | Graphics Processing Unit |
GSD | Ground Sampling Distance |
HMH | Houay Mak Hiao |
ILSVRC2013 | ImageNet Large Scale Visual Recognition Challenge 2013 |
IoU | Intersection over Union |
J-EDI | JAMSTEC E-Library of Deep-sea Images |
mAP | Mean Average Precision |
NIR | Near Infrared |
PANet | Path Aggregation Network |
R-CNN | Region-Based Convolutional Neural Networks |
RNN | Recurrent Neural Network |
R2 IPoints | Rotation-Insensitive Points |
ROVs | Remotely Operated Vehicles |
SAM | Sample Angle Mapping |
SPP | Spatial Pyramid Pooling |
SRM | Stacked rotation convolution module |
SSD | Single Shot Detector |
SWIR | Short-wave Infrared |
TP | True Positive |
TT | Talad Thai |
TACO | Trash Annotations in Context Dataset |
UAVs | Unmanned Aerial Vehicles |
UNEP | United Nations Environment Programme |
VGG-16 | Visual Geometry Group-16 |
YOLO | You Only Look Once |
References
- Kershaw, P. Marine Plastic Debris and Microplastics–Global Lessons and Research to Inspire Action and Guide Policy Change; United Nations Environment Programme: Nairobi, Kenya, 2016. [Google Scholar]
- Lebreton, L.C.M.; van der Zwet, J.; Damsteeg, J.W.; Slat, B.; Andrady, A.; Reisser, J. River plastic emissions to the world’s oceans. Nat. Commun. 2017, 8, 15611. [Google Scholar] [CrossRef] [PubMed]
- Jambeck, J.R.; Geyer, R.; Wilcox, C.; Siegler, T.R.; Perryman, M.; Andrady, A.; Naray, R. Plastic waste inputs from land into the ocean. Science 2015, 347, 768–771. [Google Scholar] [CrossRef] [PubMed]
- Blettler, M.C.M.; Abrial, E.; Khan, F.R.; Sivri, N.; Espinola, L.A. Freshwater plastic pollution: Recognizing research biases and identifying knowledge gaps. Water Res. 2018, 143, 416–424. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Moore, C.J.; Lattin, G.L.; Zellers, A.F. Este artigo está disponível em. J. Integr. Coast. Zone Manag. 2011, 11, 65–73. [Google Scholar]
- Gasperi, J.; Dris, R.; Bonin, T.; Rocher, V.; Tassin, B. Assessment of floating plastic debris in surface water along the seine river. Environ. Pollut. 2014, 195, 163–166. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yao, X.; Wang, N.; Liu, Y.; Cheng, T.; Tian, Y.; Chen, Q.; Zhu, Y. Estimation of wheat LAI at middle to high levels using unmanned aerial vehicle narrowband multispectral imagery. Remote Sens. 2017, 9, 1304. [Google Scholar] [CrossRef] [Green Version]
- Papakonstantinou, A.; Kavroudakis, D.; Kourtzellis, Y.; Chtenellis, M.; Kopsachilis, V.; Topouzelis, K.; Vaitis, M. Mapping cultural heritage in coastal areas with UAS: The case study of Lesvos Island. Heritage 2019, 2, 1404–1422. [Google Scholar] [CrossRef] [Green Version]
- Watts, A.C.; Ambrosia, V.G.; Hinkley, E.A. Unmanned aircraft systems in remote sensing and scientific research: Classification and considerations of use. Remote Sens. 2012, 4, 1671–1692. [Google Scholar] [CrossRef] [Green Version]
- Shakhatreh, H.; Sawalmeh, A.; Al-Fuqaha, A.; Dou, Z.; Almaita, E.; Khalil, I.; Othman, N.S.; Khreishah, A.; Guizani, M. Unmanned aerial vehicles: A survey on civil applications and key research challenges. IEEE Access 2018, 7, 48572–48634. [Google Scholar] [CrossRef]
- Reynaud, L.; Rasheed, T. Deployable aerial communication networks: Challenges for futuristic applications. In Proceedings of the 9th ACM Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, and Ubiquitous Networks, Paphos, Cyprus, 24–25 October 2012. [Google Scholar]
- Colomina, I.; Molina, P. Unmanned aerial systems for photogrammetry and remote sensing: A review. ISPRS J. Photogramm. Remote Sens. 2014, 92, 79–97. [Google Scholar] [CrossRef] [Green Version]
- Mugnai, F.; Longinotti, P.; Vezzosi, F.; Tucci, G. Performing low-altitude photogrammetric surveys, a comparative analysis of user-grade unmanned aircraft systems. Appl. Geomat. 2022, 14, 211–223. [Google Scholar] [CrossRef]
- Martin, C.; Zhang, Q.; Zhai, D.; Zhang, X.; Duarte, C.M. Enabling a large-scale assessment of litter along Saudi Arabian Red Sea shores by combining drones and machine learning. Environ. Pollut. 2021, 277, 116730. [Google Scholar] [CrossRef]
- Merlino, S.; Paterni, M.; Berton, A.; Massetti, L. Unmanned aerial vehicles for debris survey in coastal areas: Long-term monitoring programme to study spatial and temporal accumulation of the dynamics of beached marine litter. Remote Sens. 2020, 12, 1260. [Google Scholar] [CrossRef] [Green Version]
- Andriolo, U.; Gonçalves, G.; Rangel-Buitrago, N.; Paterni, M.; Bessa, F.; Gonçalves, L.M.S.; Sobral, P.; Bini, M.; Duarte, D.; Fontán-Bouzas, Á.; et al. Drones for litter mapping: An inter-operator concordance test in marking beached items on aerial images. Mar. Pollut. Bull. 2021, 169, 112542. [Google Scholar] [CrossRef] [PubMed]
- Pinto, L.; Andriolo, U.; Gonçalves, G. Detecting stranded macro-litter categories on drone orthophoto by a multi-class neural network. Mar. Pollut. Bull. 2021, 169, 112594. [Google Scholar] [CrossRef]
- Deidun, A.; Gauci, A.; Lagorio, S.; Galgani, F. Optimising beached litter monitoring protocols through aerial imagery. Mar. Pollut. Bull. 2018, 131, 212–217. [Google Scholar] [CrossRef]
- Fallati, L.; Polidori, A.; Salvatore, C.; Saponari, L.; Savini, A.; Galli, P. Anthropogenic marine debris assessment with unmanned aerial vehicle imagery and deep learning: A case study along the beaches of the Republic of Maldives. Sci. Total Environ. 2019, 693, 133581. [Google Scholar] [CrossRef]
- Martin, C.; Parkes, S.; Zhang, Q.; Zhang, X.; McCabe, M.F.; Duarte, C.M. Use of unmanned aerial vehicles for efficient beach litter monitoring. Mar. Pollut. Bull. 2018, 131, 662–673. [Google Scholar] [CrossRef] [Green Version]
- Nelms, S.E.; Coombes, C.; Foster, L.C.; Galloway, T.S.; Godley, B.J.; Lindeque, P.K.; Witt, M.J. Marine anthropogenic litter on british beaches: A 10-year nationwide assessment using citizen science data. Sci. Total Environ. 2017, 579, 1399–1409. [Google Scholar] [CrossRef] [Green Version]
- Andriolo, U.; Gonçalves, G.; Sobral, P.; Bessa, F. Spatial and size distribution of macro-litter on coastal dunes from drone images: A case study on the Atlantic Coast. Mar. Pollut. Bull. 2021, 169, 112490. [Google Scholar] [CrossRef]
- Andriolo, U.; Gonçalves, G.; Sobral, P.; Fontán-Bouzas, Á.; Bessa, F. Beach-dune morphodynamics and marine macro-litter abundance: An integrated approach with unmanned aerial system. Sci. Total Environ. 2020, 749, 432–439. [Google Scholar] [CrossRef] [PubMed]
- Andriolo, U.; Garcia-Garin, O.; Vighi, M.; Borrell, A.; Gonçalves, G. Beached and floating litter surveys by unmanned aerial vehicles: Operational analogies and differences. Remote Sens. 2022, 14, 1336. [Google Scholar] [CrossRef]
- Papakonstantinou, A.; Batsaris, M.; Spondylidis, S.; Topouzelis, K. A citizen science unmanned aerial system data acquisition protocol and deep learning techniques for the automatic detection and mapping of marine litter concentrations in the coastal zone. Drones 2021, 5, 6. [Google Scholar] [CrossRef]
- Merlino, S.; Paterni, M.; Locritani, M.; Andriolo, U.; Gonçalves, G.; Massetti, L. Citizen science for marine litter detection and classification on unmanned aerial vehicle images. Water 2021, 13, 3349. [Google Scholar] [CrossRef]
- Ham, S.; Oh, Y.; Choi, K.; Lee, I. Semantic segmentation and unregistered building detection from UAV images using a deconvolutional network. In Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences—ISPRS Archives; International Society for Photogrammetry and Remote Sensing, Niece, France, 30 May 2018; Volume 42, pp. 419–424. [Google Scholar]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Disaster Monitoring using unmanned aerial vehicles and deep learning. arXiv 2018, arXiv:1807.11805. [Google Scholar]
- Zeggada, A.; Benbraika, S.; Melgani, F.; Mokhtari, Z. Multilabel conditional random field classification for UAV images. IEEE Geosci. Remote Sens. Lett. 2018, 15, 399–403. [Google Scholar] [CrossRef]
- Zhao, Z.; Zheng, P.; Xu, S.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [Green Version]
- Viola, P.; Jones, M.J. Robust Real-Time Object Detection; 2001. In Proceedings of the Workshop on Statistical and Computational Theories of Vision, Cambridge Research Laboratory, Cambridge, MA, USA, 25 February 2001; Volume 266, p. 56. [Google Scholar]
- Längkvist, M.; Kiselev, A.; Alirezaie, M.; Loutfi, A. Classification and segmentation of satellite orthoimagery using convolutional neural networks. Remote Sens. 2016, 8, 329. [Google Scholar] [CrossRef] [Green Version]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv 2013, arXiv:1312.6229. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Maitra, D.S.; Bhattacharya, U.; Parui, S.K. CNN based common approach to handwritten character recognition of multiple scripts. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR; IEEE Computer Society, Tunis, Tunisia, 23–26 August 2015; Volume 2015, pp. 1021–1025. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 142–158. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2013, arXiv:1312.4400.b. [Google Scholar]
- Sarkar, P.; Gupta, M.A. Object Recognition with Text and Vocal Representation. Int. J. Eng. Res. Appl. 2020, 10, 63–77. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. arXiv 2014, arXiv:1409.4842. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Salimi, I.; Bayu Dewantara, B.S.; Wibowo, I.K. Visual-based trash detection and classification system for smart trash bin robot. In Proceedings of the 2018 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), Bali, Indonesia, 29–30 October 2018; pp. 378–383. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Yao, X.; Shen, H.; Feng, X.; Cheng, G.; Han, J. R2 IPoints: Pursuing rotation-insensitive point representation for aerial object detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5623512. [Google Scholar] [CrossRef]
- Vaswani, A.; Brain, G.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Processing Syst. 2017, 30, 6000–6010. [Google Scholar]
- Bazi, Y.; Bashmal, L.; al Rahhal, M.M.; al Dayil, R.; al Ajlan, N. Vision Transformers for Remote Sensing Image Classification. Remote Sens. 2021, 13, 516. [Google Scholar] [CrossRef]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 213–229. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11 October 2021. [Google Scholar]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. arXiv 2021, arXiv:2012.12877. [Google Scholar]
- Majchrowska, S.; Mikołajczyk, A.; Ferlin, M.; Klawikowska, Z.; Plantykow, M.A.; Kwasigroch, A.; Majek, K. Deep learning-based waste detection in natural and urban environments. Waste Manag. 2022, 138, 274–284. [Google Scholar] [CrossRef]
- Córdova, M.; Pinto, A.; Hellevik, C.C.; Alaliyat, S.A.A.; Hameed, I.A.; Pedrini, H.; da Torres, R.S. Litter detection with deep learning: A comparative study. Sensors 2022, 22, 548. [Google Scholar] [CrossRef]
- Kraft, M.; Piechocki, M.; Ptak, B.; Walas, K. Autonomous, onboard vision-based trash and litter detection in low altitude aerial images collected by an unmanned aerial vehicle. Remote Sens. 2021, 13, 965. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and efficient object detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, DC, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar] [CrossRef]
- Kumar, S.; Yadav, D.; Gupta, H.; Verma, O.P.; Ansari, I.A.; Ahn, C.W. A Novel Yolov3 algorithm-based deep learning approach for waste segregation: Towards smart waste management. Electronics 2021, 14. [Google Scholar] [CrossRef]
- Fulton, M.; Hong, J.; Islam, M.J.; Sattar, J. Robotic detection of marine litter using deep visual detection models. arXiv 2018, arXiv:1804.01079. [Google Scholar]
- Tata, G.; Royer, S.-J.; Poirion, O.; Lowe, J. A robotic approach towards quantifying epipelagic bound plastic using deep visual models. arXiv 2021, arXiv:2105.01882. [Google Scholar]
- Luo, W.; Han, W.; Fu, P.; Wang, H.; Zhao, Y.; Liu, K.; Liu, Y.; Zhao, Z.; Zhu, M.; Xu, R.; et al. A water surface contaminants monitoring method based on airborne depth reasoning. Processes 2022, 10, 131. [Google Scholar] [CrossRef]
- Pati, B.M.; Kaneko, M.; Taparugssanagorn, A. A deep convolutional neural network based transfer learning method for non-cooperative spectrum sensing. IEEE Access 2020, 8, 164529–164545. [Google Scholar] [CrossRef]
- Huang, Z.; Pan, Z.; Lei, B. Transfer learning with deep convolutional neural network for SAR target classification with limited labeled data. Remote Sens. 2017, 9, 907. [Google Scholar] [CrossRef] [Green Version]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Zhang, S.; Wu, J. Efficient object detection framework and hardware architecture for remote sensing images. Remote Sens. 2019, 11, 2376. [Google Scholar] [CrossRef] [Green Version]
- Boutell, M.R.; Luo, J.; Shen, X.; Brown, C.M. Learning multi-label scene classification. Pattern Recognit. 2004, 37, 1757–1771. [Google Scholar] [CrossRef] [Green Version]
- Russell, S.; Norvig, P. Artificial Intelligence: A Modern Approach, 3rd ed.; Pearson Education, Inc.: Upper Saddle River, NJ, USA, 2009. [Google Scholar]
- Kwon, Y. Yolo_Label: GUI for Marking Bounded Boxes of Objects in Images for Training Neural Network Yolo v3 and v2. Available online: https://github.com/developer0hye/Yolo_Label.git (accessed on 24 December 2021).
- Huang, K.; Lei, H.; Jiao, Z.; Zhong, Z. Recycling waste classification using vision transformer on portable device. Sustainability 2021, 13, 1572. [Google Scholar] [CrossRef]
- Devries, T.; Misra, I.; Wang, C.; van der Maaten, L. Does object recognition work for everyone. arXiv 2019, arXiv:1906.02659. [Google Scholar] [CrossRef]
- van Lieshout, C.; van Oeveren, K.; van Emmerik, T.; Postma, E. Automated River plastic monitoring using deep learning and cameras. Earth Space Sci. 2020, 7, e2019EA000960. [Google Scholar] [CrossRef]
- Jakovljevic, G.; Govedarica, M.; Alvarez-Taboada, F. A deep learning model for automatic plastic mapping using unmanned aerial vehicle (UAV) data. Remote Sens. 2020, 12, 1515. [Google Scholar] [CrossRef]
- Lin, F.; Hou, T.; Jin, Q.; You, A. Improved yolo based detection algorithm for floating debris in waterway. Entropy 2021, 23, 1111. [Google Scholar] [CrossRef]
- Colica, E.; D’Amico, S.; Iannucci, R.; Martino, S.; Gauci, A.; Galone, L.; Galea, P.; Paciello, A. Using unmanned aerial vehicle photogrammetry for digital geological surveys: Case study of Selmun promontory, northern of Malta. Environ. Earth Sci. 2021, 80, 12538. [Google Scholar] [CrossRef]
- Lu, H.; Li, Y.; Xu, X.; He, L.; Li, Y.; Dansereau, D.; Serikawa, S. underwater image descattering and quality assessment. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 1998–2002. [Google Scholar]
- Wolf, M.; van den Berg, K.; Garaba, S.P.; Gnann, N.; Sattler, K.; Stahl, F.; Zielinski, O. Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q). Environ. Res. Lett. 2020, 15, 094075. [Google Scholar] [CrossRef]
- Silva, G.F.; Carneiro, G.B.; Doth, R.; Amaral, L.A.; de Azevedo, D.F.G. Near real-time shadow detection and removal in aerial motion imagery application. ISPRS J. Photogramm. Remote Sens. 2018, 140, 104–121. [Google Scholar] [CrossRef]
- Nelson, J.; Solawetz, J. Responding to the Controversy about YOLOv5. Available online: https://blog.roboflow.com/yolov4-versus-yolov5/ (accessed on 30 July 2020).
- Garcia-Garin, O.; Monleón-Getino, T.; López-Brosa, P.; Borrell, A.; Aguilar, A.; Borja-Robalino, R.; Cardona, L.; Vighi, M. Automatic detection and quantification of floating marine macro-litter in aerial images: Introducing a novel deep learning approach connected to a web application in R. Environ. Pollut. 2021, 273, 116490. [Google Scholar] [CrossRef] [PubMed]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO series in 2021 V100 batch 1 latency (Ms) YOLOX-L YOLOv5-L YOLOX-DarkNet53 YOLOv5-Darknet53 EfficientDet5 COCO AP (%) number of parameters (M) figure 1: Speed-accuracy trade-off of accurate models (Top) and size-accuracy curve of lite models on mobile devices (Bottom) for YOLOX and other state-of-the-art object detectors. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Nepal, U.; Eslamiat, H. Comparing YOLOv3, YOLOv4 and YOLOv5 for autonomous landing spot detection in faulty UAVs. Sensors 2022, 22, 464. [Google Scholar] [CrossRef]
- Glenn, J. Ultralytics/Yolov5. Available online: https://github.com/ultralytics/yolov5/releases (accessed on 5 April 2022).
- Biermann, L.; Clewley, D.; Martinez-Vicente, V.; Topouzelis, K. Finding plastic patches in coastal waters using optical satellite data. Sci. Rep. 2020, 10, 5364. [Google Scholar] [CrossRef] [Green Version]
- Gonçalves, G.; Andriolo, U.; Gonçalves, L.; Sobral, P.; Bessa, F. Quantifying marine macro litter abundance on a sandy beach using unmanned aerial systems and object-oriented machine learning methods. Remote Sens. 2020, 12, 2599. [Google Scholar] [CrossRef]
- Escobar-Sánchez, G.; Haseler, M.; Oppelt, N.; Schernewski, G. Efficiency of aerial drones for macrolitter monitoring on Baltic Sea Beaches. Front. Environ. Sci. 2021, 8, 237. [Google Scholar] [CrossRef]
- Cao, H.; Gu, X.; Sun, Y.; Gao, H.; Tao, Z.; Shi, S. Comparing, validating and improving the performance of reflectance obtention method for UAV-remote sensing. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102391. [Google Scholar] [CrossRef]
- Gonçalves, G.; Andriolo, U. Operational use of multispectral images for macro-litter mapping and categorization by unmanned aerial vehicle. Mar. Pollut. Bull. 2022, 176, 113431. [Google Scholar] [CrossRef]
- Guffogg, J.A.; Blades, S.M.; Soto-Berelov, M.; Bellman, C.J.; Skidmore, A.K.; Jones, S.D. Quantifying marine plastic debris in a beach environment using spectral analysis. Remote Sens. 2021, 13, 4548. [Google Scholar] [CrossRef]
- Garaba, S.P.; Aitken, J.; Slat, B.; Dierssen, H.M.; Lebreton, L.; Zielinski, O.; Reisser, J. Sensing ocean plastics with an airborne hyperspectral shortwave infrared imager. Environ. Sci. Technol. 2018, 52, 11699–11707. [Google Scholar] [CrossRef] [PubMed]
- Goddijn-Murphy, L.; Dufaur, J. Proof of concept for a model of light reflectance of plastics floating on natural waters. Mar. Pollut. Bull. 2018, 135, 1145–1157. [Google Scholar] [CrossRef] [PubMed]
- Taddia, Y.; Corbau, C.; Buoninsegni, J.; Simeoni, U.; Pellegrinelli, A. UAV approach for detecting plastic marine debris on the beach: A case study in the Po River Delta (Italy). Drones 2021, 5, 140. [Google Scholar] [CrossRef]
- Gonçalves, G.; Andriolo, U.; Pinto, L.; Bessa, F. Mapping marine litter using UAS on a beach-dune system: A multidisciplinary approach. Sci. Total Environ. 2020, 706, 135742. [Google Scholar] [CrossRef] [PubMed]
- Geraeds, M.; van Emmerik, T.; de Vries, R.; bin Ab Razak, M.S. Riverine plastic litter monitoring using unmanned aerial vehicles (UAVs). Remote Sens. 2019, 11, 2045. [Google Scholar] [CrossRef] [Green Version]
- Makarau, A.; Richter, R.; Muller, R.; Reinartz, P. Adaptive shadow detection using a blackbody radiator model. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2049–2059. [Google Scholar] [CrossRef]
- Balsi, M.; Moroni, M.; Chiarabini, V.; Tanda, G. High-resolution aerial detection of marine plastic litter by hyperspectral sensing. Remote Sens. 2021, 13, 1557. [Google Scholar] [CrossRef]
- Andriolo, U.; Gonçalves, G.; Bessa, F.; Sobral, P. Mapping marine litter on coastal dunes with unmanned aerial systems: A showcase on the Atlantic Coast. Sci. Total Environ. 2020, 736, 139632. [Google Scholar] [CrossRef]
- Topouzelis, K.; Papakonstantinou, A.; Garaba, S.P. Detection of floating plastics from satellite and unmanned aerial systems (plastic litter project 2018). Int. J. Appl. Earth Obs. Geoinf. 2019, 79, 175–183. [Google Scholar] [CrossRef]
- Lo, H.S.; Wong, L.C.; Kwok, S.H.; Lee, Y.K.; Po, B.H.K.; Wong, C.Y.; Tam, N.F.Y.; Cheung, S.G. Field test of beach litter assessment by commercial aerial drone. Mar. Pollut. Bull. 2020, 151, 110823. [Google Scholar] [CrossRef]
Parameters | Value |
---|---|
Batch size * | 16, 32, 64 and 128 |
Learning rate | 0.01 to 0.001 |
No. of filters in YOLO layers | 18 ** |
Experiment | Training Dataset | Testing Dataset | Training Method | Models (YOLO Family) |
---|---|---|---|---|
I | HMH | TT | Scratch | YOLOv2 YOLOv2-tiny YOLOv3 YOLOv3-tiny YOLOv3-spp YOLOv4 YOLOv4-tiny YOLOv5s YOLOv5m YOLOv5l YOLOv5x |
II | Using pre-trained model | |||
III | TT | HMH | Scratch | |
IV | Using pre-trained model | |||
V | HMH | TT | Fine-tuning | YOLOv5s, YOLOv4, YOLOv3-spp, and YOLOv2 trained in II |
VI | TT | HMH | Fine-tuning | YOLOv5s, YOLOv4, YOLOv3-spp, and YOLOv2 trained in IV |
VII | Plastic volume estimation using pre-trained YOLOv5s in terms of surface area |
Model | Training Time (h) | Inference Time per Image (s) | Model Size (MB) | Computational Complexity (GFLOPs) | mAP @ 0.5 IoU for Validation Dataset | Map @ 0.5 IoU for Testing Dataset | Highest F1 Score | Computing Platform |
---|---|---|---|---|---|---|---|---|
Pre-trained YOLOv2 | 0.359 | 4.74 | 192.9 | 29.338 | 0.723 | 0.442 | 0.66 | Google Colab |
YOLOv2 scratch | 0.367 | 4.84 | 192.9 | 29.338 | 0.581 | 0.259 | 0.6 | |
Pre-trained YOLOv2-tiny | 0.166 | 3.53 | 42.1 | 5.344 | 0.467 | 0.293 | 0.38 | |
YOLOv2-tiny scratch | 0.23 | 3.52 | 42.1 | 5.344 | 0.348 | 0.286 | 0.44 | |
Pre-trained YOLOv3 tiny | 0.082 | 0.01 | 16.5 | 12.9 | 0.714 | 0.366 | 0.7 | Intel®Core™ i7-10750H CPU @2.60 GHz, 16 GB RAM, and GPU as NVIDIA GeForce RTX 2060 |
YOLOv3-tiny scratch | 0.082 | 0.004 | 16.5 | 12.9 | 0.555 | 0.336 | 0.58 | |
Pre-trained YOLOv3 | 0.259 | 0.018 | 117 | 154.9 | 0.735 | 0.396 | 0.72 | |
YOLOv3 scratch | 0.258 | 0.017 | 117 | 154.9 | 0.479 | 0.311 | 0.54 | |
Pre-trained YOLOv3-spp | 0.266 | 0.017 | 119 | 155.7 | 0.787 | 0.402 | 0.75 | |
YOLOv3-spp scratch | 0.279 | 0.014 | 119 | 155.7 | 0.59 | 0.265 | 0.57 | |
Pre-trained YOLOv4 | 1.884 | 6.85 | 244.2 | 59.563 | 0.809 | 0.463 | 0.78 | Google Colab |
YOLOv4 scratch | 1.961 | 5.54 | 244.2 | 59.563 | 0.766 | 0.373 | 0.74 | |
Pre-trained YOLOv4-tiny | 0.899 | 2.92 | 22.4 | 6.787 | 0.758 | 0.418 | 0.76 | |
YOLOv4-tiny scratch | 0.968 | 2.72 | 22.4 | 6.787 | 0.732 | 0.355 | 0.73 | |
Pre-trained YOLOv5s | 0.146 | 0.019 | 13.6 | 16.3 | 0.810 | 0.424 | 0.78 | Intel®Core™ i7-10750H CPU @2.60 GHz, 16 GB RAM, and GPU as NVIDIA GeForce RTX 2060 |
YOLOv5s scratch | 0.149 | 0.017 | 13.6 | 16.3 | 0.740 | 0.272 | 0.67 | |
Pre-trained YOLOv5m | 0.195 | 0.041 | 40.4 | 50.3 | 0.787 | 0.434 | 0.77 | |
YOLOv5m scratch | 0.197 | 0.04 | 40.4 | 50.3 | 0.695 | 0.331 | 0.70 | |
Pre-trained YOLOv5l | 0.265 | 0.027 | 89.3 | 114.1 | 0.810 | 0.422 | 0.78 | |
YOLOv5l scratch | 0.262 | 0.032 | 89.3 | 114.1 | 0.669 | 0.176 | 0.67 | |
Pre-trained YOLOv5x | 0.402 | 0.036 | 166 | 217.1 | 0.781 | 0.367 | 0.76 | |
YOLOv5x scratch | 0.399 | 0.042 | 166 | 217.1 | 0.710 | 0.316 | 0.69 |
Model | Training Time (h) | Inference Time per Image (s) | mAP@ 0.5 IoU for Validation Dataset | mAP @ 0.5 IoU for Testing Dataset | Highest F1 Score | Computing Platform |
---|---|---|---|---|---|---|
Pre-trained YOLOv2 | 0.649 | 4.74 | 0.499 | 0.452 | 0.52 | Google Colab |
YOLOv2 scratch | 0.648 | 4.94 | 0.368 | 0.327 | 0.44 | |
Pre-trained YOLOv2-tiny | 0.162 | 3.53 | 0.328 | 0.256 | 0.33 | |
YOLOv2-tiny scratch | 0.174 | 3.43 | 0.302 | 0.220 | 0.32 | |
Pre-trained YOLOv3-tiny | 0.087 | 0.007 | 0.495 | 0.483 | 0.53 | Intel®Core™ i7-10750H CPU @2.60 GHz, 16 GB RAM, and GPU as NVIDIA GeForce RTX 2060 |
YOLOv3-tiny scratch | 0.088 | 0.007 | 0.409 | 0.562 | 0.47 | |
Pre-trained YOLOv3 | 0.282 | 0.017 | 0.571 | 0.743 | 0.59 | |
YOLOv3 scratch | 0.286 | 0.016 | 0.359 | 0.358 | 0.43 | |
Pre-trained YOLOv3-spp | 0.285 | 0.016 | 0.570 | 0.748 | 0.60 | |
YOLOv3-spp scratch | 0.28 | 0.016 | 0.390 | 0.511 | 0.41 | |
Pre-trained YOLOv4 | 1.86 | 4.54 | 0.608 | 0.553 | 0.78 | Google Colab |
YOLOv4 scratch | 1.89 | 4.63 | 0.544 | 0.524 | 0.75 | |
Pre-trained YOLOv4-tiny | 0.949 | 2.85 | 0.609 | 0.568 | 0.59 | |
YOLOv4-tiny scratch | 0.44 | 3.33 | 0.560 | 0.434 | 0.54 | |
Pre-trained YOLOv5s | 0.146 | 0.029 | 0.610 | 0.767 | 0.61 | Intel®Core™ i7-10750H CPU @2.60 GHz, 16 GB RAM, and GPU as NVIDIA GeForce RTX 2060 |
YOLOv5s scratch | 0.155 | 0.025 | 0.530 | 0.622 | 0.59 | |
Pre-trained YOLOv5m | 0.22 | 0.036 | 0.562 | 0.761 | 0.57 | |
YOLOv5m scratch | 0.221 | 0.036 | 0.426 | 0.494 | 0.49 | |
Pre-trained YOLOv5l | 0.273 | 0.026 | 0.579 | 0.767 | 0.60 | |
YOLOv5l scratch | 0.283 | 0.027 | 0.442 | 0.529 | 0.49 | |
Pre-trained YOLOv5x | 0.41 | 0.035 | 0.575 | 0.779 | 0.57 | |
YOLOv5x scratch | 0.393 | 0.035 | 0.363 | 0.456 | 0.45 |
YOLO Family | Best Model (Pre-Trained) | Evaluation Dataset | Mean Average Precision (mAP) | |||
---|---|---|---|---|---|---|
Training from Scratch | Pretraining on COCO; No Transfer Learning | Transfer from | Pretraining on COCO + Transfer | |||
YOLOv5 | YOLOv5s | HMH | 0.74 | 0.81 | TT | 0.83 |
TT | 0.53 | 0.61 | HMH | 0.62 | ||
YOLOv4 | YOLOv4 | HMH | 0.76 | 0.80 | TT | 0.83 |
TT | 0.54 | 0.60 | HMH | 0.61 | ||
YOLOv3 | YOLOv3-spp | HMH | 0.59 | 0.79 | TT | 0.81 |
TT | 0.39 | 0.57 | HMH | 0.59 | ||
YOLOv2 | YOLOv2 | HMH | 0.58 | 0.72 | TT | 0.77 |
TT | 0.37 | 0.49 | HMH | 0.51 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Maharjan, N.; Miyazaki, H.; Pati, B.M.; Dailey, M.N.; Shrestha, S.; Nakamura, T. Detection of River Plastic Using UAV Sensor Data and Deep Learning. Remote Sens. 2022, 14, 3049. https://doi.org/10.3390/rs14133049
Maharjan N, Miyazaki H, Pati BM, Dailey MN, Shrestha S, Nakamura T. Detection of River Plastic Using UAV Sensor Data and Deep Learning. Remote Sensing. 2022; 14(13):3049. https://doi.org/10.3390/rs14133049
Chicago/Turabian StyleMaharjan, Nisha, Hiroyuki Miyazaki, Bipun Man Pati, Matthew N. Dailey, Sangam Shrestha, and Tai Nakamura. 2022. "Detection of River Plastic Using UAV Sensor Data and Deep Learning" Remote Sensing 14, no. 13: 3049. https://doi.org/10.3390/rs14133049
APA StyleMaharjan, N., Miyazaki, H., Pati, B. M., Dailey, M. N., Shrestha, S., & Nakamura, T. (2022). Detection of River Plastic Using UAV Sensor Data and Deep Learning. Remote Sensing, 14(13), 3049. https://doi.org/10.3390/rs14133049