YOLO-LWNet: A Lightweight Road Damage Object Detection Network for Mobile Terminal Devices
<p>Building blocks of MobileNetv3. (<b>a</b>) No channel expansion units; (<b>b</b>) the unit of channel expansion with residual connection (when stride is set to one, and in_channels are equal to out_channels). DWconv denotes depthwise convolution.</p> "> Figure 2
<p>Building blocks of ShuffleNetv2. (<b>a</b>) The basic unit; (<b>b</b>) the spatial downsampling unit. DWConv denotes depthwise convolution.</p> "> Figure 3
<p>The structure of the YOLOv5-s.</p> "> Figure 4
<p>The network structure of the YOLO-LWNet-Small. Where, LWConv (<span class="html-italic">s</span> set to one) is the basic unit in the LWC module, and LWConv (<span class="html-italic">s</span> set to two) is the unit for spatial downsampling in the LWC module. In LWConv (<span class="html-italic">s</span> set to one), the CBAM attention module is used as attention. The ECA attention module is used as attention in LWConv (<span class="html-italic">s</span> set to two). In the small version, the hyperparameter <span class="html-italic">N</span> of LWblock (LWB) in stage two, three, four, and five in the backbone is set as one, four, four, and three, respectively.</p> "> Figure 5
<p>Building blocks of our work (LWC). (<b>a</b>) The basic unit (stride set to one); (<b>b</b>) the spatial downsampling unit (stride set to two). DWConv denotes depthwise convolution. <span class="html-italic">R</span> denotes expansion rate. Act denotes activates the function. Attention denotes the attention mechanism.</p> "> Figure 6
<p>Diagram of SENet and ECANet. (<b>a</b>) Diagram of SENet; (<b>b</b>) diagram of ECANet. <span class="html-italic">k</span> denotes the convolution kernel size of 1D convolution in ECANet, and <span class="html-italic">k</span> is determined adaptively by mapping the channel dimension <span class="html-italic">C</span>. <span class="html-italic">σ</span> denotes the sigmoid function.</p> "> Figure 7
<p>Diagram of CBAM. <span class="html-italic">r</span> denotes multiple dimension reduction in the channel attention module. <span class="html-italic">k</span> denotes the convolution kernel size of the convolution layer in the spatial attention module.</p> "> Figure 8
<p>The overview of LWNet.</p> "> Figure 9
<p>The overview of efficient feature fusion network.</p> "> Figure 10
<p>Ablation result of different methods on the test set.</p> "> Figure 11
<p>Performance comparison of the mean average precision using the training set of the RDD-2020. These curves represent the different improvement methods, as well as the final model structure.</p> "> Figure 12
<p>Comparison of different algorithms for object detection on the test set.</p> "> Figure 13
<p>Comparison of the detection of longitudinal cracks (D00).</p> "> Figure 14
<p>Comparison of the detection of transverse cracks (D10).</p> "> Figure 15
<p>Comparison of the detection of alligator cracks (D20).</p> "> Figure 16
<p>Comparison of the detection of potholes (D40).</p> "> Figure 17
<p>Pictures of detection results.</p> ">
Abstract
:1. Introduction
2. Related Work on The YOLO Series Detection Network and Lightweight Networks
2.1. Lightweight Networking
2.2. YOLOv5
3. Proposed Method for Lightweight Road Damage Detection Network (YOLO-LWNet)
3.1. The Structure of YOLO-LWNet
3.2. Lightweight Network Building Block—LWC
3.3. Attention Mechanism
3.4. Activation Function
3.5. Lightweight Backbone—LWNet
3.6. Efficient Feature Fusion Network
4. Experiments on Road Damage Object Detection Network
4.1. Datasets
4.2. Experimental Environment
4.3. Evaluation Indicators
4.4. Comparison with Other Lightweight Networks
4.5. Ablation Experiments
4.6. Comparison with State-of-the-Art Object Detection Networks
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chatterjee, A.; Tsai, Y.-C.J. A fast and accurate automated pavement crack detection algorithm. In Proceedings of the 2018 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, 3–7 September 2018; pp. 2140–2144. [Google Scholar]
- Er-yong, C. Development summary of international pavement surface distress automatic survey system. Transp. Stand. 2009, 204, 96–99. [Google Scholar]
- Ma, J.; Zhao, X.; He, S.; Song, H.; Zhao, Y.; Song, H.; Cheng, L.; Wang, J.; Yuan, Z.; Huang, F.; et al. Review of pavement detection technology. J. Traffic Transp. Eng. 2017, 14, 121–137. [Google Scholar]
- Du, Y.; Zhang, X.; Li, F.; Sun, L. Detection of crack growth in asphalt pavement through use of infrared imaging. Transp. Res. Rec. J. Transp. Res. Board 2017, 2645, 24–31. [Google Scholar] [CrossRef]
- Yu, X.; Yu, B. Vibration-based system for pavement condition evaluation. In Proceedings of the 9th International Conference on Applications of Advanced Technology in Transportation (AATT), Chicago, IL, USA, 13–15 August 2006; pp. 183–189. [Google Scholar]
- Li, Q.; Yao, M.; Yao, X.; Xu, B. A real-time 3D scanning system for pavement distortion inspection. Meas. Sci. Technol. 2009, 21, 015702. [Google Scholar] [CrossRef]
- Wang, J. Research on Vehicle Technology on Road Three-Dimension Measurement; Chang’an University: Xi’an, China, 2010. [Google Scholar]
- Fu, P.; Harvey, J.T.; Lee, J.N.; Vacura, P. New method for classifying and quantifying cracking of flexible pavements in automated pavement condition survey. Transp. Res. Rec. J. Transp. Res. Board 2011, 2225, 99–108. [Google Scholar] [CrossRef]
- Oliveira, H.; Correia, P.L. Automatic road crack segmentation using entropy and image dynamic thresholding. In Proceedings of the 2009 17th European Signal Processing Conference, European, Glasgow, UK, 24–28 August 2009; pp. 622–626. [Google Scholar]
- Ying, L.; Salari, E. Beamlet transform-based technique for pavement crack detection and classification. Comput-Aided Civ. Infrastruct. Eng. 2010, 25, 572–580. [Google Scholar] [CrossRef]
- Nisanth, A.; Mathew, A. Automated Visual Inspection of Pavement Crack Detection and Characterization. Int. J. Technol. Eng. Syst. (IJTES) 2014, 6, 14–20. [Google Scholar]
- Zalama, E.; Gómez-García-Bermejo, J.; Medina, R.; Llamas, J.M. Road crack detection using visual features extracted by gabor filters. Comput-Aided Civ. Infrastruct. Eng. 2014, 29, 342–358. [Google Scholar] [CrossRef]
- Wang, K.C.P.; Li, Q.; Gong, W. Wavelet-based pavement distress image edge detection with à trous algorithm. Transp. Res. Rec. J. Transp. Res. Board 2007, 2024, 73–81. [Google Scholar] [CrossRef]
- Li, Q.; Zou, Q.; Mao, Q.Z. Pavement crack detection based on minimum cost path searching. China J. Highw. Transp. 2010, 23, 28–33. [Google Scholar]
- Cao, J.; Zhang, K.; Yuan, C.; Xu, S. Automatic road cracks detection and characterization based on mean shift. J. Comput-Aided Des. Comput. Graph. 2014, 26, 1450–1459. [Google Scholar]
- Li, Q.; Zou, Q.; Zhang, D.; Mao, Q. FoSA: F*Seed-growing approach for crack-line detection from pavement images. Image Vis. Comput. 2011, 29, 861–872. [Google Scholar] [CrossRef]
- Daniel, A.; Preeja, V. A novel technique automatic road distress detection and analysis. Int. J. Comput. Appl. 2014, 101, 18–23. [Google Scholar]
- Shen, Z.; Peng, Y.; Shu, N. A road damage identification method based on scale-span image and SVM. Geomat. Inf. Sci. Wuhan Univ. 2013, 38, 993–997. [Google Scholar]
- Zakeri, H.; Nejad, F.M.; Fahimifar, A.; Doostparast, A. A multi-stage expert system for classification of pavement cracking. In Proceedings of the IFSA World Congress and Nafips Meeting, Edmonton, AB, Canada, 24–28 June 2013; pp. 1125–1130. [Google Scholar]
- Hajilounezhad, T.; Oraibi, Z.A.; Surya, R.; Bunyak, F.; Maschmann, M.R.; Calyam, P.; Palaniappan, K. Exploration of Carbon Nanotube Forest Synthesis-Structure Relationships Using Physics-Based Simulation and Machine Learning; IEEE: Piscataway, NJ, USA, 2019. [Google Scholar]
- Adu-Gyamfi, Y.; Asare, S.; Sharm, A. Automated vehicle recognition with deep convolutional neural networks. Transp. Res. Rec. J. Transp. Res. Board 2017, 2645, 113–122. [Google Scholar] [CrossRef] [Green Version]
- Chakraborty, P.; Okyere, Y.; Poddar, S.; Ahsani, V. Traffic congestion detection from camera images using deep convolution neural networks. Transp. Res. Rec. J. Transp. Res. Board 2018, 2672, 222–231. [Google Scholar] [CrossRef] [Green Version]
- Miotto, R.; Wang, F.; Wang, S.; Jiang, X.; Dudley, J. Deep Learning for Healthcare: Review, Opportunities and Challenges. Brief. Bioinform. 2017, 19, 1236–1246. [Google Scholar] [CrossRef]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef] [Green Version]
- Lin, G.; Liu, K.; Xia, X.; Yan, R. An efficient and intelligent detection method for fabric defects based on improved YOLOv5. Sensors 2023, 23, 97. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. ECCV Trans. Pattern Anal. Mach. Intell. 2015, 37, 1094–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. Computer Vision and Pattern Recognition (CVPR). arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Glenn, J. YOLOv5 Release v6.1. 2022. Available online: https://github.com/ultralytics/yolov5/releases/tag/v6.1 (accessed on 28 February 2022).
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A single-stage object detection framework for industrial applications. Computer Vision and Pattern Recognition (CVPR). arXiv 2022, arXiv:2209.02976. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot MultiBox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Part 1, LNCS 9905. Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 318–327. [Google Scholar] [CrossRef] [Green Version]
- da Silva, W.R.L.; de Lucena, D.S. Concrete cracks detection based on deep learning image classification. Multidiscip. Digit. Publ. Inst. Proc. 2018, 2, 489. [Google Scholar]
- Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016. [Google Scholar]
- Fan, Z.; Wu, Y.; Lu, J.; Li, W. Automatic pavement crack detection based on structured prediction with the convolutional neural network. arXiv 2018, arXiv:1802.02208. [Google Scholar]
- Maeda, H.; Sekimoto, Y.; Seto, T.; Kashiyama, T.; Omata, H. Road damage detection and classification using deep neural networks with smartphone images. Comput-Aided Civ. Infrastruct. Eng. 2018, 33, 1127–1141. [Google Scholar] [CrossRef] [Green Version]
- Du, Y.; Pan, N.; Xu, Z.; Deng, F.; Shen, Y.; Kang, H. Pavement distress detection and classification based on yolo network. Int. J. Pavement Eng. 2020, 22, 1659–1672. [Google Scholar] [CrossRef]
- Majidifard, H.; Jin, P.; Adu-Gyamfi, Y.; Buttlar, W.G. Pavement image datasets: A new benchmark dataset to classify and densify pavement distresses. Transp. Res. Rec. J. Transp. Res. Board 2020, 2674, 328–339. [Google Scholar] [CrossRef] [Green Version]
- Angulo, A.; Vega-Fernández, J.A.; Aguilar-Lobo, L.M.; Natraj, S.; Ochoa-Ruiz, G. Road damage detection acquisition system based on deep neural networks for physical asset management. In Mexican International Conference on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2019; pp. 3–14. [Google Scholar]
- Roberts, R.; Giancontieri, G.; Inzerillo, L.; Di Mino, G. Towards low-cost pavement condition health monitoring and analysis using deep learning. Appl. Sci. 2020, 10, 319. [Google Scholar] [CrossRef] [Green Version]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The KITTI dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef] [Green Version]
- Arya, D.; Maeda, H.; Ghosh, S.K.; Toshniwal, D.; Mraz, A.; Kashiyama, T.; Sekimoto, Y. Transfer Learning-based Road Damage Detection for Multiple Countries. arXiv 2020, arXiv:2008.13101. [Google Scholar]
- Maeda, H.; Kashiyama, T.; Sekimoto, Y.; Seto, T.; Omata, H. Generative adversarial network for road damage detection. Comput-Aided Civ. Infrastruct. Eng. 2020, 36, 47–60. [Google Scholar] [CrossRef]
- Hegde, V.; Trivedi, D.; Alfarrarjeh, A.; Deepak, A.; Kim, S.H.; Shahabi, C. Ensemble Learning for Road Damage Detection and Classification. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020. [Google Scholar]
- Doshi, K.; Yilmaz, Y. Road Damage Detection using Deep Ensemble Learning. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020. [Google Scholar]
- Pei, Z.; Lin, R.; Zhang, X.; Shen, H.; Tang, J.; Yang, Y. CFM: A consistency filtering mechanism for road damage detection. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020. [Google Scholar]
- Mandal, V.; Mussah, A.R.; Adu-Gyamfi, Y. Deep learning frameworks for pavement distress classification: A comparative analysis. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020. [Google Scholar]
- Pham, V.; Pham, C.; Dang, T. Road Damage Detection and Classification with Detectron2 and Faster R-CNN. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020. [Google Scholar]
- Hascoet, T.; Zhang, Y.; Persch, A.; Takashima, R.; Takiguchi, T.; Ariki, Y. FasterRCNN Monitoring of Road Damages: Competition and Deployment. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Han, S.; Pool, J.; Tran, J.; Dally, W.J. Learning both weights and connections for efficient neural networks. Adv. Neural Inf. Process. Syst. 2015, 1, 1135–1143. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. NIPS Deep Learning Workshop. arXiv 2014, arXiv:1503.02531. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for MobileNetv3. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. ShuffleNet V2: Practical guideline for efficient CNN architecture design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1580–1589. [Google Scholar]
- Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
- Elsken, T.; Metzen, J.H.; Hutter, F. Neural architecture search: A survey. J. Mach. Learn. Res. 2019, 20, 1997–2017. [Google Scholar]
- Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 390–391. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Ramachandran, P.; Zoph, B.; Le, Q.V. Searching for activation functions. arXiv 2017, arXiv:1710.05941. [Google Scholar]
- Arya, D.; Maeda, H.; Ghosh, S.K.; TOshniwal, D.; Sekimoto, Y. RDD-2020: An annotated image dataset for automatic road damage detection using deep learning. Data Brief 2021, 36, 107133. [Google Scholar] [CrossRef] [PubMed]
- Howard, A.; Zhmoginov, A.; Chen, L.-C.; Sandler, M.; Zhu, M. Inverted residuals and linear bottlenecks: Mobile networks for classification. detection and segmentation. arXiv 2018, arXiv:1801.04381. [Google Scholar]
- Liu, Z.; Li, J.; Shen, Z.; Huang, G.; Yan, S.; Zhang, C. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2736–2744. [Google Scholar]
- Wen, W.; Wu, C.; Wang, Y.; Chen, Y.; Li, H. Learning structured sparsity in deep neural networks. Adv. Neural Inf. Process. Syst. 2016, 29, 2074–2082. [Google Scholar]
- He, Y.; Zhang, X.; Sun, J. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 1389–1397. [Google Scholar]
Operator | Output Size | s | act | n | Output Channels | Exp Channels | ||
---|---|---|---|---|---|---|---|---|
T | S | T | S | |||||
Image | 640 × 640 | - | - | - | 3 | 3 | 3 | 3 |
Focus | 320 × 320 | - | - | 1 | 16 | 32 | - | - |
LWConv | 160 × 160 | 2 | ECA | 1 | 32 | 64 | 60 | 120 |
LWConv | 160 × 160 | 1 | ECA | 1 | 32 | 64 | 60 | 120 |
LWConv | 80 × 80 | 2 | ECA | 1 | 64 | 96 | 120 | 180 |
LWConv | 80 × 80 | 1 | ECA | 4 | 64 | 96 | 120 | 180 |
LWConv | 40 × 40 | 2 | ECA | 1 | 96 | 128 | 180 | 240 |
LWConv | 40 × 40 | 1 | ECA | 4 | 96 | 128 | 180 | 240 |
LWConv | 20 × 20 | 2 | ECA | 1 | 128 | 168 | 240 | 300 |
LWConv | 20 × 20 | 1 | ECA | 3 | 128 | 168 | 240 | 300 |
SPPF | 20 × 20 | - | - | 1 | 256 | 512 | - | - |
Class Name | Sample Image | Japan | India | Czech |
---|---|---|---|---|
D00 | 4049 | 1555 | 988 | |
D10 | 3979 | 68 | 399 | |
D20 | 6199 | 2021 | 161 | |
D40 | 2243 | 3187 | 197 |
Operator | Output Size | s | SE | Output Channels | Exp Channels |
---|---|---|---|---|---|
Image | 640 × 640 | - | - | 3 | - |
Focus | 320 × 320 | - | - | 64 | - |
LWConv | 160 × 160 | 2 | √ | 64 | 120 |
LWConv | 160 × 160 | 1 | √ | 64 | 120 |
LWConv | 80 × 80 | 2 | √ | 96 | 180 |
LWConv | 80 × 80 | 1 | - | 96 | 180 |
LWConv | 80 × 80 | 1 | √ | 96 | 180 |
LWConv | 80 × 80 | 1 | - | 96 | 180 |
LWConv | 40 × 40 | 2 | √ | 128 | 240 |
LWConv | 40 × 40 | 1 | √ | 128 | 240 |
LWConv | 40 × 40 | 1 | √ | 128 | 240 |
LWConv | 20 × 20 | 2 | √ | 168 | 300 |
LWConv | 20 × 20 | 1 | √ | 168 | 300 |
Operator | Output Size | s | SE | Output Channels | Exp Channels |
---|---|---|---|---|---|
Image | 640 × 640 | - | - | 3 | - |
Focus | 320 × 320 | - | - | 64 | - |
LWConv | 160 × 160 | 2 | - | 64 | 120 |
LWConv | 160 × 160 | 1 | - | 64 | 120 |
LWConv | 80 × 80 | 2 | √ | 96 | 180 |
LWConv | 80 × 80 | 1 | - | 96 | 180 |
LWConv | 80 × 80 | 1 | √ | 96 | 180 |
LWConv | 80 × 80 | 1 | - | 96 | 180 |
LWConv | 80 × 80 | 1 | √ | 96 | 180 |
LWConv | 40 × 40 | 2 | √ | 128 | 240 |
LWConv | 40 × 40 | 1 | √ | 128 | 240 |
LWConv | 40 × 40 | 1 | - | 128 | 240 |
LWConv | 40 × 40 | 1 | √ | 128 | 240 |
LWConv | 40 × 40 | 1 | - | 128 | 240 |
LWConv | 40 × 40 | 1 | √ | 128 | 240 |
LWConv | 20 × 20 | 2 | √ | 168 | 300 |
LWConv | 20 × 20 | 1 | - | 168 | 300 |
LWConv | 20 × 20 | 1 | √ | 168 | 300 |
Method | Backbone | mAP | Param/M | FLOPs | FPS | Latency/ms |
---|---|---|---|---|---|---|
YOLOv5 | MobileNetV3-Small | 43.0 | 3.55 | 6.3 | 93 | 10.7 |
MobileNetV3-Large | 47.1 | 13.47 | 24.7 | 80 | 12.5 | |
ShuffleNetV2-x1 | 42.8 | 3.61 | 7.5 | 89 | 11.2 | |
ShuffleNetV2-x2 | 45.1 | 14.67 | 29.7 | 83 | 12.1 | |
BLWNet-Small(ours) | 45.9 | 3.16 | 11.8 | 101 | 9.9 | |
BLWNet-Large(ours) | 48.2 | 11.30 | 27.3 | 83 | 12.1 |
Scheme | BLWNet | CBAM/ECA | Hardswish | Depth | SPPF | BiFPN | ENeck | ECA |
---|---|---|---|---|---|---|---|---|
LW | √ | |||||||
LW-SE | √ | √ | ||||||
LW-SE-H | √ | √ | √ | |||||
LW-SE-H-depth | √ | √ | √ | √ | ||||
LW-SE-H-depth-spp | √ | √ | √ | √ | √ | |||
LW-SE-H-depth-spp-bi | √ | √ | √ | √ | √ | √ | ||
LW-SE-H-depth-spp-bi-ENeck | √ | √ | √ | √ | √ | √ | √ | |
LW-SE-H-depth-spp-bi-fast | √ | √ | √ | √ | √ | √ | ||
LW-SE-H-depth-spp-bi-ENeck-fast | √ | √ | √ | √ | √ | √ | √ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, C.; Ye, M.; Zhang, J.; Ma, Y. YOLO-LWNet: A Lightweight Road Damage Object Detection Network for Mobile Terminal Devices. Sensors 2023, 23, 3268. https://doi.org/10.3390/s23063268
Wu C, Ye M, Zhang J, Ma Y. YOLO-LWNet: A Lightweight Road Damage Object Detection Network for Mobile Terminal Devices. Sensors. 2023; 23(6):3268. https://doi.org/10.3390/s23063268
Chicago/Turabian StyleWu, Chenguang, Min Ye, Jiale Zhang, and Yuchuan Ma. 2023. "YOLO-LWNet: A Lightweight Road Damage Object Detection Network for Mobile Terminal Devices" Sensors 23, no. 6: 3268. https://doi.org/10.3390/s23063268
APA StyleWu, C., Ye, M., Zhang, J., & Ma, Y. (2023). YOLO-LWNet: A Lightweight Road Damage Object Detection Network for Mobile Terminal Devices. Sensors, 23(6), 3268. https://doi.org/10.3390/s23063268