research-article

A Comparative Study of YOLOv5 models on American Sign Language Dataset

Authors:

Michael Stephen Lui,

Fitri UtaminingrumAuthors Info & Claims

SIET '22: Proceedings of the 7th International Conference on Sustainable Information Engineering and Technology

Pages 3 - 7

https://doi.org/10.1145/3568231.3568233

Published: 13 January 2023 Publication History

Abstract

Sign language is the most common way of communication for people with hearing and speech difficulties. One of the biggest problems for sign language user is that most people does not understand sign language. The most promising solution to this problem is a sign language detection system using object detection algorithm. YOLOv5 is state-of-art one-stage object detection algorithm and is available in wide range of model complexity, ranging from simplest YOLOv5n to most complex YOLOv5x. To achieve efficient communication, the sign language detection system needs to be fast and reliable. However, many previous studies only used the most complex model without considering the time needed for the system to run. As more complex model tends to performs better at the cost of computational time, the most optimal model for sign language detection system is a model that performs well while maintaining fast inference time. In this study, we compare the inference time and performance of every YOLOv5 model available, trained on American Sign Language dataset to find the most optimal model of YOLOv5 for sign language detection. The experiment results shows that while YOLOv5x has slightly better performance than other models with mAP of 0.88 and F1 score of 0.91, it required twice the amount of time to detect the sign language with inference time of 26.2 ms. The same can be said to YOLOv5m and YOLOv5l, both with mAP of 0.88 and F1 score of 0.88 and 0.90, while require inference time of 16.2 ms and 19.1 ms respectively. YOLOv5n is the fastest model at inference time of 7.2 ms, but the performance is considerably worse with mAP of 0.79 and F1 score of 0.88. In conclusion, YOLOv5s is the most optimal model with mAP of 0.88, F1 score of 0.90, and inference time of 10.6 ms.

References

[1]

[1] “Sensory Functions, Disability and Rehabilitation.” https://www.who.int/teams/noncommunicable-diseases/sensory-functions-disability-and-rehabilitation (accessed Jul. 17, 2022).

[2]

[2] R. E. Mitchell, T. A. Young, B. Bachleda, and M. A. Karchmer, “How Many People Use ASL in the United States? Why Estimates Need Updating,” Sign Lang Stud, vol. 6, no. 3, 2006.

[3]

[3] R. P. Prasetya, F. Utaminingrum, and W. F. Mahmudy, “Real time eyeball movement detection based on region division and midpoint position,” International Journal of Intelligent Engineering and Systems, vol. 11, no. 3, pp. 149–158, 2018.

[4]

[4] Z. Wang, L. Jin, S. Wang, and H. Xu, “Apple stem/calyx real-time recognition using YOLO-v5 algorithm for fruit automatic loading system,” Postharvest Biol Technol, vol. 185, Mar. 2022.

[5]

[5] E. Yohannes et al., “Domain Adaptation Deep Attention Network for Automatic Logo Detection and Recognition in Google Street View,” IEEE Access, vol. 9, pp. 102623–102635, 2021.

[6]

[6] A. Kuznetsova, T. Maleva, and V. Soloviev, “Detecting Apples in Orchards Using YOLOv3 and YOLOv5 in General and Close-Up Images,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12557 LNCS, pp. 233–243, 2020.

[7]

[7] P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, “A Review of Yolo Algorithm Developments,” Procedia Comput Sci, vol. 199, pp. 1066–1073, Jan. 2022.

[8]

[8] F. Utaminingrum et al., “Human tracking by using multiple methods and weighted products,” Signal Image Video Process, vol. 13, no. 8, pp. 1469–1476, Nov. 2019.

[9]

[9] F. Utaminingrum, “Autonomous Robot System Based on Room Nameplate Recognition Using YOLOv4 Method on Jetson Nano 2 GB,” International Journal on Informatics Visualization, vol. 6, no. 1, pp. 117–123, 2022.

[10]

[10] D.-Y. Zhang et al., “Assessment of the levels of damage caused by Fusarium head blight in wheat using an improved YoloV5 method,” Comput Electron Agric, vol. 198, p. 107086, Jul. 2022.

Digital Library

[11]

[11] J. Lv et al., “A visual identification method for the apple growth forms in the orchard,” Comput Electron Agric, vol. 197, Jun. 2022.

Digital Library

[12]

[12] S. Cai, G. Li, and Y. Shan, “Underwater object detection using collaborative weakly supervision,” Computers and Electrical Engineering, vol. 102, p. 108159, Sep. 2022.

Digital Library

[13]

[13] F. Jubayer et al., “Detection of mold on the food surface using YOLOv5,” Curr Res Food Sci, vol. 4, pp. 724–728, 2021.

[14]

[14] L. Xing et al., “Multi-UAV cooperative system for search and rescue based on YOLOv5,” International Journal of Disaster Risk Reduction, vol. 76, Jun. 2022.

[15]

[15] T. Jintasuttisak, E. Edirisinghe, and A. Elbattay, “Deep neural network based date palm tree detection in drone imagery,” Comput Electron Agric, vol. 192, Jan. 2022.

Digital Library

[16]

[16] J. Qi et al., “An improved YOLOv5 model based on visual attention mechanism: Application to recognition of tomato virus disease,” Comput Electron Agric, vol. 194, p. 106780, Mar. 2022.

Digital Library

[17]

[17] D.-Y. Zhang et al., “Assessment of the levels of damage caused by Fusarium head blight in wheat using an improved YoloV5 method,” Comput Electron Agric, vol. 198, p. 107086, Jul. 2022.

Digital Library

[18]

[18] T. F. Dima and M. E. Ahmed, “Using YOLOv5 Algorithm to Detect and Recognize American Sign Language,” 2021 International Conference on Information Technology, ICIT 2021 - Proceedings, no. July, pp. 603–607, 2021.

[19]

[19] X. Dong, S. Yan, and C. Duan, “A lightweight vehicles detection network model based on YOLOv5,” Eng Appl Artif Intell, vol. 113, p. 104914, Aug. 2022.

Digital Library

[20]

[20] C. Y. Wang, H. Y. Mark Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, and I. H. Yeh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 2020-June, pp. 1571–1580, Nov. 2019.

[21]

[21] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path Aggregation Network for Instance Segmentation,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8759–8768, Mar. 2018.

[22]

[22] I. A. Adeyanju, O. O. Bello, and M. A. Adegboye, “Machine learning methods for sign language recognition: A critical review and analysis,” Intelligent Systems with Applications, vol. 12, p. 200056, 2021.

Cited By

Sri Hari RAmbalam RRuban Kumar BIbrahim MPonnusamy R(2023)Yolo5-Based UAV Surveillance for Tiny Object Detection on Airport Runways2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI)10.1109/ICDSAAI59313.2023.10452584(1-6)Online publication date: 21-Dec-2023
https://doi.org/10.1109/ICDSAAI59313.2023.10452584

Index Terms

A Comparative Study of YOLOv5 models on American Sign Language Dataset
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Sign Language Interpretation Using Deep Learning
Multi-disciplinary Trends in Artificial Intelligence
Abstract
Sign language is used to communicate a particular message over some universally known and accepted gestures. The speech and hearing challenged people use a specific combination of hand gestures and movements to convey a message. Despite the ...
Evaluation of Language Feedback Methods for Student Videos of American Sign Language
Special Issue (Part 2) of Papers from ASSETS 2015

This research investigates how to best present video-based feedback information to students learning American Sign Language (ASL); these results are relevant not only for the design of a software tool for providing automatic feedback to students but ...
Real-Time Sign Language Detection Using Human Pose Estimation
Computer Vision – ECCV 2020 Workshops
Abstract
We propose a lightweight real-time sign language detection model, as we identify the need for such a case in videoconferencing. We extract optical flow features based on human pose estimation and, using a linear classifier, show these features are ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

SIET '22: Proceedings of the 7th International Conference on Sustainable Information Engineering and Technology

November 2022

398 pages

ISBN:9781450397117

DOI:10.1145/3568231

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SIET '22

SIET '22: 7th International Conference on Sustainable Information Engineering and Technology 2022

November 22 - 23, 2022

Malang, Indonesia

Acceptance Rates

Overall Acceptance Rate 45 of 57 submissions, 79%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
95
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sri Hari RAmbalam RRuban Kumar BIbrahim MPonnusamy R(2023)Yolo5-Based UAV Surveillance for Tiny Object Detection on Airport Runways2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI)10.1109/ICDSAAI59313.2023.10452584(1-6)Online publication date: 21-Dec-2023
https://doi.org/10.1109/ICDSAAI59313.2023.10452584

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents