Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-030-78321-1_29guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Investigation of Sign Language Motion Classification by Feature Extraction Using Keypoints Position of OpenPose

Published: 24 July 2021 Publication History

Abstract

So far, on the premise of using a monocular optical camera, sign language motion classification has been performed using a wristband and color gloves with different dyeing on each finger. In this method, the movement of sign language is detected by extracting the color region using color gloves. However, this method has problems such as the burden on the signer of wearing color gloves and the change in color extraction accuracy due to changes in ambient light, resulting in difficulty in ensuring stable classification accuracy.
Therefore, we used OpenPose, which can detect the movements of both hands without the need for colored gloves, to classify sign language movements. Feature element extraction was performed using the keypoint position obtained from OpenPose. Then, we proposed three methods as feature element for classifying each motion and compared their classification accuracy. In method 1, feature element is obtained directly from the keypoint positions of the neck, shoulder, elbow, and wrist. Method 2 is a scheme of obtaining from the relative distance from the target keypoint position around the neck. In method 3, the feature element was 30 elements, which is the sum of the 24 elements obtained in method 1 and the 6 elements obtained in method 2.
In the classification experiment, cross-validation was performed using the feature quantity obtained from the sign language motion videos of five people, and the accuracy of each method was investigated. In method 1, B (68.05%), A (62.56%), C (62.19%), D (61.49%), E (56.75%), average 62.21%, in order from the signer with the highest average classification accuracy. In method 2, B (75.31%), A (75.09%), D (73.28%), E (69.97%), C (69.81%), average 72.69%. Method 3 gave B (70.72%), A (69.65%), C (66.13%), D (64.27%), E (62.72%), and an average of 66.30% classification accuracy.

References

[1]
Pfau R, Steinbach M, and Woll B Sign Language: An International Handbook 2012 Berlin Walterde Gruyter 1138
[2]
Sourcenext Corporation, Pockettalk. https://pocketalk.jp/. (in Japanese)
[3]
Tornay, S., Razavi, M., Camgoz, C.N., Bowden, R., Doss, M.M.: HMM-based approaches to model multichannel information in sign language inspired from articulatory features-based speech processing. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2817–2821 (2019).
[4]
Park C-I and Sohn C-B Data augmentation for human keypoint estimation deep learning based sign language translation Electronics 2020 9 8 1257
[5]
Mukushev, M., Sabyrov, A., Imashev, A., Koishybay, K., Kimmelmany, V., Sandygulova, A.: Evaluation of manual and non-manual components for sign language recognition. In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 6073–6078 (2020)
[6]
Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., Presti, P.: American sign language recognition with the Kinect. In: Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 276–286 (2011)
[7]
Jitcharoenpory, R., Senechakr, P., Dahlan, M., Suchato, A., Chuangsuwanich, E., Punyabukkana, P.: Recognizing words in Thai Sign Language using flex sensors and gyroscopes. In: i-CREATe2017, 4 p. (2017)
[8]
Oz C and Leu CM American sign language word recognition with a sensory glove using artificial neural networks Eng. Appl. Artif. Intell. 2011 24 7 1204-1213
[9]
Ko KS, Kim JC, Jung H, and Cho C Neural sign language translation based on human keypoint estimation Appl. Sci. 2019 9 2683
[10]
Ozawa T, Okayasu Y, Dahlan M, Nishimura H, and Tanaka H Yamamoto S and Mori H Investigation of sign language recognition performance by integration of multiple feature elements and classifiers Human Interface and the Management of Information. Interaction, Visualization, and Analytics 2018 Cham Springer 291-305
[11]
Liu G and Guo J Bidirectional LSTM with attention mechanism and convolutional layer for text classification Neurocomputing 2019 337 325-338
[12]
KCC Corporation, Smart Deaf. http://www.smartdeaf.com/. (in Japanese)
[13]
Yanagisawa, T., Ishikawa, T., Watanabe, H.: Sign language analysis with monocular RGB camera. In: IEICE General Conference, D-12–34, p. 70 (2019). (in Japanese)
[14]
Kawaguchi, K., Wang, W., Ohta, E., Nishimura, H., Tanaka, H.: Basic investigation of sign language motion classification by feature extraction using pre-trained network models. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim2019), 4 p. (2019)

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Human Interface and the Management of Information. Information Presentation and Visualization: Thematic Area, HIMI 2021, Held as Part of the 23rd HCI International Conference, HCII 2021, Virtual Event, July 24–29, 2021, Proceedings, Part I
Jul 2021
416 pages
ISBN:978-3-030-78320-4
DOI:10.1007/978-3-030-78321-1

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 24 July 2021

Author Tags

  1. Sign language
  2. Motion
  3. Classification
  4. OpenPose
  5. LSTM
  6. Cross-validation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media