Abstract
Deep learning models have recently attracted great interest as an effective solution to the challenging problem of human activity recognition (HAR) and its widespread applications in medical rehabilitation and human–computer interaction. However, it is difficult for many existing models to simultaneously extract global temporal features and local spatial features of activity data. This paper proposes a deep learning model, ConvTransformer, based on convolutional neural network (CNN), Transformer, and attention mechanism. The model first uses the convolutional layer to model the local information of the sensor time series signal, then uses Transformer to obtain the temporal correlation of the feature sequence, adds an attention mechanism to highlight essential features, and finally completes the activity recognition through the fully connected layer (FC). We evaluate the model on four public datasets, and the experimental results show that the model achieves 92%, 99%, 97%, and 86% accuracy on the OPPORTUNITY, PAMAP2, SKODA, and USC-HAD datasets, respectively. Compared with the baseline Transformer and existing state-of-the-art methods, the model has higher recognition accuracy and better robustness. In addition, we explore the strengths and limitations of the method proposed in this paper by tuning hyperparameters and optimization strategies.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
This study is an analysis of existing public datasets. The OPPORTUNITY dataset is available from the UCI Machine Learning Repository: http://archiive.ics.uci.edu/ ml/datasets/OPPORTUNITY+Activity+Recognition. The PAMAP2 dataset is available from the UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/PAMAP2+Physical+Activity+Monitoring. The SKODA dataset is from the Wearable Computing Laboratory at ETH Zurich. This dataset is free to use provided that papers provided by the authors are cited. The USC-HAD dataset is available from the work of Mi Zhang and Alexander A. Sawchuk: https://sipi.usc.edu/had/.
References
Alawneh L, Mohsen B, Al-Zinati M, Shatnawi A, Al-Ayyoub M (2020) A comparison of unidirectional and bidirectional lstm networks for human activity recognition. In: 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 1–6. IEEE
Alemayoh TT, Lee JH, Okamoto S (2021) New sensor data structuring for deeper feature extraction in human activity recognition. Sensors 21(8):2814
Bokhari SM, Sohaib S, Khan AR, Shafi M et al (2021) Dgru based human activity recognition using channel state information. Measurement 167:108245
Chavarriaga R, Sagha H, Calatroni A, Digumarti ST, Tröster G, Millán JR, Roggen D (2013) The opportunity challenge: A benchmark database for on-body sensor-based activity recognition. Pattern Recogn Lett 34(15):2033–2042
Despinoy F, Bouget D, Forestier G, Penet C, Zemiti N, Poignet P, Jannin P (2015) Unsupervised trajectory segmentation for surgical gesture recognition in robotic training. IEEE Trans Biomed Eng 63(6):1280–1291
Dirgová Luptáková I, Kubovčík M, Pospíchal J (2022) Wearable sensor-based human activity recognition with transformer model. Sensors 22(5):1911
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Dua N, Singh SN, Semwal VB (2021) Multi-input cnn-gru based human activity recognition using wearable sensors. Computing 103(7):1461–1478
Guo J, Han K, Wu H, Tang Y, Chen X, Wang Y, Xu C (2022) Cmt: Convolutional neural networks meet vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 12175–12185
Hammerla NY, Halloran S, Plötz T (2016) Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880
Ha S, Yun J-M, Choi S (2015) Multi-modal convolutional neural networks for activity recognition. In: 2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3017–3022. IEEE
Hernandez V, Dadkhah D, Babakeshizadeh V, Kulić D (2021) Lower body kinematics estimation from wearable sensors for walking and running: A deep learning approach. Gait & Posture 83:185–193
Jebali M, Dakhli A, Jemni M (2021) Vision-based continuous sign language recognition using multimodal sensor fusion. Evol Syst 12(4):1031–1044
Khalid S, Khalil T, Nasreen S (2014) A survey of feature selection and feature extraction techniques in machine learning. In: 2014 Science and Information Conference (SAI), pp. 372–378. IEEE
Lee MKI, Rabindranath M, Faust K, Yao J, Gershon A, Alsafwani N, Diamandis P (2022) Compound computer vision workflow for efficient and automated immunohistochemical analysis of whole slide images. Journal of Clinical Pathology
Li Y, Wang L (2022) Human activity recognition based on residual network and bilstm. Sensors 22(2):635
Li X, Ding M, Pižurica A (2019) Deep feature fusion via two-stream convolutional neural network for hyperspectral image classification. IEEE Trans Geosci Remote Sens 58(4):2615–2629
Maaz M, Shaker A, Cholakkal H, Khan S, Zamir SW, Anwer RM, Khan FS (2022) Edgenext: efficiently amalgamated cnn-transformer architecture for mobile vision applications. arXiv preprint arXiv:2206.10589
Murahari VS, Plötz T (2018) On attention models for human activity recognition. In: Proceedings of the 2018 ACM International Symposium on Wearable Computers (ISWC), pp. 100–103
Ni Q, Fan Z, Zhang L, Nugent CD, Cleland I, Zhang Y, Zhou N (2020) Leveraging wearable sensors for human daily activity recognition with stacked denoising autoencoders. Sensors 20(18):5114
Noori FM, Wallace B, Uddin M, Torresen J et al (2019) A robust human activity recognition approach using openpose, motion features, and deep recurrent neural network. In: Scandinavian Conference on Image Analysis (SCIA) 299–310, Springer
Ordóñez FJ, Roggen D (2016) Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16(1):115
Patil P, Kumar KS, Gaud N, Semwal VB (2019) Clinical human gait classification: extreme learning machine approach. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–6. IEEE
Phyo CN, Zin TT, Tin P (2019) Deep learning for recognizing human activities using motions of skeletal joints. IEEE Trans Consum Electron 65(2):243–252
Pise N, Kulkarni P (2017) Evolving learners’ behavior in data mining. Evol Syst 8(4):243–259
Ramachandra S, Hoelzemann A, Van Laerhoven K (2011) Transformer networks for data augmentation of human physical activity recognition. arXiv preprint arXiv:2109.01081
Reiss A, Stricker D (2012) Introducing a new benchmarked dataset for activity monitoring. In: 2012 16th International Symposium on Wearable Computers (ISWC), pp. 108–109. IEEE
Riahi M, Eslami M, Safavi SH, Torkamani Azar F (2020) Human activity recognition using improved dynamic image. IET Image Proc 14(13):3223–3231
Sak H, Senior AW, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling
Sena J, Barreto J, Caetano C, Cramer G, Schwartz WR (2021) Human activity recognition based on smartphone and wearable sensors using multiscale dcnn ensemble. Neurocomputing 444:226–243
Singh SP, Sharma MK, Lay-Ekuakille A, Gangwar D, Gupta S (2020) Deep convlstm with self-attention for human activity decoding using wearable sensors. IEEE Sens J 21(6):8575–8582
Ullah A, Muhammad K, Del Ser J, Baik SW, de Albuquerque VHC (2018) Activity recognition using temporal optical flow convolutional features and multilayer lstm. IEEE Trans Industr Electron 66(12):9692–9702
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
Walse KH, Dharaskar RV, Thakare VM (2016) Pca based optimal ann classifiers for human activity recognition using mobile sensors data. In: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems (ICTIS): Volume 1, pp. 429–436. Springer
Xia K, Huang J, Wang H (2020) Lstm-cnn architecture for human activity recognition. IEEE Access 8:56855–56866
Xu S, Zhang L, Huang W, Wu H, Song A (2022) Deformable convolutional networks for multimodal human activity recognition using wearable sensors. IEEE Trans Instrum Meas 71:1–14
Yao R, Lin G, Shi Q, Ranasinghe DC (2018) Efficient dense labelling of human activity sequences from wearables using fully convolutional networks. Pattern Recogn 78:252–266
Zappi P, Roggen D, Farella E, Tröster G, Benini L (2012) Network-level power-performance trade-off in wearable activity recognition: A dynamic sensor selection approach. ACM Transactions on Embedded Computing Systems (TECS) 11(3):1–30
Zeng M, Gao H, Yu T, Mengshoel OJ, Langseth H, Lane I, Liu X (2018) Understanding and improving recurrent networks for human activity recognition by continuous attention. In: Proceedings of the 2018 ACM International Symposium on Wearable Computers (ISWC), pp. 56–63
Zeng M, Nguyen LT, Yu B, Mengshoel OJ, Zhu J, Wu P, Zhang J (2014) Convolutional neural networks for human activity recognition using mobile sensors. In: 6th International Conference on Mobile Computing, Applications and Services (MobiCASE), pp. 197–205. IEEE
Zhang M, Sawchuk AA (2012) Usc-had: A daily activity dataset for ubiquitous activity recognition using wearable sensors. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp), pp. 1036–1043
Zhao Y, Yang R, Chevalier G, Xu X, Zhang Z (2018) Deep residual bidir-lstm for human activity recognition using wearable sensors. Mathematical Problems in Engineering 2018
Acknowledgements
This work was supported by the National Science Foundation of China (61563032, 61963025); Project (18JR3RA133) supported by Gansu Basic Research Innovation Group, China; Open Fund Project of Industrial Process Advanced Control of Gansu Province (2019KFJJ02), China.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Wang, W., An, A. et al. A human activity recognition method using wearable sensors based on convtransformer model. Evolving Systems 14, 939–955 (2023). https://doi.org/10.1007/s12530-022-09480-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-022-09480-y