Open AccessArticle

Improving Human Activity Recognition Through 1D-ResNet: A Wearable Wristband for 14 Workout Movements

Sang-Un Kim

and

Joo-Yong Kim

^2,*

Department of Smart Wearable Engineering, Soongsil University, Seoul 06978, Republic of Korea

Department of Materials Science and Engineering, Soongsil University, Seoul 06978, Republic of Korea

Author to whom correspondence should be addressed.

Processes 2025, 13(1), 207; https://doi.org/10.3390/pr13010207

Submission received: 12 December 2024 / Revised: 6 January 2025 / Accepted: 10 January 2025 / Published: 13 January 2025

(This article belongs to the Special Issue Smart Wearable Technology: Thermal Management and Energy Applications)

Download

Browse Figures

Figure 1
The sensing mechanism for workout recognition using the Z-axis acceleration of an IMU sensor using gravity. "> Figure 2
The wristband with IMU sensor and the red arrows are the Z-axis in the sensor coordinate system. "> Figure 3
The 14 workouts were chosen for recognition. (a) Bench press, (b) Incline bench press, (c) Dumbbell shoulder press, (d) Dumbbell triceps extension, (e) Dumbbell kick, (f) Dumbbell front raise, (g) Lat pull down, (h) Straight arm lat pull down, (i) Deadlift, (j) Dumbbell bent row, (k) One-arm dumbbell row, (l) EZ-bar curls, (m) Machine preacher curl, (n) Seated dumbbell lateral raise [<a href="#B30-processes-13-00207" class="html-bibr">30</a>]. "> Figure 4
The main path and skip connection of residual block in ResNet. "> Figure 5
The architecture of 1D ResNet. "> Figure 6
The Z-axis acceleration data of bench press for subject 1. "> Figure 7
The accuracy of training and validation of (a) 1D ResNet and (b) 3D ResNet. "> Figure 8
The confusion matrix of workout recognition 1D ResNet. "> Figure 9
The confusion matrix of workout recognition 1D ResNet fresh test. ">

Versions Notes

Abstract

This study presents a 1D Residual Network(ResNet)-based algorithm for human activity recognition (HAR) focused on classifying 14 different workouts, which represent key exercises commonly performed in fitness training, using wearable inertial measurement unit (IMU) sensors. Unlike traditional 1D Convolutional neural network (CNN) models, the proposed 1D ResNet incorporates residual blocks to prevent gradient vanishing and exploding problems, allowing for deeper networks with improved performance. The IMU sensor, placed on the wrist, provided Z-axis acceleration data, which were used to train the model. A total of 901 data samples were collected from five participants, with 600 used for training and 301 for testing. The model achieved a recognition accuracy of 97.09%, surpassing the 89.03% of a 1D CNN without residual blocks and the 92% of a cascaded 1D CNN from previous research. These results indicate that the 1D ResNet model is highly effective in recognizing a wide range of workouts. The findings suggest that wearable devices can autonomously classify human activities and provide personalized training recommendations, paving the way for AI-driven personal training systems.

Keywords:

1D ResNet; neural network; artificial intelligence; human activity recognition; wearable healthcare

1. Introduction

In recent years, human activity recognition (HAR) using wearable sensors has gained considerable attention in the wearable technology field. HAR, which involves classifying and recognizing human movements and behaviors, is being explored in a wide range of applications, including virtual reality (VR) [1,2,3,4], sports [5,6,7], healthcare [8], robotics [9,10,11], and entertainment [12,13]. In these areas, various types of sensors are utilized to track or monitor movements. Broadly, these can be categorized into vision-based sensors and wearable sensors. Vision-based sensors, such as Microsoft’s Kinect [14,15], radar [16,17], and optical devices like cameras, track movements externally by employing image processing or signal analysis techniques. In contrast, wearable sensors measure actual movements through data such as joint angles, acceleration, and gyroscopic information, often utilizing sensor fusion methods to provide more accurate motion tracking [18,19,20].

Vision-based sensors are capable of tracking human movement without limitations on the number of joint points or body parts, enabling real-time analysis and making them suitable for highly precise measurements and detailed analysis. However, there are significant limitations to this method. Optical sensors are prone to inaccuracies due to environmental factors such as lighting conditions, reflections, or obstacles, as well as the resolution limitations of the equipment itself. Consequently, the use of expensive equipment and the necessity of a controlled environment often make this approach less feasible for widespread use. Additionally, as the measurement duration increases, the volume of image processing data grows exponentially, requiring substantial computational resources and leading to increased costs and time for both measurement and analysis.

On the other hand, wearable sensors offer significant convenience by being minimally attached to critical joints or positions while still effectively analyzing or recognizing movements [21,22,23]. The smaller amount of data they generate allows for easy storage in compact modules or devices and enables real-time wireless transmission through technologies like Bluetooth or Wi-Fi. Furthermore, wearable sensors are less affected by external environmental factors, reducing the margin of error, which makes them particularly well-suited for outdoor activities, training, and exercise. Their accessibility and ease of use are enhanced by their relatively low cost, allowing for widespread adoption. Moreover, these sensors are typically lightweight and flexible, enabling prolonged use without causing discomfort to the wearer. Also, wearable sensors, due to their advantage of being wearable, can be utilized not only in special scenarios such as dangerous situations involving soldiers, firefighters, or police officers, artificial joints that move in coordination with the limbs of individuals with amputated body parts, and sensing abnormal muscle tremors caused by irregular brain signals in Parkinson’s disease, but also in everyday applications such as monitoring athletes, supporting dieting efforts, and measuring sleep patterns.

Notably, numerous studies have utilized wrist-mounted IMU sensors for human activity recognition (HAR), as the wrist exhibits significant movement during various physical activities, making it an ideal location for capturing representative HAR data. In a related study, a wearable device was developed to recognize gestures from wrist movements occurring during motion [24]. Another study focused on detecting abrupt gestures rather than repetitive ones [25]. Such studies highlight the extensive research being conducted to interpret user intentions and states from the complex data generated by the hand and wrist, which are commonly involved in various activities [26]. In this study, we considered a watch-type wearable product, which is a representative wearable sensor, and attached an IMU sensor to the wrist to obtain Z-axis acceleration data for recognizing 14 types of movements.

Another reason why research on human activity recognition (HAR) is receiving increased attention is the rapid advancement of recognition algorithms driven by artificial intelligence (AI) [27,28,29]. With the development of machine learning and deep learning techniques, AI can now efficiently classify and recognize the complex and nuanced features present in human movement data. This has greatly simplified the process of analyzing and interpreting intricate motion patterns, which were previously difficult to capture and understand. AI’s ability to handle large datasets and uncover hidden patterns has significantly enhanced the accuracy and efficiency of HAR systems. In previous research, a workout classification algorithm using a 1D CNN was designed to classify fitness exercises using artificial intelligence. To improve accuracy, a cascade structure was employed, where the first step involved pre-processing by grouping exercises into categories before final classification. This approach improved the classification accuracy from 82% to 92% [30]. Although traditional artificial intelligence models have demonstrated remarkable achievements across various domains by utilizing deep learning techniques, they face limitations due to vanishing or exploding gradient problems in deep learning layers. These issues result in significantly lower training accuracy as the gradient, propagated backward during training, diminishes or amplifies excessively as the network depth increases. Such limitations become more pronounced when attempting to utilize deeper network architectures, posing a significant obstacle to effectively learning complex data patterns [31].

In this study, we employed ResNet, a cutting-edge artificial intelligence image classification model that utilizes residual blocks, to develop a deeper-layer human activity recognition (HAR) algorithm and enhance its accuracy. ResNet was introduced during the 2015 ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where it demonstrated a breakthrough by extracting features through 152 layers, significantly deeper than the 20–30 layers used in previous algorithms. One of the key innovations of ResNet is the use of residual blocks, which help overcome the vanishing gradient problem that often occurs in very deep neural networks. As a result, ResNet achieved an error rate of 3.57%, which is notably lower than the human error rate of around 5%, establishing itself as one of the most advanced recognition algorithms available [31]. One of the most prominent features of ResNet is the residual block, which effectively addresses the vanishing or exploding gradient problem that occurs as neural networks become deeper. In traditional deep learning models, the gradient can diminish to zero or explode as it propagates back through many layers due to extensive calculations and weight adjustments. ResNet resolves this issue by adding the initial input (residual) to the output of the deeper layers, a method known as skip connection (or short connection). These skip connections ensure that gradients can flow more easily through the network, preventing their loss or divergence. By designing networks with multiple residual blocks, ResNet not only preserves gradient flow but also enables much deeper architectures, resulting in improved accuracy and more stable learning.

In summary, this study developed an algorithm to recognize 14 different workouts using the Z-axis accelerometer sensor of a wristband. To enhance accuracy, we utilized ResNet, which features a residual block structure designed to mitigate the vanishing and exploding gradient issues common in deep neural networks. A comparative analysis was conducted against existing studies, focusing on improvements in classification accuracy and model performance. The use of the Z-axis accelerometer was particularly advantageous for capturing vertical motion data, making it suitable for recognizing a wide range of workouts.

2. Materials and Methods

2.1. Z-Axis Acceleration Data for Workout Recognition

In this study, we used an Inertial Measurement Unit (IMU) sensor as a wearable device to measure the Z-axis acceleration of the wrist. The IMU sensor is composed of a 3-axis accelerometer and a 3-axis gyroscope, providing a 6-axis measurement system. Additionally, some IMU sensors include a 3-axis magnetometer, forming a 9-axis system, which allows for the measurement of orientation relative to the Earth’s magnetic field. All of these sensors are designed specifically for motion tracking. In this research, we employed the EBMotion V5.2, an IMU sensor from E2BOX (Hanam, Gyeonggi, Republic of Korea), which includes a wireless sensor (EBIMU24GV52) and a wireless receiver (EBRCV24GV5). This setup was used to accurately capture the Z-axis acceleration data. When measuring motion related to physical activities using Z-axis acceleration data, the Z-axis is more significant than the X and Y axes because it is the axis most affected by gravity when we are standing. One of the challenges with accelerometers is that it is difficult to determine the initial position of the object. When a person is standing still, the accelerometer detects gravitational acceleration along the Z-axis, resulting in a value of approximately -g. This value is recorded even when there is no motion, because the sensor continuously measures the gravitational force. However, during movement or when assuming a preparatory posture for specific activities, the Z-axis value is influenced by both internal and external factors due to gravity. These variations enable the extraction of meaningful features for motion analysis. This concept is depicted in Figure 1, and the equation representing the change in the accelerometer’s angle with respect to its position is provided below.

\overset{⃑}{A_{z}} = - \overset{⃑}{g} \times c o s θ + \overset{⃑}{l}

(1)

where

\overset{⃑}{A_{z}}

is the acceleration vector along the Z-axis,

\overset{⃑}{g}

is the acceleration vector of gravity,

\overset{⃑}{l}

indicates the acceleration vector of the arm movement over time, and

θ

is the angle between the

\overset{⃑}{A_{z}}

and

\overset{⃑}{g}

. At the initial position, the sensor’s acceleration value is determined by calculating the angle difference between the gravitational force and the sensor’s Z-axis. This angle can be obtained through trigonometric functions or rotational transformations, depending on the orientation of the sensor. As the workout progresses, motion-induced acceleration is added to the initial value over time, allowing for the comprehensive tracking of dynamic movements.

2.2. 14 Workout Data

In this study, with reference to previous studies [30], a wristband equipped with an IMU sensor was designed, as shown in Figure 2, and worn on the right wrist to collect exercise data. The red arrow indicates the Z-axis of the sensor. A total of 14 different exercises, illustrated in Figure 3, were measured, with the initial movements based on the actions performed on the left side. To minimize the effects of body tremors and measurement noise during movements, a low-pass filter with a cutoff frequency of 40 Hz was applied during data processing. The filter was designed to remove high-frequency noise while preserving the relevant motion signals. The sampling frequency was set to 10 Hz, and a total of 18,020 data points were collected. Data from five male participants were used in this study, with an average height of 176 cm and an average weight of 77.4 kg. Detailed information about the participants, including variability in their physical characteristics, is provided in Table 1.

2.3. 1D-ResNet

All algorithm structures and code in this study were designed and implemented using MATLAB R2023a. The original ResNet is designed for image classification and recognition, typically handling 2D and 3D image data. However, since the Z-axis data used in this study are one-dimensional, we modified the input data structure to accommodate 1D data by replacing the input layer with a sequence input layer. For the input data, 20 data points, corresponding to approximately 2 s of data with a sampling frequency of 10 Hz and one workout cycle, were grouped into sequences within single cells, resulting in a total of 901 sequences. The output data were labeled according to the corresponding workout type. Out of the 901 data samples, 600 were randomly allocated for training and 301 for testing. This 2:1 split ensures that the model has sufficient data to learn while maintaining enough data for evaluation. The data were randomly split to ensure an unbiased distribution between the training and test sets.

The key concept of ResNet, the residual block, can be divided into the main path and the skip connection. The main path consists of convolutional layers, batch normalization layers, and ReLu layers where the input data undergo processing. The skip connection, however, allows the initial input to bypass any processing and pass directly to the output. At the end of the process, the values from both the main path and the skip connection are combined through residual mapping, which helps mitigate the vanishing gradient problem and allows deeper networks to perform better. The equation for this process is as follows:

X + F (X)

(2)

where

X

is the input value and

F (X)

is the value that passes through the main path. Additionally, the residual block where residual mapping occurs can be illustrated as shown in Figure 4.

The 1D-ResNet model in this study, as depicted in Figure 5, consists of three residual blocks. Each convolutional layer has a kernel size of 3 × 3, with 64 filters in the first block, 128 filters in the second block, and 256 filters in the third block. Following the residual blocks, a global average pooling layer is used to reduce the dimensionality while retaining important features. This is followed by two fully connected layers, a dropout layer with a dropout rate of 0.5 to prevent overfitting, and a softmax layer for calculating class probabilities. The final classification layer then recognizes the workout based on these features. The algorithm was trained using Root Mean Square Propagation (RMSprop) as the solver, chosen for its ability to adapt the learning rate during training, which helps in improving convergence. The model was trained for 200 epochs (800 iterations) with a batch size of 128, an initial learning rate of 0.001 and decrease factor of 0.5 per 25 epochs, allowing for a balance between learning speed and stability. Furthermore, we conducted the same training and validation processes with 3D input data comprising X-, Y-, and Z-axis dimensions to investigate the performance differences compared to the 1D ResNet model.

To validate the performance of the trained 1D ResNet algorithm, we utilized 301 test data samples as input to generate predictions. These predictions were then used to calculate not only the overall accuracy but also key classification metrics such as precision, recall, and F1-score. These metrics were calculated as follows:

P r e c i s i o n = \frac{T P}{T P + F P}

(3)

R e c a l l = \frac{T P}{T P + F N}

(4)

F 1 s c o r e = 2 \frac{Precision \times Recall}{Precision + Recall}

(5)

where the true positive (TP) refers to the number of positive samples correctly classified as positive. False positive (FP) is the number of negative samples incorrectly classified as positive, and false negative (FN) refers to the number of positive samples incorrectly classified as negative.

Precision measures the proportion of true positive predictions (TP) out of all predictions made for the positive class (TP + FP). It evaluates the model’s ability to avoid false positives. Recall, also known as sensitivity, measures the proportion of actual positive instances correctly identified (TP) out of the total actual positives (TP + FN). It evaluates the model’s ability to detect true positives. The F1-Score is the harmonic mean of precision and recall, balancing their trade-offs. It is particularly useful when dealing with imbalanced datasets.

Finally, to verify the reliability and precise applicability of the algorithm modeling method, we excluded 173 data samples from one participant (segment 2) who performed the 14 workout movements. The remaining 728 data samples from the other four participants were used for training, and the fresh data from segment 2 were utilized for testing.

3. Results and Discussion

3.1. Evaluation of 1D ResNet

Figure 6 presents the Z-axis acceleration data from a single sequence for the bench press exercise, recorded from subject 1 and used as input for the ResNet model in this study. With a sampling frequency of 10 Hz and an approximate workout duration of 2 s, each sequence is composed of 20 data points. For the bench press, the upward and downward movements of the bar, driven by the hands and arms, are clearly captured in the Z-axis data. This highlights the suitability of the Z-axis acceleration for accurately reflecting the motion dynamics of this exercise.

As illustrated in Figure 7, the training accuracy of both the 1D ResNet and 3D ResNet models was evaluated using the 600 training data samples across the training epochs. To monitor overfitting, the validation accuracy was also measured and compared using the 301 test data samples. While the 3D ResNet model achieved 100% training accuracy by epoch 200 and maintained it throughout, its validation accuracy fluctuated between 80% and 90%, ultimately stabilizing at 87.02%. This discrepancy can be attributed to the increased information complexity in the 3D data, which allowed the model to create more nuanced classifications for training data but led to overfitting, as evident from the lower validation accuracy compared to training accuracy. In contrast, the 1D ResNet model developed in this study also achieved a final training accuracy of 100% while maintaining a validation accuracy of 97.02%, closely matching its training performance. These results indicate that the 1D ResNet model achieved higher accuracy while effectively minimizing overfitting. This suggests that for workout data, which inherently include noise and human error, a simpler yet efficient model like 1D ResNet, which focuses on essential information rather than excessive data, is more suitable for classification tasks.

Figure 8 presents the confusion matrix, which compares the predicted workout results from the 1D ResNet model trained on the final 301 test data samples with the actual workout labels. The confusion matrix helps visualize how well the model predicts each workout category. First, misclassifications were observed in workouts (a), (b), and (c)—Bench press, Incline bench press, and Dumbbell shoulder press. Although these exercises involve different degrees of upper body inclination, the arm movements are highly similar, which likely contributed to the higher misclassification rate. Nevertheless, with a minimum accuracy of 88.2%, the model’s performance is still sufficient for reliable workout recognition. Secondly, workouts (j) and (k)—Dumbbell kickback and One-arm dumbbell row—also had some misclassifications. The primary differences between these exercises involve whether both arms are engaged and whether the foot is elevated during the movement. Despite the similarity in arm motion, the misclassification rate remained low, with a maximum error rate of only 6.9%, suggesting that the algorithm is adequately robust for workout recognition.

Table 2 presents the classification metrics, including precision, recall, and F1-score, for each of the 14 workout categories evaluated in this study using the 1D ResNet model. The results demonstrate the model’s high effectiveness in recognizing diverse workout types. The average precision, recall, and F1-score across all categories were 0.98, 0.97, and 0.97, respectively, indicating consistent performance across various exercises. Notably, exercises such as (d) Dumbbell triceps extension, (j) Dumbbell kickback, and (k) One-arm dumbbell row achieved perfect scores of 1.0 across all metrics, showcasing the model’s robustness in identifying these movements.

Figure 9 and Table 3 present the results obtained using the same algorithm modeling method, excluding the data from one participant (segment 2) out of the total of five participants. Compared to training with the complete dataset, the accuracy decreased to 90.17%. The confusion matrix, which compares actual and predicted classifications, revealed that while the confusion among exercises (a), (b), and (c) remained the same, additional confusion was observed between exercises (d) and (m). This confusion is presumed to be influenced by the similar degree of arm rotation between the two exercises, despite their differing directions, and may have been further impacted by human error. As shown in Table 3, precision, recall, and F1-score decreased overall but remained approximately 90%. Considering the application to new data, this level of performance is deemed sufficiently high.

3.2. Comparison with Previous Research

Finally, the accuracy of the 1D ResNet model for workout recognition in this study was compared with previous studies and a 1D CNN model without the residual block structure to assess the impact of the residual block on performance. This comparison highlights how the addition of residual blocks improves the model’s ability to recognize workouts accurately. Although the 1D CNN without residual mapping had the same number of learning layers, its recognition accuracy was 89.03%. The cascaded 1D CNN structure from previous research achieved a higher accuracy of 92% [30]. However, the 1D ResNet model developed in this study achieved the highest recognition accuracy of 97.09%, demonstrating that the inclusion of residual mapping significantly improves model performance compared to both previous approaches.

The input values passed through residual mapping help retain crucial information from the initial IMU data related to the starting posture, minimizing the loss of valuable input features. Moreover, by preventing gradient vanishing and exploding issues in deep learning layers, the 1D ResNet model demonstrated higher accuracy in recognizing 14 different exercises compared to previous research that utilized a traditional 1D CNN algorithm. This shows that the use of residual blocks improves both information retention and overall model performance.

4. Conclusions

In summary, this study developed an algorithm for classifying 14 different workouts within the field of human activity recognition (HAR) using artificial intelligence in the wearable technology sector. By leveraging ResNet, which has demonstrated low error rates in recent studies, this research achieved substantial improvements in accuracy compared to previous algorithms. These findings suggest the potential for AI-powered personal training systems, where wearable devices can autonomously recognize user activities, gather data, and provide tailored workout techniques and plans to users. This system would enable more efficient and personalized training without requiring direct user input, paving the way for more advanced AI-driven personal training solutions.

The future research will focus on developing algorithms capable of classifying a wider variety of exercises and enhancing the applicability of the model by incorporating data from a larger and more diverse population. These efforts aim to improve the robustness and generalizability of the system, paving the way for broader adoption and effectiveness in real-world applications.

Author Contributions

As a corresponding author, J.-Y.K. was responsible for the whole structure construction, while S.-U.K. was responsible for reviewing, supervising, experiment design, and modeling. S.-U.K. was responsible for the data collection, processing, and material selection. S.-U.K. and J.-Y.K. drafted the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly funded by the Technology Innovation Program (or Industrial Strategic Technology Development Program-Materials/Parts Package Type) (20016038, development of the textile-IT converged digital sensor modules for smart wear to monitor bio and activity signal in exercise, and KS standard) funded by the Ministry of Trade, Industry and Energy (MOTIE, Korea) and the Korea Institute for Advancement of Technology (KIAT) grant funded by the Korean Government (MOTIE) (P0012770).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fangbemi, A.S.; Liu, B.; Yu, N.H.; Zhang, Y. Efficient human action recognition interface for augmented and virtual reality applications based on binary descriptor. In Augmented Reality, Virtual Reality, and Computer Graphics, Proceedings of the 5th International Conference, AVR 2018, Otranto, Italy, 24–27 June 2018, Proceedings, Part I 5; Springer International Publishing: Cham, Switzerland, 2018; pp. 252–260. [Google Scholar]
Xia, C.; Sugiura, Y. Optimizing sensor position with virtual sensors in human activity recognition system design. Sensors 2021, 21, 6893. [Google Scholar] [CrossRef] [PubMed]
Xiao, F.; Pei, L.; Chu, L.; Zou, D.; Yu, W.; Zhu, Y.; Li, T. A deep learning method for complex human activity recognition using virtual wearable sensors. In Spatial Data and Intelligence, Proceedings of the First International Conference, SpatialDI 2020, Virtual Event, 8–9 May 2020, Proceedings 1; Springer International Publishing: Cham, Switzerland, 2021; pp. 261–270. [Google Scholar]
Jeyakumar, J.V.; Lai, L.; Suda, N.; Srivastava, M. SenseHAR: A robust virtual activity sensor for smartphones and wearables. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems, New York, NY, USA, 10–13 November 2019; pp. 15–28. [Google Scholar]
Schuldhaus, D. Human Activity Recognition in Daily Life and Sports Using Inertial Sensors; FAU University Press: Boca Raton, FL, USA, 2019. [Google Scholar]
Host, K.; Ivašić-Kos, M. An overview of Human Action Recognition in sports based on Computer Vision. Heliyon 2022, 8, e09633. [Google Scholar] [CrossRef] [PubMed]
Pajak, G.; Krutz, P.; Patalas-Maliszewska, J.; Rehm, M.; Pajak, I.; Dix, M. An approach to sport activities recognition based on an inertial sensor and deep learning. Sens. Actuators A Phys. 2022, 345, 113773. [Google Scholar] [CrossRef]
Bibbò, L.; Vellasco, M.M. Human activity recognition (HAR) in healthcare. Appl. Sci. 2023, 13, 13009. [Google Scholar] [CrossRef]
Frank, A.E.; Kubota, A.; Riek, L.D. Wearable activity recognition for robust human-robot teaming in safety-critical environments via hybrid neural networks. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 449–454. [Google Scholar]
Jaramillo, I.E.; Jeong, J.G.; Lopez, P.R.; Lee, C.-H.; Kang, D.-Y.; Ha, T.-J.; Oh, J.-H.; Jung, H.; Lee, J.H.; Lee, W.H. Real-time human activity recognition with IMU and encoder sensors in wearable exoskeleton robot via deep learning networks. Sensors 2022, 22, 9690. [Google Scholar] [CrossRef]
Martínez-Villaseñor, L.; Ponce, H. A concise review on sensor signal acquisition and transformation applied to human activity recognition and human–robot interaction. Int. J. Distrib. Sens. Netw. 2019, 15, 1550147719853987. [Google Scholar] [CrossRef]
Hoelzemann, A.; Romero, J.L.; Bock, M.; Laerhoven, K.V.; Lv, Q. Hang-time HAR: A benchmark dataset for basketball activity recognition using wrist-worn inertial sensors. Sensors 2023, 23, 5879. [Google Scholar] [CrossRef]
Wang, Z.; Wu, D.; Chen, J.; Ghoneim, A.; Hossain, M.A. A triaxial accelerometer-based human activity recognition via EEMD-based features and game-theory-based feature selection. IEEE Sens. J. 2016, 16, 3198–3207. [Google Scholar] [CrossRef]
Zhang, Z. Microsoft kinect sensor and its effect. IEEE Multimed. 2012, 19, 4–10. [Google Scholar] [CrossRef]
Han, J.; Shao, L.; Xu, D.; Shotton, J. Enhanced computer vision with microsoft kinect sensor: A review. IEEE Trans. Cybern. 2013, 43, 1318–1334. [Google Scholar]
Li, X.; He, Y.; Jing, X. A survey of deep learning-based human activity recognition in radar. Remote Sens. 2019, 11, 1068. [Google Scholar] [CrossRef]
Zhu, S.; Guendel, R.G.; Yarovoy, A.; Fioranelli, F. Continuous human activity recognition with distributed radar sensor networks and CNN–RNN architectures. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5115215. [Google Scholar] [CrossRef]
Mohammadzadeh, F.F.; Liu, S.; Bond, K.A.; Nam, C.S. Feasibility of a wearable, sensor-based motion tracking system. Procedia Manuf. 2015, 3, 192–199. [Google Scholar] [CrossRef]
Longo, U.G.; De Salvatore, S.; Sassi, M.; Carnevale, A.; De Luca, G.; Denaro, V. Motion tracking algorithms based on wearable inertial sensor: A focus on shoulder. Electronics 2022, 11, 1741. [Google Scholar] [CrossRef]
Rana, M.; Mittal, V. Wearable sensors for real-time kinematics analysis in sports: A review. IEEE Sens. J. 2020, 21, 1187–1207. [Google Scholar] [CrossRef]
Poitras, I.; Dupuis, F.; Bielmann, M.; Campeau-Lecours, A.; Mercier, C.; Bouyer, L.J.; Roy, J.-S. Validity and reliability of wearable sensors for joint angle estimation: A systematic review. Sensors 2019, 19, 1555. [Google Scholar] [CrossRef]
Bakhshi, S.; Mahoor, M.H. Development of a wearable sensor system for measuring body joint flexion. In Proceedings of the 2011 International Conference on Body Sensor Networks, Dallas, TX, USA, 23–25 May 2011; pp. 35–40. [Google Scholar]
Teague, C.N.; Heller, J.A.; Nevius, B.N.; Carek, A.M.; Mabrouk, S.; Garcia-Vicente, F.; Inan, O.T.; Etemadi, M. A wearable, multimodal sensing system to monitor knee joint health. IEEE Sens. J. 2020, 20, 10323–10334. [Google Scholar] [CrossRef]
Zhao, H.; Ma, Y.; Wang, S.; Watson, A.; Zhou, G. MobiGesture: Mobility-aware hand gesture recognition for healthcare. Smart Health 2018, 9, 129–143. [Google Scholar] [CrossRef]
Digo, E.; Polito, M.; Pastorelli, S.; Gastaldi, L. Detection of upper limb abrupt gestures for human–machine interaction using deep learning techniques. J. Braz. Soc. Mech. Sci. Eng. 2024, 46, 227. [Google Scholar] [CrossRef]
Rivera, P.; Valarezo, E.; Choi, M.-T.; Kim, T.-S. Recognition of human hand activities based on a single wrist imu using recurrent neural networks. Int. J. Pharma Med. Biol. Sci 2017, 6, 114–118. [Google Scholar] [CrossRef]
Ayvaz, U.; Elmoughni, H.; Atalay, A.; Atalay, Ö.; Ince, G. Real-time human activity recognition using textile-based sensors. In Proceedings of the EAI International Conference on Body Area Networks, Tallinn, Estonia, 25–26 December 2020; pp. 168–183. [Google Scholar]
Zhang, S.; Li, Y.; Zhang, S.; Shahabi, F.; Xia, S.; Deng, Y.; Alshurafa, N. Deep learning in human activity recognition with wearable sensors: A review on advances. Sensors 2022, 22, 1476. [Google Scholar] [CrossRef] [PubMed]
Mani, N.; Haridoss, P.; George, B. Evaluation of a Combined Conductive Fabric-Based Suspender System and Machine Learning Approach for Human Activity Recognition. IEEE Open J. Instrum. Meas. 2023, 2, 2500310. [Google Scholar] [CrossRef]
Koo, B.; Nguyen, N.T.; Kim, J. Identification and Classification of Human Body Exercises on Smart Textile Bands by Combining Decision Tree and Convolutional Neural Networks. Sensors 2023, 23, 6223. [Google Scholar] [CrossRef] [PubMed]
Shafiq, M.; Gu, Z. Deep residual learning for image recognition: A survey. Appl. Sci. 2022, 12, 8972. [Google Scholar] [CrossRef]

Figure 1. The sensing mechanism for workout recognition using the Z-axis acceleration of an IMU sensor using gravity.

Figure 2. The wristband with IMU sensor and the red arrows are the Z-axis in the sensor coordinate system.

Figure 3. The 14 workouts were chosen for recognition. (a) Bench press, (b) Incline bench press, (c) Dumbbell shoulder press, (d) Dumbbell triceps extension, (e) Dumbbell kick, (f) Dumbbell front raise, (g) Lat pull down, (h) Straight arm lat pull down, (i) Deadlift, (j) Dumbbell bent row, (k) One-arm dumbbell row, (l) EZ-bar curls, (m) Machine preacher curl, (n) Seated dumbbell lateral raise [30].

Figure 4. The main path and skip connection of residual block in ResNet.

Figure 5. The architecture of 1D ResNet.

Figure 6. The Z-axis acceleration data of bench press for subject 1.

Figure 7. The accuracy of training and validation of (a) 1D ResNet and (b) 3D ResNet.

Figure 8. The confusion matrix of workout recognition 1D ResNet.

Figure 9. The confusion matrix of workout recognition 1D ResNet fresh test.

Table 1. The characteristics of 14 workout subjects.

Subject	Height (cm)	Weight (kg)
Subject 1	170	75
Subject 2	183	80
Subject 3	180	78
Subject 4	173	80
Subject 5	174	74

Table 2. The metrics of 1D Resnet model.

Workout	Precision	Recall	F1-Score
(a)	1	0.94	0.97
(b)	0.93	0.93	0.93
(c)	0.90	0.96	0.93
(d)	1	1	1
(e)	1	1	1
(f)	1	1	1
(g)	1	0.88	0.94
(h)	0.86	0.95	0.90
(i)	0.96	0.96	0.96
(j)	1	1	1
(k)	1	1	1
(l)	1	1	1
(m)	1	1	1
(n)	1	1	1
	0.98 *	0.97 *	0.97 *

* The average value of each metric.

Table 3. The metrics of 1D Resnet fresh test model.

Workout	Precision	Recall	F1-Score
(a)	0.81	0.93	0.87
(b)	0.69	0.85	0.76
(c)	0.89	0.53	0.67
(d)	1	0.45	0.62
(e)	1	1	1
(f)	1	1	1
(g)	0.75	1	0.86
(h)	1	1	1
(i)	1	1	1
(j)	1	0.91	0.95
(k)	0.93	1	0.97
(l)	1	1	1
(m)	0.68	1	0.81
(n)	1	1	1
	0.91 *	0.91 *	0.89 *

* The average value of each metric.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, S.-U.; Kim, J.-Y. Improving Human Activity Recognition Through 1D-ResNet: A Wearable Wristband for 14 Workout Movements. Processes 2025, 13, 207. https://doi.org/10.3390/pr13010207

AMA Style

Kim S-U, Kim J-Y. Improving Human Activity Recognition Through 1D-ResNet: A Wearable Wristband for 14 Workout Movements. Processes. 2025; 13(1):207. https://doi.org/10.3390/pr13010207

Chicago/Turabian Style

Kim, Sang-Un, and Joo-Yong Kim. 2025. "Improving Human Activity Recognition Through 1D-ResNet: A Wearable Wristband for 14 Workout Movements" Processes 13, no. 1: 207. https://doi.org/10.3390/pr13010207

APA Style

Kim, S.-U., & Kim, J.-Y. (2025). Improving Human Activity Recognition Through 1D-ResNet: A Wearable Wristband for 14 Workout Movements. Processes, 13(1), 207. https://doi.org/10.3390/pr13010207

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu