Image-Acceleration Multimodal Danger Detection Model on Mobile Phone for Phone Addicts
<p>Framework of multimodal phubbing danger detection system.</p> "> Figure 2
<p>Examples of the nine dangerous states defined in this paper. (<b>a</b>) walking and zebra crossing. (<b>b</b>) climbing stairs. (<b>c</b>) walking and wet surface. (<b>d</b>) walking and darkness. (<b>e</b>) climbing stairs and darkness. (<b>f</b>) static and zebra crossing. (<b>g</b>) static and stairs. (<b>h</b>) static and wet surface. (<b>i</b>) static and darkness.</p> "> Figure 3
<p>Rules for pairing environment images with sensor data. In the sensor data, the red, green, and blue lines represent the X-axis, Y-axis, and Z-axis, respectively.</p> "> Figure 4
<p>Data collection sample images. The first row shows the actual scene where the user is located, while the second and third rows depict the corresponding sensor time series and environmental real-life images. The sensor time series for the X, Y, and Z axes are represented by red, green, and blue lines, respectively. (<b>a</b>) climbing stairs. (<b>b</b>) walking and wet surface. (<b>c</b>) walking and zebra crossing. (<b>d</b>) walking and darkness.</p> "> Figure 5
<p>Multimodal phubbing danger state recognition network.</p> "> Figure 6
<p>Sensor curves and GADF pseudo-images corresponding to different behaviors. (<b>a</b>) Sample of sensor curve and X-Y-Z pseudo-image when going up and down stairs. (<b>b</b>) Sample of sensor curve and X-Y-Z pseudo-image when walking. (<b>c</b>) Sample of sensor curve and X-Y-Z pseudo-image when stationary.</p> "> Figure 7
<p>Confusion matrices of test results with GADF input as X-Z and X-Y-Z.</p> "> Figure 8
<p>Mobile application user interface.</p> "> Figure 9
<p>Examples of online test results for mobile phone app. The left side of each figure shows the real-time display of the mobile app, while the right side shows the actual photos of the user’s activity state and environment. (<b>a</b>) Walking and wet surfaces. (<b>b</b>) Walking and zebra crossing. (<b>c</b>) Going upstairs. (<b>d</b>) Going downstairs. (<b>e</b>) Static and darkness. (<b>f</b>) Sitting and browsing on the phone.</p> ">
Abstract
:1. Introduction
- A mobile-end rear-view image-gravity acceleration multimodal “phubbing” danger state monitoring dataset was constructed, and the behaviors of smartphone users were combined with surrounding environmental information, leading to the proposal of a lightweight image-acceleration multimodal “phubbing” danger perception network model for mobile devices.
- The proposed multimodal lightweight network model has been successfully deployed on Android devices. Online experiments have demonstrated that the model achieves high accuracy in identification and operates at a fast speed, showcasing promising practical applications and potential for wider adoption.
- User behavior analysis results based on acceleration data are used to control the turning on of mobile phone cameras, thereby reducing the battery consumption of the phone.
2. Related Works
3. Multimodal Phubbing Danger Detection Dataset
4. Multimodal Phubbing Danger State Recognition Network
4.1. Surrounding Environment Feature Extraction Module
4.2. Motion Feature Extraction Module
4.2.1. GAF Image Encoding and Feature Extraction
4.2.2. Statistical Feature Extraction
4.2.3. Fusion of Deep Features and Statistical Features
4.3. Multimodal Feature Fusion for Surrounding Environment and Behavior
4.4. Loss Function and Optimizer
5. Experimental Results and Analysis
5.1. Experimental Environment and Evaluation Metrics
5.2. Experimental Validation of Input Signal Effectiveness for GADF Pseudo-Image Generator
5.3. Ablation Experiment
5.4. Comparative Experiment
5.5. System Performance Testing Based on Mobile Smartphones
6. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- De-Sola Gutiérrez, J.; Rodríguez de Fonseca, F.; Rubio, G. Cell-phone addiction: A review. Front. Psychiatry 2016, 7, 216511. [Google Scholar] [CrossRef] [PubMed]
- Gangadharan, N.; Borle, A.L.; Basu, S.; Navya, G.; Borle, A.L. Mobile phone addiction as an emerging behavioral form of addiction among adolescents in India. Cureus 2022, 14, e23798. [Google Scholar] [CrossRef] [PubMed]
- Alshahrani, A.; Samy Abdrabo, M.; Aly, S.M.; Alshahrani, M.S.; Alqhtani, R.S.; Asiri, F.; Ahmad, I. Effect of smartphone usage on neck muscle endurance, hand grip and pinch strength among healthy college students: A cross-sectional study. Int. J. Environ. Res. Public Health 2021, 18, 6290. [Google Scholar] [CrossRef] [PubMed]
- Liu, X.; Tian, R.; Liu, H.; Bai, X.; Lei, Y. Exploring the Impact of Smartphone Addiction on Risk Decision-Making Behavior among College Students Based on fNIRS Technology. Brain Sci. 2023, 13, 1330. [Google Scholar] [CrossRef] [PubMed]
- Travieso-González, C.M.; Alonso-Hernández, J.B.; Canino-Rodríguez, J.M.; Pérez-Suárez, S.T.; Sánchez-Rodríguez, D.D.L.C.; Ravelo-García, A.G. Robust detection of fatigue parameters based on infrared information. IEEE Access 2021, 9, 18209–18221. [Google Scholar] [CrossRef]
- Jia, M.; Yang, L.; Chen, M. An SEMG-JASA evaluation model for the neck fatique of subway phubbers. CAAI Trans. Intell. Syst. 2020, 15, 705–713. [Google Scholar]
- Zhuang, Y.; Fang, Z. Smartphone zombie context awareness at crossroads: A multi-source information fusion approach. IEEE Access 2020, 8, 101963–101977. [Google Scholar] [CrossRef]
- Shi, D.; Xiao, F. Study on driving behavior detection method based on improved long and short-term memory network. Automot. Eng 2021, 43, 1203–1209. [Google Scholar]
- Goh, H.; Kim, W.; Han, J.; Han, K.; Noh, Y. Smombie forecaster: Alerting smartphone users about potential hazards in their surroundings. IEEE Access 2020, 8, 153183–153191. [Google Scholar] [CrossRef]
- Bi, H.; Liu, J. CSEar: Metalearning for Head Gesture Recognition Using Earphones in Internet of Healthcare Things. IEEE Internet Things J. 2022, 9, 23176–23187. [Google Scholar] [CrossRef]
- Li, H.; Shrestha, A.; Heidari, H.; Le Kernec, J.; Fioranelli, F. Bi-LSTM network for multimodal continuous human activity recognition and fall detection. IEEE Sens. J. 2019, 20, 1191–1201. [Google Scholar] [CrossRef]
- Kim, D.; Han, K.; Sim, J.S.; Noh, Y. Smombie Guardian: We watch for potential obstacles while you are walking and conducting smartphone activities. PLoS ONE 2018, 13, e0197050. [Google Scholar] [CrossRef] [PubMed]
- Kim, H.S.; Kim, G.H.; Cho, Y.Z. Prevention of smombie accidents using deep learning-based object detection. ICT Express 2022, 8, 618–625. [Google Scholar] [CrossRef]
- Sun, C.; Su, J.; Shi, Z.; Guan, Y. P-Minder: A CNN based sidewalk segmentation approach for phubber safety applications. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 4160–4164. [Google Scholar]
- Sun, C. Improvements for pedestrian safety application P-Minder. EURASIP J. Adv. Signal Process. 2022, 2022, 105. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June 1–July 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Wang, Z.; Oates, T. Imaging time-series to improve classification and imputation. arXiv 2015, arXiv:1506.00327. [Google Scholar]
- Dauphin, Y.N.; Fan, A.; Auli, M.; Grangier, D. Language modeling with gated convolutional networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; PMLR 2017. pp. 933–941. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Hu, F.; Wang, H.; Feng, N.; Zhou, B.; Wei, C.; Lu, Y.; Qi, Y.; Jia, X.; Tang, H.; Gouda, M.A. A novel fusion strategy for locomotion activity recognition based on multimodal signals. Biomed. Signal Process. Control 2021, 67, 102524. [Google Scholar] [CrossRef]
- Zhuo, S.; Sherlock, L.; Dobbie, G.; Koh, Y.S.; Russello, G.; Lottridge, D. Real-time smartphone activity classification using inertial sensors—Recognition of scrolling, typing, and watching videos while sitting or walking. Sensors 2020, 20, 655. [Google Scholar] [CrossRef]
- Qin, Z.; Zhang, Y.; Meng, S.; Qin, Z.; Choo, K.K.R. Imaging and fusing time series for wearable sensor-based human activity recognition. Inf. Fusion 2020, 53, 80–87. [Google Scholar] [CrossRef]
- Kosar, E.; Barshan, B. A new CNN-LSTM architecture for activity recognition employing wearable motion sensor data: Enabling diverse feature extraction. Eng. Appl. Artif. Intell. 2023, 124, 106529. [Google Scholar] [CrossRef]
V | X-Y | X-Z | Y-Z | X-Y-Z | |
---|---|---|---|---|---|
Climbing up and down stairs | 112.63 | 52.28 | 79.38 | 8.46 | 232.29 |
Walking | 96.23 | 52.53 | 47.40 | 1.13 | 193.20 |
Static | 95.02 | 52.02 | 66.36 | 0.93 | 209.66 |
Server Parameters | Specifications | Mobile Device Parameters | Specifications |
---|---|---|---|
CPU | I5-12490F | Type | HUAWEI Mate30 (HUAWEI, Shenzhen, China) |
GPU | Nvidia-RTX3060 | Processor | Kirin 990 5 G |
RAM | 16 GB | RAM | 8 GB |
Operating system | Windows 10 | Operating system | Android 12 |
IDE | Pycharm 2022.3.2 | -- | -- |
Code language | Python 3.8.0 | -- | -- |
Network framework | Pytorch 1.11.0 | -- | -- |
CUDA | 11.3 | -- | -- |
cuDNN | 8.2.0 | -- | -- |
Input to GADF | Accuracy | -Score |
---|---|---|
X | 0.8750 | 0.9132 |
Y | 0.9301 | 0.9586 |
Z | 0.9396 | 0.9603 |
V | 0.9479 | 0.9708 |
X-Y | 0.8823 | 0.9184 |
X-Z | 0.9574 | 0.9758 |
Y-Z | 0.9135 | 0.9411 |
X-Y-Z | 0.9598 | 0.9775 |
Input | Accuracy | F1-Score |
---|---|---|
RGB | 0.5398 | 0.5585 |
GADF | 0.5917 | 0.6149 |
Statistical features | 0.5066 | 0.5265 |
GADF + statistical features | 0.6151 | 0.6253 |
RGB + GADF | 0.9210 | 0.9521 |
RGB + statistical features | 0.8865 | 0.9016 |
Ours | 0.9598 | 0.9775 |
Type | Input | Model | Accuracy | F1-Score | Parameters | Time (s) | Size (Mb) |
---|---|---|---|---|---|---|---|
Single-modal | ATS | SMCNN [24] | 0.8983 | 0.9117 | 11,190,537 | 0.12 | 42.76 |
ATS | Extra-Trees [25] | 0.7765 | 0.7894 | 2,767,110 | 0.05 | 25.41 | |
ATS | Dempster–Shafer [7] | 0.8759 | 0.8893 | 3,525,126 | 0.06 | 27.10 | |
Multimodal | RGB + GADF | ResNet [26] | 0.9635 | 0.9792 | 22,362,249 | 0.47 | 85.46 |
RGB + ATS | 2D CNN-LSTM [27] | 0.9536 | 0.9678 | 12,376,284 | 0.31 | 56.96 | |
RGB + GADF + statistic | Ours | 0.9598 | 0.9775 | 6,874,567 | 0.08 | 28.36 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Ji, X.; Jin, L.; Ji, Y.; Wang, G. Image-Acceleration Multimodal Danger Detection Model on Mobile Phone for Phone Addicts. Sensors 2024, 24, 4654. https://doi.org/10.3390/s24144654
Wang H, Ji X, Jin L, Ji Y, Wang G. Image-Acceleration Multimodal Danger Detection Model on Mobile Phone for Phone Addicts. Sensors. 2024; 24(14):4654. https://doi.org/10.3390/s24144654
Chicago/Turabian StyleWang, Han, Xiang Ji, Lei Jin, Yujiao Ji, and Guangcheng Wang. 2024. "Image-Acceleration Multimodal Danger Detection Model on Mobile Phone for Phone Addicts" Sensors 24, no. 14: 4654. https://doi.org/10.3390/s24144654
APA StyleWang, H., Ji, X., Jin, L., Ji, Y., & Wang, G. (2024). Image-Acceleration Multimodal Danger Detection Model on Mobile Phone for Phone Addicts. Sensors, 24(14), 4654. https://doi.org/10.3390/s24144654