MCSM-Wri: A Small-Scale Motion Recognition Method Using WiFi Based on Multi-Scale Convolutional Neural Network
<p>Overview structure of MCSM-Wri. CNN = convolutional neural network; CSI = channel state information.</p> "> Figure 2
<p>The raw and pre-processed CSI phases.</p> "> Figure 3
<p>Structure and parameter settings of the convolutional neural network (CNN).</p> "> Figure 4
<p>The schematic diagram of layers of CNNs. (<b>a</b>) An example of convolution layer with a 3 × 3 kernel; (<b>b</b>) An example of average-pooling layer; (<b>c</b>) The structure of fully-connected layer, softmax layer and classification layer.</p> "> Figure 5
<p>Impact of batch normalization layer, Inception, ReLU layer, pooling layer, and dropout layer on recognition accuracy using 10-fold validation.</p> "> Figure 6
<p>Comparison of output of convolution layer and Inception module for letter ‘M’ and ‘m’. (<b>a</b>) The output of Inception module for uppercase letter ‘M’; (<b>b</b>) The output of Inception module for uppercase letter ‘m’.</p> "> Figure 7
<p>The output of all layers for lowercase letter ‘l’. (<b>a</b>) The output of Input layer. (<b>b</b>) The output of conv1 layer. (<b>c</b>) The output of conv2 layer. (<b>d</b>) The output of conv3 layer. (<b>e</b>) The output of conv4 layer. (<b>f</b>) The output of conv5 layer. (<b>g</b>) The output of Average-pooling layer in Inception moudle. (<b>h</b>) The output of Depth concatenation layer. (<b>i</b>) The output of Batch normalization layer. (<b>j</b>) The output of ReLU layer. (<b>k</b>) The output of Average-pooling layer. (<b>l</b>) The output of Dropout layer. (<b>m</b>) The output of Fully-connected layer. (<b>n</b>) The output of Softmax layer. (<b>o</b>) The output of Classification layer.</p> "> Figure 8
<p>Floor plan and measurement settings of the lab and utility room environments.</p> "> Figure 9
<p>The normalized manner of handwritten letters.</p> "> Figure 10
<p>The impact of the number of samples on the accuracy of 52 classes letters.</p> "> Figure 11
<p>The accuracy of MCSM-Wri using the five validation methods from six users. (<b>a</b>) 5-fold cross-validation for user 1 to 6. (<b>b</b>) 10-fold cross-validation for user 1 to 6. (<b>c</b>) 10-hold-out validation for user 1 to 6. (<b>d</b>) Leave-one-out validation for user 1 to 6. (<b>e</b>) The impact of the number of samples on accuracy. (<b>f</b>) The impact of the sampling rate on accuracy.</p> "> Figure 12
<p>The accuracy of MCSM-Wri using different cross-validation methods and impact of experiment setting. (<b>a</b>) The accuracy of MCSM-Wri using 5-fold cross-validation, 10-fold cross-validation, 5-hold-out validation, 10-hold-out validation, and leave-one-out validation. (<b>b</b>) Impact of experiment setting.</p> "> Figure 13
<p>The training and testing process of MCSM-Wri and SignFi.</p> "> Figure 14
<p>Comparison of the training time, testing time, and accuracy of existing methods.</p> ">
Abstract
:1. Introduction
- We propose a 10-layer, multi-scale CNN to recognize the small-scale action, aiming at the problem of low accuracy caused by the tiny impact of small-scale motions in the indoor environment.
- We introduce the Inception module, and its multi-scale characteristic can solve the problem of identifying actions with the same trajectory and different sizes by virtue of its multi-scale characteristics.
- We collect the 6240 instances for 52 kinds of handwritten letters in two different environments. We verify the performance of MCSM-Wri using five different validation methods and explore the impact of dataset size and sampling rate on accuracy. We also conduct user independence test using datasets from six different users. The accuracy of MCSM-Wri is 95.31%, 96.68%, respectively, for the lab, the utility room. The average accuracy of MCSM-Wri is up to 97.70%.
2. Related Work
2.1. Recognition Methods Based on WiFi
2.2. Handwritten Motion Recognition Methods
3. Background
3.1. Channel State Information (CSI)
3.2. Convolutional Neural Network (CNN)
4. MCSM-Wri Design
4.1. Overview of the System
4.2. Phase Processing
4.3. Structure of CNN
4.4. Input Layer
4.5. Convolution Layer
4.6. Inception Module
4.7. Batch Normalization Layer
4.8. ReLU Layer
4.9. Pooling Layer
4.10. Dropout Layer
4.11. Fully-Connected Layer
4.12. Softmax Layer
4.13. Classification Layer
4.14. Visualization of the Output of Each Layer in CNN
5. Evaluation
5.1. Experiment Setup
5.2. Impact of Number of Samples
5.3. Impact of Sampling Rate
5.4. Impact of Cross-Validation
5.5. User Independence Test
5.6. Impact of Different Experiment Setting
5.7. Comparison with Existing Methods
5.7.1. Training and Testing Process
5.7.2. Recognition Accuracy of Existing Methods
5.7.3. Time Consumption of Training Time and Testing Time
6. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
References
- Leap Motion. Available online: https://www.leapmotion.com (accessed on 25 September 2019).
- Chuan, C.H.; Regina, E.; Guardino, C. American Sign Language Recognition Using Leap Motion Sensor. In Proceedings of the International Conference on Machine Learning & Applications, Detroit, MI, USA, 3–6 December 2014. [Google Scholar]
- Fang, B.; Co, J.; Zhang, M. DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, Shenzhen, China, 4–7 November 2018. [Google Scholar]
- Chao, S.; Zhang, T.; Xu, C. Latent Support Vector Machine Modeling for Sign Language Recognition with Kinect. ACM Trans. Intell. Syst. Technol. 2015, 6, 1–20. [Google Scholar]
- Zafrulla, Z.; Brashear, H.; Starner, T.; Hamilton, H.; Presti, P. American sign language recognition with the kinect. In Proceedings of the International Conference on Multimodal Interfaces, Alicante, Spain, 14–18 November 2011. [Google Scholar]
- Schick, A.; Morlock, D.; Amma, C.; Schultz, T.; Stiefelhagen, R. Vision-based handwriting recognition for unrestricted text input in mid-air. In Proceedings of the ACM International Conference on Multimodal Interaction, Santa Monica, CA, USA, 22–26 October 2012. [Google Scholar]
- Jin, L.; Yang, D.; Zhen, L.X.; Huang, J.C. A novel vision-based finger-writing character recognition system. J. Circuits Syst. Comput. 2007, 16, 421–436. [Google Scholar] [CrossRef]
- Joshi, K.; Bharadia, D.; Kotaru, M.; Katti, S. WiDeo: Fine-grained device-free motion tracing using RF backscatter. In Proceedings of the Usenix Conference on Networked Systems Design & Implementation, Oakland, CA, 2015, March 16–18.
- Xin, Z.; Ye, Z.; Jin, L.; Feng, Z.; Xu, S. A New Writing Experience: Finger Writing in the Air Using a Kinect Sensor. IEEE Multimed. 2013, 20, 85–93. [Google Scholar]
- Amma, C.; Gehrig, D.; Schultz, T. Airwriting recognition using wearable motion sensors. In Proceedings of the Augmented Human International Conference, Megève, France, 2–3 April 2010. [Google Scholar]
- Amma, C.; Georgi, M.; Schultz, T. Airwriting: Hands-Free Mobile Text Input by Spotting and Continuous Recognition of 3d-Space Handwriting with Inertial Sensors. In Proceedings of the International Symposium on Wearable Computers, Newcastle, UK, 18–22 June 2012. [Google Scholar]
- Agrawal, S.; Constandache, I.; Gaonkar, S.; Choudhury, R.R.; Caves, K.; Deruyter, F. Using mobile phones to write in air. In Proceedings of the International Conference on Mobile Systems, Bethesda, MD, USA, 28 June–1 July 2011. [Google Scholar]
- Zhang, D.; Ni, L.M. Dynamic clustering for tracking multiple transceiver-free objects. In Proceedings of the IEEE International Conference on Pervasive Computing & Communications, Galveston, TX, USA, 9–13 March 2009. [Google Scholar]
- Pan, W.; Wu, X.; Chen, G.; Shan, M.; Zhu, X. A few bits are enough: Energy efficient device-free localization. Comput. Commun. 2016, 83, 72–80. [Google Scholar]
- Pan, W.; Chen, G.; Zhu, X.; Wu, X. Minimizing receivers under link coverage model for device-free surveillance. Comput. Commun. 2015, 63, 53–64. [Google Scholar]
- Sigg, S.; Scholz, M.; Shi, S.; Ji, Y.; Beigl, M. RF-Sensing of Activities from Non-Cooperative Subjects in Device-Free Recognition Systems Using Ambient and Local Signals. IEEE Trans. Mob. Comput. 2014, 13, 907–920. [Google Scholar] [CrossRef]
- Abdelnasser, H.; Youssef, M.; Harras, K.A. WiGest: A ubiquitous WiFi-based gesture recognition system. In Proceedings of the IEEE Conference on Computer Communications, Kowloon, Hong Kong, 26 April–1 May 2015. [Google Scholar]
- Arshad, S.; Feng, C.; Liu, Y.; Hu, Y.; Yu, R.; Zhou, S.; Li, H. Wi-chase: A WiFi based human activity recognition system for sensorless environments. In Proceedings of the IEEE International Symposium on A World of Wireless, Macau, China, 12–15 June 2017. [Google Scholar]
- Fang, B.; Lane, N.D.; Zhang, M.; Kawsar, F. HeadScan: A Wearable System for Radio-Based Sensing of Head and Mouth-Related Activities. In Proceedings of the 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), Vienna, Austria, 11–14 April 2016. [Google Scholar]
- Guo, L.; Lei, W.; Jialin, L.; Wei, Z.; Bingxian, L. HuAc: Human Activity Recognition Using Crowdsourced WiFi Signals and Skeleton Data. Wirel. Commun. Mob. Comput. 2018, 2018, 6163475. [Google Scholar] [CrossRef]
- Li, M.; Yan, M.; Liu, J.; Zhu, H.; Liang, X. When CSI Meets Public WiFi: Inferring Your Mobile Phone Password via WiFi Signals. In Proceedings of the ACM Sigsac Conference, Vienna, Austria, 24–28 October 2016. [Google Scholar]
- Qian, K.; Wu, C.; Zhou, Z.; Zheng, Y.; Yang, Z.; Liu, Y. Inferring Motion Direction using Commodity Wi-Fi for Interactive Exergames. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017. [Google Scholar]
- Shang, J.; Wu, J. A Robust Sign Language Recognition System with Multiple Wi-Fi Devices. In Proceedings of the Workshop on Mobility in the Evolving Internet Architecture, Los Angeles, CA, USA, 25 August 2017. [Google Scholar]
- Tan, S.; Yang, J. WiFinger: Leveraging commodity WiFi for fine-grained finger gesture recognition. In Proceedings of the ACM International Symposium on Mobile Ad Hoc Networking & Computing, Paderborn, Germany, 5–8 July 2016. [Google Scholar]
- Li, F.; Wang, X.; Chen, H.; Sharif, K.; Wang, Y. ClickLeak: Keystroke leaks through multimodal sensors in cyber-physical social networks. IEEE Access 2017, 5, 27311–27321. [Google Scholar] [CrossRef]
- Virmani, A.; Shahzad, M. Position and Orientation Agnostic Gesture Recognition Using WiFi. In Proceedings of the International Conference on Mobile Systems, Niagara Falls, NY, USA, 19–23 June 2017. [Google Scholar]
- Cao, X.; Bing, C.; Zhao, Y. Wi-Wri: Fine-Grained Writing Recognition Using Wi-Fi Signals. In Proceedings of the Trustcom/bigdatase/ispa, Guangzhou, China, 12–15 December 2017. [Google Scholar]
- Fu, Z.; Xu, J.; Zhu, Z.; Liu, A.X.; Sun, X. Writing in the Air with WiFi Signals for Virtual Reality Devices. IEEE Trans. Mob. Comput. 2018, 18, 473–484. [Google Scholar] [CrossRef]
- Ma, Y.; Zhou, G.; Wang, S.; Zhao, H.; Jung, W. Signfi: Sign language recognition using wifi. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2018, 2, 23. [Google Scholar] [CrossRef]
- Ranzato, M.A.; Huang, F.J.; Boureau, Y.L.; Lecun, Y. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, USA, 11–12 June 2015. [Google Scholar]
- Pu, Q.; Gupta, S.; Gollakota, S.; Patel, S. Whole-home gesture recognition using wireless signals. In Proceedings of the ACM Sigcomm Conference on Sigcomm, Hong Kong, China, 12–16 August 2013. [Google Scholar]
- Adib, F.; Katabi, D. See through walls with WiFi! ACM Sigcomm Comput. Commun. Rev. 2013, 43, 75–86. [Google Scholar] [CrossRef]
- Adib, F.; Kabelac, Z.; Katabi, D.; Miller, R.C. 3D Tracking via Body Radio Reflections. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI, Seattle, WA, USA, 2–4 April 2014. [Google Scholar]
- Wang, G.; Zou, Y.; Zhou, Z.; Wu, K.; Ni, L.M. We can hear you with Wi-Fi! IEEE Trans. Mob. Comput. 2016, 15, 2907–2920. [Google Scholar] [CrossRef]
- Wei, W.; Liu, A.X.; Shahzad, M.; Kang, L.; Lu, S. Device-free Human Activity Recognition Using Commercial WiFi Devices. IEEE J. Sel. Areas Commun. 2017, 35, 1118–1131. [Google Scholar]
- Sun, L.; Sen, S.; Koutsonikolas, D.; Kim, K.H. Widraw: Enabling hands-free drawing in the air on commodity wifi devices. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, Paris, France, 7–11 September 2015; ACM: New York, NY, USA, 2015; pp. 77–89. [Google Scholar]
- Halperin, D.; Hu, W.; Sheth, A.; Wetherall, D. Tool release:gathering 802.11n traces with channel state information. ACM Sigcomm Comput. Commun. Rev. 2011, 41, 53. [Google Scholar] [CrossRef]
- Xie, Y. Precise Power Delay Profiling with Commodity WiFi. IEEE Trans. Mob. Comput. 2015, 18, 1342–1355. [Google Scholar] [CrossRef]
- Ali, K.; Liu, A.X.; Wang, W.; Shahzad, M. Keystroke recognition using wifi signals. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, Paris, France, 7–11 September 2015; ACM: New York, NY, USA, 2015; pp. 90–102. [Google Scholar]
- Haykin, S.; Kosko, B. GradientBased Learning Applied to Document Recognition. In Intelligent Signal Processing; Wiley-IEEE Press: New York, NY, USA, 2009. [Google Scholar]
- Cun, Y.L.; Boser, B.; Denker, J.S.; Howard, R.E.; Habbard, W.; Jackel, L.D.; Henderson, D. Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst. 1990, 2, 396–404. [Google Scholar]
- Le, Y.; Bengio, C.Y. Word-Level Training of a Handritten Word Recognizer Based on Convolutional Neural Networks. In Proceedings of the International Conference on Pattern Recognition, Vol 2-conference B: Computer Vision & Image Processing, Jerusalem, Israel, 9–13 October 1994. [Google Scholar]
- Hecht-Nielsen. Theory of the backpropagation neural network. In Proceedings of the International Joint Conference on Neural Networks, Honolulu, HI, USA, 12–17 May 2002. [Google Scholar]
- Wang, X.; Gao, L.; Mao, S. CSI Phase Fingerprinting for Indoor Localization with a Deep Learning Approach. IEEE Internet Things J. 2017, 3, 1113–1123. [Google Scholar] [CrossRef]
- IEEE Standard for Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications; IEEE Std: Piscataway, NJ, USA, 2002.
- Yang, J.; Kai, Y.; Gong, Y.; Huang, T.S. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 1794–1801. [Google Scholar]
- Boureau, Y.L.; Ponce, J.; LeCun, Y. A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th international conference on machine learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 111–118. [Google Scholar]
- Tao, W.; Wu, D.J.; Coates, A.; Ng, A.Y. End-to-End Text Recognition with Convolutional Neural Networks. In Proceedings of the International Conference on Pattern Recognition, sukuba Science City, Japan, 11–15 November 2012. [Google Scholar]
- Hinton, G.E.; Srivastava, N.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R.R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv 2012, arXiv:1207.0580. [Google Scholar]
Method | Large Scale Action | Method | Small Scale Action | ||
---|---|---|---|---|---|
The Number of Class | Accuracy | The Number of Class | Accuracy | ||
Wi-Chase [18] | 3 | 97% | HeadScan [19] | 5 | 86.3% |
WiSign [23] | 5 | 93.8% | WiHear [35] | 6 | 91% |
WiAG [26] | 6 | 91.4 % | WiFinger [24] | 8 | 93% |
CARM [36] | 8 | 96 % | ClickLeak [25] | 10 | 83% |
WiDance [22] | 9 | 92 % | WindTalker [21] | 10 | 81.8% |
Huac [20] | 16 | 93 % | WiKey [40] | 37 | 96.4% |
SignFi [29] | 276 | 94.8 % |
Method | Signal/Device Used | Number of Classes | Accuracy | Granularity or Media | Intrusive? |
---|---|---|---|---|---|
Schick2012 [6] | Vision-Based | 26 | 86.15% | Finger | Yes |
FWCRS [7] | Vision-Based | 26/26 | 95.6%(uppercase)/98.5%(lowercase) | Finger | Yes |
Zhang2013 [9] | Sensor | 26/26 | 99.23%(uppercase) 98.46%(lowercase) | Finger | Yes |
Amma2010 [10] | Sensor | 652(Word) | 97.5% | Finger | Yes |
Amma2012 [11] | Sensor | 8000(Word) | 89% | Finger | Yes |
Agrawal2001 [12] | Sensor | 26(12-inch) | 91.9% | Finger | Yes |
WiDraw [37] | WiFi | 26(width:25 cm–40 cm) | 95% | Finger | No |
WriFi [28] | WiFi(5 GHz) | 26(25 cm*25 cm) | 88.74% | Finger | No |
Wi-Wri [27] | WiFi(5 GHz) | 26(5 cm*5 cm) | 82.7% | Pencil | No |
Volunteer | Age | Gender | Weight (kg) Height (cm) | Gesture Duration | Number of Instances |
---|---|---|---|---|---|
User1 | 25 | Male | 75/175 | 2–3 s | 1040(520 for lab, 520 for utility room) |
User2 | 22 | Female | 55/165 | 1.5–3 s | 1040(520 for lab, 520 for utility room) |
User3 | 24 | Male | 85/175 | 2–3 s | 1040(520 for lab, 520 for utility room) |
User4 | 25 | Male | 80/178 | 1–2 s | 1040(520 for lab, 520 for utility room) |
User5 | 24 | Female | 49/162 | 1.5–3 s | 1040(520 for lab, 520 for utility room) |
User6 | 24 | Male | 70/172 | 2–3 s | 1040(520 for lab, 520 for utility room) |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, S.; Huang, T.; Li, S.; Huang, J.; Ma, T.; Liu, J. MCSM-Wri: A Small-Scale Motion Recognition Method Using WiFi Based on Multi-Scale Convolutional Neural Network. Sensors 2019, 19, 4162. https://doi.org/10.3390/s19194162
Ma S, Huang T, Li S, Huang J, Ma T, Liu J. MCSM-Wri: A Small-Scale Motion Recognition Method Using WiFi Based on Multi-Scale Convolutional Neural Network. Sensors. 2019; 19(19):4162. https://doi.org/10.3390/s19194162
Chicago/Turabian StyleMa, Shiyuan, Tingpei Huang, Shibao Li, Junwei Huang, Tiantian Ma, and Jianhang Liu. 2019. "MCSM-Wri: A Small-Scale Motion Recognition Method Using WiFi Based on Multi-Scale Convolutional Neural Network" Sensors 19, no. 19: 4162. https://doi.org/10.3390/s19194162
APA StyleMa, S., Huang, T., Li, S., Huang, J., Ma, T., & Liu, J. (2019). MCSM-Wri: A Small-Scale Motion Recognition Method Using WiFi Based on Multi-Scale Convolutional Neural Network. Sensors, 19(19), 4162. https://doi.org/10.3390/s19194162