Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition

Published: 29 March 2022 Publication History

Abstract

A major bottleneck in training robust Human-Activity Recognition models (HAR) is the need for large-scale labeled sensor datasets. Because labeling large amounts of sensor data is an expensive task, unsupervised and semi-supervised learning techniques have emerged that can learn good features from the data without requiring any labels. In this paper, we extend this line of research and present a novel technique called Collaborative Self-Supervised Learning (ColloSSL) which leverages unlabeled data collected from multiple devices worn by a user to learn high-quality features of the data. A key insight that underpins the design of ColloSSL is that unlabeled sensor datasets simultaneously captured by multiple devices can be viewed as natural transformations of each other, and leveraged to generate a supervisory signal for representation learning. We present three technical innovations to extend conventional self-supervised learning algorithms to a multi-device setting: a Device Selection approach which selects positive and negative devices to enable contrastive learning, a Contrastive Sampling algorithm which samples positive and negative examples in a multi-device setting, and a loss function called Multi-view Contrastive Loss which extends standard contrastive loss to a multi-device setting. Our experimental results on three multi-device datasets show that ColloSSL outperforms both fully-supervised and semi-supervised learning techniques in majority of the experiment settings, resulting in an absolute increase of upto 7.9% in F1 score compared to the best performing baselines. We also show that ColloSSL outperforms the fully-supervised methods in a low-data regime, by just using one-tenth of the available labeled data in the best case.

Supplementary Material

jain (jain.zip)
Supplemental movie, appendix, image and software files for, ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition

References

[1]
P Aggarwal, Z Syed, X Niu, and N El-Sheimy. 2008. A standard testing and calibration procedure for low cost MEMS inertial sensors and units. The Journal of Navigation 61, 2 (2008), 323--336.
[2]
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge Luis Reyes-Ortiz, et al. 2013. A public domain dataset for human activity recognition using smartphones. In Esann, Vol. 3. 3.
[3]
Philip Bachman, R Devon Hjelm, and William Buchwalter. 2019. Learning representations by maximizing mutual information across views. arXiv preprint arXiv:1906.00910 (2019).
[4]
Pierre Baldi. 2012. Autoencoders, Unsupervised Learning, and Deep Architectures. In Proceedings of ICML Workshop on Unsupervised and Transfer Learning (Proceedings of Machine Learning Research, Vol. 27), Isabelle Guyon, Gideon Dror, Vincent Lemaire, Graham Taylor, and Daniel Silver (Eds.). PMLR, Bellevue, Washington, USA, 37--49. http://proceedings.mlr.press/v27/baldi12a.html
[5]
Andreas Bulling, Ulf Blanke, and Bernt Schiele. 2014. A tutorial on human activity recognition using body-worn inertial sensors. ACM Computing Surveys (CSUR) 46, 3 (2014), 1--33.
[6]
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging Properties in Self-Supervised Vision Transformers. arXiv:2104.14294 [cs.CV]
[7]
Youngjae Chang, Akhil Mathur, Anton Isopoussu, Junehwa Song, and Fahim Kawsar. 2020. A systematic study of unsupervised domain adaptation for robust human-activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1--30.
[8]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.
[9]
Wenqiang Chen, Shupei Lin, Elizabeth Thompson, and John Stankovic. 2021. SenseCollect: We Need Efficient Ways to Collect On-body Sensor-based Human Activity Data! Proc. ACM Interact. Mob. Wearable Ubiquitous Technol 1, 1 (2021).
[10]
Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15750--15758.
[11]
Shohreh Deldari, Daniel V Smith, Hao Xue, and Flora D Salim. 2021. Time Series Change Point Detection with Self-Supervised Contrastive Predictive Coding. In Proceedings of the Web Conference 2021. 3124--3135.
[12]
Sanorita Dey, Nirupam Roy, Wenyuan Xu, Romit Roy Choudhury, and Srihari Nelakuditi. 2014. AccelPrint: Imperfections of Accelerometers Make Smartphones Trackable. In NDSS. Citeseer.
[13]
Davide Figo, Pedro C Diniz, Diogo R Ferreira, and Joao MP Cardoso. 2010. Preprocessing techniques for context recognition from accelerometer data. Personal and Ubiquitous Computing 14, 7 (2010), 645--662.
[14]
Iuri Frosio, Federico Pedersini, and N Alberto Borghese. 2008. Autocalibration of MEMS accelerometers. IEEE Transactions on Instrumentation and Measurement 58, 6 (2008), 2034--2041.
[15]
Andreas Grammenos, Cecilia Mascolo, and Jon Crowcroft. 2018. You Are Sensing, but Are You Biased?: A User Unaided Sensor Calibration Approach for Mobile Sensing. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 1, Article 11 (March 2018), 26 pages. https://doi.org/10.1145/3191743
[16]
Arthur Gretton, Karsten Borgwardt, Malte J Rasch, Bernhard Scholkopf, and Alexander J Smola. 2008. A kernel method for the two-sample problem. arXiv preprint arXiv:0805.2368 (2008).
[17]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre H Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Daniel Guo, Mohammad Gheshlaghi Azar, et al. 2020. Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733 (2020).
[18]
Yu Guan and Thomas Plötz. 2017. Ensembles of Deep LSTM Learners for Activity Recognition using Wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 2 (Jun 2017), 1--28. https://doi.org/10.1145/3090076
[19]
Nils Y Hammerla, Shane Halloran, and Thomas Plötz. 2016. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880 (2016).
[20]
Nils Y Hammerla, Reuben Kirkham, Peter Andras, and Thomas Ploetz. 2013. On preserving statistical characteristics of accelerometry data using their empirical cumulative distribution. In Proceedings of the 2013 international symposium on wearable computers. 65--68.
[21]
Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, and Thomas Plötz. 2020. Masked reconstruction based self-supervision for human activity recognition. In Proceedings of the 2020 International Symposium on Wearable Computers. 45--49.
[22]
Harish Haresamudram, Irfan Essa, and Thomas Plötz. 2021. Contrastive predictive coding for human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1--26.
[23]
Adam W. Harley, Shrinidhi K. Lakshmikanth, Fangyu Li, Xian Zhou, Hsiao-Yu Fish Tung, and Katerina Fragkiadaki. 2020. Learning from Unlabelled Videos Using Contrastive Predictive Neural 3D Mapping. arXiv:1906.03764 [cs.CV]
[24]
Khalid Hasan, Kamanashis Biswas, Khandakar Ahmed, Nazmus S Nafi, and Md Saiful Islam. 2019. A comprehensive review of wireless body area network. Journal of Network and Computer Applications 143 (2019), 178--198.
[25]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum Contrast for Unsupervised Visual Representation Learning. arXiv:1911.05722 [cs.CV]
[26]
Longlong Jing and Yingli Tian. 2020. Self-supervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence (2020).
[27]
Seungwoo Kang, Youngki Lee, Chulhong Min, Younghyun Ju, Taiwoo Park, Jinwon Lee, Yunseok Rhee, and Junehwa Song. 2010. Orchestrator: An active resource orchestration framework for mobile context monitoring in sensor-rich mobile environments. In 2010 ieee international conference on pervasive computing and communications (percom). IEEE, 135--144.
[28]
Matthew Keally, Gang Zhou, Guoliang Xing, Jianxin Wu, and Andrew Pyles. 2011. Pbn: towards practical activity recognition using smartphone-based body sensor networks. In Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems. 246--259.
[29]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2021. Supervised Contrastive Learning. arXiv:2004.11362 [cs.LG]
[30]
Kai Kunze and Paul Lukowicz. 2008. Dealing with Sensor Displacement in Motion-Based Onbody Activity Recognition Systems. In Proceedings of the 10th International Conference on Ubiquitous Computing (Seoul, Korea) (UbiComp '08). Association for Computing Machinery, New York, NY, USA, 20--29. https://doi.org/10.1145/1409635.1409639
[31]
Song-Mi Lee, Sang Min Yoon, and Heeryon Cho. 2017. Human activity recognition from accelerometer data using Convolutional Neural Network. In 2017 ieee international conference on big data and smart computing (bigcomp). IEEE, 131--134.
[32]
Jonathan Liono, A Kai Qin, and Flora D Salim. 2016. Optimal time window for temporal segmentation of sensor streams in multi-activity recognition. In Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. 10--19.
[33]
Akhil Mathur, Tianlin Zhang, Sourav Bhattacharya, Petar Velickovic, Leonid Joffe, Nicholas D Lane, Fahim Kawsar, and Pietro Lió. 2018. Using deep data augmentation training to address software and hardware heterogeneities in wearable and smartphone sensing devices. In 2018 17th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 200--211.
[34]
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019).
[35]
Daniela Micucci, Marco Mobilio, and Paolo Napoletano. 2017. Unimib shar: A dataset for human activity recognition using acceleration data from smartphones. Applied Sciences 7, 10 (2017), 1101.
[36]
Chulhong Min, Alessandro Montanari, Akhil Mathur, and Fahim Kawsar. 2019. A closer look at quality-aware runtime assessment of sensing models in multi-device environments. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems. 271--284.
[37]
Francisco Javier Ordóñez and Daniel Roggen. 2016. Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 16, 1 (2016), 115.
[38]
Daniel S Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D Cubuk, and Quoc V Le. 2019. Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779 (2019).
[39]
Liangying Peng, Ling Chen, Zhenan Ye, and Yi Zhang. 2018. Aroma: A deep multi-task learning based simple and complex human activity recognition method using wearable sensors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 2 (2018), 1--16.
[40]
Thomas Plötz. 2021. Applying Machine Learning for Sensor Data Analysis in Interactive Systems: Common Pitfalls of Pragmatic Use and Ways to Avoid Them. ACM Computing Surveys (CSUR) 54, 6 (2021), 1--25.
[41]
Thomas Plötz, Nils Y Hammerla, and Patrick L Olivier. 2011. Feature learning for activity recognition in ubiquitous computing. In Twenty-second international joint conference on artificial intelligence.
[42]
Shashi Poddar, Vipan Kumar, and Amod Kumar. 2017. A comprehensive overview of inertial sensor calibration techniques. Journal of Dynamic Systems, Measurement, and Control 139, 1 (2017).
[43]
Attila Reiss and Didier Stricker. 2012. Introducing a new benchmarked dataset for activity monitoring. In 2012 16th international symposium on wearable computers. IEEE, 108--109.
[44]
Daniel Roggen, Alberto Calatroni, Mirco Rossi, Thomas Holleczek, Kilian Förster, Gerhard Tröster, Paul Lukowicz, David Bannach, Gerald Pirkl, Alois Ferscha, et al. 2010. Collecting complex activity datasets in highly rich networked sensor environments. In 2010 Seventh international conference on networked sensing systems (INSS). IEEE, 233--240.
[45]
Charissa Ann Ronao and Sung-Bae Cho. 2016. Human activity recognition with smartphone sensors using deep learning neural networks. Expert systems with applications 59 (2016), 235--244.
[46]
Aaqib Saeed, Tanir Ozcelebi, and Johan Lukkien. 2019. Multi-task self-supervised learning for human activity detection. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 2 (2019), 1--30.
[47]
Aaqib Saeed, Flora D Salim, Tanir Ozcelebi, and Johan Lukkien. 2020. Federated Self-Supervised Learning of Multisensor Representations for Embedded Intelligence. IEEE Internet of Things Journal 8, 2 (2020), 1030--1040.
[48]
Aaqib Saeed, Victor Ungureanu, and Beat Gfeller. 2020. Sense and Learn: Self-Supervision for Omnipresent Sensors. arXiv preprint arXiv:2009.13233 (2020).
[49]
Bardia Safaei, Amir Mahdi Hosseini Monazzah, Milad Barzegar Bafroei, and Alireza Ejlali. 2017. Reliability side-effects in Internet of Things application layer protocols. In 2017 2nd International Conference on System Reliability and Safety (ICSRS). IEEE, 207--212.
[50]
Pritam Sarkar and Ali Etemad. 2020. Self-supervised ecg representation learning for emotion recognition. arXiv preprint arXiv:2002.03898 (2020).
[51]
Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine, and Google Brain. 2018. Time-contrastive networks: Self-supervised learning from video. In 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1134--1141.
[52]
Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, and Yueting Zhuang. 2021. Federated Self-Supervised Contrastive Learning via Ensemble Similarity Distillation. arXiv preprint arXiv:2109.14611 (2021).
[53]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).
[54]
Maja Stikic, Kristof Van Laerhoven, and Bernt Schiele. 2008. Exploring semi-supervised and active learning for activity recognition. In 2008 12th IEEE International Symposium on Wearable Computers. IEEE, 81--88.
[55]
Allan Stisen, Henrik Blunck, Sourav Bhattacharya, Thor Siiger Prentow, Mikkel Baun Kjærgaard, Anind Dey, Tobias Sonne, and Mads Møller Jensen. 2015. Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In Proceedings of the 13th ACM conference on embedded networked sensor systems. 127--140.
[56]
Timo Sztyler and Heiner Stuckenschmidt. 2016. On-body localization of wearable devices: An investigation of position-aware activity recognition. In 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 1--9.
[57]
Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Soren Brage, Nick Wareham, and Cecilia Mascolo. 2021. SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data. arXiv preprint arXiv: 2102.06073 (2021).
[58]
Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, and Cecilia Mascolo. 2020. Exploring Contrastive Learning in Human Activity Recognition for Healthcare. arXiv preprint arXiv:2011.11542 (2020).
[59]
Tatiana Tommasi, Novi Patricia, Barbara Caputo, and Tinne Tuytelaars. 2017. A deeper look at dataset bias. In Domain adaptation in computer vision applications. Springer, 37--55.
[60]
Antonio Torralba and Alexei A Efros. 2011. Unbiased look at dataset bias. In CVPR 2011. IEEE, 1521--1528.
[61]
Yonatan Vaizman, Nadir Weibel, and Gert Lanckriet. 2018. Context recognition in-the-wild: Unified model for multi-modal sensors and multi-label classification. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 1--22.
[62]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).
[63]
Jesper E Van Engelen and Holger H Hoos. 2020. A survey on semi-supervised learning. Machine Learning 109, 2 (2020), 373--440.
[64]
Jian Bo Yang, Minh Nhut Nguyen, Phyo Phyo San, Xiao Li Li, and Shonali Krishnaswamy. 2015. Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition. In Proceedings of the 24th International Conference on Artificial Intelligence (Buenos Aires, Argentina) (IJCAI'15). AAAI Press, 3995--4001.
[65]
Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, and Tarek Abdelzaher. 2017. Deepsense: A unified deep learning framework for time-series mobile sensing data processing. In Proceedings of the 26th International Conference on World Wide Web. 351--360.
[66]
Shuochao Yao, Yiran Zhao, Shaohan Hu, and Tarek Abdelzaher. 2018. Qualitydeepsense: Quality-aware deep learning framework for internet of things applications with sensor-temporal attention. In Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning. 42--47.
[67]
Piero Zappi, Clemens Lombriser, Thomas Stiefmeier, Elisabetta Farella, Daniel Roggen, Luca Benini, and Gerhard Tröster. 2008. Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection. In European Conference on Wireless Sensor Networks. Springer, 17--33.
[68]
Xiaojin Jerry Zhu. 2005. Semi-supervised learning literature survey. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.

Cited By

View all
  • (2024)Exploring the Impact of the NULL Class on In-the-Wild Human Activity RecognitionSensors10.3390/s2412389824:12(3898)Online publication date: 16-Jun-2024
  • (2024)Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity RecognitionSensors10.3390/s2404123824:4(1238)Online publication date: 15-Feb-2024
  • (2024)SemiCMT: Contrastive Cross-Modal Knowledge Transfer for IoT Sensing with Semi-Paired Multi-Modal SignalsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997798:4(1-30)Online publication date: 21-Nov-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 6, Issue 1
March 2022
1009 pages
EISSN:2474-9567
DOI:10.1145/3529514
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 March 2022
Published in IMWUT Volume 6, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Contrastive Learning
  2. Human Activity Recognition
  3. Self-Supervised learning

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)345
  • Downloads (Last 6 weeks)38
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring the Impact of the NULL Class on In-the-Wild Human Activity RecognitionSensors10.3390/s2412389824:12(3898)Online publication date: 16-Jun-2024
  • (2024)Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity RecognitionSensors10.3390/s2404123824:4(1238)Online publication date: 15-Feb-2024
  • (2024)SemiCMT: Contrastive Cross-Modal Knowledge Transfer for IoT Sensing with Semi-Paired Multi-Modal SignalsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997798:4(1-30)Online publication date: 21-Nov-2024
  • (2024)Self-supervised Learning for Accelerometer-based Human Activity Recognition: A SurveyProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997678:4(1-42)Online publication date: 21-Nov-2024
  • (2024)ContrastSense: Domain-invariant Contrastive Learning for In-the-Wild Wearable SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997448:4(1-32)Online publication date: 21-Nov-2024
  • (2024)Evaluating Large Language Models as Virtual Annotators for Time-series Physical Sensing DataACM Transactions on Intelligent Systems and Technology10.1145/3696461Online publication date: 20-Sep-2024
  • (2024)Self-Supervised Representation Learning and Temporal-Spectral Feature Fusion for Bed Occupancy DetectionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785148:3(1-25)Online publication date: 9-Sep-2024
  • (2024)View-agnostic Human Exercise Cataloging with Single MmWave RadarProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785128:3(1-23)Online publication date: 9-Sep-2024
  • (2024)Solving the Sensor-Based Activity Recognition Problem (SOAR): Self-Supervised, Multi-Modal Recognition of Activities from Wearable SensorsCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3677562(1004-1007)Online publication date: 5-Oct-2024
  • (2024)Vi2ACT:Video-enhanced Cross-modal Co-learning with Representation Conditional Discriminator for Few-shot Human Activity RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681664(1848-1856)Online publication date: 28-Oct-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media