Abstract
Despite the popularity of home medical devices, serious safety concerns have been raised, because the use-errors of home medical devices have linked to a large number of fatal hazards. To resolve the problem, we introduce a cognitive assistive system to automatically monitor the use of home medical devices. Being able to accurately recognize user operations is one of the most important functionalities of the proposed system. However, even though various action recognition algorithms have been proposed in recent years, it is still unknown whether they are adequate for recognizing operations in using home medical devices. Since the lack of the corresponding database is the main reason causing the situation, at the first part of this paper, we present a database specially designed for studying the use of home medical devices. Then, we evaluate the performance of the existing approaches on the proposed database. Although using state-of-art approaches which have demonstrated near perfect performance in recognizing certain general human actions, we observe significant performance drop when applying it to recognize device operations. We conclude that the tiny actions involved in using devices is one of the most important reasons leading to the performance decrease. To accurately recognize tiny actions, it’s critical to focus on where the target action happens, namely the region of interest (ROI) and have more elaborate action modeling based on the ROI. Therefore, in the second part of this paper, we introduce a simple but effective approach to estimating ROI for recognizing tiny actions. The key idea of this method is to analyze the correlation between an action and the sub-regions of a frame. The estimated ROI is then used as a filter for building more accurate action representations. Experimental results show significant performance improvements over the baseline methods by using the estimated ROI for action recognition. We also introduce an interaction framework, which considers both the confidence of the detection as well as the seriousness of the potential error with messages to the user that take both aspects into account.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aggarwal, J. K., & Ryoo, M. S. (2011). Human activity analysis: A review. ACM Computing Surveys (CSUR), 43(3), 1–43.
Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.
Blank, M., Gorelick, L., Shechtman, E., Irani, M., & Basri, R. (2005). Actions as space-time shapes. In International Conference on Computer Vision.
Chen, M. Y., & Hauptmann, A. (2009). MoSIFT: Reocgnizing human actions in surveillance videos. In CMU-CS-09-161.
Cheng, M. M., Zhang, G. X., Mitra, N. J., Huang, X., & Hu, S. M. (2011). Global contrast based salient region detection. In IEEE Conference on Computer Vision and Pattern Recognition.
Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.
Gao, Z., Chen, M. Y., Detyniecki, M., Wu, W., Hauptmann, A., Wactlar, H., et al. (2010) Multi-camera monitoring of infusion pump use. In IEEE International Conference on Semantic Computing.
Gao, Z., Detyniecki, M., Chen, M. Y., Hauptmann, A. G., Wactlar, H. D., & Cai, A. (2010). The application of spatio-temporal feature and multi-sensor in home medical devices. International Journal of Digital Content Technology and Its Applications, 4(6), 69–78.
Gao, Z., Detyniecki, M., Chen, M. Y., Wu, W., Hauptmann, A. G., & Wactlar, H. D. (2010). Towards automated assistance for operating home medical devices. In International Conference of Engineering in Medicine and Biology Society.
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., & Serre, T. (2011). HMDB: A large video database for human motion recognition. In International Conference on Computer Vision.
Laptev, I. (2005). On space-time interest points. International Journal of Computer Vision, 64(2/3), 107–123.
Marszałek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In IEEE Conference on Computer Vision and Pattern Recognition.
Meier, B. (2010). F.D.A. Steps up oversight of infusion pumps. In New York Times.
Ni, B., Wang, G., & Moulin, P. (2011). RGBD-HuDaAct: A color-depth video database for human daily activity recognition. In International Conference on Computer Vision Workshops.
Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.
Parameswaran, V., & Chellappa, R. (2006). View invariance for human action recognition. International Journal of Computer Vision, 66(1), 83–101.
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In IEEE Conference on Computer Vision and Pattern Recognition.
Reddy, K., & Shah, M. (2012). Recognizing 50 human action categories of web videos. Machine Vision and Applications Journal, 25(5), 97–81.
Ryoo, M. S., Aggarwal, J. K. (2010). UT-Interaction dataset. In ICPR Contest on Semantic Description of Human Activities (SDHA).
Ryoo, M. S., & Aggarwal, J. K. (2011). Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In International Conference on Computer Vision.
Schuldt, C., Laptev, I., & Caputo B (2004) Recognizing human actions: A local sVM approach. In International Conference on Pattern Recognition.
Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision.
Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In IEEE Conference on Computer Vision and Pattern Recognition.
Weinland, D., Özuysal, M., & Fua, P. (2010). Making action recognition robust to occlusions and viewpoint changes. In European Conference on Computer Vision.
Acknowledgements
This work was supported in part by the National Science Foundation under Grant No. IIS-0917072. Any opinions, findings, and conclusions expressed in this material are those of the author(s) and do not reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Cai, Y., Yang, Y., Hauptmann, A., Wactlar, H. (2015). Monitoring and Coaching the Use of Home Medical Devices. In: Briassouli, A., Benois-Pineau, J., Hauptmann, A. (eds) Health Monitoring and Personalized Feedback using Multimedia Data. Springer, Cham. https://doi.org/10.1007/978-3-319-17963-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-17963-6_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17962-9
Online ISBN: 978-3-319-17963-6
eBook Packages: Computer ScienceComputer Science (R0)