Nothing Special   »   [go: up one dir, main page]

Skip to main content
  • 682 Accesses

Abstract

Despite the popularity of home medical devices, serious safety concerns have been raised, because the use-errors of home medical devices have linked to a large number of fatal hazards. To resolve the problem, we introduce a cognitive assistive system to automatically monitor the use of home medical devices. Being able to accurately recognize user operations is one of the most important functionalities of the proposed system. However, even though various action recognition algorithms have been proposed in recent years, it is still unknown whether they are adequate for recognizing operations in using home medical devices. Since the lack of the corresponding database is the main reason causing the situation, at the first part of this paper, we present a database specially designed for studying the use of home medical devices. Then, we evaluate the performance of the existing approaches on the proposed database. Although using state-of-art approaches which have demonstrated near perfect performance in recognizing certain general human actions, we observe significant performance drop when applying it to recognize device operations. We conclude that the tiny actions involved in using devices is one of the most important reasons leading to the performance decrease. To accurately recognize tiny actions, it’s critical to focus on where the target action happens, namely the region of interest (ROI) and have more elaborate action modeling based on the ROI. Therefore, in the second part of this paper, we introduce a simple but effective approach to estimating ROI for recognizing tiny actions. The key idea of this method is to analyze the correlation between an action and the sub-regions of a frame. The estimated ROI is then used as a filter for building more accurate action representations. Experimental results show significant performance improvements over the baseline methods by using the estimated ROI for action recognition. We also introduce an interaction framework, which considers both the confidence of the detection as well as the seriousness of the potential error with messages to the user that take both aspects into account.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://en.wikipedia.org/wiki/Kinect.

  2. 2.

    http://openni.org/.

References

  1. Aggarwal, J. K., & Ryoo, M. S. (2011). Human activity analysis: A review. ACM Computing Surveys (CSUR), 43(3), 1–43.

    Article  Google Scholar 

  2. Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.

    Article  Google Scholar 

  3. Blank, M., Gorelick, L., Shechtman, E., Irani, M., & Basri, R. (2005). Actions as space-time shapes. In International Conference on Computer Vision.

    Google Scholar 

  4. Chen, M. Y., & Hauptmann, A. (2009). MoSIFT: Reocgnizing human actions in surveillance videos. In CMU-CS-09-161.

    Google Scholar 

  5. Cheng, M. M., Zhang, G. X., Mitra, N. J., Huang, X., & Hu, S. M. (2011). Global contrast based salient region detection. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  6. Dollar, P., Rabaud, V., Cottrell, G., & Belongie, S. (2005). Behavior recognition via sparse spatio-temporal features. In Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

    Google Scholar 

  7. Gao, Z., Chen, M. Y., Detyniecki, M., Wu, W., Hauptmann, A., Wactlar, H., et al. (2010) Multi-camera monitoring of infusion pump use. In IEEE International Conference on Semantic Computing.

    Google Scholar 

  8. Gao, Z., Detyniecki, M., Chen, M. Y., Hauptmann, A. G., Wactlar, H. D., & Cai, A. (2010). The application of spatio-temporal feature and multi-sensor in home medical devices. International Journal of Digital Content Technology and Its Applications, 4(6), 69–78.

    Google Scholar 

  9. Gao, Z., Detyniecki, M., Chen, M. Y., Wu, W., Hauptmann, A. G., & Wactlar, H. D. (2010). Towards automated assistance for operating home medical devices. In International Conference of Engineering in Medicine and Biology Society.

    Google Scholar 

  10. Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., & Serre, T. (2011). HMDB: A large video database for human motion recognition. In International Conference on Computer Vision.

    Google Scholar 

  11. Laptev, I. (2005). On space-time interest points. International Journal of Computer Vision, 64(2/3), 107–123.

    Article  Google Scholar 

  12. Marszałek, M., Laptev, I., & Schmid, C. (2009). Actions in context. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  13. Meier, B. (2010). F.D.A. Steps up oversight of infusion pumps. In New York Times.

    Google Scholar 

  14. Ni, B., Wang, G., & Moulin, P. (2011). RGBD-HuDaAct: A color-depth video database for human daily activity recognition. In International Conference on Computer Vision Workshops.

    Google Scholar 

  15. Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.

    Article  Google Scholar 

  16. Parameswaran, V., & Chellappa, R. (2006). View invariance for human action recognition. International Journal of Computer Vision, 66(1), 83–101.

    Article  Google Scholar 

  17. Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  18. Reddy, K., & Shah, M. (2012). Recognizing 50 human action categories of web videos. Machine Vision and Applications Journal, 25(5), 97–81.

    Google Scholar 

  19. Ryoo, M. S., Aggarwal, J. K. (2010). UT-Interaction dataset. In ICPR Contest on Semantic Description of Human Activities (SDHA).

    Google Scholar 

  20. Ryoo, M. S., & Aggarwal, J. K. (2011). Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In International Conference on Computer Vision.

    Google Scholar 

  21. Schuldt, C., Laptev, I., & Caputo B (2004) Recognizing human actions: A local sVM approach. In International Conference on Pattern Recognition.

    Google Scholar 

  22. Sivic, J., & Zisserman, A. (2003). Video google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision.

    Google Scholar 

  23. Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  24. Weinland, D., Özuysal, M., & Fua, P. (2010). Making action recognition robust to occlusions and viewpoint changes. In European Conference on Computer Vision.

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Science Foundation under Grant No. IIS-0917072. Any opinions, findings, and conclusions expressed in this material are those of the author(s) and do not reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Hauptmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Cai, Y., Yang, Y., Hauptmann, A., Wactlar, H. (2015). Monitoring and Coaching the Use of Home Medical Devices. In: Briassouli, A., Benois-Pineau, J., Hauptmann, A. (eds) Health Monitoring and Personalized Feedback using Multimedia Data. Springer, Cham. https://doi.org/10.1007/978-3-319-17963-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17963-6_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17962-9

  • Online ISBN: 978-3-319-17963-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics