Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2510650.2510659acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Behavior recognition from video based on human constrained descriptor and adaptable neural networks

Published: 21 October 2013 Publication History

Abstract

In this paper we introduce a new descriptor, the Human Constrained Pixel Change History (HC-PCH), which is based on Pixel Change History (PCH) but focuses on the human body movements over time. We propose a modification of the conventional PCH which entails the calculation of two probabilistic maps, based on human face and body detection respectively. The features extracted from this descriptor are used as input to an HMM-based behavior recognition framework. We also introduce a rectification framework of behavior recognition and classification by incorporating an expert user's feedback into the learning process through two proposed schemes: a plain non-linear one and an adaptable one, which requires fewer training samples and is more effective in decreasing misclassification error. The methods presented are validated on a real-world computer vision dataset comprising challenging video sequences from an industrial environment.

References

[1]
G. Smith, Behind the screens: Examining constructions of deviance and informal practices among CCTV control room operators in the UK. Surveillance and Society, 2(2/3), 376--395, 2004.
[2]
Davis J. (2001). Hierarchical Motion History Images for Recognizing Human Motion. IEEE Workshop on Detection and Recognition of Events in Video, Vancouver, Canada, July 8, 2001.
[3]
Xiang, T. and S. Gong (2006). Beyond tracking: modelling activity and understanding behaviour. Inter- national Journal of Computer Vision 67, 21--51.
[4]
Nikolaos D. Doulamis, Athanasios S. Voulodimos, Dimitrios I. Kosmopoulos, Theodora A. Varvarigou, "Enhanced Human Behavior Recognition Using HMM and Evaluative Rectification", First ACM International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams (ARTEMIS 2010) (held in conjunction with ACM Multimedia 2010), pp. 39--44, Florence, Italy, October 2010.
[5]
Bastan, M., Cam, H., Gudkbay, U., Ulusoy, O., (2010). Bilvideo-7: an MPEG-7- compatible video indexing and retrieval system. IEEE Multimedia, 17(3), 62--73.
[6]
Kim, W., Kim, C., (2012). Background Subtraction for Dynamic Texture Scenes Using Fuzzy Color Histograms. IEEE Signal Processing Letters, 19(3), 127--130.
[7]
Van Gool, L., Tuytelaars, T., & Turina, A. (2001). Local Features for Image Retrieval. In Veltkamp, R., C., Burkhardt, H., Kriegel, H.-P. (Ed.), In State-of-the-Art in Content-Based Image and Video Retrieval, (pp. 21--41, 2001) Ed., Kluwer Academic Publishers.
[8]
Stauffer, C., & Grimson, W., (2000). Learning patterns of activity using real-time tracking. IEEE Trans. PAMI, 22(8), 747--758.
[9]
Ng, J., & Gong, S. (2001). Learning pixel-wise signal energy for understanding semantics. In Proc. BMVC, (pp. 695--704).
[10]
Xiang, T. Gong, S., & Parkinson, D. (2002). Autonomous Visual Events Detection and Classification without Explicit Object-Centred Segmentation and Tracking. British Machine Vision Conference (BMVC).
[11]
Miao, Q., Wang, G., Shi, C., Lin, X., Ruan, Z., (2011). A new framework for on-line object tracking based on SURF. Pattern Recognition Letters, 32(13), 1564--1571.
[12]
Liu, C., Yuen, J., & Torralba, A. (2011). SIFT Flow: Dense Correspondence across Scenes and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 978--994.
[13]
Wang, Y. and G. Mori (2009). Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (10), 1762--1774.
[14]
Hu, Q., L. Qin, Q. Huang, S. Jiang, and Q. Tian (2010). Action recognition using spatial-temporal context. In Proc. of the 20th International Conference on Pattern Recognition (ICPR) 2010, pp. 1521--1524.
[15]
Ali, S. and M. Shah (2010). Human action recognition in videos using kinematic features and multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (2), 288--303.
[16]
Ivanov, Y. and A. Bobick (2000). Recognition of visual activities and interactions by stochastic parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (8), 852--872.
[17]
Padoy, N., D. Mateus, D. Weinland, M.-O. Berger, and N. Navab (2009). Workflow monitoring based on 3d motion features. In Proc. of the 12th IEEE International Conference on Computer Vision Workshops (ICCV Workshops) 2009, pp. 585--592.
[18]
Jaeger, H., W. Maass, and J. Principe (2007). Special issue on echo state networks and liquid state machines. Neural Networks 20 (3), 287--289
[19]
Poppe, R. (2010). A survey on vision-based human action recognition. Image and Vision Computing 28 (6), 976--990.
[20]
N. Doulamis, A. Doulamis, Evaluation of relevance feedback schemes in content-based retrieval systems, Signal Processing: Image Communication 21(4) (2006) 334--357.
[21]
J.J. Rocchio, Relevance feedback in information retrieval, in: G. Salton (Ed.), The Smart Retrieval System -- Experiments in Automatic Document Processing, Prentice-Hall, Englewood Cliffs, NJ, 1971, pp. 313--323.
[22]
A. Oerlemans, J.T. Rijsdam, M.S. Lew, Real-time object tracking with relevance feedback, in: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR '07, ACM, New York, NY, USA, 2007, pp. 101--104.
[23]
Z. Chengcui, C. Wei-Bang, C. Xin, Y. Lin, J. John, A multiple instance learning and relevance feedback framework for retrieving abnormal incidents in surveillance videos, Journal of Multimedia 5 (2010) 310--321.
[24]
Wang, H., & Chang, S.-F. (1997). Highly Efficient System for Automatic Face Region Detection in MPEG Video Sequences. IEEE Trans. on Circuits and Syst. for Video Technol,. special issue on Multimedia Systems and Technologies, 7(4), 615--628.
[25]
Flusser, J., Zitova, B., & Suk, T. (2009). Moment Functions in Image Analysis: Theory and Applications. Wiley.
[26]
Removed for double blind review.
[27]
G. Seber, C. Wild, Nonlinear Regression, Wiley, Hoboken, New Jersey, 2003.
[28]
L. Breiman, Random forests, Machine Learning 45 (2001) 5--32 .
[29]
D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by backpropagating errors, MIT Press, Cambridge, MA, USA, 1988. pp. 696--699 .
[30]
A. Doulamis, Adaptable neural networks for objects' tracking re-initialization, LNCS of Lecture Notes in Computer Science, vol. 5769, 2009, pp. 715--724.
[31]
A. Doulamis, Knowledge Extraction in Stereo Video Sequences Using Adaptive Neural Networks, Intelligent Multimedia Processing with Soft Computing, Springer-Verlag, Berlin, Heidelberg, 2005. pp. 235--252.
[32]
D.G. Luenberger, Linear and nonlinear programming, Addison-Wesley, 1984.
[33]
A. Voulodimos, D. Kosmopoulos, G. Vasileiou, E. Sardis, A. Doulamis, V. Anagnostopoulos, C. Lalos, and T. Varvarigou (2012). "A threefold dataset for activity and workflow recognition in complex industrial environments", IEEE Multimedia, vol. 19, no. 3, pp. 42--52, 2012.

Index Terms

  1. Behavior recognition from video based on human constrained descriptor and adaptable neural networks

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ARTEMIS '13: Proceedings of the 4th ACM/IEEE international workshop on Analysis and retrieval of tracked events and motion in imagery stream
    October 2013
    94 pages
    ISBN:9781450323932
    DOI:10.1145/2510650
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adaptable neural networks
    2. behavior recognition
    3. human constrained pixel change history

    Qualifiers

    • Research-article

    Conference

    MM '13
    Sponsor:
    MM '13: ACM Multimedia Conference
    October 21, 2013
    Barcelona, Spain

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 122
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media