article

DMMs-Based Multiple Features Fusion for Human Action Recognition

Authors:

Mohammad Farhad Bulbul,

Yunsheng Jiang,

Jinwen MaAuthors Info & Claims

International Journal of Multimedia Data Engineering & Management, Volume 6, Issue 4

Pages 23 - 39

https://doi.org/10.4018/IJMDEM.2015100102

Published: 01 October 2015 Publication History

Abstract

The emerging cost-effective depth sensors have facilitated the action recognition task significantly. In this paper, the authors address the action recognition problem using depth video sequences combining three discriminative features. More specifically, the authors generate three Depth Motion Maps DMMs over the entire video sequence corresponding to the front, side, and top projection views. Contourlet-based Histogram of Oriented Gradients CT-HOG, Local Binary Patterns LBP, and Edge Oriented Histograms EOH are then computed from the DMMs. To merge these features, the authors consider decision-level fusion, where a soft decision-fusion rule, Logarithmic Opinion Pool LOGP, is used to combine the classification outcomes from multiple classifiers each with an individual set of features. Experimental results on two datasets reveal that the fusion scheme achieves superior action recognition performance over the situations when using each feature individually.

References

[1]

Aggarwal, J., & Ryoo, M. 2011. Human Activity Analysis: A Review. ACM Computing Surveys, 433, 1-43.

Digital Library

[2]

Benediktssono, J. A., & Sveinsson, J. 2003. Multisource Remote Sensing Data Classification Based on Consensus and Pruning. IEEE Transactions on Geoscience and Remote Sensing, 414, 932-936.

[3]

Bobick, A., & Davis, J. 2001. The Recognition of Human Movement Using Temporal Templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 233, 257-267.

Digital Library

[4]

Canny, J. 1986. A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 86, 679-698. 21869365.

Digital Library

[5]

Chaaraoui, A. A., Climent-Pérez, P., & Flóórez-Revuelta, F. 2012, September. A Review on Vision Techniques Applied to Human Behaviour Analysis for Ambient-Assisted Living. International Journal of Expert Systems with Applications, 3912, 10873-10888.

[6]

Chaaraoui, A. A., Padilla-López, J. R., Climent-Pérez, P., & Flóórez-Revuelta, F. 2014. Evolutionary joint selection to improve human action recognition with RGB-D devices. Journal of Expert Systems with Applications, 413, 786-794.

[7]

Chen, C., Jafari, R., & Kehtarnavaz, N. 2015a. Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 1092-1099. Waikoloa Beach, HI. 10.1109/WACV.2015.150

[8]

ChenC.JafariR.KehtarnavazN. 2015b. UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor. In Proceedings of the International Conference on Image Processing. Quebec city, Canada.

[9]

ChenC.KehtarnavazN.JafariR. 2014b. A Medication Adherence Monitoring System for Pill Bottles based on a Wearable Inertial Sensor. In Proceedings of the 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4983-4986. Chicago,IL. 10.1109/EMBC.2014.6944743

[10]

ChenC.LiuK.JafariR.KehtarnavazN. 2014a. Home-Based Senior Fitness Test Measurement System Using Collaborative Inertial and Depth Sensors. In Proceedings of the 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4135-4138. 10.1109/EMBC.2014.6944534

[11]

Chen, C., Liu, K., & Kehtarnavaz, N. 2013. Real-Time Human Action Recognition Based on Depth Motion Maps. Journal of Real-Time Image Processing.

[12]

Chen, L., Wei, H., & Ferryman, J. 2013. A Survey of Human Motion Analysis Using Depth Imagery. Pattern Recognition Letters, 3415, 1995-2006.

[13]

Conaire, C. Ó. n.d. Computer Vision Source Code. Retrieved from http://clickdamage.com/sourcecode/index.php: http://clickdamage.com/sourcecode/code/edgeOrientationHistogram.m

[14]

Do, M. N., & Vetterli, M. 2005. The Contourlet Transform: An Efficient Directional Multiresolution Image Representation. IEEE Transactions on Image Processing, 1412, 2091-2106. 16370462.

Digital Library

[15]

FarhadM.JiangY.MaJ. 2015a. Human Action Recognition Based on DMMs, HOGs and Contourlet Transform. In Proceedings of the IEEE International Conference on Multimedia Big Data, pp. 389-394. Beijing, China.

[16]

FarhadM.JiangY.MaJ. 2015b. Real-Time Human Action Recognition Using DMMs-Based LBP and EOH Feautres. In Proceedings of the International Conference on Intelligent Computing. Fuzhou, China.

[17]

Huang, G.-B., Zhou, H., Ding, X., & Zhang, R. 2012. Extreme Learning Machine for Regression and Multiclass Classification. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 422, 513-529. 21984515.

Digital Library

[18]

Huang, G.-B., Zhu, Q.-Y., & Siew, C.-K. 2006. Extreme Learning Machine: Theory and Applications. Journal of Neurocomputing, 701-3, 489-501.

[19]

JuniorO. L.DelgadoD.GoncalvesV.NunesU. 2009. Trainable Classifier-Fusion Schemes: an Application to Pedestrian Detection. IEEE International Conference on Intelligent Transportation Systems, pp. 432-443. St. Louis. 10.1109/ITSC.2009.5309700

[20]

Lam, L., & Suen, C. Y. 1997. Application of Majority Voting to Pattern Recognition: An analysis of Its Behaviour and Performance. IEEE Transactions on Systems, Man, and Cybernetics. Part A, Systems and Humans, 275, 553-568.

[21]

Li, W., Chen, C., Su, H., & Du, Q. 2015. Local Binary Patterns and Extreme Learning Machine for Hyperspectral Imagery Classification. IEEE Transactions on Geoscience and Remote Sensing, 537, 3681-3693.

[22]

LiW.ZhangZ.LiuZ. 2010. Action Recognition Based on a Bag of 3D Points. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 9-14. San Francisco,CA. 10.1109/CVPRW.2010.5543273

[23]

Luo, J., Wang, W., & Qi, H. 2014. Spatio-Temporal Feature Extraction and Representation for RGB-D Human Action Recognition. Pattern Recognition Letters, 50, 139-148.

Digital Library

[24]

Ni, B., Wang, G., & Moulin, P. 2013. Rgbd-Hudaact: A Color-Depth Video Database for Human Daily Activity Recognition. In Proceedings of the Consumer Depth Cameras for Computer Vision pp. 193-208. Springer London. 10.1007/978-1-4471-4640-7_10

[25]

Ojala, T., Pietikäinen, M., & Määenpää, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 247, 971-987.

Digital Library

[26]

OreifejO.LiuZ. 2013. HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 716-723. 10.1109/CVPR.2013.98

[27]

Platt, J. 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. Cambridge, MA: MIT Press.

[28]

Poppe, R. 2010, June. A Survey on Vision-Based Human Action Recognition. Journal on Image and Vision Computing, 286, 976-990.

Digital Library

[29]

RahmaniH.MahmoodA.HuynhD. Q.MianA. 2014. Real-Time Action Recognition Using Histograms of Depth Gradients and Random Decision Forests. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 626-633. 10.1109/WACV.2014.6836044

[30]

Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., & Blake, A. et al . 2013, January. Real-Time Human Pose Recognition in Parts from Single Depth Images. Communications of the ACM, 561, 116-124.

Digital Library

[31]

Vieira, A. W., Nascimento, E. R., Oliveira, G. L., Liu, Z., & Campos, M. M. 2012. STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences. In Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 252-259. Buenos Aires,Argentina.

[32]

WangH.SchmidC. 2013. Action Recognition with Improved Trajectories. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3551-3558. Sydney, Australia.

Digital Library

[33]

WangJ.LiuZ.ChorowskiJ.ChenZ.WuY. 2012a. Robust 3D Action Recognition with Random Occupancy Ptterns. In Proceedings of the European Conference on Computer Vision, pp. 872-885. Florence,Italy.

[34]

WangJ.LiuZ.WuY.YuanJ. 2012b. Mining Actionlet Ensemble for Action Recognition with Depth Cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1290-1297. Providence, RI. 10.1109/CVPR.2012.6247813

Digital Library

[35]

WiliemA.MadasuV.BolesW.YarlagaddaP. 2010. An Update-Describe Approach for Human Action Recognition in Surveillance Video. In Proceedings of the International Conference on Digital Image Computing: Techniques and Applications, pp. 270-275. Sydney, Australia. 10.1109/DICTA.2010.55

[36]

XiaL.AggarwalJ. 2013. Spatio-Temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 2834-2841. 10.1109/CVPR.2013.365

[37]

XiaL.ChenC.-C.AggarwalJ. 2012. View Invariant Human Action Recognition Using Histograms of 3D Joints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 20-27. 10.1109/CVPRW.2012.6239233

[38]

YangX.TianY. 2012a. EigenJoints-Based Action Recognition Using Naïve-Bayes-Nearest-Neighbor. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 14-19. Province, RI.

[39]

YangX.ZhangC.TianY. 2012b. Recognizing Actions Using Depth Motion Maps-Based Histograms of Oriented Gradients. In Proceedings of the 20th ACM International Conference on Multimedia, pp. 1057-1060. New York, USA. 10.1145/2393347.2396382

[40]

ZhuH.-M.PunC.-M. 2013. Human Action Recognition with Skeletal Information from Depth Camera. In Proceedings of the IEEE International Conference Information and Automation, pp. 1082 -1085. Yinchuan, China. 10.1109/ICInfA.2013.6720456

Cited By

Islam MYasar MIqbal T(2023)MAVEN: A Memory Augmented Recurrent Approach for Multimodal FusionIEEE Transactions on Multimedia10.1109/TMM.2022.316426125(3694-3708)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3164261
Kumar RKumar S(2023)Multi-view Multi-modal Approach Based on 5S-CNN and BiLSTM Using Skeleton, Depth and RGB Data for Human Activity RecognitionWireless Personal Communications: An International Journal10.1007/s11277-023-10324-4130:2(1141-1159)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1007/s11277-023-10324-4
Li DJahan HHuang XFeng Z(2022)Human action recognition method based on historical point cloud trajectory characteristicsThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-021-02167-638:8(2971-2979)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s00371-021-02167-6
Show More Cited By

DMMs-Based Multiple Features Fusion for Human Action Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning algorithms

Recommendations

Fusing appearance and motion information for action recognition on depth sequences

With the advent of cost-efficient depth cameras, many effective feature descriptors have been proposed for action recognition from depth sequences. However, most of them are based on single feature and thus unable to extract the action information ...
Fusing Multiple Features for Depth-Based Action Recognition
Special Section on Visual Understanding with RGB-D Sensors

Human action recognition is a very active research topic in computer vision and pattern recognition. Recently, it has shown a great potential for human action recognition using the three-dimensional (3D) depth data captured by the emerging RGB-D ...
Human Action Recognition Based on DMMs, HOGs and Contourlet Transform
BIGMM '15: Proceedings of the 2015 IEEE International Conference on Multimedia Big Data

This paper proposes a framework for recognizing human actions from depth video sequences by designing a novel feature descriptor based on Depth Motion Maps (DMMs), Contour let Transform (CT) and Histogram of Oriented Gradients (HOGs). First, CT is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Multimedia Data Engineering & Management

International Journal of Multimedia Data Engineering & Management Volume 6, Issue 4

October 2015

77 pages

ISSN:1947-8534

EISSN:1947-8542

Issue’s Table of Contents

Publisher

IGI Global

United States

Publication History

Published: 01 October 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Islam MYasar MIqbal T(2023)MAVEN: A Memory Augmented Recurrent Approach for Multimodal FusionIEEE Transactions on Multimedia10.1109/TMM.2022.316426125(3694-3708)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3164261
Kumar RKumar S(2023)Multi-view Multi-modal Approach Based on 5S-CNN and BiLSTM Using Skeleton, Depth and RGB Data for Human Activity RecognitionWireless Personal Communications: An International Journal10.1007/s11277-023-10324-4130:2(1141-1159)Online publication date: 1-May-2023
https://dl.acm.org/doi/10.1007/s11277-023-10324-4
Li DJahan HHuang XFeng Z(2022)Human action recognition method based on historical point cloud trajectory characteristicsThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-021-02167-638:8(2971-2979)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s00371-021-02167-6
Verma PSah ASrivastava R(2020)Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognitionMultimedia Systems10.1007/s00530-020-00677-226:6(671-685)Online publication date: 25-Jul-2020
https://dl.acm.org/doi/10.1007/s00530-020-00677-2
(2019)Real-time human action recognition using depth motion maps and convolutional neural networksInternational Journal of High Performance Computing and Networking10.5555/3337645.333765113:3(312-320)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.5555/3337645.3337651
Bulbul MIslam SAli H(2019)Human action recognition using MHI and SHI based GLAC features and Collaborative Representation ClassifierJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-18113636:4(3385-3401)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.3233/JIFS-181136
Bulbul MIslam SAli H(2019)3D human action analysis and recognition through GLAC descriptor on 2D motion and static posture imagesMultimedia Tools and Applications10.1007/s11042-019-7365-278:15(21085-21111)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s11042-019-7365-2
Ben Mahjoub AAtri M(2019)An efficient end-to-end deep learning architecture for activity classificationAnalog Integrated Circuits and Signal Processing10.1007/s10470-018-1306-299:1(23-32)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s10470-018-1306-2
Zheng JFeng ZXu CHu JGe W(2017)Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognitionMultimedia Tools and Applications10.1007/s11042-016-3988-876:20(20525-20544)Online publication date: 1-Oct-2017
https://dl.acm.org/doi/10.1007/s11042-016-3988-8
Wang QZhao JGong DShen YLi MLei Y(2017)Parallelizing Convolutional Neural Networks for Action Event Recognition in Surveillance VideosInternational Journal of Parallel Programming10.1007/s10766-016-0451-445:4(734-759)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.1007/s10766-016-0451-4

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents