Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers

Published: 01 October 2013 Publication History

Abstract

We present a computer vision based method for equipment action recognition.Our vision-based method is based on a multiple binary SVM classifier and spatio-temporal features.A comprehensive real-world video dataset of excavator and truck actions is presented.We achieve accuracies of 86.33% and 98.33% for excavator and truck action classes.The presented method can be used for construction activity analysis using long sequences of videos. Video recordings of earthmoving construction operations provide understandable data that can be used for benchmarking and analyzing their performance. These recordings further support project managers to take corrective actions on performance deviations and in turn improve operational efficiency. Despite these benefits, manual stopwatch studies of previously recorded videos can be labor-intensive, may suffer from biases of the observers, and are impractical after substantial period of observations. This paper presents a new computer vision based algorithm for recognizing single actions of earthmoving construction equipment. This is particularly a challenging task as equipment can be partially occluded in site video streams and usually come in wide variety of sizes and appearances. The scale and pose of the equipment actions can also significantly vary based on the camera configurations. In the proposed method, a video is initially represented as a collection of spatio-temporal visual features by extracting space-time interest points and describing each feature with a Histogram of Oriented Gradients (HOG). The algorithm automatically learns the distributions of the spatio-temporal features and action categories using a multi-class Support Vector Machine (SVM) classifier. This strategy handles noisy feature points arisen from typical dynamic backgrounds. Given a video sequence captured from a fixed camera, the multi-class SVM classifier recognizes and localizes equipment actions. For the purpose of evaluation, a new video dataset is introduced which contains 859 sequences from excavator and truck actions. This dataset contains large variations of equipment pose and scale, and has varied backgrounds and levels of occlusion. The experimental results with average accuracies of 86.33% and 98.33% show that our supervised method outperforms previous algorithms for excavator and truck action recognition. The results hold the promise for applicability of the proposed method for construction activity analysis.

References

[1]
Gong, J. and Caldas, C.H., Computer vision-based video interpretation model for automated productivity analysis of construction operations. Journal of Computing in Civil Engineering. v24. 252-263.
[2]
Gong, J., Caldas, C.H. and Gordon, C., Learning and classifying actions of construction workers and equipment using Bag-of-Video-Feature-Words and Bayesian network models. Advanced Engineering Informatics. v25. 771-782.
[3]
Goodrum, P.M., Haas, C.T., Caldas, C., Zhai, D., Yeiser, J. and Homm, D., Model to predict the impact of a technology on construction productivity. Journal of Construction Engineering and Management. v137. 678-688.
[4]
Su, Y.Y. and Liu, L.Y., Real-time Construction Operation Tracking from Resource Positions. In: Lucio, S., Burcu, A. (Eds.), ASCE. pp. 25
[5]
Zhai, D., Goodrum, P.M., Haas, C.T. and Caldas, C.H., Relationship between automation and integration of construction information systems and labor productivity. Journal of Construction Engineering and Management. v135. 746-753.
[6]
Zou, J. and Kim, H., Using hue, saturation, and value color space for hydraulic excavator idle time analysis. Journal of Computing in Civil Engineering. v21. 238-246.
[7]
EPA, Climate Change Indicators in the United States, USEPA, EPA 430-R-10-00, 2010.
[8]
Lewis, P., Leming, M., Frey, C. and Rasdorf, W., Assessing the effects of operational efficiency on pollutant emissions of nonroad diesel construction equipment. Journal of the Transportation Research Board. 11-18.
[9]
Frey, C., Rasdorf, W. and Lewis, P., Comprehensive field study of fuel use and emissions of nonroad diesel construction equipment. Journal of the Transportation Research Board. v2158. 69-76.
[10]
Heydarian, A. and Golparvar-Fard, M., A Visual Monitoring Framework for Integrated Productivity and Carbon Footprint Control of Construction Operations. In: Zhu, Y., Issa, R.R. (Eds.), ASCE, Miami, FL. pp. 62
[11]
Grau, D. and Caldas, C.H., Methodology for automating the identification and localization of construction components on industrial projects. Journal of Computing in Civil Engineering. v23. 3-13.
[12]
Golparvar-Fard, M., Pena-Mora, F. and Savarese, S., D4AR - A 4-dimensional augmented reality model for automating construction progress data collection, processing and communication. Journal of Information Technology in Construction. v14. 129-153.
[13]
Gong, J. and Caldas, C.H., An Intelligent Video Computing Method for Automated Productivity Analysis of Cyclic Construction Operations. In: Caldas, C.H., O'Brien, W.J. (Eds.), ASCE, Austin, TX. pp. 7
[14]
Grau, D., Caldas, C.H., Haas, C.T., Goodrum, P.M. and Gong, J., Assessing the impact of materials tracking technologies on construction craft productivity. Automation in Construction. v18. 903-911.
[15]
El-Omari, S. and Moselhi, O., Data acquisition from construction sites for tracking purposes. Engineering, Construction and Architectural Management. v16. 490-503.
[16]
Ergen, E., Akinci, B. and Sacks, R., Tracking and locating components in a precast storage yard utilizing radio frequency identification technology and GPS. Automation in Construction. v16. 354-367.
[17]
Navon, R. and Sacks, R., Assessing research issues in automated project performance control (APPC). Automation in Construction. v16. 474-484.
[18]
J. Song, C. Caldas, E. Ergen, C. Haas, B. Akinci, Field trials of RFID technology for tracking pre-fabricated pipe spools, in: Proceedings of the 21st International Symposium on Automation and Robotics in Construction, 2004.
[19]
Song, J., Haas, C.T. and Caldas, C.H., Tracking the location of materials on construction job sites. Journal of Construction Engineering and Management. v132. 911-918.
[20]
Cheng, T., Venugopal, M., Teizer, J. and Vela, P.A., Performance evaluation of ultra wideband technology for construction resource location tracking in harsh environments. Automation in Construction. v20. 1173-1184.
[21]
J. Teizer, D. Lao, M. Sofer, Rapid automated monitoring of construction site activities using ultra-wideband, in: The 24th International Symposium on Automation and Robotics in Construction, ISARC 2007, Published by I.A.A.R.C., 2007, pp. 23-28.
[22]
Williams, C., Cho, Y.K. and Youn, J.-H., Wireless Sensor-driven Intelligent Navigation Method for Mobile Robot Applications in Construction. In: Soibelman, L., Akinci, B. (Eds.), ASCE, ASCE, Pittsburgh, PA, USA. pp. 76
[23]
Gong, J. and Caldas, C.H., Data processing for real-time construction site spatial modeling. Automation in Construction. v17. 526-535.
[24]
Park, M., Koch, C. and Brilakis, I., Three-dimensional tracking of construction resources using an on-site camera system. Journal of Computing in Civil Engineering. v26 i4. 541-549.
[25]
Brilakis, I., Park, M. and Jog, G., Automated vision tracking of project related entities. Advanced Engineering Informatics. v25. 713-724.
[26]
Yang, J., Vela, P.A., Teizer, J. and Shi, Z.K., Vision-based Crane Tracking for Understanding Construction Activity. In: Zhu, Y., Issa, R.R. (Eds.), ASCE, Miami, FL. pp. 32
[27]
Rezazadeh Azar, E. and McCabe, B., Automated visual recognition of dump trucks in construction videos. Journal of Computing in Civil Engineering. v26 i6. 769-781.
[28]
Chi, S. and Caldas, C.H., Automated object identification using optical video cameras on construction sites. Computer-Aided Civil and Infrastructure Engineering. v26. 368-380.
[29]
A.Khosla, B. Yao, L. Fei-Fei, Classifying actions and measuring action similarity by modeling the mutual context of objects and human poses, in: International Conference on Machine Learning (ICML), 2011.
[30]
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 881, 2005, pp. 886-893.
[31]
Dalal, N., Triggs, B. and Schmid, C., Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (Eds.), Computer Vision - ECCV 2006, Springer, Berlin/Heidelberg. pp. 428-441.
[32]
Felzenszwalb, P.F., Girshick, R.B., McAllester, D. and Ramanan, D., Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence. v32. 1627-1645.
[33]
Y. Wang, D. Tran, Z. Liao, Learning hierarchical poselets for human parsing, in: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011, pp. 1705-1712.
[34]
Yang, Y. and Ramanan, D., Articulated pose estimation with flexible mixtures-of-parts. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1385-1392.
[35]
P. Dollar, V. Rabaud, G. Cottrell, S. Belongie, Behavior recognition via sparse spatio-temporal features, in: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005, pp. 65-72.
[36]
Laptev, I., On space-time interest points. International Journal of Computer Vision. v64. 107-123.
[37]
I. Laptev, M. Marszalek, C. Schmid, B. Rozenfeld, Learning realistic human actions from movies, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, 2008, pp. 1-8.
[38]
M. Marszalek, I. Laptev, C. Schmid, Actions in context, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, 2009, pp. 2929-2936.
[39]
Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence. v31. 1762-1774.
[40]
Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision. v79. 299-318.
[41]
S.-F. Wong, T.-K. Kim, R. Cipolla, Learning motion categories using both semantic and structural information, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR '07, 2007, pp. 1-6.
[42]
İkizler, N. and Forsyth, D., Searching for complex human activities withnovisualexamples. International Journal of Computer Vision. v80. 337-357.
[43]
B. Laxton, L. Jongwoo, D. Kriegman, Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR '07, 2007, pp. 1-8.
[44]
J.C. Niebles, C.-W. Chen, L. Fei-Fei, Modeling temporal structure of decomposable motion segments for activity classification, in: Proceedings of the 11th European Conference on Computer Vision: Part II, Springer-Verlag, Heraklion, Crete, Greece, 2010, pp. 392-405.
[45]
S. Savarese, A. DelPozo, J.C. Niebles, L. Fei-Fei, Spatial-temporal correlations for unsupervised action classification, in: IEEE Workshop on Motion and Video Computing, WMVC 2008, 2008, pp. 1-8.
[46]
J. Liu, M. Shah, Learning human actions via information maximization, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, 2008, pp. 1-8.
[47]
B. Yao, S.-C. Zhu, Learning deformable action templates from cluttered videos, in: International Conference On Computer Vision (ICCV), 2009, pp. 1-8.
[48]
CII, Leveraging Technology to Improve Construction Productivity, Volume III: Technology Field Trials, RR240-13, 2010.
[49]
C. Schuldt, I. Laptev, B. Caputo, Recognizing human actions: a local SVM approach, in: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 33, 2004, pp. 32-36.
[50]
X. Feng, P. Perona, Human action recognition by sequence of movelet codewords, in: First International Symposium on 3D Data Processing Visualization and Transmission, Proceedings, 2002, pp. 717-721.
[51]
M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri, Actions as space-time shapes, in: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 1392, 2005, pp. 1395-1402.
[52]
V. Cheung, B.J. Frey, N. Jojic, Video epitomes, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 41, 2005, pp. 42-49.
[53]
A.A. Efros, A.C. Berg, G. Mori, J. Malik, Recognizing action at a distance, in: Ninth IEEE International Conference on Computer Vision, Proceedings, vol. 722, 2003, pp. 726-733.
[54]
Vapnik, V. and Bottou, L., On structural risk minimization or overall risk in a problem of pattern recognition. Automation and Remote Control.
[55]
I. Rish, An empirical study of the naive Bayes classifier, in: International Joint Conf. on Artificial Intelligence, 2001.
[56]
T. Hofmann, Probabilistic latent semantic indexing, in: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Berkeley, California, United States, 1999, pp. 50-57.
[57]
Yann, L., Harris, J.L.D.E., Corinna, B.N.C., Harris, D.J.S.D., Eduard, S., Patrice, S. and Vladimir, V., Learning algorithms for classification: a comparison on handwritten digit recognition neural networks. The Statistical Mechanics Perspective. 261-276.
[58]
Chang, C.-C. and Lin, C.-J., LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology.
[59]
Cheng, T., Teizer, J., Migliaccio, G. and Gatti, U.C., Automated task-level activity analysis through fusion of real time location sensors and worker's thoracic posture data. Automation in Construction. v29. 24-39.
[60]
V. Escorcia, M. Dávila, M. Golparvar-Fard, J.C. Niebles. Automated vision-based recognition of construction worker actions for building interior construction operations using RGBD cameras, in: Proc. 2012 Construction Research Congress, West Lafayette, IN, pp. 879-888.
[61]
J. Teizer, T. Cheng, Y. Fang, Location tracking and data visualization technology to advance construction ironworkers' education and training in safety and productivity, Journal of Automation in Construction, in press, http://dx.doi.org/10.1016/j.autcon.2013.03.004.
[62]
Cheng, T. and Teizer, J., Real-time resource location data collection and visualization technology for construction safety and activity monitoring applications. Journal of Automation in Construction.
[63]
Teizer, J., Venugopal, M. and Walia, A., Ultra wideband for automated real-time three-dimensional location sensing for workforce, equipment, and material positioning and tracking. Transportation Research Record: Journal of the Transportation Research Board. v2081. 56-64.
[64]
Pradhananga, N. and Teizer, J., Automatic spatio-temporal analysis of construction equipment operations using GPS data. Automation in Construction. v29. 107-122.
[65]
Cheng, T., Migliaccio, G.C., Teizer, J. and Gatti, U.C., Data fusion of real-time location sensing (RTLS) and physiological status monitoring (PSM) for ergonomics analysis of construction workers. ASCE Journal of Computing in Civil Engineering.
[66]
Yang, J., Vela, P.A., Teizer, J. and Shi, Z.K., Vision-based crane tracking for understanding construction activity. ASCE Journal of Computing in Civil Engineering.
[67]
Ray, S.J. and Teizer, J., Real-time construction worker posture analysis for ergonomics training. Advanced Engineering Informatics. v26. 439-455.
[68]
Cheng, T., Mantripragada, U., Teizer, J. and Vela, P.A., Automated trajectory and path planning analysis based on ultra wideband data. ASCE Journal of Computing in Civil Engineering. v26. 151-160.
[69]
Yang, J., Cheng, T., Teizer, J., Vela, P.A. and Shi, Z.K., A performance evaluation of vision and radio frequency tracking methods for interacting workforce. Advanced Engineering Informatics. v25 i4. 736-747.
[70]
Teizer, J. and Vela, P.A., Personnel tracking on construction sites using video cameras. Advanced Engineering Informatics. v23 i4. 452-462.
[71]
Gong, J. and Caldas, C.H., An object recognition, tracking, and contextual reasoning-based video interpretation method for rapid productivity analysis of construction operations. Automation in Construction. v20 i8. 1211-1226.
[72]
Park, M. and Brilakis, I., Construction worker detection in video frames for initializing vision trackers. Automation in Construction. v28. 15-25.
[73]
Rezazadeh Azar, E. and McCabe, B., Vision-based recognition of dirt loading cycles in construction sites. Proceedings of the Construction Research Congress. 1042-1051.
[74]
Rezazadeh Azar, E., Dickinson, S. and McCabe, B., Server-customer interaction tracker: computer vision-based system to estimate dirt-loading cycles. Journal of Construction Engineering and Management. v139 i7. 785-794.
[75]
Part based model and spatial-temporal reasoning to recognize hydraulic excavators in construction images and videos. Journal of Automation in Construction. v24. 194-202.
[76]
Memarzadeh, M., Golparvar-Fard, M. and Niebles, J.C., Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors. Automation in Construction. v32. 24-37.
[77]
Aggarwal, J.K. and Ryoo, M.S., Human activity analysis: a review. ACM Computing Surveys (CSUR). v43 i3.
[78]
Kohavi, R. and Provost, F., Glossary of Terms. Editorial for the Special Issue on Applications of Machine Learning and the Knowledge Discovery Process. v30 i2/3.
[79]
Power, D.M.W., Evaluation: from precision, recall and F-factor to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies. v2 i1. 37-63.

Cited By

View all
  • (2024)Improving single‐stage activity recognition of excavators using knowledge distillation of temporal gradient dataComputer-Aided Civil and Infrastructure Engineering10.1111/mice.1315739:13(2028-2053)Online publication date: 9-Jun-2024
  • (2024)A teacher–student deep learning strategy for extreme low resolution unsafe action recognition in construction projectsAdvanced Engineering Informatics10.1016/j.aei.2023.10229459:COnline publication date: 1-Jan-2024
  • (2023)Intelligent Identification Approach of Vibratory Roller Working Stages Based on Multi-dimensional Convolutional Neural NetworkIntelligent Robotics and Applications10.1007/978-981-99-6501-4_40(463-475)Online publication date: 5-Jul-2023
  • Show More Cited By
  1. Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Advanced Engineering Informatics
    Advanced Engineering Informatics  Volume 27, Issue 4
    October, 2013
    255 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 01 October 2013

    Author Tags

    1. Action recognition
    2. Activity analysis
    3. Computer vision
    4. Construction productivity
    5. Operational efficiency
    6. Time-studies

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Improving single‐stage activity recognition of excavators using knowledge distillation of temporal gradient dataComputer-Aided Civil and Infrastructure Engineering10.1111/mice.1315739:13(2028-2053)Online publication date: 9-Jun-2024
    • (2024)A teacher–student deep learning strategy for extreme low resolution unsafe action recognition in construction projectsAdvanced Engineering Informatics10.1016/j.aei.2023.10229459:COnline publication date: 1-Jan-2024
    • (2023)Intelligent Identification Approach of Vibratory Roller Working Stages Based on Multi-dimensional Convolutional Neural NetworkIntelligent Robotics and Applications10.1007/978-981-99-6501-4_40(463-475)Online publication date: 5-Jul-2023
    • (2022)Real-Time Activity Duration Extraction of Crane Works for Data-Driven Discrete Event SimulationProceedings of the Winter Simulation Conference10.5555/3586210.3586408(2365-2376)Online publication date: 11-Dec-2022
    • (2022)Optimization of excavator engine working points based on particle swarm algorithmProceedings of the Asia Conference on Electrical, Power and Computer Engineering10.1145/3529299.3531508(1-8)Online publication date: 22-Apr-2022
    • (2022)Vision-based method for semantic information extraction in construction by integrating deep learning object detection and image captioningAdvanced Engineering Informatics10.1016/j.aei.2022.10169953:COnline publication date: 1-Aug-2022
    • (2022)Computer vision-based deep learning for supervising excavator operations and measuring real-time earthwork productivityThe Journal of Supercomputing10.1007/s11227-022-04803-x79:4(4468-4492)Online publication date: 27-Sep-2022
    • (2021)Automated active and idle time measurement in modular construction factory using inertial measurement unit and deep learning for dynamic simulation inputProceedings of the Winter Simulation Conference10.5555/3522802.3522989(1-8)Online publication date: 13-Dec-2021
    • (2021)3D convolutional neural network‐based one‐stage model for real‐time action detection in video of construction equipmentComputer-Aided Civil and Infrastructure Engineering10.1111/mice.1269537:1(126-142)Online publication date: 10-Jun-2021
    • (2021)DeepHaul: a deep learning and reinforcement learning-based smart automation framework for dump trucksProgress in Artificial Intelligence10.1007/s13748-021-00233-710:2(157-180)Online publication date: 10-Feb-2021
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media