Abstract
Video summarization is the process of refining the original video into a more concise form without losing valuable information. Both efficient storage and extraction of valuable information from a video are the challenging tasks in video analysis. Intelligent video surveillance system has an essential role for ensuring safety and security to the public. Recent intelligent technologies are extensively using the surveillance systems in all areas starting from border security application to street monitoring systems. Now the surveillance camera or motion sensitivity-based cameras produce large volume of data when employed for recording videos. As analysis of videos by humans demands immense manpower, automatic video summarization is an important and growing research topic. Hence, it is necessary to summarize the activities in the scene and eliminate unusual and redundant events recorded in videos. The proposed work has developed a video summarization framework using key moment-based frame selection and clustering of frames to identify only informative frames. The key moment is a simple yet effective characteristic for summarizing a long video shot and motion is the most salient feature in presenting actions or events in video which is used here to extract the key moments of the video frames. The motion is the scene of a video frame which has the most acceleration and deceleration in case of the key moments. Based on the extracted key moments, the frames of the video are partitioned into different groups using a novel similarity-based agglomerative clustering algorithm. The algorithm determines at most K clusters of frames based on Jaccard similarity among the clusters, where K is the user defined parameter set as the 5% to 15% of the size of the video to be summarized. From each cluster, few representative frames are identified based on the centroids of the clusters and arranged according to their original video sequence to generate the summary of the video. The proposed clustering algorithm and the summarization method are evaluated using state-of-the-art video datasets and compared with some related methodologies to demonstrate their effectiveness.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Transactions Patt Anal Mach Intell 28(12):2037–2041
Aigrain P, Zhang H, Petkovic D (1996) Content-based representation and retrieval of visual media: a state-of-the-art review. Multimedia Tools Appl 3(3):179–202
Brock G, Pihur V, Datta S, Datta S, et al. (2011) clValid, an R package for cluster validation Guy Brock, Vasyl Pihur, Susmita Datta, and Somnath Datta Department of Bioinformatics and Biostatistics, University of Louisville
Bruhn A, Weickert J, Schnörr C (2005) Lucas/kanade meets horn/schunck: combining local and global optic flow methods. Int J Comput Vision 61(3):211–231
Campo DN, Stegmayer G, Milone DH (2016) A new index for clustering validation with overlapped clusters. Expert Syst Appl 64:549–556
Chang IC, Chen KY (2007) Content-selection based video summarization. In: 2007 Digest of Technical Papers International Conference on Consumer Electronics, IEEE, pp 1–2
Chau WS, Au OC, Chong TS (2004) Key frame selection by macroblock type and motion vector analysis. In: 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763), IEEE, vol 1, pp 575–578
Chheng T (2007) Video summarization using clustering. Department of Computer Science University of California, Irvine
Cirne MVM, Pedrini H (2013) A video summarization method based on spectral clustering. In: Iberoamerican Congress on Pattern Recognition, Springer, pp 479–486
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Ieee, vol 1, pp 886–893
Danon L, Diaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech: Theory Ex 09:P09008
Das P, Das AK, Nayak J (2020) Feature selection generating directed rough-spanning tree for crime pattern analysis. Neural Comput Appl 32(12):7623–7639
Deborah LJ, Baskaran R, Kannan A (2010) A survey on internal validity measure for cluster validation. Int J Comput Sci Eng Surv 1(2):85–102
Dhawale CA, Jain S (2008) A novel approach towards keyframe selection for video summarization. Asian J Information Technol 7(4):133–137
Divakaran A, Peker KA, Radhakrishnan R, Xiong Z, Cabasson R (2003) Video summarization using mpeg- motion activity and audio descriptors. Video Mining. Springer, New York, pp 91–121
Fajtl J, Sokeh HS, Argyriou V, Monekosso D, Remagnino P (2018) Summarizing videos with attention. In: Asian Conference on Computer Vision, Springer, pp 39–54
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science. 315(5814):972–976
Gianluigi C, Raimondo S (2006) An innovative algorithm for key frame extraction in video summarization. J Real-Time Image Process 1(1):69–88
Gong B, Chao WL, Grauman K, Sha F (2014) Diverse sequential subset selection for supervised video summarization. Adv Neural Information Process Syst 27:2069–2077
Gunsel B, Tekalp AM (1998) Content-based video abstraction. In: Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No. 98CB36269), IEEE, pp 128–132
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Hubert L, Arabie P (1985) Comparing partitions. J Classification 2(1):193–218
Jadhava P, Jadhav D (2015) Video summarization using higher order color moments. Proceedings of the International Conference on Advanced Computing Technologies and Applications (ICACTA) 45:275–281
Jadon S, Jasim M (2019) Video summarization using keyframe extraction and video skimming. arXiv preprint arXiv:191004792
Li C, Wu YT, Yu SS, Chen T (2009) Motion-focusing key frame extraction and video summarization for lane surveillance system. In: 2009 16th IEEE International Conference on Image Processing (ICIP), IEEE, pp 4329–4332
Liu T, Zhang HJ, Qi F (2003) A novel video key-frame-extraction algorithm based on perceived motion energy model. IEEE Transactions Circuit Syst Video Technol 13(10):1006–1013
Liu Y, Li Z, Xiong H, Gao X, Wu J (2010) Understanding of internal clustering validation measures. In: 2010 IEEE international conference on data mining, IEEE, pp 911–916
Ma YF, Lu L, Zhang HJ, Li M (2002) A user attention model for video summarization. In: Proceedings of the tenth ACM international conference on Multimedia, pp 533–542
Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using delaunay clustering. Int J Digital Libr 6(2):219–232
Okade M, Biswas PK (2016) A novel moving object segmentation framework utilizing camera motion recognition for h. 264 compressed videos. J Visual Commun Image Represent 36:199–212
Pei SC, Chou YZ (1999) Efficient mpeg compressed video analysis using macroblock type information. IEEE Transactions Multimedia 1(4):321–333
Rendón E, Abundez I, Arizmendi A, Quiroz EM (2011) Internal versus external cluster validation indexes. Int J Comput Commun 5(1):27–34
Sony A, Ajith K, Thomas K, Thomas T, Deepa P (2011) Video summarization by clustering using euclidean distance. 2011 International Conference on Signal Processing. Communication, Computing and Networking Technologies, IEEE, pp 642–646
Srinivas M, Pai MM, Pai RM (2016) An improved algorithm for video summarization-a rank based approach. Procedia Comput Sci 89:812–819
Sujatha C, Mudenagudi U (2011) A study on keyframe extraction methods for video summary. In: 2011 International Conference on Computational Intelligence and Communication Networks, IEEE, pp 73–77
Tabii Y, Thami R (2009) A new method for soccer video summarizing based on shot detection, classification and finite state machine. In: Proceedings of The 5th international conference SETIT
Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM) 3(1):3–es
Wilcoxon F, Katti S, Wilcox RA (1970) Critical values and probability levels for the wilcoxon rank sum test and the wilcoxon signed rank test. Sel Tables Math Stat 1:171–259
Wolf W (1996) Key frame selection by motion analysis. In: 1996 IEEE international conference on acoustics, speech, and signal processing conference proceedings, IEEE, vol 2, pp 1228–1231
Wu J, Zhong Sh, Jiang J, Yang Y (2017) A novel clustering method for static video summarization. Multimedia Tools Appl 76(7):9625–9641
Zhang HJ, Wu J, Zhong D, Smoliar SW (1997) An integrated system for content-based video retrieval and browsing. Patt Recognit 30(4):643–658
Zhou K, Qiao Y, Xiang T (2017) Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. arXiv preprint arXiv:180100054
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that this manuscript has no conflict of interest with any other published source and has not been published previously (partly or in full). No data have been fabricated or manipulated to support our conclusions.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yasmin, G., Chowdhury, S., Nayak, J. et al. Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework. Neural Comput & Applic 35, 4881–4902 (2023). https://doi.org/10.1007/s00521-021-06132-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06132-1