Abstract
This paper presents generalized spatiotemporal analysis and lookup tool (GESTALT), an unsupervised framework for content-based video retrieval. GESTALT takes a query video and retrieves “similar” videos from the database. Motion and dynamics of appearance (shape) patterns of a prominent moving foreground object are considered as the key components of the video content and captured using corresponding feature descriptors. GESTALT automatically segments the moving foreground object from the given query video shot and estimates the motion trajectory. A graph-based framework is used to explicitly capture the structural and kinematics property of the motion trajectory, while an improved version of an existing spatiotemporal feature descriptor is proposed to model the change in object shape and movement over time. A combined match cost is computed as a convex combination of the two match scores, using these two feature descriptors, which is used to rank-order the retrieved video shots. Effectiveness of GESTALT is shown using extensive experimentation, and comparative study with recent techniques exhibits its superiority.
Similar content being viewed by others
References
Zheng, W., Faisal, Q.: I remember seeing this video: image driven search in video collections. In: ICCRV (2013)
Hsieh, J.W., Yu, S.L., Chen, Y.S.: Motion-based video retrieval by trajectory matching. In: IEEE T-CSVT, pp. 396–409 (2006)
Dyana, A., Das, S.: Trajectory representation using gabor features for motion-based video retrieval. Pattern Recognit. Lett. 30, 877–892 (2009)
Chattopadhyay, C., Das, S.: A novel hyperstring based descriptor for an improved representation of motion trajectory and retrieval of similar video shots with static camera. In: IEEE Proceedings of EAIT (2012)
Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: ICCV (2013)
Zhang, D., Javed, O., Shah, M.: Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: CVPR (2013)
Wang, T., Wang, S., Xiaoqing, D.: Detecting human action as the spatio-temporal tube of maximum mutual information. In: IEEE T-CSVT, vol. 2(24), pp. 277–290 (2014)
Kim, S.W., Yin, S., Yun, K., Choi, J.Y.: Spatio-temporal weighting in local patches for direct estimation of camera motion in video stabilization. In: CVIU, vol. 118, pp. 71–83 (2014)
Liu, S., Yuan, L., Tan, P., Sun, J.: SteadyFlow: spatially smooth optical flow for video stabilization. In: CVPR (2014)
Roshtkhari, J.M., Levine, M.D.: An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions. In: CVIU, pp. 1436–1452 (2013)
Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware activity recognition and anomaly detection in video. In: IEEE STSP, vol. 1(7), pp. 91–101 (2013)
Laptev, I.: On space-time interest points. In: IJCV, vol. 64(2–3), pp. 107–123 (2005)
Mehmet, E.D., Ozgur, U., Ugur, G.: Rule-based spatiotemporal query processing for video databases. In: VLDB, pp. 86–103 (2004)
Dimitrova, N., Golshani, F.: Motion recovery for video content classification. In: ACM IS, pp. 408–439 (1995)
Bashir, F., Khokhar, A., Schonfeld, D.: Real-time motion trajectory-based indexing and retrieval of video sequences. In: IEEE T-M, pp. 58–65 (2007)
Khalid, S., Naftel, A.: Motion trajectory clustering for video retrieval using spatio-temporal approximations. In: VIIS, vol. 3736, pp. 60–70 (2006)
Wang, H., Klaser, A., Schmid, C., Cheng, L. L.: Action recognition by dense trajectories. In: CVPR (2011)
Zhao, Z., Cui, B., Gao, C., Zi, H., Tao, S.H.: Extracting representative motion flows for effective video retrieval. Multimed. Tools Appl. 58(3), 687–711 (2011)
Choon-Bo, S., Jae-Woo C.: Spatio-temporal representation and retrieval using moving object’s trajectories. In: ACM MM (2000)
Zhe, J.L., Little, J.J., Gu, Z.: Video retrieval by spatial and temporal structure of trajectories. In: SPIE (2001)
Megrhi, S., Souidene, W., Beghdadi, A.: Spatio-temporal salient feature extraction for perceptual content based video retrieval. In: CVCS (2013)
Basharat, A., Zhai, Y., Shah, M.: Content based video matching using spatiotemporal volumes. In: CVIU, vol. 110(3), pp. 360–377 (2008)
Liang, B., Xiao, W., Liu, X.: Design of video retrieval system using MPEG-7 descriptors. Procedia Eng. 29, 2578–2582 (2012)
Choi, J., Wang, Z., Lee, S.C., Jeon, W.J.: A spatio-temporal pyramid matching for video retrieval. In: CVIU, vol. 117(6), pp. 660–669 (2013)
Chattopadhyay, C., Das, S.: STAR: A content based video retrieval system for moving camera video shots. In: NCVPRIPG (2013)
Chattopadhyay, C., Das, S.: Enhancing the MST-CSS representation using robust geometric features, for efficient content based video retrieval (CBVR). In: ISM (2012)
Dyana, A., Das, S.: MST-CSS (Multi-Spectro-Temporal Curvature Scale Space), a novel spatio-temporal representation for content-based video retrieval. In: IEEE T-CSVT, pp. 1080–1094 (2010)
Chattopadhyay, C., Maurya, A.K.: Multivariate time series modeling of geometric features of spatio-temporal volumes for content based video retrieval. In: IJMIR, vol. 3(1), pp. 15–28 (2013)
Hong, C., Li, N., Song, M., Bu, J., Chen, C.: An efficient approach to content-based object retrieval in videos. Neurocomputing 74(17), 3565–3575 (2011)
Gao, H.P., Yang, Z.Q.: Content based video retrieval using spatiotemporal salient objects. In: ICPR (2010)
Cuturi, M.: Fast global alignment kernels. In: ICML (2011)
Zhang, K., Zhang, L., Yang, M.H.: Real-time compressive tracking. In: ECCV (2012)
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR (2009)
Chuan, Y., Lihe, Z., Huchuan, L., Xiang, R., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: CVPR (2013)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 6(24), 381–395 (1981)
O’Neill, B.: Elementary Differential Geometry. Academic Press, London (1997)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chattopadhyay, C., Das, S. Use of trajectory and spatiotemporal features for retrieval of videos with a prominent moving foreground object. SIViP 10, 319–326 (2016). https://doi.org/10.1007/s11760-014-0744-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-014-0744-2