article

Learning topic of dynamic scene using belief propagation and weighted visual words approach

Authors:

Shengrong Gong,

Quan LiuAuthors Info & Claims

Soft Computing - A Fusion of Foundations, Methodologies and Applications, Volume 19, Issue 1

Pages 71 - 84

https://doi.org/10.1007/s00500-014-1384-8

Published: 01 January 2015 Publication History

Abstract

In this paper, we are tackling the problem of distinguishing scenes, including static and dynamic scenes. We propose a framework of scene recognition, based on bag of visual words and topic model. We achieve the task using the topic model by belief propagation (TMBP), which belongs to the family of the latent Dirichlet allocation model. We also extend the TMBP model, called as the knowledge TMBP model, by introducing the prior information of visual words and scenes. Experimental results on the static and dynamic scenes demonstrated that our proposed framework is effective and efficient. The scene semantics can be obtained from two levels of visual words and topics in our framework. Our result significantly outperforms the others using low-level visual features, such as spatial, temporal and spatiotemporal features.

References

[1]

Alqasrawi Y, Neagu D, Cowling P (2009) Natural scene image recognition by fusing weighted colour moments with bag of visual patches on spatial pyramid layout. Proceedings of the 9th international conference on intelligent systems design and applications, ISDA, IEEE Computer Society, Pisa, Italy, Nov 30-Dec 2, 2009, pp 140-145.

Digital Library

[2]

Battiato S, Farinella G, Gallo G, Ravi D (2010) Exploiting textons distributions on spatial hierarchy for scene classification. EURASIP J Image Video Process, special issue on multimedia modeling, Jan 2010, pp 1-13.

[3]

Bisho CM (2006) Pattern Recognition and Machine Learning. Springer.

[4]

Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mach Learn Res 3:993-1022.

[5]

Bosch A, Munoz X, Marti R (2007) Which is the best way to organize/ classify images by content? Image Vis Comput 5(6):778-791.

Digital Library

[6]

Bosch A, Zisserman A, Munoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell (PAMI) 30(4):712-727.

Digital Library

[7]

Bosch A, Zisserman A, Munoz X (2007) Representing shape with a spatial pyramid kernel. Proceedings of the 6th ACM international conference on image and video retrieval, CIVR, Amsterdam, The Netherlands, July 9-11, 2007, pp 401-408.

Digital Library

[8]

Cao Y, Wang C, Li Z, Zhang L, Zhang L (2010) Spatial bag-of-features. In: CVPR, June 13-18, 2010, San Francisco, CA, pp 3352-3359.

[9]

Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision (ECCV). Prague Czech Republic, pp 1-22.

[10]

Derpanis KG, Lecce M, Daniilidis K, Wildes RP (2012) Dynamic scene understanding: the role of orientation features in space and time in scene classification. In: CVPR, Providence, RI, USA, June 16-21 2012, pp 1306-1313.

[11]

Feichtenhofer C, Pinz A, Wildes RP (2013) Spacetime forests with complementary features for dynamic scene recognition. In: Proceedings of the British machine vision conference (BMVC).

[12]

Fei-Fei L, Fergus R (2003) Bayesian approach to unsupervised one-shot learning of object categories. In: ICCV, Nice, France, Oct 13-16 2003, vol 2, pp 1134-1141.

[13]

Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: CVPR, San Diego, CA, USA, June 20-26 2005, vol 2, pp 524-531.

Digital Library

[14]

Grossberg S, Huang T (2009) ARTSCENE: a neural system for natural scene classification. J Vis 9(4):1-19.

[15]

Harada T, Ushiku Y, Yamashita Y, Kuniyoshi Y (2011) Discriminative spatial pyramid. In: CVPR, Providence, RI, USA, June 20-25 2011, pp 1617-1624.

Digital Library

[16]

Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1-2):177-196.

[17]

Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457-1469.

Digital Library

[18]

Jiang YG, Ngo CW, Yang J (2007) Towards optimal bag-of-features for object categorization and semantic video retrieval. Proceedings of the 6th ACM international conference on image and video retrieval, CIVR, Amsterdam, The Netherlands, July 9-11, 2007, pp 494-501.

Digital Library

[19]

Julien SL, Sha F, Jordan MI (2008) DiscLDA: discriminative learning for dimensionality reduction and classification. In: NIPS, pp 897-904.

[20]

Khan F, van deWeijer J, Vanrell M (2009) Top-down color attention for object recognition. In: ICCV, Kyoto, Japan, Sept 27-Oct 4, 2009, pp 979-986.

[21]

Kuettel D, Breitenstein M, Gool LV, Ferrari V (2010) What's going on? Discovering spatio-temporal dependencies in dynamic scenes, In: CVPR, San Francisco, CA, USA, June 13-18 2010, pp 1951-1958.

[22]

Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: CVPR, Anchorage, Alaska, USA, June 24-26, 2008, pp 1-8.

[23]

Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognising natural scene categories. In: CVPR, New York, USA, June 17-22, 2006, pp 2169-2178.

Digital Library

[24]

Li H, Wang F, Zhang S (2011) Global and local features based topic model for scene recognition. 2011 IEEE nternational conference on systems, man, and cybernetics (SMC), 9-12 Oct 2011, Anchorage, AK, pp 532-537.

[25]

Marszalek M, Laptev I, Schmid C (2009) Actions in context. In: CVPR, Miami, FL, USA, June 20-25 2009, pp 2929-2936.

[26]

Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, New York, USA, June 17-22, 2006, pp 2161-2168.

Digital Library

[27]

Niu Z, Hua G, Gao X, Tian Q (2011) Spatial-discLDA for visual recogniton, In: CVPR, June 20-25, 2011, Providence, RI, pp 1769-1776.

[28]

Niu Z, Hua G, Gao X, Tian Q (2012) Context aware topic model for scene recognition, In: CVPR, June 16-21, 2012 Providence, RI, pp 2743-2750.

[29]

Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3):145-175.

Digital Library

[30]

Perronnin F (2008) Universal and adapted vocabularies for generic visual categorization. PAMI 30(7):1243-1256.

Digital Library

[31]

Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T, Van Gool L (2005) Modeling scenes with local descriptors and latent aspects, proceedings of IEEE international conference on computer vision ICCV, Beijing, China, Oct 17-21, 2005, pp 883-890.

Digital Library

[32]

Quelhas P, Odobez J (2007) Multi-level local descriptor quantization for bag-of-visterms image representation. Proceedings of the 6th ACM international conference on image and video retrieval, Amsterdam, The Netherlands, July 9-11, 2007, pp 242-249.

Digital Library

[33]

Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning, Piscataway, New Jersey, USA, Dec 3-8 2003.

[34]

Ravichandran A, Chaudhry R, Vidal R (2013) View-invariant dynamic texture recognition using a bag of dynamical systems. PAMI 35(2):342-353.

Digital Library

[35]

Shroff N, Turaga P, Chellappa R (2010) Moving vistas: exploiting motion for describing scenes. In: CVPR, San Francisco, CA, USA, June 13-18 2010, pp 1911-1918.

[36]

Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: ICCV, Nice, France, Oct 13-16 2003, vol 2, pp 1470-1477.

Digital Library

[37]

Sivic J, Russell B, Efros AA, Zisserman A, Reeman B (2005) Discovering objects and their location in images. In: ICCV, Oct 17-21, 2005. Beijing, China, pp 370-377.

Digital Library

[38]

Sudderth EB, Torralba A, Freeman WT, Willsky AS (2005) Learning hierarchical models of scenes, objects, and parts. In: ICCV, 17-21 Oct 2005, Vol 2, Beijing, China, pp 1331-1338.

Digital Library

[39]

Theriault C, Thome N, Cord M (2013) Dynamic scene classification: learning motion descriptors with slow features analysis. In: CVPR, Portland, OR, USA, June 23-28 2013, pp 2603-2610.

Digital Library

[40]

Wang X, Ma KT, Ng GW et al (2011) Trajectory analysis and semantic region modeling using nonparametric hierarchical bayesian models. Int J Comp Vis 95(3):287-312.

Digital Library

[41]

Wu J, Rehg J (2011) CENTRIST: a visual descriptor for scene categorization. PAMI 33(8):1489-1501.

Digital Library

[42]

Wu Z, Ke Q, Sun J, Shum HY (2009) A multi-sample, multitree approach to bag-of-words image representation for image retrieval. In: ICCV, Kyoto, Japan, Sept 27-Oct 4, 2009, pp 1992-1999.

[43]

Wu J, Rehg J (2009) Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: ICCV, Kyoto, Japan, Sept 27-Oct 4, 2009, pp 630-637.

[44]

Yang J, Jiang YG, Hauptmann AG, Ngo CW (2007) Evaluating bag-of-visual-words representations in scene classification. Proceedings of the 9th ACM international workshop on multimedia information retrieval, ACM MIR, University of Augsburg, Germany, Sept 28-29, 2007, pp 197-206.

Digital Library

[45]

Zeng J, Cheung WK-W, Liu J (2013) Learning topic models by belief propagation. PAMI 35(5):1121-1134.

[46]

Zhang Z (2008) Reasearch of object categories using bag of synonyms model, Master degree theses, Beijing Capital University, pp 9-15.

[47]

Zhou H, Yuan Y, Shi C (2009) Object tracking using SIFT features and mean shift. Comp Vis Image Underst 113(3):345-352.

Digital Library

[48]

Zhu L, Zhang A (2002) Theory of keyblock-based image retrieval. ACM Trans Inf Syst (TOIS) 20(2):224-257.

Digital Library

Cited By

Zhang XYu Y(2022)Optimization of Printing and Dyeing Energy Consumption Based on Multimedia Machine Learning AlgorithmSecurity and Communication Networks10.1155/2022/19604252022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/1960425

Recommendations

Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...
Topic sentiment change analysis
MLDM'11: Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition

Public opinions on a topic may change over time. Topic Sentiment change analysis is a new research problem consisting of two main components: (a) mining opinions on a certain topic, and (b) detect significant changes of sentiment of the opinions on the ...
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Soft Computing - A Fusion of Foundations, Methodologies and Applications

Soft Computing - A Fusion of Foundations, Methodologies and Applications Volume 19, Issue 1

January 2015

251 pages

ISSN:1432-7643

EISSN:1433-7479

Issue’s Table of Contents

Copyright © Copyright © 2015 Springer-Verlag Berlin Heidelberg.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 January 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang XYu Y(2022)Optimization of Printing and Dyeing Energy Consumption Based on Multimedia Machine Learning AlgorithmSecurity and Communication Networks10.1155/2022/19604252022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/1960425

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents