research-article

Ontology-based context representation and reasoning for object tracking and scene interpretation in video

Authors:

Juan Gómez-Romero,

Miguel A. Patricio,

Jesús García,

José M. MolinaAuthors Info & Claims

Expert Systems with Applications: An International Journal, Volume 38, Issue 6

Pages 7494 - 7510

https://doi.org/10.1016/j.eswa.2010.12.118

Published: 01 June 2011 Publication History

Abstract

Research highlights We have developed a general framework for Computer Vision systems. Perceived and contextual knowledge is represented with ontologies. Rule-based reasoning is applied to achieve scene interpretation and vision enhancement. The framework can be extended and applied in different application domains. Computer vision research has been traditionally focused on the development of quantitative techniques to calculate the properties and relations of the entities appearing in a video sequence. Most object tracking methods are based on statistical methods, which often result inadequate to process complex scenarios. Recently, new techniques based on the exploitation of contextual information have been proposed to overcome the problems that these classical approaches do not solve. The present paper is a contribution in this direction: we propose a Computer Vision framework aimed at the construction of a symbolic model of the scene by integrating tracking data and contextual information. The scene model, represented with formal ontologies, supports the execution of reasoning procedures in order to: (i) obtain a high-level interpretation of the scenario; (ii) provide feedback to the low-level tracking procedure to improve its accuracy and performance. The paper describes the layered architecture of the framework and the structure of the knowledge model, which have been designed in compliance with the JDL model for Information Fusion. We also explain how deductive and abductive reasoning is performed within the model to accomplish scene interpretation and tracking improvement. To show the advantages of our approach, we develop an example of the use of the framework in a video-surveillance application.

References

[1]

Arndt, R., Troncy, R., Staab, S., Hardman, L., & Vacura, M. (2008). COMM: Designing a well-founded multimedia ontology for the web. In Proceedings of the sixth international semantic web conference (ISWC 2007) (pp. 30-43). Busan, South Korea.

Digital Library

[2]

The description logic handbook: Theory, implementation, and applications. Cambridge University Press.

Digital Library

[3]

Description logics as ontology languages for the semantic web. Mechanizing Mathematical Reasoning. 228-248.

[4]

Handbook of knowledge representation. In: Description logics, Elsevier. pp. 135-180.

[5]

Saturation, nonmonotonic reasoning and the closed-world assumption. Artificial Intelligence. v25. 13-63.

Digital Library

[6]

Into the woods: Visual surveillance of non-cooperative and camouflaged targets in complex outdoor settings. Proceedings of the IEEE. v89 i10. 1382-1402.

[7]

Brdiczka, O., Yuen, P. C., Zaidenberg, S., Reignier, P., & Crowley, J. L. (2006). Automatic acquisition of context models and its application to video surveillance. In Proceedings of the 18th international conference on pattern recognition (ICPR 2006) (pp. 1175-1178). Hong Kong, China.

Digital Library

[8]

Bremond, F., & Thonnat, M. (1996). A context representation for surveillance systems. In Proceedings of the workshop on conceptual descriptions from images at the fourth european conference on computer vision (ECCV'96). Cambridge, UK.

[9]

High-level data fusion. Artech House Publishers.

Digital Library

[10]

Dey, A., & Abowd, G. (2000). Towards a better understanding of context and context-awareness. In Proceedings of the workshop on the what, who, where, when, and how of context-awareness (CHI 2000). The Hague, Netherlands.

[11]

Elsenbroich, C., Kutz, O., & Sattler, U. (2006). A case for abductive reasoning over ontologies. In Proceedings of the OWL workshop: Experiences and directions (OWLED'06). Athens, Georgia, USA.

[12]

Fernández, C., & González, J. (2007). Ontology for semantic integration in a cognitive surveillance system. In Proceedings of the second international conference on semantic and digital media technologies (pp. 260-263). Genoa, Italy.

Digital Library

[13]

VERL: An ontology framework for representing and annotating video events. IEEE Multimedia. v12 i4. 76-86.

Digital Library

[14]

Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., & Schneider, L. (2002). Sweetening ontologies with DOLCE. In 13th international conference on knowledge engineering and knowledge management (EKAW02) (pp. 223-233). Sigüenza, Spain.

Digital Library

[15]

A translation approach to portable ontology specifications. Knowledge Acquisition. v5 i2. 199-220.

Digital Library

[16]

Grüntter, R., Scharrenbach, T., & Bauer-Messmer, B. (2008). Improving an RCC-derived geospatial approximation by OWL axioms. In Proceedings of the seventh international semantic web conference (ISWC 2008) (pp. 293-306). Karlsruhe, Germany.

Digital Library

[17]

Description of the RACER system and its applications. In: Proceedings of the international workshop on description logics (DL2001), Stanford University, California, USA.

[18]

Multisensor data fusion. In: Handbook of multisensor data fusion, CRC Press. pp. 1-14.

[19]

W4: Real-time surveillance of people and their activities. IEEE Transactions Pattern Analysis and Machine Intelligence. v22 i8. 809-830.

Digital Library

[20]

Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P. F., & Rudolph, S. (2008). OWL 2 Web Ontology Language primer. Online, W3C recommendation. Available from http://www.w3.org/TR/owl2-primer/.

[21]

Hobbs, J., & Pan, F. (2006). Time ontology in OWL. Online, W3C working draft. Available from http://www.w3.org/TR/owl-time/.

[22]

Reducing OWL entailment to description logic satisfiability. Web Semantics: Science, Services and Agents on the World Wide Web. v1 i4. 345-357.

Digital Library

[23]

Horrocks, I., & Patel-Schneider, P. F. (2004). A proposal for an OWL rules language. In Proceedings of the 13th international conference on World Wide Web (WWW 2004) (pp. 723-731). New York, NY, USA.

Digital Library

[24]

Huang, Y., & Huang, T. (2002). Model-based human body tracking. In Proceedings of the 16th international conference on pattern recognition (ICPR 2002) (Vol. 1, pp. 552-555).

Digital Library

[25]

Katz, Y., & Cuenca Grau, B. (2005). Representing qualitative spatial information in OWL-DL. In Proceedings of OWL: Experiences and directions workshop (OWLED 2005). Galway, Ireland.

[26]

Kokar, M., & Wang, J. (2002). Using ontologies for recognition: An example. In Fifth international conference on information fusion (Vol. 2, pp. 1324-1330). Annapolis, MD, USA.

[27]

Ontology-based situation awareness. Information Fusion. v10 i1. 83-98.

Digital Library

[28]

Lambert, D. (2003). Grand challenges of information fusion. In Proceedings of the sixth international conference of information fusion (Vol. 1, pp. 213-220). Cairns, Australia.

[29]

Lee, W., Bürger, T., & Sasaki, F. (2009). Use cases and requirements for ontology and API for media object 1.0. Online, W3C working draft. Available from http://www.w3.org/TR/media-annot-reqs/.

[30]

Designing ontologies for higher level fusion. Information Fusion. v10 i1. 70-82.

Digital Library

[31]

Llinas, J., Bowman, C., Rogova, G., Steinberg, A., Waltz, E., & White, F. (2004). Revisiting the JDL data fusion model II. In Proceedings of the seventh international conference on information fusion (pp. 1218-1230). Stockholm, Sweden.

[32]

Towards ontology-based cognitive vision. Machine Vision and Applications. v16 i1. 33-40.

Digital Library

[33]

McGuiness, D., & van Harmelen, F. (2004). OWL web ontology language overview. Online, W3C recommendation. Available from http://www.w3.org/TR/owl-features/.

[34]

Query answering for OWL-DL with rules. Web Semantics: Science, Services and Agents on the World Wide Web. v3 i1. 41-60.

Digital Library

[35]

On scene interpretation with description logics. Image and Vision Computing. v26. 82-101.

Digital Library

[36]

Nowak, C. (2003). On ontologies for high-level information fusion. In Proceedings of the sixth international conference on information fusion (Vol. 1, pp. 657-664). Cairns, Australia.

[37]

Noy, N., & Rector, A., 2006. Defining n-ary relations on the semantic web. Online, W3C semantic web best practices and deployment working group note. Available from http://www.w3.org/TR/swbp-n-aryRelations/.

[38]

Orwell, J., Remagnino, P., & Jones, G. (1999). Multi-camera colour tracking. In Second IEEE workshop on visual surveillance (VS'99) (pp. 14-21). Fort Collins, Colo, USA.

Digital Library

[39]

Computational intelligence in multimedia processing: Recent advances. In: Computational intelligence in visual sensor networks: Improving video processing systems, Springer. pp. 351-377.

[40]

Representations for cognitive vision. ELCVIA: Electronic Letters on Computer Vision and Image Analysis. v7 i2. 35-61.

[41]

Randell, D. A., Cui, Z., & Cohn, A. G. (1992). A spatial logic based on regions and connection. In Proceedings of the third international conference on principles of knowledge engineering and reasoning (pp. 165-176). Cambridge, MA, USA.

[42]

Remagnino, P., Baumberg, A., Grove, T., Hogg, D., Tan, T., Worrall, A., & Baker, K. (1997). An integrated traffic and pedestrian model-based vision system. In Proceedings of the eighth british machine vision conference (BMVC97) (pp. 380-389). Essex, UK.

[43]

A context model and reasoning system to improve object tracking in complex scenarios. Expert Systems with Applications. v36 i8. 10995-11005.

Digital Library

[44]

Pellet: A practical OWL-DL reasoner. Web Semantics: Science, Services and Agents on the World Wide Web. v5 i2. 51-53.

Digital Library

[45]

Snidaro, L., Belluz, M., & Foresti, G. L. (2007). Domain knowledge for surveillance applications. In Proceedings of the 10th international conference on information fusion (pp. 1-6). Quebec, Canada.

[46]

Steinberg, A. N., & Bowman, C. L. (2004). Rethinking the JDL data fusion levels. In Proceedings of the MSS national symposium on sensor and data fusion. Columbia, SC, USA.

[47]

Revisions to the JDL data fusion model. In: Handbook of multisensor data fusion, CRC Press. pp. 45-67.

[48]

Steinberg, A. N., & Rogova, G. (2008). Situation and context in data fusion and natural language understanding. In Proceedings of the 11th international conference on information fusion (pp. 1-8). Cologne, Germany.

[49]

Knowledge engineering: Principles and methods. Data Knowledge Engineering. v25. 161-197.

Digital Library

[50]

Cognitive vision: The case for embodied perception. Image and Vision Computing. v26 i1. 127-140.

Digital Library

[51]

Wessel, M., & Möller, R. (2005). A high performance semantic web query answering engine. In Proceedings of the international workshop on description logics (DL2005). Edinburgh, Scotland.

[52]

Toward a common event model for multimedia applications. IEEE Multimedia. v14 i1. 19-29.

Digital Library

[53]

Context-aware visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence. v31 i7. 1195-1209.

Digital Library

[54]

Object tracking: A survey. ACM Computing Surveys. v38 i4. 1-45.

Digital Library

Cited By

Patel AVyas RVyas OOjha M(2022)A study on video semantics; overview, challenges, and applicationsMultimedia Tools and Applications10.1007/s11042-021-11722-181:5(6849-6897)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1007/s11042-021-11722-1
Patel AMerlino GBruneo DPuliafito AVyas OOjha M(2021)Video representation and suspicious event detection using semantic technologiesSemantic Web10.3233/SW-20039312:3(467-491)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.3233/SW-200393
Cavaliere DLoia VSaggese ASenatore SVento M(2019)A human-like description of scene events for a proper UAV-based video content analysisKnowledge-Based Systems10.1016/j.knosys.2019.04.026178:C(163-175)Online publication date: 15-Aug-2019
https://dl.acm.org/doi/10.1016/j.knosys.2019.04.026
Show More Cited By

Index Terms

Ontology-based context representation and reasoning for object tracking and scene interpretation in video
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding
    2. Knowledge representation and reasoning
2. Theory of computation
  1. Logic

Recommendations

Robust object tracking via multi-cue fusion

A long-term object tracking method based on calibrated binocular cameras by fusing information of the two channels and binocular geometry constraints is proposed.The stereo filter which is built based on the epipolar geometry of the binocular cameras is ...
Robust Object Tracking Using Motion Context in Crowded Scenes
Advances in Multimedia Information Processing – PCM 2013
Abstract
Tracking objects in a crowded scene with occlusions has been a challenge in computer vision and multimedia in the past years. This paper presents a novel framework to track any arbitrary object through modeling its coupled motion context. For a ...
Object tracking and local appearance capturing in a remote scene video surveillance system with two cameras
MMM'10: Proceedings of the 16th international conference on Advances in Multimedia Modeling

Local appearance of object is of importance to content analysis, object recognition and forensic authentication. However, existing video surveillance systems are almost incapable of capturing local appearance of object in a remote scene. We present a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Expert Systems with Applications: An International Journal

Expert Systems with Applications: An International Journal Volume 38, Issue 6

June, 2011

1507 pages

ISSN:0957-4174

Issue’s Table of Contents

Copyright © Elsevier Ltd.

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 June 2011

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Patel AVyas RVyas OOjha M(2022)A study on video semantics; overview, challenges, and applicationsMultimedia Tools and Applications10.1007/s11042-021-11722-181:5(6849-6897)Online publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1007/s11042-021-11722-1
Patel AMerlino GBruneo DPuliafito AVyas OOjha M(2021)Video representation and suspicious event detection using semantic technologiesSemantic Web10.3233/SW-20039312:3(467-491)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.3233/SW-200393
Cavaliere DLoia VSaggese ASenatore SVento M(2019)A human-like description of scene events for a proper UAV-based video content analysisKnowledge-Based Systems10.1016/j.knosys.2019.04.026178:C(163-175)Online publication date: 15-Aug-2019
https://dl.acm.org/doi/10.1016/j.knosys.2019.04.026
Kelathodi Kumaran SProsad Dogra DPratim Roy P(2019)Queuing theory guided intelligent traffic scheduling through video analysis using Dirichlet process mixture modelExpert Systems with Applications: An International Journal10.1016/j.eswa.2018.09.057118:C(169-181)Online publication date: 15-Mar-2019
https://dl.acm.org/doi/10.1016/j.eswa.2018.09.057
Sánchez-Nielsen EChávez-Gutiérrez FLorenzo-Navarro J(2019)A semantic parliamentary multimedia approach for retrieval of video clips with content understandingMultimedia Systems10.1007/s00530-019-00610-225:4(337-354)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1007/s00530-019-00610-2
Padilla WGarcía JMolina J(2018)Improving Forecasting Using Information Fusion in Local Agricultural MarketsHybrid Artificial Intelligent Systems10.1007/978-3-319-92639-1_40(479-489)Online publication date: 20-Jun-2018
https://dl.acm.org/doi/10.1007/978-3-319-92639-1_40
Greco LRitrovato PVento MAkerkar RCuzzocrea ACao JHacid M(2017)Advanced video analyticsProceedings of the 7th International Conference on Web Intelligence, Mining and Semantics10.1145/3102254.3102276(1-6)Online publication date: 19-Jun-2017
https://dl.acm.org/doi/10.1145/3102254.3102276
Chen HWong CFeng H(2017)Wireless image fuzzy recognition system for human activityMultimedia Tools and Applications10.1007/s11042-016-4302-576:23(25231-25251)Online publication date: 1-Dec-2017
https://dl.acm.org/doi/10.1007/s11042-016-4302-5
Sikos L(2017)RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexingMultimedia Tools and Applications10.1007/s11042-016-3705-776:12(14437-14460)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.1007/s11042-016-3705-7
Sikos LPowers DBalog KDalton JDoucet AIbrahim Y(2015)Knowledge-Driven Video Information Retrieval with LODProceedings of the Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval10.1145/2810133.2810141(35-37)Online publication date: 22-Oct-2015
https://dl.acm.org/doi/10.1145/2810133.2810141
Show More Cited By

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents