research-article

Learning Perceptual Causality from Video

Authors:

Song-Chun ZhuAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 7, Issue 2

Article No.: 23, Pages 1 - 22

https://doi.org/10.1145/2809782

Published: 26 November 2015 Publication History

Abstract

Perceptual causality is the perception of causal relationships from observation. Humans, even as infants, form such models from observation of the world around them [Saxe and Carey 2006]. For a deeper understanding, the computer must make similar models through the analogous form of observation: video. In this article, we provide a framework for the unsupervised learning of this perceptual causal structure from video. Our method takes action and object status detections as input and uses heuristics suggested by cognitive science research to produce the causal links perceived between them. We greedily modify an initial distribution featuring independence between potential causes and effects by adding dependencies that maximize information gain. We compile the learned causal relationships into a Causal And-Or Graph, a probabilistic and-or representation of causality that adds a prior to causality. Validated against human perception, experiments show that our method correctly learns causal relations, attributing status changes of objects to causing actions amid irrelevant actions. Our method outperforms Hellinger’s χ²-statistic by considering hierarchical action selection, and outperforms the treatment effect by discounting coincidental relationships.

References

[1]

M. Albanese, R. Chellappa, N. Cuntoor, V. Moscato, A. Picariello, V. S. Subrahmanian, and O. Udrea. 2010. Pads: A probabilistic activity detection framework for video data. IEEE Trans. Pattern Anal. Mach. Intell. 32, 12 (2010), 2246--2261.

Digital Library

[2]

M. Brand. 1997. The “Inverse Hollywood Problem”: From video to scripts and storyboards via causal analysis. In Proceedings of the National Conference on Artifial Intelligence. 132--137.

Digital Library

[3]

S. Carey. 2009. The Origin of Concepts. Oxford University Press.

Digital Library

[4]

I. Csiszár and P. C. Shields. 2004. Information theory and statistics: A tutorial. Commun. Inf. Theory 1, 4 (2004), 417--528.

Digital Library

[5]

S. Della Pietra, V. Della Pietra, and J. Lafferty. 1997. Inducing features of random fields. IEEE Trans. Pattern Anal. Mach. Intell. 19, 4 (1997), 380--393.

Digital Library

[6]

A. Fire and S.-C. Zhu. 2013a. Learning perceptual causality from video. In AAAI Workshop: Learning Rich Representations from Low-Level Sensors.

Digital Library

[7]

A. Fire and S.-C. Zhu. 2013b. Using causal induction in humans to learn and infer causality from video. In Proceedings of the 35th Annual Conference of the Cognitive Science Society, 2297--2302.

[8]

J. Friedman, T. Hastie, and R. Tibshirani. 2000. Additive logistic regression: A statistical view of boosting (With discussion and a rejoinder by the authors). Ann. Stat. 28, 2 (2000), 337--407.

[9]

T. L. Griffiths and J. B. Tenenbaum. 2005. Structure and strength in causal induction. Cognitive Psychol. 51, 4 (2005), 334--384.

[10]

Y. Hagmayer and M. R. Waldmann. 2002. How temporal assumptions influence causal judgments. Mem. Cognition 30, 7 (2002), 1128--1137.

[11]

D. Heckerman. 1995. A Bayesian approach to learning causal networks. In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, 285--295.

Digital Library

[12]

Y. A. Ivanov and A. F. Bobick. 2000. Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8 (2000), 852--872.

Digital Library

[13]

S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 2169--2178.

Digital Library

[14]

R. Mann, A. Jepson, and J. M. Siskind. 1997. The computational perception of scene dynamics. Comput. Vis. Image Und. 65, 2 (1997), 113--128.

Digital Library

[15]

E. T. Mueller. 2006. Commonsense Reasoning. Morgan Kaufmann Publishers Inc., San Francisco.

Digital Library

[16]

J. Pearl. 2009. Causality: Models, Reasoning and Inference (2nd ed.). Cambridge University Press, New York.

Digital Library

[17]

M. Pei, Y. Jia, and S.-C. Zhu. 2011. Parsing video events with goal inference and intent prediction. In ICCV. 487--494.

Digital Library

[18]

K. Prabhakar, S. Oh, P. Wang, G.D. Abowd, and J.M. Rehg. 2010. Temporal causality for the analysis of visual events. In CVPR.

[19]

M. Richardson and P. Domingos. 2006. Markov logic networks. Machine Learning 62, 1 (2006), 107--136.

Digital Library

[20]

D. B. Rubin. 2005. Causal inference using potential outcomes. J. Am. Statist. Assoc. 100, 469 (2005), 322--331.

[21]

R. Saxe and S. Carey. 2006. The perception of causality in infancy. Acta Psychol. 123, 1 (2006), 144--165.

[22]

R. Saxe, J. B. Tenenbaum, and S. Carey. 2005. Secret agents inferences about hidden causes by 10-and 12-month-old infants. Psychol. Sci. 16, 12 (2005), 995--1001.

[23]

R. Scheines, H. Hoijtink, and A. Boomsma. 1999. Bayesian estimation and testing of structural equation models. Psychometrika 64, 1 (1999), 37--52.

[24]

P. Spirtes, C. Glymour, and R. Scheines. 2000. Causation, Prediction, and Search. Vol. 81. MIT Press.

[25]

S. Tran and L. Davis. 2008. Event modeling and recognition using Markov logic networks. ECCV (2008), 610--623.

Digital Library

[26]

Z. Tu, X. Chen, A. L. Yuille, and S.-C. Zhu. 2005. Image parsing: Unifying segmentation, detection, and recognition. Int. J. Comput. Vis. 63, 2 (2005), 113--140.

Digital Library

[27]

P. Wei, Y. Zhao, N. Zheng, and S.-C. Zhu. 2013. Modeling 4d human-object interactions for event and object recognition. In Proceedings of the 2013 IEEE International Conference on Computer Vision (ICCV). IEEE, 3272--3279.

Digital Library

[28]

S.-C. Zhu and D. Mumford. 2006. A Stochastic Grammar of Images. Now Publishers Inc., Hanover, MA.

Digital Library

[29]

S.-C. Zhu, Y. N. Wu, and D. Mumford. 1997. Minimax entropy principle and its application to texture modeling. Neural Comput. 9, 8 (1997), 1627--1660.

Digital Library

Cited By

Li RXu TWu XShen ZKittler J(2024)Perceiving Actions via Temporal Video Frame PairsACM Transactions on Intelligent Systems and Technology10.1145/3652611Online publication date: 17-Mar-2024
https://doi.org/10.1145/3652611
Li XYang XWang XDeng C(2024)Agree to Disagree: Exploring Partial Semantic Consistency Against Visual Deviation for Compositional Zero-Shot LearningIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2024.336795716:4(1433-1444)Online publication date: Aug-2024
https://doi.org/10.1109/TCDS.2024.3367957
Vassiliades APatkos TEfthymiou VBikakis ABassiliades NPlexousakis D(2024)Extraction of object-action and object-state associations from Knowledge GraphsJournal of Web Semantics10.1016/j.websem.2024.10081681(100816)Online publication date: Jul-2024
https://doi.org/10.1016/j.websem.2024.100816
Show More Cited By

Index Terms

Learning Perceptual Causality from Video
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Hierarchical representations
    2. Knowledge representation and reasoning
      1. Reasoning about belief and knowledge
  2. Machine learning
    1. Learning paradigms
      1. Unsupervised learning

Recommendations

Learning perceptual causality from video
AAAIWS'13-12: Proceedings of the 12th AAAI Conference on Learning Rich Representations from Low-Level Sensors

Computer vision and artificial intelligence research has long danced around the subject of causality: vision researchers use causal relationships to aid action detection, and AI researchers propose methods for causal induction independent of video ...
Complexity results for structure-based causality

We give a precise picture of the computational complexity of causal relationships in Pearl's structural models, where we focus on causality between variables, event causality, and probabilistic causality. As for causality between variables, we consider ...
Granular Causality Applications: Using Part-of Relations for Discovering Causality

Causal markers, syntactic structures and connectives have been the sole identifying features for automatically extracting causal relations in natural language discourse. However, various connectives such as "and", prepositions such as "as", and other ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 7, Issue 2

Special Issue on Causal Discovery and Inference

January 2016

270 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/2850424

Editor:
Yu Zheng
Microsoft Research, China

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 November 2015

Accepted: 01 July 2015

Revised: 01 November 2014

Received: 01 July 2014

Published in TIST Volume 7, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Office of Naval Research, under MURI

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

36
Total Citations
View Citations
542
Total Downloads

Downloads (Last 12 months)46
Downloads (Last 6 weeks)3

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li RXu TWu XShen ZKittler J(2024)Perceiving Actions via Temporal Video Frame PairsACM Transactions on Intelligent Systems and Technology10.1145/3652611Online publication date: 17-Mar-2024
https://doi.org/10.1145/3652611
Li XYang XWang XDeng C(2024)Agree to Disagree: Exploring Partial Semantic Consistency Against Visual Deviation for Compositional Zero-Shot LearningIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2024.336795716:4(1433-1444)Online publication date: Aug-2024
https://doi.org/10.1109/TCDS.2024.3367957
Vassiliades APatkos TEfthymiou VBikakis ABassiliades NPlexousakis D(2024)Extraction of object-action and object-state associations from Knowledge GraphsJournal of Web Semantics10.1016/j.websem.2024.10081681(100816)Online publication date: Jul-2024
https://doi.org/10.1016/j.websem.2024.100816
Yang CHung DLiu YChen P(2023)Treatment Learning Causal Transformer for Noisy Image Classification2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00608(6128-6139)Online publication date: Jan-2023
https://doi.org/10.1109/WACV56688.2023.00608
Ho S(2023)Why is that a Good or Not a Good Frying Pan? – Knowledge Representation for Functions of Objects and Tools for Design Understanding, Improvement, and Generation2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371873(121-128)Online publication date: 5-Dec-2023
https://doi.org/10.1109/SSCI52147.2023.10371873
Yang JPeng WLi XGuo ZChen LLi BMa ZZhou KZhang WLoy CLiu Z(2023)Panoptic Video Scene Graph Generation2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01791(18675-18685)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.01791
Ali WZuo WYing WAli RRahman GUllah I(2023)Causality extraction: A comprehensive survey and new perspectiveJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.10159335:7(101593)Online publication date: Jul-2023
https://doi.org/10.1016/j.jksuci.2023.101593
Yuan LZhu S(2023)Communicative Learning: A Unified Learning FormalismEngineering10.1016/j.eng.2022.10.01725(77-100)Online publication date: Jun-2023
https://doi.org/10.1016/j.eng.2022.10.017
Wich ASchultheis HBeetz MPelachaud CTaylor MFaliszewski PMascardi V(2022)Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards Individualized and Explainable Robotic Support in Everyday ActivitiesProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536004(1382-1390)Online publication date: 9-May-2022
https://dl.acm.org/doi/10.5555/3535850.3536004
Fender AHolz C(2022)Causality-preserving Asynchronous RealityProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501836(1-15)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3501836
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents