Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Learning Perceptual Causality from Video

Published: 26 November 2015 Publication History

Abstract

Perceptual causality is the perception of causal relationships from observation. Humans, even as infants, form such models from observation of the world around them [Saxe and Carey 2006]. For a deeper understanding, the computer must make similar models through the analogous form of observation: video. In this article, we provide a framework for the unsupervised learning of this perceptual causal structure from video. Our method takes action and object status detections as input and uses heuristics suggested by cognitive science research to produce the causal links perceived between them. We greedily modify an initial distribution featuring independence between potential causes and effects by adding dependencies that maximize information gain. We compile the learned causal relationships into a Causal And-Or Graph, a probabilistic and-or representation of causality that adds a prior to causality. Validated against human perception, experiments show that our method correctly learns causal relations, attributing status changes of objects to causing actions amid irrelevant actions. Our method outperforms Hellinger’s χ2-statistic by considering hierarchical action selection, and outperforms the treatment effect by discounting coincidental relationships.

References

[1]
M. Albanese, R. Chellappa, N. Cuntoor, V. Moscato, A. Picariello, V. S. Subrahmanian, and O. Udrea. 2010. Pads: A probabilistic activity detection framework for video data. IEEE Trans. Pattern Anal. Mach. Intell. 32, 12 (2010), 2246--2261.
[2]
M. Brand. 1997. The “Inverse Hollywood Problem”: From video to scripts and storyboards via causal analysis. In Proceedings of the National Conference on Artifial Intelligence. 132--137.
[3]
S. Carey. 2009. The Origin of Concepts. Oxford University Press.
[4]
I. Csiszár and P. C. Shields. 2004. Information theory and statistics: A tutorial. Commun. Inf. Theory 1, 4 (2004), 417--528.
[5]
S. Della Pietra, V. Della Pietra, and J. Lafferty. 1997. Inducing features of random fields. IEEE Trans. Pattern Anal. Mach. Intell. 19, 4 (1997), 380--393.
[6]
A. Fire and S.-C. Zhu. 2013a. Learning perceptual causality from video. In AAAI Workshop: Learning Rich Representations from Low-Level Sensors.
[7]
A. Fire and S.-C. Zhu. 2013b. Using causal induction in humans to learn and infer causality from video. In Proceedings of the 35th Annual Conference of the Cognitive Science Society, 2297--2302.
[8]
J. Friedman, T. Hastie, and R. Tibshirani. 2000. Additive logistic regression: A statistical view of boosting (With discussion and a rejoinder by the authors). Ann. Stat. 28, 2 (2000), 337--407.
[9]
T. L. Griffiths and J. B. Tenenbaum. 2005. Structure and strength in causal induction. Cognitive Psychol. 51, 4 (2005), 334--384.
[10]
Y. Hagmayer and M. R. Waldmann. 2002. How temporal assumptions influence causal judgments. Mem. Cognition 30, 7 (2002), 1128--1137.
[11]
D. Heckerman. 1995. A Bayesian approach to learning causal networks. In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, 285--295.
[12]
Y. A. Ivanov and A. F. Bobick. 2000. Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8 (2000), 852--872.
[13]
S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 2169--2178.
[14]
R. Mann, A. Jepson, and J. M. Siskind. 1997. The computational perception of scene dynamics. Comput. Vis. Image Und. 65, 2 (1997), 113--128.
[15]
E. T. Mueller. 2006. Commonsense Reasoning. Morgan Kaufmann Publishers Inc., San Francisco.
[16]
J. Pearl. 2009. Causality: Models, Reasoning and Inference (2nd ed.). Cambridge University Press, New York.
[17]
M. Pei, Y. Jia, and S.-C. Zhu. 2011. Parsing video events with goal inference and intent prediction. In ICCV. 487--494.
[18]
K. Prabhakar, S. Oh, P. Wang, G.D. Abowd, and J.M. Rehg. 2010. Temporal causality for the analysis of visual events. In CVPR.
[19]
M. Richardson and P. Domingos. 2006. Markov logic networks. Machine Learning 62, 1 (2006), 107--136.
[20]
D. B. Rubin. 2005. Causal inference using potential outcomes. J. Am. Statist. Assoc. 100, 469 (2005), 322--331.
[21]
R. Saxe and S. Carey. 2006. The perception of causality in infancy. Acta Psychol. 123, 1 (2006), 144--165.
[22]
R. Saxe, J. B. Tenenbaum, and S. Carey. 2005. Secret agents inferences about hidden causes by 10-and 12-month-old infants. Psychol. Sci. 16, 12 (2005), 995--1001.
[23]
R. Scheines, H. Hoijtink, and A. Boomsma. 1999. Bayesian estimation and testing of structural equation models. Psychometrika 64, 1 (1999), 37--52.
[24]
P. Spirtes, C. Glymour, and R. Scheines. 2000. Causation, Prediction, and Search. Vol. 81. MIT Press.
[25]
S. Tran and L. Davis. 2008. Event modeling and recognition using Markov logic networks. ECCV (2008), 610--623.
[26]
Z. Tu, X. Chen, A. L. Yuille, and S.-C. Zhu. 2005. Image parsing: Unifying segmentation, detection, and recognition. Int. J. Comput. Vis. 63, 2 (2005), 113--140.
[27]
P. Wei, Y. Zhao, N. Zheng, and S.-C. Zhu. 2013. Modeling 4d human-object interactions for event and object recognition. In Proceedings of the 2013 IEEE International Conference on Computer Vision (ICCV). IEEE, 3272--3279.
[28]
S.-C. Zhu and D. Mumford. 2006. A Stochastic Grammar of Images. Now Publishers Inc., Hanover, MA.
[29]
S.-C. Zhu, Y. N. Wu, and D. Mumford. 1997. Minimax entropy principle and its application to texture modeling. Neural Comput. 9, 8 (1997), 1627--1660.

Cited By

View all
  • (2024)Perceiving Actions via Temporal Video Frame PairsACM Transactions on Intelligent Systems and Technology10.1145/3652611Online publication date: 17-Mar-2024
  • (2024)Agree to Disagree: Exploring Partial Semantic Consistency Against Visual Deviation for Compositional Zero-Shot LearningIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2024.336795716:4(1433-1444)Online publication date: Aug-2024
  • (2024)Extraction of object-action and object-state associations from Knowledge GraphsJournal of Web Semantics10.1016/j.websem.2024.10081681(100816)Online publication date: Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 2
Special Issue on Causal Discovery and Inference
January 2016
270 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2850424
  • Editor:
  • Yu Zheng
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 November 2015
Accepted: 01 July 2015
Revised: 01 November 2014
Received: 01 July 2014
Published in TIST Volume 7, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Perceptual causality
  2. causal induction
  3. information projection

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Office of Naval Research, under MURI

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)46
  • Downloads (Last 6 weeks)3
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Perceiving Actions via Temporal Video Frame PairsACM Transactions on Intelligent Systems and Technology10.1145/3652611Online publication date: 17-Mar-2024
  • (2024)Agree to Disagree: Exploring Partial Semantic Consistency Against Visual Deviation for Compositional Zero-Shot LearningIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2024.336795716:4(1433-1444)Online publication date: Aug-2024
  • (2024)Extraction of object-action and object-state associations from Knowledge GraphsJournal of Web Semantics10.1016/j.websem.2024.10081681(100816)Online publication date: Jul-2024
  • (2023)Treatment Learning Causal Transformer for Noisy Image Classification2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00608(6128-6139)Online publication date: Jan-2023
  • (2023)Why is that a Good or Not a Good Frying Pan? – Knowledge Representation for Functions of Objects and Tools for Design Understanding, Improvement, and Generation2023 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI52147.2023.10371873(121-128)Online publication date: 5-Dec-2023
  • (2023)Panoptic Video Scene Graph Generation2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01791(18675-18685)Online publication date: Jun-2023
  • (2023)Causality extraction: A comprehensive survey and new perspectiveJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.10159335:7(101593)Online publication date: Jul-2023
  • (2023)Communicative Learning: A Unified Learning FormalismEngineering10.1016/j.eng.2022.10.01725(77-100)Online publication date: Jun-2023
  • (2022)Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards Individualized and Explainable Robotic Support in Everyday ActivitiesProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3536004(1382-1390)Online publication date: 9-May-2022
  • (2022)Causality-preserving Asynchronous RealityProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501836(1-15)Online publication date: 29-Apr-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media