Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2393347.2396386acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

Multimedia event recounting with concept based representation

Published: 29 October 2012 Publication History

Abstract

Multimedia event detection has drawn a lot of attention in recent years. Given a recognized event, in this paper, we conduct a pilot study of the multimedia event recounting problem, which answers the question why this video is recognized as this event, i.e. what evidences this decision is made on. In order to provide a semantic recounting of the multimedia event, we adopt a concept-based event representation for learning a discriminative event model. Then, we present a recounting approach that exactly recovers the contribution of semantic evidence to the event classification decision. This approach can be applied on any additive discriminative classifiers. The promising result is shown on the MED11 dataset that contains 15 events in thousands of YouTube like videos.

References

[1]
www-nlpir.nist.gov/projects/tv2011/tv2011.html.
[2]
J. Liu, J. Luo, and M. Shah. Recognizing realistic actions from videos "in the wild". In CVPR, 2009.
[3]
P. Felzenszwalb, D. McAllester, and D. Ramanan. A discriminatively trained, multiscale, deformable part model. In CVPR, 2008.
[4]
C. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009.
[5]
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2):91--110, 2004.
[6]
H. Wang, A. Klaser, C. Schmid, and C. Liu. Action recognition by dense trajectories. In CVPR, 2011.
[7]
L. Bao, J. Cao, Y. Zhang, J. Li, M. Chen and A. Hauptmann Explicit and implicit concept-based video retrieval with bipartite graph propagation model. In ACM Multimedia, 2010.
[8]
I. Laptev, M. Marszaek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In CVPR, 2008.
[9]
C. Tan, Y. Jiang and C. Ngo Towards textually describing complex video contents with audio-visual concept classifiers. In ACM Multimedia, 2011.
[10]
P. Srinivasan, J. Shi and L. Davis Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In CVPR, 2009.
[11]
Y. Jiang, X. Zeng, and et al. Columbia-ucf trecvid 2010 multimedia event detection: combining multiple modalities, contextual concepts, and temporal matching. In TRECVID, 2010.

Cited By

View all
  • (2019)Semantically Interpretable Activation Maps: what-where-how explanations within CNNs2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)10.1109/ICCVW.2019.00518(4207-4215)Online publication date: Oct-2019
  • (2019)Towards Interpretable Object Detection by Unfolding Latent Structures2019 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV.2019.00613(6032-6042)Online publication date: Oct-2019
  • (2019)Cell Fault Management Using Machine Learning TechniquesIEEE Access10.1109/ACCESS.2019.29384107(124514-124539)Online publication date: 2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multimedia event recounting
  2. multimedia event representation
  3. textual descriptions of video content

Qualifiers

  • Poster

Conference

MM '12
Sponsor:
MM '12: ACM Multimedia Conference
October 29 - November 2, 2012
Nara, Japan

Acceptance Rates

Overall Acceptance Rate 1,548 of 5,826 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Semantically Interpretable Activation Maps: what-where-how explanations within CNNs2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)10.1109/ICCVW.2019.00518(4207-4215)Online publication date: Oct-2019
  • (2019)Towards Interpretable Object Detection by Unfolding Latent Structures2019 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV.2019.00613(6032-6042)Online publication date: Oct-2019
  • (2019)Cell Fault Management Using Machine Learning TechniquesIEEE Access10.1109/ACCESS.2019.29384107(124514-124539)Online publication date: 2019
  • (2019)Evolving Rule-Based Explainable Artificial Intelligence for Unmanned Aerial VehiclesIEEE Access10.1109/ACCESS.2019.28931417(17001-17016)Online publication date: 2019
  • (2018)Few-Shot Adaptation for Multimedia Semantic IndexingProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240592(1110-1118)Online publication date: 15-Oct-2018
  • (2018)You Are What You EatIEEE Transactions on Multimedia10.1109/TMM.2017.275949920:4(950-964)Online publication date: 1-Apr-2018
  • (2017)VRFP: On-the-Fly Video Retrieval Using Web Images and Fast Fisher Vector ProductsIEEE Transactions on Multimedia10.1109/TMM.2017.267141419:7(1583-1595)Online publication date: 1-Jul-2017
  • (2017)Design of an explainable machine learning challenge for video interviews2017 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2017.7966320(3688-3695)Online publication date: May-2017
  • (2017)Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge2017 IEEE International Conference on Computer Vision (ICCV)10.1109/ICCV.2017.391(3639-3647)Online publication date: Oct-2017
  • (2017)Complex Activity Recognition Via Attribute DynamicsInternational Journal of Computer Vision10.1007/s11263-016-0918-1122:2(334-370)Online publication date: 1-Apr-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media