short-paper

Open access

Multimedia Event Detection Using Event-Driven Multiple Instance Learning

Authors:

Sang Phan,

Duy-Dinh Le,

Shin'ichi SatohAuthors Info & Claims

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Pages 1255 - 1258

https://doi.org/10.1145/2733373.2806330

Published: 13 October 2015 Publication History

PDF eReader

Abstract

A complex event can be recognized by observing necessary evidences. In the real world scenarios, this is a difficult task because the evidences can happen anywhere in a video. A straightforward solution is to decompose the video into several segments and search for the evidences in each segment. This approach is based on the assumption that segment annotation can be assigned from its video label. However, this is a weak assumption because the importance of each segment is not considered. On the other hand, the importance of a segment to an event can be obtained by matching its detected concepts against the evidential description of that event. Leveraging this prior knowledge, we propose a new method, Event-driven Multiple Instance Learning (EDMIL), to learn the key evidences for event detection. We treat each segment as an instance and quantize the instance-event similarity into different levels of relatedness. Then the instance label is learned by jointly optimizing the instance classifier and its related level. The significant performance improvement on the TRECVID Multimedia Event Detection (MED) 2012 dataset proves the effectiveness of our approach.

References

[1]

S. Andrews, I. Tsochantaridis, and T. Hofmann. Support vector machines for multiple-instance learning. In NIPS, pages 561--568, 2002.

Digital Library

Google Scholar

[2]

S. Bhattacharya, F. X. Yu, and S.-F. Chang. Minimally needed evidence for complex event recognition in unconstrained videos. In ICMR, 2014.

Digital Library

Google Scholar

[3]

J. Chen, Y. Cui, G. Ye, D. Liu, and S.-F. Chang. Event-driven semantic concept discovery by exploiting weakly tagged internet images. In ICMR, 2014.

Digital Library

Google Scholar

[4]

K.-T. Lai, D. Liu, M.-S. Chen, and S.-F. Chang. Recognizing complex events in videos by learning key static-dynamic evidences. In ECCV. 2014.

Crossref

Google Scholar

[5]

K.-T. Lai, F. X. Yu, M.-S. Chen, and S.-F. Chang. Video event detection by inferring temporal instance labels. In CVPR, pages 2251--2258. IEEE, 2014.

Digital Library

Google Scholar

[6]

D. Oneata, J. Verbeek, and C. Schmid. Action and event recognition with fisher vectors on a compact feature set. In ICCV. IEEE, 2013.

Digital Library

Google Scholar

[7]

K. Tang, L. Fei-Fei, and D. Koller. Learning latent temporal structure for complex event detection. In CVPR, pages 1250--1257. IEEE, 2012.

Digital Library

Google Scholar

[8]

A. Vahdat, K. Cannons, G. Mori, S. Oh, and I. Kim. Compositional models for video event detection: A multiple kernel learning latent variable approach. In ICCV, pages 1185--1192. IEEE, 2013.

Digital Library

Google Scholar

[9]

H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV. IEEE, 2013.

Digital Library

Google Scholar

[10]

S. Wu, S. Bondugula, F. Luisier, X. Zhuang, and P. Natarajan. Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In CVPR, pages 2665--2672. IEEE, 2014.

Digital Library

Google Scholar

[11]

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning Deep Features for Scene Recognition using Places Database. NIPS, 2014.

Digital Library

Google Scholar

Cited By

View all

Zhu YChen YZhao ZLiu XGuo J(2023)Local Self-attention-based Hybrid Multiple Instance Learning for Partial Spoof Speech DetectionACM Transactions on Intelligent Systems and Technology10.1145/361654014:5(1-18)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.1145/3616540
Perini LVercruyssen VDavis JSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Learning from Positive and Unlabeled Multi-Instance Bags in Anomaly DetectionProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599409(1897-1906)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599409
Han TQi YZhu S(2021)A Continuous Semantic Embedding Method for Video Compact RepresentationElectronics10.3390/electronics1024310610:24(3106)Online publication date: 14-Dec-2021
https://doi.org/10.3390/electronics10243106
Show More Cited By

Index Terms

Multimedia Event Detection Using Event-Driven Multiple Instance Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization

Recommendations

A generic framework for event detection in various video domains
MM '10: Proceedings of the 18th ACM international conference on Multimedia

Event detection is essential for the extensively studied video analysis and understanding area. Although various approaches have been proposed for event detection, there is a lack of a generic event detection framework that can be applied to various ...
Multiple instance learning with bag dissimilarities

Multiple instance learning (MIL) is concerned with learning from sets (bags) of objects (instances), where the individual instance labels are ambiguous. In this setting, supervised learning cannot be applied directly. Often, specialized MIL methods ...
Multiple instance learning

The characteristics specific of MIL problems are formally identified and described.MIL methods and applications are reviewed in the light of the problem characteristics.Comparative experiments show the impact of problem characteristics on 16 reference ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

October 2015

1402 pages

ISBN:9781450334594

DOI:10.1145/2733373

General Chairs:
Xiaofang Zhou
The University of Queensland, Australia
,
Alan F. Smeaton
Dublin City University, Ireland
,
Qi Tian
The University of Texas at San Antonio, USA
,
Program Chairs:
Dick C.A. Bulterman
FXPAL, USA
,
Heng Tao Shen
The University of Queensland, Australia
,
Ketan Mayer-Patel
The University of North Carolina, USA
,
Shuicheng Yan
National University of Singapore, Singapore

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '15

Sponsor:

SIGMM

MM '15: ACM Multimedia Conference

October 26 - 30, 2015

Brisbane, Australia

Acceptance Rates

MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
629
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)24

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zhu YChen YZhao ZLiu XGuo J(2023)Local Self-attention-based Hybrid Multiple Instance Learning for Partial Spoof Speech DetectionACM Transactions on Intelligent Systems and Technology10.1145/361654014:5(1-18)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.1145/3616540
Perini LVercruyssen VDavis JSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Learning from Positive and Unlabeled Multi-Instance Bags in Anomaly DetectionProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599409(1897-1906)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599409
Han TQi YZhu S(2021)A Continuous Semantic Embedding Method for Video Compact RepresentationElectronics10.3390/electronics1024310610:24(3106)Online publication date: 14-Dec-2021
https://doi.org/10.3390/electronics10243106
Luo MChang XGong C(2021)Reliable shot identification for complex event detection via visual-semantic embeddingComputer Vision and Image Understanding10.1016/j.cviu.2021.103300(103300)Online publication date: Oct-2021
https://doi.org/10.1016/j.cviu.2021.103300
Xie WYao HSun XHan TZhao SChua T(2019)Discovering Latent Discriminative Patterns for Multi-Mode Event RepresentationIEEE Transactions on Multimedia10.1109/TMM.2018.287974921:6(1425-1436)Online publication date: Jun-2019
https://doi.org/10.1109/TMM.2018.2879749
Liu YGu XHuang LOuyang JLiao MWu L(2019)Analyzing periodicity and saliency for adult video detectionMultimedia Tools and Applications10.1007/s11042-019-7576-6Online publication date: 4-Jul-2019
https://doi.org/10.1007/s11042-019-7576-6

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

A generic framework for event detection in various video domains

Multiple instance learning with bag dissimilarities

Multiple instance learning