Article

A mid-level representation framework for semantic sports video analysis

Authors:

Chang-Sheng XuAuthors Info & Claims

MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia

Pages 33 - 44

https://doi.org/10.1145/957013.957020

Published: 02 November 2003 Publication History

Abstract

Sports video has been widely studied due to its tremendous commercial potentials. Despite encouraging results from various specific sports games, it is almost impossible to extend a system for a new sports game because they usually employ different sets of low-level features appropriate for the specific games and closely coupled with the use of game specific rules to detect events or highlights. There is a lack of internal representation and structure to be generic and applicable for many different sports. In this paper, we present a generic mid-level representation framework for semantic sports video analysis. The mid-level representation layer is introduced between the low-level audio-visual processing and high-level semantic analysis. It allows us to separate sports specific knowledge and rules from the low-level and mid-level feature extraction. This makes sports video analysis more efficient, effective, and less ad-hoc for various types of sports. To achieve robustness of the low-level feature analysis, a non-parametric clustering, mean shift procedure, has been successfully applied to both color and motion analysis. The proposed framework has been tested for five field-ball type sports covering duration of about 8 hours. Experiments have shown its robust performance in semantic analysis and event detection. We believe that the proposed mid-level representation framework can be used for event detection, highlight extraction, summarization and personalization of many types of sports video.

References

[1]

C. Dorai, etc., "Media Semantics: Who Needs It and Why?", In Proc. of ACM Multimedia 2002, pp. 580--583, 2002.

Digital Library

[2]

M. Xu, L.-Y. Duan, C.-S. Xu, Q. Tian, "A Fusion Scheme of Visual and Auditory Modalities for Event Detection in Sports Video," In Proc. of ICASSP 2003, pp. 189--192, 2003.

Digital Library

[3]

M. Xu, N. C. Maddage, C.-S. Xu, M. Kankanhalli, Q. Tian, "Creating Audio Keywords For Event Detection in Soccer Video," In Proc. of ICME 2003, pp. 281--284, 2003.

Digital Library

[4]

L.-Y. Duan, M. Xu, and Q. Tian, "Semantic Shot Classification in Sports Video," In Proc. of SPIE Storage and Retrieval for Media Database 2003, pp. 300--313, 2003.

[5]

D.Q. Zhang, S. -F. Chang, "Event Detection in Baseball Video Using Superimposed Caption Recognition," In Proc. of ACM Multimedia 2002, pp. 315--318, 2002.

Digital Library

[6]

M. Han, W. Hua, W. Xu, and Y.H. Gong, "An integrated Baseball Digest System Using Maximum Entropy Method," In Proc. of ACM Multimedia, pp. 347--350, 2002.

Digital Library

[7]

J. Assfalg, M. Bertini, C. Colombo, and A. D. Bimbo, "Semantic Annotation of Sports Videos," IEEE Multimedia 9(2): 52--60, 2002.

Digital Library

[8]

N. Babaguchi, Y. Kawai, and T. Kitahashi, "Event Based Indexing of Broadcasted Sports Video by Intermodal Collaboration," IEEE Transactions on Multimedia 4(1): 68--75, 2002.

Digital Library

[9]

L.-Y. Duan, M. Xu, X.-D. Yu, and Q. Tian, "A Unified Framework for Semantic Shot Classification in Sports Videos", In Proc. of ACM Multimedia 2002, pp. 419--420, 2002.

Digital Library

[10]

D. Zhong, S.-F. Chang, "Structure Analysis of Sports Video Using Domain Models," In Proc. of ICME 2001.

[11]

C.W. Ngo, T.C. Pong, and H.J. Zhang, "On Clustering and Retrieval of Video Shots", In Proc. of ACM Multimedia 2001, pp. 51--60, 2001.

Digital Library

[12]

S. Nepal, U. Srinivasan, and G. Reynolds, "Automatic Detection of Goal Segments in Basketball Videos", In Proc. of ACM Multimedia 2001.

Digital Library

[13]

P. Xu, L. Xie, S.-F. Chang, A. Divakaran, A. Vetro, H. Sun, "Algorithms And Systems for Segmentation and Structure Analysis in Soccer Video," In Proc. of ICME 2001.

[14]

Y.-P. Tan, D.D. Saur, S.R. Kulkarni, and P.J. Ramadge, "Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation," IEEE Transactions on Circuits and Systems for Video Technology 10(1): 133--146, 2000.

Digital Library

[15]

Y. Rui, A. Gupta, A. Acero, "Automatically Extracting Highlights for TV Baseball Programs," In Proc. of ACM Multimedia, pp. 105--115, 2000.

Digital Library

[16]

G. Sudhir, J. C. M. Lee, and A. K. Jain, "Automatic Classification of Tennis Video for High-level Content-based Retrieval," In Proc. of IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 81--90, 1998.

Digital Library

[17]

Y.H. Gong, L.T. Sin, C. H. Chuan, H.J. Zhang, M. Sakauchi, "Automatic Parsing of TV Soccer Programs," In Proc. of International Conference on Multimedia Computing and Systems, pp.167--174, 1995.

Digital Library

[18]

A. Hanjalic, "Shot-Boundary Detection: Unraveled and Resolved," IEEE Transactions on Circuits and Systems for Video Technology 12(2): 90--105, 2002.

Digital Library

[19]

H. Zhang, A.Kankanhalli, and S.W. Smoliar, "Automatic Partitioning of Full-motion Video," Multimedia System 1(1): 10--28, 1993.

Digital Library

[20]

C. M. Bishop, Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995, pp. 295--329.

Digital Library

[21]

B.W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, 1986, pp.7--74.

[22]

Y. Cheng, "Mean Shift, Mode Seeking, and Clustering," IEEE PAMI, 17(8): 790--799, 1995.

Digital Library

[23]

D. Comaniciu, P. Meer, "Mean Shift: A Robust Approach toward Feature Space Analysis," IEEE PAMI 24(5): 1-18, 2002.

Digital Library

[24]

R. Wang, H.J. Zhang, and Y.Q. Zhang, "A Confidence Measure Based Moving Object Extraction System Built for Compressed Domain," In Proc. of ISCAS 2000.

[25]

M.J. Black and P. Anandan, "The Robust Estimation of Multiple Motions: Parametric and Piecewise-smooth Flow Fields," Computer Vision and Image Understanding 6(4): 348--365, 1995.

Digital Library

[26]

J.M. Odebez and P. Bouthemy, "Robust Multiresolution Estimation of Parametric Motion Models," Journal of Visual Communication and Image Representation 6(4): 348--365, 1995.

[27]

L. Vincent, P. Soille, "Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations," IEEE PAMI 24(5): 1--18, 2002.

Digital Library

[28]

L.-Y. Duan, M. Xu, Q. Tian, and C.-S. Xu, "Nonparametric Color Characterization Using Mean Shift," to appear on ACM Multimedia 2003.

Digital Library

[29]

L.-Y. Duan, M. Xu, Q. Tian, and C.-S. Xu, "A Unified Framework for Semantic Shot Classification in Sports Video," Technical Report, Institute for Infocomm Research, Jun 2003.

[30]

M.J. Swain, D.H. Ballard, "Color Indexing," International Journal of Computer Vision 7(1): 11--32, 1991.

Digital Library

[31]

<http://www.hickoksports.com/glossary/gtennis.shtml>

[32]

<http://www.firstbasesports.com/soccer_glossary.html>

[33]

V. Cantoni, S. Levialdi, V. Robert. Artificial Vision, Academic Press, 1997, pp. 1--52.

[34]

B.L. Yeo and B. Liu, "Rapid Scene Analysis on Compressed Videos," IEEE Transactions on Circuits and Systems for Video Technology 5(6): 533--544, 1995.

Digital Library

Cited By

Qi MWang YLi ALuo J(2019)Sports Video Captioning via Attentive Motion Representation and Group Relationship ModelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.2921655(1-1)Online publication date: 2019
https://doi.org/10.1109/TCSVT.2019.2921655
Zanganeh AJampour M(2019)Automatic Weak Learners Selection for Pattern Recognition and its application in Soccer Goal Recognition2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA)10.1109/PRIA.2019.8785966(240-245)Online publication date: Mar-2019
https://doi.org/10.1109/PRIA.2019.8785966
Qi MWang YLi ALuo JLienhart RMoeslund TSaito HLienhart R(2018)Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural NetworksProceedings of the 1st International Workshop on Multimedia Content Analysis in Sports10.1145/3265845.3265851(77-85)Online publication date: 19-Oct-2018
https://dl.acm.org/doi/10.1145/3265845.3265851
Show More Cited By

Index Terms

A mid-level representation framework for semantic sports video analysis

Recommendations

A new method to segment playfield and its applications in match analysis in sports video
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

With the growing popularity of digitized sports video, automatic analysis of them need be processed to facilitate semantic summarization and retrieval. Playfield plays the fundamental role in automatically analyzing many sports programs. Many semantic ...
A unified framework for semantic shot classification in sports videos
MULTIMEDIA '02: Proceedings of the tenth ACM international conference on Multimedia

In this demonstration, we present a unified framework for semantic shot classification in sports videos. Unlike previous approaches, which focus on clustering by aggregating shots with similar low-level features, the proposed scheme makes use of domain ...
Offense based temporal segmentation for event detection in soccer video
MIR '04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval

Sports video is regarded as a good testing bed for techniques on content based video analysis and processing. Although partially successful systems have been designed for specific sports domains with limited data, most previous works do not adequately ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia

November 2003

670 pages

ISBN:1581137222

DOI:10.1145/957013

General Chairs:
Lawrence Rowe
University of California, Berkeley
,
Harrick Vin
University of Texas, Austin
,
Program Chairs:
Thomas Plagemann
University of Oslo
,
Prashant Shenoy
University of Massachusetts, Amherst
,
John R. Smith
IBM T.J. Watson Research Center

Copyright © 2003 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2003

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM03

Sponsor:

MM03: 2003 11th Annual ACM International Conference on Multimedia

November 2 - 8, 2003

CA, Berkeley, USA

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

81
Total Citations
View Citations
2,678
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)3

Reflects downloads up to 01 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Qi MWang YLi ALuo J(2019)Sports Video Captioning via Attentive Motion Representation and Group Relationship ModelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.2921655(1-1)Online publication date: 2019
https://doi.org/10.1109/TCSVT.2019.2921655
Zanganeh AJampour M(2019)Automatic Weak Learners Selection for Pattern Recognition and its application in Soccer Goal Recognition2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA)10.1109/PRIA.2019.8785966(240-245)Online publication date: Mar-2019
https://doi.org/10.1109/PRIA.2019.8785966
Qi MWang YLi ALuo JLienhart RMoeslund TSaito HLienhart R(2018)Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural NetworksProceedings of the 1st International Workshop on Multimedia Content Analysis in Sports10.1145/3265845.3265851(77-85)Online publication date: 19-Oct-2018
https://dl.acm.org/doi/10.1145/3265845.3265851
Shi YFernando BHartley R(2018)Action Anticipation with RBF Kernelized Feature Mapping RNNComputer Vision – ECCV 201810.1007/978-3-030-01249-6_19(305-322)Online publication date: 6-Oct-2018
https://doi.org/10.1007/978-3-030-01249-6_19
Ngo CWang F(2018)Video SummarizationEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_1026(4439-4443)Online publication date: 7-Dec-2018
https://doi.org/10.1007/978-1-4614-8265-9_1026
Timón SRincón MMartínez-Tomás R(2017)Extending XNAT Platform with an Incremental Semantic FrameworkFrontiers in Neuroinformatics10.3389/fninf.2017.0005711Online publication date: 31-Aug-2017
https://doi.org/10.3389/fninf.2017.00057
Blanc KLingrand DPrecioso F(2017)Singlets: Multi-resolution Motion Singularities for Soccer Video Abstraction2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW.2017.15(66-75)Online publication date: Jul-2017
https://doi.org/10.1109/CVPRW.2017.15
Liu WYan CLiu JMa H(2017)Deep learning based basketball video analysis for intelligent arena applicationMultimedia Tools and Applications10.1007/s11042-017-5002-576:23(24983-25001)Online publication date: 1-Dec-2017
https://dl.acm.org/doi/10.1007/s11042-017-5002-5
Wang JXu MLu HBurnett I(2016)ActiveAdNeurocomputing10.1016/j.neucom.2015.12.038185:C(82-92)Online publication date: 12-Apr-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.12.038
Kobayashi GHatakeyama HOta KNakada YKaburagi TMatsumoto T(2016)Predicting viewer-perceived activity/dominance in soccer games with stick-breaking HMM using data from a fixed set of camerasMultimedia Tools and Applications10.1007/s11042-014-2425-075:6(3081-3119)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1007/s11042-014-2425-0
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents