Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/957013.957020acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

A mid-level representation framework for semantic sports video analysis

Published: 02 November 2003 Publication History

Abstract

Sports video has been widely studied due to its tremendous commercial potentials. Despite encouraging results from various specific sports games, it is almost impossible to extend a system for a new sports game because they usually employ different sets of low-level features appropriate for the specific games and closely coupled with the use of game specific rules to detect events or highlights. There is a lack of internal representation and structure to be generic and applicable for many different sports. In this paper, we present a generic mid-level representation framework for semantic sports video analysis. The mid-level representation layer is introduced between the low-level audio-visual processing and high-level semantic analysis. It allows us to separate sports specific knowledge and rules from the low-level and mid-level feature extraction. This makes sports video analysis more efficient, effective, and less ad-hoc for various types of sports. To achieve robustness of the low-level feature analysis, a non-parametric clustering, mean shift procedure, has been successfully applied to both color and motion analysis. The proposed framework has been tested for five field-ball type sports covering duration of about 8 hours. Experiments have shown its robust performance in semantic analysis and event detection. We believe that the proposed mid-level representation framework can be used for event detection, highlight extraction, summarization and personalization of many types of sports video.

References

[1]
C. Dorai, etc., "Media Semantics: Who Needs It and Why?", In Proc. of ACM Multimedia 2002, pp. 580--583, 2002.
[2]
M. Xu, L.-Y. Duan, C.-S. Xu, Q. Tian, "A Fusion Scheme of Visual and Auditory Modalities for Event Detection in Sports Video," In Proc. of ICASSP 2003, pp. 189--192, 2003.
[3]
M. Xu, N. C. Maddage, C.-S. Xu, M. Kankanhalli, Q. Tian, "Creating Audio Keywords For Event Detection in Soccer Video," In Proc. of ICME 2003, pp. 281--284, 2003.
[4]
L.-Y. Duan, M. Xu, and Q. Tian, "Semantic Shot Classification in Sports Video," In Proc. of SPIE Storage and Retrieval for Media Database 2003, pp. 300--313, 2003.
[5]
D.Q. Zhang, S. -F. Chang, "Event Detection in Baseball Video Using Superimposed Caption Recognition," In Proc. of ACM Multimedia 2002, pp. 315--318, 2002.
[6]
M. Han, W. Hua, W. Xu, and Y.H. Gong, "An integrated Baseball Digest System Using Maximum Entropy Method," In Proc. of ACM Multimedia, pp. 347--350, 2002.
[7]
J. Assfalg, M. Bertini, C. Colombo, and A. D. Bimbo, "Semantic Annotation of Sports Videos," IEEE Multimedia 9(2): 52--60, 2002.
[8]
N. Babaguchi, Y. Kawai, and T. Kitahashi, "Event Based Indexing of Broadcasted Sports Video by Intermodal Collaboration," IEEE Transactions on Multimedia 4(1): 68--75, 2002.
[9]
L.-Y. Duan, M. Xu, X.-D. Yu, and Q. Tian, "A Unified Framework for Semantic Shot Classification in Sports Videos", In Proc. of ACM Multimedia 2002, pp. 419--420, 2002.
[10]
D. Zhong, S.-F. Chang, "Structure Analysis of Sports Video Using Domain Models," In Proc. of ICME 2001.
[11]
C.W. Ngo, T.C. Pong, and H.J. Zhang, "On Clustering and Retrieval of Video Shots", In Proc. of ACM Multimedia 2001, pp. 51--60, 2001.
[12]
S. Nepal, U. Srinivasan, and G. Reynolds, "Automatic Detection of Goal Segments in Basketball Videos", In Proc. of ACM Multimedia 2001.
[13]
P. Xu, L. Xie, S.-F. Chang, A. Divakaran, A. Vetro, H. Sun, "Algorithms And Systems for Segmentation and Structure Analysis in Soccer Video," In Proc. of ICME 2001.
[14]
Y.-P. Tan, D.D. Saur, S.R. Kulkarni, and P.J. Ramadge, "Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation," IEEE Transactions on Circuits and Systems for Video Technology 10(1): 133--146, 2000.
[15]
Y. Rui, A. Gupta, A. Acero, "Automatically Extracting Highlights for TV Baseball Programs," In Proc. of ACM Multimedia, pp. 105--115, 2000.
[16]
G. Sudhir, J. C. M. Lee, and A. K. Jain, "Automatic Classification of Tennis Video for High-level Content-based Retrieval," In Proc. of IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 81--90, 1998.
[17]
Y.H. Gong, L.T. Sin, C. H. Chuan, H.J. Zhang, M. Sakauchi, "Automatic Parsing of TV Soccer Programs," In Proc. of International Conference on Multimedia Computing and Systems, pp.167--174, 1995.
[18]
A. Hanjalic, "Shot-Boundary Detection: Unraveled and Resolved," IEEE Transactions on Circuits and Systems for Video Technology 12(2): 90--105, 2002.
[19]
H. Zhang, A.Kankanhalli, and S.W. Smoliar, "Automatic Partitioning of Full-motion Video," Multimedia System 1(1): 10--28, 1993.
[20]
C. M. Bishop, Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995, pp. 295--329.
[21]
B.W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, 1986, pp.7--74.
[22]
Y. Cheng, "Mean Shift, Mode Seeking, and Clustering," IEEE PAMI, 17(8): 790--799, 1995.
[23]
D. Comaniciu, P. Meer, "Mean Shift: A Robust Approach toward Feature Space Analysis," IEEE PAMI 24(5): 1-18, 2002.
[24]
R. Wang, H.J. Zhang, and Y.Q. Zhang, "A Confidence Measure Based Moving Object Extraction System Built for Compressed Domain," In Proc. of ISCAS 2000.
[25]
M.J. Black and P. Anandan, "The Robust Estimation of Multiple Motions: Parametric and Piecewise-smooth Flow Fields," Computer Vision and Image Understanding 6(4): 348--365, 1995.
[26]
J.M. Odebez and P. Bouthemy, "Robust Multiresolution Estimation of Parametric Motion Models," Journal of Visual Communication and Image Representation 6(4): 348--365, 1995.
[27]
L. Vincent, P. Soille, "Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations," IEEE PAMI 24(5): 1--18, 2002.
[28]
L.-Y. Duan, M. Xu, Q. Tian, and C.-S. Xu, "Nonparametric Color Characterization Using Mean Shift," to appear on ACM Multimedia 2003.
[29]
L.-Y. Duan, M. Xu, Q. Tian, and C.-S. Xu, "A Unified Framework for Semantic Shot Classification in Sports Video," Technical Report, Institute for Infocomm Research, Jun 2003.
[30]
M.J. Swain, D.H. Ballard, "Color Indexing," International Journal of Computer Vision 7(1): 11--32, 1991.
[31]
<http://www.hickoksports.com/glossary/gtennis.shtml>
[32]
<http://www.firstbasesports.com/soccer_glossary.html>
[33]
V. Cantoni, S. Levialdi, V. Robert. Artificial Vision, Academic Press, 1997, pp. 1--52.
[34]
B.L. Yeo and B. Liu, "Rapid Scene Analysis on Compressed Videos," IEEE Transactions on Circuits and Systems for Video Technology 5(6): 533--544, 1995.

Cited By

View all
  • (2019)Sports Video Captioning via Attentive Motion Representation and Group Relationship ModelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.2921655(1-1)Online publication date: 2019
  • (2019)Automatic Weak Learners Selection for Pattern Recognition and its application in Soccer Goal Recognition2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA)10.1109/PRIA.2019.8785966(240-245)Online publication date: Mar-2019
  • (2018)Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural NetworksProceedings of the 1st International Workshop on Multimedia Content Analysis in Sports10.1145/3265845.3265851(77-85)Online publication date: 19-Oct-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia
November 2003
670 pages
ISBN:1581137222
DOI:10.1145/957013
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. events
  2. mid-level representation
  3. semantics
  4. sports video

Qualifiers

  • Article

Conference

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Sports Video Captioning via Attentive Motion Representation and Group Relationship ModelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.2921655(1-1)Online publication date: 2019
  • (2019)Automatic Weak Learners Selection for Pattern Recognition and its application in Soccer Goal Recognition2019 4th International Conference on Pattern Recognition and Image Analysis (IPRIA)10.1109/PRIA.2019.8785966(240-245)Online publication date: Mar-2019
  • (2018)Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural NetworksProceedings of the 1st International Workshop on Multimedia Content Analysis in Sports10.1145/3265845.3265851(77-85)Online publication date: 19-Oct-2018
  • (2018)Action Anticipation with RBF Kernelized Feature Mapping RNNComputer Vision – ECCV 201810.1007/978-3-030-01249-6_19(305-322)Online publication date: 6-Oct-2018
  • (2018)Video SummarizationEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_1026(4439-4443)Online publication date: 7-Dec-2018
  • (2017)Extending XNAT Platform with an Incremental Semantic FrameworkFrontiers in Neuroinformatics10.3389/fninf.2017.0005711Online publication date: 31-Aug-2017
  • (2017)Singlets: Multi-resolution Motion Singularities for Soccer Video Abstraction2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW.2017.15(66-75)Online publication date: Jul-2017
  • (2017)Deep learning based basketball video analysis for intelligent arena applicationMultimedia Tools and Applications10.1007/s11042-017-5002-576:23(24983-25001)Online publication date: 1-Dec-2017
  • (2016)ActiveAdNeurocomputing10.1016/j.neucom.2015.12.038185:C(82-92)Online publication date: 12-Apr-2016
  • (2016)Predicting viewer-perceived activity/dominance in soccer games with stick-breaking HMM using data from a fixed set of camerasMultimedia Tools and Applications10.1007/s11042-014-2425-075:6(3081-3119)Online publication date: 1-Mar-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media