short-paper

Learning Music Emotion Primitives via Supervised Dynamic Clustering

Authors:

Kejun ZhangAuthors Info & Claims

MM '16: Proceedings of the 24th ACM international conference on Multimedia

Pages 222 - 226

https://doi.org/10.1145/2964284.2967215

Published: 01 October 2016 Publication History

Abstract

This paper explores a fundamental problem in music emotion analysis, i.e., how to segment the music sequence into a set of basic emotive units, which are named as emotion primitives. Current works on music emotion analysis are mainly based on the fixed-length music segments, which often leads to the difficulty of accurate emotion recognition. Short music segment, such as an individual music frame, may fail to evoke emotion response. Long music segment, such as an entire song, may convey various emotions over time. Moreover, the minimum length of music segment varies depending on the types of the emotions. To address these problems, we propose a novel method dubbed supervised dynamic clustering (SDC) to automatically decompose the music sequence into meaningful segments with various lengths. First, the music sequence is represented by a set of music frames. Then, the music frames are clustered according to the valence-arousal values in the emotion space. The clustering results are used to initialize the music segmentation. After that, a dynamic programming scheme is employed to jointly optimize the subsequent segmentation and grouping in the music feature space. Experimental results on standard dataset show both the effectiveness and the rationality of the proposed method.

References

[1]

A. Aljanaki, Y.-H. Yang, and M. Soleymani. Emotion in music task at mediaeval 2015. In Working Notes Proceedings of the MediaEval 2015 Workshop, September 2015.

[2]

D. J. Berndt and J. Clifford. Using dynamic time warping to find patterns in time series. In Proc. Knowl. Discovery Data Mining Workshop, pages 359--370. AAAI Press, 1994.

Digital Library

[3]

D. P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, 1995.

Digital Library

[4]

D. Cooke. The Language of Music. London: Oxford University Press, 1959.

[5]

E. Coutinho, G. Trigeorgis, S. Zafeiriou, and B. Schuller. Automatically estimating emotion in music with deep long-short term memory recurrent neural networks. In Working Notes Proceedings of the MediaEval 2015 Workshop, September 2015.

[6]

F. De la Torre, J. Campoy, Z. Ambadar, and J. Cohn. Temporal segmentation of facial behavior. In IEEE 11th ICCV, pages 1--8, 2007.

[7]

J. J. Deng and C. H. C. Leung. Dynamic time warping for music retrieval using time series modeling of musical emotions. IEEE Trans. Affect. Comput., 6(2):137--151, 2015.

Digital Library

[8]

F. Eyben, F. Weninger, F. Gross, and B. Schuller. Recent developments in opensmile, the munich open-source multimedia feature extractor. In Proc. 21st ACM Multimedia, pages 835--838, 2013.

Digital Library

[9]

A. Fod, M. J. Matarić, and O. C. Jenkins. Automated derivation of primitives for movement classification. Autonomous Robots, 12(1):39--54, 2002.

Digital Library

[10]

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735--1780, 1997.

Digital Library

[11]

P. N. Juslin and D. Vastfjall. Emotional responses to music: The need to consider underlying mechanisms. Behavioral and Brain Sciences, 31:559--575, 2008.

[12]

Y. Liu, Y. Liu, Y. Zhao, and K. Hua. What strikes the strings of your heart? - feature mining for music emotion analysis. IEEE Trans. Affect. Comput., 6(3):247--260, 2015.

Digital Library

[13]

Y. Liu, F. Zhou, W. Liu, F. De la Torre, and Y. Liu. Unsupervised summarization of rushes videos. In Proc. 18th ACM Multimedia, pages 751--754, 2010.

Digital Library

[14]

C. Lu and N. J. Ferrier. Repetitive motion analysis: segmentation and event classification. IEEE Trans. Pattern Anal. Mach. Intell., 26(2):258--263, 2004.

Digital Library

[15]

L. Lu, D. Liu, and H.-J. Zhang. Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio, Speech, Language Process., 14(1):5--18, 2006.

Digital Library

[16]

A. Rodriguez and A. Laio. Clustering by fast search and find of density peaks. Science, 344(6191):1492--1496, 2014.

[17]

J. A. Russell. A circumplex model of affect. J. Pers. Soc. Psychol., 39:1161--1178, 1980.

[18]

T. S\"ark\"amö, M. Tervaniemi, S. Laitinen, A. Forsblom, S. Soinila, M. Mikkonen, T. Autti, H. M. Silvennoinen, J. Erkkil\"a, M. Laine, I. Peretz, and M. Hietanen. Music listening enhances cognitive recovery and mood after middle cerebral artery stroke. Brain, 131(3):866--876, 2008.

[19]

H. Shimodaira, K.-I. Noma, M. Nakai, and S. Sagayama. Dynamic time-alignment kernel in support vector machine. In NIPS 14, pages 921--928, 2001.

Digital Library

[20]

M. Soleymani, M. N. Caro, E. M. Schmidt, C.-Y. Sha, and Y.-H. Yang. 1000 songs for emotional analysis of music. In Proc. 2nd ACM CrowdMM, pages 1--6, 2013.

Digital Library

[21]

K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas. Multi-label classification of music into emotions. In Proc. 9th ISMIR, pages 325--330, 2008.

[22]

K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas. Multi-label classification of music by emotion. EURASIP J. Audio, Speech, and Music Process., 4:1--9, 2011.

[23]

D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet. Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio, Speech, Language Process., 16(2):467--476, 2008.

Digital Library

[24]

X. Wang, D. Rosenblum, and Y. Wang. Context-aware mobile music recommendation for daily activities. In Proc. 20th ACM Multimedia, pages 99--108, 2012.

Digital Library

[25]

B. Wu, E. Zhong, A. Horner, and Q. Yang. Music emotion recognition by multi-label multi-layer multi-instance multi-view learning. In Proc. 22nd ACM Multimedia, pages 117--126, 2014.

Digital Library

[26]

M. Xu, X. Li, H. Xianyu, J. Tian, F. Meng, and W. Chen. Multi-scale approaches to the mediaeval 2015 "emotion in music" task. In Working Notes Proceedings of the MediaEval 2015 Workshop, September 2015.

[27]

Y.-H. Yang and H. H. Chen. Machine recognition of music emotion: A review. ACM Trans. Intell. Syst. Technol., 3(3):40:1--40:30, 2012.

Digital Library

[28]

Y. H. Yang, Y. C. Lin, H. T. Cheng, and H. H. Chen. Mr. emo: Music retrieval in the emotion plane. In Proc. 16th ACM Multimedia, pages 1003--1004, 2008.

Digital Library

[29]

Y.-H. Yang, Y.-C. Lin, Y.-F. Su, and H. H. Chen. A regression approach to music emotion recognition. IEEE Trans. Audio, Speech, Language Process., 16(2):448--457, 2008.

Digital Library

[30]

F. Zhou, F. De la Torre, and J. K. Hodgins. Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Trans. Pattern Anal. Mach. Intell., 35(3):582--596, 2013.

Digital Library

Recommendations

Emotion Recognition of Chinese Traditional Folk Music using an Assembling Machine Learning Method
ICMLT '22: Proceedings of the 2022 7th International Conference on Machine Learning Technologies

Various papers published recently about the emotion of western pop music, none have looked into how to describe Chinese traditional folk music. The accuracy of existing algorithms in recognizing emotions in Chinese traditional folk music is just 42%. ...
Music emotion recognition using chord progressions
2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
The chord progression is a fundamental building block in music which sketches the overall mood of a song. Many composers compose music by first deciding chord progressions as a structure and then adding melody and details. Despite its importance, it is ...
Using artificial intelligence to analyze and classify music emotion

With the rapid development of music digitization and online streaming services, automatic analysis and classification of music content has become an urgent need. This research focuses on music sentiment analysis, which is the identification and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '16: Proceedings of the 24th ACM international conference on Multimedia

October 2016

1542 pages

ISBN:9781450336031

DOI:10.1145/2964284

General Chairs:
Alan Hanjalic
Delft University of Technology
,
Cees Snoek
Qualcomm Research Netherlands / University of Amsterdam
,
Marcel Worring
University of Amsterdam
,
Moderator:
Dick Bulterman
CWI / VU University Amsterdam
,
Program Chairs:
Benoit Huet
EURECOM
,
Aisling Kelliher
Virginia Tech
,
Yiannis Kompatsiaris
CERTH-ITI
,
Jin Li
Microsoft

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National Natural Science Foundation of China

Conference

MM '16

Sponsor:

SIGMM

MM '16: ACM Multimedia Conference

October 15 - 19, 2016

Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
330
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)2

Reflects downloads up to 09 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents