Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2964284.2967215acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Learning Music Emotion Primitives via Supervised Dynamic Clustering

Published: 01 October 2016 Publication History

Abstract

This paper explores a fundamental problem in music emotion analysis, i.e., how to segment the music sequence into a set of basic emotive units, which are named as emotion primitives. Current works on music emotion analysis are mainly based on the fixed-length music segments, which often leads to the difficulty of accurate emotion recognition. Short music segment, such as an individual music frame, may fail to evoke emotion response. Long music segment, such as an entire song, may convey various emotions over time. Moreover, the minimum length of music segment varies depending on the types of the emotions. To address these problems, we propose a novel method dubbed supervised dynamic clustering (SDC) to automatically decompose the music sequence into meaningful segments with various lengths. First, the music sequence is represented by a set of music frames. Then, the music frames are clustered according to the valence-arousal values in the emotion space. The clustering results are used to initialize the music segmentation. After that, a dynamic programming scheme is employed to jointly optimize the subsequent segmentation and grouping in the music feature space. Experimental results on standard dataset show both the effectiveness and the rationality of the proposed method.

References

[1]
A. Aljanaki, Y.-H. Yang, and M. Soleymani. Emotion in music task at mediaeval 2015. In Working Notes Proceedings of the MediaEval 2015 Workshop, September 2015.
[2]
D. J. Berndt and J. Clifford. Using dynamic time warping to find patterns in time series. In Proc. Knowl. Discovery Data Mining Workshop, pages 359--370. AAAI Press, 1994.
[3]
D. P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, 1995.
[4]
D. Cooke. The Language of Music. London: Oxford University Press, 1959.
[5]
E. Coutinho, G. Trigeorgis, S. Zafeiriou, and B. Schuller. Automatically estimating emotion in music with deep long-short term memory recurrent neural networks. In Working Notes Proceedings of the MediaEval 2015 Workshop, September 2015.
[6]
F. De la Torre, J. Campoy, Z. Ambadar, and J. Cohn. Temporal segmentation of facial behavior. In IEEE 11th ICCV, pages 1--8, 2007.
[7]
J. J. Deng and C. H. C. Leung. Dynamic time warping for music retrieval using time series modeling of musical emotions. IEEE Trans. Affect. Comput., 6(2):137--151, 2015.
[8]
F. Eyben, F. Weninger, F. Gross, and B. Schuller. Recent developments in opensmile, the munich open-source multimedia feature extractor. In Proc. 21st ACM Multimedia, pages 835--838, 2013.
[9]
A. Fod, M. J. Matarić, and O. C. Jenkins. Automated derivation of primitives for movement classification. Autonomous Robots, 12(1):39--54, 2002.
[10]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735--1780, 1997.
[11]
P. N. Juslin and D. Vastfjall. Emotional responses to music: The need to consider underlying mechanisms. Behavioral and Brain Sciences, 31:559--575, 2008.
[12]
Y. Liu, Y. Liu, Y. Zhao, and K. Hua. What strikes the strings of your heart? - feature mining for music emotion analysis. IEEE Trans. Affect. Comput., 6(3):247--260, 2015.
[13]
Y. Liu, F. Zhou, W. Liu, F. De la Torre, and Y. Liu. Unsupervised summarization of rushes videos. In Proc. 18th ACM Multimedia, pages 751--754, 2010.
[14]
C. Lu and N. J. Ferrier. Repetitive motion analysis: segmentation and event classification. IEEE Trans. Pattern Anal. Mach. Intell., 26(2):258--263, 2004.
[15]
L. Lu, D. Liu, and H.-J. Zhang. Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio, Speech, Language Process., 14(1):5--18, 2006.
[16]
A. Rodriguez and A. Laio. Clustering by fast search and find of density peaks. Science, 344(6191):1492--1496, 2014.
[17]
J. A. Russell. A circumplex model of affect. J. Pers. Soc. Psychol., 39:1161--1178, 1980.
[18]
T. S\"ark\"amö, M. Tervaniemi, S. Laitinen, A. Forsblom, S. Soinila, M. Mikkonen, T. Autti, H. M. Silvennoinen, J. Erkkil\"a, M. Laine, I. Peretz, and M. Hietanen. Music listening enhances cognitive recovery and mood after middle cerebral artery stroke. Brain, 131(3):866--876, 2008.
[19]
H. Shimodaira, K.-I. Noma, M. Nakai, and S. Sagayama. Dynamic time-alignment kernel in support vector machine. In NIPS 14, pages 921--928, 2001.
[20]
M. Soleymani, M. N. Caro, E. M. Schmidt, C.-Y. Sha, and Y.-H. Yang. 1000 songs for emotional analysis of music. In Proc. 2nd ACM CrowdMM, pages 1--6, 2013.
[21]
K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas. Multi-label classification of music into emotions. In Proc. 9th ISMIR, pages 325--330, 2008.
[22]
K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas. Multi-label classification of music by emotion. EURASIP J. Audio, Speech, and Music Process., 4:1--9, 2011.
[23]
D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet. Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio, Speech, Language Process., 16(2):467--476, 2008.
[24]
X. Wang, D. Rosenblum, and Y. Wang. Context-aware mobile music recommendation for daily activities. In Proc. 20th ACM Multimedia, pages 99--108, 2012.
[25]
B. Wu, E. Zhong, A. Horner, and Q. Yang. Music emotion recognition by multi-label multi-layer multi-instance multi-view learning. In Proc. 22nd ACM Multimedia, pages 117--126, 2014.
[26]
M. Xu, X. Li, H. Xianyu, J. Tian, F. Meng, and W. Chen. Multi-scale approaches to the mediaeval 2015 "emotion in music" task. In Working Notes Proceedings of the MediaEval 2015 Workshop, September 2015.
[27]
Y.-H. Yang and H. H. Chen. Machine recognition of music emotion: A review. ACM Trans. Intell. Syst. Technol., 3(3):40:1--40:30, 2012.
[28]
Y. H. Yang, Y. C. Lin, H. T. Cheng, and H. H. Chen. Mr. emo: Music retrieval in the emotion plane. In Proc. 16th ACM Multimedia, pages 1003--1004, 2008.
[29]
Y.-H. Yang, Y.-C. Lin, Y.-F. Su, and H. H. Chen. A regression approach to music emotion recognition. IEEE Trans. Audio, Speech, Language Process., 16(2):448--457, 2008.
[30]
F. Zhou, F. De la Torre, and J. K. Hodgins. Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Trans. Pattern Anal. Mach. Intell., 35(3):582--596, 2013.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '16: Proceedings of the 24th ACM international conference on Multimedia
October 2016
1542 pages
ISBN:9781450336031
DOI:10.1145/2964284
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. emotion primitives
  2. music emotion analysis
  3. supervised dynamic clustering

Qualifiers

  • Short-paper

Funding Sources

  • National Natural Science Foundation of China

Conference

MM '16
Sponsor:
MM '16: ACM Multimedia Conference
October 15 - 19, 2016
Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 330
    Total Downloads
  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)2
Reflects downloads up to 09 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media