Abstract
We propose a new method of applying Generative Theory of Tonal Music directly to a spectrogram of music to produce a time-span segmentation as hierarchical clustering. We first consider a vertically long rectangle in a spectrogram (bin) as a pitch event and a spectrogram as a sequence of bins. The texture feature of a bin is extracted using a gray level co-occurrence matrix to generate a sequence of the texture features. The proximity and change of phrases are calculated by the distance between the adjacent bins by their texture features. The global structures such as parallelism and repetition are detected by a self-similarity matrix of a sequence of bins. We develop an algorithm which is given a sequence of the boundary strength between adjacent bins, iteratively merges adjacent bins in the bottom-up manner, and finally generates a dendrogram, which corresponds to a time-span segmentation. We conducted an experiment with inputting Mozart’s K.331 and K.550 and obtained promising results although the algorithm does not take into account almost any musical knowledge such as pitch and harmony.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Since the space is limited, for more detail, see literatures [8, 5, 6].
- 2.
Note that \(b_{i,i+1}\) means the strength of boundary between bins \(b_i\) and \(b_{i+1}\), and \(b_{i,i+1 i+2}\) means that between \(b_i\) and \(b_{i+1 i+2}\).
References
Chen, R., Li, M.: Music structural segmentation by combining harmonic and timbral information. In: Proceedings of ISMIR, pp. 477–482 (2011)
Costa, Y.M.G., Oliveira, L.S., Koerich, A.L., Gouyon, F.: Comparing textural features for music genre classification. In: Proceedings of the 2012 International Joint Conference on Neural Networks, pp. 1867–1872 (2012)
Foote, J.: Visualizing music and audio using self similarity. In: Proceedings of the 7th ACM international conference on Multimedia, pp. 77–80 (1999)
Foote, J.: Automatic audio segmentation using a measure of audio novelty. In: Proceedings of IEEE International Conference on Multimedia and Expo, vol. 1, pp. 452–455 (2000)
Hamanaka, M., Hirata, K., Tojo, S.: Implementing “A Generative Theory of Tonal Music”. J. New Music Res. 35(4), 249–277 (2007)
Hamanaka, M., Hirata, K., Tojo, S.: Implementing methods for analysing music based on Lerdahl and Jackendoff’s Generative Theory of Tonal Music. Computational Music Analysis, pp. 221–249. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-25931-4_9
Haralick, R.M.: Statistical and structural approaches to texture. Proc. IEEE 67(5), 786–804 (1979)
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music, The MIT Press (1983)
McFee, B. and Ellis, D. P. W.: Analyzing song structure with spectral clustering. In: Proceedings of ISMIR, pp. 405–410 (2014)
McFee, B. and Ellis, D. P. W.: Learning to segment songs with ordinal linear discriminant analysis. In: Proceedings of ICASSP (2014)
Nakashika, T., Garcia, C., Takiguchi, T.: Local-feature-map integration using convolutional neural networks for music genre classification. In: Proceedeings of Interspeech, ISCA, pp. 1752–1755 (2012)
Ullrich, K., Schlüter, J., and Grill, T.: Boundary detection in music structure analysis using convolutional neural networks. In: Proceedings of ISMIR, pp. 417–422 (2014)
Goto, M., Hashiguchi, H., Nishimura, T., and Oka, R.: RWC Music Database: popular, classical and jazz music databases. In: Proceedings of ISMIR, pp. 287–288 (2002)
Acknowledgement
This work has been supported by JSPS Kakenhi 16H01744.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Sawada, S., Takegawa, Y., Hirata, K. (2018). On Hierarchical Clustering of Spectrogram. In: Aramaki, M., Davies , M., Kronland-Martinet, R., Ystad, S. (eds) Music Technology with Swing. CMMR 2017. Lecture Notes in Computer Science(), vol 11265. Springer, Cham. https://doi.org/10.1007/978-3-030-01692-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-01692-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01691-3
Online ISBN: 978-3-030-01692-0
eBook Packages: Computer ScienceComputer Science (R0)