A Long-Range Self-similarity Approach to Segmenting DJ Mixed Music Streams

Tim Scarfe⁵,
Wouter M. Koolen⁵ &
Yuri Kalnishkan⁵

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 412))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

2564 Accesses
1 Citations

Abstract

In this paper we describe an unsupervised, deterministic algorithm for segmenting DJ-mixed Electronic Dance Music streams (for example; podcasts, radio shows, live events) into their respective tracks. We attempt to reconstruct boundaries as close as possible to what a human domain expert would engender. The goal of DJ-mixing is to render track boundaries effectively invisible from the standpoint of human perception which makes the problem difficult.

We use Dynamic Programming (DP) to optimally segment a cost matrix derived from a similarity matrix. The similarity matrix is based on the cosines of a time series of kernel-transformed Fourier based features designed with this domain in mind. Our method is applied to EDM streams. Its formulation incorporates long-term self similarity as a first class concept combined with DP and it is qualitatively assessed on a large corpus of long streams that have been hand labelled by a domain expert.

Download to read the full chapter text

Chapter PDF

From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass

Article Open access 24 September 2018

A New Compatibility Measure for Harmonic EDM Mixing

Using Musical Beats to Segment Videos of Bharatanatyam Adavus

Keywords

References

Foote, J.: Visualizing music and audio using self-similarity. In: Proceedings of the Seventh ACM International Conference on Multimedia (Part 1), pp. 77–80. ACM (1999)
Google Scholar
Foote, J.: A similarity measure for automatic audio classification. In: Proc. AAAI 1997 Spring Symposium on Intelligent Integration and Use of Text, Image, Video, and Audio Corpora (1997)
Google Scholar
Foote, J.: Automatic audio segmentation using a measure of audio novelty. In: 2000 IEEE International Conference on Multimedia and Expo, ICME 2000, vol. 1, pp. 452–455. IEEE (2000)
Google Scholar
Foote, J.T., Cooper, M.L.: Media segmentation using self-similarity decomposition. In: Electronic Imaging 2003, pp. 167–175. International Society for Optics and Photonics (2003)
Google Scholar
Foote, J., Cooper, M.: Visualizing musical structure and rhythm via self-similarity. In: Proceedings of the 2001 International Computer Music Conference, pp. 419–422 (2001)
Google Scholar
Goodwin, M.M., Laroche, J.: Audio segmentation by feature-space clustering using linear discriminant analysis and dynamic programming. In: 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 131–134. IEEE (2003)
Google Scholar
Goodwin, M.M., Laroche, J.: A dynamic programming approach to audio segmentation and speech/music discrimination. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 4, pp. iv–309. IEEE (2004)
Google Scholar
Peeters, G., La Burthe, A., Rodet, X.: Toward automatic music audio summary generation from signal analysis. In: Proc. of ISMIR, pp. 94–100 (2002)
Google Scholar
Peeters, G.: Deriving musical structures from signal analysis for music audio summary generation: “Sequence” and “State” approach. In: Wiil, U.K. (ed.) CMMR 2003. LNCS, vol. 2771, pp. 143–166. Springer, Heidelberg (2004)
Chapter Google Scholar
Peiszer, E., Lidy, T., Rauber, A.: Automatic audio segmentation: Segment boundary and structure detection in popular music. In: Proc. of LSAS (2008)
Google Scholar
Sox, the swiss army knife of sound processing programs, http://sox.sourceforge.net/
Lindgren, M.: Cuenation, website for edm community to share track time metadata, http://cuenation.com/
Nyquist, H.: Certain topics in telegraph transmission theory. Transactions of the American Institute of Electrical Engineers 47(2), 617–644 (1928)
Article Google Scholar
Frigo, M., Johnson, S.G.: The fftw web page (2004)
Google Scholar
Tzanetakis, G., Cook, P.: Multifeature audio segmentation for browsing and annotation. In: 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 103–106. IEEE (1999)
Google Scholar
Tzanetakis, G., Cook, F.: A framework for audio analysis based on classification and temporal segmentation. In: Proceedings of 25th EUROMICRO Conference, vol. 2, pp. 61–67 (1999)
Google Scholar
Theiler, J.P., Gisler, G.: Contiguity-enhanced k-means clustering algorithm for unsupervised multispectral image segmentation. In: Optical Science, Engineering and Instrumentation 1997, pp. 108–118. International Society for Optics and Photonics (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Learning Research Centre and Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, United Kingdom
Tim Scarfe, Wouter M. Koolen & Yuri Kalnishkan

Authors

Tim Scarfe
View author publications
You can also search for this author in PubMed Google Scholar
Wouter M. Koolen
View author publications
You can also search for this author in PubMed Google Scholar
Yuri Kalnishkan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Engineering Department, Frederick University, 1036, Nicosia, Cyprus
Harris Papadopoulos
Department of Electrical Engineering and Information Technology, Cyprus University of Technology, 3603, Limassol, Cyprus
Andreas S. Andreou
Department of Forestry and Management of the Environment and Natural Resources, Democritus University of Thrace, 68200, Orestiada, Greece
Lazaros Iliadis
Department of Digital Systems, University of Piraeus, 18534, Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Scarfe, T., Koolen, W.M., Kalnishkan, Y. (2013). A Long-Range Self-similarity Approach to Segmenting DJ Mixed Music Streams. In: Papadopoulos, H., Andreou, A.S., Iliadis, L., Maglogiannis, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2013. IFIP Advances in Information and Communication Technology, vol 412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41142-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-41142-7_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41141-0
Online ISBN: 978-3-642-41142-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Long-Range Self-similarity Approach to Segmenting DJ Mixed Music Streams

Abstract

Chapter PDF

Similar content being viewed by others

From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass

A New Compatibility Measure for Harmonic EDM Mixing

Using Musical Beats to Segment Videos of Bharatanatyam Adavus

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Long-Range Self-similarity Approach to Segmenting DJ Mixed Music Streams

Abstract

Chapter PDF

Similar content being viewed by others

From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass

A New Compatibility Measure for Harmonic EDM Mixing

Using Musical Beats to Segment Videos of Bharatanatyam Adavus

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation