Impact of Time and Note Duration Tokenizations on Deep Learning Symbolic Music Modeling
Description
Symbolic music is widely used in various deep learning tasks, including generation, transcription, synthesis, and Music Information Retrieval (MIR). It is mostly employed with discrete models like Transformers, which require music to be tokenized, i.e., formatted into sequences of distinct elements called tokens. Tokenization can be performed in different ways, and recent research has focused on developing more efficient methods. However, the key differences between these methods are often unclear, and few studies have compared them. In this work, we analyze the current common tokenization methods and experiment with time and note duration representations. We compare the performance of these two impactful criteria on several tasks, including composer classification, emotion classification, music generation, and sequence representation. We demonstrate that explicit information leads to better results depending on the task.
Files
000009.pdf
Files
(294.6 kB)
Name | Size | Download all |
---|---|---|
md5:f89b56890223537c92c2203f1a04b155
|
294.6 kB | Preview Download |