Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3409501.3409534acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcctConference Proceedingsconference-collections
research-article

Coarse-To-Fine Framework For Music Generation via Generative Adversarial Networks

Published: 25 August 2020 Publication History

Abstract

Automatic music generation is highly related to Natural Language Processing (NLP). A current note in melody always depends on its context, just like a word in NLP. Yet the difference is that music is built upon a set of special chords that formulates the skeleton of the melody. To enhance automatic music generation, we propose a two-step adversarial procedure: Step 1 learns to generate chords via a chord generative adversarial networks (GANs); and step 2 trains a melody GAN to generate music for which the input is conditioned on the chords produced through the first step. Under such a two-step procedure, the chords generated in the first step formulate a basic framework of the music, which can theoretically and practically improve the performance of melody generation in the second step. Experiments demonstrate that such a cascading process is able to generate high-quality music samples with both acoustical and music theoretical guarantees.

References

[1]
M Abboud, B Németh, and JC Guillemin. 2012. Modeling temporal dependenciesin high-dimensional sequences: Application to polyphonic music generation and transcription. Chem. Eur. J18, 13 (2012), 3981--3991.
[2]
Hang Chu, Raquel Urtasun, and Sanja Fidler. 2016. Song from PI: A musicallyplausible network for pop music generation.arXiv preprint arXiv:1611.03477(2016).
[3]
Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and Yi-Hsuan Yang. 2018.MuseGAN: Multi-track sequential generative adversarial networks for symbolicmusic generation and accompaniment. InThirty-Second AAAI Conference onArtificial Intelligence. 34--41.
[4]
Douglas Eck and Juergen Schmidhuber. 2002.A First Look at Music Composi-tion using LSTM Recurrent Neural Networks. Istituto Dalle Molle Di Studi SullIntelligenza Artificiale.
[5]
Jon Gauthier. 2014. Conditional generative adversarial nets for convolutionalface generation. Class Project for Stanford CS231N: Convolutional Neural Networksfor Visual Recognition, Winter semester2014, 5 (2014), 2.
[6]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarialnets. InAdvances in neural information processing systems. 2672--2680.
[7]
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunning-ham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, ZehanWang, et al.2017. Photo-realistic single image super-resolution using a generativeadversarial network. InProceedings of the IEEE conference on computer vision andpattern recognition. 4681--4690.
[8]
Mehdi Mirza and Simon Osindero. 2014. Conditional Generative AdversarialNets.Computer Science(2014), 2672--2680.
[9]
Olof Mogren. 2016. C-RNN-GAN: Continuous recurrent neural networks withadversarial training.Constructive Machine Learning Workshop on NIPS(2016).
[10]
Eita Nakamura, Emmanouil Benetos, Kazuyoshi Yoshii, and Simon Dixon. 2018.Towards complete polyphonic music transcription: Integrating multi-pitch detec-tion and rhythm quantization. In2018 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). IEEE, 101--105.
[11]
Eita Nakamura, Kazuyoshi Yoshii, and Shigeki Sagayama. 2017. Rhythm tran-scription of polyphonic piano music based on merged-output HMM for multiplevoices.IEEE/ACM Transactions on Audio, Speech, and Language Processing25, 4(2017), 794--806.
[12]
Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional imagesynthesis with auxiliary classifier gans. InProceedings of the 34th InternationalConference on Machine Learning. JMLR. org, 2642--2651.
[13]
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, OriolVinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu.2016. Wavenet: A generative model for raw audio.arXiv preprint arXiv:1609.03499(2016).
[14]
Scott E. Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele,and Honglak Lee. 2016. Generative adversarial text to image synthesis.Interna-tional Conference on Machine Learning(2016), 1060--1069.
[15]
Ricardo Scholz and Geber Ramalho. 2008. COCHONUT: RECOGNIZING COM-PLEX CHORDS FROM MIDI GUITAR SEQUENCES. InInternational Conferenceson Music Information Retrieval. 27--32.
[16]
Bob L Sturm, Joao Felipe Santos, Oded Ben-Tal, and Iryna Korshunova. 2016.Music transcription modelling and composition using deep learning.arXivpreprint arXiv:1604.08723(2016).
[17]
Yiming Wu and Wei Li. 2018. Music Chord Recognition Based on Midi-TrainedDeep Feature and BLSTM-CRF Hybird Decoding. In2018 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 376--380.
[18]
Li Chia Yang, Szu Yu Chou, and Yi Hsuan Yang. 2017. MidiNet: A ConvolutionalGenerative Adversarial Network for Symbolic-domain Music Generation.arXivpreprint arXiv:1703.10847(2017).
[19]
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: SequenceGenerative Adversarial Nets with Policy Gradient. InAAAI. 2852--2858.
[20]
Han Zhang, Tao Xu, and Hongsheng Li. 2017. StackGAN: Text to Photo-RealisticImage Synthesis with Stacked Generative Adversarial Networks. In2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 5908--5

Cited By

View all
  • (2024)Generating Rhythm Game Music with JukeboxIntelligent Systems and Applications10.1007/978-3-031-66428-1_16(267-280)Online publication date: 31-Jul-2024
  • (2022)A systematic review of artificial intelligence-based music generation: Scope, applications, and future trendsExpert Systems with Applications10.1016/j.eswa.2022.118190209(118190)Online publication date: Dec-2022

Index Terms

  1. Coarse-To-Fine Framework For Music Generation via Generative Adversarial Networks

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence
      July 2020
      276 pages
      ISBN:9781450375603
      DOI:10.1145/3409501
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      • Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 August 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Bi-LSTM
      2. adversarial learning
      3. melody generation

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      HPCCT & BDAI 2020

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)14
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 24 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Generating Rhythm Game Music with JukeboxIntelligent Systems and Applications10.1007/978-3-031-66428-1_16(267-280)Online publication date: 31-Jul-2024
      • (2022)A systematic review of artificial intelligence-based music generation: Scope, applications, and future trendsExpert Systems with Applications10.1016/j.eswa.2022.118190209(118190)Online publication date: Dec-2022

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media