research-article

Coarse-To-Fine Framework For Music Generation via Generative Adversarial Networks

Authors:

Guosheng YinAuthors Info & Claims

HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

Pages 192 - 198

https://doi.org/10.1145/3409501.3409534

Published: 25 August 2020 Publication History

Abstract

Automatic music generation is highly related to Natural Language Processing (NLP). A current note in melody always depends on its context, just like a word in NLP. Yet the difference is that music is built upon a set of special chords that formulates the skeleton of the melody. To enhance automatic music generation, we propose a two-step adversarial procedure: Step 1 learns to generate chords via a chord generative adversarial networks (GANs); and step 2 trains a melody GAN to generate music for which the input is conditioned on the chords produced through the first step. Under such a two-step procedure, the chords generated in the first step formulate a basic framework of the music, which can theoretically and practically improve the performance of melody generation in the second step. Experiments demonstrate that such a cascading process is able to generate high-quality music samples with both acoustical and music theoretical guarantees.

References

[1]

M Abboud, B Németh, and JC Guillemin. 2012. Modeling temporal dependenciesin high-dimensional sequences: Application to polyphonic music generation and transcription. Chem. Eur. J18, 13 (2012), 3981--3991.

[2]

Hang Chu, Raquel Urtasun, and Sanja Fidler. 2016. Song from PI: A musicallyplausible network for pop music generation.arXiv preprint arXiv:1611.03477(2016).

[3]

Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and Yi-Hsuan Yang. 2018.MuseGAN: Multi-track sequential generative adversarial networks for symbolicmusic generation and accompaniment. InThirty-Second AAAI Conference onArtificial Intelligence. 34--41.

[4]

Douglas Eck and Juergen Schmidhuber. 2002.A First Look at Music Composi-tion using LSTM Recurrent Neural Networks. Istituto Dalle Molle Di Studi SullIntelligenza Artificiale.

[5]

Jon Gauthier. 2014. Conditional generative adversarial nets for convolutionalface generation. Class Project for Stanford CS231N: Convolutional Neural Networksfor Visual Recognition, Winter semester2014, 5 (2014), 2.

[6]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarialnets. InAdvances in neural information processing systems. 2672--2680.

[7]

Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunning-ham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, ZehanWang, et al.2017. Photo-realistic single image super-resolution using a generativeadversarial network. InProceedings of the IEEE conference on computer vision andpattern recognition. 4681--4690.

[8]

Mehdi Mirza and Simon Osindero. 2014. Conditional Generative AdversarialNets.Computer Science(2014), 2672--2680.

[9]

Olof Mogren. 2016. C-RNN-GAN: Continuous recurrent neural networks withadversarial training.Constructive Machine Learning Workshop on NIPS(2016).

[10]

Eita Nakamura, Emmanouil Benetos, Kazuyoshi Yoshii, and Simon Dixon. 2018.Towards complete polyphonic music transcription: Integrating multi-pitch detec-tion and rhythm quantization. In2018 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). IEEE, 101--105.

Digital Library

[11]

Eita Nakamura, Kazuyoshi Yoshii, and Shigeki Sagayama. 2017. Rhythm tran-scription of polyphonic piano music based on merged-output HMM for multiplevoices.IEEE/ACM Transactions on Audio, Speech, and Language Processing25, 4(2017), 794--806.

[12]

Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional imagesynthesis with auxiliary classifier gans. InProceedings of the 34th InternationalConference on Machine Learning. JMLR. org, 2642--2651.

[13]

Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, OriolVinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu.2016. Wavenet: A generative model for raw audio.arXiv preprint arXiv:1609.03499(2016).

[14]

Scott E. Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele,and Honglak Lee. 2016. Generative adversarial text to image synthesis.Interna-tional Conference on Machine Learning(2016), 1060--1069.

[15]

Ricardo Scholz and Geber Ramalho. 2008. COCHONUT: RECOGNIZING COM-PLEX CHORDS FROM MIDI GUITAR SEQUENCES. InInternational Conferenceson Music Information Retrieval. 27--32.

[16]

Bob L Sturm, Joao Felipe Santos, Oded Ben-Tal, and Iryna Korshunova. 2016.Music transcription modelling and composition using deep learning.arXivpreprint arXiv:1604.08723(2016).

[17]

Yiming Wu and Wei Li. 2018. Music Chord Recognition Based on Midi-TrainedDeep Feature and BLSTM-CRF Hybird Decoding. In2018 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 376--380.

[18]

Li Chia Yang, Szu Yu Chou, and Yi Hsuan Yang. 2017. MidiNet: A ConvolutionalGenerative Adversarial Network for Symbolic-domain Music Generation.arXivpreprint arXiv:1703.10847(2017).

[19]

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: SequenceGenerative Adversarial Nets with Policy Gradient. InAAAI. 2852--2858.

[20]

Han Zhang, Tao Xu, and Hongsheng Li. 2017. StackGAN: Text to Photo-RealisticImage Synthesis with Stacked Generative Adversarial Networks. In2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 5908--5

Cited By

Yan N(2024)Generating Rhythm Game Music with JukeboxIntelligent Systems and Applications10.1007/978-3-031-66428-1_16(267-280)Online publication date: 31-Jul-2024
https://doi.org/10.1007/978-3-031-66428-1_16
Civit MCivit-Masot JCuadrado FEscalona M(2022)A systematic review of artificial intelligence-based music generation: Scope, applications, and future trendsExpert Systems with Applications10.1016/j.eswa.2022.118190209(118190)Online publication date: Dec-2022
https://doi.org/10.1016/j.eswa.2022.118190

Index Terms

Coarse-To-Fine Framework For Music Generation via Generative Adversarial Networks
1. Applied computing
  1. Arts and humanities
    1. Sound and music computing
2. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement
Special Issue on KDD 2018, Regular Papers and Survey Paper

Music plays an important role in our daily life. With the development of deep learning and modern generation techniques, researchers have done plenty of works on automatic music generation. However, due to the special requirements of both melody and ...
A Tutorial on AI Music Composition
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

AI music composition is one of the most attractive and important topics in artificial intelligence, music, and multimedia. The typical tasks in AI music composition include melody generation, song writing, accompaniment generation, arrangement, ...
A combination of multi-objective genetic algorithm and deep learning for music harmony generation
Abstract
Automatic Music Generation (AMG) has become an interesting research topic for many scientists in artificial intelligence, who are also interested in the music industry. One of the main challenges in Automatic Music Generation is that there is no ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

July 2020

276 pages

ISBN:9781450375603

DOI:10.1145/3409501

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

HPCCT & BDAI 2020

HPCCT & BDAI 2020: 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

July 3 - 6, 2020

Qingdao, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
98
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)4

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yan N(2024)Generating Rhythm Game Music with JukeboxIntelligent Systems and Applications10.1007/978-3-031-66428-1_16(267-280)Online publication date: 31-Jul-2024
https://doi.org/10.1007/978-3-031-66428-1_16
Civit MCivit-Masot JCuadrado FEscalona M(2022)A systematic review of artificial intelligence-based music generation: Scope, applications, and future trendsExpert Systems with Applications10.1016/j.eswa.2022.118190209(118190)Online publication date: Dec-2022
https://doi.org/10.1016/j.eswa.2022.118190

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents