Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3658852.3659072acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmocoConference Proceedingsconference-collections
research-article

Embodied exploration of deep latent spaces in interactive dance-music performance

Published: 27 June 2024 Publication History

Abstract

In recent years, significant advances have been made in deep learning models for audio generation, offering promising tools for musical creation. In this work, we investigate the use of deep audio generative models in interactive dance/music performance. We adopted a performance-led research design approach, establishing an art-research collaboration between a researcher/musician and a dancer. First, we describe our motion-sound interactive system integrating deep audio generative model and propose three methods for embodied exploration of deep latent spaces. Then, we detail the creative process for building the performance centered on the co-design of the system. Finally, we report feedback from the dancer’s interviews and discuss the results and perspectives. The code implementation is publicly available on our github1.

References

[1]
Sarah Fdili Alaoui and Jean-Marc Matos. 2021. RCO: Investigating social and technological constraints through interactive dance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.
[2]
Steve Benford, Chris Greenhalgh, Andy Crabtree, Martin Flintham, Brendan Walker, Joe Marshall, Boriana Koleva, Stefan Rennick Egglestone, Gabriella Giannachi, Matt Adams, 2013. Performance-led research in the wild. ACM Transactions on Computer-Human Interaction (TOCHI) 20, 3 (2013), 1–22.
[3]
Daniel Bisig. 2022. Generative dance-a taxonomy and survey. In Proceedings of the 8th International Conference on Movement and Computing. 1–10.
[4]
Daniel Bisig and Pablo Palacio. 2016. Neural narratives: Dance with virtual body extensions. In Proceedings of the 3rd International Symposium on Movement and Computing. 1–8.
[5]
Maaike Bleeker. 2016. Transmission in motion: The technologizing of dance. Taylor & Francis.
[6]
Susanne Bødker. 2006. When second wave HCI meets third wave challenges. In Proceedings of the 4th Nordic conference on Human-computer interaction: changing roles. 1–8.
[7]
Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis. Qualitative research in sport, exercise and health 11, 4 (2019), 589–597.
[8]
Antoine Caillon and Philippe Esling. 2021. RAVE: A variational autoencoder for fast and high-quality neural audio synthesis. arXiv preprint arXiv:2111.05011 (2021).
[9]
Antonio Camurri, Shuji Hashimoto, Matteo Ricchetti, Andrea Ricci, Kenji Suzuki, Riccardo Trocca, and Gualtiero Volpe. 2000. Eyesweb: Toward gesture and affect recognition in interactive dance and music systems. Computer Music Journal 24, 1 (2000), 57–69.
[10]
Baptiste Caramiaux and Marco Donnarumma. 2021. Artificial intelligence in music and performance: a subjective art-research inquiry. Handbook of Artificial Intelligence for Music: Foundations, Advanced Approaches, and Developments for Creativity (2021), 75–95.
[11]
Baptiste Caramiaux and Sarah Fdili Alaoui. 2022. " Explorers of Unknown Planets" Practices and Politics of Artificial Intelligence in Visual Arts. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–24.
[12]
Alexandre Défossez, Jade Copet, Gabriel Synnaeve, and Yossi Adi. 2022. High fidelity neural audio compression. arXiv preprint arXiv:2210.13438 (2022).
[13]
Ninon Devis, Nils Demerlé, Sarah Nabi, David Genova, and Philippe Esling. 2023. Continuous descriptor-based control for deep audio synthesis. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.
[14]
Constance Douwes, Giovanni Bindi, Antoine Caillon, Philippe Esling, and Jean-Pierre Briot. 2023. Is Quality Enoughƒ Integrating Energy Consumption in a Large-Scale Evaluation of Neural Audio Synthesis Models. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.
[15]
Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts. 2018. GANSynth: Adversarial Neural Audio Synthesis. In International Conference on Learning Representations.
[16]
Jesse Engel, Chenjie Gu, Adam Roberts, 2019. DDSP: Differentiable Digital Signal Processing. In International Conference on Learning Representations.
[17]
Philippe Esling and Ninon Devis. 2020. Creativity in the era of artificial intelligence. arXiv preprint arXiv:2008.05959 (2020).
[18]
Jerry Alan Fails and Dan R Olsen Jr. 2003. Interactive machine learning. In Proceedings of the 8th international conference on Intelligent user interfaces. 39–45.
[19]
Sarah Fdili Alaoui. 2019. Making an interactive dance piece: Tensions in integrating technology in art. In Proceedings of the 2019 on designing interactive systems conference. 1195–1208.
[20]
Rebecca Fiebrink, Perry R Cook, and Dan Trueman. 2011. Human model evaluation in interactive supervised learning. In Proceedings of the SIGCHI conference on human factors in computing systems. 147–156.
[21]
Rebecca Fiebrink and Laetitia Sonami. 2020. Reflections on eight years of instrument creation with machine learning. (2020).
[22]
Jules Françoise and Frederic Bevilacqua. 2018. Motion-sound mapping through interaction: An approach to user-centered design of auditory feedback using machine learning. ACM Transactions on Interactive Intelligent Systems (TiiS) 8, 2 (2018), 1–30.
[23]
Jules Françoise, Sarah Fdili Alaoui, and Yves Candau. 2022. CO/DA: Live-Coding Movement-Sound Interactions for Dance Improvisation. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–13.
[24]
Jules Françoise, Norbert Schnell, and Frédéric Bevilacqua. 2013. A multimodal probabilistic model for gesture–based control of sound synthesis. In Proceedings of the 21st ACM international conference on Multimedia. 705–708.
[25]
Karmen Franinovic and Stefania Serafin. 2013. Sonic interaction design. Mit Press.
[26]
Andrea Giomi. 2020. Somatic sonification in dance performances. From the Artistic to the Perceptual and Back. In Proceedings of the 7th International Conference on Movement and Computing. 1–8.
[27]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
[28]
Stacy Hsueh, Sarah Fdili Alaoui, and Wendy E Mackay. 2019. Understanding kinaesthetic creativity in dance. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
[29]
Andy Hunt, Marcelo M Wanderley, and Ross Kirk. 2000. Towards a model for instrumental mapping in expert musical interaction. In ICMC.
[30]
Théo Jourdan and Baptiste Caramiaux. 2023. Machine Learning for Musical Expression: A Systematic Literature Review. (2023).
[31]
Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings. arXiv:http://arxiv.org/abs/1312.6114v10 [stat.ML]
[32]
Micheline Lesaffre, Pieter-Jan Maes, and Marc Leman. 2017. The Routledge companion to embodied music interaction. Taylor & Francis.
[33]
Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem. 2019. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations. In Proceedings of the 36th ICML, Vol. 97. PMLR, 4114–4124. https://proceedings.mlr.press/v97/locatello19a.html
[34]
Fabio Morreale. 2021. Where Does the Buck Stop? Ethical and Political Issues with AI in Music Creation.Transactions of the International Society for Music Information Retrieval 4, 1 (2021), 105–114.
[35]
Tim Murray-Browne and Panagiotis Tigas. 2021. Latent mappings: Generating open-ended expressive mappings using variational autoencoders. In NIME 2021. PubPub.
[36]
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. (2016). http://arxiv.org/abs/1609.03499 cite arxiv:1609.03499.
[37]
Ashis Pati and Alexander Lerch. 2021. Is disentanglement enough? On latent representations for controllable music generation. arXiv preprint arXiv:2108.01450 (2021).
[38]
Joseph Butch Rovan, Marcelo M Wanderley, Shlomo Dubnov, and Philippe Depalle. 1997. Instrumental gestural mapping strategies as expressivity determinants in computer music performance. In Kansei, The Technology of Emotion. Proceedings of the AIMI International Workshop. Citeseer, 68–73.
[39]
Jan C Schacher. 2010. Motion To Gesture To Sound: Mapping For Interactive Dance. In NIME, Vol. 2010. 250–254.
[40]
Norbert Schnell, Axel Röbel, Diemo Schwarz, Geoffroy Peeters, Riccardo Borghesi, 2009. MuBu and friends–assembling tools for content based real-time interactive audio processing in Max/MSP. In ICMC.
[41]
Diemo Schwarz. 2007. Corpus-based concatenative synthesis. IEEE signal processing magazine 24, 2 (2007), 92–104.
[42]
Hugo Scurto and Ludmila Postel. 2023. Soundwalking deep latent spaces. In Proceedings of the 23rd International Conference on New Interfaces for Musical Expression (NIME’23).
[43]
Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models. Advances in neural information processing systems 28 (2015).
[44]
Jonathan Sterne and Elena Razlogova. 2021. Tuning sound for infrastructures: artificial intelligence, automation, and the cultural politics of audio mastering. Cultural Studies 35, 4-5 (2021), 750–770.
[45]
Koray Tahiroğlu, Miranda Kastemaa, and Oskar Koli. 2021. Ai-terity 2.0: An autonomous nime featuring ganspacesynth deep learning model. (2021).
[46]
Kıvanç Tatar, Kelsey Cotton, and Daniel Bisig. 2023. Sound Design Strategies for Latent Audio Space Explorations Using Deep Learning Architectures. arXiv preprint arXiv:2305.15571 (2023).
[47]
Gabriel Vigliensoni, Rebecca Fiebrink, 2023. Steering latent audio models through interactive machine learning. (2023).
[48]
Federico Ghelli Visi and Atau Tanaka. 2021. Interactive machine learning of musical gesture. Handbook of artificial intelligence for music: Foundations, advanced approaches, and developments for creativity (2021), 771–798.
[49]
Marcelo M Wanderley and Philippe Depalle. 2004. Gestural control of sound synthesis. Proc. IEEE 92, 4 (2004), 632–644.
[50]
Qiushi Zhou, Cheng Cheng Chua, Jarrod Knibbe, Jorge Goncalves, and Eduardo Velloso. 2021. Dance and choreography in HCI: a two-decade retrospective. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MOCO '24: Proceedings of the 9th International Conference on Movement and Computing
May 2024
245 pages
ISBN:9798400709944
DOI:10.1145/3658852
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HCI
  2. dance-music-AI performance
  3. deep learning
  4. embodied exploration
  5. generative models
  6. latent space
  7. motion-sound interaction

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MOCO '24

Acceptance Rates

MOCO '24 Paper Acceptance Rate 35 of 75 submissions, 47%;
Overall Acceptance Rate 85 of 185 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 85
    Total Downloads
  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)17
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media