research-article

Embodied exploration of deep latent spaces in interactive dance-music performance

Authors:

Philippe Esling,

Geoffroy Peeters,

Frédéric BevilacquaAuthors Info & Claims

MOCO '24: Proceedings of the 9th International Conference on Movement and Computing

Article No.: 12, Pages 1 - 9

https://doi.org/10.1145/3658852.3659072

Published: 27 June 2024 Publication History

Abstract

In recent years, significant advances have been made in deep learning models for audio generation, offering promising tools for musical creation. In this work, we investigate the use of deep audio generative models in interactive dance/music performance. We adopted a performance-led research design approach, establishing an art-research collaboration between a researcher/musician and a dancer. First, we describe our motion-sound interactive system integrating deep audio generative model and propose three methods for embodied exploration of deep latent spaces. Then, we detail the creative process for building the performance centered on the co-design of the system. Finally, we report feedback from the dancer’s interviews and discuss the results and perspectives. The code implementation is publicly available on our github1.

References

[1]

Sarah Fdili Alaoui and Jean-Marc Matos. 2021. RCO: Investigating social and technological constraints through interactive dance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[2]

Steve Benford, Chris Greenhalgh, Andy Crabtree, Martin Flintham, Brendan Walker, Joe Marshall, Boriana Koleva, Stefan Rennick Egglestone, Gabriella Giannachi, Matt Adams, 2013. Performance-led research in the wild. ACM Transactions on Computer-Human Interaction (TOCHI) 20, 3 (2013), 1–22.

Digital Library

[3]

Daniel Bisig. 2022. Generative dance-a taxonomy and survey. In Proceedings of the 8th International Conference on Movement and Computing. 1–10.

Digital Library

[4]

Daniel Bisig and Pablo Palacio. 2016. Neural narratives: Dance with virtual body extensions. In Proceedings of the 3rd International Symposium on Movement and Computing. 1–8.

Digital Library

[5]

Maaike Bleeker. 2016. Transmission in motion: The technologizing of dance. Taylor & Francis.

[6]

Susanne Bødker. 2006. When second wave HCI meets third wave challenges. In Proceedings of the 4th Nordic conference on Human-computer interaction: changing roles. 1–8.

Digital Library

[7]

Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis. Qualitative research in sport, exercise and health 11, 4 (2019), 589–597.

[8]

Antoine Caillon and Philippe Esling. 2021. RAVE: A variational autoencoder for fast and high-quality neural audio synthesis. arXiv preprint arXiv:2111.05011 (2021).

[9]

Antonio Camurri, Shuji Hashimoto, Matteo Ricchetti, Andrea Ricci, Kenji Suzuki, Riccardo Trocca, and Gualtiero Volpe. 2000. Eyesweb: Toward gesture and affect recognition in interactive dance and music systems. Computer Music Journal 24, 1 (2000), 57–69.

Digital Library

[10]

Baptiste Caramiaux and Marco Donnarumma. 2021. Artificial intelligence in music and performance: a subjective art-research inquiry. Handbook of Artificial Intelligence for Music: Foundations, Advanced Approaches, and Developments for Creativity (2021), 75–95.

[11]

Baptiste Caramiaux and Sarah Fdili Alaoui. 2022. " Explorers of Unknown Planets" Practices and Politics of Artificial Intelligence in Visual Arts. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–24.

Digital Library

[12]

Alexandre Défossez, Jade Copet, Gabriel Synnaeve, and Yossi Adi. 2022. High fidelity neural audio compression. arXiv preprint arXiv:2210.13438 (2022).

[13]

Ninon Devis, Nils Demerlé, Sarah Nabi, David Genova, and Philippe Esling. 2023. Continuous descriptor-based control for deep audio synthesis. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.

[14]

Constance Douwes, Giovanni Bindi, Antoine Caillon, Philippe Esling, and Jean-Pierre Briot. 2023. Is Quality Enoughƒ Integrating Energy Consumption in a Large-Scale Evaluation of Neural Audio Synthesis Models. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.

[15]

Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts. 2018. GANSynth: Adversarial Neural Audio Synthesis. In International Conference on Learning Representations.

[16]

Jesse Engel, Chenjie Gu, Adam Roberts, 2019. DDSP: Differentiable Digital Signal Processing. In International Conference on Learning Representations.

[17]

Philippe Esling and Ninon Devis. 2020. Creativity in the era of artificial intelligence. arXiv preprint arXiv:2008.05959 (2020).

[18]

Jerry Alan Fails and Dan R Olsen Jr. 2003. Interactive machine learning. In Proceedings of the 8th international conference on Intelligent user interfaces. 39–45.

Digital Library

[19]

Sarah Fdili Alaoui. 2019. Making an interactive dance piece: Tensions in integrating technology in art. In Proceedings of the 2019 on designing interactive systems conference. 1195–1208.

Digital Library

[20]

Rebecca Fiebrink, Perry R Cook, and Dan Trueman. 2011. Human model evaluation in interactive supervised learning. In Proceedings of the SIGCHI conference on human factors in computing systems. 147–156.

Digital Library

[21]

Rebecca Fiebrink and Laetitia Sonami. 2020. Reflections on eight years of instrument creation with machine learning. (2020).

[22]

Jules Françoise and Frederic Bevilacqua. 2018. Motion-sound mapping through interaction: An approach to user-centered design of auditory feedback using machine learning. ACM Transactions on Interactive Intelligent Systems (TiiS) 8, 2 (2018), 1–30.

Digital Library

[23]

Jules Françoise, Sarah Fdili Alaoui, and Yves Candau. 2022. CO/DA: Live-Coding Movement-Sound Interactions for Dance Improvisation. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[24]

Jules Françoise, Norbert Schnell, and Frédéric Bevilacqua. 2013. A multimodal probabilistic model for gesture–based control of sound synthesis. In Proceedings of the 21st ACM international conference on Multimedia. 705–708.

Digital Library

[25]

Karmen Franinovic and Stefania Serafin. 2013. Sonic interaction design. Mit Press.

Digital Library

[26]

Andrea Giomi. 2020. Somatic sonification in dance performances. From the Artistic to the Perceptual and Back. In Proceedings of the 7th International Conference on Movement and Computing. 1–8.

Digital Library

[27]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).

[28]

Stacy Hsueh, Sarah Fdili Alaoui, and Wendy E Mackay. 2019. Understanding kinaesthetic creativity in dance. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[29]

Andy Hunt, Marcelo M Wanderley, and Ross Kirk. 2000. Towards a model for instrumental mapping in expert musical interaction. In ICMC.

[30]

Théo Jourdan and Baptiste Caramiaux. 2023. Machine Learning for Musical Expression: A Systematic Literature Review. (2023).

[31]

Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings. arXiv:http://arxiv.org/abs/1312.6114v10 [stat.ML]

[32]

Micheline Lesaffre, Pieter-Jan Maes, and Marc Leman. 2017. The Routledge companion to embodied music interaction. Taylor & Francis.

[33]

Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Raetsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem. 2019. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations. In Proceedings of the 36th ICML, Vol. 97. PMLR, 4114–4124. https://proceedings.mlr.press/v97/locatello19a.html

[34]

Fabio Morreale. 2021. Where Does the Buck Stop? Ethical and Political Issues with AI in Music Creation.Transactions of the International Society for Music Information Retrieval 4, 1 (2021), 105–114.

[35]

Tim Murray-Browne and Panagiotis Tigas. 2021. Latent mappings: Generating open-ended expressive mappings using variational autoencoders. In NIME 2021. PubPub.

[36]

Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A Generative Model for Raw Audio. (2016). http://arxiv.org/abs/1609.03499 cite arxiv:1609.03499.

[37]

Ashis Pati and Alexander Lerch. 2021. Is disentanglement enough? On latent representations for controllable music generation. arXiv preprint arXiv:2108.01450 (2021).

[38]

Joseph Butch Rovan, Marcelo M Wanderley, Shlomo Dubnov, and Philippe Depalle. 1997. Instrumental gestural mapping strategies as expressivity determinants in computer music performance. In Kansei, The Technology of Emotion. Proceedings of the AIMI International Workshop. Citeseer, 68–73.

[39]

Jan C Schacher. 2010. Motion To Gesture To Sound: Mapping For Interactive Dance. In NIME, Vol. 2010. 250–254.

[40]

Norbert Schnell, Axel Röbel, Diemo Schwarz, Geoffroy Peeters, Riccardo Borghesi, 2009. MuBu and friends–assembling tools for content based real-time interactive audio processing in Max/MSP. In ICMC.

[41]

Diemo Schwarz. 2007. Corpus-based concatenative synthesis. IEEE signal processing magazine 24, 2 (2007), 92–104.

[42]

Hugo Scurto and Ludmila Postel. 2023. Soundwalking deep latent spaces. In Proceedings of the 23rd International Conference on New Interfaces for Musical Expression (NIME’23).

[43]

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models. Advances in neural information processing systems 28 (2015).

[44]

Jonathan Sterne and Elena Razlogova. 2021. Tuning sound for infrastructures: artificial intelligence, automation, and the cultural politics of audio mastering. Cultural Studies 35, 4-5 (2021), 750–770.

[45]

Koray Tahiroğlu, Miranda Kastemaa, and Oskar Koli. 2021. Ai-terity 2.0: An autonomous nime featuring ganspacesynth deep learning model. (2021).

[46]

Kıvanç Tatar, Kelsey Cotton, and Daniel Bisig. 2023. Sound Design Strategies for Latent Audio Space Explorations Using Deep Learning Architectures. arXiv preprint arXiv:2305.15571 (2023).

[47]

Gabriel Vigliensoni, Rebecca Fiebrink, 2023. Steering latent audio models through interactive machine learning. (2023).

[48]

Federico Ghelli Visi and Atau Tanaka. 2021. Interactive machine learning of musical gesture. Handbook of artificial intelligence for music: Foundations, advanced approaches, and developments for creativity (2021), 771–798.

[49]

Marcelo M Wanderley and Philippe Depalle. 2004. Gestural control of sound synthesis. Proc. IEEE 92, 4 (2004), 632–644.

[50]

Qiushi Zhou, Cheng Cheng Chua, Jarrod Knibbe, Jorge Goncalves, and Eduardo Velloso. 2021. Dance and choreography in HCI: a two-decade retrospective. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

Index Terms

Embodied exploration of deep latent spaces in interactive dance-music performance

Recommendations

DanceCraft: A Music-Reactive Real-time Dance Improv System
MOCO '24: Proceedings of the 9th International Conference on Movement and Computing

Automatic generation of 3D dance motion, in response to live music, is a challenging task. Prior research has assumed that either the entire music track, or a significant chunk of music track, is available prior to dance generation. In this paper, we ...
Fourier (Common-Tone) Phase Spaces are in Tune with Variational Autoencoders’ Latent Space
Mathematics and Computation in Music
Abstract
Expanding upon the potential of generative machine learning to create atemporal latent space representations of musical-theoretical and cognitive interest, we delve into their explainability by formulating and testing hypotheses on their alignment ... $^{}$ $^{}$
Deep active inference

This work combines the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the "deep active inference" agent. This agent minimises ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

MOCO '24: Proceedings of the 9th International Conference on Movement and Computing

May 2024

245 pages

ISBN:9798400709944

DOI:10.1145/3658852

Editor:
Laura Karreman

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

MOCO '24

MOCO '24: 9th International Conference on Movement and Computing

May 30 - June 2, 2024

Utrecht, Netherlands

Acceptance Rates

MOCO '24 Paper Acceptance Rate 35 of 75 submissions, 47%;

Overall Acceptance Rate 85 of 185 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
85
Total Downloads

Downloads (Last 12 months)85
Downloads (Last 6 weeks)17

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten