A Generative Model to Embed Human Expressivity into Robot Motions
<p>Overview of the proposed framework. Light blue highlights the components related to the robot’s task, <math display="inline"><semantics> <msub> <mi mathvariant="bold">x</mi> <mi mathvariant="bold">R</mi> </msub> </semantics></math>. Pink represents everything connected to human movement, <math display="inline"><semantics> <msub> <mi mathvariant="bold">x</mi> <mi mathvariant="bold">H</mi> </msub> </semantics></math>. Additionally to <math display="inline"><semantics> <msub> <mi mathvariant="bold">x</mi> <mi mathvariant="bold">H</mi> </msub> </semantics></math>, there is another input from the human: the neutral movement, which is defined as <math display="inline"><semantics> <msub> <mi mathvariant="bold">x</mi> <mi mathvariant="bold">NH</mi> </msub> </semantics></math>. Two blocks are shown in dotted lines: one was used during the training (blue) and the other during the inference stage (turquoise). The blocks that compose the framework’s generator are feature extraction (dark blue) and feature combination (red). The latent space, i.e., the Variational Autoencoder (VAE) encoder output, of the neutral motion is represented by <math display="inline"><semantics> <msub> <mi mathvariant="bold">z</mi> <mi mathvariant="bold">NH</mi> </msub> </semantics></math>. Simultaneously, we represent the human expressive movement latent representation as <math display="inline"><semantics> <msub> <mi mathvariant="bold">z</mi> <mi mathvariant="bold">H</mi> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi mathvariant="bold">z</mi> <mi mathvariant="bold">HS</mi> </msub> </semantics></math> corresponds to the latent features obtained by subtracting the neutral latent representation from the expressive latent representation. <math display="inline"><semantics> <msub> <mi mathvariant="bold">z</mi> <mi mathvariant="bold">R</mi> </msub> </semantics></math> represents the latent space of the robot task, and <math display="inline"><semantics> <mover accent="false"> <msub> <mi mathvariant="bold">x</mi> <mi mathvariant="bold">R</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> is the output of the generator. The new expressive robot motion has an expressive latent space denoted by <math display="inline"><semantics> <mover accent="false"> <msub> <mi mathvariant="bold">z</mi> <mi mathvariant="bold">HS</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math>, which was obtained by passing <math display="inline"><semantics> <mover accent="false"> <msub> <mi mathvariant="bold">x</mi> <mi mathvariant="bold">R</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> through the human’s VAE encoder. Additionally, the parameter <math display="inline"><semantics> <mi>λ</mi> </semantics></math> acts as an expressive gain, which can be tuned to increase or decrease the expressive content from the generated motion as required.</p> "> Figure 2
<p>Network output distribution and representation analysis. (<b>A</b>) Kernel density of the Space Laban Effort quality for human (dark purple), robot (purple), and generated outputs at <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math> (light blue), <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math> (orange), <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>100</mn> </mrow> </semantics></math> (mint green). Increasing <math display="inline"><semantics> <mi>λ</mi> </semantics></math> makes the generated dataset more like the human, retaining robot features. (<b>B</b>) t-SNE plots of human data and network outputs at varying <math display="inline"><semantics> <mi>λ</mi> </semantics></math>. Emotion labels: sad (blue), angry (green), and happy (yellow). With a rising <math display="inline"><semantics> <mi>λ</mi> </semantics></math>, the sad emotion clustering becomes clearer in the generated output.</p> "> Figure 3
<p>Similarity analysis. (<b>A</b>) Jensen–Shannon distance of Laban Effort qualities between generated and human datasets. As <math display="inline"><semantics> <mi>λ</mi> </semantics></math> increases, the Time and Space qualities converge. (<b>B</b>) Jensen–Shannon distance for Laban Effort qualities between generated and robot datasets. Time and Space drift apart with increasing <math display="inline"><semantics> <mi>λ</mi> </semantics></math>, while Flow remains stable and Weight decreases. (<b>C</b>) Cosine similarity between network output and robot motion; higher <math display="inline"><semantics> <mi>λ</mi> </semantics></math> values diminish similarity. (<b>D</b>) The mean squared error between the network output and robot motion; increasing <math display="inline"><semantics> <mi>λ</mi> </semantics></math> amplifies discrepancies.</p> "> Figure 4
<p>Effect of <math display="inline"><semantics> <mi>λ</mi> </semantics></math> values in generated trajectories. Trajectories for <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math> (light blue), <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math> (orange), and <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>100</mn> </mrow> </semantics></math> (mint green) on different robots; base task in purple. (<b>A</b>) Double pendulum shows no <math display="inline"><semantics> <mi>λ</mi> </semantics></math> variation. (<b>B</b>) Robot arm modifies the task at <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math> and loses it as <math display="inline"><semantics> <mi>λ</mi> </semantics></math> rises. (<b>C</b>) Mobile base alters task at <math display="inline"><semantics> <mrow> <mi>λ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math> and deviates more with higher <math display="inline"><semantics> <mi>λ</mi> </semantics></math>.</p> "> Figure 5
<p>Effect of emotion labels in generated trajectories. For each emotion: sad (blue), angry (green), and happy (yellow), from the human dataset, movements were generated across morphologies, with base tasks in purple. (<b>A</b>) Double pendulum: varied trajectories by emotion. (<b>B</b>) Robot arm: similar paths, but different end positions. (<b>C</b>) Mobile base: distinct paths for each emotion, covering more task space.</p> "> Figure 6
<p>Experimental setup. Experimental setup for (<b>A</b>) the mobile base and (<b>B</b>) 5DoF robot arm.</p> ">
Abstract
:1. Introduction
2. Related Works
2.1. Expressive Qualifiers
2.2. Feature Learning
2.3. Style Transfer and Expressive Movement Generation
3. Contribution
4. Materials and Methods
4.1. Method Overview
4.2. Laban Effort Qualities
4.3. Feature Extraction for Movement Representation in Sub-Spaces
4.4. Adversarial Generation Implementation
4.5. Neural Network Architecture Specifications
4.6. Training Procedure
5. Results
5.1. Expressive and Affective Evaluation
5.2. Simulation
5.3. Real World Implementation
6. Discussion
7. Conclusions
8. Limitations and Future Works
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
LMA | Laban Movement Analysis |
PAD | Pleasure–Arousal–Dominance |
VAD | Valence–Arousal–Dominance |
GAN | Generative Adversarial Networks |
DoF | Degrees of Freedom |
ELBO | Evidence Lower Bound |
VAE | Variational Autoencoders |
KL | Kullback–Leibler |
MSE | Mean Squared Error |
LSTM | Long Short-Term Memory |
LEQ | Laban Effort Qualities |
KDE | Kernel Density Estimation |
JSD | Jensen–Shannon Distance |
References
- Bartra, R. Chamanes y Robots; Anagrama: Barcelona, Spain, 2019; Volume 535. [Google Scholar]
- Mancini, C. Animal-Computer Interaction (ACI): Changing perspective on HCI, participation and sustainability. In Proceedings of the 2013 Conference on Human Factors in Computing Systems CHI 2013, Paris, France, 27 April–2 May 2013; pp. 2227–2236. [Google Scholar]
- Yuan, L.; Gao, X.; Zheng, Z.; Edmonds, M.; Wu, Y.N.; Rossano, F.; Lu, H.; Zhu, Y.; Zhu, S.C. In situ bidirectional human-robot value alignment. Sci. Robot. 2022, 7, eabm4183. [Google Scholar] [CrossRef]
- Whittaker, S.; Rogers, Y.; Petrovskaya, E.; Zhuang, H. Designing personas for expressive robots: Personality in the new breed of moving, speaking, and colorful social home robots. ACM Trans. Hum. Robot Interact. (THRI) 2021, 10, 8. [Google Scholar] [CrossRef]
- Ceha, J.; Chhibber, N.; Goh, J.; McDonald, C.; Oudeyer, P.Y.; Kulić, D.; Law, E. Expression of Curiosity in Social Robots: Design, Perception, and Effects on Behaviour. In Proceedings of the 2019 Conference on Human Factors in Computing Systems (CHI’19), Glasgow, Scotland, 4–9 May 2019; pp. 1–12. [Google Scholar] [CrossRef]
- Ostrowski, A.K.; Zygouras, V.; Park, H.W.; Breazeal, C. Small Group Interactions with Voice-User Interfaces: Exploring Social Embodiment, Rapport, and Engagement. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI’21), Boulder, CO, USA, 9–11 March 2021; pp. 322–331. [Google Scholar] [CrossRef]
- Erel, H.; Cohen, Y.; Shafrir, K.; Levy, S.D.; Vidra, I.D.; Shem Tov, T.; Zuckerman, O. Excluded by robots: Can robot-robot-human interaction lead to ostracism? In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI’21), Boulder, CO, USA, 9–11 March 2021; pp. 312–321. [Google Scholar]
- Brock, H.; Šabanović, S.; Gomez, R. Remote You, Haru and Me: Exploring Social Interaction in Telepresence Gaming With a Robotic Agent. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI’21), Boulder, CO, USA, 9–11 March 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 283–287. [Google Scholar] [CrossRef]
- Berg, J.; Lu, S. Review of interfaces for industrial human-robot interaction. Curr. Robot. Rep. 2020, 1, 27–34. [Google Scholar] [CrossRef]
- Złotowski, J.; Proudfoot, D.; Yogeeswaran, K.; Bartneck, C. Anthropomorphism: Opportunities and challenges in human–robot interaction. Int. J. Soc. Robot. 2015, 7, 347–360. [Google Scholar] [CrossRef]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Zhang, C.; Chen, J.; Li, J.; Peng, Y.; Mao, Z. Large language models for human-robot interaction: A review. Biomim. Intell. Robot. 2023, 3, 100131. [Google Scholar] [CrossRef]
- Capy, S.; Osorio, P.; Hagane, S.; Aznar, C.; Garcin, D.; Coronado, E.; Deuff, D.; Ocnarescu, I.; Milleville, I.; Venture, G. Yōkobo: A Robot to Strengthen Links Amongst Users with Non-Verbal Behaviours. Machines 2022, 10, 708. [Google Scholar] [CrossRef]
- Szafir, D.; Mutlu, B.; Fong, T. Communication of intent in assistive free flyers. In Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot interaction (HRI’14), Bielefeld, Germany, 3–6 March 2014; pp. 358–365. [Google Scholar]
- Terzioğlu, Y.; Mutlu, B.; Şahin, E. Designing Social Cues for Collaborative Robots: The RoIe of Gaze and Breathing in Human-Robot Collaboration. In Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI) (HRI’20), Cambridge, UK, 23–26 March 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 343–357. [Google Scholar]
- Reed, S.; Zolna, K.; Parisotto, E.; Colmenarejo, S.G.; Novikov, A.; Barth-maron, G.; Giménez, M.; Sulsky, Y.; Kay, J.; Springenberg, J.T.; et al. A Generalist Agent. arXiv 2022, arXiv:2205.06175. [Google Scholar]
- Bannerman, H. Is dance a language? Movement, meaning and communication. Danc. Res. 2014, 32, 65–80. [Google Scholar] [CrossRef]
- Borghi, A.M.; Cimatti, F. Embodied cognition and beyond: Acting and sensing the body. Neuropsychologia 2010, 48, 763–773. [Google Scholar] [CrossRef]
- Karg, M.; Samadani, A.A.; Gorbet, R.; Kühnlenz, K.; Hoey, J.; Kulić, D. Body movements for affective expression: A survey of automatic recognition and generation. IEEE Trans. Affect. Comput. 2013, 4, 341–359. [Google Scholar] [CrossRef]
- Venture, G.; Kulić, D. Robot expressive motions: A survey of generation and evaluation methods. ACM Trans. Hum. Robot Interact. THRI 2019, 8, 20. [Google Scholar] [CrossRef]
- Zhang, Y.; Sreedharan, S.; Kulkarni, A.; Chakraborti, T.; Zhuo, H.H.; Kambhampati, S. Plan explicability and predictability for robot task planning. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1313–1320. [Google Scholar]
- Wright, J.L.; Chen, J.Y.; Lakhmani, S.G. Agent transparency and reliability in human–robot interaction: The influence on user confidence and perceived reliability. IEEE Trans. Hum. Mach. Syst. 2019, 50, 254–263. [Google Scholar] [CrossRef]
- Dragan, A.D.; Lee, K.C.; Srinivasa, S.S. Legibility and predictability of robot motion. In Proceedings of the 2013 ACM/IEEE International Conference on Human-Robot Interaction (HRI’13), Tokyo, Japan, 3–6 March 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 301–308. [Google Scholar]
- Sripathy, A.; Bobu, A.; Li, Z.; Sreenath, K.; Brown, D.S.; Dragan, A.D. Teaching robots to span the space of functional expressive motion. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 13406–13413. [Google Scholar]
- Knight, H.; Simmons, R. Expressive motion with x, y and theta: Laban effort features for mobile robots. In Proceedings of the Proceeding of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, UK, 25–29 August 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 267–273. [Google Scholar]
- Bobu, A.; Wiggert, M.; Tomlin, C.; Dragan, A.D. Feature Expansive Reward Learning: Rethinking Human Input. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI’21), Boulder, CO, USA, 9–11 March 2021; pp. 216–224. [Google Scholar] [CrossRef]
- Chidambaram, V.; Chiang, Y.H.; Mutlu, B. Designing persuasive robots: How robots might persuade people using vocal and nonverbal cues. In Proceedings of the 2012 ACM/IEEE International Conference on Human-Robot Interaction (HRI’12), Boston, MA, USA, 5–8 March 2012; pp. 293–300. [Google Scholar]
- Saunderson, S.; Nejat, G. How robots influence humans: A survey of nonverbal communication in social human–robot interaction. Int. J. Soc. Robot. 2019, 11, 575–608. [Google Scholar] [CrossRef]
- Cominelli, L.; Feri, F.; Garofalo, R.; Giannetti, C.; Meléndez-Jiménez, M.A.; Greco, A.; Nardelli, M.; Scilingo, E.P.; Kirchkamp, O. Promises and trust in human–robot interaction. Sci. Rep. 2021, 11, 9687. [Google Scholar] [CrossRef]
- Desai, R.; Anderson, F.; Matejka, J.; Coros, S.; McCann, J.; Fitzmaurice, G.; Grossman, T. Geppetto: Enabling semantic design of expressive robot behaviors. In Proceedings of the 2019 Conference on Human Factors in Computing Systems (CHI’19’), Glasgow, Scotland, 4–9 May 2019; pp. 1–14. [Google Scholar]
- Ciardo, F.; Tommaso, D.D.; Wykowska, A. Human-like behavioral variability blurs the distinction between a human and a machine in a nonverbal Turing test. Sci. Robot. 2022, 7, eabo1241. [Google Scholar] [CrossRef]
- Wallkötter, S.; Tulli, S.; Castellano, G.; Paiva, A.; Chetouani, M. Explainable embodied agents through social cues: A review. ACM Trans. Hum. Robot Interact. (THRI) 2021, 10, 27. [Google Scholar] [CrossRef]
- Herrera Perez, C.; Barakova, E.I. Expressivity comes first, movement follows: Embodied interaction as intrinsically expressive driver of robot behaviour. In Modelling Human Motion: From Human Perception to Robot Design; Springer International Publishing: Cham, Switzerland, 2020; pp. 299–313. [Google Scholar]
- Semeraro, F.; Griffiths, A.; Cangelosi, A. Human–robot collaboration and machine learning: A systematic review of recent research. Robot. Comput. Integr. Manuf. 2023, 79, 102432. [Google Scholar] [CrossRef]
- Bruns, M.; Ossevoort, S.; Petersen, M.G. Expressivity in interaction: A framework for design. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; pp. 1–13. [Google Scholar]
- Larboulette, C.; Gibet, S. A Review of Computable Expressive Descriptors of Human Motion. In Proceedings of the 2nd International Workshop on Movement and Computing (MOCO’15), Vancouver, BC, Canada, 14–15 August 2015; pp. 21–28. [Google Scholar] [CrossRef]
- Pelachaud, C. Studies on gesture expressivity for a virtual agent. Speech Commun. 2009, 51, 630–639. [Google Scholar] [CrossRef]
- Wallbott, H.G. Bodily expression of emotion. Eur. J. Soc. Psychol. 1998, 28, 879–896. [Google Scholar] [CrossRef]
- Davies, E. Beyond Dance: Laban’s Legacy of Movement Analysis; Routledge: London, UK, 2007. [Google Scholar]
- Burton, S.J.; Samadani, A.A.; Gorbet, R.; Kulić, D. Laban movement analysis and affective movement generation for robots and other near-living creatures. In Dance Notations and Robot Motion; Springer International Publishing: Cham, Switzerland, 2016; pp. 25–48. [Google Scholar]
- Bacula, A.; LaViers, A. Character Synthesis of Ballet Archetypes on Robots Using Laban Movement Analysis: Comparison Between a Humanoid and an Aerial Robot Platform with Lay and Expert Observation. Int. J. Soc. Robot. 2021, 13, 1047–1062. [Google Scholar] [CrossRef]
- Yan, F.; Iliyasu, A.M.; Hirota, K. Emotion space modelling for social robots. Eng. Appl. Artif. Intell. 2021, 100, 104178. [Google Scholar] [CrossRef]
- Claret, J.A.; Venture, G.; Basañez, L. Exploiting the robot kinematic redundancy for emotion conveyance to humans as a lower priority task. Int. J. Soc. Robot. 2017, 9, 277–292. [Google Scholar] [CrossRef]
- Häring, M.; Bee, N.; André, E. Creation and evaluation of emotion expression with body movement, sound and eye color for humanoid robots. In Proceedings of the 2011 IEEE RO-MAN: International Symposium on Robot and Human Interactive Communication, Atlanta, GA, USA, 31 July–3 August 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 204–209. [Google Scholar]
- Embgen, S.; Luber, M.; Becker-Asano, C.; Ragni, M.; Evers, V.; Arras, K.O. Robot-specific social cues in emotional body language. In Proceedings of the 2012 IEEE RO-MAN: IEEE International Symposium on Robot and Human Interactive Communication, Paris, France, 9–12 September 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1019–1025. [Google Scholar]
- Beck, A.; Stevens, B.; Bard, K.A.; Cañamero, L. Emotional body language displayed by artificial agents. ACM Trans. Interact. Intell. Syst. (TiiS) 2012, 2, 2. [Google Scholar] [CrossRef]
- Bretan, M.; Hoffman, G.; Weinberg, G. Emotionally expressive dynamic physical behaviors in robots. Int. J. Hum.-Comput. Stud. 2015, 78, 1–16. [Google Scholar] [CrossRef]
- Dairi, A.; Harrou, F.; Sun, Y.; Khadraoui, S. Short-term forecasting of photovoltaic solar power production using variational auto-encoder driven deep learning approach. Appl. Sci. 2020, 10, 8400. [Google Scholar] [CrossRef]
- Li, Z.; Zhao, Y.; Han, J.; Su, Y.; Jiao, R.; Wen, X.; Pei, D. Multivariate time series anomaly detection and interpretation using hierarchical inter-metric and temporal embedding. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 3220–3230. [Google Scholar]
- Memarzadeh, M.; Matthews, B.; Avrekh, I. Unsupervised anomaly detection in flight data using convolutional variational auto-encoder. Aerospace 2020, 7, 115. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Chen, H.; Wang, Y.; Guo, T.; Xu, C.; Deng, Y.; Liu, Z.; Ma, S.; Xu, C.; Xu, C.; Gao, W. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 20–25 June 2021; pp. 12299–12310. [Google Scholar]
- Lu, J.; Yang, J.; Batra, D.; Parikh, D. Hierarchical question-image co-attention for visual question answering. In Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, 5–10 December 2016. [Google Scholar]
- Choi, K.; Hawthorne, C.; Simon, I.; Dinculescu, M.; Engel, J. Encoding musical style with transformer autoencoders. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; pp. 1899–1908. [Google Scholar]
- Ichter, B.; Pavone, M. Robot motion planning in learned latent spaces. IEEE Robot. Autom. Lett. 2019, 4, 2407–2414. [Google Scholar] [CrossRef]
- Park, D.; Hoshi, Y.; Kemp, C.C. A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder. IEEE Robot. Autom. Lett. 2018, 3, 1544–1551. [Google Scholar] [CrossRef]
- Du, Y.; Collins, K.; Tenenbaum, J.; Sitzmann, V. Learning signal-agnostic manifolds of neural fields. Adv. Neural Inf. Process. Syst. 2021, 34, 8320–8331. [Google Scholar]
- Yoon, Y.; Cha, B.; Lee, J.H.; Jang, M.; Lee, J.; Kim, J.; Lee, G. Speech gesture generation from the trimodal context of text, audio, and speaker identity. ACM Trans. Graph. (TOG) 2020, 39, 222. [Google Scholar] [CrossRef]
- Cudeiro, D.; Bolkart, T.; Laidlaw, C.; Ranjan, A.; Black, M.J. Capture, learning, and synthesis of 3D speaking styles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10101–10111. [Google Scholar]
- Ahuja, C.; Lee, D.W.; Morency, L.P. Low-resource adaptation for personalized co-speech gesture generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 20566–20576. [Google Scholar]
- Ferstl, Y.; Neff, M.; McDonnell, R. Multi-objective adversarial gesture generation. In Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, Newcastle upon Tyne, UK, 28–30 October 2019; pp. 1–10. [Google Scholar]
- Yoon, Y.; Ko, W.R.; Jang, M.; Lee, J.; Kim, J.; Lee, G. Robots learn social skills: End-to-end learning of co-speech gesture generation for humanoid robots. In Proceedings of the 2019 International Conference on Robotics and Automation, Montreal, QC, Canada, 20–24 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 4303–4309. [Google Scholar]
- Bhattacharya, U.; Rewkowski, N.; Banerjee, A.; Guhan, P.; Bera, A.; Manocha, D. Text2gestures: A transformer-based network for generating emotive body gestures for virtual agents. In Proceedings of the 2021 IEEE Virtual Reality and 3D User Interfaces Conference, Virtual, 27 March–2 April 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–10. [Google Scholar]
- Bobu, A.; Wiggert, M.; Tomlin, C.; Dragan, A.D. Inducing structure in reward learning by learning features. Int. J. Robot. Res. 2022, 41, 497–518. [Google Scholar] [CrossRef]
- Osorio, P.; Venture, G. Control of a Robot Expressive Movements Using Non-Verbal Features. IFAC PapersOnLine 2022, 55, 92–97. [Google Scholar] [CrossRef]
- Penco, L.; Clément, B.; Modugno, V.; Hoffman, E.M.; Nava, G.; Pucci, D.; Tsagarakis, N.G.; Mouret, J.B.; Ivaldi, S. Robust real-time whole-body motion retargeting from human to humanoid. In Proceedings of the 2018 IEEE-RAS International Conference on Humanoid Robots (Humanoids), Beijing, China, 6–9 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 425–432. [Google Scholar]
- Kim, T.; Lee, J.H. C-3PO: Cyclic-three-phase optimization for human-robot motion retargeting based on reinforcement learning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation, Virtual, 31 May–31 August 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 8425–8432. [Google Scholar]
- Rakita, D.; Mutlu, B.; Gleicher, M. A motion retargeting method for effective mimicry-based teleoperation of robot arms. In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (HRI’17), Vienna, Austria, 6–9 March 2017; pp. 361–370. [Google Scholar]
- Hagane, S.; Venture, G. Robotic Manipulator’s Expressive Movements Control Using Kinematic Redundancy. Machines 2022, 10, 1118. [Google Scholar] [CrossRef]
- Knight, H.; Simmons, R. Laban head-motions convey robot state: A call for robot body language. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 2881–2888. [Google Scholar]
- Kim, L.H.; Follmer, S. Generating legible and glanceable swarm robot motion through trajectory, collective behavior, and pre-attentive processing features. ACM Trans. Hum.-Robot Interact. (THRI) 2021, 10, 21. [Google Scholar] [CrossRef]
- Cui, H.; Maguire, C.; LaViers, A. Laban-inspired task-constrained variable motion generation on expressive aerial robots. Robotics 2019, 8, 24. [Google Scholar] [CrossRef]
- Vahdat, A.; Kautz, J. NVAE: A deep hierarchical variational autoencoder. Adv. Neural Inf. Process. Syst. 2020, 33, 19667–19679. [Google Scholar]
- Ribeiro, P.M.S.; Matos, A.C.; Santos, P.H.; Cardoso, J.S. Machine learning improvements to human motion tracking with imus. Sensors 2020, 20, 6383. [Google Scholar] [CrossRef]
- Loureiro, A. Effort: L’alternance Dynamique Dans Le Mouvement; Ressouvenances: Paris, France, 2013. [Google Scholar]
- Carreno-Medrano, P.; Harada, T.; Lin, J.F.S.; Kulić, D.; Venture, G. Analysis of affective human motion during functional task performance: An inverse optimal control approach. In Proceedings of the 2019 IEEE-RAS International Conference on Humanoid Robots (Humanoids), Toronto, ON, Canada, 15–17 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 461–468. [Google Scholar]
- Champion, K.; Lusch, B.; Kutz, J.N.; Brunton, S.L. Data-driven discovery of coordinates and governing equations. Proc. Natl. Acad. Sci. USA 2019, 116, 22445–22451. [Google Scholar] [CrossRef]
- Yang, D.; Hong, S.; Jang, Y.; Zhao, T.; Lee, H. Diversity-Sensitive Conditional Generative Adversarial Networks. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Venture, G.; Kadone, H.; Zhang, T.; Grèzes, J.; Berthoz, A.; Hicheur, H. Recognizing emotions conveyed by human gait. Int. J. Soc. Robot. 2014, 6, 621–632. [Google Scholar] [CrossRef]
- Antonini, A.; Guerra, W.; Murali, V.; Sayre-McCord, T.; Karaman, S. The blackbird uav dataset. Int. J. Robot. Res. 2020, 39, 1346–1364. [Google Scholar] [CrossRef]
- Shi, X.; Li, D.; Zhao, P.; Tian, Q.; Tian, Y.; Long, Q.; Zhu, C.; Song, J.; Qiao, F.; Song, L.; et al. Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM. In Proceedings of the 2020 International Conference on Robotics and Automation, Virtual, 31 May–31 August 2020; pp. 3139–3145. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Proceedings of the 32th 2019 Conference of Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 8024–8035. [Google Scholar]
- Yang, L.C.; Lerch, A. On the evaluation of generative models in music. Neural Comput. Appl. 2020, 32, 4773–4784. [Google Scholar] [CrossRef]
- Wang, J.; Dong, Y. Measurement of text similarity: A survey. Information 2020, 11, 421. [Google Scholar] [CrossRef]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Todorov, E.; Erez, T.; Tassa, Y. Mujoco: A physics engine for model-based control. In Proceedings of the 2012 IEEE/RSJ International Conference On Intelligent Robots and Systems, Algarve, Portugal, 7–12 October 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 5026–5033. [Google Scholar]
- Koenig, N.; Howard, A. Design and use paradigms for gazebo, an open-source multi-robot simulator. In Proceedings of the 2004 IEEE/RSJ International Conference On Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), Sendai, Japan, 28 September–2 October 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 3, pp. 2149–2154. [Google Scholar]
- Macenski, S.; Foote, T.; Gerkey, B.; Lalancette, C.; Woodall, W. Robot Operating System 2: Design, architecture, and uses in the wild. Sci. Robot. 2022, 7, eabm6074. [Google Scholar] [CrossRef]
- Corke, P.; Haviland, J. Not your grandmother’s toolbox–the robotics toolbox reinvented for python. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 11357–11363. [Google Scholar]
- Emir, E.; Burns, C.M. Evaluation of Expressive Motions based on the Framework of Laban Effort Features for Social Attributes of Robots. In Proceedings of the 2022 IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), Naples, Italy, 29 August–2 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1548–1553. [Google Scholar]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 2022, 35, 27730–27744. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Osorio, P.; Sagawa, R.; Abe, N.; Venture, G. A Generative Model to Embed Human Expressivity into Robot Motions. Sensors 2024, 24, 569. https://doi.org/10.3390/s24020569
Osorio P, Sagawa R, Abe N, Venture G. A Generative Model to Embed Human Expressivity into Robot Motions. Sensors. 2024; 24(2):569. https://doi.org/10.3390/s24020569
Chicago/Turabian StyleOsorio, Pablo, Ryusuke Sagawa, Naoko Abe, and Gentiane Venture. 2024. "A Generative Model to Embed Human Expressivity into Robot Motions" Sensors 24, no. 2: 569. https://doi.org/10.3390/s24020569
APA StyleOsorio, P., Sagawa, R., Abe, N., & Venture, G. (2024). A Generative Model to Embed Human Expressivity into Robot Motions. Sensors, 24(2), 569. https://doi.org/10.3390/s24020569