Abstract
We present results of an experiment probing whether adults exhibit categorical perception when affectively rating robot-like sounds (Non-linguistic Utterances). The experimental design followed the traditional methodology from the psychology domain for measuring categorical perception: stimulus continua for robot sounds were presented to subjects, who were asked to complete a discrimination and an identification task. In the former subjects were asked to rate whether stimulus pairs were affectively different, while in the latter they were asked to rate single stimuli affectively. The experiment confirms that Non-linguistic Utterances can convey affect and that they are drawn towards prototypical emotions, confirming that people show categorical perception at a level of inferred affective meaning when hearing robot-like sounds. We speculate on how these insights can be used to automatically design and generate affect-laden robot-like utterances.
Similar content being viewed by others
Notes
Hackett [28] proposes in total 13 properties that are universal to language, however the remaining properties (the vocal-auditory channel, broadcast transmission and directional reception, rapid fading, specialisation and total feedback were the listener can reproduce what they hear) relate specifically language through vocal/acoustic expression, and in the light of artificial languages such as sign language or programming languages, their value with respect to the broader concept of language is deemed as limited.
We argue that NLUs do not contain linguistic semantic content. They do however contain semantic content in the same way that the audible sounds made by computers, smart-phones, etc, contain semantic content.
Given the close resemblance between Gibberish Speech and Natural Language, it may be argued that Gibberish Speech could be perceived as a foreign language rather than meaningless nonsense to the naive observer.
Such settings tend to be in dynamic and unpredictable real world environments that are far from the protected and controlled, “safe” laboratory environments.
To listen to the utterances, please refer to the Online Resources. Resources 1–6 are the utterances in Set 1, and resources 7–12 are for Set 2.
Python and Java source code for the AffectButton can be downloaded at http://www.joostbroekens.com/
Broekens et al. [9] provide a detailed description of the AffectButton functionality and so this will not be described here.
The basic emotion theory as proposed by Ekman and Friesen [19] states that there are certain facial behaviours which are universally associated with particular emotions, namely anger, happiness, sadness, surprise, fear and disgust.
By “minimal situational context” we refer to the fact that the robot did not engage in vocal interaction, nor did the robot and subject engage in a complex interaction (e.g. a game of chess). Subjects were simply asked to rate sounds made by the robot, with the knowledge that the sounds were pre-recorded, and touching the robot on the head would play the next sound. In this scenario, there are no other cues that subjects can turn to in order to aid in the interpretation of the sounds.
References
Banse R, Scherer K (1996) Acoustic profiles in vocal emotion expression. J Pers Soc Psychol 70(3):614–636
Banziger T, Scherer K (2005) The role of intonation in emotional expressions. Speech Commun 46(3–4):252–267
Beck A, Stevens B, Bard KA, Cañamero L (2012) Emotional body language displayed by artificial agents. Trans Interact Intell Syst 2(1):1–29
Bimler D, Kirkland J (2001) Categorical perception of facial expressions of emotion: evidence from multidimensional scaling. Cogn Emot 15(5):633–658
Blattner M, Sumikawam D, Greenberg R (1989) Earcons and icons: their structure and common design principles. Hum Comput Interact 4:11–44
Bornstein MH, Kessen W, Weiskopf S (1976) Color vision and Hue categorization in young human infants. Human perception and performance. J Exp Psychol 2(1):115–129
Breazeal C (2002) Designing sociable robots. The MIT Press, Cambridge
Breazeal C (2003) Emotion and sociable humanoid robots. Int J Hum Comput Stud 59(1–2):119–155
Broekens J, Brinkman WP (2013) Affectbutton: a method for reliable and valid affective self-report. Int J Hum Comput Stud 71(6):641–667
Broekens J, Pronker A, Neuteboom M (2010) Real time labelling of affect in music using the affect button. In: Proceedings of the 3rd international workshop on affective interaction in natural environments (AFFINE 2010) at ACM multimedia 2010. ACM, Firenze, pp 21–26
Cassell J (1998) A framework for gesture generation and interpretation. In: Cipolla R, Pentland A (eds) Computer vision for human–machine interaction. Cambridge University Press, Cambridge, pp 191–216
Cheal JL, Rutherford MD (2011) Categorical perception of emotional facial expressions in preschoolers. J Exp Child Psychol 110(3):434–443
Cowie R, Cornelius R (2003) Describing the emotional states that are expressed in speech. Speech Commun 40(1–2):5–32
Cowie R, Douglas-Cowie E, Savvidou S, McMahon E, Sawey M, Schröder M (2000) ’FEELTRACE’: An instrument for recording perceived emotion in real time. In: Proceedings of the ISCA tutorial and research workshop (ITRW) on speech and emotion. Newcastle, pp 19–24
Delaunay F, de Greeff J, Belpaeme T (2009) Towards retro-projected robot faces: An alternative to mechatronic and android faces. In: Proceedings of the 18th international symposium on robot and human interactive communication (ROMAN 2009). Toyama, pp 306–311
Delaunay F, de Greeff J, Belpaeme T (2010) A study of a retro-projected robotic face and its effectiveness for gaze reading by humans. In: Proceedings of the 5th international conference on human–robot interaction (HRI’10). ACM/IEEE, Osaka, pp 39–44
Duffy BR (2003) Anthropomorphism and the social robot. Robot Autonom Syst 42(3–4):177–190
Ekman P (2005) Basic emotions. In: Dalgleish T, Power M (eds) Handbook of cognition and emotion. Wiley, Chichester, pp 45–60
Ekman P, Friesen W (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124–129
Embgen S, Luber M, Becker-Asano C, Ragni M, Evers V, Arras K (2012) Robot-specific social cues in emotional body language. In: Proceedings of the 21st international symposium on robot and human interactive communication (RO-MAN 2012). IEEE, Paris, pp 1019–1025
Etcoff N, Magee J (1992) Categorical perception of facial expressions. Cognition 44:227–240
Eyssel F, Hegel F (2012) (S)he’s got the look: gender stereotyping of robots. J Appl Soc Psychol 42(9):2213–2230
Franklin A, Davies IR (2004) New evidence for infant colour categories. Br J Dev Psychol 22(3):349–377
Funakoshi K, Kobayashi K, Nakano M, Yamada S, Kitamura Y, Tsujino H (2008) Smoothing human-robot speech interactions by using a blinking-light as subtle expression. In: Proceedings of the 10th international conference on multimodal interfaces (ICMI’08). ACM, Chania, pp 293–296
Gaver W (1986) Auditory icons: using sound in computer interfaces. Hum Comput Interact 2(2):167–177
Gerrits E, Schouten M (2004) Categorical perception depends on the discrimination task. Percept Psychophys 66(3):363–376
Goldstone RL, Hendrickson AT (2009) Categorical perception. Wiley Interdiscip Rev 1(1):69–78
Hackett C (1960) The origin of speech. Sci Am 203:88–96
Harnad S (ed) (1987) Categorical perception: the groundwork of cognition. Cambridge University Press, Cambridge
Heider F, Simmel M (1944) An experimental study of apparent behavior. Am J Psychol 57:243–259
Jee E, Jeong Y, Kim C, Kobayashi H (2010) Sound design for emotion and intention expression of socially interactive robots. Intel Serv Robot 3:199–206
Jee ES, Kim CH, Park SY, Lee KW (2007) Composition of musical sound expressing an emotion of robot based on musical factors. In: Proceedings of the 16th international symposium on robot and human interactive communication (RO-MAN 2007). IEEE, Jeju Island, pp 637–641
Johannsen G (2004) Auditory displays in human–machine interfaces. Proc IEEE 92(4):742–758
Karg M, Samadani Aa, Gorbet R, Kuhnlenz K (2013) Body movements for affective expression: a survey of automatic recognition and generation. Trans Affect Comput 4(4):341–359
Komatsu T, Kobayashi K (2012) Can users live with overconfident or unconfident systems?: A comparison of artificial subtle expressions with human-like expression. In: Proceedings of conference on human factors in computing systems (CHI 2012). Austin, pp 1595–1600
Komatsu T, Yamada S (2007) How appearance of robotic agents affects how people interpret the agents’ attitudes. In: Proceedings of the international conference on Advances in computer entertainment technology: ACE ’07
Komatsu T, Yamada S (2011) How does the agents’ appearance affect users’ interpretation of the agents’ attitudes: experimental investigation on expressing the same artificial sounds from agents with different appearances. Int J Hum Comput Interact 27(3):260–279
Komatsu T, Yamada S, Kobayashi K, Funakoshi K, Nakano M (2010) Artificial subtle expressions: intuitive notification methodology of artifacts. In: Proceedings of the 28th international conference on human factors in computing systems (CHI’10). ACM, New York, pp 1941–1944
Kuhl PK (1991) Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Percept Psychophys 50(2):93–107
Kuratate T, Matsusaka Y, Pierce B, Cheng G (2011) “Mask-bot”: A life-size robot head using talking head animation for human–robot communication. In: Proceedings of the 11th IEEE-RAS international conference on humanoid robots (Humanoids 2011). IEEE, Bled, pp 99–104
Lang P, Bradley M (1994) Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Therapy Exp psychiatry 25(1):49–59
Laukka P (2005) Categorical perception of vocal emotion expressions. Emotion 5(3):277–295
Levitin DJ, Rogers SE (2005) Absolute pitch: perception, coding, and controversies. Trends Cognit Sci 9(1):26–33
Liberman A, Harris K, Hoffman H (1957) The discrimination of speech sounds within and across phoneme boundaries. J Exp Psychol 54(5):358–368
Moore RK (2012) A Bayesian explanation of the ’Uncanny Valley’ effect and related psychological phenomena. Sci Rep 2:864
Moore RK (2013) Spoken language processing: where do we go from here? In: Trappl R (ed) Your virtual butler. Springer, Berlin, pp 119–133
Mori M (1970) The Uncanny Valley. Energy 7:33–35
Mubin O, Bartneck C, Leijs L, Hooft van Huysduynen H, Hu J, Muelver J (2012) Improving speech recognition with the robot interaction language. Disrupt Sci Technol 1(2):79–88
Mumm J, Mutlu B (2011) Human–robot proxemics: physical and psychological distancing in human–robot interaction. In: Proceedings of the 6th international conference on human–robot interaction (HRI’11), Lausanne
Oudeyer PY (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1–2):157–183
Paepcke S, Takayama L (2010) Judging a bot by its cover: an experiment on expectation setting for personal robots. In: Proceedings of the 5th international conference on human–robot interaction (HRI’10). ACM/IEEE, Osaka, pp 45–52
Picard RW (1997) Affective computing. MIT Press, Cambridge
Plutchik R (1994) The psychology and biology of emotion. HarperCollins College Publishers, New York
Rae I, Takayama L, Mutlu B (2013) The influence of height in robot-mediated communication. In: Proceedings of the 8th international conference on human–robot interaction (HRI’13). IEEE, Tokyo, pp 1–8
Read R, Belpaeme T (2010) Interpreting non-linguistic utterances by robots : studying the influence of physical appearance. In: Proceedings of the 3rd international workshop on affective interaction in natural environments (AFFINE 2010) at ACM multimedia 2010. ACM, Firenze, pp 65–70
Read R, Belpaeme T (2012) How to use non-linguistic utterances to convey emotion in child–robot interaction. In: Proceedings of the 7th international conference on human–robot interaction (HRI’12). ACM/IEEE, Boston, pp 219–220
Read R, Belpaeme T (2014) Situational context directs how people affectively interpret robotic non-linguistic utterances. In: Proceedings of the 9th international conference on human–robot interaction (HRI’14). ACM/IEEE, Bielefeld
Reeves B, Nass C (1996) The media equation: how people treat computers, television, and new media like real people and places. CSLI Publications, Stanford
Repp B (1984) Categorical perception: issues, methods, findings. Speech Lang 10:243–335
Ros Espinoza R, Nalin M, Wood R, Baxter P, Looije R, Demiris Y, Belpaeme T (2011) Child-robot interaction in the wild: Advice to the aspiring experimenter. In: Proceedings of the 13th international conference on multimodal interfaces (ICMI’11). ACM, Valencia, pp 335–342
Saerbeck M, Bartneck C (2010) Perception of affect elicited by robot motion. In: Proceedings of the 5th international conference on human–robot interaction (HRI’10). ACM/IEEE, Osaka, pp 53–60
Scherer K (2003) Vocal communication of emotion: a review of research paradigms. Speech Commun 40(1–2):227–256
Schouten B, Gerrits E, van Hessen A (2003) The end of categorical perception as we know it. Speech Commun 41(1):71–80
Schröder M, Burkhardt F, Krstulovic S (2010) Synthesis of emotional speech. In: Scherer KR, Bänziger T, Roesch E (eds) Blueprint for affective computing. Oxford University Press, Oxford, pp 222–231
Schwent M, Arras K (2014) R2–d2 reloaded: a flexible sound synthesis system for sonic human–robot interaction design. In: Proceedings of the 23rd international symposium on robot and human interaction communiation (RO-MAN 2014), Edinburgh
Siegel J, Siegel W (1977) Categorical perception of tonal intervals: musicians can’t tell sharp from flat. Percept Psychophys 21(5):399–407
Siegel M, Breazeal C, Norton M (2009) Persuasive robotics: the influence of robot gender on human behavior. In: International conference on intelligent robots and systems (IROS 2009). IEEE, St. Louis, pp 2563–2568
Singh A, Young J (2012) Animal-inspired human–robot interaction: a robotic tail for communicating state. In: Proceedings of the 7th international conference on human–robot interaction (HRI’12), Boston, pp 237–238
Stedeman A, Sutherland D, Bartneck C (2011) Learning ROILA. CreateSpace, Charleston
Tay B, Jung Y, Park T (2014) When stereotypes meet robots: the double-edge sword of robot gender and personality in human–robot interaction. Comput Hum Behav 38:75–84
Terada K, Yamauchi A, Ito A (2012) Artificial emotion expression for a robot by dynamic coluor change. In: Proceedings of the 21st international symposium on robot and human interactive communication (RO-MAN 2012). IEEE, Paris, pp 314–321
Walters ML, Syrdal DS, Dautenhahn K, te Boekhorst R, Koay KL (2007) Avoiding the uncanny valley: robot appearance, personality and consistency of behaviour in an attention-seeking home scenario for a robot companion. Auton Robots 24(2):159–178
Yilmazyildiz S, Athanasopoulos G, Patsis G, Wang W, Oveneke MC, Latacz L, Verhelst W, Sahli H, Henderickx D, Vanderborght B, Soetens E, Lefeber D (2013) Voice modification for wizard-of-OZ experiments in robot–child interaction. In: Proceedings of the workshop on affective social speech signals, Grenoble
Yilmazyildiz S, Henderickx D, Vanderborght B, Verhelst W, Soetens E, Lefeber D (2011) EMOGIB: emotional gibberish speech database for affective human–robot interaction. In: Proceedings of the international conference on affective computing and intelligent interaction (ACII’11). Springer, Memphis, pp 163–172
Yilmazyildiz S, Henderickx D, Vanderborght B, Verhelst W, Soetens E, Lefeber D (2013) Multi-modal emotion expression for affective human–robot interaction. In: Proceedings of the workshop on affective social speech signals (WASSS 2013), Grenoble
Yilmazyildiz S, Latacz L, Mattheyses W, Verhelst W (2010) Expressive Gibberish speech synthesis for affective human–computer interaction. In: Proceedings of the 13th international conference on text., speech and dialogue (TSD’10). Springer, Brno, pp 584–590
Zhou K, Mo L, Kay P, Kwok VPY, Ip TNM, Tan LH (2010) Newly trained lexical categories produce lateralized categorical perception of color. Proc Natl Acad Sci USA 107(22):9974–9978
Acknowledgments
This work was (partially) funded by the EU FP7 ALIZ-E project (Grant 248116).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Read, R., Belpaeme, T. People Interpret Robotic Non-linguistic Utterances Categorically. Int J of Soc Robotics 8, 31–50 (2016). https://doi.org/10.1007/s12369-015-0304-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-015-0304-0