Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu
Available online at www.sciencedirect.com ScienceDirect Cognitive Systems Research 43 (2017) 190–207 www.elsevier.com/locate/cogsys Enabling robotic social intelligence by engineering human social-cognitive mechanisms Travis J. Wiltshire a,b, Samantha F. Warta a, Daniel Barber a, Stephen M. Fiore a,⇑ a University of Central Florida, Orlando, FL, United States b University of Southern Denmark, Odense, Denmark Received 28 March 2016; received in revised form 12 September 2016; accepted 20 September 2016 Available online 24 September 2016 Abstract For effective human-robot interaction, we argue that robots must gain social-cognitive mechanisms that allow them to function naturally and intuitively during social interactions with humans. However, a lack of consensus on social cognitive processes poses a challenge for how to design such mechanisms for artificial cognitive systems. We discuss a recent integrative perspective of social cognition to provide a systematic theoretical underpinning for computational instantiations of these mechanisms. We highlight several commitments of our approach that we refer to as Engineering Human Social Cognition. We then provide a series of recommendations to facilitate the development of the perceptual, motor, and cognitive architecture for this proposed artificial cognitive system in future work. For each recommendation, we highlight their relation to the discussed social-cognitive mechanisms, provide the rationale for these recommendations and potential benefits, and detail examples of associated computational formalisms that could be leveraged to instantiate our recommendations. Overall, the goal of this paper is to outline an interdisciplinary and multi-theoretic approach to facilitate the design of robots that will one day function, and be perceived, as socially interactive and effective teammates. Ó 2016 Published by Elsevier B.V. Keywords: Human-robot interaction; Social cognition; Artificial cognitive systems; Robotics; Interaction dynamics 1. Introduction There is an increasing need to advance the state of the art in Human-Robot Interaction (HRI) by transitioning the human perceptions of robots such that they are viewed as teammates, collaborators, or partners (e.g., Fiore, Elias, Gallagher, & Jentsch, 2008; Hoffman & Breazeal, 2004; Lackey, Barber, Reinerman, Badler, & Hudson, 2011; Phillips, Ososky, Grove, & Jentsch, 2011; Warta, Kapalo, Best, & Fiore, 2016). Greater consideration for the social capabilities of robots is warranted to advance HRI. Sup- ⇑ Corresponding author at: 3100 Technology Parkway, Orlando, FL 32816, United States. E-mail address: sfiore@ist.ucf.edu (S.M. Fiore). http://dx.doi.org/10.1016/j.cogsys.2016.09.005 1389-0417/Ó 2016 Published by Elsevier B.V. port for this claim comes in part from the 2013 Robotics Roadmap, which specifically highlighted the need to advance social aspects of robotics including social cognition and modeling (see Christensen et al., 2013). While the study of robot social-cognitive mechanisms has received less emphasis in HRI, when compared to issues like autonomy, trust, and reliability (e.g., Hancock et al., 2011; Lindblom & Andreasson, 2016; Sheridan, 2016; Tsarouchi, Makris, & Chryssolouris, 2016), support for pursuing such efforts comes from the essentiality of social cognition for HRI and human-robot teaming (Christensen et al., 2013; Fiore et al., 2013; Warta et al., 2016). In particular, such mechanisms will allow robots to interact with humans in a more natural and intuitive way (Breazeal, 2004). More pertinently, social-cognitive T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 mechanisms are necessary for effective human-robot coordination and cooperation, as they would afford a robot the capacity to interpret the mental states (i.e., intentions, emotions, beliefs, desires) of humans during socially interactive contexts and support the concurrent display of appropriate behaviors (e.g., Vernon, 2014). In turn, such mechanisms can enable robotic teammates to dynamically work with humans toward shared goals (Hoffman & Breazeal, 2004) and function adaptively in an inherently information rich social environment (Dautenhahn, Ogden, & Quick, 2002). Some of the more well-known artificial general intelligence (AGI) models, or cognitive architectures, began as a subset of theories before evolving into the AGI models of today. It is in a similar fashion that we present our ideas as an integration of theory and modeling recommendations to advance the social capabilities of robots. Attempts to develop integrated theories of human cognition that have detailed functional mechanisms, and can be instantiated in artificial systems, are often presented as cognitive architectures (e.g., Langley, Laird, & Rogers, 2009). These computationally-based architectures serve as models for artificial intelligence (AI) and detail the constructs incorporated within a given theory of human cognition. Several different iterations of these cognitive architectures exist, reflecting the diversity of perspectives regarding human cognition and cognitive processes. Examples of these AGI models include cognitive architectures like: the Learning Intelligent Distribution Agent (LIDA; Ramamurthy, Baars, D’Mello, & Franklin, 2006), Soar (Laird, Newell, & Rosenbloom, 1987), and the Adaptive Control of Thought – Rational architecture (ACT-R; Anderson, 1993; Anderson et al., 2004).1 Each of these AGI models are grounded in a series of conceptual commitments, which allows for integration of the theoretical components into computational mechanisms that characterize each model. Toward a similar end, but focusing on the socialcognitive components of artificial intelligence, this paper represents an elaboration of our previous work regarding social-cognitive mechanisms in robots (Wiltshire, Barber, & Fiore, 2013). This paper details progress on key components of our approach to engineering human social cognition and elaboration of each of the modeling 1 LIDA is considered a working model of cognition designed to imitate human-like learning (i.e., episodic learning) as it utilizes artificial feelings and emotions to learn to act in a human-like manner. On the other hand, Soar represents a much less specialized form of cognitive architecture designed to be capable of general intelligence, which is characterized by processes such as, problem solving, learning, and intelligent behavior. Conversely, ACT-R integrates theories of human cognition, visual attention, and motor movement to enable researchers to create models pertaining to various tasks (e.g., learning and memory, problem solving and decision making, language and communication, perception and attention) by supporting the collection and comparison of quantitative measures (e.g., accuracy, time elapsed, neurological data) generated from the ACT-R model and human participants (ACT-R Research Group, 2013; Anderson et al., 2004). 191 recommendations. While our aim is not to propose a new AGI system per se, we are proposing a systematic set of conceptual commitments that we hope aid in the development of socially intelligent machines. As such, the first section of our paper will introduce human social cognition and its associated theoretical mechanisms. In the second section, relevant sub-components of cognition are defined in further detail to emphasize the significance of their role in our approach to engineering human social cognition and shed light on the importance of their inclusion in robotic design. From this, the third section will specify the framework of recommendations we make based on numerous disciplines including HRI, philosophy, psychology, robotics, and neuroscience. For these, we draw from theories of embodied cognition, dual-process theory, ecological psychology, and dynamical systems theory. Our recommendations do not just represent singular ideas from each of these areas, but instead outline a hierarchical approach to modeling socialcognitive mechanisms in robots by synthesizing ideas across disciplinary perspectives to facilitate their future instantiation. As a final point, we review potential next steps toward engineering human social cognition in robots and expound on future research opportunities. 2. Human social cognition Social cognition encompasses the perceptions, actions, and cognitive processes involved in the observation of, and interaction with, others (Frith & Frith, 2007, 2008, 2012). Research in social cognition is primarily concerned with the mechanisms through which humans are able to understand their own and others’ mental states (e.g., intentions, emotions, beliefs, desires). An understanding of mental states facilitates the explanation and prediction of the behavior of others and, in turn, the ability to act accordingly. This is often termed theory of mind, mentalizing, or mindreading. Whereas some approaches argue social cognition is primarily for figuring out the minds of others, other approaches suggest it is for interacting with others (e.g., Di Paolo & De Jaegher, 2012) and shaping our relationships with them (Bohl, 2015; Fiske & Haslam, 1996). Thus, there is currently a lack of consensus on the specific functions served by social cognition. In part, this lack of consensus is due to a tendency for research to overlook the distinction between online versus offline social cognition. Online social cognition is characterized by actual social interactions where a reciprocal exchange of behaviors between two or more agents leads to a recursive change in each agent’s mental states and in turn, their behaviors (Przyrembel, Smallwood, Pauen, & Singer, 2012). Offline social cognition is characterized by cases of passive observation where one agent is merely observing another (Ibid.). This distinction is relevant in that recent research has shown support for distinct neural activation during online versus offline social cognition (e.g., Tylén, Allen, Hunter, & Roepstorff, 2012). As such, 192 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 there may also be distinct social cognitive mechanisms at play in service of the different functions needed for these types of interactions. Further, both behavioral and neuroscientific research in social cognition has tended to study offline social cognition (e.g., Przyrembel et al., 2012; Schilbach, 2014). Therefore, this distinction between online versus offline social cognition can lend precision to approaches for developing artificial social intelligence (see also Pezzulo, 2012; Wiltshire, Lobato, Jentsch, & Fiore, 2013). An important area of research in this domain is debate about the basic fundamental mechanisms of social cognition. One of the first posited mechanisms is espoused by the Theory of Mind (ToM) approach, which asserts that mental states are primarily understood through the use of theoretical inference mechanisms (e.g., Gopnik & Wellman, 1992; Premack & Woodruff, 1978). According to this theory, given some perceptual social information, for example, the social cues of a furrowed brow and pursed lips, an inferential mechanism is required to probabilistically determine the mental state of the person displaying those cues (e.g., anger). Another proposed mechanism is perceptual-motor simulation routines (Blakemore & Decety, 2001; Goldman, 2006). In this case, upon encountering the aforementioned cues, an individual would employ a simulation mechanism to assess the mental state they experience under conditions where they displayed those observable cues and, from this, attribute that state to the other person. Lastly, in cases of actual social interaction (i.e., during online social cognition), a direct perception mechanism posits that there is enough information afforded by the embodiment of an interactor that their mental states can be understood without the need for inferential or simulative mechanisms (De Jaegher, 2009; De Jaegher & Di Paolo, 2007; Gallagher, 2007, 2008; Gangohopadhyay & Schilbach, 2011; Wiltshire, Lobato, McConnell, & Fiore, 2015). In short, there is a rich history of theoretical and empirical research in social cognition, with various nuanced associations and commitments aligned with differing approaches (see Table 1 for an overview). And, this continues to be an active area of research. We provided this brief summary of key social-cognitive mechanisms to provide initial conceptual grounding for our discussion of how to conceptualize engineering social cognition in artificial systems. In addition to the references provided in Table 1, a number of integrative reviews and meta-analyses have also explored the various positions in social cognition (Adolphs, 1999; Bodenhausen & Todd, 2010; Bohl & van den Bos, 2012; Fiske & Haslam, 1996; Macrae & Bodenhausen, 2000; Schilbach et al., 2013; Van Overwalle, 2009; Wiltshire et al., 2015). Given the complex nature of social cognition, we next suggest that explanatory pluralism is needed in this area of inquiry (e.g., Dale, 2008; Dale, Dietrich, & Chemero, 2009). Specifically, we argue that a better understanding of the mechanisms of social cognition might be obtainable through the integration of competing theories into metalevel frameworks that sustain the co-existence of each (Dale, 2008). With the growing body of evidence for the various proposed mechanisms, it seems increasingly likely that humans may have the capacity to employ each of them depending upon the social cognitive functions needed (see Wiltshire et al., 2015). However, with so many diverse perspectives, much work remains to integrate these mechanisms and empirically demonstrate the contexts and situations in which one mechanism would be adopted as opposed to another. For instance, in economic exchange situations, where decision-making has financial consequences, ToM mechanisms endow an individual with the ability to recognize others with whom a trusting relationship can be formed (McCabe, Houser, Ryan, Smith, & Trouard, 2001). Additionally, ToM mechanisms assist in the detection of cheaters through recognition of (non)cooperative or deceptive cues (Dunbar, 1998). Relatedly, direct perception would serve to prime and potentially reinforce ToM mechanisms in that an individual would be able to see cooperative or competitive intentions simply by observing kinematic movement (Sartori, Becchio, & Castiello, 2011). Simulation mechanisms would enable an individual to model and anticipate the behavior of others across multiple outcomes (Kourtis, Sebanz, & Knoblich, 2010). In the case of social situations, for example, this would allow for prediction of friendly or threatening behavior. Recently, dual-process theorizing was used to integrate the distinctions between online/offline cognition and the varied findings and proposed mechanisms associated with theory of mind, direct perception, and simulation routines (Bohl & van den Bos, 2012). Dual-process theories of cognition are supported by decades of research in social and cognitive psychology, and, more recently, by findings in cognitive neuroscience. Overall, this research provides evidence for two distinct types of cognitive processes that adhere to different neural pathways and that are evident at both the neural and functional levels (e.g., Bargh, 1984; Chaiken & Trope, 1999; Satpute & Lieberman, 2006). Type 1 (T1) processes are often characterized as implicit, automatic, reflexive, and stimulus-driven with primacy assigned to action and thus, qualify as online (cf. Wilson, 2002). Contrarily, Type 2 (T2) processes are explicit, controlled, reflective, and characteristically off-line (e.g., Bohl & van den Bos, 2012; Chaiken & Trope, 1999). While the overlap here between T1 processes with online social cognition and T2 processes with offline social cognition is clear, the relationship between dual-process theory and the aforementioned social-cognitive mechanisms is less so (cf. Wilkinson & Ball, 2013). Within Bohl and van den Bos’ (2012) integrative framework, T1 processes align with the direct perception mechanism of social cognition whereby mirror neurons and other sensory-motor areas contribute to the understanding of others’ mental states. Several researchers argue that this type of mechanism allows for rapid access to the inten- T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 193 Table 1 Social cognition and its mechanisms. Social Cognition Mechanism Theory of Mind (i.e., mentalizing, mindreading) Perceptual-motor Simulation Routines Direct Perception Definition References The perceptions, actions, and cognitive processes involved in observing and understanding others as well as interacting with others and shaping our relationships with them Bohl (2015), Di Paolo, and De Jaegher (2012), Fiske and Haslam (1996), and Frith and Frith (2007, 2008, 2012) Theoretical inference mechanisms use social information to probabilistically determine mental states Simulation mechanisms enable the modeling and attribution of a mental state based upon the social cues exhibited Mental states can be perceived through the embodiment of the interactor without additional cognitive mechanisms Gopnik and Wellman (1992) and Premack and Woodruff (1978) Blakemore and Decety (2001) and Goldman (2006) tional and affective states of other agents (e.g., De Jaegher, 2009; Gallagher, 2008). T2 processes, in this account, align with both the inferential and simulative inference mechanisms of social cognition, which are supported by distinct regions that have been referred to as the Theory of Mind system (see also the X system in Satpute & Lieberman, 2006). Both types of processes are interdependent, but the general function and form of each is distinct. What is also worth emphasizing is that actual social interaction is complex and can be characterized by a number of dimensions; verbal and nonverbal behavior, varying contexts, quantity of participants, and strict timing demands for reciprocal and joint activity (De Jaegher, Di Paolo, & Gallagher, 2010). Thus, both T1 and T2 processes, and their underpinning mechanisms, are required to successfully navigate the complex social environment. We have only briefly outlined the theorizing on mechanisms and functions of social cognition. But the aforementioned ideas provide the foundation on which we build our approach for developing artificial social cognition. We next detail the theoretical underpinnings for defining our approach to Engineering Human Social Cognition such that the design of machines capable of engaging in actual social interaction and collaborative teamwork with humans will one day become a reality. 3. Engineering human social cognition In this section, we advance and extend our approach termed Engineering Human Social Cognition (EHSC), which aims to leverage an interdisciplinary and multi-level understanding of human social-cognitive processes for the development and design of robotic systems that possess social intelligence (Streater, Bockelman Morrow, & Fiore, 2012; Wiltshire, Barber et al., 2013). Our goal with EHSC is to address a number of longstanding problems in humanrobot and human-machine interaction such that robots and machines can interact more naturally with people as an effective and collaborative teammate (e.g., Fiore et al., 2011; Wiltshire, Smith, & Keebler, 2013). For example, EHSC can contribute to the development of trustworthy De Jaegher and Di Paolo (2007), De Jaegher (2009), Gallagher (2007, 2008), Gangohopadhyay and Schilbach (2011), and Wiltshire et al. (2015) human-machine social interfaces (Atkinson & Clark, 2013), and also provide robots with the capacity to convey important aspects of their status and intentions while also being able to interpret the intentions of their human teammates (Klein, Woods, Bradshaw, Hoffman, & Feltovich, 2004). Similarly, EHSC can aid development of agents capable of playing the role of teammate in the context of complex collaborative problem solving (Fiore et al., 2008; Wiltshire & Fiore, 2014). This approach aims to support inter-predictability between robots and human teammates and, thus, enable effective human-robot coordination (e.g., Bradshaw et al., 2008; Bradshaw et al., 2009). The EHSC approach is comprised of four key components crucial to robotic design. The combination of these will facilitate the transition of robotic agents perceived as tools to robotic agents perceived as teammates. First, the field of social signal processing (SSP; e.g., Vinciarelli et al., 2012) lays the foundation for the creation of machines that can better understand humans and in turn, communicate more effectively with them. Our approach integrates SSP with dual-processing theories to account for the functional differences of T1 and T2 processes at play during social interaction. Second, our emphasis on social robotics (i.e., robots as socially interactive agents) builds upon and incorporates the mechanisms enabled by SSP to allow for more natural and automatic interaction in HRIs. Here, we have a specific focus on the processes that motivate the verbal and nonverbal communicative cues supporting interaction. Third, to achieve more automatic interaction in HRI, we also attend to embodied cognition theory as this provides a more natural grounding for the expression and comprehension of interactive behaviors between humans and robots. Finally, the convergence of these components will help lead to development of a more autonomous robotic system, capable of coordinating with teammates, and, to some degree, self-regulation of behaviors in the team context. This, we suggest, is crucial to the goal of transitioning from tool to teammate. We next elaborate on each of the key components of the EHSC approach (see Table 2 for a definition of each component). 194 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 Table 2 Components of the engineering human social cognition approach. Component Definition Social Signal Processing A multi-disciplinary field focused on developing social mechanisms for AI capable of interpreting social signals from social cues A social signal processing system that integrates mechanisms characteristic of T1 and T2 processes to respectively enable more reflexive and analytic forms of artificial cognition Physically embodied agents with a level of autonomy that enables meaningful interaction with humans An interaction of the brain, body, and environment to influence cognitive processes An independent, embodied system that operates without the need external control as it works towards its goals and maintains itself Dual Social Signal Processing System Socially Interactive Robots Embodied Cognition Autonomous Robotic System 3.1. Social signal processing EHSC draws heavily from the multi-disciplinary field of Social Signal Processing (SSP). EHSC shares SSP’s aim of providing social mechanisms for computers that are able to interpret high-level social signals (i.e., mental states) from combinations of low-level social cues (see Vinciarelli et al. (2012) for review). In this case, social cues are the physiological and observable activities that are apparent in a person or group of people, and social signals are meaningful interpretations of these cues as a function of the mental states attributed to said agents (see also Fiore et al., 2013; Lobato, Warta, Wiltshire, & Fiore, 2015; Lobato, Wiltshire, Hudak, & Fiore, 2014; Wiltshire, Lobato, Wedell et al., 2013; Wiltshire et al., 2015; for more on this distinction). The interpretation of social cues, in turn, can be characterized by the type of cognitive process used (i.e., T1 or T2) and the specific characteristics of the social situation (see Wiltshire, Snow, Lobato, & Fiore, 2014). Interestingly, the social cues and signals distinction applies to both robots interpreting humans and humans interpreting robots capable of displaying social cues (e.g., DeSteno et al., 2012; Fiore et al., 2013; Warta, 2015; Wiltshire, Barber et al., 2013). Therefore, expanding upon the above description of dual-process theory requires further explanation with regard to the dual way in which social-cognitive processes occur in biological systems and ways they can be implemented in artificial cognitive systems. 3.2. Dual paths for social signal processing Mechanisms characteristic of T1 processes in humans allow for a more direct understanding of, and automatic interaction with, humans as well as the environment. Additionally, mechanisms characteristic of T2 processes in humans allow for more complex and deliberate forms of cognition, such as mental simulation and theoretical inferences that support the prediction and interpretation of complex and novel social situations (see Bohl & van den Bos, 2012; Wiltshire et al., 2015). For example, a robot interacting with a teammate during situations with high temporal demands can leverage T1 processes in service of directly engaging in actions appropriate for the situation. However, when temporal demand is lower, or if a robot is taking on a passive role, such as during a surveillance task, T2 processes would allow for more analytic forms of cognition that may enable the robot and, in turn, its teammates to better understand and predict the future states of a situation. To the best of our knowledge, no such approach currently exists that explicitly attempts to provide a foundation for creating a dual social signal processing system in an embodied robot. 3.3. Socially interactive robotic teammates Central to our approach is an emphasis on social robotics since we view this as an enabling factor in the establishment of human-robot teams (Wiltshire, Smith et al., 2013). We take the position that, for robots to be teammates, they must, first and foremost, be socially interactive (e.g., Gallagher, 2007, 2013; Wiltshire, Barber et al., 2013). We draw from recent neuro-philosophical accounts that posit the primacy of social interaction for the emergence of more complex social-cognitive mechanisms (Di Paolo & De Jaegher, 2012). With this in mind, social robots are defined as: (a) physically embodied agents that, (b) function at some level of autonomy, and (c) interact and communicate with humans by (d) adhering to normative and expected social behaviors (Bartneck & Forlizzi, 2004). Extending upon this definition, robots can be described as socially interactive if they are able to: (a) express or perceive emotions, (b) use high-level dialogue, (c) learn about, and recognize, other agents, (d) establish and maintain social relationships with humans, and (e) use and perceive natural social cues (Fong, Nourbakhsh, & Dautenhahn, 2003). Naturally, any single component of these definitions poses a significant challenge to designers of robotic systems. Given the above, we suggest that, in attempting to develop socially interactive robots, a combinatorial approach is required that ultimately focuses on enabling meaningful interaction. To do so requires the understanding and integration of perceptual, motor, and cognitive factors. A key example of such an approach is outlined in Pezzulo’s (2012) description of the Interaction Engine. In particular, Pezzulo (2012) describes the design of a computational system that not only accounts for the online and offline distinction (though listed as observer versus actor), but also specifies the tasks required for communication in increasingly complex interaction scenarios (see Table 3 for more details). T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 195 Table 3 Overview of social robotics interaction engine (adapted from Pezzulo, 2012). Observer Offline social cognition Actor Online social cognition Scenario Type Individual Scenario Tasks of Perceptual Processes Estimating the state of the observed system Tasks of Action Processes Achieving goals related to the environment Interaction Scenario Non-Communicative Interaction Scenario Communicative Joint Action Scenario Tasks of the Observer Mindreading (estimating cognitive variables of another agent) Mindreading for recognizing communicative intentions Formation of shared representations Tasks of the Actor Achieving goals relative to another’s actions Linguistic Scenario Mindreading for recognizing communicative intentions in speech acts Built upon Levinson’s (2006) ideas, Pezzulo’s (2012) Interaction Engine provides the foundational basis for both linguistic and nonlinguistic interaction. There are several mechanisms comprising the Interaction Engine, but the mindreading, communicative, and representationsharing mechanisms, three components grounded in predictive processes, play a fundamental role in meaningful interaction. Where mindreading mechanisms enable abilities akin to a theory of mind, and allow an individual to assess the beliefs and intentions of others, communicative mechanisms accomplish interactional goals such as, manipulating others’ beliefs and intentions. Lastly, representation-sharing mechanisms allow for conveying mental models or representations to others, which in turn, enhances the effectiveness of an individual’s actions when operating alone in pursuit of an individual goal or coupled with the actions of others in pursuit of a joint goal. Broadly, approaches such as the Interaction Engine and its mechanisms are applicable to a number of contexts encountered in the social environment, namely observation and interaction scenarios and in the case of mindreading tasks, offer the promise of a greater understanding when it comes to human social cognition and its instantiation in artificial cognitive systems. 3.4. Embodied cognitive systems Within embodied cognition, the brain, body, and environment interact in such a way to allow for the formation of cognitive processes (Varela, Thompson, & Rosch, 1991). In framing our approach, we suggest that robots be designed with serious consideration of their embodiment. Motivation for our commitment to embodiment, when developing social-cognitive mechanisms, comes from work in cognitive robotics through the idea that ‘‘when two cognitive systems interact or couple, the shared consensus of meaning. . .will only be semantically similar if the experiences of the two systems are compatible” (Vernon, 2010, p. 93). For example, spatial orientation concepts such as up or down only have meaning if the system can directly experience these concepts through its own physical body Achieving goals relative to another’s internal variables (changing mental states of other agents) Joint action control (takes joint goal into consideration, uses shared representation) Using language to achieve goals relative to another’s internal variables, common ground formation (Lakoff & Johnson, 1980). There is increasing support for the modeling of artificial cognitive systems, and social robots, on commitments of embodied cognition across disciplines including AI, HRI, philosophy, psychology, robotics, and neuroscience (e.g., Anderson, 2003; Barsalou, 2008; Breazeal, Gray, & Berlin, 2009; Chaminade & Cheng, 2009; Dautenhahn et al., 2002; Fiore et al., 2008; Franklin, Strain, McCall, & Baars, 2013; Gallagher, 2007, 2013; Hoffman, 2012; Pezzulo et al., 2013; Pfeifer, Lungarella, & Iida, 2007). Through our commitment to embodied cognition, we mean that the designers of artificial cognitive systems must account for: (a) the environment or ecological niche and the associated physical laws in which (b) the body and morphological structures of the agent are grounded, (c) the sensorimotor couplings (i.e., the relations between sensors and effectors), as a function of the agent’s morphology, that shapes the dynamic interactions between the agent and the environment, and (d) the situatedness of the agents’ cognitive processes as a function of varying contexts (e.g., Pezzulo et al., 2013; Pfeifer et al., 2007; Wiltshire, Barber et al., 2013). This commitment to embodiment is central to social cognition in humans (e.g., Barsalou, Niedenthal, Barbey, & Ruppert, 2003; Bohl & van den Bos, 2012). Here, committing to an embodied view of social cognition is crucial, especially when trying to instantiate social-cognitive mechanisms in robots. However, given that the intent is for humans and robots to have the ability to collaborate with each other, this commitment increases in importance. Put in other words, if humans and robots are to understand each other and work effectively together, the more similar their embodiment is and, in turn, their cognition, the easier it is for interaction between these biological and artificial cognitive systems to occur (Vernon, 2010). 3.5. Autonomous robotic teammates Given that one of the long-term goals for designers of robotic teammates, and EHSC, is for robots to function autonomously, it is essential to frame our approach with a definition of autonomous agents. Autonomous agents 196 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 are characterized as an embodied system that is designed to ‘‘satisfy internal and external goals by its own actions while in continuous long-term interaction with the environment in which it is situated” (Beer, 1995, p. 173). Although there are inherent challenges associated with automated systems performing as team members (cf. Klein et al., 2004), we submit that providing foundational social-cognitive mechanisms is a necessary step towards mitigating many of these issues. Our aim with this paper is not to articulate the challenges that could be solved by the instantiation of such mechanisms, but rather to provide an outline of such a system. We suggest that leveraging the efforts of SSP, incorporating dual-process theories of cognition, supporting social interaction in the design of robotic agents, grounding robotic design in an embodied approach to cognition, and integrating the aforementioned in effort to strive for autonomy in the way robots interact with people, will provide a roadmap to chart the path forward in EHSC. In turn, we expect these concepts to improve human-robot interaction as a function of robots being provided with necessary social-cognitive mechanisms (Wiltshire, Barber et al., 2013). If these mechanisms can be computationally modeled, we argue that it may be a step toward approximating the sophisticated and flexible nature of human social-cognitive processes. It is our position that adopting the factors associated with the EHSC approach would ultimately lead to, not only a more effective robot, but also a more effective artificial teammate. Now that we have outlined the key components of EHSC, in the remainder our paper we provide the recommendations for modeling an artificial social-cognitive system based on our approach. 4. Recommendations for modeling social-cognitive mechanisms in robots The recommendations we next advance are drawn from multiple disciplines including HRI, philosophy, psychology, robotics, and neuroscience, as well as from theories of embodied cognition, dual-process theory, ecological psychology, and dynamical systems theory. We provide a novel means for conceptualizing ideas across disciplines and theories by organizing them into an integrated set of hierarchical recommendations for engineering human social cognition (i.e., an attempt toward the goals of explanatory pluralism; see Dale, 2008; Dale et al., 2009). Notably, we draw heavily from, and build upon, the ideas detailed in the Computational Grounded Cognition approach (Pezzulo et al., 2013), by enriching this account with a conceptualization for embodied cognition drawn from ecological psychology as well as human social cognition. At this point, our goal is not to provide computational formalisms as many of these are included in the references we cite. Because design recommendations for artificial socialcognitive systems are sparse (see, however, Vernon, 2014), we provide interdisciplinary and multi-theoretic rec- ommendations that can aid in the design and development of robots that will one-day function, and be perceived, as socially interactive and effective teammates. The focus of these recommendations is on developing the perceptual, motor, and cognitive architecture for such a system. Taken together, these recommendations serve as an ambitious first step toward improving human-robot interaction and teaming. The subsequent architectural design recommendations are presented in the following manner; first, relevant background information and a brief description of the recommendation will be outlined to better orient and frame the given rationale in light of the EHSC approach. Then, several example cases are provided to further illustrate the relevance and real-world application of the presented recommendation. As such, future efforts can leverage these recommendations to adopt a technical approach; therefore, what is presented here is meant to be more illustrative than exhaustive. In Fig. 1, we provide a schematic to offer a visual frame of reference for the recommendations that follow. While Fig. 1 is a clear oversimplification, it is useful in conveying several aspects related to the organization of our modeling recommendations. Specifically, the arrows represent the flow of social information through the system. In cases where our recommendations are most associated with T1 processes, there is a more direct relationship between the system’s receptors and its effectors. Some information must be stored for use by the system (see recommendation 4) to be used in more computationally intensive processing. Thus, the left side of the figure represents the least computationally demanding recommendations and the right side represents the most demanding. Note that there are plausible bidirectional and coupled relationships between modeling recommendations and components of the system (that would be represented in part by recommendation 5); however, the figure was constructed in a parsimonious fashion for legibility. Recommendation 5 is thus represented by the arrows in the figure although in a simplified form. Fig. 1 is best used in conjunction with the summary of the modeling recommendations included in Table 4. 4.1. Leverage the ecological approach to robotics Many extant computational approaches to embodied cognition (e.g., Pezzulo et al., 2013) do not incorporate insights from ecological psychology and its transitional work in robotics, despite the fact that ecological psychology was fundamental to embodied approaches to cognition (Wilson, 2002). In robotics, the ecological approach is characterized by the following principles: (a) treatment of the agent and the environment as a system, (b) behavior emerges from the dynamics of the agent-environment system, (c) a direct coupling exists between perception and action, (d) information for adaptive behavior is available in the environment, and (e) an agent does not always need T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 197 Fig. 1. Schematic overview of the flow of social information and the relations between T1 and T2 processes, and the EHSC modeling recommendations. The arrows represent recommendation 5. to represent the environment in a centralized model (Duchon, Kaelbling, & Warren, 1998; see also Gibson, 1979; Wiltshire, Barber et al., 2013). Accordingly, we suggest that artificial cognitive systems should be designed to provide the opportunity for direct and non-representational interaction with the physical and social environment (Wiltshire, Barber et al., 2013). We argue that the rationale behind leveraging the ecological approach, in light of EHSC, is that it provides a framework to computationally instantiate T1 processes in robots. That is, incorporating elements of this approach provides the mechanisms for direct interaction with the physical and social environment. Such an approach would also align with the principle of ecological balance (Pfeifer & Scheier, 1999), which relies on matching task environment complexity to the proper embodied model. The potential benefits of adhering to this recommendation are that it may serve to minimize the expense of computational resources and reduce latency in the behavioral responses of the robot. Example instantiations of efforts in robotics related to this recommendation can be found in Brooks (1999) as well as Duchon et al. (1998). Duchon et al. (1998) pioneered ecological robotics and developed robotic systems that could navigate the environment solely through optic flow rather than construction of an internal model of the world. Likewise, Brooks (1999) developed the subsumption architecture, where each control sub-system for the robot was added to more basic sub-systems without interfering with previous systems. This approach was also one of the first efforts in robotics to emphasize direct perception-action links, which provided real-time mechanisms for interaction with the environment that did not rely on a centralized control model (Bermúdez, 2010). It is evident, however, that both of these instantiations only emphasized interaction with the physical environment. Therefore, efforts are needed to develop mechanisms for interaction with the social environment. 4.2. Utilize physical and social affordances Drawing yet again on the ecological approach, we emphasize the utilization of affordances. Affordances can be defined as the directly perceivable opportunities for 198 Table 4 Modeling recommendations with potential computational formalisms and representative references. Description Rationale Benefits Example instantiations References 1. Leverage the ecological approach to robotics Design system to provide the opportunity for direct and nonrepresentational interaction with the physical and social environment Provides a framework for instantiating T1 processes in robots To minimize the expense of computational resources, reduce latency in responses  Subsumption architectures  Optical flow Brooks (1999) and Duchon et al. (1998) 2. Utilize physical and social affordances Design system for detecting affordances - directly perceivable opportunities for action or interaction arising between the agent and environment across objects, substances, surfaces, and other agents Provides robot with more direct or T1 processes for action and interaction Robot behavioral control mechanisms that specify the relations between the robot and the environment, physical objects, human interactors  Visuo-spatial perspective taking, effort and affordance analysis  Multiperspectival affordance control mechanisms (effect, (entity behavior)) Hafner and Kaplan (2008), Pandey and Alami (2012), S ß ahin et al. (2007), and Uyanik et al. (2013) 3. Incorporate analysis of interaction dynamics System should be designed with mechanisms for analyzing as well as synchronizing and complementing emergent interaction dynamics Could provide a T1 mechanism for analysis of interaction dynamics unfolding between robot and human teammates Improvements in human-robot and robot-robot coordination required for joint actions  Dynamical systems modeling techniques  HKB circuits and equations Ansermin et al. (2016), Beer (1995), Kelso et al. (2009), Marsh et al. (2009), and Treur (2013) 4. Instantiate modal perceptual and motor representations To the extent that the robot cannot rely on nonrepresentational interaction with the physical and social environment, the embodied cognitive system of a social robot must rely on multi-modal sensory and motor representations Commitment of Grounded Computational Cognition – that incoming information from the robot’s sensors must be represented in a form that is linked to its modality (visual, auditory, etc.) Provides the foundation from which a robot could begin to manipulate modal perceptions to not only interpret a human teammate, but also to form concepts, memories, and make decisions  Afferent and Efferent modality streams, nodes, action networks Breazeal et al. (2009), Hoffman (2012), and Pezzulo et al. (2013) 5. Couple perception, action, and cognition Modal representations require integration and association with one another to couple perceptual sensors with motor effectors Help enable both T1 and T2 processes; provide a basis for perceptual learning May lead to more fluent interaction between robots and human teammates  Convergence zones  Action-perception activation networks Brooks (1999), Damasio (1989), Hoffman (2012), and Simmons and Barsalou (2003) 6. Provide motor and perceptual resonance mechanisms System should be designed with motor and perceptual resonance mechanisms that allow for recursive entrainment and behavior matching Provide T1 motor and perceptual resonance mechanisms akin to those elicited in humans by the mirror neuron system Bi-directionality allows for robots to engage in social interaction more similarly to those occurring in humans would likely: (a) improve overall social competence, (b) contribute to a better understanding of human teammates’ intentions, and (c) lead to more coordinative joint actions  Mirror Circuits  Hebbian learning networks associating visual and motor representations Amit and Mataric (2002), Barakova and Lourens (2009), Billard (2002). Chaminade and Cheng (2009), Elias (2008), Ito and Tani (2004), and SchützBosbach and Prinz (2007) (continued on next page) T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 Modeling recommendations Hoffman (2012), Hoffman and Breazeal (2010), and SchützBosbach and Prinz (2007)  Markov-chain Bayesian anticipatory simulations  Intermodal Hebbian reinforcing  Weighted feature maps ‘‘Simulation-based top-down perceptual biasing may specifically be the key to more fluent coordination between humans and robots working together in a socially structured interaction” (Hoffman, 2012) Constitutes an essential link between T1 and T2 processes that stimulates and triggers the motor system towards the selection of appropriate actions while also informing direct perception-action links Breazeal et al. (2009), Johnson and Demiris (2005), Pezzulo (2012), and Vernon (2010)  Generation and simulation modes  Dynamic Bayesian Networks Provide the robot with a key mechanism for engaging in mental state attributions of others, explaining current events, predicting future events, and imagining new events Instantiate a T2 simulation/ inference process for going beyond information directly available in the environment Description Provide a simulation mechanism that allows for the ‘‘reenactment of perceptual, motor, and introspective states acquired during experience with the world, body, and mind” (Barsalou, 2008) System should be designed to provide a means for a robot to predict future states in service of anticipating and thus engaging in coordinative actions with the physical and social environment Modeling recommendations 7. Abstract from modal experiences 8. Leverage simulation-based top-down perceptual biasing Table 4 (continued) Example instantiations Benefits Rationale References T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 199 action or interaction arising between the agent and environment where the number of objects, substances, surfaces, and other agents comprise the interaction (Chemero, 2003; Gibson, 1979). Direct perception in the ecological sense typically means perception that does not rely on mental representations (Chemero, 2009, 2013), an interpretation echoed in the direct perception approaches to social cognition (see De Jaegher, 2009; De Jaegher & Di Paolo, 2007; Gallagher, 2007, 2008; Gangohopadhyay & Schilbach, 2011). We have advanced the notion of ‘‘social affordances” to describe how physically embodied cues experienced during interaction are sometimes enough to make rapid mental state attributions (Best, Warta, Kapalo, & Fiore, 2016). As such, affordances regarding both the physical and social environment are dependent upon the appropriate structuring of the environment and the agents in that environment. The combination of these provides information perceivable to an agent through its optic array but is dependent upon the form of embodiment the agent maintains and the sensory modalities available to it (e.g., Chemero, 2003; Kono, 2009). Incorporating affordances in a robotic system is significant in that such integration is essential to providing a robot with direct or T1 processes for interaction, which offers the underlying rationale for this recommendation. The potential benefit of doing so will be a positive influence on the behavioral control mechanisms in a robot such that it will be possible to specify the relations between the robot, the environment, physical objects, and human interactors (e.g., S ß ahin, Çakmak, Doğar, Uğur, & Üçolik, 2007), which may allow for more fluent interaction. As far as example instantiations of affordances are concerned, mechanisms for control of autonomous robots have been utilized based on affordances that specify relations between the agent and the environment and have proven useful for navigation through a physical environment (e.g., S ß ahin et al., 2007). Much of the affordance-based efforts in robotics are primarily of this nature; that is, focused on the physical environment with little emphasis on the social environment. Although not clearly following an ecological approach, Pandey and Alami (2012) developed a system to allow for a set of complex socialcognitive behaviors that includes mechanisms for the analysis of affordances not only between an agent, objects, and the environment, but also between multiple agents. They highlighted three key capabilities – visuo-spatial perspective taking, effort analysis, and affordance analysis – as necessary for providing a robot with social-cognitive mechanisms. In combination, these mechanisms allow a robot to interpret what a teammate visually perceives at a given point in time, determine the amount of effort required by a teammate to execute a certain task given the current situation, positioning and morphology of a teammate’s body, and then identify opportunities for action existing between agent-agent, agent-location, agent-object, and object-agent (see Pandey and Alami 200 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 (2012) for details). Likewise, Hafner and Kaplan (2008) developed an affordance-based mechanism to allow interaction behaviors in robots. Their approach centered on the creation of interpersonal maps whereby an agent is able to directly map its own body structure onto that of an observed body, thus allowing for derivation of action possibilities for the conglomerate of the two agents. More recently, Uyanik et al. (2013) developed a mechanism for a robot to learn social affordances (albeit in a simple context) that allowed the robot to be able to manipulate objects and request the assistance of a human as well as create multi-step plans to accomplish a collaborative goal. Naturally, many of these implementations of affordances for robots are subject to contention as some maintain that affordances, even in robotics, must be non-representational (e.g., Chemero & Turvey, 2007). Addressing this argument is beyond the scope of this paper; instead, we merely posit that there may be room for both representational and nonrepresentational approaches and that ultimately, both may be necessary in providing an account of cognition capable of explaining complex cognitive processes, such as those characteristic of human social cognition (cf. Dale, 2010; Horton, Chakraborty, & Amant, 2012; Wiltshire et al., 2015). Regardless, further research is needed to extend these affordance-based efforts to focus more on the nuances of social interaction. 4.3. Incorporate analysis of interaction dynamics From neural activity within an individual to the behavioral coordination of multiple individuals interacting in an environment and even on socio-cultural levels, there are emergent and dynamic self-organizational patterns that prescribe and constrain interaction possibilities as they occur across temporal and spatial dimensions (e.g., Coey, Vartlet, & Richardson, 2012; Eiler, Kallen, Harrison, & Richardson, 2013; Marsh, Richardson, & Schmidt, 2009; Pfeifer et al., 2007). Many of these interaction dynamics are emergent when humans engage in joint action and can take on many forms of coordination (e.g., Knoblich, Butterfill, & Sebanz, 2011). In accord with this growing body of research, we suggest the system be designed with the capacity to analyze and either synchronize or complement with interaction dynamics. The rationale here is that this could provide the robot with another mechanism characteristic of T1 processes, which would enable the analyses of interaction dynamics. Such mechanisms could prove beneficial in that both human-robot and robot-robot coordinative mechanisms, which are crucial in supporting joint action, could be enhanced to provide more effective and efficient interactions. One example instantiation relevant to this recommendation is drawn from the tools of dynamical systems theory and has been used for analyzing human interaction dynamics from motion sensing data (e.g., Marsh et al., 2009). With such a mechanism, robots could conceivably analyze interaction dynamics unfolding between teammates and in turn, couple with these dynamics to perform effectively. Another example comes from Kelso, de Guzman, Reveley, and Tognoli (2009) who developed the Virtual Partner Interaction (VPI) Paradigm. This paradigm leveraged the Haken-Kelso-Bunz circuit (e.g., Haken, Kelso, & Bunz, 1985) to allow the virtual partner (i.e., the computer) to couple its interaction with humans. Further, the authors suggest that the VPI serves as a foundational instantiation that can facilitate more complex forms of human-machine coordination. In a more general sense, Delaherche et al. (2012) review methods for identifying interpersonal synchronization and articulate how such mechanisms, when instantiated in machines, have the potential to favorably advance the social capabilities of artificial cognitive systems. An example instantiation that models robot synchronization with humans can be found in Ansermin, Mostafaoui, Beaussé, and Gaussier (2016). The major take away point here is that many forms of human interaction occur over time (i.e., dynamically) and socially intelligent machines must be able to attune to this information to interact accordingly. 4.4. Instantiate modal perceptual and motor representations While the essence of the previously outlined recommendations is that these mechanisms are minimally or nonreliant on representations, as noted above, the system may need to rely on representations to support more complex forms of cognition (Dale, 2010; Horton et al., 2012). Toward this end, the artificial cognitive system should rely on multi-modal sensory and motor representations (e.g., Hoffman, 2012). According to Pezzulo et al. (2013), modal perceptual and motor representations mean that incoming sensory information is represented in a form linked to its modality (e.g., visual, auditory, etc.). The rationale for this recommendation and several of the following is that they are commitments of the Grounded Computational Cognition approach (Pezzulo et al., 2013). Additionally, in accord with our account, modal perceptual and motor representations can be an essential form of input for both T1 and T2 processes. Cognitive concepts of a more complex nature would necessarily entail processes characteristic of autonomous agents such as affect and motivation, internal modalities that a system must be capable of independently constructing in pursuit of goal-oriented qualities (Hoffman, 2012; Pezzulo et al., 2013). Social-cognitive mechanisms concerning such a non-representational concept as affect may actually endow a system with the ability to imitate the affective states of its human teammates and furthermore, understand them. Some researchers have actually argued for the primacy of social learning and understanding through robotic imitation of humans (Breazeal & Scassellati, 2002). The benefits of this recommendation are that this type of grounding provides the foundation from which a robot could begin to manipulate perceptual information in order to interpret humans (Breazeal et al., 2009) as well T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 as form concepts, memories, and make decisions (Hoffman, 2012). An example instance of this recommendation is Hoffman’s (2012) use of modality streams and action networks to link perceptual data with motor action activation. Modality streams are connected to action networks and comprised of multiple perceptual process nodes, each representing a different form of perceptual data ranging from raw sensory input to features, properties, or higher-level concepts. Action nodes, which trigger motor actions, make up the structure of the action network. Activation of the modality stream can progress in either an afferent direction, from the sensory system to concepts and actions, or efferent direction, which follows the inverse path. Ultimately, these efforts would facilitate modeling of affective and cognitive states while taking their dynamic and interdependent nature into account (e.g., Pezzulo, 2012; Treur, 2013). 4.5. Couple perception, action, and cognition Modal representations require integration and association with one another to couple motor effectors with perceptual sensors, effectively supporting both T1 and T2 processes. This notion is supported by ecological psychology theory in which Gibson (1979) first suggested that cognitive processes were substantiated by the convergence of perception and action. Neisser (1976), though he takes a contrasting approach to Gibson in regards to reasoning about the environment, also put forth the argument that perception could not be separated from action. However, more recent work in neuroscience only serves to reinforce Gibson’s speculation (e.g., Gangohopadhyay & Schilbach, 2011; Knoblich & Sebanz, 2006). The rationale for this recommendation is that designing such a system would provide a basis for perceptual learning and enable both T1 and T2 processes. The benefits of designing a system in line with this recommendation are that it would not only provide a basis for perceptual learning, but it may also lead to more fluent interaction between robots and human teammates (Hoffman, 2012). In essence, this would create an architecture in which perception (i.e., sensors), action (i.e., actuators), and cognition function as interacting components within an interdependent system. These architectural components would be layered in such a way that they would function in parallel with one another, allowing for information exchange and more efficient behavior. This architecture would incorporate the concept of Convergence Zones (Damasio, 1989; Simmons & Barsalou, 2003), one of the example instantiations pertaining to this recommendation. In architectures possessing convergence zones, the perceptual and motor layers of the models overlap and interact to enable the simulation and generation of behavior (e.g., Lallee & Dominey, 2013). Brooks (1999) subsumption architecture was one of the first robotics efforts emphasizing direct perception-action links, designed out of necessity to address issues the 201 Sense-Plan-Act (SPA) paradigm faced (Nilsson, 1980). Where the SPA paradigm took a considerable amount of time to plan an action and largely relied on internal models, rather than perception, the subsumption architecture supported the execution of motor commands in direct response to sensory input, allowing the robot to engage in real-time interaction with a dynamic environment, quicker and in a more reactive manner. Symmetrical action-perception activation networks, a more recent example of this recommendation, were proposed by Hoffman (2012) to achieve multi-modal integration. Within these networks, perceptions influence higher-level associations that contribute to the selection of actions; however, perceptions are conversely biased through motor activities. As such, this line of thinking has become increasingly adopted in robotics through recognition of the reciprocal interdependencies of the perception-action continuum. The perception-action continuum is similar in concept to the action-perception cycle in which perception is recognized as the fundamental basis for an agent capable of interacting with its environment (Murphy, 2000). 4.6. Provide motor and perceptual resonance mechanisms Motor and perceptual resonance mechanisms implemented in robots would allow for recursive entrainment and behavior matching. The rationale for this recommendation is that these mechanisms would be akin to the T1 motor and perceptual resonance mechanisms elicited in humans by the mirror neuron system (Elias & Fiore, 2008). The mirror neuron system, a biological structure found in primates and humans (di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992; Gallese & Goldman, 1998) is argued to be involved in predicting others’ mental states and behaviors through simulation and imitative capabilities (Gallese & Goldman, 1998). Imitation was later proposed as the connection between mirror neurons and ‘‘mindreading” skills (Meltzoff & Decety, 2003). One benefit of modeling resonance mechanisms after the mirror neuron system is that it would enable design of a mechanism that could be trained to perceive and perform any number of novel actions. Such structures would not be limited in recognizing the wide array of possible actions (Arbib, Metta, & van der Smagt, 2008), thus holding the potential for supporting T1 processes (Bohl & van den Bos, 2012). Further benefits of such a system would entail a greater understanding of human teammates’ intentions as they moved through the environment and provide an improvement in the overall social competence of robots as a function of enabling social interactions that better reflect those occurring in humans (e.g., Chaminade & Cheng, 2009; Schütz-Bosbach & Prinz, 2007). Motor and perceptional resonance mechanisms would ultimately facilitate bettercoordinated joint actions (e.g., Chaminade & Cheng, 2009; Schütz-Bosbach & Prinz, 2007). Barakova and Lourens (2009) implemented one example of a mirror neuron framework that provided simulated and 202 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 embodied robots with a mechanism for synchronizing movements and entraining neuronal firing patterns. This facilitated turn taking and other teaming related behaviors between two agents. Additionally, Ito and Tani (2004, see also Ito, Noda, Hoshino, & Tani, 2006) utilized an approach based on dynamical systems to design a mechanism capable of enabling a humanoid robot to learn through imitative interactions. While the system uses a recurrent neural network – which ‘‘encodes” sensorimotor trajectories for later recall – to learn behaviors, it can also be interpreted in terms of a mirror neuron system in that the encoding and imitation of the action are analogous to observation and motor generation. Additionally, robot imitation learning has drawn from the concept of mirror neurons in the past, imitating these systems through hidden Markov models (Amit & Mataric, 2002) or neural networks (Billard, 2002). In an interactive context, such a mechanism could enable a robot to synchronize its movements with a human team member, adapting to dynamic situations and generating learned behaviors as required. Dynamic Bayesian Networks for understanding mental states (Pezzulo, 2012). Similarly, Johnson and Demiris (2005) designed the Hierarchical Attentive Multiple Models for Execution and Recognition (HAMMER) architecture, grounding it in the structure of mirror neurons, to employ simulation theory in the recognition and imitation of actions. The HAMMER architecture enables action recognition through observation of another agent’s actions while simultaneously utilizing a simulation mechanism to compare a set of models (analogous to motor programs) to the observed action. Essentially, their approach provides a robot with the ability to observe and recognize the actions of another and in turn, simulate and generate an imitation of the observed action by temporarily adopting the other agent’s perspective and engaging the motor system in the recognition process. While the HAMMER architecture is only applied to action recognition, this architecture could be modified for use in a dynamic social environment to allow for the inference of mental states as well. 4.8. Leverage simulation-based top-down perceptual biasing 4.7. Abstract from modal experiences Mechanisms such as simulation have the ability to abstract upon modal representations, which Barsalou (2008) defines as the ‘‘reenactment of perceptual, motor, and introspective states acquired during experience with the world, body, and mind” (pp. 618–619). According to several interpretations of grounded cognition, simulation plays a central role in supporting cognitive processes (Barsalou, 1999; Decety & Grèzes, 2006; Goldman, 2006). Thus, the rationale for this recommendation is that the instantiation of simulation or inference mechanisms capable of abstracting from modal experiences would support T2 processes for use when information was not directly available in the environment. The benefits of designing a system to support these mechanisms would be that a robot could better engage in mental state attributions of others as well as explain current events, predict future events, and imagine new events (Vernon, 2010). Essentially, this would provide the robot with a mechanism for engaging in the mental state attributions of others as a function of initial exposure to a mental state and storage of the associated multimodal sensory cues (e.g., in the case of anger: a raised voice, facial expression, etc.). In the event that the robot encounters the same mental state again, the associated cues can be retrieved and used to simulate the mental state to first, enable recognition and then, attribution. An example instantiation related to this recommendation is Breazeal et al.’s (2009) design of an embodied social robot system comprised of interconnected perceptual, motor, belief, and intention modules. From these modules, the robot generates its own states and re-uses them to simulate and infer the perspective and intentions of humans during an interaction, which in part rests on a foundation of social learning through imitation (Breazeal & Scassellati, 2002). In support of interactive capabilities, other efforts have used Simulation mechanisms also provide a means for a robot to predict future states in service of engaging in coordinative action with the physical and social environment (Hoffman & Breazeal, 2004). Research findings focusing on human visual psychophysics indicate that scene analysis, and by extension, analysis of a robot’s working environment, are heavily influenced by top-down processes (Wang, 2003). The ability to perceive one’s environment is, in itself, intrinsically linked to action. For example, Noton and Stark’s (1971) Scanpath Theory of Attention, states that top-down cognitive control drives the eye movements that in turn, enable perception. Essentially, the primary influence on perception is a cognitive model based on expectations and thus, an individual often only perceives what they expected to see. The rationale behind this recommendation is that it would provide a link between T1 and T2 processes, essential to stimulating and triggering the motor system toward the selection of appropriate actions while also informing direct perception-action links. The benefits of this recommendation are that it would support the evolution of robots from artificial systems perceived as tools to artificial systems perceived as teammates. In particular, such simulation-based topdown perceptual biasing ‘‘may specifically be the key to more fluent coordination between humans and robots working together in a socially structured interaction” (Hoffman, 2012, p. 6). Such a mechanism would additionally contribute to the learning and perception of affordances. Specifically, a mechanism grounded in the scanpath theory would enable a robot to analyze its environment more efficiently by observing specific details, in accordance with their saliency, that would support enhancement of the original model. An example of a simulation-based top-down approach to such a mechanism would be perceptual priming, which T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 essentially stimulates and triggers the motor system toward the selection of the most appropriate action in a given situation (Marsh et al., 2009; Schütz-Bosbach & Prinz, 2007). Hoffman (2012) describes two subsystems that support simulation-based top-down perceptual biasing as perceptual priming mechanisms: Markov-chain Bayesian anticipatory simulations and Intermodal Hebbian reinforcing. Markov-chain Bayesian anticipatory simulation mechanisms allow the system to probabilistically anticipate the activation of a state to reduce reaction latency whereas Intermodal Hebbian reinforcing strengthens the connections between activation nodes. For example, if a happy mental state is generally attributed when a social agent exhibits social cues resembling a smile, then the perception of a smile will activate the pathway responsible for the judgment of happiness. The activation of this pathway, when employing a Markov-chain Bayesian anticipatory simulation mechanism, serves to prime the robot’s perceptual system in that the sensors responsible for detecting the features of a happy mental state are now more responsive toward these social cues, which reduces the delay of a happy mental state attribution. The application of Intermodal Hebbian reinforcing will influence the decision making process as exposure to a given constellation of social cues (i.e., multiple co-occurring social cues) increases. If sadness is commonly communicated with social cues resembling a frown and heavy-lidded gaze (i.e., eyes not completely open) then the connection between the attribution of a sad mental state and these two social cues cooccurring are strengthened. In essence, a mechanism grounded in these two computational subsystems will trigger the simulation of a high-level concept, like a mental state, and prime the sensors responsible for detecting low-level features, like social cues. 5. Discussion and conclusions For robots to become more capable of complex social interactions, it is essential to advance the state of the art in HRI through innovations made in social-cognitive mechanisms featured in artificial systems (Christensen et al., 2013). Such innovations would enable more effective human-robot coordination and cooperation, and shift the perception of robots as tools to robots viewed as teammates, collaborators, or partners. The most well-known artificial general intelligence (AGI) models –LIDA (Ramamurthy, Baars, D’Mello, & Franklin, 2006), Soar (Laird et al., 1987), and ACT-R (Anderson, 1993) – began as a series of theories that could be applied to design an artificial cognitive model. Eventually, those theories evolved into conceptual commitments that computational efforts would adhere to in instantiating such work, resulting in the current AGI models of today. Our goal with this paper was to establish a roadmap intended for engineering human social cognition. To that end, we provided a theoretical foundation for modeling social-cognitive mechanisms in robotic agents such that these artificial systems 203 can be designed to function, and be perceived, as effective teammates. It is our primary position that understanding the many facets of human social cognition will be foundational in contributing to the development of artificial social intelligence system design. This is but one step in research for advancing the social capabilities of robotic agents. With advancements in artificial social intelligence, concomitant considerations will also need to be made regarding, for example, artificial moral capabilities (e.g., Wallach & Allen, 2008; Wiltshire, 2015). Throughout this paper, we have outlined an integrated set of mechanisms that are key to instantiating the EHSC approach, what we suggest as a path towards artificial social intelligence. Unraveling human social cognition is vital to mapping how people understand others’ mental states, explain as well as predict their behavior, determine the most appropriate behavioral response, and maintain relationships. As we have detailed, although the basic fundamental mechanisms of social cognition remain debated, they likely align and are some combination based on the Theory of Mind approach (Gopnik & Wellman, 1992), perceptualmotor simulation routines (Blakemore & Decety, 2001; Goldman, 2006), and direct perception (De Jaegher, 2009; De Jaegher & Di Paolo, 2007; Gallagher, 2007, 2008; Gangohopadhyay & Schilbach, 2011; Wiltshire et al., 2015). Meta-level approaches, such as dual-process theories of cognition, have integrated the aforementioned social-cognitive mechanisms in attempt to more fully explain social cognition (Bohl & van den Bos, 2012; Wiltshire et al., 2015). Building upon this, EHSC emphasizes the potential functional efficacy of leveraging the dual approach to social signal processing. In this way, we provided a framework that integrates competing theories with the goal of differentiating mechanisms of social cognition to make their computational instantiation more tractable. The EHSC approach is unique in its identification and advancement of four specific components that we believe will contribute to the design of robotic systems that possess social intelligence. First, we identified the importance of SSP in enabling more effective communication between human and robotic teammates and its contribution to providing the basis for mechanisms that will allow a robot to gain an improved understanding of humans. Maintaining this emphasis on social robotics places the objective of attaining more natural and automatic interaction at the forefront of our approach, with a focus on verbal and nonverbal communicative cues. However, we also stress that, for more natural and automatic interaction to be possible in HRI, embodied cognition must be taken into account given that this will establish the foundation for humans and robots to construct a shared understanding of the world. Finally, we outlined our commitment to autonomous robotic systems, one of the long-term goals for EHSC and robotics designers alike. We propose that leveraging these four key components in the EHSC approach will provide an initial roadmap toward modeling social- 204 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 cognitive mechanisms that will give rise to, not only a more effective robot, but also a more effective artificial teammate. Our modeling recommendations have centered primarily on the perceptual, motor, and cognitive modeling of a robotic system that spans disciplinary perspectives. Indeed, this is the area that will require extensive work in the future. However, while we have posited these recommendations as aligned with the components of our approach, they can also be viewed as open challenges for consideration in the instantiation of artificial social intelligence for embodied agents (cf., Vinciarelli et al., 2015). Admittedly, the aim of our paper was not to outline the challenges that will present themselves given the instantiation and integration of these mechanisms, but such challenges will certainly need to be addressed to ensure that robots will perform as effective team members (cf. Klein et al., 2004). As such, the next steps in this area must include both research and modeling efforts that assess the issues and challenges of integrating the proposed types of models and formalisms. That effort can aid in the development of an integrated and working system based on these recommendations. These recommendations, if instantiated, would provide some very basic perceptual, motor, and cognitive abilities, but future efforts should address whether these would also support more complex forms of social interaction. Such a capability would permit an artificial system to better express or perceive emotions while interacting and communicating with humans (cf. Pezzulo, 2012) in even more complex social scenarios requiring shared decision-making and problemsolving. Table 4 lists the EHSC modeling recommendations outlined within this paper, their relation to T1 & T2 processes, examples of associated computational formalisms supporting their instantiation, and representative references essential for consideration in their implementation. In sum, our goal has been to outline the basic engineering of human social cognition to illustrate how an embodied social robot can be designed to function autonomously as an efficient teammate. Adopting the EHSC recommendations as an approach to modeling social-cognitive mechanisms in robots, will not only provide a sophisticated and flexible perceptual, motor, and cognitive architecture for robots, it also allows for a more direct understanding of, and natural interaction with, the environment and human teammates. It also provides the mechanism for better understanding human behavior and mental states as well as allows for prediction and interpretation of novel and complex social situations. Social robots exist within an environment suitable for humans, therefore, similar embodiment under these conditions indicates the need to take advantage of similar cognitive mechanisms for humans and robots to coexist. Acknowledgements The authors wish to thank Christian Mosbæk Johannessen for his graphic design recommendations regarding Fig. 1. This work was partially supported by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-10-2-0016. Views contained here are of the authors and should not be interpreted as representing official policies, either expressed or implied, of the Army Research Laboratory, the U.S. Government or the University of Central Florida. The U. S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation. References ACT-R Research Group (2013). About ACT-R Retrieved from<http:// act-r.psy.cmu.edu/about/>. Adolphs, R. (1999). Social cognition and the human brain. Trends in Cognitive Sciences, 3(12), 469–479. Amit, R., & Mataric, M. (2002). Learning movement sequences from demonstration. In Proceedings of the 2nd international conference on development and learning (pp. 203–208). IEEE. Anderson, J. R. (1993). Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum Associates. Anderson, M. L. (2003). Embodied cognition: A field guide. Artificial Intelligence, 149(1), 91–130. Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of the mind. Psychological Review, 111(4), 1036. Ansermin, E., Mostafaoui, G., Beaussé, N., & Gaussier, P. (2016). Learning to synchronously imitate gestures using entrainment effect. In International conference on simulation of adaptive behavior (pp. 219–231). Springer International Publishing. August. Arbib, M. A., Metta, G., & van der Smagt, P. (2008). Neurorobotics: From vision to action. In B. Sicilliano & O. Khatib (Eds.), Springer handbook of robotics (pp. 1453–1480). Springer. Atkinson, D. J., & Clark, M. H. (2013). Autonomous agents and human interpersonal trust: Can we engineer a human-machine social interface for trust? Technical Report No SS-13-07. In Trust and autonomous systems: Papers from the 2013 AAAI spring symposium. Menlo Park, CA: AAAI Press, March. Barakova, E. I., & Lourens, T. (2009). Mirror neuron framework yields representations for robot interaction. Neurocomputing, 72(4), 895–900. Bargh, J. A. (1984). Automatic and conscious processing of social information. In R. S. Wyer, Jr. & T. K. Srull (Eds.). Handbook of social cognition (Vol. 3, pp. 1–43). Hillsdale, NJ: Lawrence Erlbaum Associates. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22(4), 577–660. Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. Barsalou, L. W., Niedenthal, P. M., Barbey, A. K., & Ruppert, J. A. (2003). Social embodiment. Psychology of Learning and Motivation, 43, 43–92. Bartneck, C., & Forlizzi, J. (2004). A design-centered framework for social human-robot interaction. Proceedings of the Ro-Man, 2004, 591–594. http://dx.doi.org/10.1109/ROMAN.2004.1374827. Beer, R. D. (1995). A dynamical systems perspective on agent-environment interaction. Artificial Intelligence, 72(1–2), 173–215. Bermúdez, J. L. (2010). Cognitive science: An introduction to the science of the mind. New York: Cambridge University Press. Best, A., Warta, S. F., Kapalo, K. A., & Fiore, S. M. (2016). Of Mental States and Machine Learning How Social Cues and Signals Can Help Develop Artificial Social Intelligence. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 60(1), 1362–1366. T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 Billard, A. (2002). Imitation: A means to enhance learning of a synthetic protolanguage in autonomous robots. In K. Dautenhahn & C. L. Nehaniv (Eds.), Imitation in animals and artifacts (pp. 281–310). Cambridge, MA: MIT Press. Blakemore, S. J., & Decety, J. (2001). From the perception of action to the understanding of intention. Nature Reviews Neuroscience, 2(8), 561–567. Bodenhausen, G. V., & Todd, A. R. (2010). Social cognition. Wiley Interdisciplinary Reviews: Cognitive Science, 1(2), 160–171. Bohl, V. (2015). We read minds to shape relationships. Philosophical Psychology, 28(5), 674–694. Bohl, V., & van den Bos, W. (2012). Toward an integrative account of social cognition: Marrying theory of mind and interactionism to study the interplay of Type 1 and Type 2 processes. Frontiers in Human Neuroscience, 6, 1–15. http://dx.doi.org/10.3389/fnhum.2012.00274. Bradshaw, J. M., Feltovich, P. J., Johnson, M. J., Breedy, M., Bunch, L., Eskridge, T. C., ... van Diggelen, J. (2009). From tools to teammates: Joint activity in human-agent-robot teams. In Human centered design (pp. 935–944). Berlin, Heidelberg: Springer. Bradshaw, J. M., Feltovich, P. J., Johnson, M., Bunch, L., Breedy, M., Eskridge, T., ... Uszok, A. (2008). Coordination in human-agent-robot teamwork. In International symposium on collaborative technologies and systems (pp. 467–476). IEEE. Breazeal, C. (2004). Social interactions in HRI: The robot view. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 34(2), 181–186, IEEE. Breazeal, C., Gray, J., & Berlin, M. (2009). An embodied cognition approach to mindreading skills for socially intelligent robots. The International Journal of Robotics Research, 28(5), 656–680. Breazeal, C., & Scassellati, B. (2002). Robots that imitate humans. Trends in Cognitive Sciences, 6(11), 481–487. Brooks, R. A. (1999). Cambrian intelligence: The early history of the new AI. Cambridge, MA: MIT Press. Chaiken, S., & Trope, Y. (1999). Dual-process theories in social psychology. New York: Guilford Press. Chaminade, T., & Cheng, G. (2009). Social cognitive neuroscience and humanoid robotics. Journal of Physiology-Paris, 103(3), 286–295. Chemero, A. (2003). An outline of a theory of affordances. Ecological Psychology, 15(2), 181–195. Chemero, A. (2009). Radical embodied cognitive science. Cambridge, MA: MIT Press. Chemero, A. (2013). Radical embodied cognitive science. Review of General Psychology, 17(2), 145–150. Chemero, A., & Turvey, M. T. (2007). Gibsonian affordances for roboticists. Adaptive Behavior, 15(4), 473–480. Christensen, H., Batzinger, T., Bekris, K., Bohringer, K., Bordogna, J., Bradski, G., ... Zhang, M. (2013). A roadmap for US robotics: From internet to robotics. Washington, DC, US: Computing Community Consortium and Computing Research Association. Coey, C. A., Vartlet, M., & Richardson, M. J. (2012). Coordination dynamics in a socially situated nervous system. Frontiers in Human Neuroscience, 6, 1–12. http://dx.doi.org/10.3389/fnhum.2012.00164. Dale, R. (2008). The possibility of a pluralist cognitive science. Journal of Experimental and Theoretical Artificial Intelligence, 20(3), 155–179. Dale, R. (2010). Review of radical embodied cognitive science. Journal of Mind and Behavior, 31(1–2), 127–140. Dale, R., Dietrich, E., & Chemero, A. (2009). Explanatory pluralism in cognitive science. Cognitive Science, 33(5), 739–742. Damasio, A. R. (1989). Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition, 33(1), 25–62. Dautenhahn, K., Ogden, B., & Quick, T. (2002). From embodied to socially embedded agents – Implications for interaction-aware robots. Cognitive Systems Research, 3(3), 397–428. De Jaegher, H. (2009). Social understanding through direct perception? Yes, by interacting. Consciousness and Cognition, 18(2), 535–542. De Jaegher, H., & Di Paolo, E. (2007). Participatory sense-making. Phenomenology and the Cognitive Sciences, 6(4), 485–507. 205 De Jaegher, H., Di Paolo, E. D., & Gallagher, S. (2010). Can social interaction constitute social cognition? Trends in Cognitive Science, 14 (10), 441–447. Decety, J., & Grèzes, J. (2006). The power of simulation: Imagining one’s own and other’s behavior. Brain Research, 1079(1), 4–14. Delaherche, E., Chetouani, M., Mahdhaoui, A., Saint-Georges, C., Viaux, S., & Cohen, D. (2012). Interpersonal synchrony: A survey of evaluation methods across disciplines. IEEE Transactions on Affective Computing, 3(3), 349–365. DeSteno, D., Breazeal, C., Frank, R. H., Pizarro, D., Baumann, J., Dickens, L., & Lee, J. J. (2012). Detecting the trustworthiness of novel partners in economic exchange. Psychological Science, 23(12), 1549–1556. Di Paolo, E., & De Jaegher, H. (2012). The interactive brain hypothesis. Frontiers in Human Neuroscience, 6, 1–16. http://dx.doi.org/10.3389/ fnhum.2012.00163. di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., & Rizzolatti, G. (1992). Understanding motor events: A neurophysiological study. Experimental Brain Research, 91(1), 176–180. Duchon, A. P., Kaelbling, L. P., & Warren, W. H. (1998). Ecological robotics. Adaptive Behavior, 6(3–4), 473–507. Dunbar, R. I. M. (1998). The social brain hypothesis. Evolutionary Anthropology, 6, 178–190. Eiler, B. A., Kallen, R. W., Harrison, S. J., & Richardson, M. J. (2013). Origins of order in joint activity and social behavior. Ecological Psychology, 25(3), 316–326. Elias, J. & Fiore, S. M. (2008, May). From psychology, to neuroscience, to robots: An interdisciplinary approach to bio-inspired robotics. In Presented at the 20th annual convention of the American Psychological Society. Chicago, IL. Fiore, S. M., Badler, N. L., Boloni, L., Goodrich, M. A., Wu, A. S., & Chen, J. (2011). Human-robot teams collaborating socially, organizationally, and culturally. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 55(1), 465–469. http://dx.doi. org/10.1177/1071181311551096. Fiore, S. M., Elias, J., Gallagher, S., & Jentsch, F. (2008). Cognition and coordination: Applying cognitive science to understand macrocognition in human-agent teams. In Proceedings of the 8th annual symposium on human interaction with complex systems. Norfolk, VA. Fiore, S. M., Wiltshire, T. J., Lobato, E. J. C., Jentsch, F. G., Huang, W. H., & Axelrod, B. (2013). Toward understanding social cues and signals in human-robot interaction: Effects of robot gaze and proxemics behavior. Frontiers in Psychology, 4, 1–15. http://dx.doi. org/10.3389/fpsyg.2013.00859. Fiske, A. P., & Haslam, N. (1996). Social cognition is thinking about relationships. Current Directions in Psychological Science, 5(5), 143–148. Fong, T., Nourbakhsh, I., & Dautenhahn, K. (2003). A survey of socially interactive robots. Robotics and Autonomous Systems, 42(3), 143–166. Franklin, S., Strain, S., McCall, R., & Baars, B. (2013). Conceptual commitments of the LIDA model of cognition. Journal of Artificial General Intelligence, 4(2), 1–22. Frith, C. D., & Frith, U. (2007). Social cognition in humans. Current Biology, 17(16), R724–R732. Frith, C. D., & Frith, U. (2008). Implicit and explicit processes in social cognition. Neuron, 60(3), 503–510. Frith, C. D., & Frith, U. (2012). Mechanisms of social cognition. Annual Review of Psychology, 63, 287–313. Gallagher, S. (2007). Social cognition and social robots. Pragmatics & Cognition, 15(3), 435–453. Gallagher, S. (2008). Direct perception in the intersubjective context. Consciousness and Cognition, 17(2), 535–543. Gallagher, S. (2013). You and I, robot. AI & Society, 28(4), 455–460. Gallese, V., & Goldman, A. (1998). Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences, 2(12), 493–501. Gangohopadhyay, N., & Schilbach, L. (2011). Seeing minds: A neurophilosophical investigation of the role of perception-action coupling in social perception. Social Neuroscience, 7(4), 410–423. 206 T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin. Goldman, A. I. (2006). Simulating minds: The philosophy, psychology, and neuroscience of mindreading. Oxford, England: Oxford University Press. Gopnik, A., & Wellman, H. M. (1992). Why the child’s theory of mind really is a theory. Mind & Language, 7(1–2), 145–171. Hafner, V. V., & Kaplan, F. (2008). Interpersonal maps: How to map affordances for interaction behaviour. In Towards affordance-based robot control (pp. 1–15). Berlin, Heidelberg: Springer. Haken, H., Kelso, J. A. S., & Bunz, H. (1985). A theoretical model of phase transitions in human hand movements. Biological Cybernetics, 51(5), 347–356. Hancock, P. A., Billings, D. R., Schaefer, K. E., Chen, J. Y., De Visser, E. J., & Parasuraman, R. (2011). A meta-analysis of factors affecting trust in human-robot interaction. Human Factors: The Journal of the Human Factors and Ergonomics Society, 53(5), 517–527. Hoffman, G. (2012). Embodied cognition for autonomous interactive robots. Topics in Cognitive Science, 4(4), 759–772. Hoffman, G., & Breazeal, C. (2004). Collaboration in human-robot teams. In Proceedings of the AIAA 1st intelligent systems technical conference (pp. 1–18). Chicago, IL: AIAA. Hoffman, G., & Breazeal, C. (2010). Effects of anticipatory perceptual simulation on practiced human-robot tasks. Autonomous Robots, 28(4), 403–423. Horton, T. E., Chakraborty, A., & Amant, R. S. (2012). Affordances for robots: A brief survey. AVANT, 3(2), 70–84. Ito, M., Noda, K., Hoshino, Y., & Tani, J. (2006). Dynamic and interactive generation of object handling behaviors by a small humanoid robot using a dynamic neural network model. Neural Networks, 19(3), 323–337. Ito, M., & Tani, J. (2004). On-line imitative interaction with a humanoid robot using a dynamic neural network model of a mirror system. Adaptive Behavior, 12(2), 93–115. Johnson, M., & Demiris, Y. (2005). Perceptual perspective taking and action recognition. International Journal of Advanced Robotic Systems, 2(4), 301–308. Kelso, J. A. S., de Guzman, G. C., Reveley, C., & Tognoli, E. (2009). Virtual partner interaction (VPI): Exploring novel behaviors via coordination dynamics. PLoS One, 4(6). http://dx.doi.org/10.1371/ journal.pone.0005749. Klein, G., Woods, D. D., Bradshaw, J. M., Hoffman, R. R., & Feltovich, P. J. (2004). Ten challenges for making automation a ‘‘team player” in joint human-agent activity. IEEE Intelligent Systems, 19(6), 91–95. Knoblich, G., Butterfill, S., & Sebanz, N. (2011). Psychological research on joint action: Theory and data. In The psychology of learning and motivation: Advances in research and theory (pp. 59–101). San Diego, CA: Elsevier Academic Press, Inc. Knoblich, G., & Sebanz, N. (2006). The social nature of perception and action. Current Directions in Psychological Science, 15(3), 99–104. Kono, T. (2009). Social affordances and the possibility of ecological linguistics. Integrative Psychological and Behavioral Science, 43(4), 356–373. Kourtis, D., Sebanz, N., & Knoblich, G. (2010). Favoritism in the motor system: Social interaction modulates action simulation. Biology Letters, 6(6), 758–761. Lackey, S., Barber, D., Reinerman, L., Badler, N. I., & Hudson, I. (2011). Defining next-generation multi-modal communication in human-robot interaction. Proceedings of the human factors and ergonomics society annual meeting (Vol. 55(1), pp. 461–464). http://dx.doi.org/10.1177/ 1071181311551095. Laird, J. E., Newell, A., & Rosenbloom, P. S. (1987). Soar: An architecture for general intelligence. Artificial Intelligence, 33(1), 1–64. Lakoff, G., & Johnson, M. (1980). Conceptual metaphor in everyday language. The Journal of Philosophy, 77(8), 453–486. Lallee, S., & Dominey, P. F. (2013). Multi-modal convergence maps: From body schema and self-representation to mental imagery. Adaptive Behavior, 21(4), 274–285. Langley, P., Laird, J. E., & Rogers, S. (2009). Cognitive architectures: Research issues and challenges. Cognitive Systems Research, 10(2), 141–160. Levinson, S. C. (2006). On the human ‘Interaction Engine’. In N. J. Enfield & S. C. Levinson (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 39–69). Oxford, UK: Berg. Lindblom, J., & Andreasson, R. (2016). Current challenges for UX evaluation of human-robot interaction. In Advances in ergonomics of manufacturing: Managing the enterprise of the future (pp. 267–277). Lobato, E. J. C., Warta, S. F., Wiltshire, T. J., & Fiore, S. M. (2015). Varying social cue constellations results in different attributed social signals in a simulated surveillance task. In Proceedings of the twentyeighth international Florida artificial intelligence research society conference (pp. 61–66). Hollywood, FL: AAAI. Lobato, E. J., Wiltshire, T. J., Hudak, S., & Fiore, S. M. (2014). No time, no problem mental state attributions made quickly or after reflection do not differ. Proceedings of the human factors and ergonomics society annual meeting (Vol. 58(1), pp. 1341–1345). SAGE Publications. Macrae, C. N., & Bodenhausen, G. V. (2000). Social cognition: Thinking categorically about others. Annual Review of Psychology, 51(1), 93–120. Marsh, K. L., Richardson, M. J., & Schmidt, R. C. (2009). Social connection through joint action and interpersonal coordination. Topics in Cognitive Science, 1(2), 320–339. McCabe, K., Houser, D., Ryan, L., Smith, V., & Trouard, T. (2001). A functional imaging study of cooperation in two-person reciprocal exchange. Proceedings of the National Academy of Sciences, 98(20), 11832–11835. http://dx.doi.org/10.1073/pnas.211415698. Meltzoff, A. N., & Decety, J. (2003). What imitation tells us about social cognition: A rapprochement between developmental psychology and cognitive neuroscience. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 358(1431), 491–500. Murphy, R. R. (2000). Introduction to AI robotics. Cambridge, MA: MIT Press. Neisser, U. (1976). Cognition and reality: Principles and implications of cognitive psychology. New York, NY: WH Freeman/Times Books/ Henry Holt & Co. Nilsson, N. J. (1980). Principles of artificial intelligence. San Francisco, CA: Morgan Kaufmann Publishers, Inc. Noton, D., & Stark, L. (1971). Scanpaths in eye movements during pattern perception. Science, 171(3968), 308–311. Pandey, A. K., & Alami, R. (2012, July). Visuo-spatial ability, effort and affordance analyses: Towards building blocks for robot’s complex socio-cognitive behaviors. In Workshops at the twenty-sixth AAAI conference on artificial intelligence <http://www.aaai.org/ocs/index. php/WS/AAAIW12/paper/view/5270>. Pezzulo, G. (2012). The ‘‘Interaction Engine”: A common pragmatic competence across linguistic and nonlinguistic interactions. IEEE Transactions on Autonomous Mental Development, 4(2), 105–123. Pezzulo, G., Barsalou, L. W., Cangelosi, A., Fischer, M. H., McRae, K., & Spivey, M. (2013). Computational Grounded Cognition: A new alliance between grounded cognition and computational modeling. Frontiers in Psychology, 3, 1–11. http://dx.doi.org/10.3389/ fpsyg.2012.00612. Pfeifer, R., Lungarella, M., & Iida, F. (2007). Self-organization, embodiment, and biologically inspired robotics. Science, 318(5853), 1088–1093. Pfeifer, R., & Scheier, C. (1999). Understanding intelligence. Cambridge, MA: MIT Press. Phillips, E., Ososky, S., Grove, J., & Jentsch, F. (2011). From tools to teammates: Toward the development of appropriate mental models for intelligent robots. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 55(1), 1491–1495. http://dx.doi.org/10.1177/ 1071181311551310. Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioral and Brain sciences, 1(04), 515–526. Przyrembel, M., Smallwood, J., Pauen, M., & Singer, T. (2012). Illuminating the dark matter of social neuroscience: Considering the problem of social interaction from philosophical, psychological, and T.J. Wiltshire et al. / Cognitive Systems Research 43 (2017) 190–207 neuroscientific perspectives. Frontiers in Human Neuroscience, 6, 1–15. http://dx.doi.org/10.3389/fnhum.2012.00190. Ramamurthy, U., Baars, B. J., D’Mello, S. K., & Franklin, S. (2006). LIDA: A working model of cognition. In Proceedings of the 7th international conference on cognitive modeling (pp. 244–249). Trieste: Edizioni Goliardiche. S ß ahin, E., Çakmak, M., Doğar, M. R., Uğur, E., & Üçoluk, G. (2007). To afford or not to afford: A new formalization of affordances toward affordance-based robot control. Adaptive Behavior, 15(4), 447–472. Sartori, L., Becchio, C., & Castiello, U. (2011). Cues to intention: The role of movement information. Cognition, 119(2), 242–252. Satpute, A. B., & Lieberman, M. D. (2006). Integrating automatic and controlled processes into neurocognitive models of social cognition. Brain Research, 1079(1), 86–97. Schilbach, L. (2014). On the relationship of online and offline social cognition. Frontiers in Human Neuroscience, 8, 1–8. http://dx.doi.org/ 10.3389/fnhum.2014.00278. Schilbach, L., Timmermans, B., Reddy, V., Costall, A., Bente, G., Schlicht, T., & Vogeley, K. (2013). Toward a second-person neuroscience. Behavioral and Brain Sciences, 36(04), 393–414. Schütz-Bosbach, S., & Prinz, W. (2007). Perceptual resonance: Actioninduced modulation of perception. Trends in Cognitive Sciences, 11(8), 349–355. Sheridan, T. B. (2016). Human–robot interaction status and challenges. Human Factors: The Journal of the Human Factors and Ergonomics Society, 58(4), 525–532. Simmons, W. K., & Barsalou, L. W. (2003). The similarity-in-topography principle: Reconciling theories of conceptual deficits. Cognitive Neuropsychology, 20(3–6), 451–486. Streater, J. P., Bockelman Morrow, P., & Fiore, S. M. (2012, October). Making things that understand people: The beginnings of an interdisciplinary approach for engineering computational social intelligence. In Presented at the 56th annual meeting of the human factors and ergonomics society. Boston, MA. Treur, J. (2013). An integrative dynamical systems perspective on emotions. Biologically Inspired Cognitive Architectures, 4, 27–40. Tsarouchi, P., Makris, S., & Chryssolouris, G. (2016). Human–robot interaction review and challenges on task planning and programming. International Journal of Computer Integrated Manufacturing, 29(8), 916–931. Tylén, K., Allen, M., Hunter, B. K., & Roepstorff, A. (2012). Interaction vs. observation: Distinctive modes of social cognition in human brain and behavior? A combined fMRI and eye-tracking study. Frontiers in Human Neuroscience, 6, 1–11. http://dx.doi.org/10.3389/fnhum.2012.00331. Uyanik, K. F., Caliskan, Y., Bozcuoglu, A. K., Yuruten, O., Kalkan, S., & Sahin, E. (2013). Learning social affordances and using them for planning. In Proceedings of the cognitive science society annual meeting (pp. 3604–3609). Berlin, Germany: Cognitive Science Society. Van Overwalle, F. (2009). Social cognition and the brain: A meta-analysis. Human Brain Mapping, 30(3), 829–858. Varela, F. J., Thompson, E., & Rosch, E. (1991). The embodied mind: Cognitive science and human experience. Cambridge, MA: MIT Press. Vernon, D. (2010). Enaction as a conceptual framework for developmental cognitive robotics. Paladyn, Journal of Behavioral Robotics, 1(2), 89–98. Vernon, D. (2014). Artificial cognitive systems: A primer. MIT Press. Vinciarelli, A., Esposito, A., André, E., Bonin, F., Chetouani, M., Cohn, J. F., ... Heylen, D. (2015). Open challenges in modelling, analysis and synthesis of human behaviour in human–human and human–machine interactions. Cognitive Computation, 7(4), 397–413. 207 Vinciarelli, A., Pantic, M., Heylen, D., Pelachaud, C., Poggi, I., D’Errico, F., & Schröder, M. (2012). Bridging the gap between social animal and unsocial machine: A survey of social signal processing. IEEE Transactions on Affective Computing, 3(1), 69–87. Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. New York, NY: Oxford University Press. Wang, D. (2003). Visual scene segmentation. In M. A. Arbib (Ed.), The handbook of brain theory and neural networks (pp. 1215–1219). Cambridge, MA: MIT Press. Warta, S. F. (2015). If a Robot did ‘‘The Robot”, would it still be called ‘‘The Robot” or just Dancing? Perceptual and social factors in humanrobot interactions. Proceedings of the human factors and ergonomics society annual meeting (Vol. 59(1), pp. 796–800). SAGE Publications. Warta, S. F., Kapalo, K. A., Best, A., & Fiore, S. M. (2016). Similarity, Complementarity, and Agency in HRI Theoretical Issues in Shifting the Perception of Robots from Tools to Teammates. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 60(1), 1230–1234. Wilkinson, M. R., & Ball, L. J. (2013). Dual processes in mental state understanding: Is theorising synonymous with intuitive thinking and is simulation synonymous with reflective thinking? In Proceedings of the 35th annual conference of the cognitive science society (pp. 3771–3776). Austin, TX: Cognitive Science Society. Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin & Review, 9(4), 625–636. Wiltshire, T. J. (2015). A prospective framework for the design of ideal artificial moral agents: Insights from the science of heroism in humans. Minds and Machines, 25(1), 57–71. Wiltshire, T. J., Barber, D., & Fiore, S. M. (2013). Towards modeling social-cognitive mechanisms in robots to facilitate human-robot teaming. Proceedings of the human factors and ergonomics society annual meeting (Vol. 57(1), pp. 1278–1282). . http://dx.doi.org/ 10.1177/1541931213571283. Wiltshire, T. J., & Fiore, S. M. (2014). Social cognitive and affective neuroscience in human-machine systems: A roadmap for improving training, human-robot interaction, and team performance. IEEE Transactions on Human-Machine Systems, 44(6), 779–787. Wiltshire, T. J., Lobato, E. J. C., Jentsch, F. G., & Fiore, S. M. (2013). Will (dis)embodied LIDA agents be socially interactive? A commentary on the target article entitled ‘‘Conceptual commitments of the LIDA model of cognition”. Journal of Artificial General Intelligence, 4(2), 23–58. Wiltshire, T. J., Lobato, E. J., McConnell, D. S., & Fiore, S. M. (2015). Prospects for direct social perception: A multi-theoretical integration to further the science of social cognition. Frontiers in Human Neuroscience, 8, 1–22. http://dx.doi.org/10.3389/fnhum.2014.01007. Wiltshire, T. J., Lobato, E. J. C., Wedell, A. V., Huang, W., Axelrod, B., & Fiore, S. M. (2013). Effects of robot gaze and proxemic behavior on perceived social presence during a hallway navigation scenario. Proceedings of the human factors and ergonomics society annual http://dx.doi.org/10.1177/ meeting (Vol. 57(1), pp. 1273–1277). 1541931213571282. Wiltshire, T. J., Smith, D. C., & Keebler, J. R. (2013). Cybernetic teams: Towards the implementation of team heuristics in HRI. In Virtual augmented and mixed reality designing and developing augmented and virtual environments. Lecture notes in computer science (Vol. 8021, pp. 321–330). Berlin, Heidelberg: Springer. Wiltshire, T. J., Snow, S. L., Lobato, E. J., & Fiore, S. M. (2014). Leveraging social judgment theory to examine the relationship between social cues and signals in human-robot interactions. Proceedings of the human factors and ergonomics society annual meeting (Vol. 58(1), pp. 1336–1340). . http://dx.doi.org/10.1177/1541931214581279.