Article

The Role of Embodiment and Simulation in Evaluating HCI: Experiments and Evaluation

Authors:

Nikhil Krishnaswamy,

James PustejovskyAuthors Info & Claims

Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior: 12th International Conference, DHM 2021, Held as Part of the 23rd HCI International Conference, HCII 2021, Virtual Event, July 24–29, 2021, Proceedings, Part I

Pages 220 - 232

https://doi.org/10.1007/978-3-030-77817-0_17

Published: 24 July 2021 Publication History

Abstract

In this paper series, we argue for the role embodiment plays in the evaluation of systems developed for Human Computer Interaction. We use a simulation platform, VoxWorld, for building Embodied Human Computer Interactions (EHCI). VoxWorld enables multimodal dialogue systems that communicate through language, gesture, action, facial expressions, and gaze tracking, in the context of task-oriented interactions. A multimodal simulation is an embodied 3D virtual realization of both the situational environment and the co-situated agents, as well as the most salient content denoted by communicative acts in a discourse. It is built on the modeling language VoxML, which encodes objects with rich semantic typing and action affordances, and actions themselves as multimodal programs, enabling contextually salient inferences and decisions in the environment. Through simulation experiments in VoxWorld, we can begin to identify and then evaluate the diverse parameters involved in multimodal communication between agents. In this second part of this paper series, we discuss the consequences of embodiment and common ground, and how they help evaluate parameters of the interaction between humans and agents, and compare and contrast evaluation schemes enabled by different levels of embodied interaction.

References

[1]

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, Georgia, USA (2016)

[2]

Arbib, M., Rizzolatti, G.: Neural expectations: a possible evolutionary path from manual skills to language. Commun. Cogn. 29, 393–424 (1996)

[3]

Arbib, M.A.: From grasp to language: embodied concepts and the challenge of abstraction. J. Physiol. Paris 102(1), 4–20 (2008)

[4]

Asher, N., Gillies, A.: Common ground, corrections, and coordination. Argumentation 17(4), 481–512 (2003)

[5]

Cangelosi, A.: Grounding language in action and perception: from cognitive agents to humanoid robots. Phys. Life Rev. 7(2), 139–151 (2010)

[6]

Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 4163–4168. IEEE (2009)

[7]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

[8]

Hobbs, J.R., Evans, D.A.: Conversation as planned behavior. Cognit. Sci. 4(4), 349–377 (1980)

[9]

Hsiao, K.Y., Tellex, S., Vosoughi, S., Kubat, R., Roy, D.: Object schemas for grounding language in a responsive robot. Connect. Sci. 20(4), 253–276 (2008)

[10]

Jaimes, A., Sebe, N.: Multimodal human–computer interaction: a survey. Comput. Vis. Image Underst. 108(1), 116–134 (2007)

[11]

Johnston, M.: Building multimodal applications with EMMA. In: Proceedings of the 2009 International Conference on Multimodal Interfaces, pp. 47–54. ACM (2009)

[12]

Koole, T.: Conversation analysis and education. In: The Encyclopedia of Applied Linguistics, pp. 977–982 (2013)

[13]

Kozierok, R., et al.: Hallmarks of human-machine collaboration: a framework for assessment in the darpa communicating with computers program. arXiv preprint arXiv:2102.04958 (2021)

[14]

Krishnaswamy, N., et al.: Diana’s world: a situated multimodal interactive agent. In: AAAI Conference on Artificial Intelligence (AAAI): Demos Program. AAAI (2020)

[15]

Krishnaswamy, N., et al.: Communicating and acting: understanding gesture in simulation semantics. In: 12th International Workshop on Computational Semantics (2017)

[16]

Krishnaswamy N and Pustejovsky J Barkowsky T, Burte H, Hölscher C, and Schultheis H Multimodal semantic simulations of linguistically underspecified motion events Spatial Cognition X 2017 Cham Springer 177-197

[17]

Krishnaswamy, N., Pustejovsky, J.: VoxSim: a visual platform for modeling motion language. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. ACL (2016)

[18]

Krishnaswamy, N., Pustejovsky, J.: An evaluation framework for multimodal interaction. In: Proceedings of LREC (2018, forthcoming)

[19]

Krishnaswamy, N., Pustejovsky, J.: Generating a novel dataset of multimodal referring expressions. In: Proceedings of the 13th International Conference on Computational Semantics-Short Papers, pp. 44–51 (2019)

[20]

Krishnaswamy, N., Pustejovsky, J.: A formal analysis of multimodal referring strategies under common ground. In: Proceedings of The 12th Language Resources and Evaluation Conference, pp. 5919–5927 (2020)

[21]

Ligozat GF Frank AU and Campari I Qualitative triangulation for spatial reasoning Spatial Information Theory A Theoretical Basis for GIS 1993 Heidelberg Springer 54-68

[22]

Madeo, R.C.B., Peres, S.M., de Moraes Lima, C.A.: Gesture phase segmentation using support vector machines. Expert Syst. Appl. 56, 100–115 (2016)

[23]

Narayana, P., et al.: Cooperating with avatars through gesture, language and action. In: Intelligent Systems Conference (IntelliSys) (2018, forthcoming)

[24]

Van Eijck, J., Unger, C.: Computational Semantics with Functional Programming. Cambridge University, Cambridge (2010)

[25]

Wang, I., et al.: EGGNOG: a continuous, multi-modal data set of naturally occurring gestures with ground truth labels. In: To appear in the Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition (2017)

[26]

Wooldridge M and Lomuscio A Jennings NR and Lespérance Y Reasoning about visibility, perception, and knowledge Intelligent Agents VI. Agent Theories, Architectures, and Languages 2000 Heidelberg Springer 1-12

[27]

Zhang, Z.: Microsoft kinect sensor and its effect. IEEE MulitMedia 19, 4–10 (2012)

[28]

Ziemke, T., Sharkey, N.E.: A stroll through the worlds of robots and animals: applying Jakob von Uexkull’s theory of meaning to adaptive robots and artificial life. Semiotica-La Haye Then Berlin 134(1/4), 701–746 (2001)

[29]

Zimmermann, K., Freksa, C.: Qualitative spatial reasoning using orientation, distance, and path knowledge. Appl. Intell. 6(1), 49–58 (1996)

Cited By

Pustejovsky JKrishnaswamy N(2022)Multimodal Semantics for Affordances and ActionsHuman-Computer Interaction. Theoretical Approaches and Design Methods10.1007/978-3-031-05311-5_9(137-160)Online publication date: 26-Jun-2022
https://dl.acm.org/doi/10.1007/978-3-031-05311-5_9

Index Terms

The Role of Embodiment and Simulation in Evaluating HCI: Experiments and Evaluation
1. Computing methodologies
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Virtual reality

Index terms have been assigned to the content through auto-classification.

Recommendations

The Role of Embodiment and Simulation in Evaluating HCI: Theory and Framework
Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior
Abstract
In this paper, we argue that embodiment can play an important role in the evaluation of systems developed for Human Computer Interaction. To this end, we describe a simulation platform for building Embodied Human Computer Interactions (EHCI). This ...
Multimodal Semantics for Affordances and Actions
Human-Computer Interaction. Theoretical Approaches and Design Methods
Abstract
In this paper, we argue that, as HCI becomes more multimodal with the integration of gesture, gaze, posture, and other nonverbal behavior, it is important to understand the role played by affordances and their associated actions in human-object ...
Levels of embodiment: linguistic analyses of factors influencing hri
HRI '12: Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction

In this paper, we investigate the role of physical embodiment of a robot and its degrees of freedom in HRI. Both factors have been suggested to be relevant in definitions of embodiment, and so far we do not understand their effects on the way people ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior: 12th International Conference, DHM 2021, Held as Part of the 23rd HCI International Conference, HCII 2021, Virtual Event, July 24–29, 2021, Proceedings, Part I

Jul 2021

392 pages

ISBN:978-3-030-77816-3

DOI:10.1007/978-3-030-77817-0

Editor:
Vincent G. Duffy
Purdue University, West Lafayette, IN, USA

© Springer Nature Switzerland AG 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 24 July 2021

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pustejovsky JKrishnaswamy N(2022)Multimodal Semantics for Affordances and ActionsHuman-Computer Interaction. Theoretical Approaches and Design Methods10.1007/978-3-031-05311-5_9(137-160)Online publication date: 26-Jun-2022
https://dl.acm.org/doi/10.1007/978-3-031-05311-5_9

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents