Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-030-77817-0_17guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

The Role of Embodiment and Simulation in Evaluating HCI: Experiments and Evaluation

Published: 24 July 2021 Publication History

Abstract

In this paper series, we argue for the role embodiment plays in the evaluation of systems developed for Human Computer Interaction. We use a simulation platform, VoxWorld, for building Embodied Human Computer Interactions (EHCI). VoxWorld enables multimodal dialogue systems that communicate through language, gesture, action, facial expressions, and gaze tracking, in the context of task-oriented interactions. A multimodal simulation is an embodied 3D virtual realization of both the situational environment and the co-situated agents, as well as the most salient content denoted by communicative acts in a discourse. It is built on the modeling language VoxML, which encodes objects with rich semantic typing and action affordances, and actions themselves as multimodal programs, enabling contextually salient inferences and decisions in the environment. Through simulation experiments in VoxWorld, we can begin to identify and then evaluate the diverse parameters involved in multimodal communication between agents. In this second part of this paper series, we discuss the consequences of embodiment and common ground, and how they help evaluate parameters of the interaction between humans and agents, and compare and contrast evaluation schemes enabled by different levels of embodied interaction.

References

[1]
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Savannah, Georgia, USA (2016)
[2]
Arbib, M., Rizzolatti, G.: Neural expectations: a possible evolutionary path from manual skills to language. Commun. Cogn. 29, 393–424 (1996)
[3]
Arbib, M.A.: From grasp to language: embodied concepts and the challenge of abstraction. J. Physiol. Paris 102(1), 4–20 (2008)
[4]
Asher, N., Gillies, A.: Common ground, corrections, and coordination. Argumentation 17(4), 481–512 (2003)
[5]
Cangelosi, A.: Grounding language in action and perception: from cognitive agents to humanoid robots. Phys. Life Rev. 7(2), 139–151 (2010)
[6]
Dzifcak, J., Scheutz, M., Baral, C., Schermerhorn, P.: What to do and how to do it: translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In: IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 4163–4168. IEEE (2009)
[7]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
[8]
Hobbs, J.R., Evans, D.A.: Conversation as planned behavior. Cognit. Sci. 4(4), 349–377 (1980)
[9]
Hsiao, K.Y., Tellex, S., Vosoughi, S., Kubat, R., Roy, D.: Object schemas for grounding language in a responsive robot. Connect. Sci. 20(4), 253–276 (2008)
[10]
Jaimes, A., Sebe, N.: Multimodal human–computer interaction: a survey. Comput. Vis. Image Underst. 108(1), 116–134 (2007)
[11]
Johnston, M.: Building multimodal applications with EMMA. In: Proceedings of the 2009 International Conference on Multimodal Interfaces, pp. 47–54. ACM (2009)
[12]
Koole, T.: Conversation analysis and education. In: The Encyclopedia of Applied Linguistics, pp. 977–982 (2013)
[13]
Kozierok, R., et al.: Hallmarks of human-machine collaboration: a framework for assessment in the darpa communicating with computers program. arXiv preprint arXiv:2102.04958 (2021)
[14]
Krishnaswamy, N., et al.: Diana’s world: a situated multimodal interactive agent. In: AAAI Conference on Artificial Intelligence (AAAI): Demos Program. AAAI (2020)
[15]
Krishnaswamy, N., et al.: Communicating and acting: understanding gesture in simulation semantics. In: 12th International Workshop on Computational Semantics (2017)
[16]
Krishnaswamy N and Pustejovsky J Barkowsky T, Burte H, Hölscher C, and Schultheis H Multimodal semantic simulations of linguistically underspecified motion events Spatial Cognition X 2017 Cham Springer 177-197
[17]
Krishnaswamy, N., Pustejovsky, J.: VoxSim: a visual platform for modeling motion language. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. ACL (2016)
[18]
Krishnaswamy, N., Pustejovsky, J.: An evaluation framework for multimodal interaction. In: Proceedings of LREC (2018, forthcoming)
[19]
Krishnaswamy, N., Pustejovsky, J.: Generating a novel dataset of multimodal referring expressions. In: Proceedings of the 13th International Conference on Computational Semantics-Short Papers, pp. 44–51 (2019)
[20]
Krishnaswamy, N., Pustejovsky, J.: A formal analysis of multimodal referring strategies under common ground. In: Proceedings of The 12th Language Resources and Evaluation Conference, pp. 5919–5927 (2020)
[21]
Ligozat GF Frank AU and Campari I Qualitative triangulation for spatial reasoning Spatial Information Theory A Theoretical Basis for GIS 1993 Heidelberg Springer 54-68
[22]
Madeo, R.C.B., Peres, S.M., de Moraes Lima, C.A.: Gesture phase segmentation using support vector machines. Expert Syst. Appl. 56, 100–115 (2016)
[23]
Narayana, P., et al.: Cooperating with avatars through gesture, language and action. In: Intelligent Systems Conference (IntelliSys) (2018, forthcoming)
[24]
Van Eijck, J., Unger, C.: Computational Semantics with Functional Programming. Cambridge University, Cambridge (2010)
[25]
Wang, I., et al.: EGGNOG: a continuous, multi-modal data set of naturally occurring gestures with ground truth labels. In: To appear in the Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition (2017)
[26]
Wooldridge M and Lomuscio A Jennings NR and Lespérance Y Reasoning about visibility, perception, and knowledge Intelligent Agents VI. Agent Theories, Architectures, and Languages 2000 Heidelberg Springer 1-12
[27]
Zhang, Z.: Microsoft kinect sensor and its effect. IEEE MulitMedia 19, 4–10 (2012)
[28]
Ziemke, T., Sharkey, N.E.: A stroll through the worlds of robots and animals: applying Jakob von Uexkull’s theory of meaning to adaptive robots and artificial life. Semiotica-La Haye Then Berlin 134(1/4), 701–746 (2001)
[29]
Zimmermann, K., Freksa, C.: Qualitative spatial reasoning using orientation, distance, and path knowledge. Appl. Intell. 6(1), 49–58 (1996)

Cited By

View all
  • (2022)Multimodal Semantics for Affordances and ActionsHuman-Computer Interaction. Theoretical Approaches and Design Methods10.1007/978-3-031-05311-5_9(137-160)Online publication date: 26-Jun-2022

Index Terms

  1. The Role of Embodiment and Simulation in Evaluating HCI: Experiments and Evaluation
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Guide Proceedings
          Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior: 12th International Conference, DHM 2021, Held as Part of the 23rd HCI International Conference, HCII 2021, Virtual Event, July 24–29, 2021, Proceedings, Part I
          Jul 2021
          392 pages
          ISBN:978-3-030-77816-3
          DOI:10.1007/978-3-030-77817-0

          Publisher

          Springer-Verlag

          Berlin, Heidelberg

          Publication History

          Published: 24 July 2021

          Author Tags

          1. Embodiment
          2. HCI
          3. Common ground
          4. Multimodal dialogue
          5. VoxML

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 24 Sep 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2022)Multimodal Semantics for Affordances and ActionsHuman-Computer Interaction. Theoretical Approaches and Design Methods10.1007/978-3-031-05311-5_9(137-160)Online publication date: 26-Jun-2022

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media