This paper presents multimodal interfaces to task and control multiple robots controlled by an agent-based architecture. For the past few years, SRI International have followed an approach based on the “Coach Metaphor”. In sports or business, coaches are meant to apply predefined strategies to their teams, or, if something goes wrong, to find new means and plans during an ongoing game, so as to retask either the entire team or specific players. This is also the challenge facing a robot's operator. SRI's agent-based framework, the Open Agent ArchitectureTM (OAA), provides communication between the members of a team and the external world. The coach, or the robot's operator, who is an active member of the team, is provided with a multimodal interface that uses pen and voice. The analogy of a coach talking and drawing on a white clipboard representing the virtual world where the players are developing their game reinforces the metaphor. We present several interfaces specifically developed for SRI's robots, and we show an example (controlling robots on a soccer field) where the metaphor matches, one to one, the real world. To clarify our views, we will give an overview of the technologies in use, such as the agent architecture, the speech and gesture recognizers, and the robot controller.
Unable to display preview. Download preview PDF.
Bolt, R. Put-That-There: Voice and Gesture at the graphics interface. Computer Graphics, Vol. 14, Number 3, pp. 262–270, 1980.
Cheyer, A. and Julia, L. Multimodal maps: An agent-based approach. In Proc. of CMC'95, pp. 103–113, Eindhoven, The Netherlands, May 1995.
Cheyer, A. and Julia, L. MVIEWS: Multimodal Tools for the Video Analyst. In Proc. of IUI'98, pp 55–62, San Francisco, USA, January 1998.
Cheyer, A., Julia, L. and Martin. J.C. A Unified Framework for Constructing Multimodal Experiments and Applications. In Proc. of CMC'98, pp. 63–69, Tilburg, The Netherlands, January 1998
Digalakis, V., Monaco, P. and Murveit, H. Genones: Generalized Mixture Tying in Continuous Hidden Markov Model-Based Speech Recognizers. IEEE Transactions of Speech and Audio Processing, Vol.4, Num. 4, p 281, 1996.
Dowding, J., Gawron, J., M. Appelt, D., Bear, J., Cherny, L., Moore, R. and Moran, D. GEMINI: A natural language system for spoken-language understanding. 31st Annual Meeting of the Association for Computational Linguistics. Pp. 54–61. Colombus, USA, 1996
Guzzoni, D., Cheyer, A., Julia, L. and Konolige, K. Many Robots Make Short Work. AI Magazine, Vol. 18, Number 1, pp. 55–64, Spring 1997.
Hobbs, J., Appelt, D., Bear, J., Israel, D., Kameyama, M., Stickel, M., and Tyson, M. FASTUS: a cascaded finite-state transducer for extracting information from natural-language text. in Finite State Devices for Natural Language Processing (E. Roche and Y. Schabes, eds.) MIT Press, Cambridge, USA, 1996.
Julia, L. and Faure, C. A multimodai interface for incremental graphic document design, In Proc. HCI'93, p 186, Orlando, USA, August 1993.
Julia, L. and Faure, C. Pattern recognition and beautification for a pen-based interface. In Proc. of ICDAR'95, pp. 58–63, Montreal, Canada, August 1995.
Julia, L. and Cheyer, A. Speech: A Privileged Modality. In Proc. ofEuroSpeech'97, Vol. 4, pp. 1843–1846, Rhodes, Greece, September 1997
Konolige, K., Myers, K., Ruspini, E. and Saffiotti, A. The SAPHIRA Architecture: A Design for Autonomy. Journal ofExperimental and Theoretical AI, Vol. 4, Number 0, pp. ?-?, ? 1997
Martin, J.C., Julia, L. and Cheyer, A. A Theoretical Framework for Multimodal User Studies. In Proc. CMC'98, pp. 104–110, Tilburg, the Netherlands, January 1998.
Mellor, B.A., Baber, C. and Tunley, C. In goal-oriented multimodal dialogue systems. In Proc. ICSLP'96, pp. 1668–1671, Philadelphia, USA, 1996.
Moran, D., Cheyer, A., Julia, L. and Park, S. Multimodal user interfaces in the Open Agent Architecture. In Proc. ofIUI'97, pp. 61–68. Orlando, January 1997.
Siroux, J., Guyomard, M., Jolly, Y., Multon, F. and Remondeau, C. Speech and Tactile-Based Georal System. In Proc. EUROSPEECH'95, pp. 1943–1946, Madrid, Spain, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Julia, L. (1998). Tasking robots through multimodal interfaces: The “Coach Metaphor”. In: Drogoul, A., Tambe, M., Fukuda, T. (eds) Collective Robotics. CRW 1998. Lecture Notes in Computer Science, vol 1456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033372
Download citation
DOI: https://doi.org/10.1007/BFb0033372
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64768-3
Online ISBN: 978-3-540-68723-8
eBook Packages: Springer Book Archive