Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/958432.958438acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality

Published: 05 November 2003 Publication History

Abstract

We describe an approach to 3D multimodal interaction in immersive augmented and virtual reality environments that accounts for the uncertain nature of the information sources. The resulting multimodal system fuses symbolic and statistical information from a set of 3D gesture, spoken language, and referential agents. The referential agents employ visible or invisible volumes that can be attached to 3D trackers in the environment, and which use a time-stamped history of the objects that intersect them to derive statistics for ranking potential referents. We discuss the means by which the system supports mutual disambiguation of these modalities and information sources, and show through a user study how mutual disambiguation accounts for over 45% of the successful 3D multimodal interpretations. An accompanying video demonstrates the system in action.

References

[1]
Atherton, P. R. A method of interactive visualization of CAD surface models on a color video display. Proc. ACM Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '81), ACM Press, 1981, 279--287.
[2]
Bolt, R. A. Put-That-There: Voice and gesture at the graphics interface, Computer Graphics, 14(3), 1980, 262--270.
[3]
Bolt, R. A. and Herranz, E. Two-handed gesture in multi-modal dialog. Proc. ACM Symposium on User Interface Software and Technology (UIST '92), ACM Press, 1992, 7--14.
[4]
Cohen, P. R., Johnston, M., McGee, D. R., Oviatt, S. L., Pittman, J. A., Smith, I., Chen, L. and Clow, J. QuickSet: multimodal interaction for distributed applications. Proc. International Multimedia Conference (Multimedia '97), ACM Press, 1997, 31--40.
[5]
Cohen, P. R., McGee, D. R., Oviatt, S. L., Wu, L., Clow, J., King, R., Julier, S. and Rosenblum, L. Multimodal interactions for 2D and 3D environments, IEEE Computer Graphics and Applications, (July/Aug), 1999, 10--13.
[6]
Corradini, A. and Cohen, P. R. Multimodal Speech-Gesture Interface for Handfree Painting on a Virtual Paper using Partial Recurrent Neural Networks as Gesture Recognizer. Proc. Int. Joint Conf. on Artificial Neural Networks (IJCNN '02), 2293--2298.
[7]
Corradini, A. and Cohen, P. R. On the Relationships among Speech, Gestures, and Object Manipulation in Virtual Envi-ronments: Initial Evidence. Proc. Int. CLASS Workshop on Natural, Intelligent and Effective Interaction in Multimodal Dialogue Systems, (2002), 52--61.
[8]
Fröhlich, B., Plate, J., Wind, J., Wesche, G. and Göbel, M. Cubic-moused-based interaction in virtual environments, IEEE Computer Graphics and Applications, 20(4), 2000, 12--15.
[9]
Inselberg, A. and Dimsdale, B. Parallel coordinates: A tool for visualizing multi-dimensional geometry, Proceedings of IEEE Visualization, 901990, 361--378.
[10]
Johnston, M. Unification-based multimodal parsing. Proc. Int. Joint Conf. of the Assoc. for Computational Linguistics and the Int. Committee on Computational Linguistics, Association for Computational Linguistics Press, 1998, 624--630.
[11]
Johnston, M., Cohen, P. R., McGee, D. R., Oviatt, S. L., Pittman, J. A. and Smith, I. Unification-based multimodal integration. Proc. Meeting of the Assoc. for Computational Linguistics, ACL Press, 1997, 281--288.
[12]
Kaiser, E. C. and Cohen, P. R. Implementation testing of a hybrid symbolic/statistical multimodal architecture. Proc. Int. Conf. on Spoken Language Processing (ICSLP '02), 173--176.
[13]
Koons, D. B., Sparrell, C. J. and Thorisson, K. R. Integrating simultaneous input from speech, gaze, and hand gestures, in Intelligent Multimedia Interfaces, M. T. Maybury. AAAI Press/MIT Press: Cambridge, MA, 1993, 257--276.
[14]
Krum, D. M., Omoteso, O., Ribarsky, W., Starner, T. and Hodges, L. F. Speech and gesture control of a whole earth 3D visualization environment. Proc. Joint Eurographics-IEEE TCVG Symposium on Visualization (VisSym 02), IEEE Press, 2002, 195--200.
[15]
Kumar, S., Cohen, P. R. and Levesque, H. J. The Adaptive Agent Architecture: Achieving Fault-Tolerance Using Persistent Broker Teams. Proc. Int. Conf. on Multi-Agent Systems, 2000.
[16]
Latoschik, M. E. Designing transition networks for multimodal VR-interactions using a markup language. Proc. IEEE fourth International Conference on Multimodal Interfaces (ICMI '02), IEEE Press, 2002.
[17]
Laviola, J. MSVT: A virtual reality-based multimodal scientific visualization tool. Proc. IASTED Int. Conf. on Computer Graphics and Imaging, 2000, 1--7.
[18]
Liang, J. and Green, M. JDCAD: A highly interactive 3D modeling system, Computers and Graphics, 18(4), 1994, 499--506.
[19]
Lucente, M., Zwart, G.-J. and George, A. Visualization Space: A testbed for deviceless multimodal user interfaces. Proc. AAAI Spring Symp., AAAI Press, 1998, 87--92.
[20]
McGee, D. R., Cohen, P. R. and Oviatt, S. L. Confirmation in multimodal systems. Proc. Int. Joint Conf. of the Assoc. for Computational Linguistics and the Int. Committee on Computational Linguistics, Université de Montréal, 1998, 823--829.
[21]
Olwal, A., Benko, H. and Feiner, S. SenseShapes: Using Sta-tistical Geometry for Object Selection in a Multimodal Aug-mented Reality System, in The Second International Symposium on Mixed and Augmented Reality.
[22]
Oviatt, S. L. Mutual disambiguation of recognition errors in a multimodal architecture. Proc. ACM Conf. on Human Factors in Computing Systems, ACM Press, 1999, 576--583.
[23]
Oviatt, S. L., Cohen, P. R., Wu, L., Vergo, J., Duncan, L., Suhm, B., Bers, J., Holzman, T., Winograd, T., Landay, J., Larson, J. and Ferro, D. Designing the user interface for multimodal speech and gesture applications: State-of-the-art systems and research directions for 2000 and beyond, in Human-Computer Interaction in the New Millennium, J. Carroll. Addison-Wesley: Boston, 2002,
[24]
Poddar, I., Sethi, Y., Ozyildiz, E. and Sharma, R. Toward natural gesture/speech HCI: A case study of weather narration. Proc. ACM Workshop on Perceptual User Interfaces (PUI 98), ACM Press, 1998.
[25]
Poupyrev, I., Billinghurst, M., Weghorst, S. and Ichikawa, T. Go-Go interaction technique: Non-linear mapping for direct manipulation in VR. Proc. ACM Symposium on User Interface Software and Technology (UIST '96), ACM Press, 1996, 79--80.
[26]
Stiefelhagen, R. Tracking Focus of Attention in Meetings. Proc. 4th International Conference on Multimodal Interfaces (ICMI 02), IEEE Press, 2002, 273--380.
[27]
Tollmar, K., Demirdjian, D. and Darrell, T. Gesture + Play: Full-body interaction for virtual environments. Proc. ACM Conference on Human Factors in Computing Systems (CHI 2003), ACM Press, 2003, 620--621.
[28]
Weimer, D. and Ganapathy, S. K. A synthetic visual environment with hand gesturing and voice input. Proc. ACM Conference on Human Factors in Computing Systems (CHI '89), ACM Press, 1989, 235--240.
[29]
Wilson, A. and Shafer, S. XWand: UI for intelligent spaces. Proc. ACM Conference on Human Factors in Computing Systems (CHI '03), ACM Press, 2003.
[30]
Wu, L., Oviatt, S. L. and Cohen, P. R. Multimodal integration-A statistical view, IEEE Transactions on Multimedia, 1(4), 1999, 334--341.

Cited By

View all
  • (2024)Investigating the Effects of Avatarization and Interaction Techniques on Near-field Mixed Reality Interactions with Physical ComponentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.337205030:5(2756-2766)Online publication date: 4-Mar-2024
  • (2024)Personalized decision-making for agents in face-to-face interaction in virtual realityMultimedia Systems10.1007/s00530-024-01591-731:1Online publication date: 24-Dec-2024
  • (2024)Neuro-Symbolic Reasoning for Multimodal Referring Expression Comprehension in HMI SystemsNew Generation Computing10.1007/s00354-024-00243-842:4(579-598)Online publication date: 1-Nov-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '03: Proceedings of the 5th international conference on Multimodal interfaces
November 2003
318 pages
ISBN:1581136218
DOI:10.1145/958432
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. augmented/virtual reality
  2. evaluation
  3. multimodal interaction

Qualifiers

  • Article

Conference

ICMI-PUI03
Sponsor:
ICMI-PUI03: International Conference on Multimodal User Interfaces
November 5 - 7, 2003
British Columbia, Vancouver, Canada

Acceptance Rates

ICMI '03 Paper Acceptance Rate 45 of 130 submissions, 35%;
Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)90
  • Downloads (Last 6 weeks)11
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Investigating the Effects of Avatarization and Interaction Techniques on Near-field Mixed Reality Interactions with Physical ComponentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.337205030:5(2756-2766)Online publication date: 4-Mar-2024
  • (2024)Personalized decision-making for agents in face-to-face interaction in virtual realityMultimedia Systems10.1007/s00530-024-01591-731:1Online publication date: 24-Dec-2024
  • (2024)Neuro-Symbolic Reasoning for Multimodal Referring Expression Comprehension in HMI SystemsNew Generation Computing10.1007/s00354-024-00243-842:4(579-598)Online publication date: 1-Nov-2024
  • (2023)Recording multimodal pair-programming dialogue for reference resolution by conversational agentsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614231(731-735)Online publication date: 9-Oct-2023
  • (2023)A Human-Computer Collaborative Editing Tool for Conceptual DiagramsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580676(1-29)Online publication date: 19-Apr-2023
  • (2023)Give Me a Hand: Improving the Effectiveness of Near-field Augmented Reality Interactions By Avatarizing Users' End EffectorsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.324710529:5(2412-2422)Online publication date: 1-May-2023
  • (2023)MEinVR: Multimodal interaction techniques in immersive explorationVisual Informatics10.1016/j.visinf.2023.06.0017:3(37-48)Online publication date: Sep-2023
  • (2023)Mixed Reality Interaction TechniquesSpringer Handbook of Augmented Reality10.1007/978-3-030-67822-7_5(109-129)Online publication date: 1-Jan-2023
  • (2022)Research on Equipment and Algorithm of a Multimodal Perception Gameplay Virtual and Real Fusion Intelligent ExperimentApplied Sciences10.3390/app12231218412:23(12184)Online publication date: 28-Nov-2022
  • (2022)MoonBuddy: A Voice-based Augmented Reality User Interface That Supports Astronauts During Extravehicular ActivitiesAdjunct Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526114.3558690(1-4)Online publication date: 29-Oct-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media