Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1452392.1452439acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Robust gesture processing for multimodal interaction

Published: 20 October 2008 Publication History

Abstract

With the explosive growth in mobile computing and communication over the past few years, it is possible to access almost any information from virtually anywhere. However, the efficiency and effectiveness of this interaction is severely limited by the inherent characteristics of mobile devices, including small screen size and the lack of a viable keyboard or mouse. This paper concerns the use of multimodal language processing techniques to enable interfaces combining speech and gesture input that overcome these limitations. Specifically we focus on robust processing of pen gesture inputs in a local search application and demonstrate that edit-based techniques that have proven effective in spoken language processing can also be used to overcome unexpected or errorful gesture input. We also examine the use of a bottom-up gesture aggregation technique to improve the coverage of multimodal understanding.

References

[1]
A. Adler and R. Davis. Speech and sketching for multimodal design. In Proceedings of 9th International Conference on Intelligent User Interfaces, pages 214--216. ACM Press, 2004.
[2]
J. Allen, D. Byron, M. Dzikovska, G. Ferguson, L. Galescu, and A. Stent. Towards conversational human-computer interaction. AI Magazine, 22(4):27--38, December 2001.
[3]
E. André. Natural language in multimedia/multimodal systems. In R. Mitkov, editor, Handbook of Computational Linguistics, pages 650--669. Oxford University Press, 2002.
[4]
S. Bangalore and M. Johnston. Balancing data-driven and rule-based approaches in the context of a multimodal conversational system. In Proceedings of North American Association for Computational Linguistics/Human Language Technology, pages 33--40, Boston, USA, 2004.
[5]
S. Bangalore and M. Johnston. Robust understanding in multimodal interfaces. Accepted for publication in Computational Linguistics, 2008.
[6]
M. Boros, W. Eckert, F. Gallwitz, G. Gğrz, G. Hanrieder, and H. Niemann. Towards understanding spontaneous speech: word accuracy vs. concept accuracy. In Proceedings of International Conference on Spoken Language Processing, pages 41--44, Philadelphia, USA, 1996.
[7]
J. Cassell. Embodied conversational agents: Representation and intelligence in user interface. In AI Magazine, volume 22, pages 67--83, 2001.
[8]
A. Cheyer and L. Julia. Multimodal Maps: An Agent-Based Approach. Lecture Notes in Computer Science, 1374:103--113, 1998.
[9]
A. Ciaramella. A Prototype Performance Evaluation Report. Technical Report WP8000-D3, Project Esprit 2218 SUNDIAL, 1993.
[10]
P. Cohen, M. Johnston, D. McGee, S. L. Oviatt, J. Pittman, I. Smith, L. Chen, and J. Clow. Multimodal interaction for distributed interactive simulation. In M. Maybury and W. Wahlster, editors, Readings in Intelligent Interfaces, pages 562--571. Morgan Kaufmann Publishers, 1998.
[11]
P. R. Cohen, M. Johnston, D. McGee, S. L. Oviatt, J. Clow, and I. Smith. The efficiency of multimodal interaction: a case study. In Proceedings of International Conference on Spoken Language Processing, pages 249--252, Sydney, Australia, 1998.
[12]
J. Dowding, J. Gawron, D. Appelt, J. Bear, L. Cherny, R. Moore, and D. Moran. GEMINI: A natural language system for spoken-language understanding. In Proceedings of Association for Computational Linguistics, pages 54--61, Columbus, OH, USA, 1993.
[13]
P. Ehlen, M. Johnston, and G. Vasireddy. Collecting mobile multimodal data for MATCH. In Proceedings of International Conference on Spoken Language Processing, pages 2557--2560, Denver, CO, USA, 2002.
[14]
A. L. Gorin, G. Riccardi, and J. H. Wright. How May I Help You? Speech Communication, 23(1-2):113--127, 1997.
[15]
J. Gustafson, L. Bell, J. Beskow, J. Boye, R. Carlson, J. Edlund, B. Granstrm, D. House, and M. Wirén. Adapt - a multimodal conversational dialogue system in an apartment domain. In International Conference on Spoken Language Processing, pages 134--137, Beijing, China, 2000.
[16]
M. Johnston. Deixis and conjunction in multimodal systems. In Proceedings of International Conference on Computational Linguistics (COLING), pages 362--368, Saarbrücken, Germany, 2000.
[17]
M. Johnston and S. Bangalore. Finite-state multimodal parsing and understanding. In Proceedings of International Conference on Computational Linguistics (COLING), pages 369--375, Saarbrücken, Germany, 2000.
[18]
M. Johnston and S. Bangalore. Finite-state multimodal integration and understanding. Journal of Natural Language Engineering, 11(2):159--187, 2005.
[19]
M. Johnston, S. Bangalore, G. Vasireddy, A. Stent, P. Ehlen, M. Walker, S. Whittaker, and P. Maloor. MATCH: An architecture for multimodal dialog systems. In Proceedings of Association of Computational Linguistics, pages 376--383, Philadelphia, PA, USA, 2002.
[20]
J. A. Landay and B. A. Myers. Sketching interfaces: Toward more human interface design. IEEE Computer, 34(3):56--64, March 2001.
[21]
S. Oviatt. Multimodal interactive maps: Designing for human performance. Human-Computer Interaction, 12(1):93--129, 1997.
[22]
S. Oviatt. Mutual disambiguation of recognition errors in a multimodal architecture. In Proceedings of the Conference on Human Factors in Computing Systems: CHI'99, pages 576--583, Pittsburgh, PA, USA, 1999. ACM Press.
[23]
W. Wahlster. SmartKom: Fusion and fission of speech, gestures, and facial expressions. In Proceedings of the 1st International Workshop on Man-Machine Symbiotic Systems, pages 213--225, Kyoto, Japan, 2002.
[24]
W. Ward. Understanding spontaneous speech: the Phoenix system. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pages 365--367, Washington, D.C., USA, 1991.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '08: Proceedings of the 10th international conference on Multimodal interfaces
October 2008
322 pages
ISBN:9781605581989
DOI:10.1145/1452392
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. finite-state methods
  2. local search
  3. mobile
  4. multimodal interfaces
  5. robustness
  6. speech-gesture integration

Qualifiers

  • Research-article

Conference

ICMI '08
Sponsor:
ICMI '08: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES
October 20 - 22, 2008
Crete, Chania, Greece

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 369
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media