Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3340555.3356097acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Streamlined Decoder for Chinese Spoken Language Understanding

Published: 14 October 2019 Publication History

Abstract

As a critical component of Spoken Dialog System (SDS), spoken language understanding (SLU) attracts a lot of attention, especially for methods based on unaligned data. Recently, a new approach has been proposed that utilizes the hierarchical relationship between act-slot-value triples. However, it ignores the transfer of internal information which may record the intermediate information of the upper level and contribute to the prediction of the lower level. So, we propose a novel streamlined decoding structure with attention mechanism, which uses three successively connected RNN to decode act, slot and value respectively. On the first Chinese Audio-Textual Spoken Language Understanding Challenge (CATSLU), our model exceeds state-of-the-art model on an unaligned multi-turn task-oriented Chinese spoken dialogue dataset provided by the contest.

References

[1]
Lina M Rojas Barahona, Milica Gasic, Nikola Mrkšić, Pei-Hao Su, Stefan Ultes, Tsung-Hsien Wen, and Steve Young. 2016. Exploiting sentence and context representations in deep neural models for spoken language understanding. arXiv preprint arXiv:1610.04120(2016).
[2]
Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18, 5-6 (2005), 602–610.
[3]
Matthew Henderson, Milica Gašić, Blaise Thomson, Pirros Tsiakoulis, Kai Yu, and Steve Young. 2012. Discriminative spoken language understanding using word confusion networks. In 2012 IEEE Spoken Language Technology Workshop (SLT). IEEE, 176–181.
[4]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[5]
Chunxi Liu, Puyang Xu, and Ruhi Sarikaya. 2015. Deep contextual language understanding in spoken dialogue systems. In Sixteenth annual conference of the international speech communication association.
[6]
James H Martin and Daniel Jurafsky. 2009. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Pearson/Prentice Hall Upper Saddle River.
[7]
Christian Raymond and Giuseppe Riccardi. 2007. Generative and discriminative algorithms for spoken language understanding. In Eighth Annual Conference of the International Speech Communication Association.
[8]
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673–2681.
[9]
Tiejun Zhao Chengqing Zong Su Zhu, Zijian Zhao and Kai Yu. 2019. CATSLU: The 1st Chinese Audio-Textual Spoken Language Understanding Challenge. In 21st ACM International Conference on Multimodal Interaction.
[10]
Kaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, Yangyang Shi, and Dong Yu. 2013. Recurrent neural networks for language understanding. In Interspeech. 2524–2528.
[11]
Lin Zhao and Zhe Feng. 2018. Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 426–431.
[12]
Zijian Zhao, Su Zhu, and Kai Yu. 2019. A Hierarchical Decoding Model for Spoken Language Understanding from Unaligned Data. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7305–7309.
[13]
Su Zhu and Kai Yu. 2017. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5675–5679.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICMI '19: 2019 International Conference on Multimodal Interaction
October 2019
601 pages
ISBN:9781450368605
DOI:10.1145/3340555
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention mechanisms
  2. long short term memory networks
  3. pointer network
  4. spoken dialog system
  5. spoken language understanding
  6. streamlined decoder

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Major Project of Zhijiang Lab
  • NSFC
  • SFSMBRP
  • NSFB
  • CETC
  • BIGKE

Conference

ICMI '19

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 109
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media