Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2974804.2980495acmotherconferencesArticle/Chapter ViewAbstractPublication PageshaiConference Proceedingsconference-collections
poster

Towards an Interactive Voice Agent for Singapore Hokkien

Published: 04 October 2016 Publication History

Abstract

Singapore Hokkien (SH) is the most commonly spoken non-Mandarin Chinese dialect in Singapore. It is an important language for many members of Singapore's pioneer generation, but much less so for the younger generation who prefer English. In recent years, the greying of this demographic has placed an increasing demand on for assistive devices to support them. We report ongoing efforts to build limited-vocabulary speech recognition, with the eventual goal of a conversational voice agent in SH that can support applications in home-automation or in-hospital use case scenarios. This process is challenging as sizeable SH speech corpora do not yet exist, and SH is sufficiently different from existing Mandarin or Minnan such that other corpora cannot be directly used. We document our efforts at building language resources -- audio corpora, pronunciation lexicons -- and present some preliminary findings on multilingual training.

References

[1]
David Deterding. 2007. Singapore English. Edinburgh: Edinburgh University Press.
[2]
C.J. Chen, R. A. Gopinath, M. D. Monkowski, M. A. Picheny, K. Shen. 1997. New methods in continuous Mandarin speech recognition. In Proceedings of EUROSPEECH.
[3]
Dau-Cheng Lyu, Min-Siong Liang, Yuangchin Chiang, Chun-Nan Hsu, Ren-Yuan Lyu. 2003. Large vocabulary Taiwanese (min-nan) speech recognition using tone features and statistical pronunciation modeling. In Proceedings of EUROSPEECH
[4]
Amelia Hong. 2012. A Phonological and Phonetic Description of Singapore Hokkien. Master's Thesis. Nanyang Technological University, Singapore.
[5]
Pinyin index - Chinese Dictionary of examples. In Discover China, learn Chinese Chinese-Tools.com. Retrieved on May 1st 2016 from http://www.chinesetools.com/chinese/examples/pinyin.html
[6]
Daniel Povey et al. 2011. The Kaldi speech recognition toolkit. In Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding
[7]
Geoffrey Hinton et al. Deep neural networks for acoustic modeling in speech recognition. Signal Processing Magazine, 2012
[8]
Tanya Schulz, Alex Waibel. 1998 Multilingual and Crosslingual Speech Recognition, in Proc. DARPA Workshop on Broadcast News Transcription and Understanding
[9]
Zhang, Y., Chuangsuwanich, E., & Glass, J. 2014. Language ID-based Training of Multilingual Stacked Bottleneck Features. In Proceedings of Interspeech, 1--5
[10]
Ngoc Thang Vu, Florian Metze, Tanya Schulz. 2012. Multilingual Bottle-neck Features and its Application for Underresourced Languages, in Proceedings of Spoken Language Technology and Understanding
[11]
Heigold, G., Vanhoucke, V., Senior, A., Nguyen, P., Ranzato, M., Devin, M., & Dean, J. 2013. Multilingual Acoustic Models Using Distributed Deep Neural Networks. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, 8169--6149

Cited By

View all
  • (2018)Multitask Learning for Phone Recognition of Underresourced Languages Using Mismatched TranscriptionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2017.278236026:3(501-514)Online publication date: 1-Mar-2018
  • (2018)Recognizing Zero-Resourced Languages Based on Mismatched Machine Transcriptions2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2018.8462481(5979-5983)Online publication date: Apr-2018

Index Terms

  1. Towards an Interactive Voice Agent for Singapore Hokkien

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    HAI '16: Proceedings of the Fourth International Conference on Human Agent Interaction
    October 2016
    414 pages
    ISBN:9781450345088
    DOI:10.1145/2974804
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    • Chinese and Oriental Language Information Processing Society: Chinese and Oriental Language Information Processing Society

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 October 2016

    Check for updates

    Author Tags

    1. acoustic modeling
    2. singapore hokkien
    3. speech recognition
    4. voice agents

    Qualifiers

    • Poster

    Conference

    HAI '16
    Sponsor:
    • Chinese and Oriental Language Information Processing Society

    Acceptance Rates

    HAI '16 Paper Acceptance Rate 29 of 182 submissions, 16%;
    Overall Acceptance Rate 121 of 404 submissions, 30%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Multitask Learning for Phone Recognition of Underresourced Languages Using Mismatched TranscriptionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2017.278236026:3(501-514)Online publication date: 1-Mar-2018
    • (2018)Recognizing Zero-Resourced Languages Based on Mismatched Machine Transcriptions2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2018.8462481(5979-5983)Online publication date: Apr-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media