Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Task adaptation in stochastic language model for Chinese homophone disambiguation

Published: 01 March 2003 Publication History

Abstract

The runtime application domain has a great effect on the performance of practical corpus-based applications. Previous smoothing techniques and class-based and similarity-based models could not handle the dynamic status perfectly. In this paper, an adaptive learning algorithm is proposed for task adaptation that best fits the runtime application domain in applying Chinese homophone disambiguation. The proposed algorithm is first formulated by a neural network model and then generalized to avoid the problem of slow convergence. The resulting techniques are greatly simplified and robust. The experimental results demonstrate the effects of the learning algorithm from a generic domain to a specific one. A methodology is also presented to show how these techniques can be extended to various language models and corpus-based applications.

References

[1]
BROWN, P. F., ET AL. 1992. Class-based N-gram models of natural language. Comput. Linguist. 18, 467--479.]]
[2]
CHURCH, K. W., AND GALE, W. A. 1991. A comparison of the enhanced good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Comput. Speech Lang. 5, 19--54.]]
[3]
DAGAN, I. ET AL. 1994. Similarity-based estimation of word cooccurrence probabilities. In Proceedings of the 32nd Annual Meeting of ACL. 272--278.]]
[4]
ESSEN, U. AND STEINBISS, V. 1992. Cooccurrence smoothing for stochastic language modeling. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. I-161--I-164.]]
[5]
FURUL, S. ET AL. 1992. Recent topics in speech recognition research at NTT laboratories. In Proceedings of Speech and Natural Language Workshop. 162--167.]]
[6]
GOOD, I. J. 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40. 237--264.]]
[7]
KATZ, S. M. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoustic, Speech, Signal Process. 35. 400--401.]]
[8]
KONCAR, N. AND GUTHRIE, G. 1994. A natural language translation neural network. In Proceedings of the International Conference on New Methods in Language Processing. 71--77.]]
[9]
KUHN, R. AND MORI, R. D. 1990. A cache-based natural language model for speech recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12, 570--583.]]
[10]
KUHN, R. AND MORI, R. D. 1992. Corrections to a cache-based natural language model for speech recognition. IEEE Trans. Pattern Anal. Mach. Intell. 14. 691--692.]]
[11]
LIN, Y. C. ET AL. 1994. Automatic model refinement: with an application to tagging. In Proceedings of the International Conference on Computational Linguistics. 148--153.]]
[12]
MATSUNAGA, S. ET AL. 1992. Task adaptation in stochastic language models for continuous speech recognition. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. I-165--I-168.]]
[13]
NAKAMURA, M. ET AL. 1990. Neural network approach to word category prediction for English texts. In Proceedings of the International Conference on Computational Linguistics. 213--218.]]
[14]
PEREIRA, F. ET AL. 1993. Distributional clustering of English words. In Proceedings of the 31st Annual Meeting of ACL. 183--190.]]
[15]
PLACEWWAY, P. ET AL. 1993. The estimation of powerful language models from small and large corpora. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. II-33-II-36.]]
[16]
SCHMID, H. 1987. Part-of-speech tagging with neural networks. In Proceedings of the International Conference on Computational Linguistics. 172--176.]]
[17]
SHIKANO, K. 1987. Improvement of word recognition results by trigram model. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. 1261--1264.]]
[18]
SPROAT, R. 1990. An application of statistical optimization with dynamic programming to phonemic-input-to-character conversion for Chinese. In Proceedings of the ROCLING Conference. 377--390.]]
[19]
WONG, S. K. M. AND CAI, Y. J. 1993. Computation of term associations by a neural network. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 107--115.]]

Cited By

View all
  • (2018)On the use of acoustic features for automatic disambiguation of homophones in spontaneous GermanComputer Speech & Language10.1016/j.csl.2017.12.01152(209-224)Online publication date: Nov-2018
  • (2008)Ambiguity solution of pinyin segmentation in continuous Pinyin-to-Character conversion2008 International Conference on Natural Language Processing and Knowledge Engineering10.1109/NLPKE.2008.4906775(1-7)Online publication date: Oct-2008
  • (2006)Using word support model to improve Chinese input systemProceedings of the COLING/ACL on Main conference poster sessions10.5555/1273073.1273181(842-849)Online publication date: 17-Jul-2006
  • Show More Cited By

Index Terms

  1. Task adaptation in stochastic language model for Chinese homophone disambiguation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian Language Information Processing
    ACM Transactions on Asian Language Information Processing  Volume 2, Issue 1
    March 2003
    77 pages
    ISSN:1530-0226
    EISSN:1558-3430
    DOI:10.1145/964161
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 March 2003
    Published in TALIP Volume 2, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Adaptive learning
    2. Chinese homophone disambiguation
    3. language model
    4. neural network
    5. runtime application domain
    6. task adaptation

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 18 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)On the use of acoustic features for automatic disambiguation of homophones in spontaneous GermanComputer Speech & Language10.1016/j.csl.2017.12.01152(209-224)Online publication date: Nov-2018
    • (2008)Ambiguity solution of pinyin segmentation in continuous Pinyin-to-Character conversion2008 International Conference on Natural Language Processing and Knowledge Engineering10.1109/NLPKE.2008.4906775(1-7)Online publication date: Oct-2008
    • (2006)Using word support model to improve Chinese input systemProceedings of the COLING/ACL on Main conference poster sessions10.5555/1273073.1273181(842-849)Online publication date: 17-Jul-2006
    • (2006)A Maximum Entropy Approach to Chinese Pin Yin-To-Character Conversion2006 IEEE International Conference on Systems, Man and Cybernetics10.1109/ICSMC.2006.384567(2956-2959)Online publication date: Oct-2006

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media