Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/981863.981892dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Compilation of weighted finite-state transducers from decision trees

Published: 24 June 1996 Publication History

Abstract

We report on a method for compiling decision trees into weighted finite-state transducers. The key assumptions are that the tree predictions specify how to rewrite symbols from an input string, and the decision at each tree node is stateable in terms of regular expressions on the input string. Each leaf node can then be treated as a separate rule where the left and right contexts are constructable from the decisions made traversing the tree from the root to the leaf. These rules are compiled into transducers using the weighted rewite-rule rule-compilation algorithm described in (Mohri and Sproat, 1996).

References

[1]
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 1984. Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove CA.]]
[2]
William Fisher, Victor Zue, D. Bernstein, and David Pallet. 1987. An acoustic-phonetic data base. Journal of the Acoustical Society of America, 91, December. Supplement 1.]]
[3]
Daniel Gildea and Daniel Jurafsky. 1995. Automatic induction of finite state transducers for simple phonological rules. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 9--15, Morristown, NJ. Association for Computational Linguistics.]]
[4]
C. Douglas Johnson. 1972. Formal Aspects of Phonological Description. Mouton, Mouton, The Hague.]]
[5]
Ronald Kaplan and Martin Kay. 1994. Regular models of phonological rule systems. Computational Linguistics, 20:331--378.]]
[6]
Kimmo Koskenniemi. 1983. Two-Level Morphology: a General Computational Model for Word-Form Recognition and Production. Ph.D. thesis, University of Helsinki, Helsinki.]]
[7]
Andrej Ljolje and Michael D. Riley. 1992. Optimal speech recognition using phone recognition and lexical access. In Proceedings of ICSLP, pages 313--316, Banff, Canada, October.]]
[8]
David Magerman. 1995. Statistical decision-tree models for parsing. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 276--283, Morristown, NJ. Association for Computational Linguistics.]]
[9]
Mehryar Mohri and Richard Sproat. 1996. An efficient compiler for weighted rewrite rules. In 34rd Annual Meeting of the Association for Computational Linguistics, Morristown, NJ. Association for Computational Linguistics.]]
[10]
José Oncina, Pedro García, and Enrique Vidal. 1993. Learning subsequential transducers for pattern recognition tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:448--458.]]
[11]
Douglas Paul and Janet Baker. 1992. The design for the Wall Street Journal-based CSR corpus. In Proceedings of International Conference on Spoken Language Processing, Banff, Alberta. ICSLP.]]
[12]
Fernando Pereira and Michael Riley. 1996. Speech recognition by composition of weighted finite automata. CMP-LG archive paper 9603001.]]
[13]
Fernando Pereira, Michael Riley, and Richard Sproat. 1994. Weighted rational transductions and their application to human language processing. In ARPA Workshop on Human Language Technology, pages 249--254. Advanced Research Projects Agency, March 8--11.]]
[14]
Patty Price, William Fisher, Jared Bernstein, and David Pallett. 1988. The DARPA 1000-word Resource Management Database for continuous speech recognition. In Proceedings of ICASSP88, volume 1, pages 651--654, New York. ICASSP.]]
[15]
Michael Riley. 1989. Some applications of tree-based modelling to speech and language. In Proceedings of the Speech and Natural Language Workshop, pages 339--352, Cape Cod MA, October. DARPA, Morgan Kaufmann.]]
[16]
Michael Riley. 1991. A statistical model for generating pronunciation networks. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pages S11.1.--S11.4. ICASSP91, October.]]
[17]
Richard Sproat. 1995. A finite-state architecture for tokenization and grapheme-to-phoneme conversion for multilingual text analysis. In Susan Armstrong and Evelyne Tzoukermann, editors, Proceedings of the EACL SIGDAT Workshop, pages 65--72, Dublin, Ireland. Association for Computational Linguistics.]]
[18]
Richard Sproat. 1996. Multilingual text analysis for text-to-speech synthesis. In Proceedings of the ECAI-96 Workshop on Extended Finite State Models of Language, Budapest, Hungary. European Conference on Artificial Intelligence.]]
[19]
Michelle Wang and Julia Hirschberg. 1992. Automatic classification of intonational phrase boundaries. Computer Speech and Language, 6:175--196.]]

Cited By

View all
  • (2018)A UNIFIED APPROACH TO GRAPHEME-TO-PHONEME CONVERSION FOR THE PLATTOS SLOVENIAN TEXT-TO-SPEECH SYSTEMApplied Artificial Intelligence10.1080/0883951070140908621:6(563-603)Online publication date: 25-Dec-2018
  • (2007)Embodied conversational agents in Wizard-of-Oz and multimodal interaction applicationsProceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours10.5555/1783474.1783506(294-309)Online publication date: 29-Mar-2007
  • (2004)Compiling boostexter rules into a finite-state transducerProceedings of the ACL 2004 on Interactive poster and demonstration sessions10.3115/1219044.1219065(21-es)Online publication date: 21-Jul-2004
  • Show More Cited By
  1. Compilation of weighted finite-state transducers from decision trees

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL '96: Proceedings of the 34th annual meeting on Association for Computational Linguistics
      June 1996
      399 pages
      • Program Chairs:
      • Aravind Joshi,
      • Martha Palmer

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 24 June 1996

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)34
      • Downloads (Last 6 weeks)10
      Reflects downloads up to 26 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2018)A UNIFIED APPROACH TO GRAPHEME-TO-PHONEME CONVERSION FOR THE PLATTOS SLOVENIAN TEXT-TO-SPEECH SYSTEMApplied Artificial Intelligence10.1080/0883951070140908621:6(563-603)Online publication date: 25-Dec-2018
      • (2007)Embodied conversational agents in Wizard-of-Oz and multimodal interaction applicationsProceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours10.5555/1783474.1783506(294-309)Online publication date: 29-Mar-2007
      • (2004)Compiling boostexter rules into a finite-state transducerProceedings of the ACL 2004 on Interactive poster and demonstration sessions10.3115/1219044.1219065(21-es)Online publication date: 21-Jul-2004
      • (2002)Parameter estimation for probabilistic finite-state transducersProceedings of the 40th Annual Meeting on Association for Computational Linguistics10.3115/1073083.1073085(1-8)Online publication date: 6-Jul-2002
      • (2001)Re-engineering letter-to-sound rulesProceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies10.3115/1073336.1073351(1-7)Online publication date: 2-Jun-2001

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media