Article

Free access

Compilation of weighted finite-state transducers from decision trees

Authors:

Richard Sproat,

Michael RileyAuthors Info & Claims

ACL '96: Proceedings of the 34th annual meeting on Association for Computational Linguistics

Pages 215 - 222

https://doi.org/10.3115/981863.981892

Published: 24 June 1996 Publication History

Abstract

We report on a method for compiling decision trees into weighted finite-state transducers. The key assumptions are that the tree predictions specify how to rewrite symbols from an input string, and the decision at each tree node is stateable in terms of regular expressions on the input string. Each leaf node can then be treated as a separate rule where the left and right contexts are constructable from the decisions made traversing the tree from the root to the leaf. These rules are compiled into transducers using the weighted rewite-rule rule-compilation algorithm described in (Mohri and Sproat, 1996).

References

[1]

Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 1984. Classification and Regression Trees. Wadsworth & Brooks, Pacific Grove CA.]]

[2]

William Fisher, Victor Zue, D. Bernstein, and David Pallet. 1987. An acoustic-phonetic data base. Journal of the Acoustical Society of America, 91, December. Supplement 1.]]

[3]

Daniel Gildea and Daniel Jurafsky. 1995. Automatic induction of finite state transducers for simple phonological rules. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 9--15, Morristown, NJ. Association for Computational Linguistics.]]

Digital Library

[4]

C. Douglas Johnson. 1972. Formal Aspects of Phonological Description. Mouton, Mouton, The Hague.]]

[5]

Ronald Kaplan and Martin Kay. 1994. Regular models of phonological rule systems. Computational Linguistics, 20:331--378.]]

Digital Library

[6]

Kimmo Koskenniemi. 1983. Two-Level Morphology: a General Computational Model for Word-Form Recognition and Production. Ph.D. thesis, University of Helsinki, Helsinki.]]

[7]

Andrej Ljolje and Michael D. Riley. 1992. Optimal speech recognition using phone recognition and lexical access. In Proceedings of ICSLP, pages 313--316, Banff, Canada, October.]]

[8]

David Magerman. 1995. Statistical decision-tree models for parsing. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 276--283, Morristown, NJ. Association for Computational Linguistics.]]

Digital Library

[9]

Mehryar Mohri and Richard Sproat. 1996. An efficient compiler for weighted rewrite rules. In 34rd Annual Meeting of the Association for Computational Linguistics, Morristown, NJ. Association for Computational Linguistics.]]

Digital Library

[10]

José Oncina, Pedro García, and Enrique Vidal. 1993. Learning subsequential transducers for pattern recognition tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:448--458.]]

Digital Library

[11]

Douglas Paul and Janet Baker. 1992. The design for the Wall Street Journal-based CSR corpus. In Proceedings of International Conference on Spoken Language Processing, Banff, Alberta. ICSLP.]]

Digital Library

[12]

Fernando Pereira and Michael Riley. 1996. Speech recognition by composition of weighted finite automata. CMP-LG archive paper 9603001.]]

[13]

Fernando Pereira, Michael Riley, and Richard Sproat. 1994. Weighted rational transductions and their application to human language processing. In ARPA Workshop on Human Language Technology, pages 249--254. Advanced Research Projects Agency, March 8--11.]]

Digital Library

[14]

Patty Price, William Fisher, Jared Bernstein, and David Pallett. 1988. The DARPA 1000-word Resource Management Database for continuous speech recognition. In Proceedings of ICASSP88, volume 1, pages 651--654, New York. ICASSP.]]

[15]

Michael Riley. 1989. Some applications of tree-based modelling to speech and language. In Proceedings of the Speech and Natural Language Workshop, pages 339--352, Cape Cod MA, October. DARPA, Morgan Kaufmann.]]

Digital Library

[16]

Michael Riley. 1991. A statistical model for generating pronunciation networks. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pages S11.1.--S11.4. ICASSP91, October.]]

Digital Library

[17]

Richard Sproat. 1995. A finite-state architecture for tokenization and grapheme-to-phoneme conversion for multilingual text analysis. In Susan Armstrong and Evelyne Tzoukermann, editors, Proceedings of the EACL SIGDAT Workshop, pages 65--72, Dublin, Ireland. Association for Computational Linguistics.]]

Digital Library

[18]

Richard Sproat. 1996. Multilingual text analysis for text-to-speech synthesis. In Proceedings of the ECAI-96 Workshop on Extended Finite State Models of Language, Budapest, Hungary. European Conference on Artificial Intelligence.]]

[19]

Michelle Wang and Julia Hirschberg. 1992. Automatic classification of intonational phrase boundaries. Computer Speech and Language, 6:175--196.]]

Cited By

Rojc MKacic Z(2018)A UNIFIED APPROACH TO GRAPHEME-TO-PHONEME CONVERSION FOR THE PLATTOS SLOVENIAN TEXT-TO-SPEECH SYSTEMApplied Artificial Intelligence10.1080/0883951070140908621:6(563-603)Online publication date: 25-Dec-2018
https://dl.acm.org/doi/10.1080/08839510701409086
Rojc MRotovnik TBrus MJan DKačič Z(2007)Embodied conversational agents in Wizard-of-Oz and multimodal interaction applicationsProceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours10.5555/1783474.1783506(294-309)Online publication date: 29-Mar-2007
https://dl.acm.org/doi/10.5555/1783474.1783506
Bangalore S(2004)Compiling boostexter rules into a finite-state transducerProceedings of the ACL 2004 on Interactive poster and demonstration sessions10.3115/1219044.1219065(21-es)Online publication date: 21-Jul-2004
https://dl.acm.org/doi/10.3115/1219044.1219065
Show More Cited By

Compilation of weighted finite-state transducers from decision trees
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

From Two-Way to One-Way Finite State Transducers
LICS '13: Proceedings of the 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science

Any two-way finite state automaton is equivalent to some one-way finite state automaton. This well-known result, shown by Rabin and Scott and independently by Shepherd son, states that two-way finite state automata (even non-deterministic) characterize ...
Iterated uniform finite-state transducers on unary languages
Abstract
An iterated uniform finite-state transducer executes the same length-preserving transduction in iterative sweeps. The first sweep occurs on the input string, while any subsequent sweep works on the output of the previous one. All sweeps always ...
Highlights
- We study Iterated Uniform Finite-State Transducers (IUFSTs) working on unary languages.
- We compare IUFSTs size with that of unary classical finite-state automata.
- We relate IUFSTs and one-way cellular automata recognition power.
MSO definable string transductions and two-way finite-state transducers

We extend a classic result of Büchi, Elgot, and Trakhtenbrot: MSO definable string transductions i.e., string-to-string functions that are definable by an interpretation using monadic second-order (MSO) logic, are exactly those realized by deterministic ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

ACL '96: Proceedings of the 34th annual meeting on Association for Computational Linguistics

June 1996

399 pages

Program Chairs:
Aravind Joshi
University of Pennsylvania, Philadelphia, PA
,
Martha Palmer
University of Pennsylvania, Philadelphia, PA

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 24 June 1996

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
316
Total Downloads

Downloads (Last 12 months)34
Downloads (Last 6 weeks)10

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rojc MKacic Z(2018)A UNIFIED APPROACH TO GRAPHEME-TO-PHONEME CONVERSION FOR THE PLATTOS SLOVENIAN TEXT-TO-SPEECH SYSTEMApplied Artificial Intelligence10.1080/0883951070140908621:6(563-603)Online publication date: 25-Dec-2018
https://dl.acm.org/doi/10.1080/08839510701409086
Rojc MRotovnik TBrus MJan DKačič Z(2007)Embodied conversational agents in Wizard-of-Oz and multimodal interaction applicationsProceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours10.5555/1783474.1783506(294-309)Online publication date: 29-Mar-2007
https://dl.acm.org/doi/10.5555/1783474.1783506
Bangalore S(2004)Compiling boostexter rules into a finite-state transducerProceedings of the ACL 2004 on Interactive poster and demonstration sessions10.3115/1219044.1219065(21-es)Online publication date: 21-Jul-2004
https://dl.acm.org/doi/10.3115/1219044.1219065
Eisner JIsabelle P(2002)Parameter estimation for probabilistic finite-state transducersProceedings of the 40th Annual Meeting on Association for Computational Linguistics10.3115/1073083.1073085(1-8)Online publication date: 6-Jul-2002
https://dl.acm.org/doi/10.3115/1073083.1073085
Jansche M(2001)Re-engineering letter-to-sound rulesProceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies10.3115/1073336.1073351(1-7)Online publication date: 2-Jun-2001
https://dl.acm.org/doi/10.3115/1073336.1073351

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents