A Spectral Learning Algorithm for Finite State Transducers

Borja Balle²³,
Ariadna Quattoni²³ &
Xavier Carreras²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6911))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3056 Accesses
6 Citations

Abstract

Finite-State Transducers (FSTs) are a popular tool for modeling paired input-output sequences, and have numerous applications in real-world problems. Most training algorithms for learning FSTs rely on gradient-based or EM optimizations which can be computationally expensive and suffer from local optima issues. Recently, Hsu et al. [13] proposed a spectral method for learning Hidden Markov Models (HMMs) which is based on an Observable Operator Model (OOM) view of HMMs. Following this line of work we present a spectral algorithm to learn FSTs with strong PAC-style guarantees. To the best of our knowledge, ours is the first result of this type for FST learning. At its core, the algorithm is simple, and scalable to large data sets. We present experiments that validate the effectiveness of the algorithm on synthetic and real data.

This work was partially supported the EU PASCAL2 Network of Excellence (FP7-ICT-216886), and by a Google Research Award. B.B. was supported by an FPU fellowship (AP2008-02064) of the Spanish Ministry of Education. The Spanish Ministry of Science and Innovation supported A.Q. (JCI-2009-04240) and X.C. (RYC-2008-02223 and “KNOW2” TIN2009-14715-C04-04).

Download to read the full chapter text

Chapter PDF

Spectral Learning of Sequence Taggers over Continuous Sequences

Learning Probability Distributions Generated by Finite-State Machines

Learning Sequential Tree-to-Word Transducers

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Abe, N., Takeuchi, J., Warmuth, M.: Polynomial Learnability of Stochastic Rules with Respect to the KL-Divergence and Quadratic Distance. IEICE Transactions on Information and Systems 84(3), 299–316 (2001)
Google Scholar
Bailly, R., Denis, F., Ralaivola, L.: Grammatical inference as a principal component analysis problem. In: Proc. ICML (2009)
Google Scholar
Bernard, M., Janodet, J.-C., Sebban, M.: A discriminative model of stochastic edit distance in the form of a conditional transducer. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds.) ICGI 2006. LNCS (LNAI), vol. 4201, pp. 240–252. Springer, Heidelberg (2006)
Chapter Google Scholar
Carlyle, J.W., Paz, A.: Realization by stochastic finite automaton. Journal of Computer and System Sciences 5, 26–40 (1971)
Article MathSciNet MATH Google Scholar
Casacuberta, F.: Inference of finite-state transducers by using regular grammars and morphisms. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 1–14. Springer, Heidelberg (2000)
Chapter Google Scholar
Chang, J.T.: Full reconstruction of markov models on evolutionary trees: Identifiability and consistency. Mathematical Biosciences 137, 51–73 (1996)
Article MathSciNet MATH Google Scholar
Chen, S., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proc. of ACL, pp. 310–318 (1996)
Google Scholar
Clark, A.: Partially supervised learning of morphology with stochastic transducers. In: Proc. of NLPRS, pp. 341–348 (2001)
Google Scholar
Clark, A., Costa Florêncio, C., Watkins, C.: Languages as hyperplanes: grammatical inference with string kernels. Machine Learning, 1–23 (2010)
Google Scholar
Eisner, J.: Parameter estimation for probabilistic finite-state transducers. In: Proc. of ACL, pp. 1–8 (2002)
Google Scholar
Fliess, M.: Matrices de Hankel. Journal de Mathematiques Pures et Appliquees 53, 197–222 (1974)
MathSciNet MATH Google Scholar
Haardt, M., Nossek, J.A.: Simultaneous schur decomposition of several nonsymmetric matrices to achieve automatic pairing in multidimensional harmonic retrieval problems. IEEE Transactions on Signal Processing 46(1) (1998)
Google Scholar
Hsu, D., Kakade, S.M., Zhang, T.: A spectral algorithm for learning hidden markov models. In: Proc. of COLT (2009)
Google Scholar
Jaeger, H.: Observable operator models for discrete stochastic time series. Neural Computation 12, 1371–1398 (2000)
Article Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition (Language, Speech, and Communication). MIT Press, Cambridge (1998)
Google Scholar
Knight, K., Graehl, J.: Machine transliteration. Computational Linguistics 24(4), 599–612 (1998)
Google Scholar
Li, H., Kumaran, A., Pervouchine, V., Zhang, M.: Report of news 2009 machine transliteration shared task. In: Proc. Named Entities Workshop (2009)
Google Scholar
Mossel, E., Roch, S.: Learning nonsingular phylogenies and hidden markov models. In: Proc. of STOC (2005)
Google Scholar
Neuts, M.F.: Matrix-geometric solutions in stochastic models: an algorithmic approach. Johns Hopkins University Press, Baltimore (1981)
MATH Google Scholar
Ng, A., Jordan, M.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In: NIPS (2002)
Google Scholar
Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic probabilistic finite automata. In: Proc. of COLT, pp. 31–40 (1995)
Google Scholar
Schützenberger, M.: On the definition of a family of automata. Information and Control 4, 245–270 (1961)
Article MathSciNet MATH Google Scholar
Siddiqi, S.M., Boots, B., Gordon, G.J.: Reduced-Rank Hidden Markov Models. In: Proc. AISTATS, pp. 741–748 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Universitat Politècnica de Catalunya, Spain
Borja Balle, Ariadna Quattoni & Xavier Carreras

Authors

Borja Balle
View author publications
You can also search for this author in PubMed Google Scholar
Ariadna Quattoni
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Carreras
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Telecommunications, University of Athens, Panepistimioupolis, Ilisia, 15784, Athens, Greece
Dimitrios Gunopulos
Google Switzerland GmbH, Brandschenkestrasse 110, 8002, Zurich, Switzerland
Thomas Hofmann
Department of Computer Science, University of Bari “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Donato Malerba
Deptartment of Informatics, Athens University of Economics and Business, Patision 76, 10434, Athens, Greece
Michalis Vazirgiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balle, B., Quattoni, A., Carreras, X. (2011). A Spectral Learning Algorithm for Finite State Transducers. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23780-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-23780-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23779-9
Online ISBN: 978-3-642-23780-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Spectral Learning Algorithm for Finite State Transducers

Abstract

Chapter PDF

Similar content being viewed by others

Spectral Learning of Sequence Taggers over Continuous Sequences

Learning Probability Distributions Generated by Finite-State Machines

Learning Sequential Tree-to-Word Transducers

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Spectral Learning Algorithm for Finite State Transducers

Abstract

Chapter PDF

Similar content being viewed by others

Spectral Learning of Sequence Taggers over Continuous Sequences

Learning Probability Distributions Generated by Finite-State Machines

Learning Sequential Tree-to-Word Transducers

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation