article

Polynomial-time identification of very simple grammars from positive data

Author:

Takashi YokomoriAuthors Info & Claims

Theoretical Computer Science, Volume 298, Issue 1

Pages 179 - 206

https://doi.org/10.1016/S0304-3975(02)00423-1

Published: 04 April 2003 Publication History

Abstract

This paper concerns a subclass of simple deterministic grammars, called very simple grammars, and studies the problem of identifying the subclass in the limit from positive data. The class of very simple languages forms a proper subclass of simple deterministic languages and is incomparable to the class of regular languages. This class of languages is also known as the class of left Szilard languages of context-free grammars.After providing some properties of very simple languages, we show that the class of very simple grammars is polynomial-time identifiable in the limit from positive data in the following sense. That is, we show that there effectively exists an algorithm that, given a target very simple grammar G* over alphabet Σ, identifies a very simple grammar G equivalent to G* in the limit from positive data, satisfying the property that the time for updating a conjecture is bounded by O(m), and the total number of prediction errors made by the algorithm is bounded by O(n), where n is the size of G*, m = Max{N^|Σ|+1, |Σ|³} and N is the total length of all positive data provided.

References

[1]

{1} D. Angluin, Inductive inference of formal languages from positive data, Inform. and Control 45 (1980) 117-135.

[2]

{2} D. Angluin, Inference of reversible languages, J. ACM 29 (1982) 741-765.

Digital Library

[3]

{3} D. Angluin, Learning regular sets from queries and counterexamples, Inform. and Comput. 75 (1987) 87-106.

Digital Library

[4]

{4} D. Angluin, Negative results for equivalence queries, Mach. Learning 5 (1990) 121-150.

Digital Library

[5]

{5} P. Butzbach, Line famille de congruences de thue pour lesquelles le probleme de l'eqiuvalence est decidable. application a l'equivalence des grammaires separees, in: M. Nivat (Ed.), Automata, Languages and Programming, North-Holland/American Elsevier, Amsterdam, 1973, pp. 3-12.

[6]

{6} P. Garcia, E. Vidal, Inference of k-testable languages in the strict sense and application to syntactic pattern recognition, IEEE Trans. Pattern Anal. Mach. Intell. 12 (9) (1990) 920-925.

Digital Library

[7]

{7} E.M. Gold, Language identification in the limit, Inform. and Control 10 (1967) 447-474.

[8]

{8} M. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.

[9]

{9} J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, Reading, MA, 1979.

[10]

{10} A. Korenjak, J.E. Hopcroft, Simple deterministic languages, Proc. 7th Annu. IEEE Conf on Switching and Automata Theory, 1966, pp. 36-46.

Digital Library

[11]

{11} P. Laird, E. Gamble, Analytical learning and term-rewriting systems, Report RIA-90-06-17-7, Ames Research Center, NASA, 1990.

[12]

{12} E. Mäkinen, On context-free derivations, Ser A 198, Act Universitatis Tamperensis, 1985.

[13]

{13} E. Mäkinen, The grammatical inference problem for the szilard languages of linear grammars, Inform. Process. Lett. 36 (1990) 203-206.

Digital Library

[14]

{14} E. Mäkinen, Remarks on the structural grammatical inference problem for context-free grammars, Inform. Process. Lett. 44 (1992) 125-127.

Digital Library

[15]

{15} E. Moriya, The associate language and the derivation complexity of formal grammars, Inform. and Control 22 (1973) 139-162.

[16]

{16} Y. Mukouchi, S. Arikawa, Towards a mathematical theory of machine discovery from facts, Theoret. Comput. Sci. 137 (1995) 53-84.

Digital Library

[17]

{17} L. Pitt, Inductive inference, DFAs, and computational complexity, Proc. 2nd Workshop on Analogical and Inductive Inference, Lecture Notes in Artificial Intelligence, Vol. 397, Springer, Berlin, 1989, pp. 18-44.

[18]

{18} M. Sato, K. Umayahara, Inductive inferability for formal languages from positive data, IEICE Trans. Inform. Systems E 75-D (4) (1992) 84-92.

[19]

{19} T. Shinohara, Rich classes inferable from positive data: length bounded elementary formal systems, Inform. Comput. 108 (1994) 175-186.

Digital Library

[20]

{20} N. Tanida, T. Yokomori, Polynomial-time identification of strictly regular languages in the limit, IEICE Trans. Inform. Systems E 75-D (1) (1992) 125-132.

[21]

{21} M. Wakatsuki, E. Tomita, A fast algorithm for checking the inclusion for very simple deterministic pushdown automata, IEICE Trans. Inform. Systems E 76-D (10) (1993) 1224-1233.

[22]

{22} K. Wright, Identification of unions of languages drawn from an identifiable class, Proc. 2nd Workshop on Computational Learning Theory, 1989, pp. 328-333, T. Motoki, T. Shinohara, K. Wright, The correct definition of finite elasticity: corrigendum to identification of unions, Proc. 4th Workshop on Computational Learning Theory, 1991, PP. 375.

[23]

{23} T. Yokomori, On polynomial-time learnability in the limit of strictly deterministic automata, Machine Learning 19 (2) (1995) 153-179.

Digital Library

[24]

{24} T. Yokomori, S. Kobayashi, Learning local languages and their application to DNA sequence analysis, IEEE Trans. Pattern Anal. Mach. Intell. 20 (10) (1998) 1067-1079.

Digital Library

Cited By

Weiss GGoldberg YYahav E(2024)Extracting automata from recurrent neural networks using queries and counterexamples (extended version)Machine Language10.1007/s10994-022-06163-2113:5(2877-2919)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s10994-022-06163-2
Yoshinaka R(2019)Learning efficiency of very simple grammars from positive dataTheoretical Computer Science10.5555/1519541.1519719410:19(1807-1825)Online publication date: 5-Jan-2019
https://dl.acm.org/doi/10.5555/1519541.1519719
Starkie BCoste Fvan Zaanen M(2018)Progressing the state-of-the-art in grammatical inference by competitionAI Communications10.5555/1218852.121885518:2(93-115)Online publication date: 26-Dec-2018
https://dl.acm.org/doi/10.5555/1218852.1218855
Show More Cited By

Index Terms

Polynomial-time identification of very simple grammars from positive data
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning paradigms
2. Theory of computation
  1. Formal languages and automata theory
    1. Grammars and context-free languages

Recommendations

Learning efficiency of very simple grammars from positive data

The class of very simple grammars is known to be polynomial-time identifiable in the limit from positive data. This paper gives an even more general discussion on the efficiency of identification of very simple grammars from positive data, which ...
Simple LR(k) grammars

A class of context-free grammars, called the “Simple LR(k)” or SLR(k) grammars is defined. This class has been shown to include weak precedence and simple precedence grammars as proper subsets. How to construct parsers for the SLR(k) grammars is also ...
Parallel multiple context-free grammars, finite-state translation systems, and polynomial-time recognizable subclasses of lexical-functional grammars
ACL '93: Proceedings of the 31st annual meeting on Association for Computational Linguistics

A number of grammatical formalisms were introduced to define the syntax of natural languages. Among them are parallel multiple context-free grammars (pmcfg's) and lexical-functional grammars (lfg's). Pmcfg's and their subclass called multiple context-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Publisher

Elsevier Science Publishers Ltd.

United Kingdom

Publication History

Published: 04 April 2003

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Weiss GGoldberg YYahav E(2024)Extracting automata from recurrent neural networks using queries and counterexamples (extended version)Machine Language10.1007/s10994-022-06163-2113:5(2877-2919)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s10994-022-06163-2
Yoshinaka R(2019)Learning efficiency of very simple grammars from positive dataTheoretical Computer Science10.5555/1519541.1519719410:19(1807-1825)Online publication date: 5-Jan-2019
https://dl.acm.org/doi/10.5555/1519541.1519719
Starkie BCoste Fvan Zaanen M(2018)Progressing the state-of-the-art in grammatical inference by competitionAI Communications10.5555/1218852.121885518:2(93-115)Online publication date: 26-Dec-2018
https://dl.acm.org/doi/10.5555/1218852.1218855
Mäkinen E(2018)A Note on the Emptiness of Intersection Problem for Left Szilard LanguagesActa Cybernetica10.14232/actacyb.22.3.2016.422:3(613-616)Online publication date: 20-Dec-2018
https://dl.acm.org/doi/10.14232/actacyb.22.3.2016.4
Wakatsuki MTomita E(2018)Polynomial Time Identification of Strict Deterministic Restricted One-Counter Automata in Some Class from Positive DataIEICE - Transactions on Information and Systems10.1093/ietisy/e91-d.6.1704E91-D:6(1704-1718)Online publication date: 16-Dec-2018
https://dl.acm.org/doi/10.1093/ietisy/e91-d.6.1704
Heinz Jde la Higuera Cvan Zaanen MWay APantel P(2011)Formal and empirical grammatical inferenceProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 201110.5555/2002465.2002467(1-83)Online publication date: 19-Jun-2011
https://dl.acm.org/doi/10.5555/2002465.2002467
Wakatsuki MTomita E(2010)Polynomial time identification of strict prefix deterministic finite state transducersProceedings of the 10th international colloquium conference on Grammatical inference: theoretical results and applications10.5555/1886263.1886300(313-316)Online publication date: 13-Sep-2010
https://dl.acm.org/doi/10.5555/1886263.1886300
Clark AEyraud RHabrard A(2010)Using Contextual Representations to Efficiently Learn Context-Free LanguagesThe Journal of Machine Learning Research10.5555/1756006.195302111(2707-2744)Online publication date: 1-Dec-2010
https://dl.acm.org/doi/10.5555/1756006.1953021
Clark AEyraud RHabrard Avan Zaanen Mde la Higuera C(2009)A note on contextual binary feature grammarsProceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference10.5555/1705475.1705481(33-40)Online publication date: 30-Mar-2009
https://dl.acm.org/doi/10.5555/1705475.1705481
Lange SZeugmann TZilles S(2008)Learning indexed families of recursive languages from positive dataTheoretical Computer Science10.1016/j.tcs.2008.02.030397:1-3(194-232)Online publication date: 10-May-2008
https://dl.acm.org/doi/10.1016/j.tcs.2008.02.030
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents