Nothing Special   »   [go: up one dir, main page]

Skip to main content

Identification in the Limit of Systematic-Noisy Languages

  • Conference paper
Grammatical Inference: Algorithms and Applications (ICGI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4201))

Included in the following conference series:

Abstract

To study the problem of learning from noisy data, the common approach is to use a statistical model of noise. The influence of the noise is then considered according to pragmatic or statistical criteria, by using a paradigm taking into account a distribution of the data. In this article, we study the noise as a nonstatistical phenomenon, by defining the concept of systematic noise. We establish various ways of learning (in the limit) from noisy data. The first is based on a technique of reduction between problems and consists in learning from the data which one knows noisy, then in denoising the learned function. The second consists in denoising on the fly the training examples, thus to identify in the limit good examples, and then to learn from noncorrupted data. We give in both cases sufficient conditions so that learning is possible and we show through various examples (coming in particular from the field of the grammatical inference) that our techniques are complementary.

This work was supported in part by the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. This publication only reflects the authors’views.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Sakakibara, Y.: Recent Advances of Grammatical Inference. Theoretical Computer Science 185, 15–45 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  2. de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognition 38, 1332–1348 (2005)

    Article  Google Scholar 

  3. Lang, K.J., Pearlmutter, B.A., Price, R.A.: Results of the abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In: Honavar, V.G., Slutzki, G. (eds.) ICGI 1998. LNCS (LNAI), vol. 1433, pp. 1–12. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  4. de la Higuera, C.: Data complexity in Grammatical Inference. In: Data complexity in Pattern Recognition. Advanced Information and Knowledge Processing, Springer, Heidelberg (2006); ISBN: 1-84628-171-7

    Google Scholar 

  5. Case, J., Jain, S., Sharma, A.: Synthesizing noise-tolerant language learners. Theoretical Computer Science 261, 31–56 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  6. Stephan, F.: Noisy inference and oracles. Theoretical Computer Science 185, 129–157 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  7. Wharton, R.M.: Approximate language identification. Information and Control 26, 236–255 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  8. Kearns, M., Valiant, L.: Cryptographic limitations on learning boolean formulae and finite automata. In: 21st ACM Symposium on Theory of Computing, pp. 433–444 (1989)

    Google Scholar 

  9. Kearns, M.: Efficient noise-tolerant learning from statistical queries. In: Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, pp. 392–401 (1993)

    Google Scholar 

  10. Sebban, M., Janodet, J.C.: On state merging in grammatical inference: a statistical approach for dealing with noisy data. In: Proceedings of ICML (2003)

    Google Scholar 

  11. Habrard, A., Bernard, M., Sebban, M.: Improvement of the state merging rule on noisy data in probabilistic grammatical inference. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 169–180. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Coste, F., Fredouille, D.: Unambiguous automata inference by means of state-merging methods. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 60–71. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Giles, C.L., Lawrence, S., Tsoi, A.: Noisy time series prediction using recurrent neural networks and grammatical inference. Machine Learning Journal 44, 161–183 (2001)

    Article  MATH  Google Scholar 

  14. Yokomori, T., Kobayashi, S.: Inductive learning of regular sets from examples: a rough set approach. In: Proc. of International Workshop on Rough Sets and Soft Computing (1994)

    Google Scholar 

  15. Miclet, L., Bayoudh, S., Delhay, A.: Définitions et premières expériences en apprentissage par analogie dans les séquences. In: Denis, F. (ed.) CAP, PUG, pp. 31–48 (2005)

    Google Scholar 

  16. Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., Carrasco, R.C.: Probabilistic finite state automata – part I and II. Pattern Analysis and Machine Intelligence 27, 1013–1039 (2005)

    Article  Google Scholar 

  17. Abe, N., Warmuth, M.: On the computational complexity of approximating distributions by probabilistic automata. Machine Learning Journal 9, 205–260 (1992)

    MATH  Google Scholar 

  18. Thollard, F., Clark, A.: Pac-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research, 473–497 (2004)

    Google Scholar 

  19. Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS (LNAI), vol. 862, pp. 139–150. Springer, Heidelberg (1994)

    Google Scholar 

  20. Gold, M.: Language identification in the limit. Information and Control 10, 447–474 (1967)

    Article  MATH  Google Scholar 

  21. Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37, 302–320 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  22. de la Higuera, C.: Complexity and reduction issues in grammatical inference. Technical Report, Universität Tübingen (2005) ISSN 0946-3852

    Google Scholar 

  23. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics and Control Theory 10, 707–710 (1965); Original in Doklady Akademii Nauk SSSR, 163(4), 845–848 (1965)

    MathSciNet  Google Scholar 

  24. Wagner, R., Fisher, M.: The string-to-string correction problem. Journal of the ACM 21, 168–178 (1974)

    Article  MATH  Google Scholar 

  25. Takada, Y.: Grammatical inference for even linear languages based on control sets. Information Processing Letters 28, 193–199 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  26. de la Higuera, C., Casacuberta, F.: Topology of strings: median string is NP-complete. Theoretical Computer Science 230, 39–48 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  27. Belmandt, Z.: Manuel de prétopologie et ses applications. Hermés (1993)

    Google Scholar 

  28. Pawlak, Z.: Theory of rough sets: A new methodology for knowledge discovery (abstract). In: ICCI 1990, p. 11 (1990)

    Google Scholar 

  29. Kobayashi, S., Yokomori, T.: On approximately identifying concept classes in the limit. In: ALT 1995, pp. 298–312 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tantini, F., de la Higuera, C., Janodet, JC. (2006). Identification in the Limit of Systematic-Noisy Languages. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_3

Download citation

  • DOI: https://doi.org/10.1007/11872436_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45264-5

  • Online ISBN: 978-3-540-45265-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics