Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2330784.2331000acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
poster

Automatic generation of regular expressions from examples with genetic programming

Published: 07 July 2012 Publication History

Abstract

We explore the practical feasibility of a system based on genetic programming (GP) for the automatic generation of regular expressions. The user describes the desired task by providing a set of labeled examples, in the form of text lines. The system uses these examples for driving the evolutionary search towards a regular expression suitable for the specified task. Usage of the system should require neither familiarity with GP nor with regular expressions syntax. In our GP implementation each individual represents a syntactically correct regular expression. We performed an experimental evaluation on two different extraction tasks applied to real-world datasets and obtained promising results in terms of precision and recall, even in comparison to an earlier state-of-the-art proposal.

References

[1]
A. Bràzma. Efficient identification of regular expressions from representative examples. In Proceedings of the sixth annual conference on Computational learning theory, volume 1, pages 236--242. ACM, 1993.
[2]
C.-C. Chen, K.-H. Yang, C.-L. Chen, and J.-M. Ho. BibPro: A Citation Parser Based on Sequence Alignment. IEEE Transactions on Knowledge and Data Engineering, 24(2):236--250, Feb. 2012.
[3]
B. Dunay, F. Petry, and B. Buckles. Regular language induction with genetic programming. In Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on, volume 1, pages 396--400. IEEE, 1994.
[4]
W. B. Langdon, J. Rowsell, and A. P. Harrison. Creating regular expressions as mrna motifs with gp to predict human exon splitting. In Proceedings of the 11th Annual conference on Genetic and evolutionary computation, GECCO '09, pages 1789--1790, New York, NY, USA, 2009. ACM.
[5]
Y. Li, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and A. Arbor. Regular Expression Learning for Information Extraction. Computational Linguistics, (October):21--30, 2008.
[6]
B. Ross. Probabilistic pattern matching and the evolution of stochastic regular expressions. Applied Intelligence, pages 285--300, 2000.
[7]
I. Sourdis, J. a. Bispo, J. a. M. P. Cardoso, and S. Vassiliadis. Regular Expression Matching in Reconfigurable Hardware. Journal of Signal Processing Systems, 51(1):99--121, 2007.
[8]
B. Svingen. Learning Regular Languages Using Genetic Programming. In J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba, and R. Riolo, editors, Genetic Programming 1998 Proceedings of the Third Annual Conference, pages 374--376. Morgan Kaufmann, 1998.

Cited By

View all
  • (2024)Automatic Regular Expression Generation for Extracting Relevant Image Data From Web Pages Using Genetic AlgorithmsIEEE Access10.1109/ACCESS.2024.342073412(90660-90669)Online publication date: 2024
  • (2024)Computational peptide discovery with a genetic programming approachJournal of Computer-Aided Molecular Design10.1007/s10822-024-00558-038:1Online publication date: 3-Apr-2024
  • (2022)Exploiting input sanitization for regex denial of serviceProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510047(883-895)Online publication date: 21-May-2022
  • Show More Cited By

Index Terms

  1. Automatic generation of regular expressions from examples with genetic programming

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GECCO '12: Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
    July 2012
    1586 pages
    ISBN:9781450311786
    DOI:10.1145/2330784

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. genetic programming
    2. regular expression

    Qualifiers

    • Poster

    Conference

    GECCO '12
    Sponsor:
    GECCO '12: Genetic and Evolutionary Computation Conference
    July 7 - 11, 2012
    Pennsylvania, Philadelphia, USA

    Acceptance Rates

    Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Automatic Regular Expression Generation for Extracting Relevant Image Data From Web Pages Using Genetic AlgorithmsIEEE Access10.1109/ACCESS.2024.342073412(90660-90669)Online publication date: 2024
    • (2024)Computational peptide discovery with a genetic programming approachJournal of Computer-Aided Molecular Design10.1007/s10822-024-00558-038:1Online publication date: 3-Apr-2024
    • (2022)Exploiting input sanitization for regex denial of serviceProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510047(883-895)Online publication date: 21-May-2022
    • (2022)Synthesis of Java Deserialisation Filters from Examples2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00123(736-745)Online publication date: Jun-2022
    • (2022)Learning Markov Decision Processes Based on Genetic Programming2022 2nd Asia Conference on Information Engineering (ACIE)10.1109/ACIE55485.2022.00023(72-76)Online publication date: Jan-2022
    • (2021)Automatic Search-and-Replace From Examples With Coevolutionary Genetic ProgrammingIEEE Transactions on Cybernetics10.1109/TCYB.2019.291833751:5(2612-2624)Online publication date: May-2021
    • (2021)Automatic generation of regular expressions for the Regex Golf challenge using a local search algorithmGenetic Programming and Evolvable Machines10.1007/s10710-021-09411-x23:1(105-131)Online publication date: 1-Oct-2021
    • (2020)Cognification of Program Synthesis—A Systematic Feature-Oriented Analysis and Future DirectionComputers10.3390/computers90200279:2(27)Online publication date: 12-Apr-2020
    • (2020)Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing Approach2020 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC48606.2020.9185650(1-7)Online publication date: Jul-2020
    • (2020)Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming2020 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC48606.2020.9185500(1-8)Online publication date: Jul-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media