Google Scholar

Evolutionary learning of syntax patterns for genic interaction extraction

A Bartoli, A De Lorenzo, E Medvet, F Tarlao… - Proceedings of the …, 2015 - dl.acm.org

A Bartoli, A De Lorenzo, E Medvet, F Tarlao, M Virgolin

Proceedings of the 2015 Annual Conference on Genetic and Evolutionary …, 2015•dl.acm.org

There is an increasing interest in the development of techniques for automatic relation extraction from unstructured text. The biomedical domain, in particular, is a sector that may greatly benefit from those techniques due to the huge and ever increasing amount of scientific publications describing observed phenomena of potential clinical interest. In this paper, we consider the problem of automatically identifying sentences that contain interactions between genes and proteins, based solely on a dictionary of genes and proteins and a small set of sample sentences in natural language. We propose an evolutionary technique for learning a classifier that is capable of detecting the desired sentences within scientific publications with high accuracy. The key feature of our proposal, that is internally based on Genetic Programming, is the construction of a model of the relevant syntax patterns in terms of standard part-of-speech annotations. The model consists of a set of regular expressions that are learned automatically despite the large alphabet size involved. We assess our approach on two realistic datasets and obtain 74% accuracy, a value sufficiently high to be of practical interest and that is in line with significant baseline methods.

ACM Digital Library

Show moreShow less

Save Cite Cited by 6 Related articles All 5 versions

Cite

Advanced search

Saved to My library

Evolutionary learning of syntax patterns for genic interaction extraction