Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1873781.1873870dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
research-article
Free access

Entity-focused sentence simplification for relation extraction

Published: 23 August 2010 Publication History

Abstract

Relations between entities in text have been widely researched in the natural language processing and information-extraction communities. The region connecting a pair of entities (in a parsed sentence) is often used to construct kernels or feature vectors that can recognize and extract interesting relations. Such regions are useful, but they can also incorporate unnecessary distracting information. In this paper, we propose a rule-based method to remove the information that is unnecessary for relation extraction. Protein-protein interaction (PPI) is used as an example relation extraction problem. A dozen simple rules are defined on output from a deep parser. Each rule specifically examines the entities in one target interaction pair. These simple rules were tested using several PPI corpora. The PPI extraction performance was improved on all the PPI corpora.

References

[1]
Airola, Antti, Sampo Pyysalo, Jari Björne, Tapio Pahikkala, Filip Ginter, and Tapio Salakoski. 2008. A graph kernel for protein-protein interaction extraction. In Proceedings of the BioNLP 2008 workshop.
[2]
Bunescu, Razvan C. and Raymond J. Mooney. 2005. A shortest path dependency kernel for relation extraction. In HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 724--731.
[3]
Bunescu, Razvan C., Ruifang Ge, Rohit J. Kate, Edward M. Marcotte, Raymond J. Mooney, Arun K. Ramani, and Yuk Wah Wong. 2005. Comparative experiments on learning information extractors for proteins and their interactions. Artificial Intelligence in Medicine, 33(2):139--155.
[4]
Chun, Hong-Woo, Yoshimasa Tsuruoka, Jin-Dong Kim, Rie Shiba, Naoki Nagata, Teruyoshi Hishiki, and Jun'ichi Tsujii. 2006. Extraction of gene-disease relations from medline using domain dictionaries and machine learning. In The Pacific Symposium on Biocomputing (PSB), pages 4--15.
[5]
Ding, J., D. Berleant, D. Nettleton, and E. Wurtele. 2002. Mining medline: abstracts, sentences, or phrases? Pacific Symposium on Biocomputing, pages 326--337.
[6]
Doddington, George, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel, and Ralph Weischedel. 2004. The automatic content extraction (ACE) program: Tasks, data, and evaluation. In Proceedings of LREC'04, pages 837--840.
[7]
Dorr, Bonnie, David Zajic, and Richard Schwartz. 2003. Hedge trimmer: A parse-and-trim approach to headline generation. In in Proceedings of Workshop on Automatic Summarization, pages 1--8.
[8]
Fundel, Katrin, Robert Küffner, and Ralf Zimmer. 2006. Relex---relation extraction using dependency parse trees. Bioinformatics, 23(3):365--371.
[9]
Jonnalagadda, Siddhartha and Graciela Gonzalez. 2009. Sentence simplification aids protein-protein interaction extraction. In Proceedings of the 3rd International Symposium on Languages in Biology and Medicine, pages 109--114, November.
[10]
Matsuzaki, Takuya, Yusuke Miyao, and Jun'ichi Tsujii. 2007. Efficient HPSG parsing with supertagging and cfg-filtering. In IJCAI'07: Proceedings of the 20th international joint conference on Artifical intelligence, pages 1671--1676, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
[11]
Miwa, Makoto, Rune Sætre, Yusuke Miyao, and Jun'ichi Tsujii. 2009. Protein-protein interaction extraction by leveraging multiple kernels and parsers. International Journal of Medical Informatics, June.
[12]
Morante, Roser and Walter Daelemans. 2009. Learning the scope of hedge cues in biomedical texts. In Proceedings of the BioNLP 2009 Workshop, pages 28--36, Boulder, Colorado, June. Association for Computational Linguistics.
[13]
Nédellec, Claire. 2005. Learning language in logic - genic interaction extraction challenge. In Proceedings of the LLL'05 Workshop.
[14]
Pyysalo, Sampo, Filip Ginter, Juho Heimonen, Jari Björne, Jorma Boberg, Jouni Järvinen, and Tapio Salakoski. 2007. BioInfer: A corpus for information extraction in the biomedical domain. BMC Bioinformatics, 8:50.
[15]
Pyysalo, Sampo, Antti Airola, Juho Heimonen, Jari Björne, Filip Ginter, and Tapio Salakoski. 2008. Comparative analysis of five protein-protein interaction corpora. In BMC Bioinformatics, volume 9(Suppl 3), page S6.
[16]
Pyysalo, Sampo, Tomoko Ohta, Jin-Dong Kim, and Jun'ichi Tsujii. 2009. Static relations: a piece in the biomedical information extraction puzzle. In BioNLP '09: Proceedings of the Workshop on BioNLP, pages 1--9, Morristown, NJ, USA. Association for Computational Linguistics.
[17]
Sarawagi, Sunita. 2008. Information extraction. Foundations and Trends in Databases, 1(3):261--377.
[18]
Vanderwende, Lucy, Hisami Suzuki, Chris Brockett, and Ani Nenkova. 2007. Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manage., 43(6):1606--1618.
[19]
Vickrey, David and Daphne Koller. 2008. Sentence simplification for semantic role labeling. In Proceedings of ACL-08: HLT, pages 344--352, Columbus, Ohio, June. Association for Computational Linguistics.
[20]
Zhang, Min, Jie Zhang, Jian Su, and Guodong Zhou. 2006. A composite kernel to extract relations between entities with both flat and structured features. In ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 825--832. Association for Computational Linguistics.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics
August 2010
1408 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 23 August 2010

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)9
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Automated Text SimplificationACM Computing Surveys10.1145/344269554:2(1-36)Online publication date: 5-Mar-2021
  • (2015)Document Layout Optimization with Automated ParaphrasingProceedings of the 2015 ACM Symposium on Document Engineering10.1145/2682571.2797095(13-16)Online publication date: 8-Sep-2015
  • (2014)SimConceptProceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/2649387.2649420(138-146)Online publication date: 20-Sep-2014
  • (2014)Transforming graph-based sentence representations to alleviate overfitting in relation extractionProceedings of the 2014 ACM symposium on Document engineering10.1145/2644866.2644875(53-62)Online publication date: 16-Sep-2014
  • (2013)Learning bayesian network using parse trees for extraction of protein-protein interactionProceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 210.1007/978-3-642-37256-8_29(347-358)Online publication date: 24-Mar-2013
  • (2011)Learning to simplify sentences using WikipediaProceedings of the Workshop on Monolingual Text-To-Text Generation10.5555/2107679.2107680(1-9)Online publication date: 24-Jun-2011
  • (2011)Simple English WikipediaProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 210.5555/2002736.2002865(665-669)Online publication date: 19-Jun-2011

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media