Abstract
We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-based method for (1) extraction of such complex entities and (2) relationships between them and (3) the conversion of such relationships into RDF. Furthermore, we present results that clearly demonstrate the utility of the generated RDF in discovering knowledge from text corpora by means of locating paths composed of the extracted relationships.
Chapter PDF
Similar content being viewed by others
References
Bush, V.: As We May Think. The Atlantic Monthly 176(1), 101–108 (1945)
NLM, PubMed, The National Library Of Medicine, Bethesda MD
Swanson, D.R.: Fish Oil, Raynaud’s Syndrome, and Undiscovered Public Knowledge. Perspectives in Biology and Medicine 30(1), 7–18 (1986)
Swanson, D.R.: Migraine and Magnesium: Eleven Neglected Connections. Perspectives in Biology and Medicine 31(4), 526–557 (1988)
Anyanwu, K., Sheth, A.: ρ-Queries: enabling querying for semantic associations on the semantic web. In: Proceedings WWW. ACM Press, Budapest (2003)
Ramakrishnan, C., et al.: Discovering informative connection subgraphs in multi-relational graphs. SIGKDD Explor. Newsl. 7(2), 56–63 (2005)
Guha, R., McCool, R., Miller, E.: Semantic search. In: WWW 2003, pp. 700–709 (2003)
Cohen, A.M., Hersh, W.R.: A survey of current work in biomedical text mining. Brief Bioinform. 6(1), 57–71 (2005)
Tanabe, L., Wilbur, W.J.: Tagging gene and protein names in biomedical text. Bioinformatics 18(8), 1124–1132 (2002)
Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)
Yu, H., et al.: Automatically identifying gene/protein terms in MEDLINE abstracts. J. of Biomedical Informatics 35(5/6), 322–330 (2002)
Gaizauskas, R., et al.: Protein structures and information extraction from biological texts: the PASTA system. Bioinformatics 19(1), 135–143 (2003)
Friedman, C., et al.: GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics 17(suppl. 1), 1367–4803 (2001)
Rindflesch, T.C., et al.: EDGAR: extraction of drugs, genes and relations from the biomedical literature. In: Pac. Symp. Biocomput., pp. 517–528 (2000)
NLM, Medical Subject Heading (MeSH), The National Library Of Medicine, Bethesda, MD
NLM, Unified Medical Language System (UMLS), The National Library Of Medicine, Bethesda, MD
Tsuruoka, Y., Tsujii, J.i.: Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing Association, pp. 467–474 (2005)
Tsuruoka, Y., Tsujii, J.i.: Chunk Parsing Revisited. In: Proceedings of the 9th International Workshop on Parsing Technologies (IWPT 2005), pp. 133–140 (2005)
Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the First Conference on North American Chapter of the ACL, pp. 132–139. Morgan, San Francisco (2000)
Collins, M.: Head-driven statistical models for natural language parsing (1999)
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron. In: ACL 2002, pp. 263–270 (2002)
Tsuruoka, Y., et al.: Developing a Robust Part-of-Speech Tagger for Biomedical Text. LNCS, pp. 382–392 (2005)
Déjean, H.: Learning rules and their exceptions. J. Mach. Learn. Res. 2, 669–693 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ramakrishnan, C., Kochut, K.J., Sheth, A.P. (2006). A Framework for Schema-Driven Relationship Discovery from Unstructured Text. In: Cruz, I., et al. The Semantic Web - ISWC 2006. ISWC 2006. Lecture Notes in Computer Science, vol 4273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11926078_42
Download citation
DOI: https://doi.org/10.1007/11926078_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49029-6
Online ISBN: 978-3-540-49055-5
eBook Packages: Computer ScienceComputer Science (R0)