Abstract
This paper describes an information extraction system, Vulcain, dedicated to message filtering for a specific domain. The paper focuses on a method for identifying domain-specific terms and concepts, using syntactic information and an existing domain ontology. We focused on a method for identifying terms by partial syntactic analysis, based on TAG grammars. The domain ontology is represented in description logics, and DL inference mechanisms are used to validate the candidate concepts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Assadi, H., Bourigault, D.: FrAnalyse syntaxique et statistique pour la construction d’ontologies à partir des textes. In J. Charlet, M. Zacklad, G. Kassel, D. Bourigault (eds.): Ingénierie des connaissances-Evolutions récentes et nouveaux défis, Eyrolles Publishing House (2000), 243–256.
Baader, F., Hollunder, B.: A Terminological Knowledge Representation Systems with Complete Inference Algorithms. In Proceedings of the Workshop on Processing Declarative Knowledge (1991).
Bonhomme, P. and Lopez, P.: TagML: XML encoding of Resources for Lexicalized Tree Adjoining Grammars. In Proceedings of LREC2000, Athens (2000).
Bouaud, J., Habert, B., Nazarenko, A., Zweigenbaum, P.: FrRegroupements issus de dépendances syntaxiques sur un corpus de spécialité: catégorisation et confrontation à deux conceptualisations du domaine. In J. Charlet, M. Zacklad, G. Kassel, D. Bourigault (eds.): Ingénierie des connaissances-Evolutions récentes et nouveaux défis, Eyrolles Publishing House (2000) 275–290.
Buitelaar, P.: CORELEX: Systematic Polysemy and Underspecification, Ph.D. thesis, Brandeis University, Department of Computer Science (1998)
Capponi, N., Toussaint, Y.: FrInterprétation de classes de termes par généralisation de structures prédicat-argument. In J. Charlet, M. Zacklad, G. Kassel, D. Bourigault (eds.): Ingénierie des connaissances-Evolutions récentes et nouveaux défis, Eyrolles Publishing House (2000) 337–356.
Chanod J.P.: Natural Language Processing and Digital Libraries. In M.T. Pazienza (ed.): Information Extraction, Springer-Verlag, LNAI 1714, (1999) 17–31.
Daille, B.: Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. In J. Klavans, P. Resnik (eds.): The Balancing Act-Combining Symbolic and Statistical Approaches to Language, MIT Press (1996) 49–66.
Fensel D. et al.: OIL in a nutshell. In R. Dieng et al. (eds.): Knowledge Acquisition, Modeling, and Management, Proceedings of the European Knowledge Acquisition Conference (EKAW-2000), Lecture Notes in Artificial Intelligence, LNAI, Springer-Verlag (2000).
Guarino, N.: Semantic Matching: Formal Ontological Distinctions for Information Organization, Extraction, and Integration. In M. T. Pazienza (ed.): Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology. Springer Verlag (1997) 139–170.
Heid, U.: A linguistic bootstrapping approach to the extraction of term candidates from German text. In Terminology, (2000) 161–180.
Haarslev V., Muller R.: Description of the RACER System and its Applications. In Proceedings of the International Workshop on Description Logics (DL-2001), Stanford, USA, (2001), 132–141
Joshi A.: An Introduction to Tree Adjoining Grammars. In Mathematics of Language, John Benjamins Publishing, Amsterdam/Philadelphia (1987), 87–115.
Lopez, P.: Robust Parsing with Lexicalized Tree Adjoining Grammars, Ph.D.Thesis, INRIA, Nancy, France (1999).
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.:Introduction to Word-Net: An On-Line Lexical Database. In International Journal of Lexicography, 3(4), (1990), 302–312.
Riloff, E., Lorenzen, J.: Extraction-based Text Categorization Generating Domain-Specific Role Relationships Automatically. In T. Strzalkowski (ed.): Natural Language Information Retrieval, Kluwer Academic Publishers, (1999), 167–196.
Riloff, E., Shepherd, J.: A Corpus-Based Approach for Building Semantic Lexicons. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing (1997).
Schimd, H.:Probabilistic Part-of-Speech Tagging Using Decision Trees. In Proceedings of the International Conference on New Methods in Language Processing, Manchester, United Kingdom (1994)
Vilain, M.: Inferential Information Extraction. In M. Pazienza (ed.): Information Extraction, LNAI 1714, Springer-Verlag, (1999), 95–119.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Todirascu, A., Romary, L., Bekhouche, D. (2002). Vulcain — An Ontology-Based Information Extraction System. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds) Natural Language Processing and Information Systems. NLDB 2002. Lecture Notes in Computer Science, vol 2553. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36271-1_6
Download citation
DOI: https://doi.org/10.1007/3-540-36271-1_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00307-6
Online ISBN: 978-3-540-36271-5
eBook Packages: Springer Book Archive