Nothing Special   »   [go: up one dir, main page]

Skip to main content

Abstract

We report on an a set of experiments carried out in the context of the Flemish OntoBasis project. Our purpose is to extract semantic relations from text corpora in an unsupervised way and use the output as preprocessed material for the construction of ontologies from scratch. The experiments are evaluated in a quantitative and ”impressionistic” manner.

We have worked on two corpora: a 13M words corpus composed of Medline abstracts related to proteins (SwissProt), and a small legal corpus (EU VAT directive) consisting of 43K words. Using a shallow parser, we select functional relations from the syntactic structure subject-verb-direct-object. Those functional relations correspond to what is a called a ”lexon”. The selection is done using prepositional structures and statistical measures in order to select the most relevant lexons. Therefore, the paper stresses the filtering carried out in order to discard automatically all irrelevant structures.

Domain experts have evaluated the precision of the outcomes on the SwissProt corpus. The global precision has been rated 55%, with a precision of 42% for the functional relations or lexons, and a precision of 76% for the prepositional relations. For the VAT corpus, a knowledge engineer has judged that the outcomes are useful to support and can speed up his modelling task. In addition, a quantitative scoring method (coverage and accuracy measures resulting in a 52.38% and 47.12% score respectively) has been applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aussenac-Gilles, N., Biébow, B., Szulman, S. (eds.): EKAW 2000 Workshop on Ontologies and Texts CEUR (2000), http://CEUR-WS.org/Vol-51/

  2. Aussenac-Gilles, N., Maedche, A. (eds.): ECAI 2002 Workshop on Machine Learning and Natural Language Processing for Ontology Engineering, volume (2002), http://www.inria.fr/acacia/OLT2002

  3. Bechhofer, S. (ed.): Ontology language standardisation efforts. OntoWeb Deliverable #D4, UMIST - IMG, Manchester (2002)

    Google Scholar 

  4. Berland, M., Charniak, E.: Finfing parts in very large corpora. In: Proceedings ACL 1999 (1999)

    Google Scholar 

  5. Berners-Lee, T.: Weaving the Web. Harper, New York (1999)

    Google Scholar 

  6. Bo, J.D., Spyns, P.: Creating a dogmatic multilingual ontology to support a semantic portal. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2003. LNCS, vol. 2889, pp. 253–266. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Bourigault, D., Jacquemin, C.: Term extraction + term clustering: An integrated platform for computer-aided terminology. In: Proceedings EACL 1999 (1999)

    Google Scholar 

  8. Brewster, C., Ciravegna, F., Wilks, Y.: User centred ontology learning for knowledge management. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds.) NLDB 2002. LNCS, vol. 2553, pp. 203–207. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Buchholz, S.: Memory-Based Grammatical Relation Finding (1999)

    Google Scholar 

  10. Buchholz, S., Veenstra, J., Daelemans, W.: Cascaded grammatical relation assignment. Print Partners Ipskamp (2002)

    Google Scholar 

  11. Buitelaar, P., Olejnik, D., Sintek, M.: A Protégé plug-in for ontology extraction from text based on linguistic analysis. In: Van Harmelen, F., McIlraith, S., Plexousakis, D. (eds.) Proceedings of the Internal Semantic Web Conference 2004. LNCS, Springer, Heidelberg (2004)

    Google Scholar 

  12. Buitelaar, P., Handschuh, S., Magnini, B. (eds.): Proc. of the ECAI 2004 Workshop on Ontologies. Learning and Population (2004)

    Google Scholar 

  13. Caraballo, S.A., Charniak, E.: Determining the specificity of nouns from text. In: Proceedings SIGDAT 1999 (1999)

    Google Scholar 

  14. Daelemans, W., Buchholz, S., Veenstra, J.: Memory-based shallow parsing. In: Proceedings of CoNLL 1999 (1999)

    Google Scholar 

  15. Dingli, A., Ciravegna, F., Guthrie, D., Wilks, Y.: Mining web sites using adaptive information extraction. In: Proceedings of the 10th Conference of the EACL (2003)

    Google Scholar 

  16. Faure, D., Nédellec, C.: Knowledge acquisition of predicate argument structures from technical texts using machine learning: The system ASIUM. In: Proceedings EKAW 1999 (1999)

    Google Scholar 

  17. Friedman, C., Hripcsak, G.: Evalutating natural language processors in the clinical domain. Methods of Information in Medicine 37, 334–344 (1998)

    Google Scholar 

  18. Gamallo, P., Agustini, A., Lopes, G.P.: Selection restrictions acquisition from corpora. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, Springer, Heidelberg (2001)

    Google Scholar 

  19. Gamallo, P., Agustini, A., Lopes, G.P.: Using co-composition for acquiring syntactic and semantic subcategorisation. In: Proceedings of the Workshop SIGLEX 2002, ACL 2002 (2002)

    Google Scholar 

  20. Gamallo, P., Gonzalez, M., Agustini, A., Lopes, G., de Lima, V.: Mapping syntactic dependencies onto semantic relations. In: Nathalie Aussenac-Gilles and Alexander Maedche, editors, ECAI 2002 Workshop on Machine Learning and Natural Language Processing for Ontology Engineering, volume (2002), http://www.inria.fr/acacia/OLT2002

  21. Gangemi, A., Navigli, R., Velardi, P.: The ontowordnet project: Extension and axiomatization of conceptual relations in wordnet. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 820–838. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  22. Gómez-Pérez, A., Fernández-López, M., Corcho, O.: Ontological Engineering. In: Advanced Information and Knowledge Processing, Springer, Heidelberg (2003)

    Google Scholar 

  23. Gómez-Pérez, A., Manzano-Macho, D. (eds.): A survey of ontology learning methods and techniques. OntoWeb Deliverable #D1.5, Universidad Politécnica de Madrid (2003)

    Google Scholar 

  24. Grishman, R., Sterling, J.: Generalizing automatically generated selectional patterns. In: Proceedings of COLING 1994 (1994)

    Google Scholar 

  25. Gruber, T.R.: A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition 6(2), 199–221 (1993)

    Article  Google Scholar 

  26. Guarino, N., Giaretta, P.: Ontologies and knowledge bases: Towards a terminological clarification. In: Mars, N. (ed.) Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, pp. 25–32. IOS Press, Amsterdam (1995)

    Google Scholar 

  27. Guarino, N., Persidis, A.: Evaluation framework for content standards. Technical Report OntoWeb Deliverable #3.5, Padova (2003)

    Google Scholar 

  28. Karanikas, H., Spiliopolou, M., Theodoulidis, B.: Parmenides system architecture and technical specification. Parmenides Deliverable #D22, UMIST, Manchester (2003)

    Google Scholar 

  29. Karanikas, H., Theodoulidis, B.: Knowledge discovery in text and text mining software. Technical report, UMIST - CRIM, Manchester (2002)

    Google Scholar 

  30. Lenat, D.B., Guha, R.V.: Building Large Knowledge Based Systems. Addison Wesley, Reading (1990)

    Google Scholar 

  31. Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of COLINGACL 1998 (1998)

    Google Scholar 

  32. Losee, R.: Term dependence: A basis for luhn and zipf models. Journal of the American Society for Information Science and Technology 52(12), 1019–1025 (2001)

    Article  Google Scholar 

  33. Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2), 159–195 (1958)

    Article  MathSciNet  Google Scholar 

  34. Maedche, A.: Ontology Learning for the Semantic Web. The Kluwer International Series in Engineering and Computer Science, vol. 665. Kluwer International, Dordrecht (2003)

    Google Scholar 

  35. Maedche, A., Staab, S., Nédellec, C., Hovy, E. (eds.): IJCAI 2001 Workshop on Ontology Learning, volume CEUR (2001), http://CEUR-WS.org/Vol-38/

  36. Meersman, R.: Ontologies and databases: More than a fleeting resemblance. In: d’Atri, A., Missikoff, M. (eds.) OES/SEO 2001 Rome Workshop, Luiss Publications (2001)

    Google Scholar 

  37. Miller, G.: Wordnet: a lexical database for english. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  38. Navigli, R., Velardi, P., Gangemi, A.: Ontology learning and its application to automated terminology translation. IEEE Intelligent Systems 18(1), 22–31 (2002)

    Google Scholar 

  39. Niles, I., Pease, A.: Towards a standard upper ontology. In: Welty, C., Smith, B. (eds.) Proceedings of the 2nd International Conference on Formal Ontology in Information Systems, FOIS 2001 (2001)

    Google Scholar 

  40. Parpola, P.: Managing terminology using statistical analyses, ontologies and a graphical ka tool. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, Springer, Heidelberg (2000), http://CEUR-WS.org/Vol-51/

  41. Peeters, S., Kaufner, S.: State of the art in crosslingual information access for medical information. Technical report, CSLI (2001)

    Google Scholar 

  42. Pinto, H., Gómez-Pérez, A., Martins, J.P.: Some issues on ontology integration. In: Benjamins, R., Gómez-Pérez, A. (eds.) Proceedings of the IJCAI 1999 Workshop on Ontology and Problem-solving methods: lesson learned and future trends. CEUR, pp. 7.1–7.11 (1999)

    Google Scholar 

  43. Pretorius, A.J.: Lexon visualization: visualizing binary fact types in ontology bases. In: Proceedings of the 8th international conference on information visualisation (IV 2004), London, IEEE Press, Los Alamitos (2004) (in press)

    Google Scholar 

  44. Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge (1995)

    Google Scholar 

  45. Reinberger, M.-L., Spyns, P., Daelemans, W., Meersman, R.: Mining for lexons: Applying unsupervised learning methods to create ontology bases. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 803–819. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  46. Reinberger, M.-L., Spyns, P.: Discovering knowledge in texts for the learning of dogma-inspired ontologies. In: Buitelaar, P., Handschuh, S., Magnini, B. (eds.) Proceedings of the ECAI 2004 Workshop on Ontologies. Learning and Population (2004)

    Google Scholar 

  47. Rinaldi, F., Kaljurand, K., Dowdall, J., Hess, M.: Breaking the deadlock. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 876–888. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  48. Spyns, P., Van Acker, S., Wynants, M., Jarrar, M., Lisovoy, A.: Using a novel orm-based ontology modelling method to build an experimental innovation router. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, Springer, Heidelberg (2004) (in press)

    Chapter  Google Scholar 

  49. Spyns, P., Meersman, R., Jarrar, M.: Data modelling versus ontology engineering. SIGMOD Record Special Issue 31(4), 12–17 (2002)

    Google Scholar 

  50. Staab, S., Maedche, A., Nédellec, C., Wiemer-Hastings, P. (eds.): Proceedings of the Workshop on Ontology Learning, volume CEUR (2000), http://CEUR-WS.org/Vol-31/

  51. Uschold, M., Gruninger, M.: Ontologies: Principles, methods and applications. Knowledge Sharing and Review 11(2) (June 1996)

    Google Scholar 

  52. Ushold, M.: Where are the semantics in the semantic web? AI Magazine 24(3), 25–36 (2003)

    Google Scholar 

  53. Velardi, P., Missikoff, M., Basili, R.: Identification of relevant terms to support the construction of Domain Ontologies. In: Maybury, M., Bernsen, N., Krauwer, S. (eds.) Proc. of the ACL-EACL Workshop on Human Language Technologies (2001)

    Google Scholar 

  54. Volz, R., Handschuh, S., Staab, S., Stojanovic, L., Stojanovic, N.: Unveiling the hidden bride: deep annotation for mapping and migrating legacy data to the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web 1, 187–206 (2004)

    Article  Google Scholar 

  55. Vossen, P. (ed.): EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers, Dordrecht (1998)

    MATH  Google Scholar 

  56. Zipf, G.K.: Human Behaviour and the Principle of Least-Effort. Addison-Wesley, Cambridge (1949)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Reinberger, ML., Spyns, P., Pretorius, A.J., Daelemans, W. (2004). Automatic Initiation of an Ontology. In: Meersman, R., Tari, Z. (eds) On the Move to Meaningful Internet Systems 2004: CoopIS, DOA, and ODBASE. OTM 2004. Lecture Notes in Computer Science, vol 3290. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30468-5_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30468-5_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23663-4

  • Online ISBN: 978-3-540-30468-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics