Abstract
This paper presents a semantic case-based reasoning framework for text categorization. Text categorization is the task of classifying text documents under predefined categories.
Accidentology is our application field and the goal of our framework is to classify documents describing real road accidents under predefined road accident prototypes, which also are described by text documents. Accidents are described by accident reports while accident prototypes are described by accident scenarios. Thus, text categorization is done by assigning each accident report to an accident scenario, which highlights particular mechanisms leading to accident.
We propose a textual case-based reasoning approach (TCBR), which allows us to integrate both textual and domain knowledge aspects in order to carry out this categorization. CBR solves a new problem (target case) by identifying its similarity to one or several previously solved problems (source cases) stored in a case base and by adapting their known solutions. Cases of our framework are created from text. Most of TCBR applications create cases from text by using Information Retrieval techniques, which leads to knowledge-poor descriptions of cases. We show that using semantic resources (two ontologies of accidentology) makes possible to overcome this difficulty, and allows us to enrich cases by using formal knowledge.
In this paper, we argue that semantic resources are likely to improve the quality of cases created from text, and, therefore, such resources can support the reasoning cycle. We illustrate this claim with our framework developed to classify documents in the accidentology domain.
Chapter PDF
Similar content being viewed by others
References
Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations and system approaches. AICom - Artificial Intelligence Communications 7(1), 39–59 (1994)
Lamontagne, L., Lapalme, G.: Raisonnement à base de cas textuel: état de l’art et perspectives futures. Revue d’intelligence artificielle 16(3), 339–366 (2002)
Wiratunga, N., Koychev, I., Massie, S.: Feature selection and generalisation for retrieval of textual cases. In: Proceedings of the 7-th European Conference on Case-Based Reasoning (2004)
Gupta, K., Aha, D., Sandhu, N.: Exploiting taxonomic and causal relations in conversational case retrieval. In: Proceedings of the Sixth European Conference on Case-Based Reasoning (2002)
Bergmann, R.: On the use of taxonomies for representing case features and local similarity measures. In: Proceedings of the 6th German Workshop on Case-Based Reasoning (1998)
Bruninghaus, S., Ashley, K.D.: The role of information extraction for textual cbr. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 74–89. Springer, Heidelberg (2001)
Lenz, M.: Textual cbr and information retrieval - a comparison. In: Proceedings of the 6th German Workshop on Case Based Reasoning (1998)
Gruber, T.: A translation approach to portable ontology specifications. Knowledge Acquisition, 199–220 (1993)
Smith, M., Welty, C., McGuinness, D.: Owl web ontology language guide. Technical report, W3C, W3C Proposed Recommendation (2004)
Després, S.: Contribution à la conception de méthodes et d’outils pour la gestion des connaissances. In: Habilitation à diriger des recherches, Université René Descartes (2002)
Seguela, P.: Adaptation semi-automatique d’une base de marqueurs de relations sémantiques sur des corpus spécialisés. In: Terminologie et Intelligence Artificielle (1999)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods in Language Processing (1994)
Ville-Ometz, F., Royauté, J., Zasadzinski, A.: Filtrage semi-automatique des variantes de termes dans un processus d’indexation controlée. In: Proceedings of Colloque International sur la Fouille de Textes (2004)
Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string distance metrics for name-matching tasks. In: IJCAI 2003. Proceedings of the International Joint Conference on Artificial Intelligence, Workshop on Information Integration on the Web pages (2003)
Ceausu, V., Desprès, S.: Towards a text mining driven approach for terminology construction. In: Proceedings of the 7th International conference on Terminology and Knowledge Engineering (2005)
Biébow, B., Szulman, S.: A linguistic-based tool for the building of a domain ontology. In: Proceedings of the International Conference on Knowledge Engineering and Knowledge Management (1999)
Ceausu, V., Després, S.: Alignement de ressources sémantiques à partir de régles. In: EGC 2007. Dans la revue RNTI (Revue des Nouvelles Technologies de l’Information), numéro spécial (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ceausu, V., Desprès, S. (2007). A Semantic Case-Based Reasoning Framework for Text Categorization. In: Aberer, K., et al. The Semantic Web. ISWC ASWC 2007 2007. Lecture Notes in Computer Science, vol 4825. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76298-0_53
Download citation
DOI: https://doi.org/10.1007/978-3-540-76298-0_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76297-3
Online ISBN: 978-3-540-76298-0
eBook Packages: Computer ScienceComputer Science (R0)