Abstract
Within the last decade text mining, i.e., extracting sensitive information from text corpora, has become a major factor in business intelligence. The automated textual analysis of law corpora is highly valuable because of its impact on a company’s legal options and the raw amount of available jurisdiction. The study of supreme court jurisdiction and international law corpora is equally important due to its effects on business sectors.
In this paper we use text mining methods to investigate Austrian supreme administrative court jurisdictions concerning dues and taxes. We analyze the law corpora using R with the new text mining package tm. Applications include clustering the jurisdiction documents into groups modeling tax classes (like income or value-added tax) and identifying jurisdiction properties. The findings are compared to results obtained by law experts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
ACHATZ, M., KAMPER, K., and RUPPE H. (1987): Die Rechtssprechung des VwGH in Abgabensachen. Orac Verlag, Wien.
CONRAD, J., AL-KOFAHI, K., ZHAO, Y. and KARYPIS, G. (2005): Effective Document Clustering for Large Heterogeneous Law Firm Collections. In: 10th International Con-ference on Artificial Intelligence and Law (ICAIL). 177-187.
FEINERER, I. (2007): tm: Text Mining Package, R package version 0.1-2.
HORNIK, K. (2007): Snowball: Snowball Stemmers, R package version 0.0-1.
KARATZOGLOU, A. and FEINERER, I. (2007): Text Clustering with String Kernels in R. In: Advances in Data Analysis (Proceedings of the 30th Annual Conference of the GfKl). 91-98. Springer-Verlag.
KARATZOGLOU, A., SMOLA, A. and HORNIK, K. (2006): kernlab: Kernel-based machine learning methods including support vector machines, R package version 0.9-1.
KARATZOGLOU, A., SMOLA, A., HORNIK, K. and ZEILEIS, A. (2004): kernlab — An S4 Package for Kernel Methods in R. Journal of Statistical Software, 11(9), 1-20.
LODHI, H., SAUNDERS, C., SHAWE-TAYLOR, J., WATKINS, C., and CRISTIANINI, N. (2002): Text classification using string kernels. Journal of Machine Learning Research, 2,419-444.
NAGEL, H. and MAMUT, M. (2006): Rechtsprechung des VwGH in Abgabensachen 2000-2004.
PORTER, M. (1980): An algorithm for suffix stripping. Program, 14(3), 130-137.
R DEVELOPMENT CORE TEAM (2006): R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/.
SCHWEIGHOFER, E. (1999): Legal Knowledge Representation, Automatic Text Analysis in Public International and European Law. Kluwer Law International, Law and Electronic Commerce, Volume 7, The Hague. ISBN 9041111484.
TEMPLE LANG, D. (2006): Rstem: Interface to Snowball implementation of Porter’s word stemming algorithm, R package version 0.3-1.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feinerer, I., Hornik, K. (2008). Text Mining of Supreme Administrative Court Jurisdictions. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78246-9_67
Download citation
DOI: https://doi.org/10.1007/978-3-540-78246-9_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78239-1
Online ISBN: 978-3-540-78246-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)