4. Conclusion
We presented HITEC, an automated text classifier and its application categorize to English and German patent collections of WIPO under the IPC taxonomy. IPC covers all areas of technology and is currently used by the industrial property offices of many countries. Patent classification is indispensable for the retrieval of patent documents in the search for prior art. Such retrieval is crucial to patent-issuing authorities, potential inventors, research and development units, and others concerned with the application or development of technology. An efficient automated patent classifier is crucial component in providing an automated classification assistance system for categorizing patent applications in the IPC, that is a main aim at WIPO Fall et al., 2002. HITEC can be a prominent candidate for this purpose.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aas, L. and Eikvil, L. (1999). Text categorisation: A survey. Raport NR 941, Norwegian Computing Center.
Baker, K. D. and McCallum, A. K. (1998). Distributional clustering of words for text classification. In Proc. of the 21th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98), pages 96–103, Melbourne, Australia.
Chakrabarti, S., Dom, B., Agrawal, R., and Raghavan, P. (1998). Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. The VLDB Journal, 7(3):163–178.
Fall, C. J., Törcsvári, A., Benzineb, K., and Karetka, G. (2003a). Automated categorization in the international patent classification. ACM SIGIR Forum archive, 37(1):10–25.
Fall, C. J., Törcsvári, A., Fievét, P., and Karetka, G. (2003b). Additional readme information for WIPO-de autocategorization data set. http://www.wipo.int/ibis/datasets/wipo-de-readme.html.
Fall, C. J., Törcsvári, A., and Karetka, G. (2002). Readme information for WIPO-alpha autocategorization training set. http://www.wipo.int/ibis/datasets/wipo-alpha-readme.html.
Koller, D. and Sahami, M. (1997). Hierarchically classifying documents using a very few words. In International Conference on Machine Learning, volume 14, San Mateo, CA. Morgan-Kaufmann.
McCallum, A., Rosenfeld, R., Mitchell, T., and Ng, A. (1998). Improving text classification by shrinkage in a hierarchy of classes. In Proc. of ICML-98. http://www-2.cs.cmu.edu/~mccallum/papers/hier-icml98.ps.gz.
Salton, G. and McGill, M. J. (1983). An Introduction to Modern Information Retrieval. McGraw-Hill.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47.
Tikk, D. and Biró, G. (2003). Experiments with multilabel text classifier on the Reuters collection. In International Conference on Computational Cybernetics (ICCC03), pages 33–38, Siófok, Hungary.
Tikk, D., Yang, J. D., and Bang, S. L. (2003). Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika, 39(5):583–600.
van Rijsbergen, C. J. (1979). Information Retrieval. Butterworths, London, 2nd edition. http://www.dcs.gla.ac.uk/Keith.
Weiss, S. M., Apte, C., Damerau, F. J., Johnson, D. E., Oles, F. J., Goetz, T., and Hampp, T. (1999). Maximizing text-mining performance. IEEE Intelligent Systems, 14(4):2–8.
Wibovo, W. and Williams, H. E. (2002). Simple and accurate feature selection for hierarchical categorisation. In Proc. of the 2002 ACM symposium on Document engineering, pages 111–118, McLean, Virginia, USA.
Wiener, E., Pedersen, J. O., and Weigend, A. S. (1993). A neural network approach to topic spotting. In Proc. of the 4th Annual Symposium on Document Analysis and Information Retrieval, pages 22–34.
Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information Retrieval, 1(1–2):69–90. http://citeseer.nj.nec.com/yang97evaluation.html.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Tikk, D., Biró, G., Yang, J.D. (2005). Experiment with a Hierarchical Text Categorization Method on WIPO Patent Collections. In: Attoh-Okine, N.O., Ayyub, B.M. (eds) Applied Research in Uncertainty Modeling and Analysis. International Series in Intelligent Technologies, vol 20. Springer, Boston, MA. https://doi.org/10.1007/0-387-23550-7_13
Download citation
DOI: https://doi.org/10.1007/0-387-23550-7_13
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23535-6
Online ISBN: 978-0-387-23550-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)