Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Automated categorization in the international patent classification

Published: 01 April 2003 Publication History

Abstract

A new reference collection of patent documents for training and testing automated categorization systems is established and described in detail. This collection is tailored for automating the attribution of international patent classification codes to patent applications and is made publicly available for future research work. We report the results of applying a variety of machine learning algorithms to the automated categorization of English-language patent documents. This procedure involves a complex hierarchical taxonomy, within which we classify documents into 114 classes and 451 subclasses. Several measures of categorization success are described and evaluated. We investigate how best to resolve the training problems related to the attribution of multiple classification codes to each patent document.

References

[1]
S. Adams. Using the International Patent Classification in an online environment, World Patent Information 22, 291--300, 2000.
[2]
J. Calvert and M. Makarov. The reform of the IPC, World Patent Information 23, 133--136, 2001.
[3]
A. J. Carlson, C. M. Cumby, J. L. Rosen and D. Roth. SNoW User's Guide, UIUC Tech. Report UIUC-DCS-R-99-210, 1999.
[4]
S. Chakrabarti, B. Dom, R. Agrawal, and P. Raghavan. Using taxonomy, discriminants, and signatures for navigating in text databases, proceedings of 23rd VLDB conference, 1997.
[5]
S. Chakrabarti, B. Dom, R. Agrawal, and P. Raghavan. Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies, VLDB Journal 7, 163--178, 1998.
[6]
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks, Proc. SIGMOD98, ACM International Conference on Management of Data, ACM Press, New York, 307--318, 1998.
[7]
F. C. Gey, M. Buckland, C. Chen, and R. Larson. Entry Vocabulary--a Technology to Enhance Digital Search, in Proceedings of the First International Conference on Human Language Technology, San Diego, pp 91--95, 2001.
[8]
D. Hull, S. Aït-Mokhtar, M. Chuat, A. Eisele, E. Gaussier, G. Grefenstette, P. Isabelle, C. Samuelsson, and F. Segond. Language technologies and patent search and classification, World Patent Information 23, 265--268, 2001.
[9]
K. Kakimoto. Intellectual Property Cooperation Center, personal communication, 2003.
[10]
N. Kando. What shall we evaluate? Preliminary discussion for the NTCIR Patent IR Challenge based on the brainstorming with the specialized intermediaries in patent searching and patent attorneys, Proc. ACM-SIGIR Workshop on Patent Retrieval, (pp. 37--42). Athens, Greece, July 2000.
[11]
C. H. A. Koster, M. Seutter, and J. Beney. Classifying Patent Applications with Winnow, Proc. Benelearn 2001 conf., Antwerpen, 2001.
[12]
T. Kohonen, S. Kaski, K. Lagus, J. Salojärvi, J., Honkela, V. Paatero, and A. Saarela. Self organization of a massive document collection, IEEE transactions on neural networks 11 (3), 574--585, 2000.
[13]
M. Krier and F. Zaccà. Automatic categorization applications at the European patent office, World Patent Information 24, 187--196, 2002.
[14]
L. S. Larkey. Some Issues in the Automatic Classification of U.S. patents, Working Notes for the Workshop on Learning for Text Categorization, 15th Nat. Conf. on Artif. Intell. (AAAi-98), Madison, Wisconsin, 1998.
[15]
L. S. Larkey. A Patent Search and Classification System, Proc. DL-99, 4th ACM Conference on Digital Libraries, 179--187, 1999.
[16]
D. D. Lewis, Y. Yang, T. Rose, F. Li. RCV1: A New Benchmark Collection for Text Categorization Research, to appear in J. Machine Learning Research, 2003.
[17]
A. K. McCallum (1996) Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering, www.cs.cmu.edu/~mccallum/bow.
[18]
H. Smith. Automation of patent classification, World Patent Information 24, 269--271, 2002.
[19]
T. Vachon, N. Grandjean, and P. Parisot. Interactive Exploration of Patent Data for Competitive Intelligence: Applications in Ulix (Novartis Knowledge Miner), Proc. Int. Chem. Inform. Conf., Nimes, France, October 2001.
[20]
WIPO. International Patent Classification: Guide, Survey of Classes and Summary of Main Groups, Seventh Edition, Volume 9, World Intellectual Property Organization, Geneva, 1999.

Cited By

View all
  • (2025)A Methodology for Patent Classification through Bigbird-Pegasus Based Claim Abstractive SummarizationJournal of the Korean Institute of Industrial Engineers10.7232/JKIIE.2025.51.1.06151:1(61-72)Online publication date: 15-Feb-2025
  • (2025)An Ensemble Framework for Text ClassificationInformation10.3390/info1602008516:2(85)Online publication date: 23-Jan-2025
  • (2024)Ön eğitimli Bert modeli ile patent sınıflandırılmasıGazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi10.17341/gazimmfd.129254339:4(2484-2496)Online publication date: 20-May-2024
  • Show More Cited By

Index Terms

  1. Automated categorization in the international patent classification

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM SIGIR Forum
    ACM SIGIR Forum  Volume 37, Issue 1
    Spring 2003
    43 pages
    ISSN:0163-5840
    DOI:10.1145/945546
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 April 2003
    Published in SIGIR Volume 37, Issue 1

    Check for updates

    Author Tags

    1. IPC taxonomy
    2. automated categorization
    3. patent
    4. support vector machines

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)90
    • Downloads (Last 6 weeks)15
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A Methodology for Patent Classification through Bigbird-Pegasus Based Claim Abstractive SummarizationJournal of the Korean Institute of Industrial Engineers10.7232/JKIIE.2025.51.1.06151:1(61-72)Online publication date: 15-Feb-2025
    • (2025)An Ensemble Framework for Text ClassificationInformation10.3390/info1602008516:2(85)Online publication date: 23-Jan-2025
    • (2024)Ön eğitimli Bert modeli ile patent sınıflandırılmasıGazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi10.17341/gazimmfd.129254339:4(2484-2496)Online publication date: 20-May-2024
    • (2024)SEA-PSJournal of Information Science10.1177/0165551522110665150:4(831-850)Online publication date: 1-Aug-2024
    • (2024)Multi-model Collaboration and Prompt-driven Patent Classification MethodsProceedings of the 2024 4th International Conference on Artificial Intelligence, Big Data and Algorithms10.1145/3690407.3690464(332-336)Online publication date: 21-Jun-2024
    • (2024)Hierarchy-aware BERT-GCN Dual-Channel Global Model for Hierarchical Text Classification2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498363(820-825)Online publication date: 19-Jan-2024
    • (2024)Technological trajectory analysis in lithium battery manufacturing: Based on patent claims perspectiveJournal of Energy Storage10.1016/j.est.2024.11289498(112894)Online publication date: Sep-2024
    • (2024)Research on cross-lingual multi-label patent classification based on pre-trained modelScientometrics10.1007/s11192-024-05024-0129:6(3067-3087)Online publication date: 6-May-2024
    • (2024)Text Categorization: Conceptual ViewText Mining10.1007/978-3-031-75976-5_5(81-102)Online publication date: 8-Oct-2024
    • (2023)Second Component of the LCIConverting Ideas to Innovation With Lean Canvas for Invention10.4018/978-1-6684-8341-1.ch003(25-37)Online publication date: 13-Oct-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media