Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Patent2Vec: Multi-view representation learning on patent-graphs for patent classification

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Patent classification has long been treated as a crucial task to support related services. Though large efforts have been made on the automatic patent classification task, those prior arts mainly focus on mining textual information such as titles and abstracts. Meanwhile, few of them pay attention to the meta data, e.g., the inventors and the assignee company, and the potential correlation via the metadata-based graph has been largely ignored. To that end, in this paper, we develop a new paradigm for patent classification task in the perspective of multi-view patent graph analysis and then propose a novel framework called Patent2vec to learn low-dimensional representations of patents for patent classification. Specifically, we first employ the graph representation learning on individual graphs, so that view-specific representations will be learned by capturing the network structure and side information. Then, we propose a view enhancement module to enrich single view representations by exploiting cross-view correlation knowledge. Afterward, we deploy an attention-based multi-view fusion method to get refined representations for each patent and further design a view alignment module to constraint final fused representation in a relational embedding space which can preserve latent relational information. Empirical results demonstrate that our model not only improves the classification accuracy but also improves the interpretability of classifying patents reflected in the multi-source data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Cao, S., Lu, W., Xu, Q.: Grarep: Learning graph representations with global structural information. In: CIKM 2015, pp. 891–900 (2015)

  2. Chandra, D.K., Wang, P., Leopold, J., Fu, Y.: Collective representation learning on spatiotemporal heterogeneous information networks. In: SIGSPATIAL, pp. 319–328 (2019)

  3. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding (2019)

  4. Dai, L., Yin, Y., Qin, C., Xu*, T., He, X., Chen, E., Xiong, H.: Enterprise Cooperation and Competition Analysis with Sign-Oriented Preference Network. In: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’20), pp 774–782, San Diego, CA, USA (2020)

  5. Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: Scalable representation learning for heterogeneous networks. In: SIGKDD, pp. 135–144 (2017)

  6. Evgeniya, U., Yaroslav, G., Victor, L.: Multi-region bilinear convolutional neural networks for person re-identification. In: AVSS, pp. 1–6. IEEE (2017)

  7. Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR, pp. 1933–1941 (2016)

  8. Grawe, M.F., Martins, C.A., Bonfante, A.G.: Automated patent classification using word embedding. In: ICMLA, pp. 408–411. IEEE (2017)

  9. Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: SIGKDD, pp. 855–864 (2016)

  10. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: NeurIPS, pp. 1024–1034 (2017)

  11. He, C., Loh, H.T.: Pattern-oriented associative rule-based patent classification. Expert Syst. Appl. 37(3), 2395–2404 (2010)

    Article  Google Scholar 

  12. Hu, J., Li, S., Hu, J., Yang, G.: A hierarchical feature extraction model for multi-label mechanical patent classification. Sustainability 10(1), 219 (2018)

    Article  Google Scholar 

  13. Hu, J., Li, S, Yao, Y, Yu, L., Yang, G., Hu, J.: Patent keyword extraction algorithm based on distributed representation for patent classification. Entropy 20(2), 104 (2018)

    Article  Google Scholar 

  14. Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: SIGKDD, pp. 935–944 (2016)

  15. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T. (2016)

  16. Karpathy, Andrej, Li, Fei-Fei: Deep visual-semantic alignments for generating image descriptions. In: CVPR, pp. 3128–3137 (2015)

  17. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  18. Kiros, R., Salakhutdinov, R., Zemel, R.S.: Unifying visual-semantic embeddings with multimodal neural language models. arXiv preprint arXiv:1411.2539 (2014)

  19. Lai, K.-K., Wu, S.-J.: Using the patent co-citation approach to establish a new patent classification system. Inf. Process. Manage. 41(2), 313–330 (2005)

    Article  Google Scholar 

  20. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)

  21. Lee, J.-S., Hsiang, J.: Patent classification by fine-tuning bert language model. World Patent Inf. 61, 101965 (2020)

    Article  Google Scholar 

  22. Li, S., Hu, J., Cui, Y., Hu, J.: Deeppatent: patent classification with convolutional neural networks and word embedding. Scientometrics 117 (2), 721–744 (2018)

    Article  Google Scholar 

  23. Li, P., Xie, J., Wang, Q., Gao, Z.: Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In: CVPR, pp. 947–955 (2018)

  24. Li, Y., Yang, M., Zhang, Z.: A survey of multi-view representation learning. IEEE TKDE 31(10), 1863–1883 (2018)

    Google Scholar 

  25. Louay, A., Peter, K., Erdan, G., Stefan, F., Frank, H.: Optimizing neural networks for patent classification. In: ECML PKDD, pp. 688–703. Springer (2019)

  26. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013)

  27. Nagrani, A., Albanie, S., Zisserman, A.: Learnable pins: Cross-modal embeddings for person identity. In: ECCV, pp. 71–88 (2018)

  28. Peng, Y., Qi, J.: Cm-gans: Cross-modal generative adversarial networks for common representation learning. TOMM 15(1), 1–24 (2019)

    Article  MathSciNet  Google Scholar 

  29. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: SIGKDD (2014)

  30. Prabhu, Y., Varma, M.: Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In: SIGKDD, pp. 263–272 (2014)

  31. Risch, J., Krestel, R.: Domain-specific word embeddings for patent classification. Data Technologies and Applications (2019)

  32. Roudsari, A.H., Afshar, J., Lee, C.C., Lee, W.: Multi-label patent classification using attention-aware deep learning model. In: IEEE BigComp, pp. 558–559. IEEE (2020)

  33. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  34. Smith, H.: Automation of patent classification. World Patent Inf. 24(4), 269–271 (2002)

    Article  Google Scholar 

  35. Tang, P., Jiang, M., Xia, B.(Ning), Pitera, J.W., Welser, J., Chawla, N.V.: Multi-label patent categorization with non-local attention-based graph convolutional network. In: AAAI, pp. 9024–9031 (2020)

  36. Tang, J., Meng, Q., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-scale information network embedding. In: WWW, pp. 1067–1077 (2015)

  37. Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)

  38. Wang, W., Arora, R., Livescu, K., Srebro, N.: Stochastic optimization for deep cca via nonlinear orthogonal iterations (2016)

  39. Wang, P., Fu, Y., Xiong, H., Li, X.: Adversarial substructured representation learning for mobile user profiling. In: SIGKDD, pp. 130–138 (2019)

  40. Wang, P., Fu, Y., Zhang, J., Wang, P., Yu, Z., Aggarwal, C.: You are how you drive: Peer and temporal-aware representation learning for driving behavior analysis. In: SIGKDD, pp. 2457–2466 (2018)

  41. Wang, P., Fu, Y., Zhou, Y., Liu, K., Li, X., Hua, K.: Exploiting mutual information for substructure-aware graph representation learning. In: IJCAI, pp. 3415–3421 (2020)

  42. Wang, P., Li, X., Zheng, Y., Aggarwal, C., Fu, Y.: Spatiotemporal representation learning for driving behavior analysis. A joint perspective of peer and temporal dependencies. TKDE (2019)

  43. Hao Wang, Tong Xu*, Qi Liu, Defu Lian, Enhong Chen, Dongfang Du, Han Wu, Wen Su: MCNE: An End-to-End Framework for Learning Multiple Conditional Network Representations of Social Network. In: Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’19), pp 1064–1072, Anchorage, AK, USA (2019)

  44. Wu, C.-H., Ken, Y., Huang, T.: Patent classification system using a new hybrid genetic algorithm support vector machine. Appl. Soft Comput. 10 (4), 1164–1177 (2010)

    Article  Google Scholar 

  45. Xia, B., Baoan, L.I., Lv, X.: Research on patent document classification based on deep learning. In: AIIE. Atlantis Press (2016)

  46. Zhang, L., Li, L., Li, T.: Patent mining: a survey. ACM SIGKDD Explorations Newsletter 16(2), 1–19 (2015)

    Article  Google Scholar 

  47. Zhang, L., Xu, T., Zhu, H., Qin, C., Meng, Q, Xiong, H., Chen, E.: Large-Scale Talent Flow Embedding for Company Competitive Analysis. In: Proceedings of The Web Conference 2020 (WWW’20), pp 2354–2364, Taipei, China (2020)

  48. Zhang, D., Liu, J., Zhu, H., Liu, Y., Wang, L., Wang, P., Xiong, H.: Job2vec: Job title benchmarking with collective multi-view representation learning. In: CIKM, pp. 2763–2771 (2019)

  49. van der Maaten, L., Hinton, G.: Visualizing data using t-sne. JMLR 9(Nov), 2579–2605 (2008)

    MATH  Google Scholar 

Download references

Acknowledgments

This research was partially supported by grants from the National Key Research and Development Program of China (Grant No.2018YFB1402600), and the National Natural Science Foundation of China (Grant No.62072423).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Explainability in the Web

Guest Editors: Guandong Xu, Hongzhi Yin, Irwin King, and Lin Li

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, L., Zhang, L., Wu, H. et al. Patent2Vec: Multi-view representation learning on patent-graphs for patent classification. World Wide Web 24, 1791–1812 (2021). https://doi.org/10.1007/s11280-021-00885-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-021-00885-4

Keywords

Navigation