Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Meta-Information Fusion of Hierarchical Semantics Dependency and Graph Structure for Structured Text Classification

Published: 20 February 2023 Publication History

Abstract

Structured text with plentiful hierarchical structure information is an important part in real-world complex texts. Structured text classification is attracting more attention in natural language processing due to the increasing complexity of application scenarios. Most existing methods treat structured text from a local hierarchy perspective, focusing on the semantics dependency and the graph structure of the structured text independently. However, structured text has global hierarchical structures with sophisticated dependency when compared to unstructured text. According to the variety of structured texts, it is not appropriate to use the existing methods directly. The function of distinction information within semantics dependency and graph structure for structured text, referred to as meta-information, should be stated more precisely. In this article, we propose HGMETA, a novel meta-information embedding frame network for structured text classification, to obtain the fusion embedding of hierarchical semantics dependency and graph structure in a structured text, and to distill the meta-information from fusion characteristics. To integrate the global hierarchical features with fused structured text information, we design a hierarchical LDA module and a structured text embedding module. Specially, we employ a multi-hop message passing mechanism to explicitly incorporate complex dependency into a meta-graph. The meta-information is constructed from meta-graph via neighborhood-based propagation to distill redundant information. Furthermore, using an attention-based network, we investigate the complementarity of semantics dependency and graph structure based on global hierarchical characteristics and meta-information. Finally, the fusion embedding and the meta-information can be straightforwardly incorporated for structured text classification. Experiments conducted on three real-world datasets show the effectiveness of meta-information and demonstrate the superiority of our method.

References

[1]
Ion Androutsopoulos, John Koutsias, Konstantinos Chandrinos, Georgios Paliouras, and Constantine D. Spyropoulos. 2000. An evaluation of Naive Bayesian anti-spam filtering. CoRR cs.CL/0006013 (2000). https://arxiv.org/abs/cs/0006013.
[2]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473.
[3]
Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinícius Flores Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Çaglar Gülçehre, H. Francis Song, Andrew J. Ballard, Justin Gilmer, George E. Dahl, Ashish Vaswani, Kelsey R. Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matthew M. Botvinick, Oriol Vinyals, Yujia Li, and Razvan Pascanu. 2018. Relational inductive biases, deep learning, and graph networks. CoRR abs/1806.01261 (2018). arXiv:1806.01261 http://arxiv.org/abs/1806.01261.
[4]
Ranjan Kumar Behera, Kshira Sagar Sahoo, Debadatt Naik, Santanu Kumar Rath, and Bibhudatta Sahoo. 2021. Structural mining for link prediction using various machine learning algorithms. International Journal of Social Ecology and Sustainable Development 12, 3 (2021), 66–78.
[5]
David Blei and John Lafferty. 2006. Correlated topic models. In Proceedings of the 18th International Conference on Neural Information Processing Systems. 147.
[6]
David M. Blei and Jon D. McAuliffe. 2007. Supervised topic models. In Proceedings of the 20th International Conference on Neural Information Processing Systems. 121–128.
[7]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3 (2003), 993–1022. http://jmlr.org/papers/v3/blei03a.html.
[8]
Hongyun Cai, Vincent W. Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering 30, 9 (2018), 1616–1637.
[9]
Jiajun Cheng, Shenglin Zhao, Jiani Zhang, Irwin King, Xin Zhang, and Hui Wang. 2017. Aspect-level sentiment classification with heat (hierarchical attention) network. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 97–106.
[10]
Mark Craven, Andrew McCallum, Dan PiPasquo, Tom Mitchell, and Dayne Freitag. 1998. Learning to Extract Symbolic Knowledge from the World Wide Web. Technical Report. Carnegie-Mellon Univ Pittsburgh PA School of Computer Science.
[11]
George Forman. 2008. BNS feature scaling: An improved representation over TF-IDF for SVM text classification. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. 263–270.
[12]
Lianzhe Huang, Dehong Ma, Sujian Li, Xiaodong Zhang, and Wang Houfeng. 2019. Text level graph neural network for text classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 3435–3441.
[13]
Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning. Springer, 137–142.
[14]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.
[15]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. InProceedings of the International Conference on Learning Representations.
[16]
Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations.
[17]
Shen Li, Zhe Zhao, Renfen Hu, Wensi Li, Tao Liu, and Xiaoyong Du. 2018. Analogical reasoning on Chinese morphological and semantic relations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 138–143.
[18]
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2015. Gated graph sequence neural networks. In Proceedings of the International Conference on Learning Representations.
[19]
Zekun Li, Zeyu Cui, Shu Wu, Xiaoyu Zhang, and Liang Wang. 2019. Fi-GNN: Modeling feature interactions via graph neural networks for CTR prediction. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 539–548.
[20]
Zheng Li, Ying Wei, Yu Zhang, and Qiang Yang. 2018. Hierarchical attention transfer network for cross-domain sentiment classification. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32.
[21]
Tingting Liang, Lei Zheng, Liang Chen, Yao Wan, Philip S. Yu, and Jian Wu. 2020. Multi-view factorization machines for mobile app recommendation based on hierarchical attention. Knowl. Based Syst. 187 (2020), 104821.
[22]
Liao Liefa and Zhu Yalan Le Fugang. 2017. The application of LDA model in patent text classification. Journal of Modern Information 37, 3 (2017), 35–39.
[23]
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. InProceedings of the International Conference on Learning Representations.
[24]
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2873–2879.
[25]
Tomáš Mikolov, Martin Karafiát, Lukáš Burget, Jan Černockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of the 11th Annual Conference of the International Speech Communication Association.
[26]
Giannis Nikolentzos, Antoine Tixier, and Michalis Vazirgiannis. 2020. Message passing attention networks for document understanding. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34, 8544–8551.
[27]
Miha Pavlinek and Vili Podgorelec. 2017. Text classification method based on self-training and LDA topic models. Expert Systems with Applications 80, C (2017), 83–93.
[28]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1532–1543.
[29]
François Rousseau, Emmanouil Kiagias, and Michalis Vazirgiannis. 2015. Text categorization as a graph classification problem. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1702–1712.
[30]
Md. Shahriare Satu, Md. Imran Khan, Mufti Mahmud, Shahadat Uddin, Matthew A. Summers, Julian M. W. Quinn, and Mohammad Ali Moni. 2021. TClustVID: A novel machine learning classification model to investigate topics and sentiment in COVID-19 tweets. Knowl. Based Syst. 226 (2021), 107126.
[31]
Qingying Sun, Zhongqing Wang, Qiaoming Zhu, and Guodong Zhou. 2018. Stance detection with hierarchical attention network. In Proceedings of the 27th International Conference on Computational Linguistics. 2399–2409.
[32]
Songbo Tan. 2006. An effective refinement strategy for KNN text classifier. Expert Systems with Applications 30, 2 (2006), 290–298.
[33]
Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, and Junzhou Zhao. 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3086–3095.
[34]
Naganand Yadati. 2020. Neural message passing for multi-relational ordered and recursive hypergraphs. In Proceedings of the 34th International Conference on Neural Information Processing Systems. 33.
[35]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.
[36]
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33, 7370–7377.
[37]
Haochao Ying, Fuzhen Zhuang, Fuzheng Zhang, Yanchi Liu, Guandong Xu, Xing Xie, Hui Xiong, and Jian Wu. 2018. Sequential recommender system based on hierarchical attention network. In Proceedings of the 27th International Joint Conference on Artificial Intelligence.
[38]
Tianyang Zhang, Minlie Huang, and Li Zhao. 2018. Learning structured representation for text classification via reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32.
[39]
Yufeng Zhang, Xueli Yu, Zeyu Cui, Shu Wu, Zhongzhen Wen, and Liang Wang. 2020. Every document Owns its structure: Inductive text classification via graph neural networks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 334–339.
[40]
Zhao Zhang, Fuzhen Zhuang, Hengshu Zhu, Zhiping Shi, Hui Xiong, and Qing He. 2020. Relational graph neural network with hierarchical attention for knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34, 9612–9619.

Cited By

View all
  • (2024)Knowledge Graph-Based Hierarchical Text Semantic RepresentationInternational Journal of Intelligent Systems10.1155/2024/55832702024Online publication date: 12-Jan-2024
  • (2024)CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge GraphACM Transactions on Knowledge Discovery from Data10.1145/364356518:5(1-56)Online publication date: 28-Feb-2024
  • (2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
  • Show More Cited By

Index Terms

  1. Meta-Information Fusion of Hierarchical Semantics Dependency and Graph Structure for Structured Text Classification

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 2
    February 2023
    355 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3572847
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 February 2023
    Online AM: 17 May 2022
    Accepted: 08 May 2022
    Received: 07 August 2021
    Published in TKDD Volume 17, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Structured text
    2. meta-information
    3. hierarchical semantics
    4. meta-graph

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)173
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Knowledge Graph-Based Hierarchical Text Semantic RepresentationInternational Journal of Intelligent Systems10.1155/2024/55832702024Online publication date: 12-Jan-2024
    • (2024)CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge GraphACM Transactions on Knowledge Discovery from Data10.1145/364356518:5(1-56)Online publication date: 28-Feb-2024
    • (2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
    • (2024)Relevance Feedback with Brain SignalsACM Transactions on Information Systems10.1145/363787442:4(1-37)Online publication date: 9-Feb-2024
    • (2024)Evolving Knowledge Graph Representation Learning with Multiple Attention Strategies for Citation Recommendation SystemACM Transactions on Intelligent Systems and Technology10.1145/363527315:2(1-26)Online publication date: 28-Mar-2024
    • (2023)Multi-aspect Graph Contrastive Learning for Review-enhanced RecommendationACM Transactions on Information Systems10.1145/361810642:2(1-29)Online publication date: 8-Nov-2023
    • (2023)Reinforced PU-learning with Hybrid Negative Sampling Strategies for RecommendationACM Transactions on Intelligent Systems and Technology10.1145/358256214:3(1-25)Online publication date: 8-May-2023
    • (2023)Learn to be Fair without Labels: A Distribution-based Learning Framework for Fair RankingProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605132(23-32)Online publication date: 9-Aug-2023
    • (2023)Generative Relevance Feedback with Large Language ModelsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591992(2026-2031)Online publication date: 19-Jul-2023
    • (2023)Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591952(1827-1832)Online publication date: 19-Jul-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media