Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3638584.3638631acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaiConference Proceedingsconference-collections
research-article

Analyzing the 'Belt and Road' topic of overseas Chinese media based on the BERTopic modeling

Published: 14 March 2024 Publication History

Abstract

The year 2023 marks the 10th anniversary of the "Belt and Road" initiative. Chinese-language media abroad serves as an important channel for overseas Chinese to access information, and is a window for external communication and cultural exchange, as well as an important force in the construction of the "Belt and Road". In order to better explore the theme information related to the "Belt and Road" in overseas Chinese-language media, and integrate the idea of experimental engineering, an orthogonal experimental method was used to study 4 important parameters of the BERTopic model. The best parameters were selected by analyzing the main effects and constructing the BERTopic model, and the content and evolution trends of the topics were analyzed based on the model results. This method provides a new approach for text topic data mining and important reference for BERTopic topic modeling.

References

[1]
A. Paccanaro and G. E. Hinton. 2001. Learning distributed representations of concepts using linear relational embedding. in IEEE Transactions on Knowledge and Data Engineering. ACM 2, 4 (April 2001),232-244. https://doi.org/10.1109/69.917563.
[2]
BENGIO Y, COURVILLE A, VINCENT P. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence. ACM 35, 8 (August 2013), 1798-828.https://doi.org/10.1109/TPAMI.2013.50
[3]
Tomas M, Kai C, Greg C, Jeffrey D. 2013. Efficient estimation of word representations in vector space., Computation and Language. ACM 7, 9 (September 2013), 289-296. https://doi.org/10.48550/arXiv.1301.3781
[4]
DEERWESTER S, DUMAIS S T, FURNAS G W, 1990. Indexing by latent semantic analysis. Journal of the American society for information science. ACM 41,6.(June 1990), 391-407. https://doi.org/10.1002/(SICI)1097-4571(199009)41:6%3c391::AID-ASI1%3e3.0.CO;2-9
[5]
PENNINGTON J, SOCHER R, MANNING C D. Glove.2014. Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing(EMNLP). Doha, Qata, CA.1532-1543. https://doi.org/10.3115/v1/D14-1162
[6]
KENTON J D M-W C, Toutanova L K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the NAACL,.Minneapolis, Minnesota, June, CA, 4171-4186. https://doi.org/10.48550/arXiv.1810.04805
[7]
PETERS M E, NEUMANN M, IYYER M, 2018. Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the NAACL, New Orleans, Louisiana, CA, 2227-2237. https://doi.org/10.48550/arXiv.1802.05365
[8]
VASWANI A, SHAZEER N, PARMAR N, 2017. Attention is all you need. Advances in neural information processing systems. ACM 12, 6 (June 2017), https://doi.org/10.48550/arXiv.1706.03762
[9]
KENTON J D M-W C, Toutanova L K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the NAACL,.Minneapolis, Minnesota, CA, 4171-4186.https://doi.org/10.48550/arXiv.1810.04805
[10]
Hofmann T. 1999. Probabilistic latent semantic analysis. Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence. [S.l.]: Morgan Kaufmann Publishers Inc, CA. 289–296. https://doi.org/10.48550/arXiv.1301.6705
[11]
Blei D M, Ng A Y,2003. Jordan M I. Latent dirichlet allocation. Journal of machine Learning research, ACM 3,3 (March 2003): 993–1022. https://dl.acm.org/doi/10.5555/944919.944937
[12]
Blei, D. M., & Lafferty, J. D. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, CA, 113–120. https://doi.org/10.5555/944919.944937
[13]
Zenodo,Grootendorst M. 2022 BERTopic: Neural topic modeling with a class -based TF-IDF procedure . ACM 11, 3 (Mar 2022), https://doi.org/10.48550/arXiv.2203.05794
[14]
McInnes L, Healy J, Melville J. UMAP. 2018. Uniform manifold approximation and projection for dimension reduction. ACM 9. 1.(February 2018), https://doi.org/10.48550/arXiv.1802.03426
[15]
McInnes L, Healy J, Astels S. hdbscan: 2017. Hierarchical density based clustering. The Journal of Open Source Software, ACM 21, 3 (March 2017), 205.https://joss.theoj.org/papers/10.21105/joss.00205
[16]
Carbinell J, Goldstein J. 1998. The use of MMR, diversity - based reranking for reordering documents and producing summaries . ACM SIGIR Forum, ACM,1, 8 (August 1998) 335-336. https://doi.org/10.1145/290941.291025

Index Terms

  1. Analyzing the 'Belt and Road' topic of overseas Chinese media based on the BERTopic modeling

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CSAI '23: Proceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence
    December 2023
    563 pages
    ISBN:9798400708688
    DOI:10.1145/3638584
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. BERTopic
    2. Belt and Road initiative,Topic Modeling
    3. Orthogonal Experiment

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CSAI 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 66
      Total Downloads
    • Downloads (Last 12 months)66
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 11 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media