Nothing Special   »   [go: up one dir, main page]

Skip to main content

Document Annotation Tool for News Content Analysis

  • Conference paper
  • First Online:
Progress on Pattern Classification, Image Processing and Communications (CORES 2023, IP&C 2023)

Abstract

Every day, the average Internet user perceives an abundance of content that is unintentionally consumed every day. We frequently hear the seemingly obvious remark that the modern world is full of data. We are bombarded with numerous links to amusing content circulated by our friends, various news and content providers, and social media. Unfortunately, an increasing amount of this information is only loosely related to the truth. Some of the low-quality content news could be automatically detected by modern large language models (LLM). Unfortunately, we need a large number of annotated articles to train such models. In this paper, we described our tool for news content annotation. In particular, we explain our research methodology, the tool architecture, and the analysis of the quality of the annotations. In our experiments, we engaged more than 100 volunteers, who annotated almost 4000 articles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    MongoDB homepage: https://www.mongodb.com/.

  2. 2.

    Scrapy homepage: https://scrapy.org/.

References

  1. Banik, S.: Covid fake news dataset [data set]. Zenodo, Online (2020). https://doi.org/10.5281/zenodo.428252

  2. Bazmi, P., Asadpour, M., Shakery, A.: Multi-view co-attention network for fake news detection by modeling topic-specific user and news source credibility. Inf. Process. Manag. 60(1), 103146 (2023). https://doi.org/10.1016/j.ipm.2022.103146. https://www.sciencedirect.com/science/article/pii/S0306457322002473

  3. Jing, J., Wu, H., Sun, J., Fang, X., Zhang, H.: Multimodal fake news detection via progressive fusion networks. Inf. Process. Manag. 60(1), 103120 (2023). https://doi.org/10.1016/j.ipm.2022.103120. https://www.sciencedirect.com/science/article/pii/S0306457322002217

  4. Khan, J.Y., Khondaker, M.T.I., Afroz, S., Uddin, G., Iqbal, A.: A benchmark study of machine learning models for online fake news detection. Mach. Learn. Appl. 4, 100032 (2021). https://doi.org/10.1016/j.mlwa.2021.100032. https://www.sciencedirect.com/science/article/pii/S266682702100013X

  5. Ksieniewicz, P., Zyblewski, P., Borek-Marciniec, W., Kozik, R., Choraś, M., Woźniak, M.: Alphabet flatting as a variant of n-gram feature extraction method in ensemble classification of fake news. Eng. Appl. Artif. Intell. 120, 105882 (2023). https://doi.org/10.1016/j.engappai.2023.105882

    Article  Google Scholar 

  6. Kula, S., Kozik, R., Choraś, M.: Implementation of the BERT-derived architectures to tackle disinformation challenges. Neural Comput. Appl. (2021). https://doi.org/10.1007/s00521-021-06276-0

  7. Li, Y., Jiang, B., Shu, K., Liu, H.: MM-COVID: a multilingual and multimodal data repository for combating COVID-19 disinformation (2020)

    Google Scholar 

  8. Risdal, M.: Getting real about fake news. Kaggle, Online (2016). https://www.kaggle.com/mrisdal/fake-news

  9. Szczepański, M., Pawlicki, M., Kozik, R., Choraś, M.: New explainability method for BERT-based model in fake news detection. Sci. Rep. 11(1), 23705 (2021). https://doi.org/10.1038/s41598-021-03100-6

    Article  Google Scholar 

  10. Zhang, J., Dong, B., Yu, P.S.: Fakedetector: effective fake news detection with deep diffusive neural network. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1826–1829. IEEE Computer Society, Los Alamitos (2020). https://doi.org/10.1109/ICDE48307.2020.00180. https://doi.ieeecomputersociety.org/10.1109/ICDE48307.2020.00180

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafał Kozik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gackowska, M., Katek, G., Śrutek, M., Kozik, R., Choraś, M. (2023). Document Annotation Tool for News Content Analysis. In: Burduk, R., Choraś, M., Kozik, R., Ksieniewicz, P., Marciniak, T., Trajdos, P. (eds) Progress on Pattern Classification, Image Processing and Communications. CORES IP&C 2023 2023. Lecture Notes in Networks and Systems, vol 766. Springer, Cham. https://doi.org/10.1007/978-3-031-41630-9_21

Download citation

Publish with us

Policies and ethics