Nothing Special   »   [go: up one dir, main page]

Skip to main content

Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13913))

  • 1333 Accesses

Abstract

Social media platforms, e.g. Twitter, are significant sources of information, with various users posting vast amounts of content every day. Analyzing such content has the potential to offer valuable insights for commercial and research purposes. To gain a comprehensive understanding of the information, it is crucial to consider the demographics of users, with gender being a particularly important factor. Nevertheless, the gender of Twitter’s users is not usually available. Predicting the gender of Twitter’s users from tweet data becomes more challenging. In this paper, we introduce a weakly supervised method to automatically build the supervision data. The experimental result show that our weak supervision component could generate well-annotated data automatically with an accuracy rate exceeding 85%. Furthermore, we conduct a comparative analysis of various multimodal learning architectures to predict the gender of Twitter users using weak supervision data. In the study, five multimodal learning architectures: 1) Early Fusion, 2) Late Fusion, 3) Dense Fusion, 4) Caption Fusion, and 5) Ensemble Fusion, are proposed. The experimental results on the evaluation data indicate that Caption Fusion outperforms the other multimodal learning architectures and baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://meyou.jp/ranking/follower_allcat.

  2. 2.

    https://www.faceplusplus.com.

  3. 3.

    https://huggingface.co/cl-tohoku/bert-base-japanese.

  4. 4.

    https://pytorch.org/vision/main/models/vision_transformer.html.

  5. 5.

    https://pypi.org/project/googletrans/.

References

  1. Argamon, S., Koppel, M., Pennebaker, J.W., Schler, J.: Automatically profiling the author of an anonymous text. Commun. ACM 52(2), 119–123 (2009)

    Article  Google Scholar 

  2. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186 (2019)

    Google Scholar 

  3. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  4. Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378 (1971)

    Article  Google Scholar 

  5. Li, J., Ritter, A., Hovy, E.: Weakly supervised user profile extraction from twitter. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 165–174 (2014)

    Google Scholar 

  6. Liu, Y., Singh, L., Mneimneh, Z.: A comparative analysis of classic and deep learning models for inferring gender and age of twitter users. In: Proceedings of the 2nd International Conference on Deep Learning Theory and Applications (2021)

    Google Scholar 

  7. Ma, X., Tsuboshita, Y., Kato, N.: Gender estimation for SNS user profiling using automatic image annotation. In: 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pp. 1–6. IEEE (2014)

    Google Scholar 

  8. Mokady, R., Hertz, A., Bermano, A.H.: Clipcap: clip prefix for image captioning. arXiv preprint arXiv:2111.09734 (2021)

  9. Rangel, F., Rosso, P., Montes-y Gómez, M., Potthast, M., Stein, B.: Overview of the 6th author profiling task at pan 2018: multimodal gender identification in twitter. Working notes papers of the CLEF (2018)

    Google Scholar 

  10. Rangel, F., Rosso, P., Potthast, M., Stein, B.: Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter. Working notes papers of the CLEF (2017)

    Google Scholar 

  11. Sakaki, S., Miura, Y., Ma, X., Hattori, K., Ohkuma, T.: Twitter user gender inference using combined analysis of text and image processing. In: Proceedings of the Third Workshop on Vision and Language, pp. 54–61 (2014)

    Google Scholar 

  12. Suman, C., Naman, A., Saha, S., Bhattacharyya, P.: A multimodal author profiling system for tweets. IEEE Trans. Comput. Soc. Syst. 8(6), 1407–1416 (2021)

    Article  Google Scholar 

  13. Wang, Z., et al.: Demographic inference and representative population estimates from multilingual social media data. In: Proceedings of the World Wide Web conference, pp. 2056–2067 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natthawut Kertkeidkachorn .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hirota, H., Kertkeidkachorn, N., Shirai, K. (2023). Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham. https://doi.org/10.1007/978-3-031-35320-8_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35320-8_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-35319-2

  • Online ISBN: 978-3-031-35320-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics