Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users

Haruka Hirota¹²,
Natthawut Kertkeidkachorn¹² &
Kiyoaki Shirai¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13913))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1333 Accesses

Abstract

Social media platforms, e.g. Twitter, are significant sources of information, with various users posting vast amounts of content every day. Analyzing such content has the potential to offer valuable insights for commercial and research purposes. To gain a comprehensive understanding of the information, it is crucial to consider the demographics of users, with gender being a particularly important factor. Nevertheless, the gender of Twitter’s users is not usually available. Predicting the gender of Twitter’s users from tweet data becomes more challenging. In this paper, we introduce a weakly supervised method to automatically build the supervision data. The experimental result show that our weak supervision component could generate well-annotated data automatically with an accuracy rate exceeding 85%. Furthermore, we conduct a comparative analysis of various multimodal learning architectures to predict the gender of Twitter users using weak supervision data. In the study, five multimodal learning architectures: 1) Early Fusion, 2) Late Fusion, 3) Dense Fusion, 4) Caption Fusion, and 5) Ensemble Fusion, are proposed. The experimental results on the evaluation data indicate that Caption Fusion outperforms the other multimodal learning architectures and baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Leveraging Interactive Knowledge and Unlabeled Data in Gender Classification with Co-training

Gender Identification in Social Media Using Transfer Learning

Interactive Gender Inference in Social Media

Notes

References

Argamon, S., Koppel, M., Pennebaker, J.W., Schler, J.: Automatically profiling the author of an anonymous text. Commun. ACM 52(2), 119–123 (2009)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186 (2019)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378 (1971)
Article Google Scholar
Li, J., Ritter, A., Hovy, E.: Weakly supervised user profile extraction from twitter. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 165–174 (2014)
Google Scholar
Liu, Y., Singh, L., Mneimneh, Z.: A comparative analysis of classic and deep learning models for inferring gender and age of twitter users. In: Proceedings of the 2nd International Conference on Deep Learning Theory and Applications (2021)
Google Scholar
Ma, X., Tsuboshita, Y., Kato, N.: Gender estimation for SNS user profiling using automatic image annotation. In: 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pp. 1–6. IEEE (2014)
Google Scholar
Mokady, R., Hertz, A., Bermano, A.H.: Clipcap: clip prefix for image captioning. arXiv preprint arXiv:2111.09734 (2021)
Rangel, F., Rosso, P., Montes-y Gómez, M., Potthast, M., Stein, B.: Overview of the 6th author profiling task at pan 2018: multimodal gender identification in twitter. Working notes papers of the CLEF (2018)
Google Scholar
Rangel, F., Rosso, P., Potthast, M., Stein, B.: Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter. Working notes papers of the CLEF (2017)
Google Scholar
Sakaki, S., Miura, Y., Ma, X., Hattori, K., Ohkuma, T.: Twitter user gender inference using combined analysis of text and image processing. In: Proceedings of the Third Workshop on Vision and Language, pp. 54–61 (2014)
Google Scholar
Suman, C., Naman, A., Saha, S., Bhattacharyya, P.: A multimodal author profiling system for tweets. IEEE Trans. Comput. Soc. Syst. 8(6), 1407–1416 (2021)
Article Google Scholar
Wang, Z., et al.: Demographic inference and representative population estimates from multilingual social media data. In: Proceedings of the World Wide Web conference, pp. 2056–2067 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Haruka Hirota, Natthawut Kertkeidkachorn & Kiyoaki Shirai

Authors

Haruka Hirota
View author publications
You can also search for this author in PubMed Google Scholar
Natthawut Kertkeidkachorn
View author publications
You can also search for this author in PubMed Google Scholar
Kiyoaki Shirai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natthawut Kertkeidkachorn .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane
Oakland University, Rochester, NY, USA
Vijayan Sugumaran
University of Derby, Derby, UK
Warren Manning
University of Derby, Derby, UK
Stephan Reiff-Marganiec

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hirota, H., Kertkeidkachorn, N., Shirai, K. (2023). Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham. https://doi.org/10.1007/978-3-031-35320-8_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-35320-8_39
Published: 14 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35319-2
Online ISBN: 978-3-031-35320-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Leveraging Interactive Knowledge and Unlabeled Data in Gender Classification with Co-training

Gender Identification in Social Media Using Transfer Learning

Interactive Gender Inference in Social Media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Weakly-Supervised Multimodal Learning for Predicting the Gender of Twitter Users

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Leveraging Interactive Knowledge and Unlabeled Data in Gender Classification with Co-training

Gender Identification in Social Media Using Transfer Learning

Interactive Gender Inference in Social Media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation