Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3625007.3629129acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Emoji are Effective Predictors of User’s Demographics

Published: 15 March 2024 Publication History

Abstract

Social media platforms like Twitter provide rich data that can offer insights into various aspects of users' behavior. In this study, we explore the potential of emoji usage as a means for demographic prediction. Leveraging a Twitter dataset of 18,689 users, annotated with gender and ethnicity labels, we analyze the proportion of tweets containing emoji across different demographic groups. We identify significant variations in emoji usage, with women utilizing emoji more frequently than men and users of African descent displaying a higher tendency for emoji usage compared to users of European descent. Moreover, we investigate the most distinctive emoji for each group, revealing intriguing patterns that are closely tied to the cultural and demographic backgrounds of users. Building upon these findings, we employ machine learning models with different feature extraction techniques to predict users' gender and ethnicity. Our results demonstrate the predictive power of emoji, outperforming traditional text-based features. Furthermore, our study provides evidence that emoji usage can be a valuable resource for inferring user demographic characteristics on social media platforms, contributing to our understanding of user behavior in digital environments.

References

[1]
Hamza Alshenqeeti. Are emojis creating a new or old visual language for new generations? a socio-semiotic study. Advances in Language and Literary Studies, 7(6):56--69, 2016.
[2]
Chris Fullwood, Sally Quinn, Josephine Chen-Wilson, Darren Chadwick, and Katie Reynolds. Put on a smiley face: Textspeak and personality perceptions. Cyberpsychology, Behavior, and Social Networking, 18(3):147--151, 2015.
[3]
Alexander Robertson, Walid Magdy, and Sharon Goldwater. Self-representation on twitter using emoji skin color modifiers. In Twelfth International AAAI Conference on Web and Social Media, 2018.
[4]
Alexander Robertson, Walid Magdy, and Sharon Goldwater. Emoji skin tone modifiers: Analyzing variation in usage on social media. ACM Transactions on Social Computing, 3(2):1--25, 2020.
[5]
Jinhang Li, Giorgos Longinos, Steven Wilson, and Walid Magdy. Emoji and self-identity in twitter bios. In Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science, pages 199--211, 2020.
[6]
Nikola Ljubešić and Darja Fišer. A global analysis of emoji usage. In Proceedings of the 10th web as corpus workshop, pages 82--89, 2016.
[7]
Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. Homophily and latent attribute inference: Inferring latent attributes of twitter users from neighbors. In Sixth International AAAI Conference on Weblogs and Social Media, 2012.
[8]
David Jurgens. That's what friends are for: Inferring location in online social media platforms based on social relationships. In Seventh International AAAI Conference on Weblogs and Social Media, 2013.
[9]
Carter Jernigan and Behram FT Mistree. Gaydar: Facebook friendships expose sexual orientation. First Monday, 14(10), 2009.
[10]
Zach Wood-Doughty, Nicholas Andrews, Rebecca Marvin, and Mark Dredze. Predicting twitter user demographics from names alone. In Proceedings of the Second Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media, pages 105--111, 2018.
[11]
Delip Rao, Michael Paul, Clay Fink, David Yarowsky, Timothy Oates, and Glen Coppersmith. Hierarchical bayesian models for latent attribute detection in social media. In Fifth International AAAI Conference on Weblogs and Social Media, 2011.
[12]
Chakravarthy Bhagvati et al. Word representations for gender classification using deep learning. Procedia computer science, 132:614--622, 2018.
[13]
Kashfia Sailunaz, Manmeet Dhaliwal, Jon Rokne, and Reda Alhajj. Emotion detection from text and speech: a survey. Social Network Analysis and Mining, 8:1--26, 2018.
[14]
Marìlia Prada, David L Rodrigues, Margarida V Garrido, Diniz Lopes, Bernardo Cavalheiro, and Rui Gaspar. Motives, frequency and attitudes toward emoji and emoticon use. Telematics and Informatics, 35(7):1925--1934, 2018.
[15]
Kashfia Sailunaz and Reda Alhajj. Emotion and sentiment analysis from twitter text. Journal of Computational Science, 36:101003, 2019.
[16]
Sharath Chandra Guntuku, Mingyang Li, Louis Tay, and Lyle H Ungar. Studying cultural differences in emoji usage across the east and the west. In Proceedings of the International AAAI Conference on Web and Social Media, volume 13, pages 226--235, 2019.
[17]
Mayu Kimura and Marie Katsurai. Automatic construction of an emoji sentiment lexicon. In Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pages 1033--1036, 2017.
[18]
Marcel Danesi. The semiotics of emoji: The rise of visual language in the age of the internet. Bloomsbury Publishing, 2017.
[19]
Mohammed Shiha and Serkan Ayvaz. The effects of emoji in sentiment analysis. Int. J. Comput. Electr. Eng.(IJCEE.), 9(1):360--369, 2017.
[20]
Jiaxin An, Tian Li, Yifei Teng, and Pengyi Zhang. Factors influencing emoji usage in smartphone mediated communications. In International conference on information, pages 423--428. Springer, 2018.
[21]
Burhanuddin Arafah and Muhammad Hasyim. The language of emoji in social media. KnE Social Sciences, pages 494--504, 2019.
[22]
Laurie Beth Feldman, Eliza Barach, Vidhushini Srinivasan, and Samira Shaikh. Emojis and words work together in the service of communication. 2021.
[23]
Lara L Jones, Lee H Wurm, Gregory A Norville, and Kate L Mullins. Sex differences in emoji use, familiarity, and valence. Computers in Human Behavior, 108:106305, 2020.
[24]
Alexander Robertson, Walid Magdy, and Sharon Goldwater. Black or white but never neutral: How readers perceive identity from yellow or skin-toned emoji. arXiv preprint arXiv:2105.05887, 2021.
[25]
Timo K Koch, Peter Romero, and Clemens Stachl. Age and gender in language, emoji, and emoticon usage in instant messages. Computers in Human Behavior, 126:106990, 2022.
[26]
Keenan Jones, Jason RC Nurse, and Shujun Li. The shadowy lives of emojis: An analysis of a hacktivist collective's use of emojis on twitter. arXiv preprint arXiv:2105.03168, 2021.
[27]
SangEun Lee, Dahye Jeong, and Eunil Park. Multiemo: Multitask framework for emoji prediction. Knowledge-Based Systems, 242:108437, 2022.
[28]
Xiaowei Wang, Mingming Cheng, Shanshi Li, and Ruochen Jiang. The interaction effect of emoji and social media content on consumer engagement: A mixed approach on peer-to-peer accommodation brands. Tourism Management, 96:104696, 2023.
[29]
Xuan Lu, Wei Ai, Xuanzhe Liu, Qian Li, Ning Wang, Gang Huang, and Qiaozhu Mei. Learning from the ubiquitous language: an empirical analysis of emoji usage of smartphone users. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 770--780, 2016.
[30]
Karoline Marko. Digital identity performance through emoji on the social media platform instagram. Frontiers in Communication, 8:1148517, 2023.
[31]
Margot Hare and Jason Jones. Slava ukraini: Exploring identity activism in support of ukraine via the ukraine flag emoji on twitter. Journal of Quantitative Description: Digital Media, 3, 2023.
[32]
Walid Magdy, Yehia Elkhatib, Gareth Tyson, Sagar Joglekar, and Nishanth Sastry. Fake it till you make it: Fishing for catfishes. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pages 497--504. ACM, 2017.
[33]
Rochana Chaturvedi and Sugat Chaturvedi. It's all in the name: A character-based approach to infer religion. Political Analysis, pages 1--16, 2023.
[34]
Svetlana S Bodrunova. Geolocation detection approaches for user discussion analysis in twitter. In HCI International 2022-Late Breaking Papers. Interaction in New Media, Learning and Games: 24th International Conference on Human-Computer Interaction, HCII 2022, Virtual Event, June 26--July 1, 2022, Proceedings, volume 13517, page 16. Springer Nature, 2022.
[35]
Dimitar Dimitrov, Dennis Segeth, and Stefan Dietze. Geotagging tweetscov19: enriching a covid-19 twitter discourse knowledge base with geographic information. In Companion Proceedings of the Web Conference 2022, pages 438--442, 2022.
[36]
Bernardo P Cavalheiro, Marìlia Prada, David L Rodrigues, Diniz Lopes, and Margarida V Garrido. Evaluating the adequacy of emoji use in positive and negative messages from close and distant senders. Cyberpsychology, Behavior, and Social Networking, 25(3):194--199, 2022.
[37]
Vandita Grover and Hema Banati. Emorile: a personalised emoji prediction scheme based on user profiling. International Journal of Business Intelligence and Data Mining, 22(4):470--485, 2023.
[38]
Cong Tang, Keith Ross, Nitesh Saxena, and Ruichuan Chen. What's in a name: A study of names, gender inference, and gender behavior in facebook. In Database Systems for Adanced Applications: 16th International Conference, DASFAA 2011, International Workshops: GDB, SIM3, FlashDB, SNSMW, DaMEN, DQIS, Hong Kong, China, April 22--25, 2011. Proceedings 16, pages 344--356. Springer, 2011.
[39]
Daniel Preoţiuc-Pietro and Lyle Ungar. User-level race and ethnicity predictors from twitter text. In Proceedings of the 27th international conference on computational linguistics, pages 1534--1545, 2018.
[40]
Rosana CB Rego, Verônica ML Silva, and Victor M Fernandes. Predicting gender by first name using character-level machine learning. arXiv preprint arXiv:2106.10156, 2021.
[41]
Minal Gadiya and SV Jain. Gender prediction using images posted on online social networks. 2016.
[42]
Aron Culotta, Nirmal Ravi Kumar, and Jennifer Cutler. Predicting the demographics of twitter users from website traffic data. In Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.
[43]
Zhenpeng Chen, Xuan Lu, Wei Ai, Huoran Li, Qiaozhu Mei, and Xuanzhe Liu. Through a gender lens: learning usage patterns of emojis from large-scale android users. In Proceedings of the 2018 World Wide Web Conference, pages 763--772, 2018.
[44]
Zhenpeng Chen, Xuan Lu, Sheng Shen, Wei Ai, Xuanzhe Liu, and Qiaozhu Mei. Through a gender lens: An empirical study of emoji usage over large-scale android users. arXiv preprint arXiv:1705.05546, 2017.
[45]
Mengdi Li, Eugene Chng, Alain Yee Loong Chong, and Simon See. An empirical analysis of emoji usage on twitter. Industrial Management & Data Systems, 2019.
[46]
Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, and Sebastian Riedel. emoji2vec: Learning emoji representations from their description. arXiv preprint arXiv:1609.08359, 2016.
[47]
Anshu Malhotra, Luam Totti, Wagner Meira Jr, Ponnurangam Kumaraguru, and Virgilio Almeida. Studying user footprints in different online social networks. In 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 1065--1070. IEEE, 2012.
[48]
Xiaotao Gu, Hong Yang, Jie Tang, and Jing Zhang. Web user profiling using data redundancy. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 358--365. IEEE, 2016.
[49]
Abeer Aldayel and Walid Magdy. Your stance is exposed! analysing possible factors for stance detection on social media. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW):1--20, 2019.
[50]
Youcef Benkhedda, Faical Azouaou, and Sofiane Abbar. Identity linkage across diverse social networks. In 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 468--472. IEEE, 2020.
[51]
Paras Bhatt, Naga Vemprala, Rohit Valecha, Govind Hariharan, and H Raghav Rao. User privacy, surveillance and public health during covid-19-an examination of twitterverse. Information Systems Frontiers, pages 1--16, 2022.
[52]
Susanne Barth, Dan Ionita, and Pieter Hartel. Understanding online privacy---a systematic review of privacy visualizations and privacy by design guidelines. ACM Computing Surveys (CSUR), 55(3):1--37, 2022.
[53]
Ben Treves, Md Rayhanul Masud, and Michalis Faloutsos. Urlytics: Profiling forum users from their posted urls. In 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 510--513. IEEE, 2022.

Index Terms

  1. Emoji are Effective Predictors of User’s Demographics
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ASONAM '23: Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
        November 2023
        835 pages
        ISBN:9798400704093
        DOI:10.1145/3625007
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 15 March 2024

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. emoji
        2. demographic prediction
        3. user profiling

        Qualifiers

        • Research-article

        Conference

        ASONAM '23
        Sponsor:

        Acceptance Rates

        ASONAM '23 Paper Acceptance Rate 53 of 145 submissions, 37%;
        Overall Acceptance Rate 116 of 549 submissions, 21%

        Upcoming Conference

        KDD '25

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 259
          Total Downloads
        • Downloads (Last 12 months)259
        • Downloads (Last 6 weeks)25
        Reflects downloads up to 18 Feb 2025

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media