Abstract
In recent years, several studies have explored the potential of social media data in forecasting electoral outcomes and public opinion trends, with mixed results. This paper presents a novel approach to forecasting modeling, employing data from daily approval polls for 21 executives across Asia, Europe, North America, and Latin America, as well as social media data from their respective Twitter accounts. Machine learning models are trained using these data to predict future executive approval ratings. Through extensive testing of different models, the findings reveal that a combination of previous approval ratings and social media data yields superior performance in predicting future approval. Additionally, models using exclusively social media data exhibit slightly lower performance; however, it remains acceptable for political contexts where executives are highly active on these platforms. This study offers valuable insights into the effective use of social media data for executive approval forecasting, presenting a comparative analysis across diverse political contexts.
Similar content being viewed by others
Data Availability Statement
The datasets generated and analyzed during the current study are available in the Figshare repository, https://figshare.com/s/963ba9bf9b79293da4a3
Notes
Note that this count only includes ratings obtained on days when the executive also tweeted.
The choice to present monthly averages rather than daily counts was made purely for visualization purposes, as the former provides a clearer depiction of the data.
The Cyclical model of public support suggests that executives experience a honeymoon period at the beginning of their mandates. This is followed by a decline and then boosts during electoral periods. Finally, they enter a lame-duck period after electoral contests, wherein they lose the opportunity for reelection.
An analysis including sentiment in four languages-English, Italian, Portuguese, and Spanish-is presented in Appendix 2. The results suggest that classifying tweets from executives and then including sentiment variables in the models leads to improvements in only a few cases, yet these do not outperform the best models presented here.
For a better understanding of the data distribution of the main variables in the analysis, see histograms in Appendix 1.
This is hypothetical because it is unlikely that all the approval values will be as extreme as 16 and 84.
References
Ahmed, S., Jaidka, K., & Skoric, M. (2016). Tweets and votes: A four country comparison of volumetric and sentiment analysis approaches. Proceedings of the International AAAI Conference on Web and Social Media, 10(1), 507–510.
Ali, H., et al. (2022). Deep learning-based election results prediction using Twitter activity. Soft Computing, 26(16), 7535–7543.
Beauchamp, N. (2017). Predicting and interpolating state-level polls using Twitter textual data. American Journal of Political Science, 61(2), 490–503.
Bermingham, A., & Smeaton, A. F. (2011). On using Twitter to monitor political sentiment and predict election results. In.
Borzemski, L., & Wojtkiewicz, M. (2011). Evaluation of chaotic internet traffic predictor using MAPE accuracy measure (pp. 173–182). https://doi.org/10.1007/978-3-642-21771-5_19.
Boyle, J., et al. (2011). Predicting emergency department admissions. Emergency Medicine Journal, 29, 358–365. https://doi.org/10.1136/emj.2010.103531
Bozanta, A., Bayrak, F., & Basar, A. (2023). Prediction of the 2023 Turkish presidential election results using social media data. In: arXiv preprint arXiv:2305.18397.
Brito, K., et al. (2019). Social media and presidential campaigns-preliminary results of the 2018 Brazilian presidential election. In Proceedings of the 20th annual international conference on digital government research (pp. 332–341).
Cameron, M. P., Barrett, P., & Stewardson, B. (2016). Can social media predict election results? Evidence from New Zealand. Journal of Political Marketing, 15(4), 416–432.
Carlin, R. E., & Singh, S. P. (2015). Executive power and economic accountability. The Journal of Politics, 77(4), 1031–1044.
Ceron, A., et al. (2014). Every tweet counts? How sentiment analysis of social media can improve our knowledge of citizens’ political preferences with an application to Italy and France. New Media and Society, 16(2), 340–358.
Consult, Morning (2023). Global leader approval rating tracker. In Morning consult political intelligence.
Fu, K., & Chan, C. (2013). Analyzing online sentiment to predict telephone poll results. Cyberpsychology, Behavior and Social Networking, 16(9), 702–707.
Gaurav, M., et al. (2013). Leveraging candidate popularity on Twitter to predict election outcome. In Proceedings of the 7th workshop on social network mining and analysis (pp. 1–8).
Graham, T., Jackson, D., & Broersma, M. (2016). New platform, old habits? Candidates’ use of Twitter during the 2010 British and Dutch general election campaigns. New Media and Society, 18(5), 765–783.
He, L., et al. (2018). Random forest as a predictive analytics alternative to regression in institutional research. Practical Assessment, Research and Evaluation, 23, 1. https://doi.org/10.7275/1WPR-M024
Heredia, B., Prusa, J. D., & Khoshgoftaar, T. M. (2018). Social media for polling and predicting United States election outcome. Social Network Analysis and Mining, 8, 1–16.
Horne, W. (2024). mDeBERTa-EAD-Sentiment-bilingual1.0available. https://huggingface.co/rwillh11/mDeBERTa-EAD-Sentiment-bilingual_1.0
Ibrahim, M., et al. (2015). Buzzer detection and sentiment analysis for predicting presidential election results in a twitter nation. In 2015 IEEE international conference on data mining workshop (ICDMW) (pp. 1348–1353). IEEE.
Jaidka, K., et al. (2019). Predicting elections from social media: A three-country, three method comparative study. Asian Journal of Communication, 29(3), 252–273.
Li, W., et al. (2015). Outlier detection and removal improves accuracy of machine learning approach to multispectral burn diagnostic imaging. Journal of Biomedial Optics, 20(12), 121305–121305.
Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid search, random search, genetic algorithm: a big comparison for NAS. In: arXiv preprint arXiv:1912.06059.
Livne, A., et al. (2011). The party is over here: Structure and content in the 2010 election. Proceedings of the International AAAI Conference on Web and Social Media, 5(1), 201–208.
Long, T., et al. (2023). Tree-based techniques for predicting the compression index of clayey soils. Journal of Soft Computing in Civil Engineering, 7(3), 52–67.
Marchetti-Bowick, M., & Chambers, N. (2012). Learning for microblogs with distant supervision: Political forecasting with Twitter. In Proceedings of the 13th conference of the European chapter of the association for computational linguistics (pp. 603–612).
NLP Town. (2023). bert-base-multilingual-uncased-sentiment (Revision edd66ab). https://doi.org/10.57967/hf/1515. https://huggingface.co/nlptown/bert-base-multilingualuncased-sentiment
O’Connor, B., et al. (2010). From tweets to polls: Linking text sentiment to public opinion time series. Proceedings of the International AAAI Conference on Web and Social Media, 4(1), 122–129.
Patel, J., et al. (2015). Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems with Applications, 42(1), 259–268.
Pérez, J. M., et al. (2023). pysentimiento: A Python Toolkit for Opinion Mining and Social NLP tasks. arXiv: 2106.09462 [cs.CL].
Pérez-Liñán, A. (2007). Presidential impeachment and the new political instability in Latin America. Cambridge University Press.
Phillips, L., et al. (2017). Using social media to predict the future: A systematic literature review. In: arXiv preprint arXiv:1706.06134.
Pimenta, F., Obradovic, D., & Dengel, A. (2013). A comparative study of social media prediction potential in the 2012 us republican presidential preelections. In 2013 International conference on cloud and green computing (pp. 226–232). IEEE.
Roth, M., Peters, G., & Seruga, J. (2013). Some insights into the role of social media in political communication (pp. 351–360). https://doi.org/10.5220/0004418603510360.
Sabuncu, I., Balci, M. A., & Akguller, O. (2020). Prediction of USA November 2020 election results using multifactor Twitter data analysis method. arXiv preprint arXiv:2010.15938.
Saleiro, P., Gomes, L., & Soares, C. (2016). Sentiment aggregate functions for political opinion polling using microblog streams. In Proceedings of the ninth international C* conference on computer science & software engineering (pp. 44–50).
Skoric, M., & Jaidka, K. (2023). Electoral predictions from social media data. In Skoric, M., & Jaidka, K. (Eds.) Electoral predictions from social media data (February 1, 2023). In A. Ceron (Ed.), Elgar Encyclopedia of technology and politics.
Skoric, M. M., Liu, J., & Jaidka, K. (2020). Electoral and public opinion forecasts with social media data: A meta-analysis. Information, 11(4), 187.
Stimson, J. A. (1976). Public support for American presidents: A cyclical model. Public Opinion Quarterly, 40(1), 1–21.
Tsakalidis, A., Aletras, N., et al. (2018). Nowcasting the stance of social media users in a sudden vote: The case of the Greek referendum. In Proceedings of the 27th ACM international conference on information and knowledge management (pp. 367–376).
Tsakalidis, A., Papadopoulos, S., et al. (2015). Predicting elections for multiple countries using Twitter and polls. IEEE Intelligent Systems, 30(2), 10–17.
Tumasjan, A., et al. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. Proceedings of the International AAAI Conference on Web and Social Media, 4(1), 178–185.
Utami, Y., et al. (2022). Forecasting the sum of new college students with linear regression approach. In Jurnal Teknik Informatika C.I.T Medicom. https://doi.org/10.35335/cit.vol14.2022.231.
Vergeer, M. (2017). Adopting, networking, and communicating on Twitter. Social Science Computer Review, 35, 698–712. https://doi.org/10.1177/0894439316672826
Wang, J., et al. (2020). An innovative random forest-based nonlinear ensemble paradigm of improved feature extraction and deep learning for carbon price forecasting. The Science of the Total Environment. https://doi.org/10.1016/j.scitotenv.2020
Wirth, K. (2020). Predicting changes in public opinion with Twitter: What social media data can and can’t tell us about opinion formation. American University.
Zeng, A., et al. (2023). Are transformers effective for time series forecasting? Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 11121–11128.
Ziegler, A., & König, I. (2014). Mining data with random forests: Current options for real-world applications. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. https://doi.org/10.1002/widm.1114
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Data distribution of variables of interest
This appendix provides a glance at how data for the main variables in the analysis are distributed. As mentioned in the text, distribution is not a real issue for models using the Random Forest algorithm; however, these figures are helpful in understanding the form the data adopts (Figs. 6, 7, 8, 9, 10, 11, 12, 13).
Appendix 2: Sensitivity analysis for models with sentiment
As mentioned above, social media sentiment analysis has been a recurrent method for forecasting elections or public opinion [20]. Sentiment analysis using social media data involves assigning a sentiment label to each post, tweet, or comment intended for forecasting. This study contends that incorporating sentiment information into the forecasting method proposed here would increase the complexity of the analysis, making it more challenging to apply in real-life situations. However, testing the impact of adding or letting out sentiment to the various models presented is important.
The most robust method to include sentiment in the models entails collecting data from users responding to executives, assigning a specific sentiment label to each response, and then assessing the prevalent sentiment toward the executive for a specific day or week. Assigning sentiment to social media texts is becoming easier due to the increasing availability of pre-trained models and packages that are often bilingual or multilingual sentiment classification, such as bert-base-multilingual-uncased-sentiment [26], mDeBERTa-EAD-Sentiment-bilingual [18], and pysentimiento [29]. However, collecting large data volumes, especially from Twitter, is becoming more challenging. Therefore, with the data at hand, obtaining responses for each executive is not feasible for this analysis.
While not as robust as collecting data in response to executives, analyzing the sentiment in the posts these actors publish might also help us gain additional information to forecast approval and disapproval. The underlying theory is that presidents and prime ministers may exhibit varying sentiments in their social media posts, influencing public perception and evaluation. To assess this impact, I have selected a sample of executives and classified the sentiment in their tweets during the period analyzed.
I evaluated the sentiment of tweets from 10 executives across four languages: English, Italian, Portuguese, and Spanish, using the Python package pysentimiento for classification. After this sentiment classification, I re-ran all the models discussed in the main analysis and added new models incorporating three sentiment variables: positive count, negative count, and neutral count. These variables represent the daily tweet counts for each sentiment category.
The results of these additional analyses are displayed in Tables 8 and 9. For most cases, introducing sentiment variables led to reduced performance, as indicated by higher MAE and MAPE scores. The exceptions were models that used only approval data or only social media data for predicting approval with single-lagged social media variables. Specifically, as illustrated in Table 8, the model that forecasts approval using only approval data enhanced its MAE score from 0.618 to 0.572 and its MAPE score from 1.21 to 1.13% after the inclusion of sentiment variables. For the model relying exclusively on social media data, the MAE score improved from 1.404 to 1.390, and the MAPE score decreased from 2.68 to 2.67 after adding sentiment variables. In the double-lagged models, the only approval data model showed improved predictive ability with sentiment variables, reducing the MAE score from 0.603 to 0.552, as shown in Table 9. These findings indicate that incorporating the sentiment from leaders’ tweets does not generally enhance the predictive accuracy of most models.
Considering these results and the goal of developing a simple, general model to predict approval across various political contexts, incorporating the sentiment from executives’ tweets appears to be an unnecessary step. However, if technological advancements occur and accessibility of data collection becomes easier, integrating sentiment from responses to executives could significantly enhance forecasting accuracy in multilingual and cross-country models (Tables 10, 11, 12, 13, 14, 15, 16, 17).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cruces, J.S.G. Forecasting executive approval with social media data: opportunities, challenges and limitations. J Comput Soc Sc 7, 2029–2065 (2024). https://doi.org/10.1007/s42001-024-00299-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42001-024-00299-y