The Future is Different: Predicting Reddits Popularity with Variational Dynamic Language Models

Kostadin Cvejoski^13,15,
Ramsés J. Sánchez^14,15 &
César Ojeda¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14941))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

788 Accesses

Abstract

Large pre-trained language models (LPLM) have shown spectacular success when fine-tuned on downstream supervised tasks. It is known, however, that their performance can drastically drop when there is a distribution shift between the data used during training and that used at inference time. In this paper we focus on data distributions that naturally change over time and introduce four Reddit datasets, namely the Wallstreetbets, AskScience, The Donald, and Politics sub-reddits. First, we empirically demonstrate that LPLM can display average performance drops of about 79% in the best cases, when predicting the popularity of future posts. We then introduce a methodology that leverages neural variational dynamic topic models and attention mechanisms to infer temporal language model representations for regression tasks Our models display performance drops of only about 33% in the best cases when predicting the popularity of future posts, while using only about 7% of the total number of parameters of LPLM and providing interpretable representations that offer insight into real-world events, like the GameStop short squeeze of 2021. Source code to reproduce our experiments is available online.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Cranks and Charlatans and Deepfakes

: A New Dataset for pularity diction of manian Reddit Posts

Automated Topic Analysis with Large Language Models

Notes

References

Agarwal, O., Nenkova, A.: Temporal effects on pre-trained models for language processing tasks. Trans. Assoc. Comput. Linguist. 10, 904–921 (2022)
Article Google Scholar
Amba, S., Chen, H.T., Zhang, M., Bendersky, M., Najork, M., Ben, M.: Dynamic language models for continuously evolving content; dynamic language models for continuously evolving content, vol. 11 (2021). https://doi.org/10.1145/3447548.3467162
Amba Hombaiah, S., Chen, T., Zhang, M., Bendersky, M., Najork, M.: Dynamic language models for continuously evolving content. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 2514–2524 (2021)
Google Scholar
Bishop, C.M.: Pattern recognition and machine learning. Springer (2006)
Google Scholar
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120 (2006)
Google Scholar
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33 (2020)
Google Scholar
Chawla, S., Singh, N., Drori, I.: Quantifying and alleviating distribution shifts in foundation models on review classification. In: NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications (2021). https://openreview.net/forum?id=OG78-TuPcvL
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches (2014). arXiv preprint arXiv:1409.1259
Cvejoski, K., Sánchez, R.J., Bauckhage, C., Ojeda, C.: Dynamic review-based recommenders. In: Data Science – Analytics and Applications, pp. 66–71. Springer, Wiesbaden (2022). https://doi.org/10.1007/978-3-658-36295-9_10
Chapter Google Scholar
Cvejoski, K., Sánchez, R.J., Georgiev, B., Bauckhage, C., Ojeda, C.: Recurrent point review models. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020). https://doi.org/10.1109/IJCNN48605.2020.9206768
Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J., Potts, C.: No country for old members: user lifecycle and linguistic change in online communities. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 307–318. Association for Computing Machinery, New York, NY, USA (2013)
Google Scholar
Delasalles, E., Lamprier, S., Denoyer, L.: Dynamic neural language models. In: Gedeon, T., Wong, K.W., Lee, M. (eds.) ICONIP 2019. LNCS, vol. 11955, pp. 282–294. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36718-3_24
Chapter Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding (2018). arXiv preprint arXiv:1810.04805
Dhingra, B., Cole, J.R., Eisenschlos, J.M., Gillick, D., Eisenstein, J., Cohen, W.W.: Time-aware language models as temporal knowledge bases. Trans. Assoc. Comput. Linguist. 10, 257–273 (2022)
Article Google Scholar
Guo, H., Zhu, H., Guo, Z., Zhang, X., Wu, X., Su, Z.: Domain adaptation with latent semantic association for named entity recognition. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 281–289. Association for Computational Linguistics, Boulder, Colorado (2009). https://aclanthology.org/N09-1032
Hendrycks, D., Liu, X., Wallace, E., Dziedzic, A., Krishnan, R., Song, D.: Pretrained transformers improve out-of-distribution robustness. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2744–2751. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.acl-main.244, https://aclanthology.org/2020.acl-main.244
Hofmann, V., Pierrehumbert, J., Schütze, H.: Dynamic contextualized word embeddings. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6970–6984. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.acl-long.542, https://aclanthology.org/2021.acl-long.542
Hu, Y., Zhai, K., Eidelman, V., Boyd-Graber, J.: Polylingual tree-based topic models for translation domain adaptation. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1166–1176. Association for Computational Linguistics, Baltimore, Maryland (2014). https://doi.org/10.3115/v1/P14-1110, https://aclanthology.org/P14-1110
Koh, P.W., et al.: WILDS: a benchmark of in-the-wild distribution shifts. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 5637–5664. PMLR (2021). https://proceedings.mlr.press/v139/koh21a.html
Krishnan, R.G., Shalit, U., Sontag, D.: Deep Kalman filters (2015)
Google Scholar
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: A lite BERT for self-supervised learning of language representations (2019). arXiv preprint arXiv:1909.11942
Lazaridou, A., et al.: Mind the Gap: assessing temporal generalization in neural language models. Adv. Neural. Inf. Process. Syst. 34, 29348–29363 (2021)
Google Scholar
Liska, A., et al.: StreamingQA: a benchmark for adaptation to new knowledge over time in question answering models. In: International Conference on Machine Learning, pp. 13604–13622. PMLR (2022)
Google Scholar
Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach (2019). arXiv preprint arXiv:1907.11692
Loureiro, D., Barbieri, F., Neves, L., Anke, L.E., Camacho-Collados, J.: TimeLMS: Diachronic language models from twitter, pp. 251–260 (2022). https://doi.org/10.48550/arxiv.2202.03829, https://arxiv.org/abs/2202.03829v2
Luu, K., Khashabi, D., Gururangan, S., Mandyam, K., Smith, N.A.: Time waits for no one! analysis and challenges of temporal misalignment (2021). arXiv preprint arXiv:2111.07408
Ma, X., Xu, P., Wang, Z., Nallapati, R., Xiang, B.: Domain adaptation with BERT-based domain classification and data selection. In: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pp. 76–83. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-6109, https://aclanthology.org/D19-6109
Oren, Y., Sagawa, S., Hashimoto, T.B., Liang, P.: Distributionally robust language modeling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4227–4237. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1432, https://aclanthology.org/D19-1432
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202, https://aclanthology.org/N18-1202
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., Huang, X.: Pre-trained models for natural language processing: a survey. SCIENCE CHINA Technol. Sci. 63(10), 1872–1897 (2020)
Article Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Rosin, G.D., Guy, I., Radinsky, K.: Time masking for temporal language models. In: WSDM 2022 - Proceedings of the 15th ACM International Conference on Web Search and Data Mining, pp. 833–841 (10 2021). https://doi.org/10.48550/arxiv.2110.06366, https://arxiv.org/abs/2110.06366v4
Rosin, G.D., Radinsky, K.: Temporal attention for language models. In: Findings of the Association for Computational Linguistics: NAACL 2022. pp. 1498–1508. Association for Computational Linguistics, Seattle, United States (2022).https://doi.org/10.18653/v1/2022.findings-naacl.112, https://aclanthology.org/2022.findings-naacl.112
Röttger, P., Pierrehumbert, J.: Temporal adaptation of BERT and performance on downstream document classification: Insights from social media. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 2400–2412. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.206, https://aclanthology.org/2021.findings-emnlp.206
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, X., YANG, Y.: Neural topic model with attention for supervised learning. In: Chiappa, S., Calandra, R. (eds.) Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 108, pp. 1147–1156. PMLR (2020). https://proceedings.mlr.press/v108/wang20c.html
Wu, C.Y., Ahmed, A., Beutel, A., Smola, A.J.: Joint training of ratings and reviews with recurrent recommender networks (2016)
Google Scholar
Yogatama, D., Wang, C., Routledge, B.R., Smith, N.A., Xing, E.P.: Dynamic Lang. Models Streaming Text. Trans. Assoc. Comput. Linguist. 2, 181–192 (2014). https://doi.org/10.1162/tacl_a_00175
Zhou, W., Liu, F., Chen, M.: Contrastive out-of-distribution detection for pretrained transformers. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 1100–1111. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.emnlp-main.84, https://aclanthology.org/2021.emnlp-main.84

Download references

Acknowledgments

This research has been funded by the Federal Ministry of Education and Research of Germany and the state of North-Rhine Westphalia as part of the Lamarr-Institute for Machine Learning and Artificial Intelligence, LAMARR22B. César Ojeda is supported by Deutsche Forschungsgemeinschaft (DFG) - Project-ID 318763901 - SFB1294.

Author information

Authors and Affiliations

Lamarr Institute, Dortmund, Germany
Kostadin Cvejoski
University of Bonn, Bonn, Germany
Ramsés J. Sánchez
Fraunhofer IAIS, Sankt Augustin, Germany
Kostadin Cvejoski & Ramsés J. Sánchez
University of Potsdam, Potsdam, Germany
César Ojeda

Authors

Kostadin Cvejoski
View author publications
You can also search for this author in PubMed Google Scholar
Ramsés J. Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
César Ojeda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kostadin Cvejoski .

Editor information

Editors and Affiliations

LTCI, Télécom Paris, Palaiseau Cedex, France
Albert Bifet
KU Leuven, Leuven, Belgium
Jesse Davis
Faculty of Informatics, Vytautas Magnus University, Akademija, Lithuania
Tomas Krilavičius
Institute of Computer Science, University of Tartu, Tartu, Estonia
Meelis Kull
Department of Computer Science, Bundeswehr University Munich, Munich, Germany
Eirini Ntoutsi
Department of Computer Science, University of Helsinki, Helsinki, Finland
Indrė Žliobaitė

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cvejoski, K., Sánchez, R.J., Ojeda, C. (2024). The Future is Different: Predicting Reddits Popularity with Variational Dynamic Language Models. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14941. Springer, Cham. https://doi.org/10.1007/978-3-031-70341-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-70341-6_25
Published: 22 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70340-9
Online ISBN: 978-3-031-70341-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

The Future is Different: Predicting Reddits Popularity with Variational Dynamic Language Models

Abstract

Access this chapter

Subscribe and save

Buy Now