Abstract
As social media becomes increasingly popular, more and more public health activities emerge, which is worth noting for pandemic monitoring and government decision-making. Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs). Although recent progress in LLMs has shown a strong ability to comprehend knowledge by being fine-tuned on specific domain datasets, the costs of training an in-domain LLM for every specific public health task are especially expensive. Furthermore, such kinds of in-domain datasets from social media are generally highly imbalanced, which will hinder the efficiency of LLMs tuning. To tackle these challenges, the data imbalance issue can be overcome by sophisticated data augmentation methods for social media datasets. In addition, the ability of the LLMs can be effectively utilised by prompting the model properly. In light of the above discussion, in this paper, a novel ALEX framework is proposed for social media analysis on public health. Specifically, an augmentation pipeline is developed to resolve the data imbalance issue. Furthermore, an LLMs explanation mechanism is proposed by prompting an LLM with the predicted results from BERT models. Extensive experiments conducted on three tasks at the Social Media Mining for Health 2023 (SMM4H) competition with the first ranking in two tasks demonstrate the superior performance of the proposed ALEX method. Our code has been released in https://github.com/YanJiangJerry/ALEX.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Klein, A.Z., Kunatharaju, S., O’Connor, K., Gonzalez-Hernandez, G.: Automatically identifying self-reports of COVID-19 diagnosis on Twitter: an annotated data set, deep neural network classifiers, and a large-scale cohort. J. Med. Internet Res. 25, e46484 (2023)
Al-Dmour, H., Masa’deh, R., Salman, A., Abuhashesh, M., Al-Dmour, R.: Influence of social media platforms on public health protection against the COVID-19 pandemic via the mediating effects of public health awareness and behavioral changes: integrated model. J. Med. Internet Res. 22(8), e19996 (2020)
Al-Garadi, M.A., Yang, Y.C., Sarker, A.: The role of natural language processing during the COVID-19 pandemic: health applications, opportunities, and challenges. In: Healthcare. MDPI (2022)
Bacelar-Nicolau, L.: The still untapped potential of social media for health promotion: the WHO example. In: PDH (2019)
Brown, T.B., et al.: Language models are few-shot learners. CoRR abs/2005.14165 (2020)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
Ge, H., Zheng, S., Wang, Q.: Based BERT-BiLSTM-ATT model of commodity commentary on the emotional tendency analysis. In: BDAI (2021)
Hoffmann, J., et al.: Training compute-optimal large language models. CoRR abs/2203.15556 (2022)
Ji, Z., et al.: Survey of hallucination in natural language generation. ACM 55(12), 248:1–248:38 (2023)
Kaur, K., Kaur, P.: BERT-RCNN: an automatic classification of app reviews using transfer learning based RCNN deep model (2023)
Kumar, P., Bhatnagar, R., Gaur, K., Bhatnagar, A.: Classification of imbalanced data: review of methods and applications. In: IOP (2021)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: ICLR (2020)
Le, H., et al.: FlauBERT: unsupervised language model pre-training for French. In: LREC (2020)
Lee, K., et al.: Deduplicating training data makes language models better. In: ACL (2022)
Liu, N., et al.: Text representation: from vector to tensor. In: ICDM (2005)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: IJCAI (2016)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR (2019)
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Martin, L., et al.: CamemBERT: a tasty French language model. In: ACL (2020)
Morris, J.X., Lifland, E., Yoo, J.Y., Grigsby, J., Jin, D., Qi, Y.: TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In: EMNLP (2020)
Müller, M., Salathé, M., Kummervold, P.E.: COVID-Twitter-BERT: a natural language processing model to analyse COVID-19 content on Twitter. CoRR abs/2005.07503 (2020)
Nguyen, D.Q., Vu, T., Nguyen, A.T.: BERTweet: a pre-trained language model for English Tweets. In: EMNLP (2020)
OpenAI: GPT-4 technical report. CoRR abs/2303.08774 (2023)
Ouyang, L., et al.: Training language models to follow instructions with human feedback. CoRR abs/2203.02155 (2022)
Qiu, X., Sun, T., Xu, Y., Shao, Y., Dai, N., Huang, X.: Pre-trained models for natural language processing: a survey. CoRR abs/2003.08271 (2020)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training. OpenAI (2018)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Safaya, A., Abdullatif, M., Yuret, D.: KUISAIL at SemEval-2020 task 12: BERT-CNN for offensive speech identification in social media. In: SemEval (2020)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019)
Seo, M.J., Min, S., Farhadi, A., Hajishirzi, H.: Neural speed reading via Skim-RNN. In: ICLR (2018)
Singhal, K., et al.: Large language models encode clinical knowledge. CoRR abs/2212.13138 (2022)
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: CCL (2019)
Touvron, H., et al.: LLaMA: open and efficient foundation language models. CoRR abs/2302.13971 (2023)
Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017)
Wei, J., et al.: Chain-of-thought prompting elicits reasoning in large language models. In: NeurIPS (2022)
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. CoRR abs/1910.03771 (2019)
Wu, D., et al.: U2++: unified two-pass bidirectional end-to-end model for speech recognition. CoRR abs/2106.05642 (2021)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. CoRR abs/1906.08237 (2019)
Yogatama, D., Dyer, C., Ling, W., Blunsom, P.: Generative and discriminative text classification with recurrent neural networks. CoRR abs/1703.01898 (2017)
Zeng, A., et al.: GLM-130B: an open bilingual pre-trained model. CoRR abs/2210.02414 (2022)
Zeng, D., Cao, Z., Neill, D.B.: Artificial intelligence-enabled public health surveillance-from local detection to global epidemic monitoring and control. In: AIM (2021)
Zhang, X., et al.: TwHIN-BERT: a socially-enriched pre-trained language model for multilingual tweet representations. CoRR abs/2209.07562 (2022)
Zhao, W.X., et al.: A survey of large language models. CoRR abs/2303.18223 (2023)
Acknowledgements
The work is supported by Developing a proof-of-concept self-contact tracing app to support epidemiological investigations and outbreak response (Australia-Korea Joint Research Projects - ATSE Tech Bridge Grant).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, Y., Qiu, R., Zhang, Y., Zhang, PF. (2024). Balanced and Explainable Social Media Analysis for Public Health with Large Language Models. In: Bao, Z., Borovica-Gajic, R., Qiu, R., Choudhury, F., Yang, Z. (eds) Databases Theory and Applications. ADC 2023. Lecture Notes in Computer Science, vol 14386. Springer, Cham. https://doi.org/10.1007/978-3-031-47843-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-47843-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47842-0
Online ISBN: 978-3-031-47843-7
eBook Packages: Computer ScienceComputer Science (R0)