Abstract
This study aims at automatic processing and knowledge extraction from large amounts of oncology-related content from online social networks (OSN). In this context, a large number of OSN textual posts concerning major cancer types are automatically scraped and structured using natural language processing techniques. Machines are trained to assign multiple labels to these posts based on the type of knowledge enclosed, if any. Trained machines are used to automatically classify large-scale textual posts. Statistical inferences are made based on these predictions to extract general concepts and abstract knowledge. Different approaches for constructing document feature vectors showed no tangible effect on the classification accuracy. Among different classifiers, logistic regression achieved the highest overall accuracy (96.4%) and \(\overline{F1}\) (73.4) in a 13-way multi-label classification of textual posts. The most common topic was seeking or providing moral support for cancer patients, followed by providing technical information about cancer causes and treatments. The most common causes and treatments of different types of cancer on OSN are also automatically detected in this study. Seeking or providing moral support for cancer patients shared the largest overlap with other topics, i.e. moral support tends to be present even in OSN posts which focus on other topics. On the other hand, providing technical information about cancer diagnosis or prevention were the most isolated topics, where OSN posts tend not to allude to other topics. OSN posts which seek financial support only overlap with the moral support topic, if any. Our methodology and results provide public health professionals with an opportunity to monitor what topics and to which extent are being discussed on OSN, what specific information and knowledge are being disseminated over OSN, and to assess their veracity in close to real time. This helps them to develop policies that encourage, discourage, or modify the consumption of viral oncology-related information on OSN.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
I, you’d, below, so, who, is, mightn’t, did, for, the, any, each, hers, more, own, mustn, about, o, wouldn’t, between, off, s, doesn, ve, it’s, as, just, be, won’t, they, your, yourselves, isn’t, from, where, y, d, ourselves, she’s, at, our, why, him, you, can, himself, such, haven, to, most, you’ve, above, myself, than, now, here, only, it, through, aren, while, has, am, aren’t, but, down, too, hadn, he, other, there, having, not, itself, shouldn’t, up, until, on, didn, how, been, both, her, wouldn, shouldn, nor, being, shan’t, further, themselves, or, herself, all, theirs, during, no, out, after, needn’t, ain, should’ve, which, under, couldn’t, whom, doesn’t, their, ma, yours, you’re, if, these, my, again, wasn, weren’t, you’ll, wasn’t, when, don, because, hadn’t, that’ll, once, over, will, some, isn, does, shan, its, had, what, didn’t, were, an, re, and, are, against, into, have, mustn’t, this, do, in, before, yourself, t, same, was, doing, mightn, we, weren, haven’t, that, needn, few, hasn’t, me, she, ours, of, with, don’t, m, a, couldn, by, hasn, won, then, should, them, those, very, his, ll.
References
American Cancer Society (2019) Cancer facts and figures. American Cancer Society, Atlanta, GA. https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2019/cancer-facts-and-figures-2019.pdf. Accessed 1 Dec 2018
Antheunis ML, Tates K, Nieboer TE (2013) Patients’ and health professionals’ use of social media in health care: motives, barriers and expectations. Patient Educ Couns 92(3):426–431
Ashcraft KA, Warner AB, Jones LW, Dewhirst MW (2019) Exercise as adjunct therapy in cancer. Semi Radiat Oncol 29(1):16–24
Attai DJ, Cowher MS, Al-Hamadani M, Schoger JM, Staley AC, Landercasper J (2015) Twitter social media is an effective tool for breast cancer patient education and support: patient-reported outcomes by survey. J Med Internet Res 17(7):e188
Bloom R, Amber KT, Hu S, Kirsner R (2015) Google search trends and skin cancer: evaluating the us population’s interest in skin cancer and its association with melanoma outcomes. JAMA Dermatol 151(8):903–905
Bosslet GT, Torke AM, Hickman SE, Terry CL, Helft PR (2011) The patient–doctor relationship and online social networks: results of a national survey. J Gen Intern Med 26(10):1168–1174
Byars T, Theisen E, Bolton DL (2019) Using cannabis to treat cancer-related pain. Semin Oncol Nurs 35(3):300–309
Charani E, Castro-Sánchez E, Moore LS, Holmes A (2014) Do smartphone applications in healthcare require a governance and legal framework? It depends on the application! BMC Med 12(1):29
Chou W-YS, Hunt YM, Beckjord EB, Moser RP, Hesse BW (2009) Social media use in the United States: implications for health communication. J Med Internet Res 11(4):e48
Chou W-YS, Hunt Y, Folkers A, Augustson E (2011) Cancer survivorship in the age of YouTube and social media: a narrative analysis. J Med Internet Res 13(1):e7
Chretien K, Azar J, Kind T (2011) Physicians on twitter. J Am Med Assoc 305(6):566–568
Chung JE (2014) Social networking in online support groups for health: how online social networking benefits patients. J Health Commun 19(6):639–659
Crannell WC, Clark E, Jones C, James TA, Moore J (2016) A pattern-matched Twitter analysis of US cancer-patient sentiments. J Surg Res 206(2):536–542
Dredze M (2012) How social media will change public health. IEEE Intell Syst 27(4):81–84
Elkin N (2008) How America searches: health and wellness. Opinion Research Corporation: iCrossing 1–17
Eysenbach G (2008) Medicine 2.0: social networking, collaboration, participation, apomediation, and openness. J Med Internet Res 10(3):e22
Falzone AE, Brindis CD, Chren M-M, Junn A, Pagoto S, Wehner M, Linos E (2017) Teens, tweets, and tanning beds: rethinking the use of social media for skin cancer prevention. Am J Prev Med 53(3):S86–S94
Gold J, Pedrana AE, Sacks-Davis R, Hellard ME, Chang S, Howard S, Keogh L, Hocking JS, Stoove MA (2011) A systematic examination of the use of online social networking sites for sexual health promotion. BMC Public Health 11(1):583
Gottlieb BH, Wachala ED (2007) Cancer support groups: a critical review of empirical studies. Psychooncology 16(5):379–400
Gough A, Hunter RF, Ajao O, Jurek A, McKeown G, Hong J, Barrett E, Ferguson M, McElwee G, McCarthy M, Kee F (2017) Tweet for behavior change: using social media for the dissemination of public health messages. JMIR Public Health Surveill 3(1):e14
Griffis HM, Kilaru AS, Werner RM, Asch DA, Hershey JC, Hill S, Ha YP, Sellers A, Mahoney K, Merchant RM (2014) Use of social media across US hospitals: descriptive analysis of adoption and utilization. J Med Internet Res 16(11):e264
Harris JK, Snider D, Mueller N (2013) Social media adoption in health departments nationwide: the state of the states. Front Public Health Serv Syst Res 2(1):5
Hashemi M (2019) Web page classification: a survey of perspectives, gaps, and future directions. Multimedia Tools Appl. https://doi.org/10.1007/s11042-019-08373-8
Hashemi M, Hall M (2019) Detecting and classifying online dark visual propaganda. Image Vis Comput 89:95–105
Hashemi M, Karimi HA (2018) Weighted machine learning. Stat Optim Inf Comput 6(4):497–525
Häuser W, Welsch P, Klose P, Radbruch L, Fitzcharles M-A (2019) Efficacy, tolerability and safety of cannabis-based medicines for cancer pain: a systematic review with meta-analysis of randomised controlled trials. Der Schmerz 33(5):424–436
Hawn C (2009) Take two aspirin and tweet me in the morning: how Twitter, Facebook, and other social media are reshaping health care. Health Aff 28(2):361–368
Heilferty CM (2009) Toward a theory of online communication in illness: concept analysis of illness blogs. J Adv Nurs 65(7):1539–1547
Huber J, Muck T, Maatz P, Keck B, Enders P, Maatouk I, Ihrig A (2018) Face-to-face vs. online peer support groups for prostate cancer: a cross-sectional comparison study. J Cancer Surviv 12(1):1–9
Jaidka K, Zhou A, Lelkes Y (2019) Brevity is the soul of Twitter: the constraint affordance and political discussion. J Commun 69(4):345–372
Jiang S (2017) The role of social media use in improving cancer survivors’ emotional well-being: a moderated mediation study. J Cancer Surviv 11(3):386–392
Jiménez J, Ramos A, Ramos-Rivera FE, Gwede C, Quinn GP, Vadaparampil S, Brandon T, Simmons V, Castro E (2018) Community engagement for identifying cancer education needs in Puerto Rico. J Cancer Educ 33(1):12–20
Jung AY, Behrens S, Schmidt M, Thoene K, Obi N, Hüsing A, Chang-Claude J (2019) Pre-to postdiagnosis leisure-time physical activity and prognosis in postmenopausal breast cancer survivors. Breast Cancer Res 21(1):117
Jurafsky D, Martin JH (2014) Speech and language processing. Pearson, London
Kaplan W (2012) Social media and survivorship: building a cancer support network for the 21st century. Oncol Nurse Advisor 3(2):35
Lapointe L, Ramaprasad J, Vedel I (2014) Creating health awareness: a social media enabled collaboration. Health Technol 4(1):43–57
Lyles CR, López A, Pasick R, Sarkar U (2013) “5 mins of uncomfyness is better than dealing with cancer 4 a lifetime”: an exploratory qualitative analysis of cervical and breast cancer screening dialogue on Twitter. J Cancer Educ 28(1):127–133
Marteau TM, Hollands GJ, Fletcher PC (2012) Changing human behavior to prevent disease: the importance of targeting automatic processes. Science 337(6101):1492–1495
Murthy D, Gross A, Oliveira D (2011) Understanding cancer-based networks in Twitter using social network analysis. In: 5th IEEE international conference on semantic computing. IEEE, pp 559–566
Norman C (2011) eHealth literacy 2.0: problems and opportunities with an evolving concept. J Med Internet Res 13(4):e125
Orsini M (2010) Social media: how home health care agencies can join the chorus of empowered voices. Home Health Care Manag Pract 22(3):213–217
Paul MJ, Dredze M (2011) You are what you tweet: analyzing twitter for public health. In: Fifth international AAAI conference on weblogs and social media. AAAI, pp 265–272
Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137
Rajaraman A, Ullman JD (2011) Data mining. In Mining of massive datasets. Cambridge University Press, Cambridge, pp 1–17
Randeree E (2009) Exploring technology impacts of Healthcare 2.0 initiatives. Telemed and e-Health 15(3):255–260
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Read J, Martino L, Luengo D (2014) Efficient monte carlo methods for multi-dimensional learning with classifier chains. Pattern Recognit 47(3):1535–1546
Rehman S, Lyons K, McEwen R, Sellen K (2018) Motives for sharing illness experiences on Twitter: conversations of parents with children diagnosed with cancer. Inf Commun Soc 21(4):578–593
Ritterman J, Osborne M, Klein E (2009) Using prediction markets and Twitter to predict a swine flu pandemic. In: 1st international workshop on mining social media, vol 9, pp 9–17
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523
Strekalova YA, Krieger JL (2017) A picture really is worth a thousand words: public engagement with the National Cancer Institute on social media. J Cancer Educ 32(1):155–157
Sugawara Y, Narimatsu H, Hozawa A, Shao L, Otani K, Fukao A (2012) Cancer patients on Twitter: a novel patient community on social media. BMC Res Notes 5(1):699
Tsuya A, Sugawara Y, Tanaka A, Narimatsu H (2014) Do cancer patients tweet? Examining the twitter use of cancer patients in Japan. J Med Internet Res 16(5):e137
Twitter (n.d.) https://about.twitter.com/company. Retrieved 1 Feb 2019
Uysal AK, Gunal S (2014) The impact of preprocessing on text classification. Inf Process Manag 50(1):104–112
Vraga EK, Stefanidis A, Lamprianidis G, Croitoru A, Crooks AT, Delamater PL, Pfoser D, Radzikowski JR, Jacobsen KH (2018) Cancer and social media: a comparison of traffic about breast cancer, prostate cancer, and other reproductive cancers on Twitter and Instagram. J Health Commun 23(2):181–189
Wicks P, Massagli M, Frost J, Brownstein C, Okun S, Vaughan T, Bradley R, Heywood J (2010) Sharing health data for better outcomes on PatientsLikeMe. J Med Internet Res 12(2):e19
Wiener L, Crum C, Grady C, Merchant M (2011) To friend or not to friend: the use of social media in clinical oncology. J Oncol Pract 8(2):103–106
Yoo S-W, Kim J, Lee Y (2018) The effect of health beliefs, media perceptions, and communicative behaviors on health behavioral intention: an integrated health campaign model on social media. Health Commun 33(1):32–40
Zhou J (2018) Factors influencing people’s personal information disclosure behaviors in online health communities: a pilot study. Asia Pac J Public Health 30(3):286–295
Zucco R, Lavano F, Anfosso R, Bianco A, Pileggi C, Pavia M (2018) Internet and social media use for antibiotic-related information seeking: findings from a survey among adult population in Italy. Int J Med Inform 111(1):131–139
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hashemi, M., Hall, M. Multi-label classification and knowledge extraction from oncology-related content on online social networks. Artif Intell Rev 53, 5957–5994 (2020). https://doi.org/10.1007/s10462-020-09839-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-020-09839-0