Abstract
The extensive use of online social media has highlighted the importance of privacy in the digital space. As more scientists analyse the data created in these platforms, privacy concerns have extended to data usage within the academia. Although text analysis is a well documented topic in academic literature with a multitude of applications, ensuring privacy of user-generated content has been overlooked. In an effort to reduce the exposure of online users’ information, we propose a privacy-preserving text labelling method for varying applications, based in crowdsourcing. We transform text with different levels of privacy and analyse the effectiveness of the transformation with regards to label correlation. To demonstrate the adaptive nature of our approach we also employ a TF/IDF filtering transformation. Our results suggest that total privacy can be implemented in labelling, retaining the annotational diversity and subjectivity of traditional labelling. The privacy-preserving labelling, with the use of NRC lexicon, demonstrates an average 0.11 Mean Spearman’s Rho correlation, boosted to 0.124 with TF/IDF filtering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
References
Barnes, S.B.: A privacy paradox: social networking in the united states. First Monday, vol. 11, no. 9 (2006)
De Cristofaro, E., Soriente, C.: Short paper: pepsi–privacy-enhanced participatory sensing infrastructure. In: Proceedings of the Fourth ACM Conference on Wireless Network Security, pp. 23–28. ACM (2011)
Dienlin, T., Trepte, S.: Is the privacy paradox a relic of the past? an in-depth analysis of privacy attitudes and privacy behaviors. Eur. J. Soc. Psychol. 45(3), 285–297 (2015)
Giatsoglou, M., Vozalis, M.G., Diamantaras, K., Vakali, A., Sarigiannidis, G., Chatzisavvas, K.C.: Sentiment analysis leveraging emotions and word embeddings. Expert Syst. Appl. 69, 214–224 (2017)
Gundecha, P., Liu, H.: Mining social media: a brief introduction. In: New Directions in Informatics, Optimization, Logistics, and Production, pp. 1–17. Informs (2012)
Haralabopoulos, G., Anagnostopoulos, I., McAuley, D.: Ensemble deep learning for multilabel binary classification of user-generated content. Algorithms 13(4), 83 (2020)
Haralabopoulos, G., Simperl, E.: Crowdsourcing for beyond polarity sentiment analysis a pure emotion lexicon. arXiv preprint arXiv:1710.04203 (2017)
Haralabopoulos, G., Torres, M.T., Anagnostopoulos, I., McAuley, D.: Text data augmentations: permutation, antonyms and negation. Expert Syst. Appl. 177, 114769 (2021)
Haralabopoulos, G., Tsikandilakis, M., Torres Torres, M., McAuley, D.: Objective assessment of subjective tasks in crowdsourcing applications. In: Proceedings of the LREC 2020 Workshop on “Citizen Linguistics in Language Resource Development", pp. 15–25. European Language Resources Association, Marseille, France, May 2020. https://www.aclweb.org/anthology/2020.cllrd-1.3
Haralabopoulos, G., Wagner, C., McAuley, D., Anagnostopoulos, I.: Paid crowdsourcing, low income contributors, and subjectivity. In: MacIntyre, J., Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds.) AIAI 2019. IAICT, vol. 560, pp. 225–231. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19909-8_20
Haralabopoulos, G., Wagner, C., McAuley, D., Simperl, E.: A multivalued emotion lexicon created and evaluated by the crowd. In: 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), pp. 355–362. IEEE (2018)
Korshunov, P., Cai, S., Ebrahimi, T.: Crowdsourcing approach for evaluation of privacy filters in video surveillance. In: Proceedings of the ACM Multimedia 2012 Workshop on Crowdsourcing for Multimedia, pp. 35–40. ACM (2012)
Korshunov, P., Nemoto, H., Skodras, A., Ebrahimi, T.: Crowdsourcing-based evaluation of privacy in HDR images. In: Optics, Photonics, and Digital Technologies for Multimedia Applications III, vol. 9138, p. 913802. International Society for Optics and Photonics (2014)
Li, Y., Yi, G., Shin, B.-S.: Spatial task management method for location privacy aware crowdsourcing. Cluster Comput. 22(1), 1797–1803 (2017). https://doi.org/10.1007/s10586-017-1598-5
Mitrou, L., Kandias, M., Stavrou, V., Gritzalis, D.: Social media profiling: a panopticon or omniopticon tool? In: Proceedings of the 6th Conference of the Surveillance Studies Network. Barcelona, Spain (2014)
Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon, pp. 26–34. Association for Computational Linguistics (2010)
Mortier, R., et al.: Personal data management with the databox: what’s inside the box? In: Proceedings of the 2016 ACM Workshop on Cloud-Assisted Networking, pp. 49–54. ACM (2016)
Plutchik, R.: A general psychoevolutionary theory of emotion. Theor. Emotion 1(3–31), 4 (1980)
Wang, Q., Zhang, Y., Lu, X., Wang, Z., Qin, Z., Ren, K.: Real-time and spatio-temporal crowd-sourced social network data publishing with differential privacy. IEEE Trans. Dependable Secure Comput. 15(4), 591–606 (2016)
Wu, Z., Wang, Z., Wang, Z., Jin, H.: Towards privacy-preserving visual recognition via adversarial training: A pilot study. arXiv preprint arXiv:1807.08379 (2018)
Yang, K., Zhang, K., Ren, J., Shen, X.: Security and privacy in mobile crowdsourcing networks: challenges and opportunities. IEEE Commun. Mag. 53(8), 75–81 (2015)
Zheng, X., Luo, G., Cai, Z.: A fair mechanism for private data publication in online social networks. IEEE Trans. Netw. Sci. Eng. 7(2), 880–891 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 IFIP International Federation for Information Processing
About this paper
Cite this paper
Haralabopoulos, G., Torres, M.T., Anagnostopoulos, I., McAuley, D. (2021). Privacy-Preserving Text Labelling Through Crowdsourcing. In: Maglogiannis, I., Macintyre, J., Iliadis, L. (eds) Artificial Intelligence Applications and Innovations. AIAI 2021 IFIP WG 12.5 International Workshops. AIAI 2021. IFIP Advances in Information and Communication Technology, vol 628. Springer, Cham. https://doi.org/10.1007/978-3-030-79157-5_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-79157-5_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79156-8
Online ISBN: 978-3-030-79157-5
eBook Packages: Computer ScienceComputer Science (R0)