Abstract
The Australian Digital Observatory was funded by the Australian Research Data Commons in 2021. The goal of the project was to establish the national social media data repository for Australia. This includes collection of social media data at scale, in the first instance from numerous platforms including Twitter, Reddit, FlickR, FourSquare and YouTube. This paper describes the technical architecture of the ADO platform and provides examples of the capabilities that are offered for data discovery, analysis and subsequent download of targeted social media data to support diverse research purposes, noting that the platform needs to respect the terms and conditions of the various social media platforms on data licensing and use, i.e., direct user access to the original raw data is not possible as this would violate the licensing arrangements of the various platforms. We present a case study in the utilization of the platform.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beheshti, A., Benatallah, B., Tabebordbar, A., Motahari-Nezhad, H.R., Barukh, M.C., Nouri, R.: Datasynapse: a social data curation foundry. Distrib. Parallel Databases 37(3), 351–384 (2019)
Zarro, M., Hall, C.: Exploring social curation. D-Lib Mag. 18(11/12), 1 (2012)
Cunliffe, G., Liang, C., Sinnott, R.O.: Using social media to understand city-wide movement patterns and behaviours. In: IEEE International Conference on Social Network Analysis, Management and Security (SNAMS 2020), Paris, France, December 2020
Rehm, G., et al.: QURATOR: innovative technologies for content and data curation (2020). arXiv preprint: arXiv:2004.12195
Beheshti, A., Vaghani, K., Benatallah, B., Tabebordbar, A.: Crowdcorrect: a curation pipeline for social data cleansing and curation. In: Mendling, J., Mouratidis, H. (eds.) CAiSE 2018. LNBIP, vol. 317, pp. 24–38. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92901-9_3
Wang, Z., Sinnott, R. O.: Linking user accounts across social media platforms. In: IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, Leicester, UK, December 2021
Rüdiger, S., Dayter, D. (eds.) Corpus Approaches to Social Media, vol. 98. John Benjamins Publishing Company, Amsterdam (2020)
Mitra, T., Gilbert, E.: Credbank: a large-scale social media corpus with associated credibility annotations. In: Proceedings of the international AAAI conference on web and social media, vol. 9, no. 1, pp. 258–267 (2015)
Uryupina, O., Plank, B., Severyn, A., Rotondi, A., Moschitti, A.: SenTube: a corpus for sentiment analysis on YouTube social media. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC2014), pp. 4244–4249, May 2014
Petrović, S., Osborne, M., Lavrenko, V.: The Edinburgh twitter corpus. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media, pp. 25–26, June 2010
Isaak, J., Hanna, M.J.: User data privacy: Facebook, Cambridge Analytica, and privacy protection. Computer 51(8), 56–59 (2018)
Wang, S., Sinnott, R.O.: Protecting personal trajectories of social media users through differential privacy. Comput. Secur. 67, 142–163 (2017)
Humphreys, L., Gill, P., Krishnamurthy, B.: How much is too much? Privacy issues on Twitter. In: Conference of International Communication Association, Singapore, June 2010
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline (2019). arXiv preprint: arXiv:1905.05950
Church, K.W.: Word2Vec. Nat. Lang. Eng. 23(1), 155–162 (2017)
Walton, C.: Spies, election meddling, and disinformation: past and present. Brown J. World Aff. 26, 107 (2019)
Hamurcu, C.: Can Elon Mask’s Twitter Posts about cryptocurrencies influence cryptocurrency markets by creating a herding behavior Bias? Fiscaoeconomia 6(1), 215–228 (2022)
Morandini, L., Sinnott, R.O.: Mapping the chatter: spatial metaphors for dynamic topic modelling of social media. In: FOSS4G Conference, Florence, Italy, August 2022
Acknowledgments
The authors would like to thank the collaborators involved in the ADO project at Queensland University of Technology and the University of New South Wales. Acknowledgments are also given to the ARDC for the ADO funding.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sinnott, R.O., Li, Q., Mohammad, A., Morandini, L. (2023). The Australian Digital Observatory: Social Media Collection, Discovery and Analytics at Scale. In: Hsu, CH., Xu, M., Cao, H., Baghban, H., Shawkat Ali, A.B.M. (eds) Big Data Intelligence and Computing. DataCom 2022. Lecture Notes in Computer Science, vol 13864. Springer, Singapore. https://doi.org/10.1007/978-981-99-2233-8_23
Download citation
DOI: https://doi.org/10.1007/978-981-99-2233-8_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2232-1
Online ISBN: 978-981-99-2233-8
eBook Packages: Computer ScienceComputer Science (R0)