Abstract
Text analytics is a large umbrella under which it is possible to report countless techniques, models, methods for automatic and quantitative analysis of textual data. Its development can be traced back the introduction of the computer, but the prodromes date back, the importance of text analysis has grown over time and has been greatly enriched with the spread of the Internet and social media, which constitute an important flow of information also in support of official statistics. This paper aims to describe, through a timeline the past, the present and the possible future scenario of text analysis. Moreover, the main macro-steps for a practical study are illustrated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
Stephens-Davidowitz, in an article published in the New York Times (April 5, 2020), suggested that this methodology could be adopted to search for places where dissemination has escaped the reporting of official data as in the case of the state of Ecuador.
- 3.
See the website: https://unstats.un.org/bigdata/.
- 4.
See the website https://www.istat.it/it/archivio/219585.
References
Giovannini E (2014) Scegliere il futuro, il Mulino., Bologna
United Nations (2014) Independent expert advisory group on a data revolution for sustainable development . A world that counts: mobilising the data revolution for sustainable development. Independent Advisory Group Secretariat. Available from http://www.undatarevolution.org/report/
Forrester Research (1995) Coping with complex data. The Forrest Report, April
Stephens-Davidowitz S, Pinker S (2017) Everybody lies : big data, new data, and what the Internet can tell us about who we really are, New York, NY
Bourdon B (1892) L’espression des émotions et des tendence dans le language. Alcan, Paris
Estoup JB (1916) Gammes sténografiques. Institut sténografiques de France, Paris
Busemann (1925) Die Sprache der Jugend als Ausdruck des Entwicklungsbrbythmuds. Fischer, Jena
The adjective-verb quotient: a contribution to the psychology of language. Psychol Rec 3:310–343
Zipf GK (1929) Relative frequency as a determinant of phonetic change. Harvard Stud Classical Philology 40: l–95
Zipf GK (1935) The psycho-biology of language. Houghton Mifflin, Boston, MA
Mandelbrot B (1954) Structure formelle des textes et communication. Word lO:l-27
Bakkalbasi N, Bauer K, Glover J, Wang L (2006) Three options for citation tracking: Google Scholar, Scopus and Web of science. Biomed Digital Libraries 3(1):7
Boyle F, Sherman D (2006) Scopus: the product and its development. The Serials Librarian 49(3)
Celardo L, Everett MG (2020) Network text analysis: a two-way classification approach. Int J Inf Manage 51:102009
Malagas ME, Pitsouni EI, Malietzis GA, Pappas G (2008) Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses. FASEB J 22(2):338–342
Harzing AW, Alakangas S (2016) Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison. Scientometrics 106(2):787–804
Iezzi DF (2012) Centrality measures for text clustering. Commun Statistics-Theor Methods 41(16–17):3179–3197
Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Information Process Manage 41(6):1462–1480
Mongeon P, Paul-Hus A (2016) The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106(1):213–228
Vieira ES, Gomes JANF (2009) A comparison of Scopus and Web of Science for a typical university. Scientometrics 81(2):587–600
Bolasco S (2013) L’analisi automatica dei testi: fare ricerca con il text mining. Carocci, Roma
Celardo L, Iezzi DF (2020) Combining words, emoticons and emojis to measure sentiment in Italian tweet speeches. In: JADT 2020 : 15th international conference on statistical analysis of textual data, 16–19 Jun 2020, TOULOUSE (France)
Bolasco S, Iezzi DF (2012) Advances in textual data analysis and text mining-special issue Statistica Applicata Italian. J Appl Statistics 21(1):9–21
Aria M, Misuraca M, Spano M (2020) Mapping the evolution of social research and data science on 30 years of social indicators research
Bing L (2012) Sentiment analysis and opinion mining. Synthesis Lect Human Language Technol 5(1):1–167
Witten IH, Bell TC (1991) The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans Information Theor 37:1085–1094
Lebart L, Salem A (1988) Analyse statistique des données textuelle. Dunod, Paris
Herdan G (1964) Quantitative linguistics or generative grammar? Linguistics 4:56–65
Guiraud P (1954) Les caractheres statistiques du vocabulaire, Paris, P.U.F
Aghaei Chadegani A, Salehi H, Yunus M, Farhadi H, Fooladi M, Farhadi M, Ale Ebrahim N (2013) A comparison between two main academic literature collections: web of science and Scopus databases. Asian Soc Sci 9(5):18–26
Krippendorff K (2012) Content analysis: an introduction to its methodology, 3rd edn. Sage, Thousand Oaks, CA, p 441
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Iezzi, D.F., Celardo, L. (2020). Text Analytics: Present, Past and Future. In: Iezzi, D.F., Mayaffre, D., Misuraca, M. (eds) Text Analytics. JADT 2018. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-030-52680-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-52680-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-52679-5
Online ISBN: 978-3-030-52680-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)