Abstract
The paper analyses the correlation of change in word concreteness ratings with semantic change. To perform the analysis, we apply a neural network to diachronic data to obtain concreteness ratings of English words. As input to the model, we use co-occurrence statistics with the most frequent words extracted from the Google Books Ngram diachronic corpus. It is shown that the model, initially trained on data averaged over a long time interval, predicts the concreteness ratings with high accuracy (based on the word co-occurrence data in a particular year). The impact of lexical semantic change on the change in the concreteness rating is analyzed using 69 words borrowed from previous works. As the considered cases show, the neural network estimate of the word concreteness rating is very sensitive to changes in semantics. Among the factors that influence changes in the concreteness rating, we reveal the emergence of new meanings of a word, the competition of word meanings related to different parts of speech, the use of a word as a proper name, and the use of the word as a part of collocations. It is shown in the paper that changes in the concreteness rating can (along with changes in other word properties) serve as a marker of semantic change.
REFERENCES
D. Schlechtweg, B. McGillivray, S. Hengchen, H. Dubossarsky, and N. Tahmasebi, ‘‘SemEval-2020 task 1: Unsupervised lexical semantic change detection,’’ in Proceedings of the 14th Workshop on Semantic Evaluation (Int. Committee Comput. Linguistics, Barcelona, 2020), pp. 1–23.
Z. Harris, Papers in Structural and Transformational Linguistics (Reidel, Dordrecht, 1970).
J. R. Firth, ‘‘A synopsis of linguistic theory,’’ in Studies in Linguistic Analysis 1930–1955, Special Volume of the Philological Society (Philol. Soc., Oxford, 1957), pp. 1–32.
N. Tahmasebi, L. Borin, and A. Jatowt, ‘‘Survey of computational approaches to lexical semantic change detection,’’ in Computational Approaches to Semantic Change, Vol. 6 of Language Variation (Language Science Press, Berlin, 2021), pp. 1–91.
S. Hengchen, N. Tahmasebi, D. Schlechtweg, and H. Dubossarsky, ‘‘Challenges for computational lexical semantic change,’’ in Computational Approaches to Semantic Change, Vol. 6 of Language Variation (Language Science Press, Berlin, 2021), pp. 341–372.
A. Ryzhova, D. Ryzhova, and I. Sochenkov, ‘‘Detection of semantic changes in russian nouns with distributional models and grammatical features,’’ Komp’yut. Lingvist. Intell. Tekhnol. 20, 597–606 (2021).
A. Kutuzov, L. Pivovarova, and M. Giulianelli, ‘‘Grammatical profiling for semantic change detection,’’ in Proceedings of the 25th Conference on Computational Natural Language Learning (Assoc. Comput. Linguist., 2021), pp. 423–434.
V. Bochkarev, A. Achkeev, A. Shevlyakova, and S. Khristoforov, ‘‘Diachronic neural network predictor of word animacy,’’ in Advances in Computational Intelligence MICAI 2022, Ed. by O. Pichardo Lagunas, J. Martinez-Miranda, and B. Martinez Seis, Lect. Notes Comput. Sci. 13613, 215–226 (2022). https://doi.org/10.1007/978-3-031-19496-2_16
V. V. Bochkarev, S. V. Khristoforov, A. V. Shevlyakova, and V. D. Solovyev, ‘‘Neural network algorithm for detection of new word meanings denoting named entities,’’ IEEE Access 10, 68499–68512 (2022). https://doi.org/10.1109/ACCESS.2022.3186681
A. Paivio, J. C. Yuille, and S. A. Madigan, ‘‘Concreteness, imagery, and meaningfulness values for 925 nouns,’’ J. Exp. Psychol. 76, 1–25 (1968). https://doi.org/10.1037/h0025327
A. Paivio, ‘‘Mental imagery in associative learning and memory,’’ Psychol. Rev. 76, 241–263 (1969). https://doi.org/10.1037/h0027272
M. Brysbaert, A. B. Warriner, and V. Kuperman, ‘‘Concreteness ratings for 40 thousand generally known English word lemmas,’’ Behavior Res. Methods 46, 904–911 (2014).
V. V. Bochkarev, S. V. Khristoforov, A. V. Shevlyakova, and V. D. Solovyev, ‘‘Comparison of the three algorithms for concreteness rating estimation of English words,’’ Acta Polytech. Hung. 19 (10), 99–121 (2022).
B. Snefjella, M. Généreux, and V. Kuperman, ‘‘Historical evolution of concrete and abstract language revisited,’’ Behavior Res. Methods 51, 1693–1705 (2019). https://doi.org/10.3758/s13428-018-1071-2
Y. Li, T. Engelthaler, C. S. Siew, and T. T. Hills, ‘‘The macroscope: A tool for examining the historical structure of language,’’ Behavior Res. Methods 51, 1864–1877 (2019). https://doi.org/10.3758/s13428-018-1177-6
Y. Lin, J.-B. Michel, E. L. Aiden, J. Orwant, W. Brockman, and S. Petrov, ‘‘Syntactic annotations for the Google books Ngram corpus,’’ in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics 2012 (Assoc. for Comput. Linguistics, Jeju Island, Korea, 2012), Vol. 2, pp. 238–242.
X. Tang, ‘‘A state-of-the-art of semantic change computation,’’ Nat. Language Eng. 24, 649–676 (2018).
D. Schlechtweg and S. Schulte im Walde, ‘‘Simulating lexical semantic change from sense-annotated data,’’ in The Evolution of Language: Proceedings of the 13th International Conference EVOLANGXIII (2020).
G. Boleda, ‘‘Distributional semantics and linguistic theory,’’ Ann. Rev. Linguist. 6, 213–234 (2020). https://doi.org/10.1146/annurev-linguistics-011619-030303
J.-B. Michel, Y. K. Shen, A. P. Aiden, A. Veres, M. K. Gray, et al., ‘‘Quantitative analysis of culture using millions of digitized books,’’ Science (Washington, DC, U. S.) 331 (6014), 176–182 (2011).
M. Davies, ‘‘Expanding horizons in historical linguistics with the 400 million word corpus of historical american english,’’ Corpora 7, 121–57 (2012).
V. Solovyev, ‘‘Concreteness/abstractness concept: State of the art,’’ in Advances in Cognitive Research, Artificial Intelligence and Neuroinformatics Intercognsci’2020 (Springer, Cham, 2021), pp. 275–283.
M. Coltheart, ‘‘The MRC psycholinguistic database,’’ Q. J. Exp. Psychol., Sect. A 33, 497–505 (1981). https://doi.org/10.1080/14640748108400805
W. L. Hamilton, K. Clark, J. Leskovec, and D. Jurafsky, ‘‘Inducing domain-specific sentiment lexicons from unlabeled corpora,’’ in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (Assoc. for Comput. Linguistics, Austin, TX, 2016), pp. 595–605.
O. Fugikawa, O. Hayman, R. Liu, L. Yu, T. Brochhagen, and Y. Xu, ‘‘A computational analysis of crosslinguistic regularity in semantic change,’’ Front. Commun. 8, 1136338 (2023). https://doi.org/10.3389/fcomm.2023.1136338
Y. Xu and C. Kemp, ‘‘A computational evaluation of two laws of semantic change,’’ in Proceedings of the 37th Annual Meeting of the Cognitive Science Society, CogSci 2015 (Pasadena, CA, USA, 2015).
S. Khristoforov, V. Bochkarev, and A. Shevlyakova, ‘‘Recognition of parts of speech using the vector of bigram frequencies,’’ in Analysis of Images, Social Networks and Texts AIST 2019, Commun. Comput. Inform. Sci. 1086, 132–142 (2020). https://doi.org/10.1007/978-3-030-39575-9_13
J. A. Bullinaria and J. P. Levy, ‘‘Extracting semantic representations from word co-occurrence statistics: Stop-lists, stemming, and SVD,’’ Behavior Res. Methods 44, 890–907 (2012).
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, ‘‘Dropout: A simple way to prevent neural networks from overfitting,’’ J. Mach. Learn. Res. 15, 1929–1958 (2014).
D. P. Kingma and J. Ba, ‘‘Adam: A method for stochastic optimization,’’ arXiv: 1412.6980 (2014). https://doi.org/10.48550/arXiv.1412.6980
Online Etymology Dictionary. https://www.etymonline.com/. Accessed May 30, 2023.
V. Kulkarni, R. Al-Rfou, B. Perozzi, and S. Skiena, ‘‘Statistically significant detection of linguistic change,’’ in Proceedings of the 24th International Conference on World Wide Web (Florence, Italy, 2015), pp. 625–635.
V. Bochkarev and A. Shevlyakova, ‘‘Calculation of a confidence interval of semantic distance estimates obtained using a large diachronic corpus,’’ J. Phys.: Conf. Ser. 1730, 012031 (2021). https://doi.org/10.1088/1742-6596/1730/1/012031
Dictionary.com. https://www.dictionary.com/. Accessed May 30, 2023.
Cambridge Dictionary. https://dictionary.cambridge.org/. Accessed May 30, 2023.
Funding
This research was financially supported by Russian Science Foundation, grant no. 20-18-00206.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
(Submitted by E. K. Lipachev)
Rights and permissions
About this article
Cite this article
Bochkarev, V., Khristoforov, S., Shevlyakova, A. et al. Diachronic Analysis of a Word Concreteness Rating: Impact of Semantic Change. Lobachevskii J Math 45, 961–971 (2024). https://doi.org/10.1134/S1995080224600559
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1995080224600559