Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Comparison of Semantic Similarity Models on Constrained Scenarios

Published: 09 November 2022 Publication History

Abstract

The technological world has grown by incorporating billions of small sensing devices, collecting and sharing large amounts of diversified data over the new generation of wireless and mobile networks. We can use semantic similarity models to help organize and optimize these devices. Even so, many of the proposed semantic similarity models do not consider the constrained and dynamic environments where these devices are present (IoT, edge computing, 5g, and next-generation networks). In this paper, we review the commonly used models, discuss the limitations of our previous model, and explore latent space methods (through matrix factorization) to reduce noise and correct the model profiles with no additional data. The new proposal is evaluated with corpus-based state-of-the-art approaches achieving competitive results while having four times faster training time than the next fastest model and occupying 36 times less disk space than the next smallest model.

References

[1]
Abdalla, M., Vishnubhotla, K., & Mohammad, S. M. (2021). What makes sentences semantically related: A textual relatedness dataset and empirical study. ArXiv, abs/2110.04845
[2]
Afzal MK, Zikria YB, Mumtaz S, Rayes A, Al-Dulaimi A, and Guizani M Unlocking 5g spectrum potential for intelligent iot: Opportunities, challenges, and solutions IEEE Communications Magazine 2018 56 10 92-93
[3]
Antunes, M., Gomes, D., & Aguiar, R. (2017). Towards iot data classification through semantic features. Future Generation Computer Systems,86,.
[4]
Antunes, M., Gomes, D., & Aguiar, R. L. (2021). Semantic similarity on constraints datasets: A latent approach. 2021 8th international conference on future internet of things and cloud (cloud) (p. 256-261).
[5]
Araque, O., Zhu, G., & Iglesias, C. A. (2019). A semantic similarity based perspective of affect lexicons for sentiment analysis. Knowledge-Based Systems, 165, 346–359. Retrieved from https://www.sciencedirect.com/science/article/pii/S0950705118305926.
[6]
Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, and Hellmann S Dbpedia - a crystallization point for the web of data Web Semantics: Science, Services and Agents on the World Wide Web 2009 7 154-165
[7]
Bojanowski P, Grave E, Joulin A, and Mikolov T Enriching word vectors with subword information Transactions of the Association for Computational Linguistics 2017 5 135-146
[8]
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. Proceedings of the 2008 acm sigmod international conference on management of data (p. 1247–1250). Association for Computing Machinery. Retrieved from
[9]
Calvanese Strinati, E., & Barbarossa, S. (2021). 6g networks: Beyond shannon towards semantic and goal-oriented communications. Computer Networks, 190, 107930. Retrieved from https://www.sciencedirect.com/science/article/pii/S1389128621000773.
[10]
Camacho-Collados, J., Pilehvar, M. T., & Navigli, R. (2015). NASARI: a novel approach to a semantically-aware representation of items. Proceedings of the 2015 conference of the north American chapter of the association for computational linguistics: Human language technologies (pp. 567–577). Association for Computational Linguistics. Retrieved from https://aclanthology.org/N15-1059.
[11]
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pretraining of deep bidirectional transformers for language understanding. Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers) (pp. 4171–4186). Association for Computational Linguistics. Retrieved from https://aclanthology.org/N19-1423. Accessed 30 Nov 2021.
[12]
Finkelstein L, Gabrilovich E, Matias Y, Rivlin E, Solan Z, Wolfman G, and Ruppin E Placing search in context: The concept revisited ACM Transactions on Information Systems 2002 20 1 116-131
[13]
Hoffart, J., Suchanek, F. M., Berberich, K., & Weikum, G. (2013). Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence, 194, 28–61. Retrieved from https://www.sciencedirect.com/science/article/pii/S0004370212000719 (Artificial Intelligence, Wikipedia and Semi-Structured Resources)
[14]
Ilakiya, P., Sumathi, M., & Karthik, S. (2012). A survey on semantic similarity between words in semantic web. 2012 international conference on radar, communication and computing (icrcc) (p. 213-216).
[15]
Iosif E and Potamianos A Unsupervised semantic similarity computation between terms using web documents IEEE Trans. Knowl. Data Eng. 2010 22 1637-1647
[16]
Lee D and Seung H Learning the parts of objects by non-negative matrix factorization Nature 1999 401 6755 788-791
[17]
Li, S., Xu, L. D., & Zhao, S. (2018). 5g internet of things: A survey. Journal of Industrial Information Integration, 10, 1-9. Retrieved from https://www.sciencedirect.com/science/article/pii/S2452414X18300037.
[18]
Li Y, McLean D, Bandar Z, O’Shea J, and Crockett K Sentence similarity based on semantic nets and corpus statistics IEEE Transactions on Knowledge and Data Engineering 2006 18 1138-1150
[19]
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. Retrieved from http://arxiv.org/abs/1907.11692
[20]
Mihalcea, R., Corley, C., & Strapparava, C. (2006). Corpus-based and knowledge-based measures of text semantic similarity. Proceedings of the 21st national conference on artificial intelligence (vol. 1, p. 775–780).
[21]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26). Curran Associates, Inc. Retrieved from https://proceedings.neurips.cc/paper/2013/le/9aa42b31882ec039965f3c4923ce901b-Paper.pdf
[22]
Miller, G. A. (1995). Wordnet: A lexical database for english. Commun. ACM, 38 (11), 39–41. Retrieved from
[23]
Miller GA and Charles WG Contextual correlates of semantic similarity Language and Cognitive Processes 1991 6 1 1-28
[24]
Mohammad, S. M., & Hirst, G. (2012). Distributional measures as proxies for semantic relatedness. arXiv. Retrieved from https://arxiv.org/abs/1203.1889.
[25]
Mukhamediev, R. I., Aliguliyev, R. M., & Muhamedijeva, J. (2017). Estimation of relationship between domains of ict semantic network. D.A. Alexandrov, A.V. Boukhanovsky, A.V. Chugunov, Y. Kabanov, O. Koltsova (Eds.), Digital transformation and global society (pp. 130–135). Springer International Publishing.
[26]
Navigli, R., & Ponzetto, S. P. (2012). Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193, 217–250. Retrieved from https://www.sciencedirect.com/science/article/pii/S0004370212000793
[27]
Padó, S., & Lapata, M. (2007). Dependency-based construction of semantic space models. Comput. Linguist., 33 (2), 161–199. Retrieved from
[28]
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. Empirical methods in natural language processing (emnlp) (pp. 15321543). Retrieved from http://www.aclweb.org/anthology/D14-1162. Accessed 30 Nov 2021
[29]
Rada R, Mili H, Bicknell E, and Blettner M Development and application of a metric on semantic nets Systems, Man and Cybernetics, IEEE Transactions on 1989 19 17-30
[30]
Sitikhu, P., Pahi, K., Thapa, P., & Shakya, S. (2019). A comparison of semantic similarity methods for maximum human interpretability. 2019 artificial intelligence for transforming business and society (aitb) (vol. 1, p. 1-4).
[31]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., & Polosukhin, I. (2017). Attention is all you need. I. Guyon et al. (Eds.), Advances in neural information processing systems (vol. 30). Curran Associates, Inc.
[32]
Wu W, Li H, Wang H, and Zhu K Probase: A probabilistic taxonomy for text understanding Proceedings of the ACM SIGMOD International Conference on Management of Data 2012
[33]
Yang, T., Wu, S., Feng, J., Fu, N., & Tian, M. (2019). Semantic network based approach to compute term semantic similarity. 2019 3rd international conference on electronic information technology and computer engineering (eitce) (pp. 654–658).

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Systems Frontiers
Information Systems Frontiers  Volume 26, Issue 4
Aug 2024
353 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 09 November 2022
Accepted: 04 October 2022

Author Tags

  1. Semantic Similarity
  2. Distributional profiles
  3. Constrained datasets
  4. Word embeddings
  5. IoT
  6. NGN

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media