Applying Latent Semantic Analysis to Optimize Second-order Co-occurrence Vectors for Semantic Relatedness Measurement

Ahmad Pesaranghader²¹,
Ali Pesaranghader²² &
Azadeh Rezaei²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8284))

2673 Accesses
2 Citations

Abstract

Measures of semantic relatedness are largely applicable in intelligent tasks of NLP and Bioinformatics. By taking these automated measures into account, this paper attempts to improve Second-order Co-occurrence Vector semantic relatedness measure for more effective estimation of relatedness between two given concepts. Typically, this measure, after constructing concepts definitions (Glosses) from a thesaurus, considers the cosine of the angle between the concepts’ gloss vectors as the degree of relatedness. Nonetheless, these computed gloss vectors of concepts are impure and rather large in size which would hinder the expected performance of the measure. By employing latent semantic analysis (LSA), we try to conduct some level of insignificant feature elimination to generate economic gloss vectors. Applying both approaches to the biomedical domain, using MEDLINE as corpus, UMLS as thesaurus, and reference standard of biomedical concept-pairs manually rated for relatedness, we show LSA implementation enforces positive impact in terms of performance and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

tESA: a distributional measure for calculating semantic relatedness

Article Open access 28 December 2016

DomESA: a novel approach for extending domain-oriented lexical relatedness calculations with domain-specific semantics

Article 13 January 2017

Adapting Gloss Vector Semantic Relatedness Measure for Semantic Similarity Estimation: An Evaluation in the Biomedical Domain

References

Muthaiyah, S., Kerschberg, L.: A Hybrid Ontology Mediation Approach for the Semantic Web. International Journal of E-Business Research 4, 79–91 (2008)
Article Google Scholar
Pekar, V., Ou, S., Constantin Orasan, C., Spurk, C., Negri, M.: Development and alignment of a domain-specific ontology for question answering. In: Proceedings of the 6th Edition of the Language Resources and Evaluation Conference, LREC-08 (May 2008)
Google Scholar
Chen, B., Foster, G., Kuhn, R.: Bilingual Sense Similarity for Statistical Machine Translation. In: Proceedings of the ACL, pp. 834–843 (2010)
Google Scholar
Bousquet, C., Lagier, G., LilloLe, L.A., Le Beller, C., Venot, A., Jaulent, M.C.: Appraisal of the MedDRA Conceputal Structure for describing and grouping adverse drug reactions. Drug Safety 28(1), 19–34 (2005)
Article Google Scholar
Firth, J.R.: A Synopsis of Linguistic Theory 1930-1955. In: Studies in Linguistic Analysis, pp. 1–32 (1957)
Google Scholar
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and Application of a Metric on Semantic Nets. IEEE Transactions on Systems, Man and Cybernetics 19, 17–30 (1989)
Article Google Scholar
Wu, Z., Palmer, M.: Verb Semantics and Lexical Selections. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (1994)
Google Scholar
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453 (1995)
Google Scholar
Jiang, J.J., Conrath, D.W.: Semantic Similarity based on Corpus Statistics and Lexical Taxonomy. In: International Conference on Research in Computational Linguistics (1997)
Google Scholar
Lin, D.: An Information-theoretic Definition of Similarity. In: 15th International Conference on Machine Learning, Madison, USA (1998)
Google Scholar
Pesaranghader, A., Muthaiyah, S.: Definition-based information content vectors for semantic similarity measurement. In: Proceedings of the 2nd International Multi-Conference on Artificial Intelligence Technology (M-CAIT), pp. 268–282 (2013)
Google Scholar
Lesk, M.: Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice-cream Cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation, New York, USA, pp. 24–26 (1986)
Google Scholar
Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet. In: Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City (2002)
Google Scholar
Patwardhan, S., Pedersen, T.: Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts. In: Proceedings of the EACL 2006 Workshop, Making Sense of Sense: Bringing Computational Linguistics and Psycholinguistics together, Trento, Italy, pp. 1–8 (2006)
Google Scholar
Liu, Y., McInnes, B.T., Pedersen, T., Melton-Meaux, G., Pakhomov, S.: Semantic relatedness study using second order co-occurrence vectors computed from biomedical corpora, UMLS and WordNet. In: Proceedings of the 2nd ACM SIGHIT IHI, pp. 363–371
Google Scholar
Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.: Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. In: Proceedings of AMIA, pp. 572–576 (2010)
Google Scholar
Landauer, T.K., Dumais, S.T.: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction and Representation of Knowledge. Psychological Review 104, 211–240 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Jalan Multimedia, Multimedia University (MMU), 63100, Cyberjaya, Malaysia
Ahmad Pesaranghader & Azadeh Rezaei
Universiti Putra Malaysia (UPM), 43400, Serdang, Selangor, Malaysia
Ali Pesaranghader

Authors

Ahmad Pesaranghader
View author publications
You can also search for this author in PubMed Google Scholar
Ali Pesaranghader
View author publications
You can also search for this author in PubMed Google Scholar
Azadeh Rezaei
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Business Information Systems, FSIC, National University of Ireland, University College Cork, O’Rahilly Buildings, Cork, Ireland
Rajendra Prasath
Research Centre in Computer Science, V.H.N.Senthikumara Nadar College (Autonomous), 626 001, Virudhunagar, Tamil Nadu, India
T. Kathirvalavakumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pesaranghader, A., Pesaranghader, A., Rezaei, A. (2013). Applying Latent Semantic Analysis to Optimize Second-order Co-occurrence Vectors for Semantic Relatedness Measurement. In: Prasath, R., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8284. Springer, Cham. https://doi.org/10.1007/978-3-319-03844-5_58

Download citation

DOI: https://doi.org/10.1007/978-3-319-03844-5_58
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03843-8
Online ISBN: 978-3-319-03844-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Applying Latent Semantic Analysis to Optimize Second-order Co-occurrence Vectors for Semantic Relatedness Measurement

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

tESA: a distributional measure for calculating semantic relatedness

DomESA: a novel approach for extending domain-oriented lexical relatedness calculations with domain-specific semantics

Adapting Gloss Vector Semantic Relatedness Measure for Semantic Similarity Estimation: An Evaluation in the Biomedical Domain

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Applying Latent Semantic Analysis to Optimize Second-order Co-occurrence Vectors for Semantic Relatedness Measurement

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

tESA: a distributional measure for calculating semantic relatedness

DomESA: a novel approach for extending domain-oriented lexical relatedness calculations with domain-specific semantics

Adapting Gloss Vector Semantic Relatedness Measure for Semantic Similarity Estimation: An Evaluation in the Biomedical Domain

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation