Unsupervised Hidden Topic Framework for Extracting Keywords (Synonym, Homonym, Hyponymy and Polysemy) and Topics in Meeting Transcripts

J. I. Sheeba⁴,
K. Vivekanandan⁴,
G. Sabitha⁴ &
…
P. Padmavathi⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 177))

3071 Accesses

Abstract

Keyword is the important item in the document that provides efficient access to the content of a document. It can be used to search for information or to decide whether to read a document. This paper mainly focuses on extracting hidden topics from meeting transcripts. Existing system is handled with web documents, but this proposed framework focuses on solving Synonym, Homonym, Hyponymy and Polysemy problems in meeting transcripts. Synonym problem means different words having similar meaning are grouped and single keyword is extracted. Hyponymy problem means one word denoting subclass is considered and super class keyword is extracted. Homonym means a word can have two or more different meanings. For example, Left might appear in two different contexts: Car left (past tense of leave) and Left side (Opposite of right). A polysemy means word with different, but related senses. For example, count has different related meanings: to say number in right order, to calculate. Hidden topics from meeting transcripts can be found using LDA model. Finally MaxEnt classifier is used for extracting keywords and topics which will be used for information retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Sentence-Based Topic Modeling Using Lexical Analysis

Assessing the Effectiveness of Topic Modeling Algorithms in Discovering Generic Label with Description

A Novel Hysynset-Based Topic Modeling Approach for Marathi Language

References

Liu, F., Pennell, D., Liu, F.: Unsupervised Approaches for Automatic keyword extraction, Boulder, Colorado. ACM (June 2009)
Google Scholar
Phan, X.-H., Nguyen, C.-T., Le, D.-T., Nguyen, L.-M.: A Hidden Topic-Based Framework toward Building Applications with Short Web Documents. IEEE Transactions on Knowledge and Data Engineering 23 (2011)
Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge Univ. Press, Springer (2008)
Google Scholar
Deerwester, S., Furnas, G., Landauer, T.: Indexing by Latent Semantic Analysis. J. Am. Soc. for Information Science 41(6), 391–407 (1990)
Article Google Scholar
Letsche, T.A., Berry, M.W.: Large-Scale Information Retrieval with Latent Semantic Indexing. Information Science 100(1-4), 105–137 (1997)
Article Google Scholar
Baker, L., McCallum, A.: Distributional Clustering of Words for Text Classification. In: Proc. ACM SIGIR (1998)
Google Scholar
Bekkerman, R., El-Yaniv, R., Tishby, N., Winter, Y.: Distributional Word Clusters vs. Words for Text Categorization. Machine Learning Research 3, 1183–1208 (2003)
MATH Google Scholar
Dhillon, I., Modha, D.: Concept Decompositions for Large Sparse Text Data Using Clustering. Machine Learning 42(1/2), 143–175 (2001)
Article MATH Google Scholar
Metzler, D., Dumais, S., Meek, C.: Similarity Measures for Short Segments of Text. In: Proc. 29th European Conference IR Research, ECIR 2007. ACM (2007)
Google Scholar
Yih, W., Meek, C.: Improving Similarity Measures for Short Segments of Text. In: Proc. 22nd National Conference on Artificial Intelligence, AAAI (2007)
Google Scholar
Sahami, M., Heilman, T.: A Web-Based Kernel Function for Measuring the Similarity of Short Text Snippets. In: Proc. 15th International Conference on World Wide Web. ACM (2006)
Google Scholar
Gabrilovich, E., Markovitch, S.: Computing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis. In: Proc. 20th Int’l Joint Conference, Artificial Intelligence (2007)
Google Scholar
Cai, L., Hofmann, T.: Text Categorization by Boosting Automatically Extracted Concepts. In: Proc. ACM SIGIR (2003)
Google Scholar
Cai, J., Lee, W., The, Y.: Improving WSD Using Topic Features. In: Proc. Joint Conf. Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLPCoNLL, Prague, pp. 1015–1023 (June 2007)
Google Scholar
Term frequency-inverse document frequency, http://www.wikipedia.com/
http://www.buzzle.com/articles/lexical-relations-hyponymy-and-homonymy.html
http://umass.academia.edu/AndrewMcCallum/Papers/49541/Using_Maximum_Entropy_for_Text_Classification
http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
Gibb Sampling Algorithm, http://www.wikipedia.com/

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Pondicherry Engineering College, Puducherry, 605 014, India
J. I. Sheeba, K. Vivekanandan, G. Sabitha & P. Padmavathi

Authors

J. I. Sheeba
View author publications
You can also search for this author in PubMed Google Scholar
K. Vivekanandan
View author publications
You can also search for this author in PubMed Google Scholar
G. Sabitha
View author publications
You can also search for this author in PubMed Google Scholar
P. Padmavathi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. I. Sheeba .

Editor information

Editors and Affiliations

, Department of Computer Science, Jackson State University, John R. Lynch Street 1400, Jackson, 39217, USA
Natarajan Meghanathan
Wireilla Net Solutions PTY Ltd, Melbourne, Australia
Dhinaharan Nagamalai
Department of Computer Science & Eng., University of Calcutta, Calcutta, 700 073, India
Nabendu Chaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sheeba, J.I., Vivekanandan, K., Sabitha, G., Padmavathi, P. (2013). Unsupervised Hidden Topic Framework for Extracting Keywords (Synonym, Homonym, Hyponymy and Polysemy) and Topics in Meeting Transcripts. In: Meghanathan, N., Nagamalai, D., Chaki, N. (eds) Advances in Computing and Information Technology. Advances in Intelligent Systems and Computing, vol 177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31552-7_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-31552-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31551-0
Online ISBN: 978-3-642-31552-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Unsupervised Hidden Topic Framework for Extracting Keywords (Synonym, Homonym, Hyponymy and Polysemy) and Topics in Meeting Transcripts

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Sentence-Based Topic Modeling Using Lexical Analysis

Assessing the Effectiveness of Topic Modeling Algorithms in Discovering Generic Label with Description

A Novel Hysynset-Based Topic Modeling Approach for Marathi Language

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Unsupervised Hidden Topic Framework for Extracting Keywords (Synonym, Homonym, Hyponymy and Polysemy) and Topics in Meeting Transcripts

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Sentence-Based Topic Modeling Using Lexical Analysis

Assessing the Effectiveness of Topic Modeling Algorithms in Discovering Generic Label with Description

A Novel Hysynset-Based Topic Modeling Approach for Marathi Language

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation