Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Extractive Summarization of Telugu Text Using Modified Text Rank and Maximum Marginal Relevance

Published: 22 September 2023 Publication History

Abstract

With the rapid growth of digital content, there is a need for an automatic text summarizer to provide short text from a long text document. Many research works have been presented for extractive text summarization (ETS). This article mainly focuses on the graph-based ETS approach for multiple Telugu text documents. A modified Text-Rank algorithm is employed with the noun and verb count of each sentence in the text as the initial score of each node. To get the optimal features, a novel feature selection algorithm called improved Flamingo Search Algorithm is proposed in this article. Though graph-based ETS is an important approach, the generated summaries are redundant. To reduce the redundancy in the generated summary, maximum marginal relevance is combined with the modified Text-Rank. Different word-embedding techniques such as Fast-Text, Word2vec, TF-IDF, and one-hot encoding are utilized to experiment with the proposed approach. The performance of the proposed text summarization approach is evaluated with BLEU and ROUGE in terms of F-measure, precision, and recall.

References

[1]
M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, and K. Kochut. 2017. Text summarization techniques: A brief survey. arXiv:1707.02268. Retrieved from https://arxiv.org/abs/1707.02268.
[2]
W. H. Alquliti and N. B. Abdul Ghani. 2019. Convolutional neural network based for automatic text summarization. Int. J. Adv. Comput. Sci. Appl. 10, 4 (2019).
[3]
M. Banu, C. Karthika, P. Sudarmani, and T. Geetha. 2007. Tamil document summarization using semantic graph method. In Proceedings of the International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’07). IEEE, Los Alamitos, CA, 128–134.
[4]
A. Barrera and R. Verma. 2012. Combining syntax and semantics for automatic extractive single-document summarization. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Berlin, 366–377.
[5]
Y. Diao, H. Lin, L. Yang, X. Fan, Y. Chu, D. Wu, and K. Xu. 2020. CRHASum: Extractive text summarization with contextualized-representation hierarchical- attention summarization network. Neural Comput. Appl. 32 (2020), 11491–11503.
[6]
R. Elbarougy, G. Behery, and A. El Khatib. 2020. Extractive arabic text summarization using modified pagerank algorithm. Egypt. Inf. J. 21, 2 (July 2020), 73–81.
[7]
G. Erkan and D. Radev. 2004. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 365–371.
[8]
D. Gunawan, S. H. Harahap, and R. Fadillah Rahmat. 2019. Multi-document summarization by using textrank and maximal marginal relevance for text in bahasa indonesia. In Proceedings of the International Conference on ICT for Smart Society (ICISS’19), 1–5.
[9]
V. Gupta and G. S. Lehal. 2010. A Survey of text summarization of extractive techniques. J. Emerg. Technol. Web Intell. 2, 3 (August 2010), 258–268.
[10]
V. Gupta and N. Kaur. 2016. A novel hybrid text summarization system for Punjabi text. Cogn. Comput. 8 (April 2016), 261–277.
[11]
A. Hernandez-Castaneda, R. A. Garcia-Hernandez, Y. Ledeneva and C. E. Millan- Hernandez. 2020. Extractive automatic text summarization based on lexical-semantic keywords. IEEE Access 8 (March 2020), 49896–49907.
[12]
D. P. Kadam, N. Patil, and A. Gulathi. 2015. A comparative study of hindi text summarization techniques, genetic algorithm and neural network. Int. J. Innov. Adv. Comput. Sci. 4 (2015).
[13]
D. Kakwani, A. Kunchukuttan, S. Golla, N. C. Gokul, A. Bhattacharyya, M. M. Khapra, and P. Kumar. 2020. IndicNLPSuite: Monolingual corpora, evaluation benchmarks and Pre-trained multilingual language models for indian languages. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 4948–4961.
[14]
M. H. Khanam and S. Sravani. 2016. Text summarization for telugu document. IOSR J. Softw. Eng. 18, 6 (2016).
[15]
J. S. Kallimani, K. G. Srinivasa, and B. Eswara Reddy. 2011. Information extraction by an abstractive text summarization for an Indian regional language. In Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering. IEEE, Los Alamitos, CA, 319–322.
[16]
K. V. Kumar and D. Yadav. 2015. An improvised extractive approach to Hindi text summarization. In Information Systems Design and Intelligent Applications. Springer, New Delhi, 291–300.
[17]
Y. M. Latha and D. N. Sudha. 2020. Multi-Document abstractive text summarization through semantic similarity matrix for telugu language. Int. J. Adv. Sci. Technol. 29, 1 (2020), 513–521. http://sersc.org/journals/index.php/IJAST/article/view/3105.
[18]
K. U. Manjari. 2020. Extractive summarization of telugu documents using textrank algorithm. In Proceedings of the 4th International Conference on IoT in Social, Mobile, Analytics and Cloud (I-SMAC’20). IEEE, Los Alamitos, CA, 678–683.
[19]
K. Manju, S. David Peter, and S. M. Idicula. 2021. A framework for generating extractive summary from multiple malayalam documents. Information 12, 1 (January 2021).
[20]
R. Mihalcea and P. Tarau. 2004. TextRank: Bringing order into text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 404–411.
[21]
D. R. Radev and K. R. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Comput. Ling. 24 (1998), 469–500.
[22]
Y. V. Rathod. 2018. Extractive text summarization of marathi news articles. Int. Res. J. Eng Technol. 5, 7 (July 2018), 1204–1210.
[23]
R. K. Roul. 2020. Topic modeling combined with classification technique for extractive multi-document text summarization. Soft Comput. 25 (2020), 1113–1127.
[24]
A. See, P. J. Liu, and C. D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.
[25]
S. Shashikanth and S. Sanghavi. 2019. Text summarization techniques survey on telugu and foreign languages. Int. J. Res. Eng. Sci. Manage. 2, 1 (Jan. 2019), 211–213.
[26]
Telugu POS tagger: Natural Language Processing at KBCS, CDAC Mumbai. Retrieved from http://kbcs.in/tools.php.
[27]
T. Uçkan and A. Karcı. 2020. Extractive multi-document text summarization based on graph independent sets. Egypt. Inf. J. 21, 3 (Sept. 2020), 145–157.
[28]
P. Verma and A. Verma. 2020. Accountability of NLP tools in text summarization for indian languages. J. Sci. Res. 64, 1 (2020).

Index Terms

  1. Extractive Summarization of Telugu Text Using Modified Text Rank and Maximum Marginal Relevance

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 9
    September 2023
    226 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3625383
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 September 2023
    Online AM: 12 June 2023
    Accepted: 23 April 2023
    Revised: 13 July 2022
    Received: 09 June 2022
    Published in TALLIP Volume 22, Issue 9

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep learning
    2. Telugu text summarizer
    3. improved flamingo search algorithm
    4. encoding technique

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 205
      Total Downloads
    • Downloads (Last 12 months)100
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media