research-article

Automated Text Simplification: A Survey

Authors:

Suha S. Al-Thanyyan,

Aqil M. AzmiAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 54, Issue 2

Article No.: 43, Pages 1 - 36

https://doi.org/10.1145/3442695

Published: 05 March 2021 Publication History

Abstract

Text simplification (TS) reduces the complexity of the text to improve its readability and understandability, while possibly retaining its original information content. Over time, TS has become an essential tool in helping those with low literacy levels, non-native learners, and those struggling with various types of reading comprehension problems. In addition, it is used in a preprocessing stage to enhance other NLP tasks. This survey presents an extensive study of current research studies in the field of TS, as well as covering resources, corpora, and evaluation methods that have been used in those studies.

References

[1]

Omri Abend and Ari Rappoport. 2013. Universal conceptual cognitive annotation (UCCA). In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 228--238.

[2]

Itziar Aduriz, Maxux Aranzabe, J Arriola, Aitziber Atutxa, Arantza Diaz-De-Ilarraza, Nerea Ezeiza, Koldo Gojenola, Maite Oronoz, Aitor Soroa, and Ruben Urizar. 2006. Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing. In Language and Computers: Corpus Linguistics Around the World. Vol. 56. Brill/Rodopi, 1--15.

[3]

Eneko Agirre, Inaki Alegria, Xabier Arregi, Xabier Artola, A. Díaz de Ilarraza, Montse Maritxalar, Kepa Sarasola, and Miriam Urkia. 1992. XUXEN: A spelling checker/corrector for Basque based on two-level morphology. In Proceedings of the 3rd Conference on Applied Natural Language Processing. 119--125.

[4]

Sandra Maria Aluísio and Caroline Gasperin. 2010. Fostering digital inclusion and accessibility: The PorSimples project for simplification of Portuguese texts. In Proceedings of the NAACL HLT Young Investigators Workshop on Computational Approaches to Languages of the Americas. 46--53.

[5]

Marıa Jesús Aranzabe, Arantza Dıaz de Ilarraza, and Itziar Gonzalez-Dios. 2012. First approach to automatic text simplification in Basque. In Proceedings of the Natural Language Processing for Improving Textual Accessibility Workshop (LREC’12). 1--8.

[6]

Ricardo Baeza-Yates, Luz Rello, and Julia Dembowski. 2015. CASSA: A context-aware synonym simplification algorithm. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’15). 1380--1385.

[7]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

[8]

Mokhtar B. Billami, Thomas François, and Núria Gala. 2018. ReSyf: A French lexicon with ranked synonyms. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). 2570--2581.

[9]

Or Biran, Samuel Brody, and Noémie Elhadad. 2011. Putting it simply: A context-aware approach to lexical simplification. In Proceedings of the 49th Meeting of the Association for Computational Linguistics: Human Language Technologies. 496--501.

[10]

Bernd Bohnet. 2009. Efficient parsing of syntactic and semantic dependency structures. In Proceedings of the 13th Conference on Computational Natural Language Learning. 67--72.

[11]

Bernd Bohnet. 2010. Top accuracy and fast dependency parsing is not a contradiction. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 89--97.

Digital Library

[12]

Bernd Bohnet, Andreas Langjahr, and Leo Wanner. 2000. A development environment for an MTT-based sentence generator. In Proceedings of the 1st International Conference on Natural Language Generation. 260--263.

Digital Library

[13]

Stefan Bott, Luz Rello, Biljana Drndarevic, and Horacio Saggion. 2012. Can Spanish be simpler? LexSIS: Lexical simplification for Spanish. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12). 357--374.

[14]

Stefan Bott, Horacio Saggion, and Simon Mille. 2012. Text simplification tools for spanish. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). 1665--1671.

[15]

Nadjet Bouayad-Agha, Gerard Casamayor, Gabriela Ferraro, Simon Mille, Vanesa Vidal, and Leo Wanner. 2009. Improving the comprehension of legal documentation: The case of patent claims. In Proceedings of the 12th International Conference on Artificial Intelligence and Law. 78--87.

Digital Library

[16]

Dominique Brunato, Andrea Cimino, Felice Dell’Orletta, and Giulia Venturi. 2016. PaCCSS-IT: A parallel corpus of complex-simple sentences for automatic text simplification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 351--361.

[17]

Alicia Burga, Joan Codina, Gabriella Ferraro, Horacio Saggion, and Leo Wanner. 2013. The challenge of syntactic dependency parsing adaptation for the patent domain. In Proceedings of the 25th European Summer School in Logic, Language and Information (ESSLLI’13) Workshop on Extrinsic Parse Improvement.

[18]

John Carroll, Guido Minnen, Yvonne Canning, Siobhan Devlin, and John Tait. 1998. Practical simplification of English newspaper text to assist aphasic readers. In Proceedings of AAAI’98 Workshop on Integrating Artificial Intelligence and Assistive Technology. 7--10.

[19]

Daniel Castro-Castro, Rocío Lannes-Losada, Montse Maritxalar, Ianire Niebla, Celia Pérez-Marqués, Nancy C. Álamo-Suárez, and Aurora Pons-Porrata. 2008. A multilingual application for automated essay scoring. In Proceedings of the Ibero-American Conference on Artificial Intelligence. 243--251.

Digital Library

[20]

R. Chandrasekar, Christine Doran, and B. Srinivas. 1996. Motivations and methods for text simplification. In Proceedings of the 16th Conference on Computational Linguistics (COLING’96). 1041--1044.

[21]

Raman Chandrasekar and Bangalore Srinivas. 1997. Automatic induction of rules for text simplification1. Knowl.-based Syst. 10, 3 (1997), 183--190.

[22]

Danqi Chen and Christopher Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 740--750.

[23]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[24]

Kostadin Cholakov, Chris Biemann, Judith Eckle-Kohler, and Iryna Gurevych. 2014. Lexical substitution dataset for german. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’14). 1406--1411.

[25]

James Clarke and Mirella Lapata. 2006. Models for sentence compression: A comparison across domains, training requirements and evaluation measures. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Meeting of the Association for Computational Linguistics. 377--384.

Digital Library

[26]

Bertol Arrieta Cortajarenak. 2010. Azaleko Sintaxiaren Tratamendua Ikasketa Automatikoko Tekniken Bidez: ....Ph.D. Dissertation. Universidad del País Vasco-Euskal Herriko Unibertsitatea.

[27]

William Coster and David Kauchak. 2011. Learning to simplify sentences using Wikipedia. In Proceedings of the Workshop on Monolingual Text-to-text Generation. 1--9.

Digital Library

[28]

William Coster and David Kauchak. 2011. Simple English Wikipedia: A new text simplification task. In Proceedings of the 49th Meeting of the ACL: Human Language Technologies. 665--669.

[29]

James R. Curran, Stephen Clark, and Johan Bos. 2007. Linguistically motivated large-scale NLP with C&C and boxer. In Proceedings of the 45th Meeting of the ACL. 33--36.

[30]

Edgar Dale and Jeanne S. Chall. 1948. A formula for predicting readability: Instructions. Educ. Res. Bull. 27, 1 (1948), 37--54.

[31]

Jan De Belder and Marie-Francine Moens. 2012. A dataset for the evaluation of lexical simplification. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. 426--437.

Digital Library

[32]

Marie-Catherine De Marneffe, Bill MacCartney, Christopher D. Manning, et al. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC’06). 449--454.

[33]

Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Series B (Methodol.) 39, 1 (1977), 1--38.

[34]

Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the 6th Workshop on Statistical Machine Translation. 85--91.

[35]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[36]

Siobhan Devlin and Gary Unthank. 2006. Helping aphasic people process online information. In Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility. 225--226.

Digital Library

[37]

Yuan Ding and Martha Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of the 43rd Meeting of the ACL. 541--548.

Digital Library

[38]

George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research. 138--145.

[39]

Biljana Drndarević and Horacio Saggion. 2012. Towards automatic lexical simplification in Spanish: An empirical study. In Proceedings of the NAACL-HLT Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR’12). 8--16.

[40]

Emmanuel Dupoux. 2018. Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner. Cognition 173 (2018), 43--59.

[41]

Richard Evans. 2011. Comparing methods for the syntactic simplification of sentences in information extraction. Liter. Ling. Comput. 26, 4 (2011), 371--388.

[42]

Richard Evans, Constantin Orasan, and Iustin Dornescu. 2014. An evaluation of syntactic simplification rules for people with autism. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations. 131--140.

[43]

Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, and Noah A. Smith. 2014. Retrofitting word vectors to semantic lexicons. arXiv preprint arXiv:1411.4166 (2014).

[44]

Gabriela Ferraro. 2012. Towards Deep Content Extraction from Specialized Discourse: The Case of Verbal Relations in Patent Claims. Ph.D. Dissertation. Universitat Pompeu Fabra, Barcelona, Spain.

[45]

Daniel Ferrés, Montserrat Marimon, Horacio Saggion, and Ahmed AbuRa’ed. 2016. YATS: Yet another text simplifier. In Natural Language Processing and Information Systems (LNCS, Vol. 9612), E. Métais, F. Meziane, M. Saraee, V. Sugumaran, and S. Vadera (Eds.). Springer International Publishing, 335--342.

[46]

Daniel Ferrés, Horacio Saggion, and Xavier Gómez Guinovart. 2017. An adaptable lexical simplification architecture for major Ibero-Romance languages. In Proceedings of the 1st Workshop on Building Linguistically Generalizable NLP Systems. 40--47.

[47]

Katja Filippova and Michael Strube. 2008. Dependency tree based sentence compression. In Proceedings of the 5th International Natural Language Generation Conference. 25--32.

[48]

Thomas François, Núria Gala, Patrick Watrin, and Cédrick Fairon. 2014. FLELex: A graded lexical resource for French foreign learners. In Proceedings of 9th International Conference on Language Resources and Evaluation. 3766--3773.

[49]

Núria Gala, Anaïs Tack, Ludivine Javourey-Drevet, Thomas François, and Johannes C. Ziegler. 2020. Alector: A parallel corpus of simplified French texts with alignments of misreadings by poor and dyslexic readers. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC’20). 1353--1361.

[50]

Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 758--764.

[51]

Goran Glavaš and Sanja Štajner. 2013. Event-centered simplification of news stories. In Proceedings of the Student Research Workshop Associated with RANLP’13. 71--78.

[52]

Goran Glavaš and Sanja Štajner. 2015. Simplifying lexical simplification: Do we need simplified corpora? In Proceedings of the 53rd Meeting of the ACL and 7th International Joint Conference on NLP. 63--68.

[53]

Chikio Hayashi. 1998. What is data science? Fundamental concepts and a heuristic example. In Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization, C. Hayashi, K. Yajima, H.-H. Bock, N. Ohsumi, Y. Tanaka, and Y. Baba (Eds.). Springer, Japan, 40--51.

[54]

Michael Heilman and Noah A. Smith. 2010. Good question! Statistical ranking for question generation. In Proceedings of the Conference of the North American Chapter of the ACL: Human Language Technologies. 609--617.

[55]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.

Digital Library

[56]

Colby Horn, Cathryn Manduca, and David Kauchak. 2014. Learning a lexical simplifier using Wikipedia. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics. 458--463.

[57]

William Hwang, Hannaneh Hajishirzi, Mari Ostendorf, and Wei Wu. 2015. Aligning sentences from standard Wikipedia to simple Wikipedia. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 211--217.

[58]

Kentaro Inui, Atsushi Fujita, Tetsuro Takahashi, Ryu Iida, and Tomoya Iwakura. 2003. Text simplification for reading assistance: A project note. In Proceedings of the 2nd International Workshop on Paraphrasing. 9--16.

Digital Library

[59]

Tomoyuki Kajiwara and Mamoru Komachi. 2016. Building a monolingual parallel corpus for text simplification using sentence similarity based on alignment between word embeddings. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 1147--1158.

[60]

Tomoyuki Kajiwara and Kazuhide Yamamoto. 2015. Evaluation dataset and system for Japanese lexical simplification. In Proceedings of the ACL-IJCNLP Student Research Workshop. 35--40.

[61]

Hans Kamp. 2008. A Theory of Truth and Semantic Representation. John Wiley & Sons, Ltd, 189--222.

[62]

J. Peter Kincaid, Robert Fishburne Jr, Richard Rogers, and Brand Chissom. 1975. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Technical Report. Naval Technical Training Command Millington Tenn. Research Branch Report 8-75.

[63]

Beata Beigman Klebanov, Kevin Knight, and Daniel Marcu. 2004. Text simplification for information-seeking applications. In Proceedings of the OTM Confederated International Conferences “On the Move to Meaningful Internet Systems.” 735--747.

[64]

Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. 2017. OpenNMT: Open-source toolkit for neural machine translation. arXiv preprint arXiv:1701.02810 (2017).

[65]

Tomonori Kodaira, Tomoyuki Kajiwara, and Mamoru Komachi. 2016. Controlled and balanced dataset for Japanese lexical simplification. In Proceedings of the ACL Student Research Workshop. 1--7.

[66]

Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Meeting of the ACL. 177--180.

[67]

Mathieu Lafourcade. 2007. Making people play for lexical acquisition with the jeuxdemots prototype. In Proceedings of the 7th International Symposium on Natural Language Processing (SNLP’07).

[68]

Partha Lal and Stefan Rüger. 2002. Extract-based summarization with simplification. In Proceedings of the ACL Workshop on Text Summarization (DUC’02).

[69]

Benoit Lavoie and Owen Rambow. 1997. A fast and portable realizer for text generation systems. In Proceedings of the 5th Conference on Applied Natural Language Processing. 265--268.

Digital Library

[70]

Jonathan Mallinson and Mirella Lapata. 2019. Controllable sentence simplification: Employing syntactic and lexical constraints. arXiv preprint arXiv:1910.04387 (2019).

[71]

Diana McCarthy and Roberto Navigli. 2007. Semeval-2007 task 10: English lexical substitution task. In Proceedings of the 4th International Workshop on Semantic Evaluations. 48--53.

[72]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).

[73]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3111--3119.

[74]

George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J. Miller. 1990. Introduction to WordNet: An on-line lexical database. Int. J. Lexicog. 3, 4 (1990), 235--244.

[75]

Makoto Miwa, Rune Sætre, Yusuke Miyao, and Jun’ichi Tsujii. 2010. Entity-focused sentence simplification for relation extraction. In Proceedings of the 23rd International Conference on Computational Linguistics. 788--796.

Digital Library

[76]

Shashi Narayan and Claire Gardent. 2014. Hybrid simplification using deep semantics and machine translation. In Proceedings of the 52nd Meeting of the ACL. 435--445.

[77]

Gonzalo Navarro. 2001. A guided tour to approximate string matching. ACM Comput. Surv. 33, 1 (2001), 31--88.

Digital Library

[78]

Sergiu Nisioi, Sanja Štajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. Exploring neural text simplification models. In Proceedings of the 55th Meeting of the ACL. 85--91.

[79]

Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Comput. Ling. 29, 1 (2003), 19--51.

Digital Library

[80]

Ethel Ong, Jerwin Damay, Gerard Lojico, Kimberly Lu, and Dex Tarantan. 2007. Simplifying text in medical literature. J. Res. Sci., Comput. Eng. 4, 1 (2007), 37--47.

[81]

Lluís Padró and Evgeny Stanilovsky. 2012. FreeLing 3.0: Towards wider multilinguality. In Proceedings of the 8th Language Resources and Evaluation Conference (LREC’12). 2473--2479.

[82]

Gustavo Paetzold and Lucia Specia. 2016. Benchmarking lexical simplification systems. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 3074--3080.

[83]

Gustavo Paetzold and Lucia Specia. 2016. Semeval 2016 task 11: Complex word identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 560--569.

[84]

Gustavo Paetzold and Lucia Specia. 2017. Lexical simplification with neural ranking. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 34--40.

[85]

Gustavo H. Paetzold and Lucia Specia. 2013. Text simplification as tree transduction. In Proceedings of the 9th Brazilian Symposium in Information and Human Language Technology. 116--125.

[86]

Gustavo H. Paetzold and Lucia Specia. 2016. Unsupervised lexical simplification for non-native speakers. In Proceedings of the 13th AAAI Conference on Artificial Intelligence (AAAI’16). 3761--3767.

[87]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting of the ACL. 311--318.

[88]

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1532--1543.

[89]

Basel Qenam, Tae Youn Kim, Mark J. Carroll, and Michael Hogarth. 2017. Text simplification using consumer health vocabulary to generate patient-centered radiology reporting: Translation and evaluation. J. Med. Internet Res. 19, 12 (2017).

[90]

Jipeng Qiang, Yun Li, Yi Zhu, Yunhao Yuan, and Xindong Wu. 2019. A simple BERT-based approach for lexical simplification. arXiv preprint arXiv:1907.06226 (2019).

[91]

Luz Rello, Ricardo Baeza-Yates, Laura Dempere-Marco, and Horacio Saggion. 2013. Frequent words improve readability and short words improve understandability for people with dyslexia. In Proceedings of the IFIP Conference on Human-Computer Interaction (INTERACT’13), P. Kotzé, G. Marsden, G. Lindgaard, J. Wesson, and M. Winckler (Eds.). 203--219.

[92]

Jascha Rüsseler, Stefan Probst, Sönke Johannes, and Thomas F. Münte. 2003. Recognition memory for high-and low-frequency words in adult normal and dyslexic readers: An event-related brain potential study. J. Clin. Exper. Neuropsych. 25, 6 (2003), 815--829.

[93]

Horacio Saggion. 2017. Automatic text simplification. Synth. Lect. Hum. Lang. Technol. 10, 1 (2017), 1--137.

[94]

Horacio Saggion, Sanja Štajner, Stefan Bott, Simon Mille, Luz Rello, and Biljana Drndarevic. 2015. Making it simplext: Implementation and evaluation of a text simplification system for Spanish. ACM Trans. Access. Comput. 6, 4 (2015), 14:1--14:36.

Digital Library

[95]

Carolina Scarton, Alessio Palmero Aprosio, Sara Tonelli, Tamara Martín Wanton, and Lucia Specia. 2017. MUSST: A multilingual syntactic simplification tool. In Proceedings of the International Joint Conference on Natural Language Processing and Workshops (IJCNLP’17). 25--28.

[96]

Isabel Segura-Bedmar and Paloma Martínez. 2017. Simplifying drug package leaflets written in Spanish by using word embedding. J. Biomed. Seman. 8, 1 (2017), 45.

[97]

Matthew Shardlow. 2013. A comparison of techniques to automatically identify complex words. In Proceedings of the 51st Meeting of the ACL Proceedings of the Student Research Workshop. 103--109.

[98]

Matthew Shardlow. 2013. The CW corpus: A new resource for evaluating the identification of complex words. In Proceedings of 2nd Workshop on Predicting and Improving Text Readability for Target Reader Populations. 69--77.

[99]

Matthew Shardlow. 2014. Out in the open: Finding and categorising errors in the lexical simplification pipeline. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). 1583--1590.

[100]

Matthew Shardlow. 2014. A survey of automated text simplification. Int. J. Adv. Comput. Sci. Applic. 4, 1 (2014), 58--70.

[101]

Advaith Siddharthan. 2006. Syntactic simplification and text cohesion. Res. Lang. Comput. 4, 1 (2006), 77--109.

[102]

Advaith Siddharthan. 2011. Text simplification using typed dependencies: A comparison of the robustness of different generation strategies. In Proceedings of the 13th European Workshop on Natural Language Generation. 2--11.

[103]

Advaith Siddharthan. 2014. A survey of research on text simplification. Int. J. Appl. Ling. 165, 2 (2014), 259--298.

[104]

Advaith Siddharthan and Angrosh Annayappan Mandya. 2014. Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). 722--731.

[105]

Advaith Siddharthan, Ani Nenkova, and Kathleen McKeown. 2004. Syntactic simplification for improving content selection in multi-document summarization. In Proceedings of the 20th International Conference on Computational Linguistics (COLING’04).

Digital Library

[106]

Sara Botelho Silveira and António Branco. 2012. Combining a double clustering approach with sentence simplification to produce highly informative multi-document summaries. In Proceedings of the IEEE 13th International Conference on Information Reuse and Integration (IRI’12). 482--489.

[107]

David A. Smith and Jason Eisner. 2006. Quasi-synchronous grammars: Alignment by soft projection of syntactic dependencies. In Proceedings of the Workshop on Statistical Machine Translation (StatMT’06). 23--30.

[108]

Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas. 223--231.

[109]

Matthew Snover, Nitin Madnani, Bonnie J. Dorr, and Richard Schwartz. 2009. Fluency, adequacy, or HTER?: Exploring different human judgments with a tunable MT metric. In Proceedings of the 4th Workshop on Statistical Machine Translation. 259--268.

[110]

Lucia Specia. 2010. Translating from complex to simplified sentences. In Proceedings of the International Conference on Computational Processing of the Portuguese Language. 30--39.

Digital Library

[111]

Lucia Specia, Sujay Kumar Jauhar, and Rada Mihalcea. 2012. Semeval-2012 task 1: English lexical simplification. In Proceedings of the 1st Joint Conference on Lexical and Computational Semantics (*SEM’12). 347--355.

[112]

Sanja Štajner and Goran Glavaš. 2017. Leveraging event-based semantics for automated text simplification. Exp. Syst. Applic. 82 (2017), 383--395.

Digital Library

[113]

Sanja Štajner, Horacio Saggion, and Simone Paolo Ponzetto. 2019. Improving lexical coverage of text simplification systems for Spanish. Exp. Syst. Applic. 118 (2019), 80--91.

[114]

Elior Sulem, Omri Abend, and Ari Rappoport. 2018. Simple and effective text simplification using semantic and neural methods. arXiv preprint arXiv:1810.05104 (2018).

[115]

Hong Sun and Ming Zhou. 2012. Joint learning of a dual SMT system for paraphrase generation. In Proceedings of the 50th Meeting of the Association for Computational Linguistics. 38--42.

[116]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3104--3112.

[117]

Kristina Toutanova and Christopher D. Manning. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, Held in Conjunction with the 38th Meeting of the ACL. 63--70.

[118]

Sowmya Vajjala and Ivana Lucic. 2018. OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification. In Proceedings of the 13th Workshop on Innovative use of NLP for Building Educational Apps. 297--304.

[119]

David Vickrey and Daphne Koller. 2008. Sentence simplification for semantic role labeling. In Proceedings of the 46th Meeting of the Association for Computational Linguistics (ACL’08). 344--352.

[120]

Piek Vossen (Ed.). 1998. EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers.

[121]

Tong Wang, Ping Chen, Kevin Amaral, and Jipeng Qiang. 2016. An experimental study of LSTM encoder-decoder model for text simplification. arXiv preprint arXiv:1609.03663 (2016).

[122]

Rodrigo Wilkens, Leonardo Zilio, Silvio Ricardo Cordeiro, Felipe Paula, Carlos Ramisch, Marco Idiart, and Aline Villavicencio. 2017. LexSubNC: A dataset of lexical substitution for nominal compounds. In Proceedings of the 12th International Conference on Computational Semantics (IWCS’17).

[123]

Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 3--4 (1992), 229--256.

Digital Library

[124]

Kristian Woodsend and Mirella Lapata. 2011. Learning to simplify sentences with quasi-synchronous grammar and integer programming. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 409--420.

Digital Library

[125]

Sander Wubben, Antal Van Den Bosch, and Emiel Krahmer. 2012. Sentence simplification by monolingual machine translation. In Proceedings of the 50th Meeting of the ACL. 1015--1024.

[126]

Wei Xu, Chris Callison-Burch, and Courtney Napoles. 2015. Problems in current text simplification research: New data can help. Trans. Assoc. Comput. Ling. 3, 1 (2015), 283--297.

[127]

Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, and Chris Callison-Burch. 2016. Optimizing statistical machine translation for text simplification. Trans. Assoc. Comput. Ling. 4 (2016), 401--415.

[128]

Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Meeting of the Association for Computational Linguistics. 523--530.

Digital Library

[129]

Mark Yatskar, Bo Pang, Cristian Danescu-Niculescu-Mizil, and Lillian Lee. 2010. For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 365--368.

[130]

Xingxing Zhang and Mirella Lapata. 2017. Sentence simplification with deep reinforcement learning. arXiv preprint arXiv:1703.10931 (2017).

[131]

Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, and Rui Yan. 2017. A constrained sequence-to-sequence neural model for sentence simplification. arXiv preprint arXiv:1704.02312 (2017).

[132]

Zhemin Zhu, Delphine Bernhard, and Iryna Gurevych. 2010. A monolingual tree-based translation model for sentence simplification. In Proceedings of the 23rd International Conference on Computational Linguistics. 1353--1361.

Digital Library

Cited By

Zaman FKamiran FShardlow MHassan SKarim AAljohani N(2024)SATS: simplification aware text summarization of scientific documentsFrontiers in Artificial Intelligence10.3389/frai.2024.13754197Online publication date: 10-Jul-2024
https://doi.org/10.3389/frai.2024.1375419
Menta AGarcia-Serrano A(2024)Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approachComputer Science and Information Systems10.2298/CSIS230912017M21:3(899-921)Online publication date: 2024
https://doi.org/10.2298/CSIS230912017M
Sorokina S(2024)Intelligent Text Processing: A Review of Automated Summarization MethodsVirtual Communication and Social Networks10.21603/2782-4799-2024-3-3-203-2223:3(203-222)Online publication date: 1-Oct-2024
https://doi.org/10.21603/2782-4799-2024-3-3-203-222
Show More Cited By

Index Terms

Automated Text Simplification: A Survey
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Lexical Complexity Prediction: An Overview
The occurrence of unknown words in texts significantly hinders reading comprehension. To improve accessibility for specific target populations, computational modeling has been applied to identify complex words in texts and substitute them for simpler ...
Comparing resources for spanish lexical simplification
SLSP'13: Proceedings of the First international conference on Statistical Language and Speech Processing

In this paper we study the effect of different lexical resources and strategies for selecting synonyms in a lexical simplification system for the Spanish language. The resources used for the experiments are the Spanish EuroWordNet, the Spanish Open ...
Exploring language technologies to provide support to WCAG 2.0 and E2R guidelines
Interacción '15: Proceedings of the XVI International Conference on Human Computer Interaction

Part of citizenship faces accessibility barriers when they read and understand texts containing long sentences, unusual words, complex linguistic structures, etc. Readability and understanding should be considered when texts are created. In order to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 54, Issue 2

March 2022

800 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3450359

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021

Accepted: 01 December 2020

Revised: 01 December 2020

Received: 01 January 2020

Published in CSUR Volume 54, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Deputyship for Research and Innovation, ?Ministry of Education? in Saudi Arabia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
2,132
Total Downloads

Downloads (Last 12 months)514
Downloads (Last 6 weeks)69

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zaman FKamiran FShardlow MHassan SKarim AAljohani N(2024)SATS: simplification aware text summarization of scientific documentsFrontiers in Artificial Intelligence10.3389/frai.2024.13754197Online publication date: 10-Jul-2024
https://doi.org/10.3389/frai.2024.1375419
Menta AGarcia-Serrano A(2024)Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approachComputer Science and Information Systems10.2298/CSIS230912017M21:3(899-921)Online publication date: 2024
https://doi.org/10.2298/CSIS230912017M
Sorokina S(2024)Intelligent Text Processing: A Review of Automated Summarization MethodsVirtual Communication and Social Networks10.21603/2782-4799-2024-3-3-203-2223:3(203-222)Online publication date: 1-Oct-2024
https://doi.org/10.21603/2782-4799-2024-3-3-203-222
Swanson KHe SCalvano JChen DTelvizian TJiang LChong PSchwell JMak GLee J(2024)Biomedical text readability after hypernym substitution with fine-tuned large language modelsPLOS Digital Health10.1371/journal.pdig.00004893:4(e0000489)Online publication date: 16-Apr-2024
https://doi.org/10.1371/journal.pdig.0000489
Huotala AKuutila MRalph PMäntylä M(2024)The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic ReviewsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661172(262-271)Online publication date: 18-Jun-2024
https://dl.acm.org/doi/10.1145/3661167.3661172
Wu GQian JCastelo Quispe SChen SRulff JSilva C(2024)ARTiST: Automated Text Simplification for Task Guidance in Augmented RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642772(1-24)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642772
Säuberli AHolzknecht FHaller PDeilen SSchiffl LHansen-Schirra SEbling S(2024)Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual DisabilitiesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642570(1-11)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642570
Xia B(2024)Machine Automatic Translation Evaluation Based on Big Data Algorithms2024 Second International Conference on Data Science and Information System (ICDSIS)10.1109/ICDSIS61070.2024.10594232(1-5)Online publication date: 17-May-2024
https://doi.org/10.1109/ICDSIS61070.2024.10594232
McTeague CChatzimichali A(2024)An approach for enhancing and measuring information comprehensibility for engineering designers: applied to patent documentsArtificial Intelligence for Engineering Design, Analysis and Manufacturing10.1017/S089006042400007638Online publication date: 20-Sep-2024
https://doi.org/10.1017/S0890060424000076
North KRanasinghe TShardlow MZampieri M(2024)Deep learning approaches to lexical simplification: A surveyJournal of Intelligent Information Systems10.1007/s10844-024-00882-9Online publication date: 2-Sep-2024
https://doi.org/10.1007/s10844-024-00882-9
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents