Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Automated Text Simplification: A Survey

Published: 05 March 2021 Publication History

Abstract

Text simplification (TS) reduces the complexity of the text to improve its readability and understandability, while possibly retaining its original information content. Over time, TS has become an essential tool in helping those with low literacy levels, non-native learners, and those struggling with various types of reading comprehension problems. In addition, it is used in a preprocessing stage to enhance other NLP tasks. This survey presents an extensive study of current research studies in the field of TS, as well as covering resources, corpora, and evaluation methods that have been used in those studies.

References

[1]
Omri Abend and Ari Rappoport. 2013. Universal conceptual cognitive annotation (UCCA). In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 228--238.
[2]
Itziar Aduriz, Maxux Aranzabe, J Arriola, Aitziber Atutxa, Arantza Diaz-De-Ilarraza, Nerea Ezeiza, Koldo Gojenola, Maite Oronoz, Aitor Soroa, and Ruben Urizar. 2006. Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing. In Language and Computers: Corpus Linguistics Around the World. Vol. 56. Brill/Rodopi, 1--15.
[3]
Eneko Agirre, Inaki Alegria, Xabier Arregi, Xabier Artola, A. Díaz de Ilarraza, Montse Maritxalar, Kepa Sarasola, and Miriam Urkia. 1992. XUXEN: A spelling checker/corrector for Basque based on two-level morphology. In Proceedings of the 3rd Conference on Applied Natural Language Processing. 119--125.
[4]
Sandra Maria Aluísio and Caroline Gasperin. 2010. Fostering digital inclusion and accessibility: The PorSimples project for simplification of Portuguese texts. In Proceedings of the NAACL HLT Young Investigators Workshop on Computational Approaches to Languages of the Americas. 46--53.
[5]
Marıa Jesús Aranzabe, Arantza Dıaz de Ilarraza, and Itziar Gonzalez-Dios. 2012. First approach to automatic text simplification in Basque. In Proceedings of the Natural Language Processing for Improving Textual Accessibility Workshop (LREC’12). 1--8.
[6]
Ricardo Baeza-Yates, Luz Rello, and Julia Dembowski. 2015. CASSA: A context-aware synonym simplification algorithm. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT’15). 1380--1385.
[7]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
[8]
Mokhtar B. Billami, Thomas François, and Núria Gala. 2018. ReSyf: A French lexicon with ranked synonyms. In Proceedings of the 27th International Conference on Computational Linguistics (COLING’18). 2570--2581.
[9]
Or Biran, Samuel Brody, and Noémie Elhadad. 2011. Putting it simply: A context-aware approach to lexical simplification. In Proceedings of the 49th Meeting of the Association for Computational Linguistics: Human Language Technologies. 496--501.
[10]
Bernd Bohnet. 2009. Efficient parsing of syntactic and semantic dependency structures. In Proceedings of the 13th Conference on Computational Natural Language Learning. 67--72.
[11]
Bernd Bohnet. 2010. Top accuracy and fast dependency parsing is not a contradiction. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 89--97.
[12]
Bernd Bohnet, Andreas Langjahr, and Leo Wanner. 2000. A development environment for an MTT-based sentence generator. In Proceedings of the 1st International Conference on Natural Language Generation. 260--263.
[13]
Stefan Bott, Luz Rello, Biljana Drndarevic, and Horacio Saggion. 2012. Can Spanish be simpler? LexSIS: Lexical simplification for Spanish. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12). 357--374.
[14]
Stefan Bott, Horacio Saggion, and Simon Mille. 2012. Text simplification tools for spanish. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). 1665--1671.
[15]
Nadjet Bouayad-Agha, Gerard Casamayor, Gabriela Ferraro, Simon Mille, Vanesa Vidal, and Leo Wanner. 2009. Improving the comprehension of legal documentation: The case of patent claims. In Proceedings of the 12th International Conference on Artificial Intelligence and Law. 78--87.
[16]
Dominique Brunato, Andrea Cimino, Felice Dell’Orletta, and Giulia Venturi. 2016. PaCCSS-IT: A parallel corpus of complex-simple sentences for automatic text simplification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 351--361.
[17]
Alicia Burga, Joan Codina, Gabriella Ferraro, Horacio Saggion, and Leo Wanner. 2013. The challenge of syntactic dependency parsing adaptation for the patent domain. In Proceedings of the 25th European Summer School in Logic, Language and Information (ESSLLI’13) Workshop on Extrinsic Parse Improvement.
[18]
John Carroll, Guido Minnen, Yvonne Canning, Siobhan Devlin, and John Tait. 1998. Practical simplification of English newspaper text to assist aphasic readers. In Proceedings of AAAI’98 Workshop on Integrating Artificial Intelligence and Assistive Technology. 7--10.
[19]
Daniel Castro-Castro, Rocío Lannes-Losada, Montse Maritxalar, Ianire Niebla, Celia Pérez-Marqués, Nancy C. Álamo-Suárez, and Aurora Pons-Porrata. 2008. A multilingual application for automated essay scoring. In Proceedings of the Ibero-American Conference on Artificial Intelligence. 243--251.
[20]
R. Chandrasekar, Christine Doran, and B. Srinivas. 1996. Motivations and methods for text simplification. In Proceedings of the 16th Conference on Computational Linguistics (COLING’96). 1041--1044.
[21]
Raman Chandrasekar and Bangalore Srinivas. 1997. Automatic induction of rules for text simplification1. Knowl.-based Syst. 10, 3 (1997), 183--190.
[22]
Danqi Chen and Christopher Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 740--750.
[23]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[24]
Kostadin Cholakov, Chris Biemann, Judith Eckle-Kohler, and Iryna Gurevych. 2014. Lexical substitution dataset for german. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’14). 1406--1411.
[25]
James Clarke and Mirella Lapata. 2006. Models for sentence compression: A comparison across domains, training requirements and evaluation measures. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Meeting of the Association for Computational Linguistics. 377--384.
[26]
Bertol Arrieta Cortajarenak. 2010. Azaleko Sintaxiaren Tratamendua Ikasketa Automatikoko Tekniken Bidez: ....Ph.D. Dissertation. Universidad del País Vasco-Euskal Herriko Unibertsitatea.
[27]
William Coster and David Kauchak. 2011. Learning to simplify sentences using Wikipedia. In Proceedings of the Workshop on Monolingual Text-to-text Generation. 1--9.
[28]
William Coster and David Kauchak. 2011. Simple English Wikipedia: A new text simplification task. In Proceedings of the 49th Meeting of the ACL: Human Language Technologies. 665--669.
[29]
James R. Curran, Stephen Clark, and Johan Bos. 2007. Linguistically motivated large-scale NLP with C&C and boxer. In Proceedings of the 45th Meeting of the ACL. 33--36.
[30]
Edgar Dale and Jeanne S. Chall. 1948. A formula for predicting readability: Instructions. Educ. Res. Bull. 27, 1 (1948), 37--54.
[31]
Jan De Belder and Marie-Francine Moens. 2012. A dataset for the evaluation of lexical simplification. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. 426--437.
[32]
Marie-Catherine De Marneffe, Bill MacCartney, Christopher D. Manning, et al. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC’06). 449--454.
[33]
Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Series B (Methodol.) 39, 1 (1977), 1--38.
[34]
Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the 6th Workshop on Statistical Machine Translation. 85--91.
[35]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[36]
Siobhan Devlin and Gary Unthank. 2006. Helping aphasic people process online information. In Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility. 225--226.
[37]
Yuan Ding and Martha Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of the 43rd Meeting of the ACL. 541--548.
[38]
George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research. 138--145.
[39]
Biljana Drndarević and Horacio Saggion. 2012. Towards automatic lexical simplification in Spanish: An empirical study. In Proceedings of the NAACL-HLT Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR’12). 8--16.
[40]
Emmanuel Dupoux. 2018. Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner. Cognition 173 (2018), 43--59.
[41]
Richard Evans. 2011. Comparing methods for the syntactic simplification of sentences in information extraction. Liter. Ling. Comput. 26, 4 (2011), 371--388.
[42]
Richard Evans, Constantin Orasan, and Iustin Dornescu. 2014. An evaluation of syntactic simplification rules for people with autism. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations. 131--140.
[43]
Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, and Noah A. Smith. 2014. Retrofitting word vectors to semantic lexicons. arXiv preprint arXiv:1411.4166 (2014).
[44]
Gabriela Ferraro. 2012. Towards Deep Content Extraction from Specialized Discourse: The Case of Verbal Relations in Patent Claims. Ph.D. Dissertation. Universitat Pompeu Fabra, Barcelona, Spain.
[45]
Daniel Ferrés, Montserrat Marimon, Horacio Saggion, and Ahmed AbuRa’ed. 2016. YATS: Yet another text simplifier. In Natural Language Processing and Information Systems (LNCS, Vol. 9612), E. Métais, F. Meziane, M. Saraee, V. Sugumaran, and S. Vadera (Eds.). Springer International Publishing, 335--342.
[46]
Daniel Ferrés, Horacio Saggion, and Xavier Gómez Guinovart. 2017. An adaptable lexical simplification architecture for major Ibero-Romance languages. In Proceedings of the 1st Workshop on Building Linguistically Generalizable NLP Systems. 40--47.
[47]
Katja Filippova and Michael Strube. 2008. Dependency tree based sentence compression. In Proceedings of the 5th International Natural Language Generation Conference. 25--32.
[48]
Thomas François, Núria Gala, Patrick Watrin, and Cédrick Fairon. 2014. FLELex: A graded lexical resource for French foreign learners. In Proceedings of 9th International Conference on Language Resources and Evaluation. 3766--3773.
[49]
Núria Gala, Anaïs Tack, Ludivine Javourey-Drevet, Thomas François, and Johannes C. Ziegler. 2020. Alector: A parallel corpus of simplified French texts with alignments of misreadings by poor and dyslexic readers. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC’20). 1353--1361.
[50]
Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 758--764.
[51]
Goran Glavaš and Sanja Štajner. 2013. Event-centered simplification of news stories. In Proceedings of the Student Research Workshop Associated with RANLP’13. 71--78.
[52]
Goran Glavaš and Sanja Štajner. 2015. Simplifying lexical simplification: Do we need simplified corpora? In Proceedings of the 53rd Meeting of the ACL and 7th International Joint Conference on NLP. 63--68.
[53]
Chikio Hayashi. 1998. What is data science? Fundamental concepts and a heuristic example. In Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization, C. Hayashi, K. Yajima, H.-H. Bock, N. Ohsumi, Y. Tanaka, and Y. Baba (Eds.). Springer, Japan, 40--51.
[54]
Michael Heilman and Noah A. Smith. 2010. Good question! Statistical ranking for question generation. In Proceedings of the Conference of the North American Chapter of the ACL: Human Language Technologies. 609--617.
[55]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.
[56]
Colby Horn, Cathryn Manduca, and David Kauchak. 2014. Learning a lexical simplifier using Wikipedia. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics. 458--463.
[57]
William Hwang, Hannaneh Hajishirzi, Mari Ostendorf, and Wei Wu. 2015. Aligning sentences from standard Wikipedia to simple Wikipedia. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 211--217.
[58]
Kentaro Inui, Atsushi Fujita, Tetsuro Takahashi, Ryu Iida, and Tomoya Iwakura. 2003. Text simplification for reading assistance: A project note. In Proceedings of the 2nd International Workshop on Paraphrasing. 9--16.
[59]
Tomoyuki Kajiwara and Mamoru Komachi. 2016. Building a monolingual parallel corpus for text simplification using sentence similarity based on alignment between word embeddings. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 1147--1158.
[60]
Tomoyuki Kajiwara and Kazuhide Yamamoto. 2015. Evaluation dataset and system for Japanese lexical simplification. In Proceedings of the ACL-IJCNLP Student Research Workshop. 35--40.
[61]
Hans Kamp. 2008. A Theory of Truth and Semantic Representation. John Wiley & Sons, Ltd, 189--222.
[62]
J. Peter Kincaid, Robert Fishburne Jr, Richard Rogers, and Brand Chissom. 1975. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Technical Report. Naval Technical Training Command Millington Tenn. Research Branch Report 8-75.
[63]
Beata Beigman Klebanov, Kevin Knight, and Daniel Marcu. 2004. Text simplification for information-seeking applications. In Proceedings of the OTM Confederated International Conferences “On the Move to Meaningful Internet Systems.” 735--747.
[64]
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, and Alexander M. Rush. 2017. OpenNMT: Open-source toolkit for neural machine translation. arXiv preprint arXiv:1701.02810 (2017).
[65]
Tomonori Kodaira, Tomoyuki Kajiwara, and Mamoru Komachi. 2016. Controlled and balanced dataset for Japanese lexical simplification. In Proceedings of the ACL Student Research Workshop. 1--7.
[66]
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Meeting of the ACL. 177--180.
[67]
Mathieu Lafourcade. 2007. Making people play for lexical acquisition with the jeuxdemots prototype. In Proceedings of the 7th International Symposium on Natural Language Processing (SNLP’07).
[68]
Partha Lal and Stefan Rüger. 2002. Extract-based summarization with simplification. In Proceedings of the ACL Workshop on Text Summarization (DUC’02).
[69]
Benoit Lavoie and Owen Rambow. 1997. A fast and portable realizer for text generation systems. In Proceedings of the 5th Conference on Applied Natural Language Processing. 265--268.
[70]
Jonathan Mallinson and Mirella Lapata. 2019. Controllable sentence simplification: Employing syntactic and lexical constraints. arXiv preprint arXiv:1910.04387 (2019).
[71]
Diana McCarthy and Roberto Navigli. 2007. Semeval-2007 task 10: English lexical substitution task. In Proceedings of the 4th International Workshop on Semantic Evaluations. 48--53.
[72]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[73]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3111--3119.
[74]
George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine J. Miller. 1990. Introduction to WordNet: An on-line lexical database. Int. J. Lexicog. 3, 4 (1990), 235--244.
[75]
Makoto Miwa, Rune Sætre, Yusuke Miyao, and Jun’ichi Tsujii. 2010. Entity-focused sentence simplification for relation extraction. In Proceedings of the 23rd International Conference on Computational Linguistics. 788--796.
[76]
Shashi Narayan and Claire Gardent. 2014. Hybrid simplification using deep semantics and machine translation. In Proceedings of the 52nd Meeting of the ACL. 435--445.
[77]
Gonzalo Navarro. 2001. A guided tour to approximate string matching. ACM Comput. Surv. 33, 1 (2001), 31--88.
[78]
Sergiu Nisioi, Sanja Štajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. Exploring neural text simplification models. In Proceedings of the 55th Meeting of the ACL. 85--91.
[79]
Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Comput. Ling. 29, 1 (2003), 19--51.
[80]
Ethel Ong, Jerwin Damay, Gerard Lojico, Kimberly Lu, and Dex Tarantan. 2007. Simplifying text in medical literature. J. Res. Sci., Comput. Eng. 4, 1 (2007), 37--47.
[81]
Lluís Padró and Evgeny Stanilovsky. 2012. FreeLing 3.0: Towards wider multilinguality. In Proceedings of the 8th Language Resources and Evaluation Conference (LREC’12). 2473--2479.
[82]
Gustavo Paetzold and Lucia Specia. 2016. Benchmarking lexical simplification systems. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 3074--3080.
[83]
Gustavo Paetzold and Lucia Specia. 2016. Semeval 2016 task 11: Complex word identification. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). 560--569.
[84]
Gustavo Paetzold and Lucia Specia. 2017. Lexical simplification with neural ranking. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 34--40.
[85]
Gustavo H. Paetzold and Lucia Specia. 2013. Text simplification as tree transduction. In Proceedings of the 9th Brazilian Symposium in Information and Human Language Technology. 116--125.
[86]
Gustavo H. Paetzold and Lucia Specia. 2016. Unsupervised lexical simplification for non-native speakers. In Proceedings of the 13th AAAI Conference on Artificial Intelligence (AAAI’16). 3761--3767.
[87]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting of the ACL. 311--318.
[88]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1532--1543.
[89]
Basel Qenam, Tae Youn Kim, Mark J. Carroll, and Michael Hogarth. 2017. Text simplification using consumer health vocabulary to generate patient-centered radiology reporting: Translation and evaluation. J. Med. Internet Res. 19, 12 (2017).
[90]
Jipeng Qiang, Yun Li, Yi Zhu, Yunhao Yuan, and Xindong Wu. 2019. A simple BERT-based approach for lexical simplification. arXiv preprint arXiv:1907.06226 (2019).
[91]
Luz Rello, Ricardo Baeza-Yates, Laura Dempere-Marco, and Horacio Saggion. 2013. Frequent words improve readability and short words improve understandability for people with dyslexia. In Proceedings of the IFIP Conference on Human-Computer Interaction (INTERACT’13), P. Kotzé, G. Marsden, G. Lindgaard, J. Wesson, and M. Winckler (Eds.). 203--219.
[92]
Jascha Rüsseler, Stefan Probst, Sönke Johannes, and Thomas F. Münte. 2003. Recognition memory for high-and low-frequency words in adult normal and dyslexic readers: An event-related brain potential study. J. Clin. Exper. Neuropsych. 25, 6 (2003), 815--829.
[93]
Horacio Saggion. 2017. Automatic text simplification. Synth. Lect. Hum. Lang. Technol. 10, 1 (2017), 1--137.
[94]
Horacio Saggion, Sanja Štajner, Stefan Bott, Simon Mille, Luz Rello, and Biljana Drndarevic. 2015. Making it simplext: Implementation and evaluation of a text simplification system for Spanish. ACM Trans. Access. Comput. 6, 4 (2015), 14:1--14:36.
[95]
Carolina Scarton, Alessio Palmero Aprosio, Sara Tonelli, Tamara Martín Wanton, and Lucia Specia. 2017. MUSST: A multilingual syntactic simplification tool. In Proceedings of the International Joint Conference on Natural Language Processing and Workshops (IJCNLP’17). 25--28.
[96]
Isabel Segura-Bedmar and Paloma Martínez. 2017. Simplifying drug package leaflets written in Spanish by using word embedding. J. Biomed. Seman. 8, 1 (2017), 45.
[97]
Matthew Shardlow. 2013. A comparison of techniques to automatically identify complex words. In Proceedings of the 51st Meeting of the ACL Proceedings of the Student Research Workshop. 103--109.
[98]
Matthew Shardlow. 2013. The CW corpus: A new resource for evaluating the identification of complex words. In Proceedings of 2nd Workshop on Predicting and Improving Text Readability for Target Reader Populations. 69--77.
[99]
Matthew Shardlow. 2014. Out in the open: Finding and categorising errors in the lexical simplification pipeline. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). 1583--1590.
[100]
Matthew Shardlow. 2014. A survey of automated text simplification. Int. J. Adv. Comput. Sci. Applic. 4, 1 (2014), 58--70.
[101]
Advaith Siddharthan. 2006. Syntactic simplification and text cohesion. Res. Lang. Comput. 4, 1 (2006), 77--109.
[102]
Advaith Siddharthan. 2011. Text simplification using typed dependencies: A comparison of the robustness of different generation strategies. In Proceedings of the 13th European Workshop on Natural Language Generation. 2--11.
[103]
Advaith Siddharthan. 2014. A survey of research on text simplification. Int. J. Appl. Ling. 165, 2 (2014), 259--298.
[104]
Advaith Siddharthan and Angrosh Annayappan Mandya. 2014. Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). 722--731.
[105]
Advaith Siddharthan, Ani Nenkova, and Kathleen McKeown. 2004. Syntactic simplification for improving content selection in multi-document summarization. In Proceedings of the 20th International Conference on Computational Linguistics (COLING’04).
[106]
Sara Botelho Silveira and António Branco. 2012. Combining a double clustering approach with sentence simplification to produce highly informative multi-document summaries. In Proceedings of the IEEE 13th International Conference on Information Reuse and Integration (IRI’12). 482--489.
[107]
David A. Smith and Jason Eisner. 2006. Quasi-synchronous grammars: Alignment by soft projection of syntactic dependencies. In Proceedings of the Workshop on Statistical Machine Translation (StatMT’06). 23--30.
[108]
Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas. 223--231.
[109]
Matthew Snover, Nitin Madnani, Bonnie J. Dorr, and Richard Schwartz. 2009. Fluency, adequacy, or HTER?: Exploring different human judgments with a tunable MT metric. In Proceedings of the 4th Workshop on Statistical Machine Translation. 259--268.
[110]
Lucia Specia. 2010. Translating from complex to simplified sentences. In Proceedings of the International Conference on Computational Processing of the Portuguese Language. 30--39.
[111]
Lucia Specia, Sujay Kumar Jauhar, and Rada Mihalcea. 2012. Semeval-2012 task 1: English lexical simplification. In Proceedings of the 1st Joint Conference on Lexical and Computational Semantics (*SEM’12). 347--355.
[112]
Sanja Štajner and Goran Glavaš. 2017. Leveraging event-based semantics for automated text simplification. Exp. Syst. Applic. 82 (2017), 383--395.
[113]
Sanja Štajner, Horacio Saggion, and Simone Paolo Ponzetto. 2019. Improving lexical coverage of text simplification systems for Spanish. Exp. Syst. Applic. 118 (2019), 80--91.
[114]
Elior Sulem, Omri Abend, and Ari Rappoport. 2018. Simple and effective text simplification using semantic and neural methods. arXiv preprint arXiv:1810.05104 (2018).
[115]
Hong Sun and Ming Zhou. 2012. Joint learning of a dual SMT system for paraphrase generation. In Proceedings of the 50th Meeting of the Association for Computational Linguistics. 38--42.
[116]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3104--3112.
[117]
Kristina Toutanova and Christopher D. Manning. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, Held in Conjunction with the 38th Meeting of the ACL. 63--70.
[118]
Sowmya Vajjala and Ivana Lucic. 2018. OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification. In Proceedings of the 13th Workshop on Innovative use of NLP for Building Educational Apps. 297--304.
[119]
David Vickrey and Daphne Koller. 2008. Sentence simplification for semantic role labeling. In Proceedings of the 46th Meeting of the Association for Computational Linguistics (ACL’08). 344--352.
[120]
Piek Vossen (Ed.). 1998. EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers.
[121]
Tong Wang, Ping Chen, Kevin Amaral, and Jipeng Qiang. 2016. An experimental study of LSTM encoder-decoder model for text simplification. arXiv preprint arXiv:1609.03663 (2016).
[122]
Rodrigo Wilkens, Leonardo Zilio, Silvio Ricardo Cordeiro, Felipe Paula, Carlos Ramisch, Marco Idiart, and Aline Villavicencio. 2017. LexSubNC: A dataset of lexical substitution for nominal compounds. In Proceedings of the 12th International Conference on Computational Semantics (IWCS’17).
[123]
Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 3--4 (1992), 229--256.
[124]
Kristian Woodsend and Mirella Lapata. 2011. Learning to simplify sentences with quasi-synchronous grammar and integer programming. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 409--420.
[125]
Sander Wubben, Antal Van Den Bosch, and Emiel Krahmer. 2012. Sentence simplification by monolingual machine translation. In Proceedings of the 50th Meeting of the ACL. 1015--1024.
[126]
Wei Xu, Chris Callison-Burch, and Courtney Napoles. 2015. Problems in current text simplification research: New data can help. Trans. Assoc. Comput. Ling. 3, 1 (2015), 283--297.
[127]
Wei Xu, Courtney Napoles, Ellie Pavlick, Quanze Chen, and Chris Callison-Burch. 2016. Optimizing statistical machine translation for text simplification. Trans. Assoc. Comput. Ling. 4 (2016), 401--415.
[128]
Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Meeting of the Association for Computational Linguistics. 523--530.
[129]
Mark Yatskar, Bo Pang, Cristian Danescu-Niculescu-Mizil, and Lillian Lee. 2010. For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 365--368.
[130]
Xingxing Zhang and Mirella Lapata. 2017. Sentence simplification with deep reinforcement learning. arXiv preprint arXiv:1703.10931 (2017).
[131]
Yaoyuan Zhang, Zhenxu Ye, Yansong Feng, Dongyan Zhao, and Rui Yan. 2017. A constrained sequence-to-sequence neural model for sentence simplification. arXiv preprint arXiv:1704.02312 (2017).
[132]
Zhemin Zhu, Delphine Bernhard, and Iryna Gurevych. 2010. A monolingual tree-based translation model for sentence simplification. In Proceedings of the 23rd International Conference on Computational Linguistics. 1353--1361.

Cited By

View all
  • (2024)SATS: simplification aware text summarization of scientific documentsFrontiers in Artificial Intelligence10.3389/frai.2024.13754197Online publication date: 10-Jul-2024
  • (2024)Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approachComputer Science and Information Systems10.2298/CSIS230912017M21:3(899-921)Online publication date: 2024
  • (2024)Intelligent Text Processing: A Review of Automated Summarization MethodsVirtual Communication and Social Networks10.21603/2782-4799-2024-3-3-203-2223:3(203-222)Online publication date: 1-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 54, Issue 2
March 2022
800 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3450359
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021
Accepted: 01 December 2020
Revised: 01 December 2020
Received: 01 January 2020
Published in CSUR Volume 54, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Text simplification
  2. lexical simplification
  3. monolingual machine translation
  4. survey
  5. syntactic simplification

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Deputyship for Research and Innovation, ?Ministry of Education? in Saudi Arabia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)514
  • Downloads (Last 6 weeks)69
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)SATS: simplification aware text summarization of scientific documentsFrontiers in Artificial Intelligence10.3389/frai.2024.13754197Online publication date: 10-Jul-2024
  • (2024)Reaching quality and efficiency with a parameter-efficient controllable sentence simplification approachComputer Science and Information Systems10.2298/CSIS230912017M21:3(899-921)Online publication date: 2024
  • (2024)Intelligent Text Processing: A Review of Automated Summarization MethodsVirtual Communication and Social Networks10.21603/2782-4799-2024-3-3-203-2223:3(203-222)Online publication date: 1-Oct-2024
  • (2024)Biomedical text readability after hypernym substitution with fine-tuned large language modelsPLOS Digital Health10.1371/journal.pdig.00004893:4(e0000489)Online publication date: 16-Apr-2024
  • (2024)The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic ReviewsProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering10.1145/3661167.3661172(262-271)Online publication date: 18-Jun-2024
  • (2024)ARTiST: Automated Text Simplification for Task Guidance in Augmented RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642772(1-24)Online publication date: 11-May-2024
  • (2024)Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual DisabilitiesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642570(1-11)Online publication date: 11-May-2024
  • (2024)Machine Automatic Translation Evaluation Based on Big Data Algorithms2024 Second International Conference on Data Science and Information System (ICDSIS)10.1109/ICDSIS61070.2024.10594232(1-5)Online publication date: 17-May-2024
  • (2024)An approach for enhancing and measuring information comprehensibility for engineering designers: applied to patent documentsArtificial Intelligence for Engineering Design, Analysis and Manufacturing10.1017/S089006042400007638Online publication date: 20-Sep-2024
  • (2024)Deep learning approaches to lexical simplification: A surveyJournal of Intelligent Information Systems10.1007/s10844-024-00882-9Online publication date: 2-Sep-2024
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media