Overview of the CLEF 2023 SimpleText Lab: Automatic Simplification of Scientific Texts

Liana Ermakova¹⁷,
Eric SanJuan¹⁸,
Stéphane Huet¹⁸,
Hosein Azarbonyad¹⁹,
Olivier Augereau²⁰ &
…
Jaap Kamps²¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14163))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

909 Accesses
6 Citations

Abstract

There is universal consensus on the importance of objective scientific information, yet the general public tends to avoid scientific literature due to access restrictions, its complex language or their lack of prior background knowledge. Academic text simplification promises to remove some of these barriers, by improving the accessibility of scientific text and promoting science literacy. This paper presents an overview of the CLEF 2023 SimpleText track addressing the challenges of text simplification approaches in the context of promoting scientific information access, by providing appropriate data and benchmarks, and creating a community of IR and NLP researchers working together to resolve one of the greatest challenges of today. The track provides a corpus of scientific literature abstracts and popular science requests. It features three tasks. First, content selection (what is in, or out?) challenges systems to select passages to include in a simplified summary in response to a query. Second, complexity spotting (what is unclear?) given a passage and a query, aims to rank terms/concepts that are required to be explained for understanding this passage (definitions, context, applications). Third, text simplification (rewrite this!) given a query, asks to simplify passages from scientific abstracts while preserving the main content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Overview of the CLEF 2022 SimpleText Lab: Automatic Simplification of Scientific Texts

Automatic Simplification of Scientific Texts: SimpleText Lab at CLEF-2022

CLEF 2023 SimpleText Track

Notes

References

Aliannejadi, M., Faggioli, G., Ferro, N., Vlachos, M. (eds.): Working Notes of CLEF 2023: Conference and Labs of the Evaluation Forum. CEUR Workshop Proceedings. CEUR-WS.org (2023)
Google Scholar
Alva-Manchego, F., Martin, L., Scarton, C., Specia, L.: EASSE: easier automatic sentence simplification evaluation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pp. 49–54. Association for Computational Linguistics, Hong Kong (2019). https://doi.org/10.18653/v1/D19-3009, https://aclanthology.org/D19-3009
Andermatt, P.S., Fankhauser, T.: UZH_Pandas at SimpleTextCLEF-2023: alpaca LoRA 7B and LENS model selection for scientific literature simplification. In: [1] (2023)
Google Scholar
Anjum, A., Lieberum, N.: Automatic simplification of scientific texts using pre-trained language models: a comparative study at CLEF symposium 2023. In: [1] (2023)
Google Scholar
Bertin, S.: Scientific simplification, the limits of ChatGPT. In: [1] (2023)
Google Scholar
Capari, A., Azarbonyad, H., Tsatsaronis, G., Afzal, Z.: Elsevier at simpletext: passage retrieval by fine-tuning GPL on scientific documents. In: [1] (2023)
Google Scholar
Davari, D.R., Prnjak, A., Schmitt, K.: CLEF 2023 SimpleText task 2, 3: identification and simplification of difficult terms. In: [1] (2023)
Google Scholar
Dubreuil, Q.: UBO team @ CLEF SimpleText 2023 track for task 2 and 3 - using IA models to simplify scientific texts. In: [1] (2023)
Google Scholar
Engelmann, B., Haak, F., Kreutz, C.K., Nikzad-Khasmakhi, N., Schaer, P.: Text simplification of scientific texts for non-expert readers. In: [1] (2023)
Google Scholar
Ermakova, L., et al.: Overview of SimpleText 2021 - CLEF workshop on text simplification for scientific information access. In: Candan, K.S., et al. (eds.) CLEF 2021. LNCS, vol. 12880, pp. 432–449. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85251-1_27
Chapter Google Scholar
Ermakova, L., SanJuan, E., Huet, S., Augereau, O., Azarbonyad, H., Kamps, J.: CLEF 2023 simpletext track - what happens if general users search scientific texts? In: Kamps, J., et al. (eds.) ECIR 2023. LNCS, vol. 13982, pp. 536–545. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-28241-6_62
Chapter Google Scholar
Ermakova, L., et al.: Overview of the CLEF 2022 simpletext lab: automatic simplification of scientific texts. In: Barrón-Cedeño, A., et al. (eds.) CLEF 2022. LNCS, vol. 13390, pp. 470–494. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-13643-6_28
Chapter Google Scholar
Ermakova, L.N., Nurbakova, D., Ovchinnikova, I.: COVID or not COVID? Topic shift in information cascades on Twitter. In: Linguistics, A.F.C. (ed.) 3rd International Workshop on Rumours and Deception in Social Media (RDSM) Collocated with COLING 2020. Proceedings of the 3rd International Workshop on Rumours and Deception in Social Media (RDSM), Barcelona (On line), Spain, pp. 32–37 (2020). https://hal.archives-ouvertes.fr/hal-03066857
Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32(3), 221–233 (1948). ISSN 0021-9010
Article Google Scholar
Hou, R., Qin, X.: An evaluation of MUSS and T5 models in scientific sentence simplification: a comparative study. In: [1] (2023)
Google Scholar
Hutter, R., Sutmuller, J., Adib, M., Rau, D., Kamps, J.: University of Amsterdam at the CLEF 2023 SimpleText track. In: [1] (2023)
Google Scholar
Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the ACL on Human Language Technology, vol. 1, pp. 71–78. ACL (2003)
Google Scholar
Maddela, M., Alva-Manchego, F., Xu, W.: Controllable text simplification with explicit paraphrasing (2021). http://arxiv.org/abs/2010.11004
Mansouri, B., Durgin, S., Franklin, S., Fletcher, S., Campos, R.: AIIR and LIAAD labs systems for CLEF 2023 SimpleText. In: [1] (2023)
Google Scholar
Mendoza, O.E., Pasi, G.: Domain context-centered retrieval for the content selection task in the simplification of scientific literature. In: [1] (2023)
Google Scholar
Ohnesorge, F., Gutierrez, M.A., Plichta, J.: Scientific text simplification and general audience. In: [1] (2023)
Google Scholar
Ortiz-Zambrano, J.A., Espin-Riofrio, C., Montejo-Ráez, A.: SINAI participation in SimpleText task 2 at CLEF 2023: GPT-3 in lexical complexity prediction for general audience. In: [1] (2023)
Google Scholar
Palma, V.M., Preciado, C.P., Sidorov, G.: NLPalma @ CLEF 2023 SimpleText: BLOOMZ and BERT for complexity and simplification task. In: [1] (2023)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on ACL, pp. 311–318. ACL (2002)
Google Scholar
Dadić, P., Popova,O.: CLEF 2023 SimpleText tasks 2 and 3: enhancing language comprehension: addressing difficult concepts and simplifying ccientific texts using GPT, BLOOM, KeyBert, simple T5 and more. In: [1] (2023)
Google Scholar
Schwartz, A.S., Hearst, M.A.: A simple algorithm for identifying abbreviation definitions in biomedical text. In: Biocomputing 2003, pp. 451–462 (2002)
Google Scholar
Wu, S.H., Huang, H.Y.: A prompt engineering approach to scientific text simplification: CYUT at SimpleText2023 task3. In: [1] (2023)
Google Scholar
Xu, W., Napoles, C., Pavlick, E., Chen, Q., Callison-Burch, C.: Optimizing statistical machine translation for text simplification. Trans. ACL 4, 401–415 (2016)
Google Scholar

Download references

Acknowledgments

This research was funded, in whole or in part, by the French National Research Agency (ANR) under the project ANR-22-CE23-0019-01. We would like to thank Sarah Bertin, Radia Hannachi, Silvia Araújo, Pierre De Loor, Olga Popova, Diana Nurbakova, Quentin Dubreuil, Helen McCombie, Aurianne Damoy, Angelique Robert, and all other colleagues and participants who helped run this track.

Author information

Authors and Affiliations

Université de Bretagne Occidentale, HCTI, Brest, France
Liana Ermakova
Avignon Université, LIA, Avignon, France
Eric SanJuan & Stéphane Huet
Elsevier, Amsterdam, The Netherlands
Hosein Azarbonyad
ENIB, Lab-STICC UMR CNRS 6285, Brest, France
Olivier Augereau
University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps

Authors

Liana Ermakova
View author publications
You can also search for this author in PubMed Google Scholar
Eric SanJuan
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Huet
View author publications
You can also search for this author in PubMed Google Scholar
Hosein Azarbonyad
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Augereau
View author publications
You can also search for this author in PubMed Google Scholar
Jaap Kamps
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liana Ermakova .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Avi Arampatzis
University of Amsterdam, Amsterdam, The Netherlands
Evangelos Kanoulas
CERTH-ITI, Thessaloniki, Greece
Theodora Tsikrika
CERTH-ITI, Thessaloniki, Greece
Stefanos Vrochidis
Utrecht University, Utrecht, The Netherlands
Anastasia Giachanou
Elsevier, Amsterdam, The Netherlands
Dan Li
University of Amsterdam, Amsterdam, The Netherlands
Mohammad Aliannejadi
University of Lausanne, Lausanne, Switzerland
Michalis Vlachos
University of Padua, Padova, Italy
Guglielmo Faggioli
University of Padua, Padova, Italy
Nicola Ferro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ermakova, L., SanJuan, E., Huet, S., Azarbonyad, H., Augereau, O., Kamps, J. (2023). Overview of the CLEF 2023 SimpleText Lab: Automatic Simplification of Scientific Texts. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2023. Lecture Notes in Computer Science, vol 14163. Springer, Cham. https://doi.org/10.1007/978-3-031-42448-9_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-42448-9_30
Published: 11 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42447-2
Online ISBN: 978-3-031-42448-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Overview of the CLEF 2023 SimpleText Lab: Automatic Simplification of Scientific Texts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Overview of the CLEF 2022 SimpleText Lab: Automatic Simplification of Scientific Texts

Automatic Simplification of Scientific Texts: SimpleText Lab at CLEF-2022

CLEF 2023 SimpleText Track

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Overview of the CLEF 2023 SimpleText Lab: Automatic Simplification of Scientific Texts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Overview of the CLEF 2022 SimpleText Lab: Automatic Simplification of Scientific Texts

Automatic Simplification of Scientific Texts: SimpleText Lab at CLEF-2022

CLEF 2023 SimpleText Track

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation