Continuous Integration for Reproducible Shared Tasks with TIRA.io

Maik Fröbe¹⁶,
Matti Wiegmann¹⁷,
Nikolay Kolyada¹⁷,
Bastian Grahm¹⁸,
Theresa Elstner¹⁸,
Frank Loebe^18,19,
Matthias Hagen¹⁶,
Benno Stein¹⁷ &
…
Martin Potthast^18,19

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13982))

Included in the following conference series:

European Conference on Information Retrieval

1978 Accesses
11 Citations

Abstract

A major obstacle to the long-term impact of most shared tasks is their lack of reproducibility. Often only the test collections and the papers of the organizers and participants are published. Third parties who want to independently evaluate the state of the art for a task on other data must re-implement the participants’ software. The tools developed to collect software from participants in shared tasks only partially verify its reliability at the time of submission, much less long-term, and do not enable third parties to reuse it later. We have overhauled the TIRA Integrated Research Architecture to address all of these issues. The new version simplifies task setup for organizers and software submission for participants, scales from a local computer to the cloud, supports on-demand resource allocation up to parallel CPU and GPU processing, and enables export for local reproduction with just a few lines of code. This is achieved by implementing the TIRA protocol with an industry-standard continuous integration and deployment (CI/CD) pipeline using Git, Docker, and Kubernetes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

TIRA Integrated Research Architecture

Tapis: An API Platform for Reproducible, Distributed Computational Research

Effective Reproducible Research with Org-Mode and Git

References

Breuer, T., Schaer, P., Tavakolpoursaleh, N., Schaible, J., Wolff, B., Müller, B.: STELLA: Towards a Framework for the Reproducibility of Online Search Experiments. In: Proceedings of the Open-Source IR Replicability Challenge OSIRRC@SIGIR 2019, pp. 8–11 (2019)
Google Scholar
Clancy, R., Ferro, N., Hauff, C., Lin, J., Sakai, T., Wu, Z.Z.: The SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019). In: Proceedings of SIGIR 2019, pp. 1432–1434 (2019)
Google Scholar
Ferro, N., Kelly, D.: SIGIR initiative to implement ACM artifact review and badging. SIGIR Forum 52(1), 4–10 (2018)
Article Google Scholar
Ferro, N., Maistro, M., Sakai, T., Soboroff, I.: Overview of CENTRE@CLEF 2018: A First Tale in the Systematic Reproducibility Realm. In: CLEF, pp. 239–246 (2018)
Google Scholar
Gollub, T., Potthast, M., Beyer, A., Busse, M., Rangel, F., Rosso, P., Stamatatos, E., Stein, B.: Recent Trends in Digital Text Forensics and its Evaluation. In: Proceedings of CLEF 2013, pp. 282–302 (2013)
Google Scholar
Hopfgartner, F., et al.: Benchmarking news recommendations: the CLEF NewsREEL use case. SIGIR Forum 49(2), 129–136 (2015)
Article Google Scholar
Hopfgartner, F., et al.: Evaluation-as-a-Service for the Computational Sciences: Overview and Outlook. ACM J. Data Inf. Qual. 10(4), 15:1–15:32 (2018)
Google Scholar
Jagerman, R., Balog, K., de Rijke, M.: OpenSearch: Lessons Learned from an Online Evaluation Campaign. ACM J. Data Inf. Qual. 10(3), 13:1–13:15 (2018)
Google Scholar
Lin, J., Campos, D., Craswell, N., Mitra, B., Yilmaz, E.: Fostering Coopetition While Plugging Leaks: The Design and Implementation of the MS MARCO Leaderboards. In: Proceedings of SIGIR 2022, pp. 2939–2948 (2022)
Google Scholar
Pavao, A.: CodaLab Competitions: An Open Source Platform to Organize Scientific Challenges. Université Paris-Saclay, France, Tech. rep. (2022)
Google Scholar
Potthast, M., Gollub, T., Wiegmann, M., Stein, B.: TIRA integrated research architecture. In: Information Retrieval Evaluation in a Changing World. TIRS, vol. 41, pp. 123–160. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22948-1_5
Chapter Google Scholar
Sakai, T., Ferro, N., Soboroff, I., Zeng, Z., Xiao, P., Maistro, M.: Overview of the NTCIR-14 CENTRE Task. In: Proceedings of NTCIR-14 (2019)
Google Scholar
Soboroff, I., Ferro, N., Sakai, T.: Overview of the TREC 2018 CENTRE Track. In: Proceedings of TREC 2018 (2018)
Google Scholar
Tsatsaronis, G., et al.: An Overview of the BIOASQ Large-scale Biomedical Semantic Indexing and Question Answering Competition. BMC Bioinform. 16, 138:1–138:28 (2015)
Google Scholar
Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)
Article Google Scholar
Yadav, D., et al.: EvalAI: Towards Better Evaluation Systems for AI Agents. arXiv 1902.03570 (2019)
Google Scholar

Download references

Acknowledgments

This work has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070014 (OpenWebSearch.EU, https://doi.org/10.3030/101070014).

Author information

Authors and Affiliations

Friedrich-Schiller-Universität Jena, Jena, Germany
Maik Fröbe & Matthias Hagen
Bauhaus-Universität Weimar, Weimar, Germany
Matti Wiegmann, Nikolay Kolyada & Benno Stein
Leipzig University, Leipzig, Germany
Bastian Grahm, Theresa Elstner, Frank Loebe & Martin Potthast
ScaDS.AI, Leipzig, Germany
Frank Loebe & Martin Potthast

Authors

Maik Fröbe
View author publications
You can also search for this author in PubMed Google Scholar
Matti Wiegmann
View author publications
You can also search for this author in PubMed Google Scholar
Nikolay Kolyada
View author publications
You can also search for this author in PubMed Google Scholar
Bastian Grahm
View author publications
You can also search for this author in PubMed Google Scholar
Theresa Elstner
View author publications
You can also search for this author in PubMed Google Scholar
Frank Loebe
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Hagen
View author publications
You can also search for this author in PubMed Google Scholar
Benno Stein
View author publications
You can also search for this author in PubMed Google Scholar
Martin Potthast
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maik Fröbe .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Université Grenoble-Alpes, Saint-Martin-d’Hères, France
Lorraine Goeuriot
Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
University of Copenhagen, Copenhagen, Denmark
Maria Maistro
University of Tsukuba, Ibaraki, Japan
Hideo Joho
Dublin City University, Dublin, Ireland
Brian Davis
Dublin City University, Dublin, Ireland
Cathal Gurrin
Universität Regensburg, Regensburg, Germany
Udo Kruschwitz
Dublin City University, Dublin, Ireland
Annalina Caputo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fröbe, M. et al. (2023). Continuous Integration for Reproducible Shared Tasks with TIRA.io. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13982. Springer, Cham. https://doi.org/10.1007/978-3-031-28241-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-28241-6_20
Published: 16 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28240-9
Online ISBN: 978-3-031-28241-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Continuous Integration for Reproducible Shared Tasks with TIRA.io

Abstract

Access this chapter

Subscribe and save

Buy Now