short-paper

Dense Retrieval with Continuous Explicit Feedback for Systematic Review Screening Prioritisation

Authors:

Shengyao Zhuang,

Guido ZucconAuthors Info & Claims

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2357 - 2362

https://doi.org/10.1145/3626772.3657921

Published: 11 July 2024 Publication History

Abstract

The goal of screening prioritisation in systematic reviews is to identify relevant documents with high recall and rank them in early positions for review. This saves reviewing effort if paired with a stopping criterion, and speeds up review completion if performed alongside downstream tasks. Recent studies have shown that neural models have good potential on this task, but their time-consuming fine-tuning and inference discourage their widespread use for screening prioritisation. In this paper, we propose an alternative approach that still relies on neural models, but leverages dense representations and relevance feedback to enhance screening prioritisation, without the need for costly model fine-tuning and inference. This method exploits continuous relevance feedback from reviewers during document screening to efficiently update the dense query representation, which is then applied to rank the remaining documents to be screened. We evaluate this approach across the CLEF TAR datasets for this task. Results suggest that the investigated dense query-driven approach is more efficient than directly using neural models and shows promising effectiveness compared to previous methods developed on the considered datasets. Our code is available at https://github.com/ielab/dense-screening-feedback.

References

[1]

Amal Alharbi, William Briggs, and Mark Stevenson. 2018. Retrieving and Rranking Studies for Systematic Reviews: University of Sheffield's Approach to CLEF eHealth 2018 Task 2. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Vol. 2125.

[2]

Amal Alharbi and Mark Stevenson. 2017. Ranking Abstracts to Identify Relevant Evidence for Systematic Reviews: The University of Sheffield's Approach to CLEF eHealth 2017 Task 2. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866.

[3]

Amal Alharbi and Mark Stevenson. 2019. Ranking Studies for Systematic Reviews using Query Adaptation: University of Sheffield's Approach to CLEF eHealth 2019 Task 2. In Working Notes of CLEF 2019-Conference and Labs of the Evaluation Forum, Vol. 2380.

[4]

Antonios Anagnostou, Athanasios Lagopoulos, Grigorios Tsoumakas, and Ioannis Vlachavas. 2017. Combining Inter-Review Learning-to-Rank and Intra-Review Incremental Training for Title and Abstract Screening in Systematic Reviews. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866.

[5]

Tim Baumg"artner, Leonardo FR Ribeiro, Nils Reimers, and Iryna Gurevych. 2022. Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking. arXiv preprint arXiv:2210.10695 (2022).

[6]

Rohit Borah, Andrew W Brown, Patrice L Capers, and Kathryn A Kaiser. 2017. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ open, Vol. 7, 2 (2017).

[7]

Jiayi Chen, Su Chen, Yang Song, Hongyu Liu, Yueyao Wang, Qinmin Hu, Liang He, and Yan Yang. 2017. ECNU at 2017 eHealth Task 2: Technologically Assisted Reviews in Empirical Medicine. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Vol. 1866.

[8]

Gordon V Cormack and Maura R Grossman. 2014. Evaluation of machine-learning protocols for technology-assisted review in electronic discovery. In Proceedings of the 37th international ACM SIGIR Conference on Research and Development in Information Retrieval. 153--162.

Digital Library

[9]

Gordon V Cormack and Maura R Grossman. 2016. Engineering quality and reliability in technology-assisted review. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 75--84.

Digital Library

[10]

Gordon V Cormack and Maura R Grossman. 2017. Technology-Assisted Review in Empirical Medicine: Waterloo Participation in CLEF eHealth 2017. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866.

[11]

Gordon V Cormack and Maura R Grossman. 2018. Technology-Assisted Review in Empirical Medicine: Waterloo Participation in CLEF eHealth 2018. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Vol. 2125.

[12]

Giorgio Maria Di Nunzio. 2019. A Distributed Effort Approach for Systematic Reviews. IMS Unipd at CLEF 2019 eHealth Task 2. In Working Notes of CLEF 2019-Conference and Labs of the Evaluation Forum, Vol. 2380.

[13]

Giorgio Maria Di Nunzio, Federica Beghini, Federica Vezzani, and Geneviève Henrot. 2017. An Interactive Two-Dimensional Approach to Query Aspects Rewriting in Systematic Reviews. IMS Unipd At CLEF eHealth Task 2. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866.

[14]

Giorgio Maria Di Nunzio, Giacomo Ciuffreda, and Federica Vezzani. 2018. Interactive Sampling for Systematic Reviews. IMS Unipd At CLEF 2018 eHealth Task 2. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Vol. 2125.

[15]

Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2024. The faiss library. arXiv preprint arXiv:2401.08281 (2024).

[16]

Luyu Gao and Jamie Callan. 2021. Unsupervised corpus aware language model pre-training for dense passage retrieval. arXiv preprint arXiv:2108.05540 (2021).

[17]

Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. 2022. Tevatron: An efficient and flexible toolkit for dense retrieval. arXiv preprint arXiv:2203.05765 (2022).

[18]

Maura R Grossman, Gordon V Cormack, and Adam Roegiest. 2017. Automatic and semi-automatic document selection for technology-assisted review. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 905--908.

Digital Library

[19]

Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. 2021. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), Vol. 3, 1 (2021), 1--23.

Digital Library

[20]

Julian PT Higgins, James Thomas, Jacqueline Chandler, Miranda Cumpston, Tianjing Li, Matthew J Page, and Vivian A Welch. 2019. Cochrane handbook for systematic reviews of interventions. John Wiley & Sons.

[21]

Vassil Kalphov, Georgios Georgiadis, and Leif Azzopardi. 2017. Sis at clef 2017 ehealth tar task. In CEUR Workshop Proceedings, Vol. 1866. 1--5.

[22]

Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2017. CLEF 2017 technologically assisted reviews in empirical medicine overview. In CEUR Workshop Proceedings, Vol. 1866. 1--29.

[23]

Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2018. CLEF 2018 Technologically Assisted Reviews in Empirical Medicine Overview. In CEUR Workshop Proceedings, Vol. 2125.

[24]

Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker. 2019. CLEF 2019 technology assisted reviews in empirical medicine overview. In CEUR Workshop Proceedings, Vol. 2380.

[25]

Vladimir Karpukhin, Barlas Oug uz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020).

[26]

Grace E Lee and Aixin Sun. 2018. Seed-driven document ranking for systematic reviews in evidence-based medicine. In The 41st international ACM SIGIR Conference on Research and Development in information retrieval. 455--464.

Digital Library

[27]

Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, Vol. 36, 4 (2020), 1234--1240.

[28]

Dan Li and Evangelos Kanoulas. 2019. Automatic Thresholding by Sampling Documents and Estimating Recall. In Working Notes of CLEF 2019-Conference and Labs of the Evaluation Forum, Vol. 2380.

[29]

Hang Li, Ahmed Mourad, Shengyao Zhuang, Bevan Koopman, and Guido Zuccon. 2023. Pseudo relevance feedback with deep language models and dense retrievers: Successes and pitfalls. ACM Transactions on Information Systems, Vol. 41, 3 (2023), 1--40.

Digital Library

[30]

Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, and Guido Zuccon. 2022. Improving query representations for dense retrieval with pseudo relevance feedback: A reproducibility study. In European Conference on Information Retrieval. Springer, 599--612.

Digital Library

[31]

Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2356--2362.

Digital Library

[32]

Xinyu Mao, Bevan Koopman, and Guido Zuccon. 2024. A Reproducibility Study of Goldilocks: Just-Right Tuning of BERT for TAR. In European Conference on Information Retrieval. Springer, 132--146.

Digital Library

[33]

Jessie McGowan and Margaret Sampson. 2005. Systematic reviews need systematic searchers (IRP). Journal of the Medical Library Association, Vol. 93, 1 (2005), 74.

[34]

Matthew Michelson and Katja Reuter. 2019. The significant cost of systematic reviews and meta-analyses: a call for greater involvement of machine learning to assess the promise of clinical trials. Contemporary clinical trials communications, Vol. 16 (2019), 100443.

[35]

Alessio Molinari and Evangelos Kanoulas. 2022. Transferring knowledge between topics in systematic reviews. Intelligent Systems with Applications, Vol. 16 (2022), 200150.

[36]

Christopher Norman, Mariska Leeflang, and Aurélie Névéol. 2017. LIMSI@ CLEF eHealth 2017 Task 2: Logistic Regression for Automatic Article Ranking. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866.

[37]

Christopher Norman, Mariska Leeflang, and Aurélie Névéol. 2018. LIMSI@ CLEF eHealth 2018 Task 2: Technology Assisted Reviews by Stacking Active and Static Learning. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Vol. 2125.

[38]

Alison O'Mara-Eves, James Thomas, John McNaught, Makoto Miwa, and Sophia Ananiadou. 2015. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic reviews, Vol. 4, 1 (2015), 5.

[39]

Gerard Salton and Chris Buckley. 1990. Improving retrieval performance by relevance feedback. Journal of the American society for information science, Vol. 41, 4 (1990), 288--297.

[40]

Harrisen Scells, Guido Zuccon, Anthony Deacon, and Bevan Koopman. 2017. QUT ielab at CLEF eHealth 2017 technology assisted reviews track: initial experiments with learning to rank. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866. 1--6.

[41]

Gaurav Singh, James Thomas, and John Shawe-Taylor. 2018. Improving active learning in systematic reviews. arXiv preprint arXiv:1801.09496 (2018).

[42]

Byron C Wallace, Thomas A Trikalinos, Joseph Lau, Carla Brodley, and Christopher H Schmid. 2010. Semi-automated screening of biomedical citations for systematic reviews. BMC bioinformatics, Vol. 11, 1 (2010), 1--11.

[43]

Junmei Wang, Min Pan, Tingting He, Xiang Huang, Xueyan Wang, and Xinhui Tu. 2020. A pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval. Information Processing & Management, Vol. 57, 6 (2020), 102342.

[44]

Shuai Wang, Harrisen Scells, Justin Clark, Bevan Koopman, and Guido Zuccon. 2022a. From little things big things grow: A collection with seed studies for medical systematic review literature search. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3176--3186.

Digital Library

[45]

Shuai Wang, Harrisen Scells, Bevan Koopman, and Guido Zuccon. 2022b. Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search. In Proceedings of the 26th Australasian Document Computing Symposium. 1--10.

Digital Library

[46]

Huaying Wu, Tingting Wang, Jiayi Chen, Su Chen, Qinmin Hu, and Liang He. 2018. ECNU at 2018 eHealth Task 2: Technologically Assisted Reviews in Empirical Medicine. In Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, Vol. 2125.

[47]

Eugene Yang, David D Lewis, and Ophir Frieder. 2019. Text retrieval priors for Bayesian logistic regression. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1045--1048.

Digital Library

[48]

Eugene Yang, David D Lewis, and Ophir Frieder. 2021. Heuristic stopping rules for technology-assisted review. In Proceedings of the 21st ACM Symposium on Document Engineering. 1--10.

Digital Library

[49]

Eugene Yang, Sean MacAvaney, David D Lewis, and Ophir Frieder. 2022. Goldilocks: Just-right tuning of bert for technology-assisted review. In European Conference on Information Retrieval. Springer, 502--517.

Digital Library

[50]

Michihiro Yasunaga, Jure Leskovec, and Percy Liang. 2022. Linkbert: Pretraining language models with document links. arXiv preprint arXiv:2203.15827 (2022).

[51]

Hongchien Yu, Chenyan Xiong, and Jamie Callan. 2021. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 3592--3596.

Digital Library

[52]

Zhe Yu and Tim Menzies. 2017. Data Balancing for Technologically Assisted Reviews: Undersampling or Reweighting. In Working Notes of CLEF 2017-Conference and Labs of the Evaluation Forum, Vol. 1866.

[53]

Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, and Andrew Yates. 2021. Contextualized query expansion via unsupervised chunk selection for text retrieval. Information Processing & Management, Vol. 58, 5 (2021), 102672.

Digital Library

Index Terms

Dense Retrieval with Continuous Explicit Feedback for Systematic Review Screening Prioritisation
1. Information systems
  1. Information retrieval

Recommendations

Generative Retrieval as Multi-Vector Dense Retrieval
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

For a given query generative retrieval generates identifiers of relevant documents in an end-to-end manner using a sequence-to-sequence architecture. The relation between generative retrieval and other retrieval methods, especially those based on ...
Pseudo-Relevance Feedback for Multiple Representation Dense Retrieval
ICTIR '21: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval

Pseudo-relevance feedback mechanisms, from Rocchio to the relevance models, have shown the usefulness of expanding and reweighting the users' initial queries using information occurring in an initial set of retrieved documents, known as the pseudo-...
Improving zero-shot retrieval using dense external expansion
Abstract
Pseudo-relevance feedback (PRF) is a classical technique to improve search engine retrieval effectiveness, by closing the vocabulary gap between users’ query formulations and the relevant documents. While PRF is typically applied on ...
Highlights
- Dense external expansion improves zero-shot retrieval performance.
- High quality ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2024

3164 pages

ISBN:9798400704314

DOI:10.1145/3626772

General Chairs:
Grace Hui Yang
Georgetown University, USA
,
Hongning Wang
Tsinghua University, China
,
Sam Han
The Washington Post, USA
,
Program Chairs:
Claudia Hauff
Spotify, Netherlands
,
Guido Zuccon
The University of Queensland, Australia
,
Yi Zhang
University of California Santa Cruz, USA

Copyright © 2024 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Australian Research Council

Conference

SIGIR 2024

Sponsor:

SIGIR

SIGIR 2024: The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 14 - 18, 2024

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
66
Total Downloads

Downloads (Last 12 months)66
Downloads (Last 6 weeks)13

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents