Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3209978.3209994acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Seed-driven Document Ranking for Systematic Reviews in Evidence-Based Medicine

Published: 27 June 2018 Publication History

Abstract

Systematic review (SR) in evidence-based medicine is a literature review which provides a conclusion to a specific clinical question. To assure credible and reproducible conclusions, SRs are conducted by well-defined steps. One of the key steps, the screening step, is to identify relevant documents from a pool of candidate documents. Typically about 2000 candidate documents will be retrieved from databases using keyword queries for a SR. From which, about 20 relevant documents are manually identified by SR experts, based on detailed relevance conditions or eligibility criteria. Recent studies show that document ranking, or screening prioritization, is a promising way to improve the manual screening process. In this paper, we propose a seed-driven document ranking (SDR) model for effective screening, with the assumption that one relevant document is known, i.e., the seed document. Based on a detailed analysis of characteristics of relevant documents, SDR represents documents using bag of clinical terms, rather than the commonly used bag of words. More importantly, we propose a method to estimate the importance of the clinical terms based on their distribution in candidate documents. On benchmark dataset released by CLEF'17 eHealth Task 2, we show that the proposed SDR outperforms state-of-the-art solutions. Interestingly, we also observe that ranking based on word embedding representation of documents well complements SDR. The best ranking is achieved by combining the relevances estimated by SDR and by word embedding. Additionally, we report results of simulating the manual screening process with SDR.

References

[1]
Amal Alharbi and Mark Stevenson . 2017. Ranking abstracts to identify relevant evidence for systematic reviews: The University of Sheffield's approach to CLEF eHealth 2017 Task 2: Working notes for CLEF 2017 CEUR Workshop Proceedings, Vol. Vol. 1866.
[2]
Victoria B Allen, Kurinchi Selvan Gurusamy, Yemisi Takwoingi, Amun Kalia, and Brian R Davidson . 2013. Diagnostic accuracy of laparoscopy following computed tomography (CT) scanning for assessing the resectability with curative intent in pancreatic and periampullary cancer. Cochrane Database Syst Rev Vol. 11 (2013).
[3]
Aaron M Cohen, William R Hersh, K Peterson, and Po-Yin Yen . 2006. Reducing workload in systematic review preparation using automated citation classification. Journal of the American Medical Informatics Association Vol. 13, 2 (2006), 206--219.
[4]
Agostino Colli, Juan Cristóbal Gana, Dan Turner, Jason Yap, Thomasin Adams-Webber, Simon C Ling, and Giovanni Casazza . 2014. Capsule endoscopy for the diagnosis of oesophageal varices in people with chronic liver disease or portal vein thrombosis. Cochrane Database Syst Rev Vol. 10 (2014).
[5]
Gordon V. Cormack and Maura R. Grossman . 2016. Engineering Quality and Reliability in Technology-Assisted Review SIGIR. 75--84.
[6]
Gordon V. Cormack and Maura R. Grossman . 2017. Technology-Assisted Review in Empirical Medicine: Waterloo Participation in CLEF eHealth 2017. In CEUR Workshop Proceedings, Vol. Vol. 1866.
[7]
Kurinchi Selvan Gurusamy, Vanja Giljaca, Yemisi Takwoingi, David Higgie, Goran Poropat, Davor vStimac, and Brian R Davidson . 2015. Ultrasound versus liver function tests for diagnosis of common bile duct stones. Cochrane Database Syst Rev Vol. 2 (2015).
[8]
Kazuma Hashimoto, Georgios Kontonatsios, Makoto Miwa, and Sophia Ananiadou . 2016. Topic detection using paragraph vectors to support active learning in systematic reviews. Journal of biomedical informatics Vol. 62 (2016), 59--65.
[9]
Siddhartha R Jonnalagadda, Pawan Goyal, and Mark D Huffman . 2015. Automating data extraction in systematic reviews: a systematic review. Systematic reviews Vol. 4, 1 (2015), 78.
[10]
Evangelos Kanoulas, Dan Li, Leif Azzopardi, and Rene Spijker . 2017. CLEF 2017 Technologically Assisted Reviews in Empirical Medicine Overview CEUR Workshop Proceedings, Vol. Vol. 1866.
[11]
Youngho Kim and W. Bruce Croft . 2014. Diversifying Query Suggestions Based on Query Documents SIGIR. 891--894.
[12]
Youngho Kim and W. Bruce Croft . 2015. Improving Patent Search by Search Result Diversification ICTIR. 201--210.
[13]
Youngho Kim, Jangwon Seo, W Bruce Croft, and David A Smith . 2014. Automatic suggestion of phrasal-concept queries for literature search. IP&M Vol. 50, 4 (2014), 568--583.
[14]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger . 2015. From word embeddings to document distances. In ICML. 957--966.
[15]
Matthew Lease, Gordon V Cormack, An T Nguyen, Thomas A Trikalinos, and Byron C Wallace . 2016. Systematic review is e-discovery in doctor's clothing MedIR workshop, SIGIR.
[16]
Yuanhua Lv, Taesup Moon, Pranam Kolari, Zhaohui Zheng, Xuanhui Wang, and Yi Chang . 2011. Learning to Model Relatedness for News Recommendation WWW. 57--66.
[17]
Iain J Marshall, Joël Kuiper, and Byron C Wallace . 2015. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association Vol. 23, 1 (2015), 193--201.
[18]
Eric Nalisnick, Bhaskar Mitra, Nick Craswell, and Rich Caruana . 2016. Improving document ranking with dual word embeddings WWW. 83--84.
[19]
Alison ÓMara-Eves, James Thomas, John McNaught, Makoto Miwa, and Sophia Ananiadou . 2015. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic reviews Vol. 4, 1 (2015), 5.
[20]
Harrisen Scells, Guido Zuccon, Bevan Koopman, Anthony Deacon, Leif Azzopardi, and Shlomo Geva . 2017 a. Integrating the Framing of Clinical Questions via PICO into the Retrieval of Medical Literature for Systematic Reviews. In CIKM. 2291--2294.
[21]
Harrisen Scells, Guido Zuccon, Bevan Koopman, Anthony Deacon, Leif Azzopardi, and Shlomo Geva . 2017 b. A Test Collection for Evaluating Retrieval of Studies for Inclusion in Systematic Reviews. In SIGIR. 1237--1240.
[22]
Nader Shaikh, JL Borrell, J Evron, and MM Leeflang . 2011. Procalcitonin, C-reactive protein, and erythrocyte sedimentation rate for the diagnosis of acute pyelonephritis in children. Cochrane Database Syst Rev Vol. 1 (2011).
[23]
Luca Soldaini and Nazli Goharian . 2016. Quickumls: a fast, unsupervised approach for medical concept extraction MedIR workshop, SIGIR.
[24]
Byron C Wallace, Joël Kuiper, Aakash Sharma, Mingxi Brian Zhu, and Iain J Marshall . 2016. Extracting PICO sentences from clinical trial reports using supervised distant supervision. JMLR Vol. 17, 132 (2016), 1--25.
[25]
Byron C Wallace, Kevin Small, Carla E Brodley, and Thomas A Trikalinos . 2010. Active learning for biomedical citation screening. In KDD. 173--182.
[26]
Linkai Weng, Zhiwei Li, Rui Cai, Yaoxue Zhang, Yuezhi Zhou, Laurence T. Yang, and Lei Zhang . 2011. Query by Document via a Decomposition-based Two-level Retrieval Approach SIGIR. 505--514.
[27]
Christopher M Williams, Nicholas Henschke, Christopher G Maher, Maurits W van Tulder, Bart W Koes, Petra Macaskill, and Les Irwig . 2013. Red flags to screen for vertebral fracture in patients presenting with low-back pain. Cochrane Database Syst Rev Vol. 1 (2013).
[28]
Yin Yang, Nilesh Bansal, Wisam Dakka, Panagiotis Ipeirotis, Nick Koudas, and Dimitris Papadias . 2009. Query by Document. In WSDM. 34--43.
[29]
ChengXiang Zhai and Sean Massung . 2016. Text data management and analysis: a practical introduction to information retrieval and text mining. Morgan & Claypool.

Cited By

View all
  • (2024)Dense Retrieval with Continuous Explicit Feedback for Systematic Review Screening PrioritisationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657921(2357-2362)Online publication date: 10-Jul-2024
  • (2024)Zero-Shot Generative Large Language Models for Systematic Review Screening AutomationAdvances in Information Retrieval10.1007/978-3-031-56027-9_25(403-420)Online publication date: 20-Mar-2024
  • (2023)Generating Natural Language Queries for More Effective Systematic Review Screening PrioritisationProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625322(73-83)Online publication date: 26-Nov-2023
  • Show More Cited By

Index Terms

  1. Seed-driven Document Ranking for Systematic Reviews in Evidence-Based Medicine

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
    June 2018
    1509 pages
    ISBN:9781450356572
    DOI:10.1145/3209978
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 June 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. document ranking
    2. seed document
    3. systematic reviews

    Qualifiers

    • Research-article

    Conference

    SIGIR '18
    Sponsor:

    Acceptance Rates

    SIGIR '18 Paper Acceptance Rate 86 of 409 submissions, 21%;
    Overall Acceptance Rate 464 of 2,392 submissions, 19%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)30
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Dense Retrieval with Continuous Explicit Feedback for Systematic Review Screening PrioritisationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657921(2357-2362)Online publication date: 10-Jul-2024
    • (2024)Zero-Shot Generative Large Language Models for Systematic Review Screening AutomationAdvances in Information Retrieval10.1007/978-3-031-56027-9_25(403-420)Online publication date: 20-Mar-2024
    • (2023)Generating Natural Language Queries for More Effective Systematic Review Screening PrioritisationProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3625322(73-83)Online publication date: 26-Nov-2023
    • (2023)Hierarchical Transformer-based Query by Multiple DocumentsProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605130(105-115)Online publication date: 9-Aug-2023
    • (2023)SciMine: An Efficient Systematic Prioritization Model Based on Richer Semantic InformationProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591764(205-215)Online publication date: 19-Jul-2023
    • (2023)MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query ConstructionProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3573025(1176-1179)Online publication date: 27-Feb-2023
    • (2023)Proximity-Aware Clinical Passage Retrieval Framework by Exploiting Knowledge StructureIEEE Access10.1109/ACCESS.2023.326600411(37681-37693)Online publication date: 2023
    • (2023)The use of artificial intelligence for automating or semi-automating biomedical literature analyses: A scoping reviewJournal of Biomedical Informatics10.1016/j.jbi.2023.104389142(104389)Online publication date: Jun-2023
    • (2023)Towards semantic-driven boolean query formalization for biomedical systematic literature reviewsInternational Journal of Medical Informatics10.1016/j.ijmedinf.2022.104928170(104928)Online publication date: Feb-2023
    • (2022)An active learning-based approach for screening scholarly articles about the origins of SARS-CoV-2PLOS ONE10.1371/journal.pone.027372517:9(e0273725)Online publication date: 16-Sep-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media