Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3626772.3657853acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation

Published: 11 July 2024 Publication History

Abstract

Federated search systems aggregate results from multiple search engines, selecting appropriate sources to enhance result quality and align with user intent. With the increasing uptake of Retrieval-Augmented Generation (RAG) pipelines, federated search can play a pivotal role in sourcing relevant information across heterogeneous data sources to generate informed responses. However, existing datasets, such as those developed in the past TREC FedWeb tracks, predate the RAG paradigm shift and lack representation of modern information retrieval challenges.
To bridge this gap, we present FeB4RAG, a novel dataset specifically designed for federated search within RAG frameworks. This dataset, derived from 16 sub-collections of the widely used BEIR benchmarking collection, includes 790 information requests (akin to conversational queries) tailored for chatbot applications, along with top results returned by each resource and associated LLM-derived relevance judgements. Additionally, to support the need for this collection, we demonstrate the impact on response generation of a high quality federated search system for RAG compared to a naive approach to federated search. We do so by comparing answers generated by the RAG pipeline with a qualitative side-by-side comparison. Our collection fosters and supports the development and evaluation of new federated search methods, especially in the context of RAG pipelines. The resource is publicly available at https://github.com/ielab/FeB4RAG.

References

[1]
Suresh K Bhavnani and Concepción S Wilson. 2009. Information scattering. Encyclopedia of library and information sciences (2009), 2564--2569.
[2]
Alexander Bondarenko, Maik Fröbe, Meriem Beloucif, Lukas Gienapp, Yamen Ajjour, Alexander Panchenko, Chris Biemann, Benno Stein, Henning Wachsmuth, Martin Potthast, et al. 2020. Overview of Touché 2020: argument retrieval. In Experimental IR Meets Multilinguality, Multimodality, and Interaction: 11th International Conference of the CLEF Association, CLEF 2020, Thessaloniki, Greece, September 22-25, 2020, Proceedings 11. Springer, 384--395.
[3]
Jamie Callan. 2002. Distributed information retrieval. In Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Springer, 127--150.
[4]
Jamie Callan, Mark Hoy, Changkuk Yoo, and Le Zhao. 2009. Clueweb09 data set.
[5]
Harrison Chase. 2022. LangChain. https://github.com/langchain-ai/langchain
[6]
Charles LA Clarke, Nick Craswell, Ian Soboroff, et al. 2004. Overview of the TREC 2004 Terabyte Track. In TREC, Vol. 4. 74.
[7]
Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, and Daniel S Weld. 2020. SPECTER: Document-level Representation Learning using Citation-informed Transformers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2270--2282.
[8]
Florin Cuconasu, Giovanni Trappolini, Federico Siciliano, Simone Filice, Cesare Campagnano, Yoelle Maarek, Nicola Tonellotto, and Fabrizio Silvestri. 2024. The Power of Noise: Redefining Retrieval for RAG Systems. arXiv preprint arXiv:2401.14887 (2024).
[9]
Zhuyun Dai, Yubin Kim, and Jamie Callan. 2017. Learning to rank resources. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 837--840.
[10]
Thomas Demeester, D Trieschnigg, D Nguyen, and D Hiemstra. 2013. Overview of the TREC 2013 federated web search track. In Text Retrieval Conference (TREC-2013). 1--11.
[11]
Thomas Demeester, Dolf Trieschnigg, Dong Nguyen, Djoerd Hiemstra, and Ke Zhou. 2014. Overview of the trec 2014 federated web search track. In Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, November 19-21, 2014.
[12]
Thomas Demeester, Dolf Trieschnigg, Dong Nguyen, Djoerd Hiemstra, and Ke Zhou. 2015. FedWeb greatest hits: Presenting the new test collection for federated web search. In Proceedings of the 24th International Conference on World Wide Web. 27--28.
[13]
Ulugbek Ergashev, Eduard Dragut, and Weiyi Meng. 2023. Learning To Rank Resources with GNN. In Proceedings of the ACM Web Conference 2023. 3247--3256.
[14]
Guglielmo Faggioli, Laura Dietz, Charles LA Clarke, Gianluca Demartini, Matthias Hagen, Claudia Hauff, Noriko Kando, Evangelos Kanoulas, Martin Potthast, Benno Stein, et al. 2023. Perspectives on large language models for relevance judgment. In Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval. 39--50.
[15]
Yixing Fan, Xiaohui Xie, Yinqiong Cai, Jia Chen, Xinyu Ma, Xiangsheng Li, Ruqing Zhang, Jiafeng Guo, et al. 2022. Pre-training methods in information retrieval. Foundations and Trends® in Information Retrieval, Vol. 16, 3 (2022), 178--317.
[16]
Adamu Garba, Shengli Wu, and Shah Khalid. 2023. Federated search techniques: an overview of the trends and state of the art. Knowledge and Information Systems, Vol. 65, 12 (2023), 5065--5095.
[17]
Lukas Gienapp, Harrisen Scells, Niklas Deckers, Janek Bevendorff, Shuai Wang, Johannes Kiesel, Shahbaz Syed, Maik Fröbe, Guide Zucoon, Benno Stein, et al. 2023. Evaluating Generative Ad Hoc Information Retrieval. arXiv preprint arXiv:2311.04694 (2023).
[18]
Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. 2020. A deep look into neural ranking models for information retrieval. Information Processing & Management, Vol. 57, 6 (2020), 102067.
[19]
Omar Khattab, Keshav Santhanam, Xiang Lisa Li, David Hall, Percy Liang, Christopher Potts, and Matei Zaharia. 2022. Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP. arXiv preprint arXiv:2212.14024 (2022).
[20]
Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts. 2023. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. arXiv preprint arXiv:2310.03714 (2023).
[21]
Dahyun Kim, Chanjun Park, Sanghoon Kim, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim, Changbae Ahn, Seonghoon Yang, Sukyung Lee, Hyunbyung Park, Gyoungjin Gim, Mikyoung Cha, Hwalsuk Lee, and Sunghun Kim. 2023. SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling. arxiv: 2312.15166 [cs.CL]
[22]
Hang Li, Shuai Wang, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, and Guido Zuccon. 2022. To interpolate or not to interpolate: Prf, dense and sparse retrievers. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2495--2500.
[23]
Xianming Li and Jing Li. 2023. AnglE-optimized Text Embeddings. arXiv preprint arXiv:2309.12871 (2023).
[24]
Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, and Meishan Zhang. 2023. Towards general text embeddings with multi-stage contrastive learning. arXiv preprint arXiv:2308.03281 (2023).
[25]
Jerry Liu. 2022. LlamaIndex. https://doi.org/10.5281/zenodo.1234
[26]
Bhaskar Mitra, Nick Craswell, et al. 2018. An introduction to neural information retrieval. Foundations and Trends® in Information Retrieval, Vol. 13, 1 (2018), 1--126.
[27]
André Mourao, Flávio Martins, and Joao Magalhaes. 2013. NovaSearch at TREC 2013 Federated Web Search Track: Experiments with rank fusion. In TREC.
[28]
Niklas Muennighoff. 2022. SGPT: GPT Sentence Embeddings for Semantic Search. arXiv preprint arXiv:2202.08904 (2022).
[29]
Niklas Muennighoff, Nouamane Tazi, Loïc Magne, and Nils Reimers. 2022. MTEB: Massive text embedding benchmark. arXiv preprint arXiv:2210.07316 (2022).
[30]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. choice, Vol. 2640 (2016), 660.
[31]
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D Manning, and Chelsea Finn. 2023. Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv:2305.18290 (2023).
[32]
Milad Shokouhi. 2007. Central-rank-based collection selection in uncooperative distributed information retrieval. In European Conference on Information Retrieval. Springer, 160--172.
[33]
Milad Shokouhi, Luo Si, et al. 2011. Federated search. Foundations and Trends® in Information Retrieval, Vol. 5, 1 (2011), 1--102.
[34]
Luo Si and Jamie Callan. 2003. Relevant document distribution estimation method for resource selection. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. 298--305.
[35]
Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A Smith, Luke Zettlemoyer, and Tao Yu. 2022. One embedder, any task: Instruction-finetuned text embeddings. arXiv preprint arXiv:2212.09741 (2022).
[36]
Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, and Iryna Gurevych. 2021. BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).
[37]
Paul Thomas and David Hawking. 2006. Evaluation by comparing result sets in context. In Proceedings of the 15th ACM international conference on Information and knowledge management. 94--101.
[38]
Paul Thomas and Milad Shokouhi. 2009. Sushi: Scoring scaled samples for server selection. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. 419--426.
[39]
Paul Thomas, Seth Spielman, Nick Craswell, and Bhaskar Mitra. 2023. Large language models can accurately predict searcher preferences. arXiv preprint arXiv:2309.10621 (2023).
[40]
Kien Tjin-Kam-Jet and Djoerd Hiemstra. 2010. Learning to merge search results for efficient distributed information retrieval. (2010).
[41]
Nicola Tonellotto. 2022. Lecture notes on neural information retrieval. arXiv preprint arXiv:2207.13443 (2022).
[42]
Mohamed Trabelsi, Zhiyu Chen, Brian D Davison, and Jeff Heflin. 2021. Neural ranking models for document retrieval. Information Retrieval Journal, Vol. 24 (2021), 400--444.
[43]
Henning Wachsmuth, Martin Trenkmann, Benno Stein, Gregor Engels, and Tsvetomira Palakarska. 2014. A Review Corpus for Argumentation Analysis. In Proceedings of the 15th International Conference on Intelligent Text Processing and Computational Linguistics (Kathmandu, Nepal), Alexander Gelbukh (Ed.). Springer, Berlin Heidelberg New York, 115--127. https://doi.org/10.1007/978-3-642-54903-8_10
[44]
Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei. 2022. Text Embeddings by Weakly-Supervised Contrastive Pre-training. arXiv preprint arXiv:2212.03533 (2022).
[45]
Shuai Wang, Harrisen Scells, Shengyao Zhuang, Martin Potthast, Bevan Koopman, and Guido Zuccon. 2024. Zero-shot Generative Large Language Models for Systematic Review Screening Automation. arXiv preprint arXiv:2401.06320 (2024).
[46]
Shuai Wang, Shengyao Zhuang, Bevan Koopman, and Guido Zuccon. 2024. ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search. arXiv preprint arXiv:2401.17645 (2024).
[47]
Shuai Wang, Shengyao Zhuang, and Guido Zuccon. 2021. Bert-based dense retrievers require interpolation with bm25 for effective passage retrieval. In Proceedings of the 2021 ACM SIGIR international conference on theory of information retrieval. 317--324.
[48]
Jinxi Xu and W Bruce Croft. 1999. Cluster-based language models for distributed retrieval. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 254--261.
[49]
Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Zhicheng Dou, and Ji-Rong Wen. 2023. Large language models for information retrieval: A survey. arXiv preprint arXiv:2308.07107 (2023).
[50]
Honglei Zhuang, Zhen Qin, Kai Hui, Junru Wu, Le Yan, Xuanhui Wang, and Michael Berdersky. 2023. Beyond yes and no: Improving zero-shot llm rankers via scoring fine-grained relevance labels. arXiv preprint arXiv:2310.14122 (2023).
[51]
Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, and Guido Zuccon. 2023. A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models. arXiv preprint arXiv:2310.09497 (2023).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2024
3164 pages
ISBN:9798400704314
DOI:10.1145/3626772
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. federated search
  2. large language models (llms)
  3. retrieval augmented generation (rag)
  4. test collection.

Qualifiers

  • Research-article

Conference

SIGIR 2024
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 207
    Total Downloads
  • Downloads (Last 12 months)207
  • Downloads (Last 6 weeks)79
Reflects downloads up to 18 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media