Nothing Special   »   [go: up one dir, main page]

Skip to main content

Enhancing LLM’s Reliability by Iterative Verification Attributions with Keyword Fronting

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Research Track (ECML PKDD 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14946))

  • 405 Accesses

Abstract

Retrieval-augmented text generation attribution is of great significance for knowledge-intensive tasks as it can enhance the credibility and verifiability of large language models (LLMs). However, existing research often ignores the adverse effect of “Middle Loss” in lengthy input contexts on answer correctness, and the potential negative impact of unverified citations on the quality of attribution. To address these challenges, we propose a framework IVAKF (Iterative Verified Attribution with Keyword Fronting), which better utilizes long context information and integrates attribution verification throughout the whole process of response generation. Specifically, for the “Middle Loss” issue, we employ a keyword fronting strategy with Named Entity Recognition (NER), guiding the model’s attention to focus on key entities and their relationship with other parts. As for the issue of poor attribution quality, we design a verification-based iterative optimization algorithm, which continuously updates candidate statements and citations until it produces a satisfactory output result. Experiments on three public knowledge-intensive datasets demonstrate that the proposed framework significantly improves the quality of the final response. It improved answer correctness by 6.4%, and citation quality by 9.1% than the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Borgeaud, S., et al.: Improving language models by retrieving from trillions of tokens. In: International Conference on Machine Learning, pp. 2206–2240. PMLR (2022)

    Google Scholar 

  2. Chiang, W.L., et al.: Vicuna: an open-source chatbot impressing GPT-4 with 90%* chatgpt quality (2023). https://lmsys.org/blog/2023-03-30-vicuna/

  3. Fan, A., Jernite, Y., Perez, E., Grangier, D., Weston, J., Auli, M.: ELI5: long form question answering. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3558–3567 (2019)

    Google Scholar 

  4. Fierro, C., et al.: Learning to plan and generate text with citations. arXiv preprint arXiv:2404.03381 (2024)

  5. Fu, J., Huang, X., Liu, P.: Spanner: named entity re-/recognition as span prediction. arXiv preprint arXiv:2106.00641 (2021)

  6. Gao, L., et al.: RARR: researching and revising what language models say, using language models. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 16477–16508 (2023)

    Google Scholar 

  7. Gao, T., Yen, H., Yu, J., Chen, D.: Enabling large language models to generate text with citations. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 6465–6488 (2023)

    Google Scholar 

  8. Gravel, J., D’Amours-Gravel, M., Osmanlliu, E.: Learning to fake it: limited responses and fabricated references provided by ChatGPT for medical questions. Mayo Clinic Proc. Digit. Health 1(3), 226–234 (2023)

    Article  Google Scholar 

  9. Honovich, O., et al.: True: re-evaluating factual consistency evaluation. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3905–3920 (2022)

    Google Scholar 

  10. Ji, Z., et al.: Survey of hallucination in natural language generation. ACM Comput. Surv. 55(12), 1–38 (2023)

    Article  Google Scholar 

  11. Jiang, Z., et al.: Active retrieval augmented generation. arXiv preprint arXiv:2305.06983 (2023)

  12. Kwiatkowski, T., et al.: Natural questions: a benchmark for question answering research. Trans. Assoc. Comput. Linguist. 7, 453–466 (2019)

    Article  Google Scholar 

  13. Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50–70 (2020)

    Article  Google Scholar 

  14. Li, X., Zhu, C., Li, L., Yin, Z., Sun, T., Qiu, X.: Llatrieval: LLM-verified retrieval for verifiable generation. arXiv preprint arXiv:2311.07838 (2023)

  15. Li, X., Cao, Y., Pan, L., Ma, Y., Sun, A.: Towards verifiable generation: a benchmark for knowledge-aware language model attribution. arXiv preprint arXiv:2310.05634 (2023)

  16. Liu, N.F., et al.: Lost in the middle: how language models use long contexts. Trans. Assoc. Comput. Linguist. 12, 157–173 (2024)

    Article  Google Scholar 

  17. Liu, N.F., Zhang, T., Liang, P.: Evaluating verifiability in generative search engines. In: The 2023 Conference on Empirical Methods in Natural Language Processing (2023)

    Google Scholar 

  18. Modarressi, A., Imani, A., Fayyaz, M., Schütze, H.: RET-LLM: towards a general read-write memory for large language models. arXiv preprint arXiv:2305.14322 (2023)

  19. Ni, J., et al.: Large dual encoders are generalizable retrievers. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 9844–9855 (2022)

    Google Scholar 

  20. Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)

    MathSciNet  Google Scholar 

  21. Rashkin, H., et al.: Measuring attribution in natural language generation models. Comput. Linguist. 49(4), 777–840 (2023)

    Article  Google Scholar 

  22. Stelmakh, I., Luan, Y., Dhingra, B., Chang, M.W.: ASQA: factoid questions meet long-form answers. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 8273–8288 (2022)

    Google Scholar 

  23. Sun, H., et al.: Allies: prompting large language model with beam search. In: The 2023 Conference on Empirical Methods in Natural Language Processing (2023)

    Google Scholar 

  24. Sun, Z., Wang, X., Tay, Y., Yang, Y., Zhou, D.: Recitation-augmented language models. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=-cqvvvb-NkI

  25. Wang, X., et al.: Knowledgpt: enhancing large language models with retrieval and storage access on knowledge bases. arXiv preprint arXiv:2308.11761 (2023)

  26. Wang, Y., Li, P., Sun, M., Liu, Y.: Self-knowledge guided retrieval augmentation for large language models. arXiv preprint arXiv:2310.05002 (2023)

  27. Weller, O., Marone, M., Weir, N., Lawrie, D., Khashabi, D., Van Durme, B.: “ according to...” prompting language models improves quoting from pre-training data. arXiv preprint arXiv:2305.13252 (2023)

  28. Xu, S., Pang, L., Shen, H., Cheng, X., Chua, T.S.: Search-in-the-chain: towards the accurate, credible and traceable content generation for complex knowledge-intensive tasks. arXiv preprint arXiv:2304.14732 (2023)

  29. Zuccon, G., Koopman, B., Shaik, R.: Chatgpt hallucinates when attributing answers. In: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, pp. 46–51 (2023)

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (No. 91948303-1, No. 61803375, No. 12002380, No. 62106278, No. 62101575, No. 61906210), the National University of Defense Technology Foundation (No. ZK20-52) and the Independent and open subject fund (grant no.202201-06) from State Key Laboratory of High Performance Computing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Ren .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sui, Y., Ren, J., Tan, H., Chen, H., Li, Z., Wang, J. (2024). Enhancing LLM’s Reliability by Iterative Verification Attributions with Keyword Fronting. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14946. Springer, Cham. https://doi.org/10.1007/978-3-031-70365-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70365-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70364-5

  • Online ISBN: 978-3-031-70365-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics