Enhancing LLM’s Reliability by Iterative Verification Attributions with Keyword Fronting

Yize Sui¹³,
Jing Ren¹³,
Huibin Tan¹³,
Huan Chen¹³,
Zhaoye Li¹³ &
…
Ji Wang¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14946))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

405 Accesses

Abstract

Retrieval-augmented text generation attribution is of great significance for knowledge-intensive tasks as it can enhance the credibility and verifiability of large language models (LLMs). However, existing research often ignores the adverse effect of “Middle Loss” in lengthy input contexts on answer correctness, and the potential negative impact of unverified citations on the quality of attribution. To address these challenges, we propose a framework IVAKF (Iterative Verified Attribution with Keyword Fronting), which better utilizes long context information and integrates attribution verification throughout the whole process of response generation. Specifically, for the “Middle Loss” issue, we employ a keyword fronting strategy with Named Entity Recognition (NER), guiding the model’s attention to focus on key entities and their relationship with other parts. As for the issue of poor attribution quality, we design a verification-based iterative optimization algorithm, which continuously updates candidate statements and citations until it produces a satisfactory output result. Experiments on three public knowledge-intensive datasets demonstrate that the proposed framework significantly improves the quality of the final response. It improved answer correctness by 6.4%, and citation quality by 9.1% than the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Empowering Legal Citation Recommendation via Efficient Instruction-Tuning of Pre-trained Language Models

Improving Wikipedia verifiability with AI

Article Open access 19 October 2023

References

Borgeaud, S., et al.: Improving language models by retrieving from trillions of tokens. In: International Conference on Machine Learning, pp. 2206–2240. PMLR (2022)
Google Scholar
Chiang, W.L., et al.: Vicuna: an open-source chatbot impressing GPT-4 with 90%* chatgpt quality (2023). https://lmsys.org/blog/2023-03-30-vicuna/
Fan, A., Jernite, Y., Perez, E., Grangier, D., Weston, J., Auli, M.: ELI5: long form question answering. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3558–3567 (2019)
Google Scholar
Fierro, C., et al.: Learning to plan and generate text with citations. arXiv preprint arXiv:2404.03381 (2024)
Fu, J., Huang, X., Liu, P.: Spanner: named entity re-/recognition as span prediction. arXiv preprint arXiv:2106.00641 (2021)
Gao, L., et al.: RARR: researching and revising what language models say, using language models. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 16477–16508 (2023)
Google Scholar
Gao, T., Yen, H., Yu, J., Chen, D.: Enabling large language models to generate text with citations. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 6465–6488 (2023)
Google Scholar
Gravel, J., D’Amours-Gravel, M., Osmanlliu, E.: Learning to fake it: limited responses and fabricated references provided by ChatGPT for medical questions. Mayo Clinic Proc. Digit. Health 1(3), 226–234 (2023)
Article Google Scholar
Honovich, O., et al.: True: re-evaluating factual consistency evaluation. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3905–3920 (2022)
Google Scholar
Ji, Z., et al.: Survey of hallucination in natural language generation. ACM Comput. Surv. 55(12), 1–38 (2023)
Article Google Scholar
Jiang, Z., et al.: Active retrieval augmented generation. arXiv preprint arXiv:2305.06983 (2023)
Kwiatkowski, T., et al.: Natural questions: a benchmark for question answering research. Trans. Assoc. Comput. Linguist. 7, 453–466 (2019)
Article Google Scholar
Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 34(1), 50–70 (2020)
Article Google Scholar
Li, X., Zhu, C., Li, L., Yin, Z., Sun, T., Qiu, X.: Llatrieval: LLM-verified retrieval for verifiable generation. arXiv preprint arXiv:2311.07838 (2023)
Li, X., Cao, Y., Pan, L., Ma, Y., Sun, A.: Towards verifiable generation: a benchmark for knowledge-aware language model attribution. arXiv preprint arXiv:2310.05634 (2023)
Liu, N.F., et al.: Lost in the middle: how language models use long contexts. Trans. Assoc. Comput. Linguist. 12, 157–173 (2024)
Article Google Scholar
Liu, N.F., Zhang, T., Liang, P.: Evaluating verifiability in generative search engines. In: The 2023 Conference on Empirical Methods in Natural Language Processing (2023)
Google Scholar
Modarressi, A., Imani, A., Fayyaz, M., Schütze, H.: RET-LLM: towards a general read-write memory for large language models. arXiv preprint arXiv:2305.14322 (2023)
Ni, J., et al.: Large dual encoders are generalizable retrievers. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 9844–9855 (2022)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(140), 1–67 (2020)
MathSciNet Google Scholar
Rashkin, H., et al.: Measuring attribution in natural language generation models. Comput. Linguist. 49(4), 777–840 (2023)
Article Google Scholar
Stelmakh, I., Luan, Y., Dhingra, B., Chang, M.W.: ASQA: factoid questions meet long-form answers. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 8273–8288 (2022)
Google Scholar
Sun, H., et al.: Allies: prompting large language model with beam search. In: The 2023 Conference on Empirical Methods in Natural Language Processing (2023)
Google Scholar
Sun, Z., Wang, X., Tay, Y., Yang, Y., Zhou, D.: Recitation-augmented language models. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=-cqvvvb-NkI
Wang, X., et al.: Knowledgpt: enhancing large language models with retrieval and storage access on knowledge bases. arXiv preprint arXiv:2308.11761 (2023)
Wang, Y., Li, P., Sun, M., Liu, Y.: Self-knowledge guided retrieval augmentation for large language models. arXiv preprint arXiv:2310.05002 (2023)
Weller, O., Marone, M., Weir, N., Lawrie, D., Khashabi, D., Van Durme, B.: “ according to...” prompting language models improves quoting from pre-training data. arXiv preprint arXiv:2305.13252 (2023)
Xu, S., Pang, L., Shen, H., Cheng, X., Chua, T.S.: Search-in-the-chain: towards the accurate, credible and traceable content generation for complex knowledge-intensive tasks. arXiv preprint arXiv:2304.14732 (2023)
Zuccon, G., Koopman, B., Shaik, R.: Chatgpt hallucinates when attributing answers. In: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, pp. 46–51 (2023)
Google Scholar

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (No. 91948303-1, No. 61803375, No. 12002380, No. 62106278, No. 62101575, No. 61906210), the National University of Defense Technology Foundation (No. ZK20-52) and the Independent and open subject fund (grant no.202201-06) from State Key Laboratory of High Performance Computing.

Author information

Authors and Affiliations

Department of Intelligent Data Science, College of Computer Science and Technology, National University of Defense Technology, Changsha, 410073, China
Yize Sui, Jing Ren, Huibin Tan, Huan Chen, Zhaoye Li & Ji Wang

Authors

Yize Sui
View author publications
You can also search for this author in PubMed Google Scholar
Jing Ren
View author publications
You can also search for this author in PubMed Google Scholar
Huibin Tan
View author publications
You can also search for this author in PubMed Google Scholar
Huan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoye Li
View author publications
You can also search for this author in PubMed Google Scholar
Ji Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Ren .

Editor information

Editors and Affiliations

LTCI, Télécom Paris, Palaiseau Cedex, France
Albert Bifet
KU Leuven, Leuven, Belgium
Jesse Davis
Faculty of Informatics, Vytautas Magnus University, Akademija, Lithuania
Tomas Krilavičius
Institute of Computer Science, University of Tartu, Tartu, Estonia
Meelis Kull
Department of Computer Science, Bundeswehr University Munich, Munich, Germany
Eirini Ntoutsi
Dept. of Computer Science, University of Helsinki, Helsinki, Finland
Indrė Žliobaitė

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sui, Y., Ren, J., Tan, H., Chen, H., Li, Z., Wang, J. (2024). Enhancing LLM’s Reliability by Iterative Verification Attributions with Keyword Fronting. In: Bifet, A., Davis, J., Krilavičius, T., Kull, M., Ntoutsi, E., Žliobaitė, I. (eds) Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14946. Springer, Cham. https://doi.org/10.1007/978-3-031-70365-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-70365-2_15
Published: 22 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70364-5
Online ISBN: 978-3-031-70365-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Enhancing LLM’s Reliability by Iterative Verification Attributions with Keyword Fronting

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Empowering Legal Citation Recommendation via Efficient Instruction-Tuning of Pre-trained Language Models

Improving Wikipedia verifiability with AI

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Enhancing LLM’s Reliability by Iterative Verification Attributions with Keyword Fronting

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Empowering Legal Citation Recommendation via Efficient Instruction-Tuning of Pre-trained Language Models

Improving Wikipedia verifiability with AI

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation