Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3661638.3661653acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaisnsConference Proceedingsconference-collections
research-article

Detecting and Mitigating the Ungrounded Hallucinations in Text Generation by LLMs

Published: 01 June 2024 Publication History

Abstract

Large language models (LLMs) have achieved impressive success in generating fluent and coherent texts in natural language. However, the presence of inaccurate or low-quality data can unintentionally lead to the retention of incorrect knowledge, resulting in hallucinations that hinder progress in content generation. In this paper, we propose a comprehensive framework aimed at detecting and mitigating these hallucinations. Our approach uses Named Entity Recognition (NER) and Entity Relationship (ER) models to identify hallucination entities and sentences during the detection phase. Furthermore, by incorporating prompt engineering, we effectively correct these hallucination sentences using LLM in the mitigation phase. Tests on real articles confirm the effectiveness of our approach in rectifying LLM-associated hallucinations without adding new ones, thereby enhancing their reliability and credibility.

References

[1]
Hammerton J. Named entity recognition with long short-term memory[C]//Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003. 2003: 172-175.
[2]
Collobert R, Weston J, Bottou L, Natural language processing (almost) from scratch[J]. Journal of machine learning research, 2011, 12(ARTICLE): 2493− 2537.
[3]
Sarzynska-Wawer J, Wawer A, Pawlak A, Detecting formal thought disorder by deep contextualized word representations[J]. Psychiatry Research, 2021, 304: 114135.
[4]
Chan Y S, Roth D. Exploiting syntactico-semantic structures for relation extraction[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011: 551-560.
[5]
Nadeau D, Sekine S. A survey of named entity recognition and classification[J]. Lingvisticae Investigationes, 2007, 30(1): 3-26.
[6]
Miwa M, Bansal M. End-to-end relation extraction using lstms on sequences and tree structures[J]. arXiv preprint arXiv:1601.00770, 2016.
[7]
Zhang M, Zhang Y, Fu G. End-to-end neural relation extraction with global optimization[C]//Proceedings of the 2017 conference on empirical methods in natural language processing. 2017: 1730-1740.
[8]
Holtzman A, Buys J, Du L, The curious case of neural text degeneration[J]. arXiv preprint arXiv:1904.09751, 2019.
[9]
Ji Z, Lee N, Frieske R, Survey of hallucination in natural language generation[J]. ACM Computing Surveys, 2023, 55(12): 1-38.
[10]
Varshney N, Yao W, Zhang H, A stitch in time saves nine: Detecting and mitigating hallucinations of llms by validating low-confidence generation[J]. arXiv preprint arXiv:2307.03987, 2023.
[11]
Zhang Y, Li Y, Cui L, Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models[J]. arXiv preprint arXiv:2309.01219, 2023.
[12]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics.
[13]
Chin-Yew Lin and Eduard Hovy. Automatic evaluation of summaries using n-gram cooccurrence statistics. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pages 150–157, 2003.
[14]
Yuan W, Neubig G, Liu P. Bartscore: Evaluating generated text as text generation[J]. Advances in Neural Information Processing Systems, 2021, 34: 27263-27277.
[15]
Lee N, Ping W, Xu P, Factuality enhanced language models for open-ended text generation[J]. Advances in Neural Information Processing Systems, 2022, 35: 34586-34599.
[16]
Gao L, Dai Z, Pasupat P, Rarr: Researching and revising what language models say, using language models[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023: 16477-16508.
[17]
Manakul P, Liusie A, Gales M J F. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models[J]. arXiv preprint arXiv:2303.08896, 2023.
[18]
Mündler N, He J, Jenko S, Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation[J]. arXiv preprint arXiv:2305.15852, 2023.
[19]
Luo J, Xiao C, Ma F. Zero-Resource Hallucination Prevention for Large Language Models[J]. arXiv preprint arXiv:2309.02654, 2023.
[20]
Zha Y, Yang Y, Li R, AlignScore: Evaluating Factual Consistency with a Unified Alignment Function[J]. arXiv preprint arXiv:2305.16739, 2023.
[21]
Li J, Cheng X, Zhao W X, HELMA: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models[J]. arXiv preprint arXiv:2305.11747, 2023.
[22]
Lei D, Li Y, Wang M, Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations[J]. arXiv preprint arXiv:2310.03951, 2023.
[23]
Kryściński W, McCann B, Xiong C, Evaluating the factual consistency of abstractive text summarization[J]. arXiv preprint arXiv:1910.12840, 2019.

Index Terms

  1. Detecting and Mitigating the Ungrounded Hallucinations in Text Generation by LLMs

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AISNS '23: Proceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security
    December 2023
    467 pages
    ISBN:9798400716966
    DOI:10.1145/3661638
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AISNS 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 49
      Total Downloads
    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 26 Nov 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media