Abstract
This chapter delves into the recent accountability tools tailored for the evolving landscape of machine learning models for complex reasoning tasks. With the increasing integration of language models into real-world scenarios, concerns related to misuse, biases, adversarial manipulations, and unintended behaviors have increased further. Next, the chapter explores recent advancements in explainability techniques for complex reasoning tasks. Three key areas are highlighted: interactive explanations, logical reasoning, and textual explanations through chain-of-thought reasoning. Finally, the chapter delves into diagnostic explainability methods, emphasizing recent developments in assessing the quality of natural language explanations and chain-of-thought reasoning. The faithfulness of explanations is explored through innovative tests. Information-theoretic measures are proposed to evaluate text explanation methods, providing insights into the information flow through explanation generation architectures. The chapter concludes by acknowledging the ongoing challenges and emphasizing the importance of continued research and development in these critical areas to foster responsible and accountable use of language models in real-world applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abdelnabi S, Fritz M (2023) Fact-saboteurs: a taxonomy of evidence manipulation attacks against fact-verification systems. In: 32nd USENIX Security Symposium (USENIX Security 23), pp 6719–6736
Aly R, Strong M, Vlachos A (2023) QA-NatVer: question answering for natural logic-based fact verification. arXiv preprint arXiv:231014198
Angeli G, Manning CD (2014) NaturalLI: natural logic inference for common sense reasoning. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, pp 534–545. https://doi.org/10.3115/v1/D14-1059. https://aclanthology.org/D14-1059
Arora S, Pruthi D, Sadeh N, Cohen WW, Lipton ZC, Neubig G (2022) Explain, edit, and understand: rethinking user study design for evaluating model explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 36, pp 5277–5285
Atanasova P, Camburu OM, Lioma C, Lukasiewicz T, Simonsen JG, Augenstein I (2023) Faithfulness tests for natural language explanations. In: Rogers A, Boyd-Graber J, Okazaki N (eds) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, Toronto, Canada, pp 283–294. https://doi.org/10.18653/v1/2023.acl-short.25. https://aclanthology.org/2023.acl-short.25
Augenstein I, Baldwin T, Cha M, Chakraborty T, Ciampaglia GL, Corney D, DiResta R, Ferrara E, Hale S, Halevy A, Hovy E, Ji H, Menczer F, Miguez R, Nakov P, Scheufele D, Sharma S, Zagni G (2023) Factuality challenges in the era of large language models. 2310.05189
Chen H, Feng S, Ganhotra J, Wan H, Gunasekara C, Joshi S, Ji Y (2021) Explaining neural network predictions on sentence pairs via learning word-group masks. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Online, pp 3917–3930. https://doi.org/10.18653/v1/2021.naacl-main.306. https://aclanthology.org/2021.naacl-main.306
Choudhury SR, Atanasova P, Augenstein I (2023) Explaining interactions between text spans. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics
Cohen R, Hamri M, Geva M, Globerson A (2023) LM vs LM: detecting factual errors via cross examination. arXiv preprint arXiv:230513281
Dhuliawala S, Komeili M, Xu J, Raileanu R, Li X, Celikyilmaz A, Weston J (2023) Chain-of-verification reduces hallucination in large language models. arXiv preprint arXiv:230911495
Hacker P, Engel A, Mauer M (2023) Regulating ChatGPT and other large generative ai models. In: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pp 1112–1123
Hao Y, Dong L, Wei F, Xu K (2021) Self-attention attribution: interpreting information interactions inside transformer. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 35, pp 12963–12971
Henderson P, Mitchell E, Manning C, Jurafsky D, Finn C (2023) Self-destructing models: increasing the costs of harmful dual uses of foundation models. In: Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, pp 287–296
Janizek JD, Sturmfels P, Lee SI (2021) Explaining explanations: axiomatic feature interactions for deep networks. J Mach Learn Res 22(1):4687–4740
Joshi B, Liu Z, Ramnath S, Chan A, Tong Z, Nie S, Wang Q, Choi Y, Ren X (2023) Are machine rationales (not) useful to humans? Measuring and improving human utility of free-text rationales. In: Rogers A, Boyd-Graber J, Okazaki N (eds) Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Toronto, Canada, pp 7103–7128, https://doi.org/10.18653/v1/2023.acl-long.392. https://aclanthology.org/2023.acl-long.392
Krishna A, Riedel S, Vlachos A (2022) Proofver: natural logic theorem proving for fact verification. Trans Assoc Comput Linguist 10:1013–1030
Lanham T, Chen A, Radhakrishnan A, Steiner B, Denison C, Hernandez D, Li D, Durmus E, Hubinger E, Kernion J et al (2023) Measuring faithfulness in chain-of-thought reasoning. arXiv preprint arXiv:230713702
Lim G, Perrault ST (2023) Xai in automated fact-checking? The benefits are modest and there’s no one-explanation-fits-all. arXiv preprint arXiv:230803372
Lyu Q, Havaldar S, Stein A, Zhang L, Rao D, Wong E, Apidianaki M, Callison-Burch C (2023) Faithful chain-of-thought reasoning. arXiv preprint arXiv:230113379
Masoomi A, Hill D, Xu Z, Hersh CP, Silverman EK, Castaldi PJ, Ioannidis S, Dy J (2022) Explanations of black-box models based on directional feature interactions. In: International Conference on Learning Representations. https://openreview.net/forum?id=45Mr7LeKR9
Nauta M, Trienes J, Pathak S, Nguyen E, Peters M, Schmitt Y, Schlötterer J, van Keulen M, Seifert C (2023) From anecdotal evidence to quantitative evaluation methods: a systematic review on evaluating explainable AI. ACM Comput Surv 55(13s):1–42
Pan L, Wu X, Lu X, Luu AT, Wang WY, Kan MY, Nakov P (2023) Fact-checking complex claims with program-guided reasoning. arXiv preprint arXiv:230512744
Pendyala VS (2022) Why the problem is still unsolved. In: Deep Learning Research Applications for Natural Language Processing, p 41
Perez E, Huang S, Song F, Cai T, Ring R, Aslanides J, Glaese A, McAleese N, Irving G (2022) Red teaming language models with language models. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp 3419–3448. https://doi.org/10.18653/v1/2022.emnlp-main.225. https://aclanthology.org/2022.emnlp-main.225
Radhakrishnan A, Nguyen K, Chen A, Chen C, Denison C, Hernandez D, Durmus E, Hubinger E, Kernion J, Lukoŝiūtė K, Cheng N, Joseph N, Schiefer N, Rausch O, McCandlish S, Showk SE, Lanham T, Maxwell T, Chandrasekaran V, Hatfield-Dodds Z, Kaplan J, Brauner J, Bowman SR, Perez E (2023) Question decomposition improves the faithfulness of model-generated reasoning. 2307.11768
Stacey J, Minervini P, Dubossarsky H, Rei M (2022) Logical reasoning with span-level predictions for interpretable and robust NLI models. In: Goldberg Y, Kozareva Z, Zhang Y (eds) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp 3809–3823, https://doi.org/10.18653/v1/2022.emnlp-main.251. https://aclanthology.org/2022.emnlp-main.251
Sundararajan M, Dhamdhere K, Agarwal A (2020) The Shapley Taylor interaction index. In: International Conference on Machine Learning, PMLR, pp 9259–9268
Tsai CP, Yeh CK, Ravikumar P (2023) Faith-Shap: the faithful Shapley interaction index. J Mach Learn Res 24(94):1–42
Turpin M, Michael J, Perez E, Bowman SR (2023) Language models don’t always say what they think: unfaithful explanations in chain-of-thought prompting. arXiv preprint arXiv:230504388
Wang H, Shu K (2023) Explainable claim verification via knowledge-grounded reasoning with large language models. arXiv preprint arXiv:231005253
Wang Y, Reddy RG, Mujahid ZM, Arora A, Rubashevskii A, Geng J, Afzal OM, Pan L, Borenstein N, Pillai A, Augenstein I, Gurevych I, Nakov P (2023) Factcheck-GPT: end-to-end fine-grained document-level fact-checking and correction of LLM output. eprint: 2311.09000
Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, Zhou D et al (2022) Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inform Proc Syst 35:24824–24837
Xiao M, Mayer J (2023) The challenges of machine learning for trust and safety: a case study on misinformation detection. arXiv preprint arXiv:230812215
Zhao R, Li X, Joty S, Qin C, Bing L (2023) Verify-and-edit: a knowledge-enhanced chain-of-thought framework. arXiv preprint arXiv:230503268
Zhu Z, Rudzicz F (2023) Measuring information in text explanations. 2310.04557
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Atanasova, P. (2024). Recent Developments on Accountability and Explainability for Complex Reasoning Tasks. In: Accountable and Explainable Methods for Complex Reasoning over Text. Springer, Cham. https://doi.org/10.1007/978-3-031-51518-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-51518-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51517-0
Online ISBN: 978-3-031-51518-7
eBook Packages: Computer ScienceComputer Science (R0)