Computer Science > Artificial Intelligence

arXiv:2205.08452 (cs)

[Submitted on 17 May 2022 (v1), last revised 9 Jun 2022 (this version, v2)]

Title:A Psychological Theory of Explainability

Authors:Scott Cheng-Hsin Yang, Tomas Folke, Patrick Shafto

View PDF

Abstract:The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.

Comments:	15 pages, 3 figures, ICML 2022
Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2205.08452 [cs.AI]
	(or arXiv:2205.08452v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2205.08452

Submission history

From: Scott Cheng-Hsin Yang [view email]
[v1] Tue, 17 May 2022 15:52:24 UTC (1,467 KB)
[v2] Thu, 9 Jun 2022 17:44:34 UTC (1,764 KB)

Computer Science > Artificial Intelligence

Title:A Psychological Theory of Explainability

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:A Psychological Theory of Explainability

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators