The University of Southampton
University of Southampton Institutional Repository

Concept-based explainable artificial intelligence: metrics and benchmarks

Concept-based explainable artificial intelligence: metrics and benchmarks
Concept-based explainable artificial intelligence: metrics and benchmarks
Concept-based explanation methods, such as concept bottleneck models (CBMs), aim to improve the interpretability of machine learning models by linking their decisions to human-understandable concepts, under the critical assumption that such concepts can be accurately attributed to the network's feature space. However, this foundational assumption has not been rigorously validated, mainly because the field lacks standardised metrics and benchmarks to assess the existence and spatial alignment of such concepts. To address this, we propose three metrics: the concept global importance metric, the concept existence metric, and the concept location metric, including a technique for visualising concept activations, i.e., concept activation mapping. We benchmark post-hoc CBMs to illustrate their capabilities and challenges. Through qualitative and quantitative experiments, we demonstrate that, in many cases, even the most important concepts determined by post-hoc CBMs are not present in input images; moreover, when they are present, their saliency maps fail to align with the expected regions by either activating across an entire object or misidentifying relevant concept-specific regions. We analyse the root causes of these limitations, such as the natural correlation of concepts. Our findings underscore the need for more careful application of concept-based explanation techniques especially in settings where spatial interpretability is critical.
cs.AI, cs.LG
arXiv
Aysel, Halil Ibrahim
9db69eca-47c7-4443-86a1-33504e172d60
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
Aysel, Halil Ibrahim
9db69eca-47c7-4443-86a1-33504e172d60
Cai, Xiaohao
de483445-45e9-4b21-a4e8-b0427fc72cee
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e

[Unknown type: UNSPECIFIED]

Record type: UNSPECIFIED

Abstract

Concept-based explanation methods, such as concept bottleneck models (CBMs), aim to improve the interpretability of machine learning models by linking their decisions to human-understandable concepts, under the critical assumption that such concepts can be accurately attributed to the network's feature space. However, this foundational assumption has not been rigorously validated, mainly because the field lacks standardised metrics and benchmarks to assess the existence and spatial alignment of such concepts. To address this, we propose three metrics: the concept global importance metric, the concept existence metric, and the concept location metric, including a technique for visualising concept activations, i.e., concept activation mapping. We benchmark post-hoc CBMs to illustrate their capabilities and challenges. Through qualitative and quantitative experiments, we demonstrate that, in many cases, even the most important concepts determined by post-hoc CBMs are not present in input images; moreover, when they are present, their saliency maps fail to align with the expected regions by either activating across an entire object or misidentifying relevant concept-specific regions. We analyse the root causes of these limitations, such as the natural correlation of concepts. Our findings underscore the need for more careful application of concept-based explanation techniques especially in settings where spatial interpretability is critical.

Text
2501.19271v1 - Author's Original
Available under License Creative Commons Attribution.
Download (1MB)

More information

Published date: 31 January 2025
Additional Information: 17 pages it total, 8 main pages
Keywords: cs.AI, cs.LG

Identifiers

Local EPrints ID: 503867
URI: http://eprints.soton.ac.uk/id/eprint/503867
PURE UUID: f882ba94-d4cd-4852-afb8-16fb64ef6a5a
ORCID for Halil Ibrahim Aysel: ORCID iD orcid.org/0000-0002-4981-0827
ORCID for Xiaohao Cai: ORCID iD orcid.org/0000-0003-0924-2834

Catalogue record

Date deposited: 15 Aug 2025 16:41
Last modified: 16 Aug 2025 02:02

Export record

Altmetrics

Contributors

Author: Halil Ibrahim Aysel ORCID iD
Author: Xiaohao Cai ORCID iD
Author: Adam Prugel-Bennett

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×