The University of Southampton
University of Southampton Institutional Repository

AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code

AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code
AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code
The vast majority of Web pages fail to comply with established Web accessibility guidelines, excluding a range of users with diverse abilities from interacting with their content. Making Web pages accessible to all users requires dedicated expertise and additional manual efforts from Web page providers. To lower their efforts and, thus, promote inclusiveness, we aim to automatically detect and correct Web accessibility violations in HTML code. While previous work has made progress in detecting certain types of accessibility violations, the problem of automatically detecting and correcting accessibility violations remains an open challenge that we address. We introduce a novel taxonomy classifying Web accessibility violations into three key categories— Syntactic, Semantic, and Layout. This taxonomy provides a structured foundation for developing our detection and correction method and selecting and redefining evaluation metrics. We propose our novel method, AccessGuru, which combines existing accessibility testing tools and Large Language Models (LLMs) to detect accessibility violations of Web accessibility guidelines and taxonomy-driven prompting strategies of LLMs to correct all three accessibility violation categories. To evaluate these capabilities, we have developed a novel benchmark encompassing Web accessibility violations from real-world Web pages. Our benchmark quantifies syntactic and layout compliance and judges semantic accuracy through a comparative analysis against human expert corrections. Evaluation against our benchmark demonstrates that our method achieves up to 84% average violation score decrease on our benchmark dataset, significantly outperforming existing methods, which achieve at most 50% average violation score decrease.
Association for Computing Machinery
Fathallah, Nadeen
c5ca2ceb-ccfe-4e5c-85bb-5c14317e0747
Hernandez, Daniel
39723173-ccff-4015-b506-cec5aec76936
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Fathallah, Nadeen
c5ca2ceb-ccfe-4e5c-85bb-5c14317e0747
Hernandez, Daniel
39723173-ccff-4015-b506-cec5aec76936
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Fathallah, Nadeen, Hernandez, Daniel and Staab, Steffen (2025) AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code. In ASSETS '25: Proceedings of the 27th International ACM SIGACCESS Conference on Computers and Accessibility. Association for Computing Machinery. 25 pp . (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

The vast majority of Web pages fail to comply with established Web accessibility guidelines, excluding a range of users with diverse abilities from interacting with their content. Making Web pages accessible to all users requires dedicated expertise and additional manual efforts from Web page providers. To lower their efforts and, thus, promote inclusiveness, we aim to automatically detect and correct Web accessibility violations in HTML code. While previous work has made progress in detecting certain types of accessibility violations, the problem of automatically detecting and correcting accessibility violations remains an open challenge that we address. We introduce a novel taxonomy classifying Web accessibility violations into three key categories— Syntactic, Semantic, and Layout. This taxonomy provides a structured foundation for developing our detection and correction method and selecting and redefining evaluation metrics. We propose our novel method, AccessGuru, which combines existing accessibility testing tools and Large Language Models (LLMs) to detect accessibility violations of Web accessibility guidelines and taxonomy-driven prompting strategies of LLMs to correct all three accessibility violation categories. To evaluate these capabilities, we have developed a novel benchmark encompassing Web accessibility violations from real-world Web pages. Our benchmark quantifies syntactic and layout compliance and judges semantic accuracy through a comparative analysis against human expert corrections. Evaluation against our benchmark demonstrates that our method achieves up to 84% average violation score decrease on our benchmark dataset, significantly outperforming existing methods, which achieve at most 50% average violation score decrease.

Text
ASSETS_2025 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 31 July 2025
Venue - Dates: The 27th International ACM SIGACCESS Conference on Computers and Accessibility, , Denver, United States, 2025-10-26 - 2025-10-29

Identifiers

Local EPrints ID: 504030
URI: http://eprints.soton.ac.uk/id/eprint/504030
PURE UUID: 57d1afcf-bce7-4989-9606-8318682096b1
ORCID for Steffen Staab: ORCID iD orcid.org/0000-0002-0780-4154

Catalogue record

Date deposited: 21 Aug 2025 15:53
Last modified: 22 Aug 2025 02:13

Export record

Contributors

Author: Nadeen Fathallah
Author: Daniel Hernandez
Author: Steffen Staab ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×