AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code
AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code
The vast majority of Web pages fail to comply with established Web accessibility guidelines, excluding a range of users with diverse abilities from interacting with their content. Making Web pages accessible to all users requires dedicated expertise and additional manual efforts from Web page providers. To lower their efforts and, thus, promote inclusiveness, we aim to automatically detect and correct Web accessibility violations in HTML code. While previous work has made progress in detecting certain types of accessibility violations, the problem of automatically detecting and correcting accessibility violations remains an open challenge that we address. We introduce a novel taxonomy classifying Web accessibility violations into three key categories— Syntactic, Semantic, and Layout. This taxonomy provides a structured foundation for developing our detection and correction method and selecting and redefining evaluation metrics. We propose our novel method, AccessGuru, which combines existing accessibility testing tools and Large Language Models (LLMs) to detect accessibility violations of Web accessibility guidelines and taxonomy-driven prompting strategies of LLMs to correct all three accessibility violation categories. To evaluate these capabilities, we have developed a novel benchmark encompassing Web accessibility violations from real-world Web pages. Our benchmark quantifies syntactic and layout compliance and judges semantic accuracy through a comparative analysis against human expert corrections. Evaluation against our benchmark demonstrates that our method achieves up to 84% average violation score decrease on our benchmark dataset, significantly outperforming existing methods, which achieve at most 50% average violation score decrease.
Association for Computing Machinery
Fathallah, Nadeen
c5ca2ceb-ccfe-4e5c-85bb-5c14317e0747
Hernandez, Daniel
39723173-ccff-4015-b506-cec5aec76936
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Fathallah, Nadeen
c5ca2ceb-ccfe-4e5c-85bb-5c14317e0747
Hernandez, Daniel
39723173-ccff-4015-b506-cec5aec76936
Staab, Steffen
bf48d51b-bd11-4d58-8e1c-4e6e03b30c49
Fathallah, Nadeen, Hernandez, Daniel and Staab, Steffen
(2025)
AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code.
In ASSETS '25: Proceedings of the 27th International ACM SIGACCESS Conference on Computers and Accessibility.
Association for Computing Machinery.
25 pp
.
(In Press)
Record type:
Conference or Workshop Item
(Paper)
Abstract
The vast majority of Web pages fail to comply with established Web accessibility guidelines, excluding a range of users with diverse abilities from interacting with their content. Making Web pages accessible to all users requires dedicated expertise and additional manual efforts from Web page providers. To lower their efforts and, thus, promote inclusiveness, we aim to automatically detect and correct Web accessibility violations in HTML code. While previous work has made progress in detecting certain types of accessibility violations, the problem of automatically detecting and correcting accessibility violations remains an open challenge that we address. We introduce a novel taxonomy classifying Web accessibility violations into three key categories— Syntactic, Semantic, and Layout. This taxonomy provides a structured foundation for developing our detection and correction method and selecting and redefining evaluation metrics. We propose our novel method, AccessGuru, which combines existing accessibility testing tools and Large Language Models (LLMs) to detect accessibility violations of Web accessibility guidelines and taxonomy-driven prompting strategies of LLMs to correct all three accessibility violation categories. To evaluate these capabilities, we have developed a novel benchmark encompassing Web accessibility violations from real-world Web pages. Our benchmark quantifies syntactic and layout compliance and judges semantic accuracy through a comparative analysis against human expert corrections. Evaluation against our benchmark demonstrates that our method achieves up to 84% average violation score decrease on our benchmark dataset, significantly outperforming existing methods, which achieve at most 50% average violation score decrease.
Text
ASSETS_2025
- Accepted Manuscript
More information
Accepted/In Press date: 31 July 2025
Venue - Dates:
The 27th International ACM SIGACCESS Conference on Computers and Accessibility, , Denver, United States, 2025-10-26 - 2025-10-29
Identifiers
Local EPrints ID: 504030
URI: http://eprints.soton.ac.uk/id/eprint/504030
PURE UUID: 57d1afcf-bce7-4989-9606-8318682096b1
Catalogue record
Date deposited: 21 Aug 2025 15:53
Last modified: 22 Aug 2025 02:13
Export record
Contributors
Author:
Nadeen Fathallah
Author:
Daniel Hernandez
Author:
Steffen Staab
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics