AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code

The vast majority of Web pages fail to comply with established Web accessibility guidelines, excluding a range of users with diverse abilities from interacting with their content. Making Web pages accessible to all users requires dedicated expertise and additional manual efforts from Web page providers. To lower their efforts and, thus, promote inclusiveness, we aim to automatically detect and correct Web accessibility violations in HTML code. While previous work has made progress in detecting certain types of accessibility violations, the problem of automatically detecting and correcting accessibility violations remains an open challenge that we address. We introduce a novel taxonomy classifying Web accessibility violations into three key categories— Syntactic, Semantic, and Layout. This taxonomy provides a structured foundation for developing our detection and correction method and selecting and redefining evaluation metrics. We propose our novel method, AccessGuru, which combines existing accessibility testing tools and Large Language Models (LLMs) to detect accessibility violations of Web accessibility guidelines and taxonomy-driven prompting strategies of LLMs to correct all three accessibility violation categories. To evaluate these capabilities, we have developed a novel benchmark encompassing Web accessibility violations from real-world Web pages. Our benchmark quantifies syntactic and layout compliance and judges semantic accuracy through a comparative analysis against human expert corrections. Evaluation against our benchmark demonstrates that our method achieves up to 84% average violation score decrease on our benchmark dataset, significantly outperforming existing methods, which achieve at most 50% average violation score decrease.

Association for Computing Machinery

Fathallah, Nadeen

c5ca2ceb-ccfe-4e5c-85bb-5c14317e0747

Hernandez, Daniel

39723173-ccff-4015-b506-cec5aec76936

Staab, Steffen

bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Fathallah, Nadeen

c5ca2ceb-ccfe-4e5c-85bb-5c14317e0747

Hernandez, Daniel

39723173-ccff-4015-b506-cec5aec76936

Staab, Steffen

bf48d51b-bd11-4d58-8e1c-4e6e03b30c49

Fathallah, Nadeen, Hernandez, Daniel and Staab, Steffen (2025) AccessGuru: leveraging LLMs to detect and correct web accessibility violations in HTML code. In ASSETS '25: Proceedings of the 27th International ACM SIGACCESS Conference on Computers and Accessibility. Association for Computing Machinery. 25 pp . (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

Text

ASSETS_2025 - Accepted Manuscript

Available under License Creative Commons Attribution.

Download (2MB)

More information

Accepted/In Press date: 31 July 2025

Venue - Dates: The 27th International ACM SIGACCESS Conference on Computers and Accessibility, , Denver, United States, 2025-10-26 - 2025-10-29

Learn more about Web and Internet Science research

Identifiers

Local EPrints ID: 504030

URI: http://eprints.soton.ac.uk/id/eprint/504030

PURE UUID: 57d1afcf-bce7-4989-9606-8318682096b1

ORCID for Steffen Staab:

orcid.org/0000-0002-0780-4154

Catalogue record

Date deposited: 21 Aug 2025 15:53

Last modified: 22 Aug 2025 02:13

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Nadeen Fathallah

Author: Daniel Hernandez

Author: Steffen Staab

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information