AI Is Not Your Accessibility Expert: What LLMs Still Miss About WCAG
null Continue reading AI Is Not Your Accessibility Expert: What LLMs Still Miss About WCAG on SitePoint .
Artificial intelligence is rapidly reshaping the software development landscape. AI-assisted coding tools and large language models (LLMs) are increasingly being integrated into everyday engineering workflows, helping developers accelerate code generation, automate routine tasks, produce documentation, and support problem-solving activities across the software development lifecycle. This growing adoption has also extended into web accessibility, where developers are beginning to rely on AI-generated recommendations to help implement accessibility best practices and address compliance requirements. Accessibility has not been left behind. Developers increasingly rely on AI assistants to generate accessible HTML, recommend ARIA attributes, create keyboard interactions, and flag accessibility violations. At first glance, this looks like a genuine leap forward. Accessibility expertise is scarce, WCAG guidelines are notoriously difficult to interpret, and teams face constant pressure on time and resources. AI promises to bridge the gap. However, there is an important reality that many organizations are discovering the hard way: AI is a helpful accessibility assistant β but it is not an accessibility expert. While modern LLMs can produce accessibility-related code and offer guidance on WCAG requirements, they frequently miss critical nuances, context-dependent considerations, and real-world usability challenges. Blindly accepting AI-generated accessibility recommendations can introduce new barriers rather than remove existing ones. This article explores where AI delivers genuine value, where it falls short, and why human expertise remains essential for building truly inclusive digital experiences. What AI Gets Right Before covering the limitations, it is worth recognizing what AI already does well. Many developers receive limited accessibility training. For them, AI can serve as an accessible entry point into best practices. Consider a simple request: "Create an accessible button." An LLM might generate: Submit Form Reasonable. Similarly, for an accessible form field: Email Address Both follow basic accessibility principles correctly. AI can also: - Explain WCAG success criteria in plain language - Suggest semantic HTML elements - Recommend keyboard interaction patterns - Draft image alternative text - Spot common accessibility anti-patterns - Help with accessibility documentation These capabilities meaningfully improve developer productivity and help teams build accessibility awareness earlier in the development lifecycle. The problem is that accessibility is far more than labels and ARIA attributes. Accessibility Is Context-Dependent One of the most significant limitations of LLMs is their inability to fully understand context. WCAG compliance frequently depends on business requirements, user goals, content relationships, and interaction patterns that no AI tool can infer from a code snippet. Consider a modal dialog. Many AI tools generate something like this: Settings Close This markup is technically valid β but several critical behaviors remain completely unaddressed: - Is focus moved into the dialog when it opens? - Is background content hidden from assistive technologies? - Is focus trapped inside the dialog while it is open? - Is focus returned to the triggering element when the dialog closes? - Is keyboard navigation fully supported? The generated code may appear accessible but fail completely in real user testing. Accessibility success often depends on behavior, not markup alone β and this is precisely where AI frequently struggles. The ARIA Problem Perhaps the most common accessibility issue introduced by AI is the misuse of ARIA. Developers often ask: "Make this component accessible." A model will typically respond by adding multiple ARIA attributes. For example: Submit This is technically functional, but it is not the correct solution. The correct solution is simpler: Submit A native element automatically provides keyboard support, focus management, screen reader compatibility, touch support, and correct browser accessibility mappings β all without a single ARIA attribute. AI frequently violates one of the most important accessibility principles: no ARIA is better than bad ARIA. Because LLMs are trained on enormous quantities of internet content β including incorrect examples β they routinely reproduce the accessibility anti-patterns that are widespread across the web. WCAG Is More Than a Checklist Many developers believe accessibility is simply a matter of satisfying technical requirements. AI reinforces this misconception by presenting accessibility as a checklist to be ticked off. A page might include: - Alt text on all images - Labels on all form fields - A logical heading structure - Sufficient color contrast β¦and still deliver a poor experience for users with disabilities. Imagine a checkout flow containing twenty form fields, repetitive instructions, confusing error messages, and a disorganized keyboard flow. An AI tool might detect zero WCAG violations. A real user with a disability may be unable to complete the purchase. Genuine accessibility includes: - Cognitive usability β Is the interface mentally manageable? - Information architecture β Is content logically organized? - Content clarity β Are instructions and labels unambiguous? - Navigation efficiency β Can users reach their goal without unnecessary friction? - Error prevention β Are mistakes anticipated and recoverable? These human-centered considerations are difficult for current LLMs to evaluate reliably. AI Cannot Experience Disability The most fundamental limitation of AI is straightforward: AI does not experience disability. A screen reader user experiences a webpage sequentially, through audio output. A keyboard-only user navigates through focus order. A low-vision user may rely on magnification and high contrast. A user with cognitive disabilities processes information differently. AI can describe these experiences β it cannot genuinely simulate them. This creates a significant gap between theoretical accessibility and practical accessibility. For example, an AI system might recommend: Technically valid. But if this is the primary product image on an e-commerce site, the description is inadequate. A better description would be: "Men's blue long-sleeve performance running shirt with reflective detailing." Only a human reviewer can reliably determine whether alternative text actually supports the user's goal in context. AI Often Misses Dynamic Accessibility Issues Modern web applications depend heavily on JavaScript frameworks β React, Angular, Vue, Next.js. Many accessibility problems only surface after page load. Common examples include: - Focus management after route transitions - Live region announcements for dynamic content - Modal interactions and focus trapping - Infinite scroll and late-loaded content - Async form validation and error announcements AI-generated code frequently overlooks these scenarios. Opening a modal is easy: setIsOpen(true); Managing accessibility after the modal opens is harder. Where should focus move? What should screen readers announce? How should the Escape key behave? What happens when modals are stacked? These implementation details often determine whether an experience is genuinely accessible β or merely appears to be. Hallucinations and Incorrect Guidance Another real challenge is hallucination. LLMs occasionally produce guidance that sounds authoritative but is simply incorrect. Documented examples include: - Recommending ARIA attributes that do not exist - Misinterpreting WCAG success criteria - Suggesting outdated accessibility techniques from older specifications - Confusing requirements between WCAG 2.1 and 2.2 - Creating browser-unsupported ARIA patterns Many developers lack the expertise to identify these inaccuracies. As a result, accessibility defects enter production even when developers believe they have followed AI recommendations carefully. Accessibility decisions should never rest on a single AI-generated answer without independent validation. Automated Testing Has Hard Limits A common misconception is that AI-powered tools can replace accessibility testing. They cannot. Current automated tools β including those enhanced with AI β typically catch only a portion of real-world issues. They can reliably detect missing alt text, missing form labels, color contrast failures, invalid ARIA usage, and heading structure problems. They cannot reliably evaluate: - Whether alternative text is meaningful in context - Whether a workflow is cognitively manageable - Whether error messages are clearly actionable - Whether a business task can actually be completed - Whether navigation patterns are efficient under assistive technology Industry research has consistently found that automated tools identify only a fraction of the accessibility barriers present on a given page. Manual testing is not optional. What Human Experts Do That AI Cannot Accessibility professionals provide capabilities that AI cannot currently replicate. User-centered evaluation. Experts evaluate experiences from the perspective of actual users, not technical specifications. They understand what it feels like to use a screen reader or navigate by keyboard. Contextual decision-making. Accessibility decisions involve trade-offs between business goals, user needs, and technical constraints. Human experts navigate those trade-offs intelligently. Assistive technology testing. Real testing means using screen readers (NVDA, JAWS, VoiceOver), keyboard-only navigation, voice control software, and magnification tools β across browsers and operating systems. AI cannot independently perform these evaluations. Governance and training. Organizations require policies, standards, design system reviews, developer training programs, and ongoing compliance oversight. These responsibilities extend far beyond code generation. The Right Way to Use AI for
Comments
No comments yet. Start the discussion.