Test-driven HTML and accessibility

by David Luhr published on Dec 12, 2023

When I started writing unit tests and following a test-driven development (TDD) workflow, I was stoked with the immediate feedback and confidence I gained in every line of JavaScript I wrote.

TDD improved my software design with simpler, more predictable code. It saved me countless hours of manual debugging by reporting errors directly in failing tests. And with solid test coverage, it allowed me to add new logic and functionality while keeping everything in an always-working state.

I followed the process of Red, Green, Refactor, meaning I wrote a failing test, wrote just enough code to make the test pass, and then freely cleaned up my code, knowing everything was working. My JavaScript code had never been better.

Then one day, I wrote a UI-oriented test that used document.querySelector(). Before the test even finished running, it errored and revealed that JavaScript tests run outside of the browser in Node, where no document object model (DOM) exists. Since the focus of my career is evaluating, designing, and developing accessible user interfaces, primarily with HTML and CSS, I wanted to apply TDD techniques to HTML and accessibility. But it seemed like my dream testing setup wasn't possible.

Months passed of continuing to enjoy TDD with JavaScript, while wishing I could do more to test my accessible markup. Most of my research pointed to slow, flaky approaches like scripting an entire headless browser just to check if an HTML element rendered or if a menu opened on click.

Other testing libraries use synthetic (faked) interactions by dispatching DOM events that might lead to false positives in tests if they don't accurately replicate real user interaction. Plus, these libraries don't have the necessary DOM events for all interactions. I wanted to write small, focused, fast unit tests that provided trustworthy, immediate feedback with every change I made.

Then I came across a tool called Web test runner from the team behind Open Web Components. Instead of running JavaScript tests in Node, Web Test Runner executes tests directly in the browser.

This is a big deal for multiple reasons:

Tests run your production JavaScript code in the production environment—the browser
Tests have access to the real DOM, unlocking the ability to write tests for HTML, accessibility, and even CSS
Tests can send native interactions, such as actual mouse clicks and keyboard input, instead of faking them with DOM events
And it's even possible do things like set the viewport dimensions to test responsive design

Shift accessibility feedback earlier

With this approach, I write tests to capture my expectations around semantic markup and accessibility upfront, then get instant feedback on whether my code meets those expectations.

Let's say I expect paragraph (<p>) elements shouldn't be empty. I can write the following test, which gets all <p> elements, loops over them, and expects each one doesn't have empty text content:

describe("paragraphs", () => {
	const docAllParagraphs = Array.from(document.querySelectorAll("p"));

	it("have text content", () => {
		docAllParagraphs.forEach((paragraph) => {
			expect(paragraph.textContent).to.not.equal("");
		});
	});
});

With access to the real DOM in tests, I can continuously check if I'm using heading levels, ARIA, and landmark regions correctly. I can build custom UI patterns and ensure they respond to keyboard input and manage focus as expected. And as I add new features or refactor my code, my tests re-run with every change to make sure I don't harm accessibility as the project evolves.

Accessibility-focused linters give feedback as you code, but have limited awareness outside of very specific violations. Automated accessibility audits and most manual review happens after you write the code. Test-driven accessibility is about providing continuous feedback before, during, and after you write the code.

Using all these forms of accessibility testing together provides the most value. Unit testing HTML allows me find, fix, and prevent accessibility issues as I'm working. It also allows me to focus my manual accessibility reviews on more holistic evaluations, such as navigating through a page with a screen reader to asses the user experience.

This test-driven approach makes something possible that I've been striving for in my work: an accessibility-first workflow. With an accessibility-first mindset, we plan for and preserve accessibility from the beginning of a project instead of leaving it as an afterthought. If I could achieve one thing in my career, it would be to make accessibility-first a foundational practice alongside usability, responsive design, performance, and security.

Evolve code with confidence

In addition to getting feedback upfront and continuously thereafter, having tests allows us to change our code freely as features and user needs evolve.

The downside of automated accessibility testing tools and manual review is they can't be performed continuously, and often focus on the current page in the browser. We either have to manually re-test with each change we make exhaustively across all pages, or we depend on getting feedback after batches of changes.

With tests in place, we can keep our code (and accessibility) in an always-working state. Every time we make a change, our tests run across the entire codebase. If our tests pass, we can more confidently and frequently deploy our changes to users.

The setup

To begin testing our HTML in the browser, we first install our dependencies:

npm i --save-dev @web/test-runner @web/test-runner-mocha @esm-bundle/chai @web/test-runner-commands @web/test-runner-playwright

We need Web test runner to run tests in the browser, Web Test Runner's implementation of Mocha for directly testing HTML files, Chai for test assertions, Web Test Runner commands for sending native interactions (click, keyboard, etc.) to our UI elements, and Web Test Runner's implementation of Playwright for headless browsers to run the tests in.

With this setup, we run our tests in Chromium, Firefox, and WebKit all at once to make sure our code works across the major browsers. This happens super quickly in the background and works in different operating systems.

Even through browser cross-compatibility continues to improve, this testing setup reveals discrepancies that would be incredibly difficult to otherwise detect. For example, I recently found that getting the value of an element's role attribute with Element.role was failing only in Firefox, and using Element.getAttribute("role") was the fix. I'm not sure if or how I would have found that issue without tests that run directly in each of the browsers.

Next, we add a couple scripts to our package.json to run tests either in a single pass or in watch mode:

"scripts": {
  "test": "web-test-runner \"/**/*.test.js\" --node-resolve --playwright --browsers chromium firefox webkit",
  "test:watch": "web-test-runner \"/**/*.test.js\" --node-resolve --playwright --browsers chromium firefox webkit --watch"
},

Last, we can run our watch mode script as we work. This will run all tests, then detect any file changes and re-run only the impacted tests:

npm run test:watch

If our website is made with simple HTML pages, we can run the tests directly in the HTML files (with a <script> tag) to test specific markup. If our HTML is rendered with JavaScript, as is the case with web components and many other approaches, then we can test the rendering in JS files alongside any logic or functionality that we'd usually unit test. We'll compare the tradeoffs of these two options at the end of this article.

Writing tests

To demonstrate the value of this setup, let's write tests to prevent some of the most common (non-visual) WCAG 2 failures found in the annual WebAIM Million study:

Low contrast text
Missing alt text
Empty links
Missing form input labels
Empty buttons
Missing document language

Writing tests for low contrast text is possible but too involved for this post. So let's focus on tests for the remaining most common failures.

First, we'll create a test file called example.test.js with the required imports:

import { runTests } from "@web/test-runner-mocha";
import { expect } from "@esm-bundle/chai";
import { sendKeys } from "@web/test-runner-commands";

runTests(() => {
  // TODO: Write tests here
});

runTests allows us to directly test HTML files, which makes this setup quick and easy for static HTML. @web/test-runner-commands provides native interaction utilities, such as sendKeys, that we'll use in a later example.

Then we'll create an example HTML file called example.html with the usual boilerplate and link to the test file:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Our test file</title>

    <script type="module" src="example.test.js"></script>
  </head>
  <body>
    <!-- TODO: Write markup here -->
  </body>
</html>

Missing alt text

In example.test.js, let's write a test to check for missing image alt text:

// ...
runTests(() => {
  describe("images", () => {
    const docAllImages = Array.from(document.querySelectorAll("img"));

    it("have an alt attribute", () => {
      docAllImages.forEach((image) => {
        expect(image.getAttribute("alt"), image.outerHTML).to.exist;
      });
    });
  });
});

This test gets all images in the DOM, loops over them, and checks to make sure each one has an alt attribute.

Error messages in JavaScript include the line and character numbers to help quickly locate and fix errors, but we don't have access to this with HTML elements. By passing image.outerHTML as the second argument of our expect() function, we can include the element in the error message to make finding and correcting the offending element easier. We'll use this for custom error messages any time we're checking all instances of an element.

If this is too verbose or noisy for more complex elements, we could instead only include the element's opening tag, or assign all elements unique data-test-id values to identify them directly in test error messages.

In our example.html file, let's add an image without an alt attribute to get a failing test:

<img src="/example.jpg" />

Web Test Runner reports the following in the command line:

❌ images > have an alt attribute
	AssertionError: <img src="/example.jpg">: expected null to exist
		at example.test.js:9:41

Chromium: 1 failed
Firefox: 1 failed
Webkit: 1 failed

Our test fails as expected. To get the test to pass, we simply need to add an alt attribute, either as an empty string "" for decorative content or with a useful description for meaningful content.

The neat thing about this test is it passes if no <img> elements exist, but fails if they exist and don't have an alt attribute. So, this test can be used from the beginning of any project without creating irrelevant noise.

Let's write tests for other common WCAG 2 failures.

Empty links

We'll add another test in example.test.js:

runTests(() => {
  // ...
  describe("links", () => {
    const docAllLinks = Array.from(document.querySelectorAll("a"));

    it("have a non-empty href attribute", () => {
      docAllLinks.forEach((link) => {
        const hrefValue = link.getAttribute("href");
        expect(hrefValue, link.outerHTML).to.exist;
        expect(hrefValue, link.outerHTML).to.not.equal("");
      });
    });

    it("are not empty", () => {
      docAllLinks.forEach((link) => {
        expect(link.textContent, link.outerHTML).to.not.equal("");
      });
    });
  });
});

Similar to the previous test, we get all links from the DOM, loop over them, and check that their text content is not empty. We're also checking that links have a non-empty href attribute, just as an example of other HTML validation we'd like to do.

Missing form input labels

Time for another test suite:

runTests(() => {
  // ...
  describe("form inputs", () => {
    const docAllFormInputs = Array.from(
      document.querySelectorAll("input, textarea, select")
    );

    it("have a dedicated label", () => {
      docAllFormInputs.forEach((formInput) => {
        const inputId = formInput.id;
        expect(inputId, formInput.outerHTML).to.exist;

        const inputLabel = document.querySelector(`label[for="${inputId}"]`);
        expect(inputLabel, formInput.outerHTML).to.exist;
        expect(inputLabel.textContent, formInput.outerHTML).to.not.equal("");
      });
    });
  });
});

In this test, we get all form input elements, loop over them, and assert multiple things:

Each form input element needs an id attribute
We expect to find a <label> element with a for attribute that has the corresponding input's id as the value
We expect the <label> element to not be empty

It's possible to provide an accessible name for form inputs with other techniques, such as aria-label or aria-labelledby, but I'm being opinionated here because I prefer actual <label> elements and text content. Testing for an accessible name through other means is just as easy.

Empty buttons

This is nearly identical to our empty links test:

runTests(() => {
  // ...
  describe("buttons", () => {
    const docAllButtons = Array.from(document.querySelectorAll("button"));

    it("are not empty", () => {
      docAllButtons.forEach((button) => {
        expect(button.textContent, button.outerHTML).to.not.equal("");
      });
    });
  });
});

Missing document language

Let's write a test for the final common WCAG 2 failure:

runTests(() => {
  // ...
  describe("document", () => {
    it("has a set language", () => {
      const languageValue = document.querySelector("html").getAttribute("lang");

      expect(languageValue).to.equal("en");
    });
  });
});

This one is the most straightforward, but valuable nonetheless. If your site has internationalization, you'd want to check the lang value matches the current locale and updates based on user preference.

Testing heading levels

We're starting to build a universal test suite that will preserve accessibility across our projects.

Let's add some further assertions around heading levels, which are a common source of errors in accessibility audits:

runTests(() => {
  // ...
  describe("headings", () => {
    const docAllHeadings = Array.from(
      document.querySelectorAll("h1, h2, h3, h4, h5, h6")
    );

    it("have <h1> element as the first heading", () => {
      expect(
        getHeadingLevel(docAllHeadings[0]),
        docAllHeadings[0].outerHTML
      ).to.equal(1);
    });

    it("have a single <h1>", () => {
      docAllHeadings.forEach((heading, index) => {
        // Don't fail the test if the first heading on the page is `<h1>`
        if (index === 0 && getHeadingLevel(heading) === 1) {
          return;
        }

        expect(getHeadingLevel(heading), heading.outerHTML).to.not.equal(1);
      });
    });

    it("don't skip heading levels", () => {
      docAllHeadings.forEach((heading, index) => {
        let previousHeadingLevel = 0;
        const currentHeadingLevel = getHeadingLevel(heading);

        if (index !== 0) {
          previousHeadingLevel = getHeadingLevel(docAllHeadings[index - 1]);
        }

        expect(currentHeadingLevel, heading.outerHTML).to.be.lessThanOrEqual(
          previousHeadingLevel + 1
        );
      });
    });
  });
});

function getHeadingLevel(heading) {
  return +heading.tagName.toLowerCase().replace("h", "");
}

With these tests, we can ensure heading levels are used properly in our page. We should only have a single <h1> heading. This heading should also be the first heading on the page. Lastly, headings should never skip levels, such as <h2> followed by <h4>. We make use of small utility function, getHeadingLevel(), to keep our code more concise and mistake-proof.

Although these tests enforce correct heading logic, they can't evaluate if heading levels are appropriate for the content they describe. That always requires thoughtful consideration. Tests free the developer to spend more time on wider UX considerations like this.

Testing user interaction

So far, we've made simple assertions about our static HTML to make sure it's valid and not causing common accessibility failures. But as we create interactive patterns in our UI, the accessibility considerations become much more complex. We need to use HTML and ARIA to create custom semantics. We need to handle click, tap, and keyboard events. And we need to display and hide content, all while updating multiple attributes in concert.

Let's follow the Red, Green, Refactor workflow as we build a custom disclosure (show/hide) pattern.

We'll start with our tests in example.test.js to make sure our initial HTML and ARIA are correct:

runTests(() => {
  // ...
  describe("disclosure", () => {
    const docDisclosureToggle = document.querySelector(
      `[data-component="disclosureToggle"]`
    );
    const docDisclosureContent = document.querySelector(
      `[data-component="disclosureContent"]`
    );

    it("has a toggle button", () => {
      expect(docDisclosureToggle).to.exist;
      expect(docDisclosureToggle.tagName).to.equal("BUTTON");
    });

    it("has a toggle button with the expected ARIA attributes", () => {
      expect(docDisclosureToggle.getAttribute("aria-expanded")).to.exist;
      expect(docDisclosureToggle.getAttribute("aria-expanded")).to.equal(
        "false"
      );
      expect(docDisclosureToggle.getAttribute("aria-controls")).to.exist;
      expect(docDisclosureToggle.getAttribute("aria-expanded")).to.not.equal(
        ""
      );
    });

    it("has a content panel with the corresponding controls ID", () => {
      expect(docDisclosureContent).to.exist;
      expect(docDisclosureContent.id).to.equal(
        docDisclosureToggle.getAttribute("aria-controls")
      );
    });

    it("has a hidden content panel by default", () => {
      expect(docDisclosureContent.getAttribute("hidden")).to.exist;
    });
  });
});

These tests will fail, meaning we're in the "Red" stage. Now, we can author our initial HTML in example.html to get our tests to pass:

<button
  type="button"
  aria-expanded="false"
  aria-controls="disclosureContent"
  data-component="disclosureToggle"
>
  Open disclosure
</button>

<div id="disclosureContent" data-component="disclosureContent" hidden>
  <p>Disclosure content</p>
</div>

Our tests now pass, so we're in the "Green" stage. I'd make a Git commit at this point. If there are any improvements we'd like to make, we can make them with confidence as long as our tests are passing, and commit again each time we're in a working state.

Let's write a failing test to begin interacting with our disclosure in example.test.js:

runTests(() => {
  // ...
  describe("disclosure", () => {
    // ...
    it("opens the disclosure on keyboard Enter press", async () => {
      docDisclosureToggle.focus();

      await sendKeys({
        down: "Enter",
      });

      expect(docDisclosureToggle.getAttribute("aria-expanded")).to.equal(
        "true"
      );
      expect(docDisclosureContent.getAttribute("hidden")).to.not.exist;

      // TODO: Reset the disclosure
    });
  });
});

With this test, we move focus to our toggle button, then use the sendKeys function from Web Test Runner commands to send a native keyboard Enter key press. We're checking that our toggle button gets updated attributes, and that our disclosure content is no longer hidden. We'd want similar tests for mouse click and the keyboard Space press, as well as testing that the disclosure closes on these interactions if it's already open.

We're in the "Red" stage again, but I'll leave the rest of the work to you to add the remaining tests and get them to pass. A cool thing is that the functionality needed for this component to work also creates some handy utility functions for our tests, such as resetting the disclosure at the end of each test to create a consistent starting point for other tests. And these utilities are easy to unit test with this approach as well.

Wrapping up

This workflow for test-driven HTML and accessibility provides so much value in my daily work, and I hope teams can adopt this technique to create a more responsible and usable web for everyone. I've used this setup extensively in large projects with complex UI patterns and it has scaled gracefully with several hundred tests running at all times.

With this setup, we can write unit tests that have access to the real DOM across major browsers, which allows us to check our HTML for validity, interactivity, and accessibility. We can run these tests in multiple browsers at once with every change, and preserve accessibility from the very start of our project. These expectations are permanently captured in our code, so we can freely refactor our work and add new features without introducing regressions.

The examples in this post assume a static HTML file that we directly linked our tests to. This is the fastest and most direct way to test our HTML, but we don't want our test file to load in production. As a result, we'd need to remove the <script> tag before deploying to the web. To make this approach more convenient, it'd be good to do this automatically with a build command that creates a /dist folder or something similar.

Most projects probably won't just have static HTML files, and instead render components with JavaScript. In this case, we create standalone .test.js files that import corresponding JavaScript modules and call render functions or methods directly. From there, the workflow and benefits are all the same, but there may be a little more effort in setting up specific UI situations to test vs. having HTML ready to go.

As a final consideration, this testing technique enhances, not replaces, other forms of accessibility testing and shifts a lot of the feedback earlier in the process. Using accessibility-focused code linters, automated accessibility testing tools, and manual accessibility review together provides the most value.

Many of the examples we covered in this post are checks we would otherwise have to make manually, either through code review or running accessibility audit tools. By writing unit tests for accessibility, we can discover and fix more issues as they arise. And we free more capacity to manually evaluate other aspects of accessibility, such as using assistive technology to evaluate the flow of a page or automated testing tools to check color contrast.

Automated tools cover about 30% of the WCAG success criteria and detect about 57% of overall issues, leaving the rest for manual evaluation. With this approach, unit tests can increase coverage, particularly with interactive UI patterns. Even with more automated test coverage, manual review remains the most effective accessibility testing technique for finding issues and building empathy.

My long-term goal with this work is to build a robust, universal accessibility test collection that can be used across projects. As we explored with our disclosure example, it's also possible to create test suites for custom UI patterns such as accordions, menus, tooltips, and others.

Let me know if you'd be interested in using this test collection or if you'd like to collaborate. Together, we can make test-driven accessibility and accessibility-first development a reality.

About David Luhr

David Luhr is an independent consultant who helps teams of all sizes with accessible design and development. He is passionate about creating a more responsible web for everyone, eliminating waste, and creating free educational content through his Build UX YouTube channel.

Personal website and blog: luhr.co
YouTube: youtube.com/@buildux
LinkedIn: linkedin.com/in/davidluhr