Screen Reader Testing Done Right: NVDA, JAWS, and VoiceOver Playbook for Testers

Most accessibility testing guides read like compliance checklists written by lawyers. Run an automated scanner, fix the red items, call it done. That approach catches maybe 30–40% of real issues. The other 60% only surfaces when an actual human navigates your product using a screen reader — and that experience can be absolutely broken even when axe-core gives you a green checkmark.

This article is a working playbook. It covers the three screen readers that matter — NVDA, JAWS, and VoiceOver — with the exact key combos, testing flows, and gotchas that QA engineers and developers need to actually catch bugs, not just audit for them.

No theory lectures. Let’s get into it.


Why These Three, and Not Others

NVDA and JAWS together dominate Windows desktop usage (WebAIM’s survey consistently puts their combined share above 70%). VoiceOver ships with every Apple device, which makes it the default for iOS testing. If your app works well with all three, you’ve covered the vast majority of real users who depend on assistive technology.

TalkBack (Android), Narrator (Windows), and Orca (Linux) matter too, but this guide keeps scope tight on purpose. Master these three first.


Setting Up Your Testing Environment

NVDA (Free, Windows)

NVDA is open source and free. Get it from nvaccess.org. Install it on a real Windows machine or a VM — both work fine.

Key install note: during setup, enable "Use NVDA during sign-in and on secure screens" if you want it available immediately on boot. Not critical for testing but useful.

By default NVDA uses the Insert key as its modifier (the "NVDA key"). If you’re on a laptop without a dedicated Insert key, go to NVDA Menu → Preferences → Settings → Keyboard and switch the modifier to CapsLock. This prevents constant conflict with your other workflows.

Install the NVDA Remote add-on if you’re testing remotely — it lets you share a session so a developer can hear exactly what you’re hearing.

For speech output during testing sessions where you need silence, go to NVDA Menu → Preferences → Settings → Speech and set the synthesizer to "No Speech", then enable Braille display output if you have one, or just read the Speech Viewer (NVDA Menu → Tools → Speech Viewer). The Speech Viewer is a floating window showing everything NVDA speaks — essential for async documentation and bug reports.

JAWS (Commercial, Windows)

JAWS from Freedom Scientific is the most common enterprise screen reader. A full license is expensive, but there’s a 40-minute trial mode — after 40 minutes the machine needs a reboot to get another 40 minutes. That’s enough for structured test sessions if you plan them well.

Download from freedomscientific.com. The JAWS key is also Insert by default (or CapsLock on laptops).

One critical JAWS-specific thing: JAWS has its own virtual cursor mode called "Virtual PC Cursor" (on by default for web content). The behavior differs subtly from NVDA’s browse mode, which is why the same page can behave differently across the two.

Also install JAWS Tandem if you’re doing remote sessions — same idea as NVDA Remote.

VoiceOver (macOS, iOS)

On macOS, VoiceOver is built in. Enable it with Command + F5 or through System Settings → Accessibility → VoiceOver. The VoiceOver modifier is by default Control + Option (abbreviated as VO in shortcut notation).

On iOS, go to Settings → Accessibility → VoiceOver and toggle it on. Warning: once enabled, all touch interactions change. A single tap announces an element; a double tap activates it. If you’ve never used it before, practice for 10 minutes before starting a test session or you’ll constantly unlock things you didn’t mean to.


Understanding Browse Mode vs. Application Mode

This is the single concept that trips up developers most often.

Screen readers on Windows have two primary modes when handling web content:

Browse mode (virtual cursor mode) — the screen reader builds a virtual representation of the DOM and lets users navigate it with arrow keys and quick-nav letters (H for headings, B for buttons, F for form fields, etc.). The browser doesn’t receive most keystrokes; the screen reader intercepts them.

Application/Forms mode — the screen reader passes keys through to the browser/application. Required for interactive widgets like custom autocomplete, date pickers, sliders, or anything that uses arrow keys internally.

The switch between modes:

  • NVDA: NVDA + Space to toggle
  • JAWS: NVDA/Insert + Z or it often auto-switches when focus lands on a form element
  • VoiceOver: doesn’t use browse mode in the same way — it uses the VO+Arrow navigation model instead

Gotcha #1: Your custom JavaScript widget intercepts arrow keys but the screen reader is still in browse mode. The user presses arrow-down expecting to move to the next option, but instead the screen reader jumps to the next paragraph. Fix: add role="application" or role="listbox" plus proper ARIA patterns so the screen reader knows to switch modes. This is one of the most common real-world bugs.


Core Navigation Shortcuts You Must Know

Don’t rely on mouse testing with screen readers enabled. Navigate exclusively with the keyboard. Here’s the reference you’ll actually use:

NVDA / JAWS (Browse Mode)

Action Key
Next/previous heading H / Shift+H
Next heading level 2 2 / Shift+2
Next link K / Shift+K
Next button B / Shift+B
Next form field F / Shift+F
Next landmark region D / Shift+D
Next table T / Shift+T
Next list L / Shift+L
Read from here NVDA+↓ / Insert+↓
Stop speech Ctrl
Read current line NVDA+↑
Elements list NVDA+F7

The Elements List (NVDA+F7) is your best friend. It lets you pull up all headings, links, form elements, or landmarks on a page in a filterable list. Use it to quickly spot structural problems — missing headings, duplicate link text like "click here", unlabeled buttons.

VoiceOver (macOS, Safari)

Action Key
Move to next/previous element VO+→ / VO+←
Enter a group VO+Shift+↓
Exit a group VO+Shift+↑
Activate element VO+Space
Open rotor VO+U
Next heading (in rotor) VO+U then H then ↑/↓
Read page from cursor VO+A
Stop speech Ctrl

The Rotor (VO+U) is VoiceOver’s equivalent of NVDA’s Elements List. It gives you a spinning menu to navigate by headings, links, landmarks, form controls, tables. Cycle through rotor categories with ← and →, navigate items with ↑ and ↓.

Gotcha #2: VoiceOver on Safari and VoiceOver on Chrome behave differently. Safari is the reference browser for VoiceOver testing on macOS — Apple tunes the integration specifically for it. Don’t assume Chrome results reflect what VoiceOver users actually experience.


A Structured Testing Workflow

Ad-hoc "let me just tab around" testing misses too much. Use a structured flow.

Pass 1: Page Structure Audit

Before interacting with any functionality, check the bones of the page.

Open the Elements List (NVDA+F7) or the Rotor and look at:

  1. Heading hierarchy — Does H1 appear exactly once? Do H2s and H3s nest logically? A page going H1 → H4 → H2 is a structural bug.
  2. Landmarks — Is there a <main>, <nav>, <header>, <footer>? These are how screen reader users jump directly to sections. Missing landmarks force users to linear-tab through everything.
  3. Link text — Are there multiple "Read more", "Click here", or "Learn more" links? Out of context they’re meaningless. The link text must describe the destination.

Pass 2: Keyboard Navigation Flow

Navigate through the entire primary user journey using only Tab, Shift+Tab, Enter, Space, and arrow keys (no mouse). Document every point where:

  • Focus is lost (jumps to top of page or becomes invisible)
  • Focus order is illogical (jumps around visually)
  • An interactive element can’t be reached by keyboard at all
  • An element is reachable but can’t be activated

Gotcha #3: Focus trap in modals. When a modal opens, focus must be trapped inside it. Users should not be able to Tab out of an open modal into the underlying page. When the modal closes, focus must return to the element that triggered it. This breaks in roughly 40% of custom modal implementations.

Pass 3: Form Testing

Forms are where accessibility really earns its keep. For each form:

  • Tab through every field and confirm the label is announced (not just placeholder text — placeholders disappear and aren’t reliably announced)
  • Submit an intentionally invalid form and check that errors are announced. Does focus move to the error? Does the error message have an aria-live region or is it only visual?
  • Confirm that required fields are identified as required before the user submits — not just after

Gotcha #4: placeholder is not a label. This comes up constantly. A field with only a placeholder attribute reads as "edit text" in some screen readers with no further description. Always use <label for="..."> or aria-label or aria-labelledby. Always.

Pass 4: Dynamic Content

Any content that updates without a page reload needs explicit ARIA announcements, or screen reader users simply won’t know it changed.

Test:

  • Toast/snackbar notifications — are they in an aria-live="polite" region?
  • Error messages appearing after async validation
  • Loading states — is the spinner announced? Does focus move somewhere useful when loading completes?
  • Infinite scroll / lazy load — is new content reachable by keyboard after loading?

aria-live="polite" announces content when the user is idle. aria-live="assertive" interrupts immediately. Use assertive only for genuine errors or critical alerts — it’s disruptive.

Pass 5: Images and Media

For every image, confirm meaningful images have descriptive alt text and decorative images have alt="". This is table stakes — but still breaks constantly in production.

For video: are captions available? Does the media player itself have keyboard controls that are labeled? Can the user reach and operate Play, Pause, Volume, Fullscreen without a mouse?


Cross-Reader Comparison: What to Expect

Running the same test across NVDA, JAWS, and VoiceOver will surface differences. Not all of them are bugs — screen readers interpret the same markup differently by design. Here’s what to watch for:

ARIA live regions — NVDA and VoiceOver handle aria-live fairly consistently. JAWS has historically been more aggressive about which updates it announces and can sometimes be noisy or miss updates depending on the region’s DOM position. Test all three.

aria-label on non-interactive elements — NVDA reads aria-label on <div> with role="group". JAWS sometimes ignores it. If you’re labeling a group of related content, test this explicitly.

Table headers — JAWS and NVDA both announce <th> scope correctly if scope="col" or scope="row" is set. VoiceOver reads table headers too, but its announcement format is different. None of them reliably extrapolate header associations for complex merged cells — if you have a table with colspan and rowspan, manually test it with all three readers.

Select elements vs. custom dropdowns — Native <select> elements work everywhere with minimal effort. Every custom dropdown is a gamble. If you’re inheriting a custom dropdown, test it hard. It’s almost always broken for at least one reader.


Bug Reporting for Screen Reader Issues

A vague "doesn’t work with screen reader" report gets ignored or deprioritized. Good bug reports look like this:

Environment: Windows 11, Chrome 124, NVDA 2024.1
Component: User profile dropdown (header)
Mode: Browse mode

Steps:
1. Navigate to /profile
2. Tab to the "Account" button in the header
3. Press Enter to open the dropdown

Expected: Dropdown opens, focus moves to first menu item, 
          items are announced as a menu
Actual: Dropdown opens visually, focus remains on "Account" 
        button, no announcement. Arrow keys do not navigate items.
        NVDA speech output: "Account, button collapsed"
        (never announces "expanded" state or menu items)

ARIA: button has aria-expanded="true" after click, but 
      role="menu" is missing from the dropdown container. 
      Children have no role="menuitem".

Always include the NVDA speech viewer output or a screen recording with audio. "Trust me, it sounds wrong" doesn’t give developers enough to reproduce and fix.


Production-Ready Practices

A few things that make the difference between theoretical compliance and an actually usable product:

Test with speech on, not just the Speech Viewer. Reading the text output is not the same as experiencing the full audio stream with pauses, punctuation, and tone. Some bugs only become apparent when you hear the announcement — particularly overly verbose announcements that technically work but are cognitively exhausting.

Use real content, not Lorem Ipsum. Screen reader announcements depend heavily on actual text. "Lorem ipsum dolor" tells you nothing about whether link text is meaningful or whether headings communicate the right hierarchy.

Set up automated gates, but treat them as a floor. Tools like axe-core, Deque’s axe DevTools, or IBM Equal Access Checker in your CI pipeline are worth having. They catch the mechanical issues — missing alt text, incorrect ARIA usage, color contrast math. But they don’t catch focus management bugs, announce order problems, or anything requiring judgment. Automate the easy stuff, manual test the rest.

Create a regression suite. Every screen reader bug you fix is a test case. Write a quick checklist (or an automated test with Playwright + axe) to ensure it doesn’t regress. Screen reader bugs have a way of silently reappearing after refactors.

Involve actual users. Nothing replaces a usability session with someone who uses a screen reader as their primary interface. If you can’t arrange this, at minimum spend 30 minutes navigating your own product blind — screen turned off, audio on, keyboard only. It’s humbling and surfaces issues you’d never find any other way.


Quick Reference: Most Common Bugs by Category

Labeling:

  • Buttons with only icon content and no aria-label
  • Form fields with placeholder but no <label>
  • Images with missing or meaningless alt text (e.g., alt="image")

Focus management:

  • Modal/dialog doesn’t trap focus; focus doesn’t return on close
  • SPA route changes don’t move focus to new page content
  • Dynamic content loads but focus stays on the trigger

Structure:

  • Multiple H1s or heading levels skipped
  • No landmark regions (missing <main>, <nav>)
  • Generic <div> and <span> used for interactive elements

Dynamic content:

  • Toast notifications not in a live region
  • Error messages appear visually but are not announced
  • Loading spinners have no text equivalent

Keyboard:

  • Custom widgets intercept arrow keys in browse mode without switching to application mode
  • Custom dropdowns/datepickers unreachable or inoperable by keyboard
  • Keyboard traps (infinite Tab loop with no escape)

Screen reader testing is a skill. It takes time to get fast at it, and even longer to develop an intuition for which patterns will break and which won’t. But the fundamentals are learnable in a single afternoon, and once you’ve sat through a session listening to a real user struggle through a broken form, you don’t need to be convinced why it matters.

Build the habit of opening NVDA on the feature you just built before it goes to code review. The fix when you’re the author is 20 minutes. The fix when it surfaces in a VPAT review is a sprint.

Leave a comment

👁 Views: 2,289 · Unique visitors: 1,646