pdf-accessibility/screen_reader_simulator_proposal.md
DJP 2a683f1edb Add third-party integration analysis
New Documents:
- INTEGRATION_OPTIONS.md - Comprehensive analysis of tools to integrate
- screen_reader_simulator_proposal.md - Feasibility study

Analysis covers:
 veraPDF (FREE) - STRONGLY RECOMMENDED
  - Open source PDF/UA validator
  - 1-2 day integration, /bin/zsh cost
  - Adds 30% more coverage
  - Structure tree validation, reading order, heading hierarchy

 PDFix SDK (/mo) - Commercial option
  - Full remediation capabilities
  - Only if processing >20 PDFs/month

⚠️ PAC, Adobe SDK, NVDA - Not recommended
  - Various limitations (platform, cost, complexity)

Recommendations:
1. Integrate veraPDF immediately (free, huge value)
2. Build tab order validator (1 day, free)
3. Consider screen reader simulator (3-4 days, nice UX feature)

Result: 24% → 59% coverage with veraPDF + tab validator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 09:18:38 -04:00

9.1 KiB

Screen Reader Simulator - Feasibility Analysis

What We COULD Build (Realistic)

1. PDF Reading Order Simulator FEASIBLE

What it does:

  • Parse PDF structure tree
  • Extract content in screen reader order
  • Show exactly what would be announced
  • Highlight reading order issues

Output Example:

Screen Reader Output Simulation:
-----------------------------------
[Heading Level 1] "Annual Report 2024"
[Paragraph] "This document presents..."
[Image] "Bar chart showing revenue growth" (alt text)
[Heading Level 2] "Financial Summary"
[Table with 3 columns, 5 rows]
  [Header Row] "Quarter | Revenue | Profit"
  [Row 1] "Q1 | $1M | $100K"
  ...

Technical approach:

def simulate_screen_reader_output(pdf_path):
    # Parse structure tree
    struct_tree = parse_structure_tree(pdf)

    # Walk tree in reading order
    for element in struct_tree:
        if element.type == 'H1':
            print(f"[Heading Level 1] {element.text}")
        elif element.type == 'P':
            print(f"[Paragraph] {element.text}")
        elif element.type == 'Figure':
            alt_text = element.get_alt_text()
            print(f"[Image] {alt_text or 'NO ALT TEXT'}")
        elif element.type == 'Table':
            print(f"[Table with {rows} rows, {cols} columns]")

Tools needed:

  • pypdf for structure tree parsing
  • Custom tree walker
  • Tag-to-announcement mapping

Time to build: 2-3 days Value: High - shows exact reading order issues


2. Reading Order Validator FEASIBLE

What it does:

  • Compare visual order vs. tag order
  • Detect reading order problems
  • Flag if content reads incorrectly

Example issues it would catch:

Visual layout:
┌─────────────┬─────────────┐
│ Column 1    │ Column 2    │
│ Paragraph A │ Paragraph C │
│ Paragraph B │ Paragraph D │
└─────────────┴─────────────┘

Tag order (what SR reads):
1. Column 1 Paragraph A
2. Column 1 Paragraph B
3. Column 2 Paragraph C  ← WRONG! Should be #2
4. Column 2 Paragraph D

ISSUE: Multi-column layout not properly tagged!

Time to build: 3-4 days Value: Medium-High - catches common layout issues


3. Accessibility Tree Inspector FEASIBLE

What it does:

  • Show PDF accessibility tree (like Chrome DevTools)
  • Display all accessible properties
  • Highlight missing names/roles/values

Visual output:

Document
├─ Article
│  ├─ H1 "Annual Report" ✅
│  ├─ P "This year we..." ✅
│  ├─ Figure [NO ALT TEXT] ❌
│  └─ Table
│     ├─ TR (header=true) ✅
│     └─ TR (header=false) ✅
└─ Form
   ├─ Field "email" (tooltip="Email Address") ✅
   └─ Field "phone" (NO TOOLTIP) ❌

Time to build: 4-5 days Value: High - visual debugging tool


What We CANNOT Build (Unrealistic)

Full Screen Reader

Why not:

  • Requires OS-level hooks (Windows MSAA/UIA, macOS Accessibility API)
  • Need TTS (Text-to-Speech) engine integration
  • Complex rendering pipeline
  • Must support ALL applications, not just PDFs
  • Years of development, 100,000+ lines of code

Equivalent effort: Building a web browser from scratch


Real-Time Audio Output

Why not:

  • Need professional TTS engine (expensive licensing)
  • Voice customization
  • Speech rate controls
  • Pronunciation dictionaries
  • Multi-language support

Better alternative: Use existing screen readers (NVDA is free!)


⌨️ Keyboard Navigation Testing

What We COULD Build (Partially)

1. Tab Order Validator FEASIBLE

What it does:

  • Extract tab order from PDF form fields
  • Detect if tab indices are set
  • Flag fields with no tab order
  • Verify tab order is logical (1, 2, 3... not 1, 5, 2, 8)

Code example:

def check_tab_order(pdf):
    form_fields = get_form_fields(pdf)

    for field in form_fields:
        tab_index = field.get('/T')  # Tab index
        if not tab_index:
            issue("Field has no tab order")

    # Check for gaps/skips
    indices = sorted([f.tab_index for f in form_fields])
    for i, idx in enumerate(indices):
        if i > 0 and idx != indices[i-1] + 1:
            issue(f"Tab order jumps from {indices[i-1]} to {idx}")

Time to build: 1-2 days Value: Medium - catches common form issues


2. Focus Order Detection FEASIBLE

What it does:

  • Map visual position of form fields
  • Compare to programmatic tab order
  • Detect if focus jumps around illogically

Example:

Visual layout:     Tab order:
┌─────────┐       1. Name ✅
│ Name    │ 1     2. Email ✅
│ Email   │ 2     3. Submit ❌ WRONG! Should be #4
│ Phone   │ 4     4. Phone ❌ WRONG! Should be #3
│ Submit  │ 3
└─────────┘

ISSUE: Tab order doesn't match visual layout!

Time to build: 2-3 days Value: Medium - useful for complex forms


What We CANNOT Build

Actual Keyboard Navigation Simulation

Why not:

  • Need to launch PDF reader (Adobe, Preview, etc.)
  • Simulate keyboard input (requires automation framework)
  • Capture behavior (focus changes, interactions)
  • Different readers behave differently
  • Slow and brittle

What this would require:

  1. Launch PDF in Adobe Acrobat
  2. Use Selenium/Playwright to send keyboard events
  3. Monitor focus changes
  4. Detect keyboard traps
  5. Verify all functionality accessible

Problems:

  • Adobe Acrobat not automation-friendly
  • Each PDF reader has different keyboard shortcuts
  • Slow (30+ seconds per test)
  • Flaky (automation breaks with UI changes)
  • Requires GUI (can't run headless)

Better solution: Manual testing with actual keyboard


Build What's Useful:

Phase 1 (High Value, Quick Wins):

  1. Screen Reader Output Simulator (3 days)

    • Show what SR would announce
    • Detect reading order issues
    • Most valuable feature
  2. Tab Order Validator (2 days)

    • Check form field tab order
    • Detect missing tab indices
    • Quick win for forms

Phase 2 (Medium Value): 3. ⚠️ Accessibility Tree Inspector (4 days)

  • Visual tree viewer
  • Helpful for debugging
  1. ⚠️ Focus Order Detector (3 days)
    • Compare visual vs. programmatic order
    • Useful for complex forms

Don't Build (Not Worth It):

  • Full screen reader (months of work, low ROI)
  • TTS integration (expensive, existing solutions better)
  • Keyboard automation (brittle, slow, limited value)

🚀 My Recommendation

Option A: Build Screen Reader Simulator (Best ROI)

Effort: 3-4 days Value: HIGH What you get:

📄 Screen Reader Preview
─────────────────────────────
[Document Title] "Annual Report 2024"
[Heading 1] "Executive Summary"
[Paragraph] "This year saw significant growth..."
[Image] NO ALT TEXT ❌
[Heading 2] "Financial Results"
[Table: 4 columns, 10 rows]
  [Row 1, Header] "Quarter" "Revenue" "Profit" "Growth"
  [Row 2] "Q1" "$1.2M" "$150K" "12%"
  ...

Benefits:

  • Shows EXACTLY what blind users hear
  • Catches reading order problems
  • Validates alt text presence
  • No need for actual screen reader
  • Works in web interface

This would be VERY valuable!


Option B: Add Tab Order Checking (Quick Win)

Effort: 1-2 days Value: MEDIUM What you get:

  • Verify tab order exists
  • Detect illogical tab sequences
  • Flag forms with no tab order
  • ⚠️ Can't test actual behavior (still need manual)

Option C: Do Nothing (Use Existing Tools)

Free screen readers:

  • NVDA (Windows) - Free, excellent
  • VoiceOver (Mac) - Built-in
  • JAWS (Windows) - Commercial, industry standard

Recommendation: Train users to test with NVDA (5 minutes to learn)

Keyboard testing: Just manually test (Tab through the PDF)


🎯 My Suggestion:

Build the Screen Reader Simulator

Why:

  1. High value - Shows reading order issues (common problem)
  2. Unique feature - Competitors don't have this
  3. Fast to build - 3-4 days with existing code
  4. Integrates well - Add to Visual Page Inspector
  5. Educational - Helps users understand accessibility

What it would show:

  • Text content in SR order
  • Image alt text (or "MISSING")
  • Table structure
  • Heading hierarchy
  • Form field labels
  • Link text

How it helps:

  • Catch reading order bugs without screen reader
  • Verify alt text before publishing
  • Educational for non-technical users
  • Great demo feature

Want Me To Build It?

I can build a Screen Reader Output Simulator that:

  • Parses PDF structure tree
  • Simulates screen reader announcements
  • Shows reading order issues
  • Displays in web interface
  • Highlights problems visually

Estimated time: 3-4 days of development

Would you like me to:

  1. Build the Screen Reader Simulator (high value)
  2. ⚠️ Build Tab Order Validator (quick win, lower value)
  3. Skip it and use existing screen readers (practical approach)

What do you think? The Screen Reader Simulator would be a really cool feature! 🎯