pdf-accessibility/screen_reader_simulator_proposal.md at master

DJP 2a683f1edb Add third-party integration analysis

New Documents:
- INTEGRATION_OPTIONS.md - Comprehensive analysis of tools to integrate
- screen_reader_simulator_proposal.md - Feasibility study

Analysis covers:
✅ veraPDF (FREE) - STRONGLY RECOMMENDED
  - Open source PDF/UA validator
  - 1-2 day integration, /bin/zsh cost
  - Adds 30% more coverage
  - Structure tree validation, reading order, heading hierarchy

✅ PDFix SDK (/mo) - Commercial option
  - Full remediation capabilities
  - Only if processing >20 PDFs/month

⚠️ PAC, Adobe SDK, NVDA - Not recommended
  - Various limitations (platform, cost, complexity)

Recommendations:
1. Integrate veraPDF immediately (free, huge value)
2. Build tab order validator (1 day, free)
3. Consider screen reader simulator (3-4 days, nice UX feature)

Result: 24% → 59% coverage with veraPDF + tab validator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-21 09:18:38 -04:00

9.1 KiB

Raw Permalink Blame History

What We COULD Build (Realistic)

1. PDF Reading Order Simulator ✅ FEASIBLE

What it does:

Parse PDF structure tree
Extract content in screen reader order
Show exactly what would be announced
Highlight reading order issues

Output Example:

Screen Reader Output Simulation:
-----------------------------------
[Heading Level 1] "Annual Report 2024"
[Paragraph] "This document presents..."
[Image] "Bar chart showing revenue growth" (alt text)
[Heading Level 2] "Financial Summary"
[Table with 3 columns, 5 rows]
  [Header Row] "Quarter | Revenue | Profit"
  [Row 1] "Q1 | $1M | $100K"
  ...

Technical approach:

def simulate_screen_reader_output(pdf_path):
    # Parse structure tree
    struct_tree = parse_structure_tree(pdf)

    # Walk tree in reading order
    for element in struct_tree:
        if element.type == 'H1':
            print(f"[Heading Level 1] {element.text}")
        elif element.type == 'P':
            print(f"[Paragraph] {element.text}")
        elif element.type == 'Figure':
            alt_text = element.get_alt_text()
            print(f"[Image] {alt_text or 'NO ALT TEXT'}")
        elif element.type == 'Table':
            print(f"[Table with {rows} rows, {cols} columns]")

Tools needed:

pypdf for structure tree parsing
Custom tree walker
Tag-to-announcement mapping

Time to build: 2-3 days Value: High - shows exact reading order issues

2. Reading Order Validator ✅ FEASIBLE

What it does:

Compare visual order vs. tag order
Detect reading order problems
Flag if content reads incorrectly

Example issues it would catch:

Visual layout:
┌─────────────┬─────────────┐
│ Column 1    │ Column 2    │
│ Paragraph A │ Paragraph C │
│ Paragraph B │ Paragraph D │
└─────────────┴─────────────┘

Tag order (what SR reads):
1. Column 1 Paragraph A
2. Column 1 Paragraph B
3. Column 2 Paragraph C  ← WRONG! Should be #2
4. Column 2 Paragraph D

ISSUE: Multi-column layout not properly tagged!

Time to build: 3-4 days Value: Medium-High - catches common layout issues

3. Accessibility Tree Inspector ✅ FEASIBLE

What it does:

Show PDF accessibility tree (like Chrome DevTools)
Display all accessible properties
Highlight missing names/roles/values

Visual output:

Document
├─ Article
│  ├─ H1 "Annual Report" ✅
│  ├─ P "This year we..." ✅
│  ├─ Figure [NO ALT TEXT] ❌
│  └─ Table
│     ├─ TR (header=true) ✅
│     └─ TR (header=false) ✅
└─ Form
   ├─ Field "email" (tooltip="Email Address") ✅
   └─ Field "phone" (NO TOOLTIP) ❌

Time to build: 4-5 days Value: High - visual debugging tool

What We CANNOT Build (Unrealistic)

Why not:

Requires OS-level hooks (Windows MSAA/UIA, macOS Accessibility API)
Need TTS (Text-to-Speech) engine integration
Complex rendering pipeline
Must support ALL applications, not just PDFs
Years of development, 100,000+ lines of code

Equivalent effort: Building a web browser from scratch

❌ Real-Time Audio Output

Why not:

Need professional TTS engine (expensive licensing)
Voice customization
Speech rate controls
Pronunciation dictionaries
Multi-language support

Better alternative: Use existing screen readers (NVDA is free!)

What We COULD Build (Partially)

1. Tab Order Validator ✅ FEASIBLE

What it does:

Extract tab order from PDF form fields
Detect if tab indices are set
Flag fields with no tab order
Verify tab order is logical (1, 2, 3... not 1, 5, 2, 8)

Code example:

def check_tab_order(pdf):
    form_fields = get_form_fields(pdf)

    for field in form_fields:
        tab_index = field.get('/T')  # Tab index
        if not tab_index:
            issue("Field has no tab order")

    # Check for gaps/skips
    indices = sorted([f.tab_index for f in form_fields])
    for i, idx in enumerate(indices):
        if i > 0 and idx != indices[i-1] + 1:
            issue(f"Tab order jumps from {indices[i-1]} to {idx}")

Time to build: 1-2 days Value: Medium - catches common form issues

2. Focus Order Detection ✅ FEASIBLE

What it does:

Map visual position of form fields
Compare to programmatic tab order
Detect if focus jumps around illogically

Example:

Visual layout:     Tab order:
┌─────────┐       1. Name ✅
│ Name    │ 1     2. Email ✅
│ Email   │ 2     3. Submit ❌ WRONG! Should be #4
│ Phone   │ 4     4. Phone ❌ WRONG! Should be #3
│ Submit  │ 3
└─────────┘

ISSUE: Tab order doesn't match visual layout!

Time to build: 2-3 days Value: Medium - useful for complex forms

What We CANNOT Build

Why not:

Need to launch PDF reader (Adobe, Preview, etc.)
Simulate keyboard input (requires automation framework)
Capture behavior (focus changes, interactions)
Different readers behave differently
Slow and brittle

What this would require:

Launch PDF in Adobe Acrobat
Use Selenium/Playwright to send keyboard events
Monitor focus changes
Detect keyboard traps
Verify all functionality accessible

Problems:

Adobe Acrobat not automation-friendly
Each PDF reader has different keyboard shortcuts
Slow (30+ seconds per test)
Flaky (automation breaks with UI changes)
Requires GUI (can't run headless)

Better solution: Manual testing with actual keyboard

💡 Recommended Approach

Build What's Useful:

Phase 1 (High Value, Quick Wins):

✅ Screen Reader Output Simulator (3 days)
- Show what SR would announce
- Detect reading order issues
- Most valuable feature
✅ Tab Order Validator (2 days)
- Check form field tab order
- Detect missing tab indices
- Quick win for forms

Phase 2 (Medium Value): 3. ⚠️ Accessibility Tree Inspector (4 days)

Visual tree viewer
Helpful for debugging

⚠️ Focus Order Detector (3 days)
- Compare visual vs. programmatic order
- Useful for complex forms

Don't Build (Not Worth It):

❌ Full screen reader (months of work, low ROI)
❌ TTS integration (expensive, existing solutions better)
❌ Keyboard automation (brittle, slow, limited value)

🚀 My Recommendation

Effort: 3-4 days Value: HIGH What you get:

📄 Screen Reader Preview
─────────────────────────────
[Document Title] "Annual Report 2024"
[Heading 1] "Executive Summary"
[Paragraph] "This year saw significant growth..."
[Image] NO ALT TEXT ❌
[Heading 2] "Financial Results"
[Table: 4 columns, 10 rows]
  [Row 1, Header] "Quarter" "Revenue" "Profit" "Growth"
  [Row 2] "Q1" "$1.2M" "$150K" "12%"
  ...

Benefits:

Shows EXACTLY what blind users hear
Catches reading order problems
Validates alt text presence
No need for actual screen reader
Works in web interface

This would be VERY valuable!

Option B: Add Tab Order Checking (Quick Win)

Effort: 1-2 days Value: MEDIUM What you get:

✅ Verify tab order exists
✅ Detect illogical tab sequences
✅ Flag forms with no tab order
⚠️ Can't test actual behavior (still need manual)

Option C: Do Nothing (Use Existing Tools)

Free screen readers:

NVDA (Windows) - Free, excellent
VoiceOver (Mac) - Built-in
JAWS (Windows) - Commercial, industry standard

Recommendation: Train users to test with NVDA (5 minutes to learn)

Keyboard testing: Just manually test (Tab through the PDF)

🎯 My Suggestion:

Why:

High value - Shows reading order issues (common problem)
Unique feature - Competitors don't have this
Fast to build - 3-4 days with existing code
Integrates well - Add to Visual Page Inspector
Educational - Helps users understand accessibility

What it would show:

Text content in SR order
Image alt text (or "MISSING")
Table structure
Heading hierarchy
Form field labels
Link text

How it helps:

Catch reading order bugs without screen reader
Verify alt text before publishing
Educational for non-technical users
Great demo feature

❓ Want Me To Build It?

I can build a Screen Reader Output Simulator that:

Parses PDF structure tree
Simulates screen reader announcements
Shows reading order issues
Displays in web interface
Highlights problems visually

Estimated time: 3-4 days of development

Would you like me to:

✅ Build the Screen Reader Simulator (high value)
⚠️ Build Tab Order Validator (quick win, lower value)
❌ Skip it and use existing screen readers (practical approach)

What do you think? The Screen Reader Simulator would be a really cool feature! 🎯

9.1 KiB

Raw Permalink Blame History

What We COULD Build (Realistic)

1. PDF Reading Order Simulator ✅ FEASIBLE

2. Reading Order Validator ✅ FEASIBLE

3. Accessibility Tree Inspector ✅ FEASIBLE

What We CANNOT Build (Unrealistic)

❌ Real-Time Audio Output

⌨️ Keyboard Navigation Testing

What We COULD Build (Partially)

1. Tab Order Validator ✅ FEASIBLE

2. Focus Order Detection ✅ FEASIBLE

What We CANNOT Build

❌ Actual Keyboard Navigation Simulation

💡 Recommended Approach

Build What's Useful:

🚀 My Recommendation

Option B: Add Tab Order Checking (Quick Win)

Option C: Do Nothing (Use Existing Tools)

🎯 My Suggestion:

❓ Want Me To Build It?

9.1 KiB Raw Permalink Blame History

Screen Reader Simulator - Feasibility Analysis

What We COULD Build (Realistic)

1. PDF Reading Order Simulator ✅ FEASIBLE

2. Reading Order Validator ✅ FEASIBLE

3. Accessibility Tree Inspector ✅ FEASIBLE

What We CANNOT Build (Unrealistic)

❌ Full Screen Reader

❌ Real-Time Audio Output

⌨️ Keyboard Navigation Testing

What We COULD Build (Partially)

1. Tab Order Validator ✅ FEASIBLE

2. Focus Order Detection ✅ FEASIBLE

What We CANNOT Build

❌ Actual Keyboard Navigation Simulation

💡 Recommended Approach

Build What's Useful:

🚀 My Recommendation

Option A: Build Screen Reader Simulator (Best ROI)

Option B: Add Tab Order Checking (Quick Win)

Option C: Do Nothing (Use Existing Tools)

🎯 My Suggestion:

Build the Screen Reader Simulator

❓ Want Me To Build It?

9.1 KiB

Raw Permalink Blame History