pdf-accessibility/screen_reader_simulator_proposal.md
DJP 2a683f1edb Add third-party integration analysis
New Documents:
- INTEGRATION_OPTIONS.md - Comprehensive analysis of tools to integrate
- screen_reader_simulator_proposal.md - Feasibility study

Analysis covers:
 veraPDF (FREE) - STRONGLY RECOMMENDED
  - Open source PDF/UA validator
  - 1-2 day integration, /bin/zsh cost
  - Adds 30% more coverage
  - Structure tree validation, reading order, heading hierarchy

 PDFix SDK (/mo) - Commercial option
  - Full remediation capabilities
  - Only if processing >20 PDFs/month

⚠️ PAC, Adobe SDK, NVDA - Not recommended
  - Various limitations (platform, cost, complexity)

Recommendations:
1. Integrate veraPDF immediately (free, huge value)
2. Build tab order validator (1 day, free)
3. Consider screen reader simulator (3-4 days, nice UX feature)

Result: 24% → 59% coverage with veraPDF + tab validator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 09:18:38 -04:00

360 lines
No EOL
9.1 KiB
Markdown

# Screen Reader Simulator - Feasibility Analysis
## What We COULD Build (Realistic)
### 1. PDF Reading Order Simulator ✅ FEASIBLE
**What it does:**
- Parse PDF structure tree
- Extract content in screen reader order
- Show exactly what would be announced
- Highlight reading order issues
**Output Example:**
```
Screen Reader Output Simulation:
-----------------------------------
[Heading Level 1] "Annual Report 2024"
[Paragraph] "This document presents..."
[Image] "Bar chart showing revenue growth" (alt text)
[Heading Level 2] "Financial Summary"
[Table with 3 columns, 5 rows]
[Header Row] "Quarter | Revenue | Profit"
[Row 1] "Q1 | $1M | $100K"
...
```
**Technical approach:**
```python
def simulate_screen_reader_output(pdf_path):
# Parse structure tree
struct_tree = parse_structure_tree(pdf)
# Walk tree in reading order
for element in struct_tree:
if element.type == 'H1':
print(f"[Heading Level 1] {element.text}")
elif element.type == 'P':
print(f"[Paragraph] {element.text}")
elif element.type == 'Figure':
alt_text = element.get_alt_text()
print(f"[Image] {alt_text or 'NO ALT TEXT'}")
elif element.type == 'Table':
print(f"[Table with {rows} rows, {cols} columns]")
```
**Tools needed:**
- pypdf for structure tree parsing
- Custom tree walker
- Tag-to-announcement mapping
**Time to build:** 2-3 days
**Value:** High - shows exact reading order issues
---
### 2. Reading Order Validator ✅ FEASIBLE
**What it does:**
- Compare visual order vs. tag order
- Detect reading order problems
- Flag if content reads incorrectly
**Example issues it would catch:**
```
Visual layout:
┌─────────────┬─────────────┐
│ Column 1 │ Column 2 │
│ Paragraph A │ Paragraph C │
│ Paragraph B │ Paragraph D │
└─────────────┴─────────────┘
Tag order (what SR reads):
1. Column 1 Paragraph A
2. Column 1 Paragraph B
3. Column 2 Paragraph C ← WRONG! Should be #2
4. Column 2 Paragraph D
ISSUE: Multi-column layout not properly tagged!
```
**Time to build:** 3-4 days
**Value:** Medium-High - catches common layout issues
---
### 3. Accessibility Tree Inspector ✅ FEASIBLE
**What it does:**
- Show PDF accessibility tree (like Chrome DevTools)
- Display all accessible properties
- Highlight missing names/roles/values
**Visual output:**
```
Document
├─ Article
│ ├─ H1 "Annual Report" ✅
│ ├─ P "This year we..." ✅
│ ├─ Figure [NO ALT TEXT] ❌
│ └─ Table
│ ├─ TR (header=true) ✅
│ └─ TR (header=false) ✅
└─ Form
├─ Field "email" (tooltip="Email Address") ✅
└─ Field "phone" (NO TOOLTIP) ❌
```
**Time to build:** 4-5 days
**Value:** High - visual debugging tool
---
## What We CANNOT Build (Unrealistic)
### ❌ Full Screen Reader
**Why not:**
- Requires OS-level hooks (Windows MSAA/UIA, macOS Accessibility API)
- Need TTS (Text-to-Speech) engine integration
- Complex rendering pipeline
- Must support ALL applications, not just PDFs
- Years of development, 100,000+ lines of code
**Equivalent effort:** Building a web browser from scratch
---
### ❌ Real-Time Audio Output
**Why not:**
- Need professional TTS engine (expensive licensing)
- Voice customization
- Speech rate controls
- Pronunciation dictionaries
- Multi-language support
**Better alternative:** Use existing screen readers (NVDA is free!)
---
## ⌨️ Keyboard Navigation Testing
### What We COULD Build (Partially)
#### 1. Tab Order Validator ✅ FEASIBLE
**What it does:**
- Extract tab order from PDF form fields
- Detect if tab indices are set
- Flag fields with no tab order
- Verify tab order is logical (1, 2, 3... not 1, 5, 2, 8)
**Code example:**
```python
def check_tab_order(pdf):
form_fields = get_form_fields(pdf)
for field in form_fields:
tab_index = field.get('/T') # Tab index
if not tab_index:
issue("Field has no tab order")
# Check for gaps/skips
indices = sorted([f.tab_index for f in form_fields])
for i, idx in enumerate(indices):
if i > 0 and idx != indices[i-1] + 1:
issue(f"Tab order jumps from {indices[i-1]} to {idx}")
```
**Time to build:** 1-2 days
**Value:** Medium - catches common form issues
---
#### 2. Focus Order Detection ✅ FEASIBLE
**What it does:**
- Map visual position of form fields
- Compare to programmatic tab order
- Detect if focus jumps around illogically
**Example:**
```
Visual layout: Tab order:
┌─────────┐ 1. Name ✅
│ Name │ 1 2. Email ✅
│ Email │ 2 3. Submit ❌ WRONG! Should be #4
│ Phone │ 4 4. Phone ❌ WRONG! Should be #3
│ Submit │ 3
└─────────┘
ISSUE: Tab order doesn't match visual layout!
```
**Time to build:** 2-3 days
**Value:** Medium - useful for complex forms
---
### What We CANNOT Build
#### ❌ Actual Keyboard Navigation Simulation
**Why not:**
- Need to launch PDF reader (Adobe, Preview, etc.)
- Simulate keyboard input (requires automation framework)
- Capture behavior (focus changes, interactions)
- Different readers behave differently
- Slow and brittle
**What this would require:**
1. Launch PDF in Adobe Acrobat
2. Use Selenium/Playwright to send keyboard events
3. Monitor focus changes
4. Detect keyboard traps
5. Verify all functionality accessible
**Problems:**
- Adobe Acrobat not automation-friendly
- Each PDF reader has different keyboard shortcuts
- Slow (30+ seconds per test)
- Flaky (automation breaks with UI changes)
- Requires GUI (can't run headless)
**Better solution:** Manual testing with actual keyboard
---
## 💡 **Recommended Approach**
### Build What's Useful:
**Phase 1 (High Value, Quick Wins):**
1.**Screen Reader Output Simulator** (3 days)
- Show what SR would announce
- Detect reading order issues
- Most valuable feature
2.**Tab Order Validator** (2 days)
- Check form field tab order
- Detect missing tab indices
- Quick win for forms
**Phase 2 (Medium Value):**
3. ⚠️ **Accessibility Tree Inspector** (4 days)
- Visual tree viewer
- Helpful for debugging
4. ⚠️ **Focus Order Detector** (3 days)
- Compare visual vs. programmatic order
- Useful for complex forms
**Don't Build (Not Worth It):**
- ❌ Full screen reader (months of work, low ROI)
- ❌ TTS integration (expensive, existing solutions better)
- ❌ Keyboard automation (brittle, slow, limited value)
---
## 🚀 **My Recommendation**
### **Option A: Build Screen Reader Simulator** (Best ROI)
**Effort:** 3-4 days
**Value:** HIGH
**What you get:**
```
📄 Screen Reader Preview
─────────────────────────────
[Document Title] "Annual Report 2024"
[Heading 1] "Executive Summary"
[Paragraph] "This year saw significant growth..."
[Image] NO ALT TEXT ❌
[Heading 2] "Financial Results"
[Table: 4 columns, 10 rows]
[Row 1, Header] "Quarter" "Revenue" "Profit" "Growth"
[Row 2] "Q1" "$1.2M" "$150K" "12%"
...
```
**Benefits:**
- Shows EXACTLY what blind users hear
- Catches reading order problems
- Validates alt text presence
- No need for actual screen reader
- Works in web interface
**This would be VERY valuable!**
---
### **Option B: Add Tab Order Checking** (Quick Win)
**Effort:** 1-2 days
**Value:** MEDIUM
**What you get:**
- ✅ Verify tab order exists
- ✅ Detect illogical tab sequences
- ✅ Flag forms with no tab order
- ⚠️ Can't test actual behavior (still need manual)
---
### **Option C: Do Nothing** (Use Existing Tools)
**Free screen readers:**
- NVDA (Windows) - Free, excellent
- VoiceOver (Mac) - Built-in
- JAWS (Windows) - Commercial, industry standard
**Recommendation:** Train users to test with NVDA (5 minutes to learn)
**Keyboard testing:** Just manually test (Tab through the PDF)
---
## 🎯 **My Suggestion:**
### **Build the Screen Reader Simulator**
**Why:**
1. **High value** - Shows reading order issues (common problem)
2. **Unique feature** - Competitors don't have this
3. **Fast to build** - 3-4 days with existing code
4. **Integrates well** - Add to Visual Page Inspector
5. **Educational** - Helps users understand accessibility
**What it would show:**
- Text content in SR order
- Image alt text (or "MISSING")
- Table structure
- Heading hierarchy
- Form field labels
- Link text
**How it helps:**
- Catch reading order bugs without screen reader
- Verify alt text before publishing
- Educational for non-technical users
- Great demo feature
---
## ❓ **Want Me To Build It?**
I can build a **Screen Reader Output Simulator** that:
- Parses PDF structure tree
- Simulates screen reader announcements
- Shows reading order issues
- Displays in web interface
- Highlights problems visually
**Estimated time:** 3-4 days of development
**Would you like me to:**
1. ✅ Build the Screen Reader Simulator (high value)
2. ⚠️ Build Tab Order Validator (quick win, lower value)
3. ❌ Skip it and use existing screen readers (practical approach)
What do you think? The Screen Reader Simulator would be a really cool feature! 🎯