pdf-accessibility/README's/API_QUICK_REFERENCE.md
DJP bf83a409bb Initial commit: Enterprise PDF Accessibility Checker
- Complete WCAG 2.1 accessibility checking system
- AI-powered analysis with Claude 4.5 and Google Vision
- Web interface with drag-and-drop upload
- REST API backend (PHP)
- Python checker with parallel processing
- Quick mode for fast scans (~10 seconds)
- Full mode with AI analysis (~2 minutes)
- .env file support for API keys
- Error logging and debugging tools
- Comprehensive documentation

Performance improvements:
- Parallel image processing (3x faster)
- Smart API timeouts (10s)
- Reduced DPI for faster conversions
- Real-time progress updates

🤖 Generated with Claude Code
2025-10-20 15:50:56 -04:00

441 lines
11 KiB
Markdown

# API Integration Quick Reference
## 🚀 One-Page Integration Guide
### What Can Each API Do?
```
┌─────────────────────────────────────────────────────────────────┐
│ WCAG GAP → API SOLUTION │
├─────────────────────────────────────────────────────────────────┤
│ Alt Text Quality → GPT-4V, Claude, Google Vision │
│ Color Contrast → PIL + pdf2image (FREE) │
│ OCR for Scans → Tesseract (FREE) / Google Doc AI │
│ Content Readability → TextBlob (FREE) / GPT-4 │
│ Link Text Quality → Regex + NLP (FREE) / GPT-4 │
│ Heading Structure → pypdf parsing (FREE) │
│ Form Field Labels → pypdf parsing (FREE) │
└─────────────────────────────────────────────────────────────────┘
```
---
## 💰 Cost Comparison Table
| Service | Cost | Best For | Setup Complexity |
|---------|------|----------|------------------|
| **Tesseract OCR** | FREE | Scanned documents | ⭐ Easy |
| **TextBlob** | FREE | Readability checks | ⭐ Easy |
| **PIL/Pillow** | FREE | Color contrast | ⭐⭐ Medium |
| **OpenAI GPT-4V** | $0.01-0.03/image | Alt text validation | ⭐⭐ Medium |
| **Claude Vision** | $0.015/image | Alt text + context | ⭐⭐ Medium |
| **Google Vision** | $1.50/1000 images | Bulk processing | ⭐⭐⭐ Hard |
| **Google Doc AI** | $1.50/1000 pages | Complex OCR | ⭐⭐⭐ Hard |
---
## 🎯 Recommended Setups by Budget
### $0/month - Basic (60% coverage)
```bash
pip install pypdf pdfplumber pytesseract textblob pillow pdf2image
# Enables:
✅ Document structure checks
✅ OCR for scanned docs
✅ Readability analysis
✅ Color contrast checks
✅ Link validation
```
### $10/month - Intermediate (80% coverage)
```bash
# All free tools PLUS:
pip install openai
export OPENAI_API_KEY="sk-..."
# Enables:
✅ All free features
✅ AI alt text validation (10 images/doc)
✅ Content quality analysis
```
### $50/month - Advanced (90% coverage)
```bash
# All tools PLUS:
# - Unlimited image analysis
# - Advanced content analysis
# - Batch processing
```
### $100/month - Enterprise (95% coverage)
```bash
# All tools PLUS:
pip install google-cloud-vision google-cloud-documentai
# Enables:
✅ Google Document AI (best OCR)
✅ Unlimited image processing
✅ Full automation pipeline
```
---
## ⚡ Quick Start Commands
### 1. Install Free Tools (5 minutes)
```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install tesseract-ocr poppler-utils
# macOS
brew install tesseract poppler
# Python packages
pip install pypdf pdfplumber pytesseract textblob pillow pdf2image numpy --break-system-packages
# Download language data
python -m textblob.download_corpora
```
### 2. Basic Check (No APIs)
```bash
python pdf_accessibility_checker.py document.pdf
```
### 3. With OCR
```bash
python enhanced_pdf_checker.py document.pdf --enable-ocr
```
### 4. With All Free Tools
```bash
python enhanced_pdf_checker.py document.pdf \
--enable-ocr \
--check-contrast \
--analyze-content \
--check-links \
--verbose
```
### 5. With OpenAI Vision
```bash
export OPENAI_API_KEY="sk-your-key"
python enhanced_pdf_checker.py document.pdf \
--vision-api openai \
--vision-api-key $OPENAI_API_KEY
```
---
## 📝 API Setup Instructions
### OpenAI (GPT-4 Vision)
```python
# 1. Get API key from https://platform.openai.com/api-keys
# 2. Install library
pip install openai
# 3. Use in code
import openai
client = openai.OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image"},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
]
}]
)
```
### Anthropic (Claude Vision)
```python
# 1. Get API key from https://console.anthropic.com/
# 2. Install library
pip install anthropic
# 3. Use in code
import anthropic
client = anthropic.Anthropic(api_key="sk-ant-...")
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{
"role": "user",
"content": [
{"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": base64_image}},
{"type": "text", "text": "Provide alt text for accessibility"}
]
}]
)
```
### Google Cloud Vision
```bash
# 1. Create project at https://console.cloud.google.com/
# 2. Enable Vision API
# 3. Create service account & download credentials
# 4. Install library
pip install google-cloud-vision
# 5. Set credentials
export GOOGLE_APPLICATION_CREDENTIALS="path/to/credentials.json"
```
```python
from google.cloud import vision
client = vision.ImageAnnotatorClient()
image = vision.Image(content=image_bytes)
response = client.label_detection(image=image)
```
---
## 🔧 Common Integration Patterns
### Pattern 1: Smart Sampling (Cost Control)
```python
# Only check first 10 images per document
def check_images_smart(pdf_path, max_images=10):
images = extract_all_images(pdf_path)
if len(images) <= max_images:
return check_all_images(images)
else:
# Sample evenly throughout document
step = len(images) // max_images
sampled = images[::step][:max_images]
return check_all_images(sampled)
```
### Pattern 2: Caching Results
```python
import hashlib
import json
from pathlib import Path
def get_cached_result(image_bytes):
"""Cache API results to avoid repeat calls"""
cache_dir = Path(".cache")
cache_dir.mkdir(exist_ok=True)
# Create hash of image
img_hash = hashlib.md5(image_bytes).hexdigest()
cache_file = cache_dir / f"{img_hash}.json"
if cache_file.exists():
return json.loads(cache_file.read_text())
# Call API
result = call_vision_api(image_bytes)
# Cache result
cache_file.write_text(json.dumps(result))
return result
```
### Pattern 3: Batch Processing
```python
def process_directory(directory, max_cost=10.0):
"""Process all PDFs with cost limit"""
total_cost = 0
for pdf_file in Path(directory).glob("*.pdf"):
if total_cost >= max_cost:
print(f"Reached cost limit of ${max_cost}")
break
result = check_pdf(pdf_file)
total_cost += result['estimated_cost']
print(f"Processed {pdf_file.name} - Total cost: ${total_cost:.2f}")
```
---
## 🎨 Example: Complete Integration
```python
#!/usr/bin/env python3
"""
Complete PDF accessibility checker with all integrations
"""
import sys
from enhanced_pdf_checker import EnhancedPDFAccessibilityChecker, EnhancedCheckConfig
def main():
pdf_path = sys.argv[1] if len(sys.argv) > 1 else "document.pdf"
# Configure with your API keys
config = EnhancedCheckConfig(
# Free tools
enable_ocr=True,
enable_contrast_check=True,
enable_content_analysis=True,
enable_link_validation=True,
# Paid APIs (optional)
vision_api_provider="openai", # or "anthropic" or "google"
vision_api_key="sk-your-key-here", # or None to skip
verbose=True
)
# Run checks
print(f"Analyzing {pdf_path}...")
checker = EnhancedPDFAccessibilityChecker(pdf_path, config)
issues = checker.check_all()
# Generate reports
checker.generate_report("text") # Console output
html_output = pdf_path.replace(".pdf", "_report.html")
with open(html_output, "w") as f:
f.write(checker.generate_report("html"))
json_output = pdf_path.replace(".pdf", "_report.json")
with open(json_output, "w") as f:
f.write(checker.generate_report("json"))
print(f"\n✅ Complete!")
print(f"📊 Found {len(issues)} issues")
print(f"📄 HTML report: {html_output}")
print(f"📄 JSON report: {json_output}")
if __name__ == "__main__":
main()
```
**Run it:**
```bash
python complete_checker.py my_document.pdf
```
---
## 📊 Expected Results by Coverage Level
### 20% Coverage (Basic Tool Only)
```
Issues Found: 5-10
- Missing title
- No language set
- PDF not tagged
- No bookmarks
- Security issues
```
### 60% Coverage (+ Free Tools)
```
Issues Found: 15-30
- All basic issues
- 5-10 OCR issues (scanned pages)
- 3-5 readability issues
- 2-4 contrast warnings
- 1-3 link text issues
```
### 80% Coverage (+ Budget APIs)
```
Issues Found: 25-45
- All previous issues
- 10-15 image alt text issues
- 5-8 content quality issues
- Specific improvement suggestions
```
### 95% Coverage (+ Full APIs)
```
Issues Found: 40-60+
- Comprehensive coverage
- Every image analyzed
- Detailed contrast analysis
- AI-powered suggestions
- Production-ready reports
```
---
## 🆘 Troubleshooting
### "ModuleNotFoundError: No module named 'pytesseract'"
```bash
pip install pytesseract pdf2image --break-system-packages
sudo apt-get install tesseract-ocr # Linux
brew install tesseract # macOS
```
### "TesseractNotFoundError"
```bash
# Linux
sudo apt-get install tesseract-ocr
# macOS
brew install tesseract
# Windows
# Download from: https://github.com/UB-Mannheim/tesseract/wiki
```
### OpenAI API Rate Limits
```python
# Add rate limiting
import time
def check_with_rate_limit(images, max_per_minute=50):
for i, img in enumerate(images):
result = check_image(img)
if (i + 1) % max_per_minute == 0:
time.sleep(60) # Wait 1 minute
```
### High API Costs
```python
# Strategy 1: Use low-detail mode
image_url = {"url": f"data:image/jpeg;base64,{img}", "detail": "low"}
# Strategy 2: Sample images
images_to_check = images[::5] # Every 5th image
# Strategy 3: Set hard limits
MAX_COST = 5.00 # Stop at $5
```
---
## 🎓 Learning Resources
- **WCAG 2.1**: https://www.w3.org/WAI/WCAG21/quickref/
- **PDF/UA**: https://www.pdfa.org/resource/pdfua-in-a-nutshell/
- **OpenAI Vision**: https://platform.openai.com/docs/guides/vision
- **Anthropic Claude**: https://docs.anthropic.com/claude/docs
- **Google Vision**: https://cloud.google.com/vision/docs
---
## ⚡ TL;DR
**Free (60% coverage):**
```bash
pip install pypdf pdfplumber pytesseract textblob pillow pdf2image
python enhanced_pdf_checker.py doc.pdf --enable-ocr --check-contrast --analyze-content
```
**With AI ($10/month, 80% coverage):**
```bash
pip install openai
export OPENAI_API_KEY="sk-..."
python enhanced_pdf_checker.py doc.pdf --vision-api openai --vision-api-key $OPENAI_API_KEY
```
**Start simple, add APIs as needed. Every integration adds 10-20% more coverage!**