obsidian/01 Projects/pdf-accessibility/PDF Accessibility Checker.md
2026-04-24 10:49:33 +01:00

3.7 KiB
Raw Blame History

name client status tech local_path deploy url server tags created last_commit commits
PDF Accessibility Checker Oliver Internal active
Python
PHP
PostgreSQL
Redis
Docker
Claude
Google Cloud Vision
pypdf
pdfplumber
/Users/ai_leed/Documents/Projects/Oliver/pdf-accessibility docker-compose up (dev) / docker-compose -f docker-compose.prod.yml up -d (prod) TBD
oliver
pdf
accessibility
wcag
ai
php
redis
postgresql
2026-04-14 2026-03-18 69

Overview

Production-ready PDF accessibility checker — WCAG 2.1 Level A & AA validation. ~95% automated coverage via traditional PDF analysis + AI. Branded for Oliver (Montserrat font, black/#FFC407).

Recent (Feb 2026): API authentication added, production readiness enhancements.

Tech Stack

  • Web UI: Vanilla JS + HTML/CSS (drag-drop, visual inspector)
  • REST API: PHP (api.php) + auth.php (Bearer/X-API-Key)
  • Core engine: Python (enterprise_pdf_checker.py) — 30+ WCAG checks
  • AI Analysis: Anthropic Claude + Google Cloud Vision (image analysis)
  • PDF libs: pypdf + pdfplumber
  • Queue: Redis (pdf:queue)
  • Database: PostgreSQL (job tracking + history)
  • Worker: worker.py (background daemon — Redis queue consumer)
  • Infrastructure: Docker + docker-compose (dev + prod variants)
  • Remediation: pdf_remediation.py (auto-fix output)

Architecture

Web UI / REST API (PHP)
    ↓ auth.php validates Bearer/X-API-Key
api.php → uploads/ → Redis queue (pdf:queue)
    ↓
worker.py (daemon)
    └── EnterprisePDFChecker.check_all() → 30+ WCAG checks
        ├── Traditional: pypdf, pdfplumber
        └── AI: Claude + Google Cloud Vision
    → results/{job_id}.result.json + PostgreSQL
    ↓
Client polls api.php?action=status → fetch results

Dev Commands

# Tests
source venv/bin/activate
pytest tests/ -v                     # 31 tests
pytest tests/ --cov=. --cov-report=html

# Run locally
php -S localhost:8000                 # PHP dev server

# Docker
docker-compose up                                        # Dev
docker-compose -f docker-compose.prod.yml up -d         # Prod
docker-compose exec worker pytest tests/ -v             # Tests in container

# CLI usage
python enterprise_pdf_checker.py doc.pdf --output report.json
python enterprise_pdf_checker.py doc.pdf --quick        # Skip AI checks
python pdf_remediation.py doc.pdf --output fixed.pdf --all

Key Files

File Purpose
enterprise_pdf_checker.py Core engine — 30+ WCAG checks
api.php REST API — file handling, job queue
auth.php Authentication — Bearer/X-API-Key
worker.py Background daemon — Redis queue consumer
pdf_remediation.py Auto-fix accessibility issues

Timeline / Git History

Date Change
2026-03-18 Fix CP14 heading detection via RoleMap + manual pass support
2026-03-18 Persist adjusted score to server on Recalculate
2026-03-18 Address client feedback: WCAG badges, table grouping, history UX
2026-03-16 PDF report reflects adjusted score + manual pass
2026-03-13 Move document history to separate history.html page
2026-03-13 Fix history: read jobs from data.data.jobs

Sessions

2026-04-14 Project catalogued

Done: Added to Obsidian second brain with full details.


Change Log

Date Requested Changed Files
2026-03-18 Fix heading detection CP14 via RoleMap + manual pass enterprise_pdf_checker.py
2026-03-13 Separate history page Move to history.html history.html, api.php