pdf-accessibility/requirements-cloudrun.txt
michael 4080638856 Migrate PDF processing from Redis worker to Google Cloud Run
Replace the Redis queue + Python worker daemon with a synchronous HTTP
call to a Cloud Run service, eliminating Redis and simplifying the
infrastructure from 4 containers (web, worker, redis, postgres) to just
web + postgres (with Cloud Run handling processing).

- Add cloudrun_service.py: Flask app wrapping EnterprisePDFChecker with
  POST /check and GET /health endpoints, GCS image upload
- Add Dockerfile.cloudrun + requirements-cloudrun.txt for Cloud Run image
- Add cloudbuild.yaml for Cloud Build with custom Dockerfile
- Rewrite api.php: remove all Redis code, add Cloud Run OIDC auth
  (getCloudRunToken), synchronous processing in handleCheck(), file-based
  rate limiting, GCS redirect in handleImage(), DB helper updateJobInDatabase()
- Update js/upload.js: handle synchronous completed response from Cloud Run,
  increase poll timeout to 15 minutes
- Update js/page-viewer.js: use GCS URLs directly for page images
- Simplify docker-compose.yml and docker-compose.prod.yml: remove worker
  and redis services
- Remove PHP Redis extension from Dockerfile.web
- Set 900s timeouts across nginx, PHP-FPM, gunicorn, curl, and Cloud Run
- Update cleanup.py: remove result_images pattern (now on GCS), add
  rate_limits cleanup
- Update .env.example: replace Redis vars with Cloud Run/GCS config

Cloud Run service deployed to:
  https://pdf-checker-bcb6ipdqka-uc.a.run.app
GCS bucket: gs://optical-pdf-images (7-day lifecycle, public read)
GCP project: optical-414516

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 14:50:38 -06:00

33 lines
521 B
Text

# Cloud Run PDF Accessibility Checker - Python Dependencies
# Core PDF processing
pypdf>=4.0.0
pdfplumber>=0.11.0
# Image processing
Pillow>=10.0.0
pdf2image>=1.16.0
# OCR
pytesseract>=0.3.10
# Scientific computing
numpy>=1.24.0
# NLP and readability
textblob>=0.17.1
# Google Cloud APIs
google-cloud-vision>=3.4.0
google-cloud-documentai>=2.20.0
# Anthropic Claude API
anthropic>=0.18.0
# Additional utilities
python-dotenv>=1.0.0
# Cloud Run specific
flask>=3.0.0
gunicorn>=21.2.0
google-cloud-storage>=2.14.0