- Import langdetect with graceful fallback if not installed
- _check_language(): detect actual document language via langdetect on first
3 pages of text; store in self._detected_lang; warn when declared /Lang tag
doesn't match detected language; suggest correct BCP-47 tag when missing
- _check_readability(): skip Flesch Reading Ease / Flesch-Kincaid (English-only
formulas) for non-English documents; long-sentence check remains language-agnostic
- _check_links(): extend unclear-link patterns to Ukrainian, Russian, German,
French, Spanish, and Polish
- requirements-cloudrun.txt: add langdetect>=1.0.9
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the Redis queue + Python worker daemon with a synchronous HTTP
call to a Cloud Run service, eliminating Redis and simplifying the
infrastructure from 4 containers (web, worker, redis, postgres) to just
web + postgres (with Cloud Run handling processing).
- Add cloudrun_service.py: Flask app wrapping EnterprisePDFChecker with
POST /check and GET /health endpoints, GCS image upload
- Add Dockerfile.cloudrun + requirements-cloudrun.txt for Cloud Run image
- Add cloudbuild.yaml for Cloud Build with custom Dockerfile
- Rewrite api.php: remove all Redis code, add Cloud Run OIDC auth
(getCloudRunToken), synchronous processing in handleCheck(), file-based
rate limiting, GCS redirect in handleImage(), DB helper updateJobInDatabase()
- Update js/upload.js: handle synchronous completed response from Cloud Run,
increase poll timeout to 15 minutes
- Update js/page-viewer.js: use GCS URLs directly for page images
- Simplify docker-compose.yml and docker-compose.prod.yml: remove worker
and redis services
- Remove PHP Redis extension from Dockerfile.web
- Set 900s timeouts across nginx, PHP-FPM, gunicorn, curl, and Cloud Run
- Update cleanup.py: remove result_images pattern (now on GCS), add
rate_limits cleanup
- Update .env.example: replace Redis vars with Cloud Run/GCS config
Cloud Run service deployed to:
https://pdf-checker-bcb6ipdqka-uc.a.run.app
GCS bucket: gs://optical-pdf-images (7-day lifecycle, public read)
GCP project: optical-414516
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>