pdf-accessibility/requirements-cloudrun.txt
Vadym Samoilenko 7fe26e7dc4 Add multilingual PDF support: language detection + language-aware checks
- Import langdetect with graceful fallback if not installed
- _check_language(): detect actual document language via langdetect on first
  3 pages of text; store in self._detected_lang; warn when declared /Lang tag
  doesn't match detected language; suggest correct BCP-47 tag when missing
- _check_readability(): skip Flesch Reading Ease / Flesch-Kincaid (English-only
  formulas) for non-English documents; long-sentence check remains language-agnostic
- _check_links(): extend unclear-link patterns to Ukrainian, Russian, German,
  French, Spanish, and Polish
- requirements-cloudrun.txt: add langdetect>=1.0.9

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 14:52:05 +00:00

34 lines
539 B
Text

# Cloud Run PDF Accessibility Checker - Python Dependencies
# Core PDF processing
pypdf>=4.0.0
pdfplumber>=0.11.0
# Image processing
Pillow>=10.0.0
pdf2image>=1.16.0
# OCR
pytesseract>=0.3.10
# Scientific computing
numpy>=1.24.0
# NLP and readability
textblob>=0.17.1
# Google Cloud APIs
google-cloud-vision>=3.4.0
google-cloud-documentai>=2.20.0
# Anthropic Claude API
anthropic>=0.18.0
# Additional utilities
python-dotenv>=1.0.0
# Cloud Run specific
flask>=3.0.0
gunicorn>=21.2.0
google-cloud-storage>=2.14.0
langdetect>=1.0.9