gemini_service.py: if the primary model (gemini-3.1-pro-preview) is
unavailable or returns a permission error, all three call sites now
automatically retry with gemini-3-flash-preview before propagating failure.
cloudrun.yaml: new Cloud Run service definition that ensures stable
WebSocket operation — 10-minute request timeout (vs 60s default),
2 vCPU / 4Gi RAM for PDF rasterisation, min 1 warm instance to prevent
cold-start disconnects, and GEMINI_API_KEY sourced from Secret Manager
so the service can actually reach the Gemini API.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>