video-accessibility/docs/project/infrastructure.md
Vadym Samoilenko a3b300b76a docs: add canonical documentation + audit cleanup
- AGENTS.md: canonical project entry point (Quick Nav, pipeline, constraints)
- docs/: complete docs tree — architecture, API spec, DB schema, infra,
  runbook, requirements, tech stack, principles, reference ADRs, guides,
  tasks backlog, testing strategy
- tests/README.md: test commands, structure, known gaps
- README.md / CLAUDE.md / DEPLOYMENT.md: updated with canonical doc links
- .archive/: backup of pre-documentation-pipeline originals
- backend/uv.lock: uv dependency lockfile
- Delete committed __pycache__ .pyc files (should have been gitignored)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-29 14:22:51 +01:00

146 lines
5.3 KiB
Markdown

# Infrastructure — Accessible Video Processing Platform
<!-- SCOPE: infrastructure | owner: ln-115 | generated: 2026-04-29 -->
## Server Inventory
| Server | Role | Resources | Location |
|--------|------|-----------|---------|
| optical-web-1 | Production host | 32GB RAM, 8 CPU | GCP VM |
**Domain:** ai-sandbox.oliver.solutions
**SSL:** Wildcard certificate covering *.ai-sandbox.oliver.solutions
---
## URL Map
| Endpoint | URL | Served by |
|----------|-----|---------|
| Frontend SPA | `https://ai-sandbox.oliver.solutions/video-accessibility/` | Apache → /var/www/html/video-accessibility |
| Backend API | `https://ai-sandbox.oliver.solutions/video-accessibility-back/` | Apache → localhost:8000 |
| Backend health | `https://ai-sandbox.oliver.solutions/video-accessibility-back/health` | FastAPI |
| Backend docs | `https://ai-sandbox.oliver.solutions/video-accessibility-back/docs` | FastAPI (Swagger) |
| Prometheus metrics | localhost:8001 | Prometheus client (internal only) |
| WebSocket | `wss://ai-sandbox.oliver.solutions/video-accessibility-back/api/v1/ws/` | Apache mod_proxy_wstunnel |
---
## Docker Compose Services
| Service | Image | Port (internal) | Port (host) | Depends on |
|---------|-------|----------------|------------|-----------|
| api | backend/Dockerfile | 8000 | 8000 | mongodb, redis |
| worker | backend/Dockerfile (celery cmd) | — | — | mongodb, redis |
| mongodb | mongo:7.0 | 27017 | 27017 | — |
| redis | redis:7.2 | 6379 | 6379 | — |
**Deploy path:** `/opt/video-accessibility/`
---
## Apache Configuration Requirements
| Module | Required for |
|--------|-------------|
| mod_rewrite | SPA routing (all paths → index.html) |
| mod_proxy | API reverse proxy |
| mod_proxy_http | HTTP proxying |
| mod_proxy_wstunnel | WebSocket proxying |
| mod_headers | CORS + security headers |
Config snippet location: `APACHE_DEPLOYMENT.md` (archived) and `/etc/apache2/sites-available/ai-sandbox.oliver.solutions-ssl.conf` on server.
---
## GCS Layout
**Bucket:** `accessible-video` (GCP project: `optical-414516`)
| Path pattern | Contents |
|-------------|---------|
| `{jobId}/source.mp4` | Original uploaded video |
| `{jobId}/en/captions.vtt` | English closed captions |
| `{jobId}/en/ad.vtt` | English audio description VTT |
| `{jobId}/en/ad.mp3` | English audio description audio |
| `{jobId}/{lang}/captions.vtt` | Translated captions (e.g., `fr/`, `de/`) |
| `{jobId}/{lang}/ad.vtt` | Translated audio description VTT |
| `{jobId}/{lang}/ad.mp3` | Translated audio description audio |
| `{jobId}/accessible.mp4` | Final accessible video (burned-in captions + AD audio) |
**Signed URL expiry:** 24h (V4 signing). URLs must not be cached or stored in the database.
---
## External Service Dependencies
| Service | Region / Endpoint | Rate limits / Quotas |
|---------|-----------------|-------------------|
| MongoDB Atlas | Cloud (Atlas cluster) | M10+ tier recommended |
| GCS | us-central1 | Standard storage class |
| Gemini 2.5 Pro | `generativelanguage.googleapis.com` | Per project quota |
| Google Cloud TTS | `texttospeech.googleapis.com` | 1M chars/month free tier |
| Google Cloud Translate | `translate.googleapis.com` | 500k chars/month free tier |
| ElevenLabs | `api.elevenlabs.io` | Subscription-dependent |
| SendGrid | `api.sendgrid.com` | 100 emails/day free tier |
| Microsoft Entra ID | `login.microsoftonline.com` | Tenant-configured |
| GCP Secret Manager | `secretmanager.googleapis.com` | 10k ops/month free |
| Sentry | `sentry.io` | Project DSN |
---
## Network Ports
| Port | Service | Exposed to |
|------|---------|-----------|
| 443 | Apache HTTPS | Public |
| 80 | Apache HTTP (→ 443 redirect) | Public |
| 8000 | FastAPI | localhost only |
| 8001 | Prometheus metrics | localhost only |
| 27017 | MongoDB | Docker network only |
| 6379 | Redis | Docker network only |
---
## Secret Management
**Production:** GCP Secret Manager. Secrets fetched at startup via `core/secrets_config.py`.
**Local:** `.env.local` (gitignored).
**Template:** `.env.prod.example` (checked in, no real values).
| Secret | Where used |
|--------|-----------|
| `JWT_SECRET_KEY` | Access token signing |
| `JWT_REFRESH_SECRET_KEY` | Refresh token signing |
| `GEMINI_API_KEY` | Gemini API |
| `ELEVENLABS_API_KEY` | ElevenLabs TTS |
| `SENDGRID_API_KEY` | Email delivery |
| `GCS_BUCKET_NAME` | File storage |
| `GOOGLE_CLOUD_PROJECT` | GCP project ID |
| `MONGODB_URI` | Atlas connection string |
| `REDIS_URL` | Redis connection |
| `SENTRY_DSN` | Error tracking |
| `DEFAULT_ADMIN_PASSWORD` | Seed script (must not have fallback value) |
---
## GCP Service Account IAM Roles
| Role | Purpose |
|------|---------|
| Storage Admin | GCS read/write + signed URL generation |
| AI Platform User | Gemini API access |
| Cloud Translation User | Translate API access |
| Cloud Text-to-Speech User | TTS API access |
| Secret Manager Secret Accessor | Read secrets at runtime |
**Credentials file:** `./secrets/gcp-credentials.json` (mounted into Docker containers, permissions 600).
---
## Maintenance
**Update triggers:** Server migration, new external service, GCS bucket rename, secret rotation.
**Verification:** All URLs in URL Map resolve. Docker service ports match `docker-compose.prod.yml`. GCS bucket name matches `GCS_BUCKET_NAME` env var.
<!-- END SCOPE: infrastructure -->