michael 792e1323e9 docs: add backend architecture documentation for campaign reuse

Comprehensive technical reference covering system architecture, Celery
queue topology, Cloud Run video offloading, resilience patterns, and
configuration. Includes 6 Mermaid diagrams.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-14 09:40:39 -06:00

34 KiB

Raw Permalink Blame History

Backend Architecture: Pet Love Song Generator

Executive Summary

The Pet Love Song Generator is a campaign microsite that lets users submit their pet's name, photo, and a music vibe preference to receive an AI-generated love song rendered as a vinyl-record music video. The backend receives form submissions, dispatches them to an external AI music generation API (Sonauto), processes webhook callbacks, downloads the generated audio, and creates a composited video — all asynchronously.

The key architectural decision is offloading CPU-intensive video rendering to Google Cloud Run. The main backend handles API serving, database operations, and task orchestration, while each video render gets its own ephemeral Cloud Run container instance. This separation means that a burst of 100 simultaneous video renders doesn't compete with API request handling or webhook processing on the main server. Cloud Run's auto-scaling matches the bursty traffic pattern of a marketing campaign (peaks during social shares, lulls overnight), and its scale-to-zero capability keeps costs proportional to actual usage.

The system was designed to handle campaign-scale bursts — hundreds of concurrent submissions arriving within minutes of a social media push — while remaining cost-efficient during quiet periods. Resilience patterns (retry strategies, two-layer timeouts, fail-fast startup checks) ensure that transient failures in external APIs or infrastructure don't silently lose user submissions.

System Architecture Overview

The application runs as six Docker Compose services on a single host, plus a Cloud Run satellite service for video rendering:

Service	Role	Image
web	FastAPI application server	Custom Dockerfile
celery_worker_default	Default task worker (API calls, audio downloads, DB ops)	Custom Dockerfile
celery_worker_video	Video dispatch worker (HTTP calls to Cloud Run)	Custom Dockerfile
celery_beat	Periodic task scheduler	Custom Dockerfile
db	PostgreSQL 15 database	`postgres:15`
redis	Message broker + result backend + cache	`redis:7-alpine`

The Cloud Run video service runs independently, sharing only the GCS bucket as a data plane.

graph LR
    subgraph "User"
        Browser[Browser / PHP Frontend]
    end

    subgraph "Docker Compose Host"
        FastAPI[FastAPI<br/>web service]
        WorkerDefault[Celery Worker<br/>default queue<br/>concurrency=10]
        WorkerVideo[Celery Worker<br/>video queue<br/>concurrency=100]
        Beat[Celery Beat<br/>scheduler]
        Redis[(Redis 7)]
        Postgres[(PostgreSQL 15)]
    end

    subgraph "Google Cloud"
        CloudRun[Cloud Run<br/>Video Service]
        GCS[(GCS Bucket<br/>vday2026)]
    end

    subgraph "External"
        Sonauto[Sonauto API<br/>Music Generation]
    end

    Browser -->|POST /api/submissions| FastAPI
    Browser -->|GET /api/submissions/.../status| FastAPI
    FastAPI --> Postgres
    FastAPI --> Redis
    FastAPI -->|upload photo| GCS

    Redis --> WorkerDefault
    Redis --> WorkerVideo
    Beat --> Redis

    WorkerDefault -->|send prompt| Sonauto
    WorkerDefault -->|download audio| GCS
    Sonauto -->|webhook callback| FastAPI

    WorkerVideo -->|POST /generate| CloudRun
    CloudRun -->|read audio, photo| GCS
    CloudRun -->|write video| GCS

    Browser -->|stream video| GCS

Request Lifecycle

A submission moves through three distinct phases: Submission, Generation, and Delivery.

Phase 1: Submission

User fills out the form (pet name, owner name, pet type, music vibe, optional photo)
Browser sends POST /api/submissions with form data and base64-encoded cropped image
FastAPI validates input, runs server-side profanity check on pet/owner names
If a photo is provided, it's decoded from base64 and uploaded to GCS (uploads/{session_id}.jpg)
A Submission record is created in PostgreSQL with status pending
A send_to_sonauto Celery task is dispatched immediately
Response returns session_id to the browser for polling

Phase 2: Generation

The send_to_sonauto task constructs a prompt from pet-specific templates and sends it to the Sonauto API
The submission status moves to processing and a per-submission timeout task is scheduled
Sonauto generates the song asynchronously and sends webhook callbacks:
- GENERATING_STREAMING_READY — the audio stream is available for early playback (the frontend can start playing before the full song is done)
- SUCCESS — full generation complete, triggers the post-webhook processing chain
- FAILURE — marks the submission as failed
On SUCCESS, a Celery chain is triggered: fetch_generation_details → download_audio → create_video
The chain fetches song details from the Sonauto API, downloads the MP3 to GCS, then dispatches video creation
Video creation sends a request to Cloud Run, which renders the vinyl record animation using FFmpeg and uploads the result to GCS

Phase 3: Delivery

The frontend polls GET /api/submissions/{session_id}/status on an interval
Once status is success, the frontend redirects to the results page
The results page fetches GET /api/results/{session_id} which returns the video URL, lyrics, record image, and metadata
The video is served directly from GCS public URLs; downloads use signed URLs with Content-Disposition: attachment

Streaming-ready optimization: When the GENERATING_STREAMING_READY webhook arrives, the status endpoint includes streaming_ready: true and the Sonauto task_id. The frontend can begin audio playback via Sonauto's streaming endpoint while the full song and video are still being generated. This reduces perceived wait time significantly.

sequenceDiagram
    participant User as Browser
    participant API as FastAPI
    participant GCS as GCS Bucket
    participant DB as PostgreSQL
    participant Redis as Redis
    participant Worker as Celery Worker<br/>(default)
    participant Sonauto as Sonauto API
    participant VideoW as Celery Worker<br/>(video)
    participant CR as Cloud Run

    User->>API: POST /api/submissions
    API->>GCS: Upload pet photo
    API->>DB: INSERT submission (pending)
    API->>Redis: Enqueue send_to_sonauto
    API-->>User: { session_id }

    Worker->>Redis: Dequeue task
    Worker->>Sonauto: POST /generations/v3
    Worker->>DB: UPDATE (processing)
    Worker->>Redis: Schedule timeout check

    Note over Sonauto: AI generates song...

    Sonauto-->>API: Webhook (STREAMING_READY)
    API->>DB: SET streaming_ready_at

    User->>API: GET /status (polling)
    API-->>User: { streaming_ready: true }
    Note over User: Begin audio playback

    Sonauto-->>API: Webhook (SUCCESS)
    API->>DB: SET received_from_LLM
    API->>Redis: Enqueue chain

    Worker->>Sonauto: GET /generations/{task_id}
    Worker->>DB: Store lyrics, song URL
    Worker->>GCS: Download & upload audio
    Worker->>DB: SET generated_song_path

    Worker->>Redis: Pass to video queue
    VideoW->>CR: POST /generate
    CR->>GCS: Download audio + photo
    Note over CR: FFmpeg renders video (~10s)
    CR->>GCS: Upload video + composite
    CR-->>VideoW: { video_blob_path }
    VideoW->>DB: SET status=success

    User->>API: GET /status (polling)
    API-->>User: { status: success }
    User->>API: GET /results/{session_id}
    API-->>User: { video_url, lyrics, ... }
    User->>GCS: Stream video

Celery Task Chain Architecture

After the Sonauto webhook signals SUCCESS, a three-task Celery chain processes the result. Each task receives a dict from its predecessor and returns a dict consumed by the next.

Chain: `fetch_generation_details → download_audio → create_video`

Task 1 — fetch_generation_details

Calls GET /generations/{task_id} on the Sonauto API
Extracts song URL, lyrics, and status from the response
Stores the full API response, lyrics, and song URL in the database
Returns: { session_id, song_url, lyrics }

Task 2 — download_audio

Downloads the MP3 from the Sonauto CDN URL
Uploads it to GCS at audio/{session_id}.mp3
Updates the database with the blob path
Returns: { session_id, audio_blob_path, lyrics }

Task 3 — create_video

Routes to the video queue (separate worker pool)
If CLOUD_RUN_VIDEO_URL is configured: dispatches to Cloud Run via HTTP POST
Otherwise: falls back to local FFmpeg rendering (development mode)
Marks the submission as success or fail based on the outcome
Returns: { session_id, video_blob_path }

Retry Configuration

All three chain tasks use the same retry pattern:

@shared_task(
    autoretry_for=(requests.RequestException,),
    retry_backoff=True,          # Exponential backoff
    retry_backoff_max=60,        # Cap at 60 seconds
    retry_kwargs={"max_retries": 3},
)

Additionally, send_to_sonauto tracks retries at the application level via a retry_count column. If the count reaches MAX_RETRIES (default: 3), the submission is marked as fail even before Celery exhausts its own retry budget — preventing indefinite requeuing.

graph LR
    Webhook[Webhook<br/>SUCCESS] --> Fetch[fetch_generation<br/>_details]
    Fetch --> Download[download_audio]
    Download --> Video[create_video]
    Video --> Success[status = success]

    Fetch -->|RequestException<br/>3 retries| Fetch
    Download -->|RequestException<br/>3 retries| Download
    Video -->|RequestException<br/>2 retries| Video

    Fetch -->|max retries exceeded| Fail[status = fail]
    Download -->|max retries exceeded| Fail
    Video -->|max retries exceeded| Fail

Queue Topology & Concurrency Model

This is the most critical load-handling pattern in the system. Two separate Celery queues with independent worker pools prevent video dispatch tasks from starving API-calling tasks during traffic bursts.

Why Queue Separation Matters

Video generation takes approximately 10 seconds per submission. During a traffic burst — say, 200 submissions arriving within a few minutes after a social media push — the system needs to:

Continue accepting and acknowledging Sonauto API calls (send_to_sonauto)
Continue processing webhook callbacks (the chain's first two tasks)
Dispatch up to 100 video renders in parallel to Cloud Run

Without queue separation, all these tasks would compete for the same worker slots. If the default worker had 10 concurrency slots, only 10 video dispatches could run at once — and while those slots are occupied waiting for Cloud Run HTTP responses (~10s each), no API calls or audio downloads could proceed. The entire pipeline would bottleneck.

Queue Configuration

"celery" queue (default) — concurrency: 10

send_to_sonauto — outbound API calls to Sonauto
fetch_generation_details — fetch song metadata from Sonauto
download_audio — download MP3 from CDN, upload to GCS
check_submission_timeout — per-submission timeout checks
check_timeouts — periodic safety-net sweep
cleanup_old_files — daily file retention cleanup

These tasks involve API calls, file downloads, and database operations. Concurrency of 10 is sufficient because each task completes in seconds, and the external APIs have their own rate limits.

"video" queue — concurrency: 100

create_video — HTTP dispatch to Cloud Run (or local FFmpeg fallback)

This worker does almost no local work. It sends an HTTP POST to Cloud Run and waits for the response. The actual CPU work happens on Cloud Run. With 100 concurrency, the system can have 100 Cloud Run instances rendering videos simultaneously. The high concurrency is safe because the worker is purely I/O-bound — it's just holding open HTTP connections.

Worker-Level Settings

# celery_app.py
"task_acks_late": True,
"worker_prefetch_multiplier": 1,
"task_routes": {
    "tasks.workers.create_video": {"queue": "video"},
},
"task_default_queue": "celery",

worker_prefetch_multiplier=1: Each worker only grabs one task at a time from the broker. This ensures fair distribution across workers — without it, a single worker could prefetch a batch of tasks and create an uneven load.
task_acks_late=True: Tasks are acknowledged only after completion, not when received. If a worker crashes mid-task, the message returns to the queue and another worker picks it up. This is the crash-recovery mechanism.

graph TB
    Redis[(Redis Broker)]

    Redis --> CeleryQ["celery" queue]
    Redis --> VideoQ["video" queue]

    CeleryQ --> DefaultPool[Default Worker Pool<br/>concurrency = 10<br/>API calls, downloads, DB ops]
    VideoQ --> VideoPool[Video Worker Pool<br/>concurrency = 100<br/>HTTP dispatch only]

    DefaultPool -->|API calls| Sonauto[Sonauto API]
    DefaultPool -->|upload/download| GCS[(GCS)]
    DefaultPool -->|read/write| DB[(PostgreSQL)]

    VideoPool -->|POST /generate| CloudRun[Cloud Run<br/>auto-scaling<br/>video instances]
    CloudRun -->|read/write| GCS

Cloud Run Video Offloading

Why Cloud Run?

FFmpeg video rendering is CPU-intensive. Each video takes approximately 10 seconds to render: generating ~45 frames for one rotation cycle, compositing four image layers per frame (background, pet photo, vinyl record, needle), then encoding to H.264. On the main server, running even 5 concurrent FFmpeg processes would consume significant CPU and memory, competing with FastAPI request handling and Celery task execution.

Cloud Run solves this in three ways:

Horizontal auto-scaling: Each video render request gets its own container instance. Cloud Run provisions instances on demand, so 100 simultaneous renders don't impact each other. There's no shared CPU to contend for.
Isolation from the main backend: The Celery video worker's job is reduced to a single HTTP POST — it doesn't consume CPU for rendering. The main server remains responsive for API requests, webhook processing, and database operations regardless of video rendering load.
Scale-to-zero economics: Campaign traffic follows a burst pattern (social shares trigger waves of submissions, then activity drops). Cloud Run scales to zero when idle, so there's no cost during lulls. This is far more cost-efficient than provisioning dedicated compute for peak load.

Data Flow

GCS serves as the shared data plane between the main backend and Cloud Run. There is no direct file transfer between services:

The Celery worker sends { session_id, audio_blob_path, photo_blob_path } to Cloud Run
Cloud Run downloads the audio and photo from GCS
Cloud Run runs FFmpeg to generate the video (720x1280px MP4, 15fps, vinyl record animation)
Cloud Run uploads the video and composite record image to GCS
Cloud Run returns { video_blob_path } to the Celery worker
The Celery worker updates the database with the video path and marks the submission as success

Configuration

CLOUD_RUN_VIDEO_URL=https://video-generator-xxx.a.run.app
VIDEO_SERVICE_API_KEY=shared-secret-for-auth
CLOUD_RUN_VIDEO_TIMEOUT=300  # 5 minutes safety ceiling

The 300-second timeout is a safety ceiling — videos normally render in ~10 seconds, but the timeout accounts for cold starts, GCS download time, and edge cases with very long audio tracks.

Local Development Fallback

When CLOUD_RUN_VIDEO_URL is not set, the create_video task falls back to local FFmpeg rendering. This allows development and testing without Cloud Run infrastructure. The local path downloads files from GCS to temp files, runs the video generator, uploads results, and cleans up.

Resilience & Failure Handling

8.1 Retry Strategy

All API-calling tasks use Celery's built-in exponential backoff retry:

Trigger: autoretry_for=(requests.RequestException,) — any HTTP error triggers a retry
Backoff: Exponential with retry_backoff=True, capped at retry_backoff_max=60 seconds
Max retries: 3 for API/download tasks, 2 for video creation
Application-level tracking: The send_to_sonauto task also increments a retry_count column on the submission. If this count reaches MAX_RETRIES (configurable, default 3), the submission is marked as fail independently of Celery's retry mechanism. This prevents scenarios where Celery's retry counter resets (e.g., after a worker restart) from causing infinite retries.

8.2 Two-Layer Timeout Strategy

Submissions that are sent to Sonauto but never receive a webhook callback need to be detected and marked as failed. The system uses two complementary timeout mechanisms:

Layer 1 — Event-based per-submission countdown (primary)

When send_to_sonauto successfully calls the Sonauto API, it immediately schedules a check_submission_timeout task with a countdown equal to WEBHOOK_TIMEOUT_MINUTES (default: 10 minutes):

check_submission_timeout.apply_async(
    args=[session_id],
    countdown=settings.WEBHOOK_TIMEOUT_MINUTES * 60,
)

This task fires at the exact timeout moment for each submission. If the webhook has already arrived, it's a no-op. If the submission is still in processing state, it's marked as fail with LLM_status = "timeout".

Layer 2 — Periodic safety-net sweep (fallback)

Celery Beat runs check_timeouts every 30 minutes. This queries for any submissions where sent_to_LLM is older than the timeout threshold and received_from_LLM is still NULL. This catches edge cases where the per-submission timeout task itself failed to execute (e.g., the worker restarted, Redis lost the message).

graph TD
    Submit[send_to_sonauto<br/>sends to API] --> Schedule[Schedule<br/>check_submission_timeout<br/>countdown = 10 min]

    Schedule --> Timer{10 minutes<br/>elapsed}
    Timer -->|webhook arrived| NoOp[No-op<br/>already processed]
    Timer -->|still processing| MarkFail1[Mark as fail<br/>LLM_status = timeout]

    Beat[Celery Beat<br/>every 30 min] --> Sweep[check_timeouts<br/>query stale submissions]
    Sweep -->|found stale| MarkFail2[Mark as fail<br/>LLM_status = timeout]
    Sweep -->|none found| Clean[No action needed]

    MarkFail1 --> Failed[Submission<br/>status = fail]
    MarkFail2 --> Failed

8.3 Fail-Fast Startup

The FastAPI application verifies GCS connectivity on startup:

@app.on_event("startup")
async def startup_event():
    from app.services import storage
    storage.check_connectivity()  # Raises if bucket inaccessible

If the GCS bucket is inaccessible (wrong credentials, bucket doesn't exist, network issue), the application refuses to start. This prevents a scenario where the app accepts submissions but silently fails to store images, leading to broken submissions that can never produce videos.

8.4 Health Monitoring

The /api/health endpoint performs three checks:

Check	Method	Unhealthy if
Redis	`redis_client.ping()`	Connection refused or timeout
PostgreSQL	`SELECT 1` via SQLAlchemy	Connection refused or query fails
Celery workers	`celery_app.control.inspect().active()`	No active workers detected

If any check fails, the endpoint returns HTTP 503. Load balancers use this to route traffic away from unhealthy instances.

8.5 Content Safety

Two layers of content filtering protect against inappropriate submissions:

Text filtering (client + server):

Client-side: JavaScript profanity check provides instant feedback
Server-side: The contains_profanity() function uses the better-profanity library plus a custom banned-words list. This is the authoritative check — client-side filtering can be bypassed.

Image classification (Gemini AI):

Configurable via IMAGE_SAFETY_METHOD (default: gemini, alternative: nudenet)
Gemini receives the image with a strict classification prompt that only accepts photos with a pet/animal as the primary subject
Fail-closed on content filter: If Gemini's own safety filter blocks the response, the image is rejected
Fail-open on network error: If Gemini is unreachable (network error, auth failure), the image is allowed through — availability is prioritized over a rare edge case

8.6 Rate Limiting

Each user is identified by a cookie_id (generated on first submission, stored in localStorage). The server enforces a maximum of FORM_SUBMIT_RETRY (default: 10) submissions per cookie_id by counting existing submissions in the database. The client also enforces this limit for immediate feedback, but the server-side check is authoritative.

8.7 Database Connection Pooling

engine = create_engine(
    settings.DATABASE_URL,
    pool_pre_ping=True,    # Verify connections before use
    pool_size=10,          # Base pool size
    max_overflow=20,       # Additional connections under load
)

pool_pre_ping=True: Before using a pooled connection, SQLAlchemy sends a lightweight ping. If the connection is stale (database restarted, network blip), it's discarded and a fresh one is created. This prevents "connection reset" errors after idle periods.
pool_size=10: Maintains 10 persistent connections. Sufficient for the FastAPI server and Celery workers under normal load.
max_overflow=20: Up to 20 additional connections can be created during traffic bursts, for a total of 30 concurrent connections. These overflow connections are closed after use.

8.8 Data Retention

A Celery Beat task runs daily at 3:00 AM UTC:

Queries submissions older than FILE_RETENTION_DAYS (default: 30 days)
Deletes associated files from GCS: uploaded photo, generated audio, generated video, and composite record image
Logs the count of deleted files and processed submissions

The database records themselves are not deleted — only the associated files. This preserves the audit trail while reclaiming storage.

Data Model

The system uses a single submissions table. Fields are grouped by purpose:

Identity & Tracking

Column	Type	Purpose
`session_id`	`String(255)` PK	CUID2-generated unique identifier
`cookie_id`	`String(255)` indexed	Client-side user identifier for rate limiting
`created_at`	`DateTime`	Submission timestamp

Form Data

Column	Type	Purpose
`owner_name`	`String(100)`	Pet owner's name
`pet_name`	`String(100)`	Pet's name
`pet_type`	`String(50)`	Pet species (dog, cat, etc.)
`music_vibe`	`String(50)`	Selected music style
`photo_path`	`String(500)`	GCS blob path to uploaded pet photo

Processing State

Column	Type	Purpose
`entry_status`	`String(50)`	Current status: `pending` → `processing` → `success` / `fail`
`retry_count`	`Integer`	Application-level retry counter for API calls

External API Tracking

Column	Type	Purpose
`sent_to_LLM`	`DateTime`	When the prompt was sent to Sonauto
`LLM_task_id`	`String(255)` indexed	Sonauto's task identifier for webhook correlation
`received_from_LLM`	`DateTime`	When the webhook callback was received
`LLM_response`	`Text`	Primary song URL from Sonauto
`LLM_full_response`	`Text`	Full JSON response for debugging
`LLM_status`	`String(50)`	Sonauto-specific status (success, fail, timeout)

Generated Content

Column	Type	Purpose
`lyrics`	`Text`	AI-generated song lyrics
`generated_song_path`	`Text`	GCS blob path to MP3 audio
`generated_video_path`	`String(500)`	GCS blob path to MP4 video

Timing & Streaming

Column	Type	Purpose
`streaming_ready_at`	`DateTime`	When Sonauto's streaming endpoint became available
`video_creation_start`	`DateTime`	When video rendering began
`video_creation_end`	`DateTime`	When video rendering completed

Indexes

ix_submissions_cookie_id on cookie_id — fast rate-limit lookups
ix_submissions_llm_task_id on LLM_task_id — fast webhook correlation by Sonauto task ID

Infrastructure (Docker Compose)

Six services are defined in docker-compose.yml. The four application services (web, two workers, beat) share the same custom Dockerfile. Data stores use official images.

Service Configuration

web — FastAPI application server

Ports: 8000:8000
Depends on: db (healthy), redis (started)
Volumes: GCS credentials (read-only)

celery_worker_default — Default queue worker

Command: celery -A tasks.celery_app worker --concurrency=10 --queues=celery
Handles: API calls, audio downloads, timeout checks, cleanup
Depends on: db (healthy), redis (started)

celery_worker_video — Video queue worker

Command: celery -A tasks.celery_app worker --concurrency=100 --queues=video
Handles: HTTP dispatch to Cloud Run only
Additional env: CLOUD_RUN_VIDEO_URL, VIDEO_SERVICE_API_KEY
Depends on: db (healthy), redis (started)

celery_beat — Periodic task scheduler

Command: celery -A tasks.celery_app beat
Schedules: check_timeouts (every 30 min), cleanup_old_files (daily 3 AM)
Depends on: db (healthy), redis (started)

db — PostgreSQL 15

Health check: pg_isready -U pah -d pah (interval 5s, 5 retries)
Persistent volume: postgres_data

redis — Redis 7 Alpine

Persistent volume: redis_data

Dependency Graph

graph TB
    subgraph "Application Services"
        Web[web<br/>FastAPI :8000]
        WorkerDefault[celery_worker_default<br/>concurrency=10]
        WorkerVideo[celery_worker_video<br/>concurrency=100]
        Beat[celery_beat<br/>scheduler]
    end

    subgraph "Data Stores"
        DB[(db<br/>PostgreSQL 15<br/>:5432)]
        Redis[(redis<br/>Redis 7<br/>:6379)]
    end

    subgraph "External"
        CloudRun[Cloud Run<br/>Video Service]
        GCS[(GCS Bucket)]
    end

    Web -->|depends_on: healthy| DB
    Web -->|depends_on: started| Redis
    WorkerDefault -->|depends_on: healthy| DB
    WorkerDefault -->|depends_on: started| Redis
    WorkerVideo -->|depends_on: healthy| DB
    WorkerVideo -->|depends_on: started| Redis
    Beat -->|depends_on: healthy| DB
    Beat -->|depends_on: started| Redis

    WorkerVideo -.->|HTTP POST| CloudRun
    CloudRun -.->|read/write| GCS
    Web -.->|upload| GCS
    WorkerDefault -.->|upload/download| GCS

All application services receive the same base environment variables (DATABASE_URL, REDIS_URL, CELERY_BROKER_URL, SONAUTO_API_KEY, WEBHOOK_BASE_URL) plus GCS credentials mounted as a read-only volume. The video worker additionally receives CLOUD_RUN_VIDEO_URL and VIDEO_SERVICE_API_KEY.

Monitoring & Observability

Health Endpoint

GET /api/health returns a JSON payload with the status of Redis, PostgreSQL, and Celery workers. Returns 200 when all checks pass, 503 otherwise. Designed for load balancer health checks.

Admin Queue Status

GET /api/admin/queue-status returns:

Count of pending submissions
Count of processing submissions
Sonauto API credit balance (cached in Redis for 15 minutes; fetched live on cache miss)

Admin Data Endpoint

GET /api/admin/data implements DataTables server-side processing (SSP) with search, sorting, and pagination. Supports filtering across owner_name, pet_name, and session_id.

Structured Logging

All components use Python's logging module with a consistent format:

%(asctime)s - %(name)s - %(levelname)s - %(message)s

Key events logged at INFO level:

Submission creation and dispatch
Sonauto API calls (send, fetch, response)
Webhook receipt and chain triggering
GCS uploads/downloads with blob paths and byte counts
Video generation start/completion with timing
Timeout and failure events (WARNING level)

Database Timestamp Audit Trail

Each submission records timestamps at critical pipeline stages:

created_at — when the user submitted the form
sent_to_LLM — when the prompt was sent to Sonauto
streaming_ready_at — when early audio playback became available
received_from_LLM — when the webhook arrived
video_creation_start / video_creation_end — video render duration

This allows post-hoc analysis of pipeline latency at each stage.

Storage Architecture (GCS)

Bucket Structure

All files reside in a single GCS bucket (vday2026) organized by type:

vday2026/
├── uploads/        # Pet photos (JPEG, ~50-200 KB each)
│   └── {session_id}.jpg
├── audio/          # Generated songs (MP3, ~3-5 MB each)
│   └── {session_id}.mp3
├── video/          # Generated videos (MP4, ~5-15 MB each)
│   └── {session_id}.mp4
└── images/         # Composite record images (PNG)
    └── {session_id}.png

Naming Convention

All files use {session_id} as the filename — a CUID2 identifier that is unique, URL-safe, and collision-resistant. This eliminates the need for name deduplication logic.

URL Patterns

Public URLs for streaming/display: https://storage.googleapis.com/vday2026/{folder}/{session_id}.{ext}
Signed download URLs for the download button: Time-limited (60 min), include Content-Disposition: attachment header with a human-readable filename ({pet-name}-love-song.mp4)

Client Architecture

The GCS client uses a lazy-loaded singleton pattern:

_client: Client | None = None
_bucket: Bucket | None = None

def get_client() -> Client:
    global _client
    if _client is None:
        _client = storage.Client.from_service_account_json(...)
    return _client

This avoids repeated authentication handshakes while still supporting both service account credentials (Docker/production) and application default credentials (Cloud Run).

Reusability Guide

This architecture can be adapted for future campaigns with different external APIs and post-processing pipelines. Key adaptation points:

Swap the external API: Replace send_to_sonauto and the webhook handler with calls to any async API that supports webhook callbacks. The chain pattern (fetch details → download artifact → post-process) is generic.
Swap post-processing: Replace the video generation step with any CPU-intensive output creation (image collage, PDF generation, etc.). The Cloud Run offloading pattern works for any workload that benefits from horizontal scaling.
Reuse queue topology: The two-queue model (default + heavy-processing) is reusable whenever you have a mix of fast tasks (API calls, DB ops) and slow I/O-bound dispatch tasks. Adjust concurrency numbers based on the external service's capacity.
Reuse timeout strategy: The two-layer timeout (event-based countdown + periodic sweep) applies to any webhook-dependent workflow. Adjust WEBHOOK_TIMEOUT_MINUTES and the Beat sweep interval to match the external API's expected response time.
Adapt rate limiting: The cookie-based rate limiting is simple but effective for campaign microsites where user accounts don't exist. For authenticated flows, replace cookie_id with user IDs.
Adapt content safety: The pluggable image safety backend (Gemini vs NudeNet) pattern can be extended with additional methods. The profanity filter's custom banned-words list can be swapped per campaign.
Config-driven via env vars: All external URLs, API keys, timeouts, concurrency limits, and retention periods are configurable via environment variables. No code changes needed for infrastructure differences between campaigns.
GCS as the data plane: Using object storage as the shared data plane between services (main backend ↔ Cloud Run) avoids the complexity of direct file transfers, shared volumes, or service-to-service streaming. Any service that can read/write to the bucket can participate in the pipeline.

Appendix: Configuration Reference

All settings are defined in backend/app/config.py using Pydantic Settings. Values are loaded from environment variables, falling back to defaults.

Variable	Default	Description
`DATABASE_URL`	`postgresql://pah:pah_password@localhost:5432/pah`	PostgreSQL connection string
`REDIS_URL`	`redis://localhost:6379/0`	Redis connection URL
`CELERY_BROKER_URL`	`redis://localhost:6379/0`	Celery message broker URL
`CELERY_RESULT_BACKEND`	`redis://localhost:6379/0`	Celery result storage URL
`SONAUTO_API_URL`	`https://api.sonauto.ai/v1`	Sonauto API base URL
`SONAUTO_API_KEY`	(empty)	Sonauto API authentication key
`WEBHOOK_BASE_URL`	`http://localhost:8000`	Base URL for webhook callbacks
`MAX_CONCURRENT_REQUESTS`	`10`	Maximum concurrent API requests
`MAX_RETRIES`	`3`	Application-level retry limit
`FORM_SUBMIT_RETRY`	`10`	Maximum submissions per cookie_id
`MIN_AVAILABLE_CREDITS`	`5000`	Minimum Sonauto credits before alert
`WEBHOOK_TIMEOUT_MINUTES`	`10`	Minutes before a submission is considered timed out
`API_REQUEST_TIMEOUT_SECONDS`	`30`	HTTP timeout for outbound API calls
`IMAGE_SAFETY_METHOD`	`gemini`	Image classification backend (`gemini` or `nudenet`)
`GEMINI_API_KEY`	(empty)	Google Gemini API key for image safety
`CLOUD_RUN_VIDEO_URL`	(empty)	Cloud Run video service URL (empty = local FFmpeg)
`VIDEO_SERVICE_API_KEY`	(empty)	Shared secret for Cloud Run authentication
`CLOUD_RUN_VIDEO_TIMEOUT`	`300`	Cloud Run request timeout in seconds
`GCS_PROJECT_ID`	`holiday-project-india`	Google Cloud project ID
`GCS_BUCKET_NAME`	`vday2026`	GCS bucket name
`GCS_CREDENTIALS_PATH`	`google_cloud_storage.json`	Path to GCS service account JSON
`GCS_UPLOADS_FOLDER`	`uploads`	GCS folder for pet photos
`GCS_AUDIO_FOLDER`	`audio`	GCS folder for generated audio
`GCS_VIDEO_FOLDER`	`video`	GCS folder for generated video
`GCS_IMAGES_FOLDER`	`images`	GCS folder for composite images
`STORAGE_BASE`	`../storage`	Local storage path (dev fallback)
`FILE_RETENTION_DAYS`	`30`	Days before files are cleaned up
`CORS_ORIGINS`	`localhost:8000, :8080`	Allowed CORS origins

34 KiB Raw Permalink Blame History