obsidian/01 Projects/amazon-transcreation/Amazon Transcreation.md
2026-04-29 14:50:31 +01:00

13 KiB
Raw Blame History

name client status server tech local_path deploy url tags created port db
Amazon Transcreation Amazon active optical-dev
Next.js 14
FastAPI
Celery
PostgreSQL 16
Redis 7
Claude API
Python 3.12
TypeScript
/Users/ai_leed/Documents/Projects/Oliver/amazon-transcreation ./deploy.sh https://optical-dev.oliver.solutions/amazon-transcreation
project
2026-04-15 8040 PostgreSQL 16

Overview

amazon-transcreation is an AI-powered SaaS platform that adapts Amazon marketing copy across 12 European locales using Claude LLM agents. It replaces a manual LibreChat workflow with a structured job wizard, one-click multi-locale parallel processing, real-time monitoring, in-app review, and XLSX export. Built for Amazon via Oliver Agency, the platform runs in production on optical-dev.oliver.solutions and handles transcreation (cultural + linguistic adaptation) of marketing campaigns using a deterministic validation pipeline + single LLM agent call + deterministic formatting pipeline.

Tech Stack

  • Frontend: Next.js 14 (SSR/SPA), TypeScript, React 18
  • Backend: FastAPI (uvicorn), Python 3.12, Pydantic ORM + SQLAlchemy async
  • Database: PostgreSQL 16 (11 tables: jobs, locales, outputs, TM registry, users, etc.)
  • Infrastructure: Docker Compose (dev & prod), Apache reverse proxy (prod), Redis 7 task broker
  • AI/ML: Anthropic Claude API (single-agent V25 prompt, 899-line JSON system message)
  • Key libraries: Celery (async tasks), Alembic (migrations), python-multipart (file upload), openpyxl (XLSX parsing/generation), anthropic SDK

Architecture

Three-tier monorepo: Next.js frontend (Node.js container) → FastAPI backend (Python 3.12 uvicorn) → Celery worker pool (4 concurrent tasks). PostgreSQL 16 holds all relational data (jobs, locale instances, outputs, user accounts, TM registry). Redis 7 serves dual purpose: Celery broker (task queue) and pub/sub for WebSocket progress updates. All AI processing runs asynchronously in isolated Celery tasks—each locale in a job spawns one task, up to 4 run in parallel.

Pipeline state machine (single-agent default):

  1. INIT — Task starts, locale status → processing
  2. VALIDATE — Deterministic: parse uploaded XLSX, load Translation Memory + reference files (glossary, blacklist, TOV guidelines), build PipelineContext
  3. SINGLE_AGENT — One Claude API call with full V25 system prompt; handles TM matching, ranking, transcreation, compliance in one go (~24 min per locale)
  4. FORMAT — Deterministic: generate output XLSX (Tab 1: transcreation output, Tab 2: linguistic summary with rationales)
  5. DONE/ERROR — Terminal state; update locale_instances.status, write error logs if needed

Legacy 6-agent pipeline (feature-flagged via USE_SINGLE_AGENT=false): splits work across TM_RETRIEVE → RANK → TRANSCREATE → COMPLY agents, but single-agent is default and recommended.

Storage: Translation Memory (TM) and reference files live in ./storage/amazon/{tm,ref}/ (git-tracked). At runtime, uploaded source XLSX files are temporarily stored and parsed.

Authentication: JWT tokens (HS256, 8-hour expiry). Users authenticate via Azure AD MSAL SSO (token exchange) or local login. Roles: admin, linguist (reviewer), viewer.

Browser
  ↓ HTTP REST / WebSocket poll
Next.js 14 (SSR, React)
  ↓ API calls
Apache reverse proxy (prod) / dev direct
  ↓
FastAPI Backend (8000 internal, 8040 dev/127.0.0.1:8040 prod)
  ├─ JWT auth, job CRUD, output endpoints
  ├─ Celery task dispatch → Redis broker
  └─ WebSocket heartbeat
       ↓
  Celery Worker Pool (4 concurrent)
       ├─ Import agent_single (main pipeline)
       ├─ Call Claude API (Anthropic SDK)
       ├─ Parse/generate XLSX (openpyxl)
       └─ Write to PostgreSQL 16
            ↓
  PostgreSQL 16 (5432 internal, 5492 dev / 127.0.0.1:5492 prod)
       └─ 11 tables: jobs, locale_instances, outputs, users, etc.

Redis 7 (6379 internal, 6389 dev / 127.0.0.1:6389 prod)
  ├─ Celery task queue
  └─ Pub/sub progress updates

Dev Commands

# Clone and initial setup
git clone git@bitbucket.org:zlalani/amazon-transcreation.git
cd amazon-transcreation
cp .env.example .env
# Edit .env: set ANTHROPIC_API_KEY=sk-ant-... and JWT_SECRET_KEY=random-string

# Start all services (db, redis, backend, celery_worker, frontend)
make up

# Run database migrations (Alembic)
make migrate

# Seed initial data (Amazon client, 3 test users)
make seed

# Access dev servers
# Frontend: http://localhost:3000
# Backend API: http://localhost:8040/api/v1
# API docs: http://localhost:8040/docs

# Other useful commands
make test              # pytest tests/ -v
make shell             # bash inside backend container
make logs              # tail all container logs
make restart           # restart backend + celery after code changes
make db-shell          # psql shell
make redis-cli         # Redis CLI
make build             # rebuild Docker images (after requirements.txt or package.json change)

Deployment

  • Server: optical-dev.oliver.solutions
  • Deploy command: ./deploy.sh (regular update) or ./deploy.sh --init (first-time setup)
  • URL: https://optical-dev.oliver.solutions/amazon-transcreation
  • Port: 8040 (backend API, bound to 127.0.0.1 behind Apache)
  • Service: Managed via Docker Compose, no systemd service
  • Local path: /Users/ai_leed/Documents/Projects/Oliver/amazon-transcreation

Deploy script:

  • ./deploy.sh --init — creates .env, builds images, migrates DB, seeds data, configures Apache
  • ./deploy.sh — pulls latest code, rebuilds images (no-cache), restarts containers
  • ./deploy.sh --rebuild — full clean rebuild of all services

Production notes:

  • All Docker ports bound to 127.0.0.1 only; Apache (host) is the public entry point
  • Backend runs with --workers 4 (vs --reload in dev)
  • Frontend built with NEXT_PUBLIC_BASE_PATH=/amazon-transcreation at deploy time
  • TM and reference files pulled from git; mounted as volumes

Environment Variables

Variable Purpose Example
DATABASE_URL PostgreSQL async connection postgresql+asyncpg://transcreation:transcreation@db:5432/transcreation
REDIS_URL Redis broker for Celery redis://redis:6379/0
ANTHROPIC_API_KEY Required: Anthropic Claude API key sk-ant-...
JWT_SECRET_KEY Required: HMAC signing key for JWT tokens random string (e.g., openssl rand -hex 32)
JWT_ALGORITHM JWT signing algorithm (default HS256) HS256
JWT_EXPIRY_HOURS Access token lifetime 8
USE_SINGLE_AGENT Pipeline mode: true (single-agent) or false (legacy 6-agent) true
NEXT_PUBLIC_BASE_PATH Frontend URL prefix (prod only) /amazon-transcreation
NEXT_PUBLIC_API_BASE_URL Frontend API base URL /amazon-transcreation/api/v1 (dev/prod)

See .env.example for full list.

API Endpoints

Key REST endpoints (all under /api/v1):

Endpoint Method Purpose
/jobs POST Create new transcreation job
/jobs/{job_id} GET Get job details + locale statuses
/jobs/{job_id}/locales/{locale}/output GET Download output XLSX
/jobs/{job_id}/rerun-locale POST Re-run a specific locale
/outputs/{output_id} GET / PATCH Fetch output, update review status
/users/me GET Authenticated user profile
/users GET / POST List users, create user (admin only)
/tm-registry GET List registered TM files
/reference-library GET List reference files (glossaries, blacklists, TOV)

WebSocket: /ws/job/{job_id} — real-time progress updates (progress %, status, error logs).

See docs/project/api_spec.md for full OpenAPI spec.

Known Issues

  • None documented in source files. Latest commits (as of 2026-04-29) address:
    • User management UI warnings and ORM refresh on updates
    • Azure AD MSAL SSO integration
    • Markdown table parser losing backtranslations (fixed in commit d5fa4e4)
    • Confidence breakdown field mapping on job cards

Git

  • Remote: git@bitbucket.org:zlalani/amazon-transcreation.git
  • Latest commits: Focus on user management (admin role, viewer role), SSO (Azure AD), and pipeline refinements (single-agent feedback, TM registry w

Sessions

2026-04-16 Fix 500 error when updating users

Asked: Fix 500 error when updating users and resolve accessibility warnings in the user form dialog. Done: Added database refresh after flush to resolve server-generated field issues, wrapped form fields in form element, and added aria-describedby to DialogContent.

2026-04-16 Fix DialogContent missing Description warning and

Asked: Fix DialogContent missing Description warning and password field not in form error. Done: Added DialogDescription to DialogContent and wrapped password field in form element.

2026-04-15 Implement user management with DB registration

Asked: Implement user management with DB registration and role assignment (viewer default, upgradeable to admin/manager). Done: Added user registration logic with role-based access control, deployed to production, and provided migration/seeding commands.

2026-04-15 Implement user registration with DB persistence

Asked: Implement user registration with DB persistence and role management (default viewer, upgradeable to admin/manager). Done: Added viewer role to UserRole enum, updated SSO auto-provisioning, created migration, and deployed with admin seeding script.

2026-04-15 Implement user registration in DB with

Asked: Implement user registration in DB with default viewer role and ability to upgrade to admin/manager via backend. Done: Added viewer role to UserRole enum, created migration, updated SSO auto-provisioning to assign viewer role by default, and verified all changes in git.

2026-04-15 Implement user registration with DB persistence

Asked: Implement user registration with DB persistence and role management (default viewer, upgradeable to admin/manager). Done: Verified user registration logic integrates with existing role enums and confirmed all role references work correctly across jobs page and migration files.

2026-04-15 Set up SSO (SPA) token exchange

Asked: Set up SSO (SPA) token exchange in the browser with Azure AD / MSAL configuration. Done: Added Azure AD environment variables to Docker configuration, updated deploy script, and pushed changes to Git.

2026-04-15 Set up SSO (SPA) token exchange

Asked: Set up SSO (SPA) token exchange in the browser with Azure AD / MSAL configuration. Done: Added Azure AD environment variables to Dockerfile and docker-compose.prod.yml, configured Next.js build args for MSAL.

2026-04-15 Set up SSO (SPA) token exchange

Asked: Set up SSO (SPA) token exchange in the browser with Azure AD/MSAL configuration. Done: Configured Azure AD credentials, updated deployment script, and added SSO setup to compose file and Dockerfile.

2026-04-15 Set up SSO token exchange in

Asked: Set up SSO token exchange in the browser using Azure AD and MSAL. Done: Configured MSAL v5 with Azure AD credentials and removed deprecated storeAuthStateInCookie parameter to resolve type errors.

Change Log

Date Requested Changed Files
2026-04-16 Fix user update errors and form warnings Added refresh() after flush, wrapped fields in form element, added aria-describedby backend/app/api/v1/users.py, frontend/src/app/admin/users/page.tsx
2026-04-16 DialogContent and password field accessibility Added DialogDescription, wrapped input in form element DialogContent.tsx, PasswordField.tsx
2026-04-15 User registration and role management User model with role enum, migration for viewer role, admin seeding script User.py, alembic migrations, create_default_admins.py
2026-04-15 User registration and roles Added viewer role enum, SSO auto-provision logic, migration for role enum, admin seeding script user.py, service.py, user.py schema, d3e4f5a6b7c8_add_viewer_role.py
2026-04-15 User registration with roles Added viewer role enum value, created migration for viewer role, updated SSO service to default new users to viewer role user.py, d3e4f5a6b7c8_add_viewer_role.py, service.py, user.py
2026-04-15 User registration and roles Database persistence, default viewer role, admin/manager upgrade capability, role enum integration jobs/[jobId]/page.tsx, migration file, seed script
2026-04-15 SSO/Azure AD setup Added NEXT_PUBLIC_AZURE_AD_TENANT_ID, NEXT_PUBLIC_AZURE_AD_CLIENT_ID, SSO_ENABLED env vars to Docker build args, updated deploy script frontend/Dockerfile, docker-compose.prod.yml, deploy.sh
2026-04-15 SSO setup with Azure AD Added ARG and ENV for NEXT_PUBLIC_AZURE_AD_TENANT_ID, NEXT_PUBLIC_AZURE_AD_CLIENT_ID, SSO_ENABLED; updated build args in compose frontend/Dockerfile, docker-compose.prod.yml
2026-04-15 SSO Azure AD setup Added AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_REDIRECT_URI, updated deploy script, added setup summary docker-compose.yml, Dockerfile, deploy.sh
2026-04-15 SSO/MSAL setup Azure tenant/client ID configuration, removed storeAuthStateInCookie, token exchange flow MSAL config file, authentication module