amazon-transcreation

Author	SHA1	Message	Date
DJP	ff4e7e768e	Add seed script to register existing TM/ref files in database Scans storage/amazon/tm/ and storage/amazon/ref/, creates DB registry entries for each JSON file so they appear in the TM Registry and Reference Library pages. Extracts channel from TM filenames, locale from ref filenames, counts JSONL segments. Idempotent (skips duplicates). Also added to deploy.sh --init flow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-14 10:08:23 -04:00
DJP	8d4dc65993	Wire TM Registry and Reference Library to real backend APIs Both pages were showing hardcoded mock data (PDFs, TMX, DOCX files). Now they: - Fetch real data from /files/tm and /files/reference endpoints - Accept .json/.jsonl uploads (not PDF/TMX) - Support delete with confirmation - Auto-select Amazon as the default client - Show proper upload dialogs with locale/channel/file-type selectors - Fixed api.ts functions to pass client_id, channel, file_type as query params (matching backend expectations) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-14 10:05:03 -04:00
DJP	17bf57e865	Add labels to confidence breakdown on job cards Added "Confidence" header, "X rows" count, and "High/Mod/Low" labels next to each dot so the bar colours have clear meaning at a glance. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 20:49:28 -04:00
DJP	a3f00a5fcd	Map confidence fields in mapJobListResponse The API was returning confidence_high/moderate/low/total_output_rows but mapJobListResponse was dropping them, so the JobCard never rendered the confidence bar. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 20:32:25 -04:00
DJP	5681ca4acf	Always rebuild frontend with --no-cache in deploy script Next.js builds inside Docker's multi-stage builder get cached even when source files change, causing stale frontends after deploy. Backend still uses normal caching since Python doesn't have this issue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 20:18:00 -04:00
DJP	dd59c81603	Add confidence breakdown to dashboard job cards, default client to Amazon - Backend: Added confidence_high/moderate/low/total_output_rows to JobListResponse, computed via a batch query joining output_rows - Frontend JobCard: Shows a stacked progress bar with green/amber/red segments and counts for High/Moderate/Low confidence tiers - Frontend StepConfigure: Auto-selects Amazon as default client when creating a new job (falls back to first client if Amazon not found) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 20:09:29 -04:00
DJP	2c7677b76f	Fix tm_entries_cited type mismatch: accept list or dict The pipeline stores tm_entries_cited as a list[str] of seg_keys, but the Pydantic response schema expected dict[str, Any], causing a validation error when loading the output preview page. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 19:43:16 -04:00
DJP	ee3de41723	Add all TM channel mappings (UEFA, PrimeDualBenefit, etc.) Only value/mass/onsite/outbound were mapped, so jobs with channel=UEFA got "Unknown channel" and fell back to no TM matches, causing all LOW confidence scores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 19:20:36 -04:00
DJP	b0055c53ab	Update deploy script: TM files from git, fix orphan containers - Removed old import_reference_files.py step from --init (TM/ref files are now tracked in git, no separate import needed) - Added file count verification during --init to confirm TM files arrived - Added --remove-orphans to docker compose commands to prevent stale containers serving old builds - Standard deploy now does compose down before up to ensure clean restart Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 19:11:09 -04:00
DJP	521f0447bd	Add TM and reference data files to git for server deployment The storage/amazon/ directory (TM files for 12 locales + reference files) was excluded by .gitignore, causing the production server to have no TM data after deployment. Updated .gitignore to track storage/amazon/ so git pull on the server brings in all 153 TM and reference files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 19:09:45 -04:00
DJP	b2a70a3867	feat: show re-run button on completed locales, not just errors Previously the re-run button only appeared on ERROR status locales. Now it also shows on COMPLETED locales so users can reprocess them after pipeline fixes without creating a new job. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 18:11:18 -04:00
DJP	437b9bd793	fix: add db.refresh after rerun_locale flush to prevent MissingGreenlet The rerun endpoint returned 500 because Pydantic tried to serialize updated_at from a stale SQLAlchemy instance after flush(). Added db.refresh(instance) to ensure all attributes are loaded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 17:59:09 -04:00
DJP	e97d4f81b7	fix: improve TM parser EN/TX split and fix report SQL errors The compact TM format parser was storing the combined EN+TX text in both fields, causing the LLM retrieval agent to fail at matching source lines against TM entries — resulting in all-low confidence tiers. Added _split_en_tx() heuristic that detects the language boundary at the first non-ASCII sentence. Also includes raw _text in LLM prompt for context. Fixed get_jobs_over_time GroupingError by using literal_column for date_trunc, added date filters to status_breakdown, and fixed Decimal serialization in locale stats. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 17:47:53 -04:00
DJP	52bc499272	fix: sidebar highlighting for shared paths and report SQL errors Fix sidebar nav so Dashboard/Monitoring and Audit Trail/System Logs highlight independently by using useSearchParams to distinguish query-param-based routes. Fix get_jobs_over_time SQL GroupingError by using literal_column for date_trunc interval. Add date filters to status_breakdown query and fix Decimal serialization in locale stats. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 17:25:34 -04:00
DJP	5ef7e588b6	feat: wire analytics to real data and add audit logging across all endpoints Replace mock chart data on reports page with real backend queries (jobs over time, locale stats, usage stats, quality metrics). Add audit logging to auth (login/login_failed), file management (upload/delete TM and reference files), and feedback submission so the system logs page shows complete activity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 17:17:14 -04:00
DJP	b11b2df0e2	fix: show real file validation stats instead of hardcoded mock data StepUpload was showing hardcoded "42 Total Lines, 8 Display Formats" for every file upload. Now: - Added POST /jobs/validate-source endpoint that parses xlsx in a temp file and returns real stats (line count, display formats, columns found, warnings) without creating any DB records - Frontend calls validateSource() when user selects a file - Shows spinner during validation, real results after - Blocks "Next" if validation fails - Removed all mock validation data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 17:03:13 -04:00
DJP	84f37a4649	feat: wire audit trail page to real backend data - Fix API path: frontend now calls /audit/logs (was /audit) - Backend eagerly loads User relationship for audit entries - Backend response includes user_name field instead of just user_id - Frontend logs page fetches real data with pagination - Derive INFO/WARN/ERROR levels from action type - Format details JSON into readable descriptions - Add loading state and empty state handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 16:59:36 -04:00
DJP	8b07a59da0	fix: persist feedback/comments across page reloads on review page Feedback was saving to DB but never loaded back on page revisit. Three-point fix: - Backend schema: add feedback list to OutputRowResponse - Backend service: eagerly load feedback relationship in preview query - Frontend mapper: map latest feedback entry to OutputRow.feedback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 16:54:49 -04:00
DJP	5e0a148b96	feat: add token usage tracking, feedback highlighting, cost on cards, help page - Wire token usage from LLM agents through pipeline context to DB and frontend - Agents 2 and 4 accumulate input/output tokens and cost into PipelineContext - job_tasks.py saves token totals to locale instance after pipeline completion - Monitoring cards show total tokens and estimated cost instead of broken 0/0 - Make feedback highlighting bolder: colored card borders, stronger button states - Add estimated cost display to dashboard job cards - Add Help page with full documentation and link in sidebar navigation - Comprehensive README with ASCII architecture diagrams Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 16:47:36 -04:00
DJP	f2398e04c4	feat: add real-time progress tracking and admin job deletion Progress tracking: - Add progress (0-100%) and current_stage columns to locale_instances - Wire orchestrator on_progress callback to update DB at each pipeline stage - Agent 4 reports batch-level sub-progress (e.g. "Translating batch 2/4") - Frontend reads real progress/stage data instead of hardcoding 50% - Stages: Loading Files → Matching TM → Ranking → Translating (per-batch) → Reviewing → Formatting → Complete Job deletion: - DELETE /jobs/{id} endpoint (admin-only, 403 for non-admins) - Cannot delete running jobs (must cancel first) - Cascades to locale instances, output rows, source lines - Frontend: Delete button with confirmation on job monitoring page (admin only) Also: compute live duration_seconds from started_at, map pipeline stages to UI status badges. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 16:18:59 -04:00
DJP	7a0971a029	feat: implement real LLM agents 2-5 for live transcreation pipeline Replace all stub agents with working Claude API-powered agents: - Agent 2 (TM Retrieval): LLM semantic matching of source lines against TM entries - Agent 3 (Ranker): Deterministic ranking with confidence tiers (high/moderate/low) - Agent 4 (Transcreator): Batched creative transcreation with voice profiles, reference files, backtranslations - Agent 5 (Compliance): Deterministic checks for character limits, blacklist terms, domain substitution Also fixes TM file loader to handle real compact JSONL format (locale code regex-based parsing), and adds file manifest resolution for reference files (glossary, blacklist, TOV, locale considerations). Verified end-to-end: 53-line de-DE brief produces real German translations with TM matching, confidence-based option counts (1/2/3), backtranslations, and compliance validation. ~$0.49 total cost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 15:26:41 -04:00
DJP	8ee66cd41c	fix: wire export buttons, feedback saving, and review page to real API - Fix download URL to match backend route (/output/jobs/.../export) - Add onClick handlers for download buttons in LocaleInstanceCard and review page - Wire FeedbackButtons to POST /output/feedback with correct schema - Replace mock data in review page with real getOutputPreview API call Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 14:40:18 -04:00
DJP	f271343bc0	feat: wire job wizard and dashboard to real backend API - Job wizard now calls real API: create job → upload source → launch - Dashboard and monitoring pages use live data instead of mock data - Monitoring page polls every 3s while job is active - Backend enriches job responses with client_name, created_by_name, source_line_count from eager-loaded relationships - Frontend response mappers handle backend→frontend type differences (lowercase enum values, field name mapping, computed progress/stage) - Source file parser accepts column aliases (Line type, Context notes) with case-insensitive matching for real-world Excel files - Clients list endpoint accessible to all authenticated users - Fixed uploadSource to use PUT, uploadSupplementary per-file - Removed all hardcoded mock data from useJobs hook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 14:18:47 -04:00
DJP	1d94bfc005	fix: use next/font/local for fonts and basePath for logo URLs Fonts and logos were not loading on the /amazon-transcreation subpath deployment because CSS @font-face used absolute /fonts/ paths and Image src used bare /amazon-logo.svg — neither respects Next.js basePath. Migrated fonts to next/font/local (bundles into _next/static with correct assetPrefix) and prepend NEXT_PUBLIC_BASE_PATH to logo srcs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 13:25:25 -04:00
DJP	4d4f853792	fix: cast role string to UserRole enum in auth.ts Fixes TypeScript build error where JWT claims role (string) was assigned to User.role (UserRole enum). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 13:15:31 -04:00
DJP	3fe93c2b22	feat: configure deployment for optical-dev.oliver.solutions/amazon-transcreation - Apache reverse proxy config (replaces nginx — server already runs Apache) - Next.js basePath set to /amazon-transcreation for subpath deployment - Frontend on port 3050 (3000 taken), backend on 8040 - WebSocket URL auto-detects protocol from page location - Deploy script handles Apache config injection into existing vhost - All Docker ports bound to 127.0.0.1 (Apache handles external access) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 13:05:12 -04:00
DJP	4a5c1c6dfe	feat: add production deployment, fix auth flow, add nginx reverse proxy - deploy.sh: one-command deploy script (--init for first time, bare for updates) - docker-compose.prod.yml: production stack with nginx, multi-worker uvicorn, no volume mounts for code - nginx/nginx.conf: reverse proxy with rate limiting, WebSocket support, static asset caching - Fix login to use real backend API instead of mock localStorage tokens - Add auth guard to AppShell (prevents flash-of-content on unauthenticated routes) - JWT claims decoded client-side for user info (no extra /me call needed) - Switch logo from missing .jpeg to .svg - Frontend API URL defaults to same-origin (works behind nginx without CORS) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 12:53:48 -04:00
DJP	98fa16bfc3	feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton Full-stack Amazon AI Transcreation Platform with: - FastAPI backend (async, PostgreSQL, Redis, Celery) with 11 DB tables - JWT auth (SSO-ready abstract provider pattern) - 6-agent pipeline orchestrator with deterministic modules - Next.js 14 frontend with Amazon branding (Ember fonts, orange/dark theme) - Job wizard, monitoring HUD, output review, admin screens - 154 TM/reference files imported, 12 locales configured - Docker Compose for all services Agents 2-5 (TM retrieval, ranker, transcreator, compliance) are stubs pending Phase 3 LLM integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 12:31:43 -04:00
Dave Porter	e3c3dccfe9	Initial commit	2026-04-10 15:52:44 +00:00

29 commits