Commit graph

4 commits

Author SHA1 Message Date
DJP
e97d4f81b7 fix: improve TM parser EN/TX split and fix report SQL errors
The compact TM format parser was storing the combined EN+TX text in both
fields, causing the LLM retrieval agent to fail at matching source lines
against TM entries — resulting in all-low confidence tiers. Added
_split_en_tx() heuristic that detects the language boundary at the first
non-ASCII sentence. Also includes raw _text in LLM prompt for context.

Fixed get_jobs_over_time GroupingError by using literal_column for
date_trunc, added date filters to status_breakdown, and fixed Decimal
serialization in locale stats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 17:47:53 -04:00
DJP
7a0971a029 feat: implement real LLM agents 2-5 for live transcreation pipeline
Replace all stub agents with working Claude API-powered agents:
- Agent 2 (TM Retrieval): LLM semantic matching of source lines against TM entries
- Agent 3 (Ranker): Deterministic ranking with confidence tiers (high/moderate/low)
- Agent 4 (Transcreator): Batched creative transcreation with voice profiles, reference files, backtranslations
- Agent 5 (Compliance): Deterministic checks for character limits, blacklist terms, domain substitution

Also fixes TM file loader to handle real compact JSONL format (locale code regex-based parsing),
and adds file manifest resolution for reference files (glossary, blacklist, TOV, locale considerations).

Verified end-to-end: 53-line de-DE brief produces real German translations with TM matching,
confidence-based option counts (1/2/3), backtranslations, and compliance validation. ~$0.49 total cost.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 15:26:41 -04:00
DJP
f271343bc0 feat: wire job wizard and dashboard to real backend API
- Job wizard now calls real API: create job → upload source → launch
- Dashboard and monitoring pages use live data instead of mock data
- Monitoring page polls every 3s while job is active
- Backend enriches job responses with client_name, created_by_name,
  source_line_count from eager-loaded relationships
- Frontend response mappers handle backend→frontend type differences
  (lowercase enum values, field name mapping, computed progress/stage)
- Source file parser accepts column aliases (Line type, Context notes)
  with case-insensitive matching for real-world Excel files
- Clients list endpoint accessible to all authenticated users
- Fixed uploadSource to use PUT, uploadSupplementary per-file
- Removed all hardcoded mock data from useJobs hook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 14:18:47 -04:00
DJP
98fa16bfc3 feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton
Full-stack Amazon AI Transcreation Platform with:
- FastAPI backend (async, PostgreSQL, Redis, Celery) with 11 DB tables
- JWT auth (SSO-ready abstract provider pattern)
- 6-agent pipeline orchestrator with deterministic modules
- Next.js 14 frontend with Amazon branding (Ember fonts, orange/dark theme)
- Job wizard, monitoring HUD, output review, admin screens
- 154 TM/reference files imported, 12 locales configured
- Docker Compose for all services

Agents 2-5 (TM retrieval, ranker, transcreator, compliance) are stubs
pending Phase 3 LLM integration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 12:31:43 -04:00