amazon-transcreation

Author	SHA1	Message	Date
DJP	100eddbc21	Switch LLM calls to streaming + tighten batch sizes The Anthropic SDK refuses non-streaming calls expected to take >10 minutes ("Streaming is required..."). Long-output batches (32k tokens of densely-formatted markdown) hit this on real 172-line briefs. Both LLMClient.create_message and create_message_cached now use the streaming context manager (client.messages.stream(...)) and accumulate text chunks; final usage + stop_reason come from get_final_message(). No timeout on streaming requests. Tightened the batch tier so individual streams stay well under any ceiling and progress / failure recovery is more granular: - ≤50 lines: single call - 51-120: batches of 30 (max_tokens=16k each) - 121+: batches of 25 (max_tokens=16k each) Verified with the 172-line case: 7 batches of 25, 172 drafts produced. Live streaming call confirmed end-to-end (haiku returned, usage and stop_reason populated correctly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 12:20:16 -04:00
DJP	70cade819c	Source-line batching with prompt caching for arbitrarily large briefs Previously briefs above ~150 source lines hit the Sonnet 4.6 64k output cap and were silently truncated. Now we batch: - ≤70 lines: one LLM call (no change) - 71-150: batches of 50 (2-3 calls) - 151+: batches of 40 (unbounded) Each batch uses Anthropic prompt caching: the V25 system prompt + job parameters + TM entries + reference data + supplementary files form a cached prefix; only the per-batch source lines vary. After the first batch, subsequent batches read the prefix from cache at ~10% input cost, so an N-batch job costs roughly (1 + 0.1*(N-1)) full prompts instead of N. Implementation: - New LLMClient.create_message_cached / acreate_message_cached methods that mark system_prompt and cached_user_content with cache_control: ephemeral. Tracks cache_creation_input_tokens and cache_read_input_tokens in usage and applies the right cost rates (1.25x for writes, 0.1x for reads). - AgentSingle.run() refactored to build the cached static prefix once, then loop over batches sending only the per-batch source lines as the dynamic content. Each batch's parsed rows are appended to context.draft_outputs / ranking_declarations. - Per-batch instructions added to the prompt for batched runs ("This is batch N of M ... output a table for these lines only ... do not repeat prior batches"). Single-call runs (≤70 lines) skip this note. - Linguistic summary: kept from the last batch (batched mode) or the single batch (single mode). - Per-batch logging of input_tokens / cache_read / cache_creation / output_tokens / stop_reason for visibility. Verified end-to-end: N=10/70/100/150/250 produce 1/1/2/3/7 LLM calls with correct draft counts, and live caching reads the cached prefix on the second call within the 5-minute TTL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:02:48 -04:00
DJP	d3f6a57386	Round 2.5 feedback: TM replacements take effect, supplementary files reach LLM, larger briefs fit, free-text channel uploads TM upload-replacement bug (critical): - Uploads were writing to /storage/clients/<uuid>/tm/... but the pipeline reads from /storage/amazon/tm/... — replacements were silently ignored - upload_tm_file now writes to the canonical pipeline path /storage/amazon/tm/<locale>/flat_<channel>_<lc>.json (overwrites in place) - Filename casing is preserved when an existing file is being replaced (the on-disk seeded files use mixed casing: flat_MASS, flat_value, flat_PrimeSpeed); falls back to CHANNEL_FILE_MAP, then user-typed case - Registry upsert by (client_id, locale_code, channel): replaces row in place rather than inserting duplicates - Verified: replacement file at canonical path, registry COUNT=1, no dupes Supplementary files now reach the LLM (critical): - New supplementary_files field on FileManifest - _resolve_file_manifest scans /storage/jobs/<job_id>/supplementary/ and populates the manifest, with per-locale gating by filename prefix (e.g. de-DE_glossary.txt only goes to de-DE; global_brief.txt goes to all) - _format_supplementary_for_prompt reads each file (.txt/.md/.json/.csv/.tsv /.docx) and inlines its text into the LLM user message under a "## SUPPLEMENTARY MATERIAL" header, capped at 40k chars per file - .docx files are extracted via inline zipfile read (no new dependency) New job wizard: - Per-supplementary-file locale dropdown ("Global" or one of 12 locales) - Filename gets prefixed with the locale on upload (de-DE_brief.docx) Admin TM upload: - Channel field is now a free-text input with autocomplete suggestions (datalist of known channels) — lets users add brand-new channels like PrimeCBM that didn't exist before Pipeline scaling: - Bumped dynamic max_tokens tiers: 80+ lines now gets 64k output budget (was 32k); 132-line briefs no longer truncate. Sonnet 4.6 caps at 64k - Added stop_reason logging — "max_tokens" stop now shows up in logs loud and clear rather than silently truncating Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 14:28:20 -04:00
DJP	d5fa4e49f7	Fix markdown table parser losing backtranslations/rationales, add model selection, update help page The V25 table has duplicate column names (Backtranslation x3, Rationale x3). The dict-based parser collapsed these — only the last value survived (Option 3's "N/A"), causing all BT/rationale fields to be "N/A" in the output Excel. Fixed by switching to positional list-based parsing instead of dicts. Also adds per-job model selection (Sonnet 4.6 / Opus 4.6) through the full stack: DB column, API schema, job wizard UI dropdown, pipeline contracts, and LLM client with model-aware cost tracking. Includes Alembic migration. Updated help page and README to reflect single-agent pipeline, multi-TM selection, flat locale grid, model selector, and linguistic summary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-14 12:40:17 -04:00
DJP	98fa16bfc3	feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton Full-stack Amazon AI Transcreation Platform with: - FastAPI backend (async, PostgreSQL, Redis, Celery) with 11 DB tables - JWT auth (SSO-ready abstract provider pattern) - 6-agent pipeline orchestrator with deterministic modules - Next.js 14 frontend with Amazon branding (Ember fonts, orange/dark theme) - Job wizard, monitoring HUD, output review, admin screens - 154 TM/reference files imported, 12 locales configured - Docker Compose for all services Agents 2-5 (TM retrieval, ranker, transcreator, compliance) are stubs pending Phase 3 LLM integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-10 12:31:43 -04:00

5 commits