TM upload-replacement bug (critical):
- Uploads were writing to /storage/clients/<uuid>/tm/... but the pipeline
reads from /storage/amazon/tm/... — replacements were silently ignored
- upload_tm_file now writes to the canonical pipeline path
/storage/amazon/tm/<locale>/flat_<channel>_<lc>.json (overwrites in place)
- Filename casing is preserved when an existing file is being replaced
(the on-disk seeded files use mixed casing: flat_MASS, flat_value,
flat_PrimeSpeed); falls back to CHANNEL_FILE_MAP, then user-typed case
- Registry upsert by (client_id, locale_code, channel): replaces row in
place rather than inserting duplicates
- Verified: replacement file at canonical path, registry COUNT=1, no dupes
Supplementary files now reach the LLM (critical):
- New supplementary_files field on FileManifest
- _resolve_file_manifest scans /storage/jobs/<job_id>/supplementary/ and
populates the manifest, with per-locale gating by filename prefix
(e.g. de-DE_glossary.txt only goes to de-DE; global_brief.txt goes to all)
- _format_supplementary_for_prompt reads each file (.txt/.md/.json/.csv/.tsv
/.docx) and inlines its text into the LLM user message under a
"## SUPPLEMENTARY MATERIAL" header, capped at 40k chars per file
- .docx files are extracted via inline zipfile read (no new dependency)
New job wizard:
- Per-supplementary-file locale dropdown ("Global" or one of 12 locales)
- Filename gets prefixed with the locale on upload (de-DE_brief.docx)
Admin TM upload:
- Channel field is now a free-text input with autocomplete suggestions
(datalist of known channels) — lets users add brand-new channels like
PrimeCBM that didn't exist before
Pipeline scaling:
- Bumped dynamic max_tokens tiers: 80+ lines now gets 64k output budget
(was 32k); 132-line briefs no longer truncate. Sonnet 4.6 caps at 64k
- Added stop_reason logging — "max_tokens" stop now shows up in logs
loud and clear rather than silently truncating
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>