The agent reported (for an nl-BE job) that glossary and blacklist were
"not provided" and date/percentage formats were "provided but empty".
The files are on disk with real content — the bug was in the loaders,
which expected shapes that didn't match what's actually shipped:
- load_glossary expected a top-level JSON list, but files use
{"locale": "...", "entries": [...]}. RefFileLoadError raised,
silently caught by load_all_reference_files, result became None.
- load_blacklist had the same mismatch, same outcome.
- load_date_pct_formats accepted the dict shape but only knew about
the "date_formats"/"percentage_formats" keys; the files use
"entries" → returned {"date_formats": [], "percentage_formats": []}
which is exactly what the agent reported.
Fix:
- New _extract_entries() helper that accepts both the wrapper shape
{entries: [...]} and a bare list. load_glossary / load_blacklist
both delegate to it.
- load_date_pct_formats now passes entries through alongside the
legacy date_formats / percentage_formats keys (back-compat).
- load_all_reference_files now logs a warning when a loader raises
RefFileLoadError instead of silently swallowing it — so any future
loader/file-shape drift surfaces in the celery logs.
Verified inside the backend container against nl-BE, de-DE, fr-FR:
- 58 / 68 / 64 glossary entries respectively (was 0)
- 14 / 9 / 4 blacklist entries (was 0)
- 10 / 10 / 10 date/pct entries (was empty)
- locale_considerations and tov_global still load correctly
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug 1: Empty tm_channels was silently re-defaulted to [campaign channel]
in both agent_single.py and job_tasks.py via `or [channel]`. Python's
`or` treats [] as falsy, so the frontend's empty-list intent was lost.
Fixed by replacing `or` with an explicit `is not None` check at both
sites. Empty list now means "load no TMs"; None still falls back.
Bug 2: Supplementary files dropped by Agent1Validator. The validator
built FileManifest(...) with explicit kwargs but forgot
supplementary_files, so the raw field from _resolve_file_manifest
never reached agent_single.run(). Files were uploaded to disk but
never inlined into the LLM context. Fixed by adding
supplementary_files=raw.get("supplementary_files", []) to the
validator's FileManifest construction.
Bug 3: New TM channels lowercased in StepReview.tsx, breaking
case-sensitive file lookup. On Linux, "flat_primecbmt_nl-be.json"
≠ "flat_PrimeCBMT_nl-be.json", so the file was silently skipped and
zero TM entries loaded. Legacy channels worked only because the
hardcoded CHANNEL_FILE_MAP has lowercase keys mapping to
canonically-cased filenames; auto-discovered channels (PrimeCBM,
PrimeCBMT, etc.) had no such safety net. Two-part fix:
3a. StepReview.tsx no longer lowercases tm_channels — preserves case
end-to-end from registry → frontend → backend → disk.
3b. _resolve_all_tm_paths builds a case-insensitive index of the
locale's TM directory once per call and resolves filenames
against it. Forgives any historical case-drift between registry
and disk.
Verified end-to-end with a standalone test script run inside the
backend container: 8/8 assertions pass covering empty tm_channels,
supplementary file passthrough, exact-case lookups, lowercase
fallback, missing channels, legacy MASS in both cases, and empty
tm_channels yielding no TM paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Anthropic SDK refuses non-streaming calls expected to take >10
minutes ("Streaming is required..."). Long-output batches (32k tokens
of densely-formatted markdown) hit this on real 172-line briefs.
Both LLMClient.create_message and create_message_cached now use the
streaming context manager (client.messages.stream(...)) and accumulate
text chunks; final usage + stop_reason come from get_final_message().
No timeout on streaming requests.
Tightened the batch tier so individual streams stay well under any
ceiling and progress / failure recovery is more granular:
- ≤50 lines: single call
- 51-120: batches of 30 (max_tokens=16k each)
- 121+: batches of 25 (max_tokens=16k each)
Verified with the 172-line case: 7 batches of 25, 172 drafts produced.
Live streaming call confirmed end-to-end (haiku returned, usage and
stop_reason populated correctly).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- "How it works" mentions supplementary files, batching with caching for
large briefs, reviewer comments in export
- Configure step: TM Files now described as "leave empty for no TMs"
(auto-include of campaign channel removed in 2.5) and dynamically
populated from registry
- New "Supplementary files" subsection on the Upload step explaining
the per-file locale dropdown, supported formats, and what gets inlined
into the agent's context
- Monitoring section: dashboard auto-refresh note, plus a new "Inputs
sent to the agent" subsection pointing to the per-job card
- Reviewing section: spell out that reviewer comments appear in the xlsx
export (Reviewer (Name): comment) and the file is regenerated on each
download
- Reference Files / TM section: replacing a TM in place, adding a new
channel via the free-text autocomplete, dynamic registry-driven channel
list
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously briefs above ~150 source lines hit the Sonnet 4.6 64k output
cap and were silently truncated. Now we batch:
- ≤70 lines: one LLM call (no change)
- 71-150: batches of 50 (2-3 calls)
- 151+: batches of 40 (unbounded)
Each batch uses Anthropic prompt caching: the V25 system prompt + job
parameters + TM entries + reference data + supplementary files form a
cached prefix; only the per-batch source lines vary. After the first
batch, subsequent batches read the prefix from cache at ~10% input cost,
so an N-batch job costs roughly (1 + 0.1*(N-1)) full prompts instead
of N.
Implementation:
- New LLMClient.create_message_cached / acreate_message_cached methods
that mark system_prompt and cached_user_content with cache_control:
ephemeral. Tracks cache_creation_input_tokens and
cache_read_input_tokens in usage and applies the right cost rates
(1.25x for writes, 0.1x for reads).
- AgentSingle.run() refactored to build the cached static prefix once,
then loop over batches sending only the per-batch source lines as the
dynamic content. Each batch's parsed rows are appended to
context.draft_outputs / ranking_declarations.
- Per-batch instructions added to the prompt for batched runs ("This is
batch N of M ... output a table for these lines only ... do not
repeat prior batches"). Single-call runs (≤70 lines) skip this note.
- Linguistic summary: kept from the last batch (batched mode) or the
single batch (single mode).
- Per-batch logging of input_tokens / cache_read / cache_creation /
output_tokens / stop_reason for visibility.
Verified end-to-end: N=10/70/100/150/250 produce 1/1/2/3/7 LLM calls
with correct draft counts, and live caching reads the cached prefix on
the second call within the 5-minute TTL.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TM upload-replacement bug (critical):
- Uploads were writing to /storage/clients/<uuid>/tm/... but the pipeline
reads from /storage/amazon/tm/... — replacements were silently ignored
- upload_tm_file now writes to the canonical pipeline path
/storage/amazon/tm/<locale>/flat_<channel>_<lc>.json (overwrites in place)
- Filename casing is preserved when an existing file is being replaced
(the on-disk seeded files use mixed casing: flat_MASS, flat_value,
flat_PrimeSpeed); falls back to CHANNEL_FILE_MAP, then user-typed case
- Registry upsert by (client_id, locale_code, channel): replaces row in
place rather than inserting duplicates
- Verified: replacement file at canonical path, registry COUNT=1, no dupes
Supplementary files now reach the LLM (critical):
- New supplementary_files field on FileManifest
- _resolve_file_manifest scans /storage/jobs/<job_id>/supplementary/ and
populates the manifest, with per-locale gating by filename prefix
(e.g. de-DE_glossary.txt only goes to de-DE; global_brief.txt goes to all)
- _format_supplementary_for_prompt reads each file (.txt/.md/.json/.csv/.tsv
/.docx) and inlines its text into the LLM user message under a
"## SUPPLEMENTARY MATERIAL" header, capped at 40k chars per file
- .docx files are extracted via inline zipfile read (no new dependency)
New job wizard:
- Per-supplementary-file locale dropdown ("Global" or one of 12 locales)
- Filename gets prefixed with the locale on upload (de-DE_brief.docx)
Admin TM upload:
- Channel field is now a free-text input with autocomplete suggestions
(datalist of known channels) — lets users add brand-new channels like
PrimeCBM that didn't exist before
Pipeline scaling:
- Bumped dynamic max_tokens tiers: 80+ lines now gets 64k output budget
(was 32k); 132-line briefs no longer truncate. Sonnet 4.6 caps at 64k
- Added stop_reason logging — "max_tokens" stop now shows up in logs
loud and clear rather than silently truncating
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A1 Export columns shifted (critical):
- V25 LLM occasionally emits 12/13-col tables with Copy Type/Char Limit prefix
- Parser now anchors on "Option 1" header position; robust to any prefix shift
- Verified with 23/23 unit tests covering 11/12/13-col variants
- Source-line block in prompt no longer uses pipe separators (defence in depth)
A2 Linguistic summary fallback:
- Drop the metadata key/value table fallback on Tab 2
- Show "No linguistic summary was generated" when the agent didn't produce one
A3 Dashboard stuck on "Running":
- useJobs / useJob now poll every 5s while any job/locale is in an active state
- Stops polling once everything is COMPLETED or ERROR
B1 TM auto-config: respect empty selection
- Send no TM files when user unchecks all (was auto-adding campaign channel)
- Backend distinguishes empty list vs missing field
B2 Auto-discover channels from TM registry:
- New GET /api/v1/files/tm/channels endpoint reads distinct channels from registry
- Frontend StepConfigure fetches channels per client; falls back to static list
- Pipeline TM resolution falls back to flat_<Channel>_<lc>.json pattern for any
registered channel (no hardcoded map needed for new channels like PrimeCBM)
B3 Job inputs visible on monitoring:
- New "Inputs sent to the agent" card on /jobs/[id] showing AI model, TM files,
supplementary file list, and context override
- New GET /api/v1/jobs/{id}/supplementary endpoint listing on-disk supplementary files
C1 Context cap (large briefs truncating):
- max_tokens scales with source line count (8k/16k/32k/64k by tier)
- 172-line briefs now have ~64k output budget instead of fixed 16k
D1 Reviewer comments in xlsx export:
- Export endpoint now copies xlsx to temp path on download, queries Feedback
joined with User, and appends "Reviewer (Name): comment" to the rationale
cells of options that have feedback
- Original generated file remains untouched
D2 Hide Clients & Voice from sidebar (page still reachable by URL)
D3 Remove dead notifications + settings icons from header
D4 Cost by Locale table added to Analytics with total + avg cost per brief
Makefile seed target now also runs register_storage_files so TM registry is
populated from disk on first setup (deploy.sh already does this via --init).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `await db.refresh(user)` after `db.flush()` in create_user and
update_user so server-generated `updated_at` is available before
model_validate (async SQLAlchemy cannot lazy-load expired attributes)
- Add DialogDescription to satisfy Radix UI aria requirement
- Wrap form fields in <form> to resolve browser password-not-in-form warning
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add viewer role to backend enum + Alembic migration
- SSO auto-provisioned users now get viewer (lowest privilege) by default
- Wire admin/users page to real API (replace mock data), with add/edit/deactivate
- Fix frontend UserRole enum to match backend (TM_MANAGER, REVIEWER)
- Replace hardcoded mock user in Sidebar with real auth, filter admin-only nav items, wire logout
- Add seed script to set default admins (daveporter, vadymsamoilenko)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Backend: Azure AD JWKS validator with 24h cache, new POST /api/v1/auth/sso/login
endpoint, sso_login() in AuthService with auto-provisioning, password_hash made
nullable, auth_provider column added, Alembic migration c1d2e3f4a5b6
- Frontend: @azure/msal-browser, msal.ts config singleton, ssoLogin() API function,
login page updated with SSO button and redirect callback handling
- Deploy: frontend Dockerfile and docker-compose.prod.yml updated to bake Azure AD
vars into the image at build time; deploy.sh validates SSO config on init/deploy
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The V25 table has duplicate column names (Backtranslation x3, Rationale x3).
The dict-based parser collapsed these — only the last value survived (Option 3's
"N/A"), causing all BT/rationale fields to be "N/A" in the output Excel.
Fixed by switching to positional list-based parsing instead of dicts.
Also adds per-job model selection (Sonnet 4.6 / Opus 4.6) through the full
stack: DB column, API schema, job wizard UI dropdown, pipeline contracts, and
LLM client with model-aware cost tracking. Includes Alembic migration.
Updated help page and README to reflect single-agent pipeline, multi-TM
selection, flat locale grid, model selector, and linguistic summary.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Four changes from user testing feedback:
1. Merge MAIN/DERIVED locale selectors into single 12-locale grid, auto-classify locale_type
2. Add multi-TM channel selection (checkbox grid, tm_channels JSON column, multi-file resolution)
3. Replace 6-agent pipeline with single V25-based agent (feature-flagged via USE_SINGLE_AGENT)
4. Replace Excel Tab 2 metadata with linguistic summary from agent output
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Scans storage/amazon/tm/ and storage/amazon/ref/, creates DB registry
entries for each JSON file so they appear in the TM Registry and
Reference Library pages. Extracts channel from TM filenames, locale
from ref filenames, counts JSONL segments. Idempotent (skips duplicates).
Also added to deploy.sh --init flow.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both pages were showing hardcoded mock data (PDFs, TMX, DOCX files).
Now they:
- Fetch real data from /files/tm and /files/reference endpoints
- Accept .json/.jsonl uploads (not PDF/TMX)
- Support delete with confirmation
- Auto-select Amazon as the default client
- Show proper upload dialogs with locale/channel/file-type selectors
- Fixed api.ts functions to pass client_id, channel, file_type as
query params (matching backend expectations)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added "Confidence" header, "X rows" count, and "High/Mod/Low" labels
next to each dot so the bar colours have clear meaning at a glance.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The API was returning confidence_high/moderate/low/total_output_rows but
mapJobListResponse was dropping them, so the JobCard never rendered the
confidence bar.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Next.js builds inside Docker's multi-stage builder get cached even when
source files change, causing stale frontends after deploy. Backend still
uses normal caching since Python doesn't have this issue.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Backend: Added confidence_high/moderate/low/total_output_rows to
JobListResponse, computed via a batch query joining output_rows
- Frontend JobCard: Shows a stacked progress bar with green/amber/red
segments and counts for High/Moderate/Low confidence tiers
- Frontend StepConfigure: Auto-selects Amazon as default client when
creating a new job (falls back to first client if Amazon not found)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The pipeline stores tm_entries_cited as a list[str] of seg_keys, but the
Pydantic response schema expected dict[str, Any], causing a validation
error when loading the output preview page.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Only value/mass/onsite/outbound were mapped, so jobs with channel=UEFA
got "Unknown channel" and fell back to no TM matches, causing all LOW
confidence scores.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Removed old import_reference_files.py step from --init (TM/ref files
are now tracked in git, no separate import needed)
- Added file count verification during --init to confirm TM files arrived
- Added --remove-orphans to docker compose commands to prevent stale
containers serving old builds
- Standard deploy now does compose down before up to ensure clean restart
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The storage/amazon/ directory (TM files for 12 locales + reference files)
was excluded by .gitignore, causing the production server to have no TM
data after deployment. Updated .gitignore to track storage/amazon/ so
git pull on the server brings in all 153 TM and reference files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously the re-run button only appeared on ERROR status locales.
Now it also shows on COMPLETED locales so users can reprocess them
after pipeline fixes without creating a new job.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The rerun endpoint returned 500 because Pydantic tried to serialize
updated_at from a stale SQLAlchemy instance after flush(). Added
db.refresh(instance) to ensure all attributes are loaded.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The compact TM format parser was storing the combined EN+TX text in both
fields, causing the LLM retrieval agent to fail at matching source lines
against TM entries — resulting in all-low confidence tiers. Added
_split_en_tx() heuristic that detects the language boundary at the first
non-ASCII sentence. Also includes raw _text in LLM prompt for context.
Fixed get_jobs_over_time GroupingError by using literal_column for
date_trunc, added date filters to status_breakdown, and fixed Decimal
serialization in locale stats.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix sidebar nav so Dashboard/Monitoring and Audit Trail/System Logs
highlight independently by using useSearchParams to distinguish
query-param-based routes. Fix get_jobs_over_time SQL GroupingError
by using literal_column for date_trunc interval. Add date filters to
status_breakdown query and fix Decimal serialization in locale stats.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace mock chart data on reports page with real backend queries (jobs over
time, locale stats, usage stats, quality metrics). Add audit logging to auth
(login/login_failed), file management (upload/delete TM and reference files),
and feedback submission so the system logs page shows complete activity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
StepUpload was showing hardcoded "42 Total Lines, 8 Display Formats"
for every file upload. Now:
- Added POST /jobs/validate-source endpoint that parses xlsx in a
temp file and returns real stats (line count, display formats,
columns found, warnings) without creating any DB records
- Frontend calls validateSource() when user selects a file
- Shows spinner during validation, real results after
- Blocks "Next" if validation fails
- Removed all mock validation data
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix API path: frontend now calls /audit/logs (was /audit)
- Backend eagerly loads User relationship for audit entries
- Backend response includes user_name field instead of just user_id
- Frontend logs page fetches real data with pagination
- Derive INFO/WARN/ERROR levels from action type
- Format details JSON into readable descriptions
- Add loading state and empty state handling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Feedback was saving to DB but never loaded back on page revisit.
Three-point fix:
- Backend schema: add feedback list to OutputRowResponse
- Backend service: eagerly load feedback relationship in preview query
- Frontend mapper: map latest feedback entry to OutputRow.feedback
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wire token usage from LLM agents through pipeline context to DB and frontend
- Agents 2 and 4 accumulate input/output tokens and cost into PipelineContext
- job_tasks.py saves token totals to locale instance after pipeline completion
- Monitoring cards show total tokens and estimated cost instead of broken 0/0
- Make feedback highlighting bolder: colored card borders, stronger button states
- Add estimated cost display to dashboard job cards
- Add Help page with full documentation and link in sidebar navigation
- Comprehensive README with ASCII architecture diagrams
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all stub agents with working Claude API-powered agents:
- Agent 2 (TM Retrieval): LLM semantic matching of source lines against TM entries
- Agent 3 (Ranker): Deterministic ranking with confidence tiers (high/moderate/low)
- Agent 4 (Transcreator): Batched creative transcreation with voice profiles, reference files, backtranslations
- Agent 5 (Compliance): Deterministic checks for character limits, blacklist terms, domain substitution
Also fixes TM file loader to handle real compact JSONL format (locale code regex-based parsing),
and adds file manifest resolution for reference files (glossary, blacklist, TOV, locale considerations).
Verified end-to-end: 53-line de-DE brief produces real German translations with TM matching,
confidence-based option counts (1/2/3), backtranslations, and compliance validation. ~$0.49 total cost.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix download URL to match backend route (/output/jobs/.../export)
- Add onClick handlers for download buttons in LocaleInstanceCard and review page
- Wire FeedbackButtons to POST /output/feedback with correct schema
- Replace mock data in review page with real getOutputPreview API call
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Job wizard now calls real API: create job → upload source → launch
- Dashboard and monitoring pages use live data instead of mock data
- Monitoring page polls every 3s while job is active
- Backend enriches job responses with client_name, created_by_name,
source_line_count from eager-loaded relationships
- Frontend response mappers handle backend→frontend type differences
(lowercase enum values, field name mapping, computed progress/stage)
- Source file parser accepts column aliases (Line type, Context notes)
with case-insensitive matching for real-world Excel files
- Clients list endpoint accessible to all authenticated users
- Fixed uploadSource to use PUT, uploadSupplementary per-file
- Removed all hardcoded mock data from useJobs hook
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fonts and logos were not loading on the /amazon-transcreation subpath
deployment because CSS @font-face used absolute /fonts/ paths and Image
src used bare /amazon-logo.svg — neither respects Next.js basePath.
Migrated fonts to next/font/local (bundles into _next/static with
correct assetPrefix) and prepend NEXT_PUBLIC_BASE_PATH to logo srcs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes TypeScript build error where JWT claims role (string) was
assigned to User.role (UserRole enum).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Apache reverse proxy config (replaces nginx — server already runs Apache)
- Next.js basePath set to /amazon-transcreation for subpath deployment
- Frontend on port 3050 (3000 taken), backend on 8040
- WebSocket URL auto-detects protocol from page location
- Deploy script handles Apache config injection into existing vhost
- All Docker ports bound to 127.0.0.1 (Apache handles external access)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- deploy.sh: one-command deploy script (--init for first time, bare for updates)
- docker-compose.prod.yml: production stack with nginx, multi-worker uvicorn, no volume mounts for code
- nginx/nginx.conf: reverse proxy with rate limiting, WebSocket support, static asset caching
- Fix login to use real backend API instead of mock localStorage tokens
- Add auth guard to AppShell (prevents flash-of-content on unauthenticated routes)
- JWT claims decoded client-side for user info (no extra /me call needed)
- Switch logo from missing .jpeg to .svg
- Frontend API URL defaults to same-origin (works behind nginx without CORS)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>