amazon-transcreation

History

DJP 100eddbc21 Switch LLM calls to streaming + tighten batch sizes The Anthropic SDK refuses non-streaming calls expected to take >10 minutes ("Streaming is required..."). Long-output batches (32k tokens of densely-formatted markdown) hit this on real 172-line briefs. Both LLMClient.create_message and create_message_cached now use the streaming context manager (client.messages.stream(...)) and accumulate text chunks; final usage + stop_reason come from get_final_message(). No timeout on streaming requests. Tightened the batch tier so individual streams stay well under any ceiling and progress / failure recovery is more granular: - ≤50 lines: single call - 51-120: batches of 30 (max_tokens=16k each) - 121+: batches of 25 (max_tokens=16k each) Verified with the 172-line case: 7 batches of 25, 172 drafts produced. Live streaming call confirmed end-to-end (haiku returned, usage and stop_reason populated correctly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-06 12:20:16 -04:00
..
alembic	Implement user management: viewer role, real API wiring, admin sidebar	2026-04-15 18:37:16 +01:00
app	Switch LLM calls to streaming + tighten batch sizes	2026-05-06 12:20:16 -04:00
tests	feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton	2026-04-10 12:31:43 -04:00
alembic.ini	feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton	2026-04-10 12:31:43 -04:00
Dockerfile	feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton	2026-04-10 12:31:43 -04:00
requirements.txt	feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton	2026-04-10 12:31:43 -04:00