amazon-transcreation/backend/app
DJP 100eddbc21 Switch LLM calls to streaming + tighten batch sizes
The Anthropic SDK refuses non-streaming calls expected to take >10
minutes ("Streaming is required..."). Long-output batches (32k tokens
of densely-formatted markdown) hit this on real 172-line briefs.

Both LLMClient.create_message and create_message_cached now use the
streaming context manager (client.messages.stream(...)) and accumulate
text chunks; final usage + stop_reason come from get_final_message().
No timeout on streaming requests.

Tightened the batch tier so individual streams stay well under any
ceiling and progress / failure recovery is more granular:

- ≤50 lines: single call
- 51-120: batches of 30 (max_tokens=16k each)
- 121+:   batches of 25 (max_tokens=16k each)

Verified with the 172-line case: 7 batches of 25, 172 drafts produced.
Live streaming call confirmed end-to-end (haiku returned, usage and
stop_reason populated correctly).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 12:20:16 -04:00
..
api Round 2 feedback: parser fix, dynamic max_tokens, polling, TM auto-discovery, reviewer comments in export 2026-05-04 16:12:47 -04:00
auth Implement user management: viewer role, real API wiring, admin sidebar 2026-04-15 18:37:16 +01:00
llm Switch LLM calls to streaming + tighten batch sizes 2026-05-06 12:20:16 -04:00
models Implement user management: viewer role, real API wiring, admin sidebar 2026-04-15 18:37:16 +01:00
pipeline Switch LLM calls to streaming + tighten batch sizes 2026-05-06 12:20:16 -04:00
schemas Implement user management: viewer role, real API wiring, admin sidebar 2026-04-15 18:37:16 +01:00
services Round 2.5 feedback: TM replacements take effect, supplementary files reach LLM, larger briefs fit, free-text channel uploads 2026-05-05 14:28:20 -04:00
tasks Round 2.5 feedback: TM replacements take effect, supplementary files reach LLM, larger briefs fit, free-text channel uploads 2026-05-05 14:28:20 -04:00
ws feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton 2026-04-10 12:31:43 -04:00
__init__.py feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton 2026-04-10 12:31:43 -04:00
config.py Add Azure AD MSAL SSO (SPA token exchange) 2026-04-15 18:08:46 +01:00
dependencies.py feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton 2026-04-10 12:31:43 -04:00
main.py feat: complete Phase 1-2 scaffold — backend, frontend, pipeline skeleton 2026-04-10 12:31:43 -04:00