Vadym Samoilenko 922ea3c377 Fix model IDs, hangs, deploy script, Docker healthchecks

MODELS (Block B):
- llm_factory.py: replace hardcoded model strings with env vars
  OPENAI_CHAT_MODEL (gpt-5.4-2026-03-05), ANTHROPIC_CHAT_MODEL (claude-sonnet-4-6),
  GEMINI_CHAT_MODEL (gemini-3.1-pro-preview), GEMINI_FLASH_MODEL (gemini-3-flash-preview)
- Fix broken IDs from fc17994: gemini-3-1-pro-preview → gemini-3.1-pro-preview,
  gemini-3-1-flash-live-preview → gemini-3-flash-preview, gpt-5.4 → gpt-5.4-2026-03-05
- Replace gpt-4.1 hardcodes in audio.py + utils.py with OPENAI_LEGACY_MODEL
- Replace hardcoded claude-sonnet-4-6 in studio_generators.py PPTX-from-template
- Replace hardcoded gemini model in gemini_video.py

HANGS (Block C):
- llm_factory.py: add timeout=LLM_TIMEOUT_SECONDS to Gemini branches (was missing)
- pipeline_manager.py: wrap aquery in asyncio.wait_for(timeout=LLAMA_QUERY_TIMEOUT=120s)
- chat.py: wrap query_notebook_pipeline in asyncio.wait_for(CHAT_QUERY_TIMEOUT=130s),
  send {"type":"error"} to client on timeout instead of hanging WS
- background_tasks.py: on startup mark IN_PROGRESS tasks as FAILED ("orphaned on restart")
- api.ts: add axios timeout 60s (was 0 = infinite)
- queryClient.ts: retry:1 + exponential retryDelay (was retry:3)
- notebooks/[id]/page.tsx: podcast poll only while status=processing (was always 5s)
- docker-compose.yml: healthchecks for all services + depends_on service_healthy conditions
- backend/Dockerfile: add --proxy-headers --timeout-keep-alive 65 --ws-ping-interval/timeout

DEPLOY (Block D):
- scripts/deploy.sh: idempotent rolling redeploy (git pull → build → migrate → up → health)
- scripts/rollback.sh: revert to any git SHA
- scripts/README.md: usage table
- .dockerignore: root-level (was missing)
- Retire legacy one-shot scripts → Old Readmes/

DOCS (Block E): Update CLAUDE.md models table + deploy section with new env vars

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-24 14:25:30 +01:00

989 B

Raw Permalink Blame History

scripts/

Script	Purpose	Run frequency
`deploy.sh`	Rolling redeploy: git pull → build → migrate DB → up -d → health check	Every push to prod
`rollback.sh <sha>`	Revert to a previous commit and rebuild	Emergency only

Deploy

# SSH into optical-web-1 and run:
ssh michael_clervi@optical-web-1
cd /opt/sandbox-notebookllamalm-nextjs
sudo bash scripts/deploy.sh

# Flags:
#   --no-build       restart containers without rebuilding (for env-only changes)
#   --backend-only   rebuild + restart backend only
#   --frontend-only  rebuild + restart frontend only
#   --branch feat/x  deploy a specific branch

Rollback

# Find the SHA you want:
git log --oneline -10

# Roll back:
sudo bash scripts/rollback.sh abc1234

Historical scripts (do not run)

Old one-shot systemd→Docker migration scripts are in Old Readmes/migration-2026-03/. Old pre-Docker systemd scripts are in Old Readmes/pre-docker-systemd/.

989 B Raw Permalink Blame History

scripts/

Deploy

Rollback

Historical scripts (do not run)

989 B

Raw Permalink Blame History