Vadym Samoilenko 922ea3c377 Fix model IDs, hangs, deploy script, Docker healthchecks

MODELS (Block B):
- llm_factory.py: replace hardcoded model strings with env vars
  OPENAI_CHAT_MODEL (gpt-5.4-2026-03-05), ANTHROPIC_CHAT_MODEL (claude-sonnet-4-6),
  GEMINI_CHAT_MODEL (gemini-3.1-pro-preview), GEMINI_FLASH_MODEL (gemini-3-flash-preview)
- Fix broken IDs from fc17994: gemini-3-1-pro-preview → gemini-3.1-pro-preview,
  gemini-3-1-flash-live-preview → gemini-3-flash-preview, gpt-5.4 → gpt-5.4-2026-03-05
- Replace gpt-4.1 hardcodes in audio.py + utils.py with OPENAI_LEGACY_MODEL
- Replace hardcoded claude-sonnet-4-6 in studio_generators.py PPTX-from-template
- Replace hardcoded gemini model in gemini_video.py

HANGS (Block C):
- llm_factory.py: add timeout=LLM_TIMEOUT_SECONDS to Gemini branches (was missing)
- pipeline_manager.py: wrap aquery in asyncio.wait_for(timeout=LLAMA_QUERY_TIMEOUT=120s)
- chat.py: wrap query_notebook_pipeline in asyncio.wait_for(CHAT_QUERY_TIMEOUT=130s),
  send {"type":"error"} to client on timeout instead of hanging WS
- background_tasks.py: on startup mark IN_PROGRESS tasks as FAILED ("orphaned on restart")
- api.ts: add axios timeout 60s (was 0 = infinite)
- queryClient.ts: retry:1 + exponential retryDelay (was retry:3)
- notebooks/[id]/page.tsx: podcast poll only while status=processing (was always 5s)
- docker-compose.yml: healthchecks for all services + depends_on service_healthy conditions
- backend/Dockerfile: add --proxy-headers --timeout-keep-alive 65 --ws-ping-interval/timeout

DEPLOY (Block D):
- scripts/deploy.sh: idempotent rolling redeploy (git pull → build → migrate → up → health)
- scripts/rollback.sh: revert to any git SHA
- scripts/README.md: usage table
- .dockerignore: root-level (was missing)
- Retire legacy one-shot scripts → Old Readmes/

DOCS (Block E): Update CLAUDE.md models table + deploy section with new env vars

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-24 14:25:30 +01:00

8.1 KiB

Raw Permalink Blame History

CLAUDE.md — Developer Guide for AI Assistants

Project Overview

Sandbox-NotebookLM is a self-hosted alternative to Google NotebookLM. FastAPI backend + Next.js 15 frontend, deployed via Docker Compose on optical-web-1.

Live: https://ai-sandbox.oliver.solutions/notebookllama/ Backend API: https://ai-sandbox.oliver.solutions/notebookllama-back/api/docs

Architecture

backend/  (FastAPI + Python 3.13, uv package manager)
  src/
    api/
      main.py              — FastAPI app, mounts static files, filters polling logs
      routes/
        auth.py            — signup, login, Microsoft SSO
        notebooks.py       — CRUD, synthesis, podcast, sharing, Studio endpoints
        documents.py       — upload, task status, summaries
        chat.py            — WebSocket chat
        admin.py           — admin dashboard
    notebookllama/
      database.py          — SQLAlchemy models + get_db() / get_db_session()
      studio_generators.py — 7 LLM generators (flashcards, quiz, mindmap, slides, report, infographic, datatable)
      audio.py             — ElevenLabs podcast generation (saves to PODCAST_DATA_DIR)
      background_tasks.py  — threaded task queue
      notebook_synthesis.py
      pipeline_manager.py
      llm_factory.py       — get_llm_by_type(), get_structured_llm()

frontend/  (Next.js 15 App Router, React 19, TypeScript, Tailwind 4)
  src/
    app/
      notebooks/[id]/page.tsx  — main notebook page (~1600 lines, Studio section + modal)
    lib/api.ts                 — axios client + all API calls
    types/index.ts             — TypeScript interfaces
    store/authStore.ts         — Zustand auth (persisted to localStorage as 'auth-storage')

Docker Deployment

Server: optical-web-1 at /opt/sandbox-notebookllamalm-nextjs

# Standard deploy (git pull + build + up + health check)
ssh michael_clervi@optical-web-1
cd /opt/sandbox-notebookllamalm-nextjs
sudo bash scripts/deploy.sh

# Rebuild only backend or frontend
sudo bash scripts/deploy.sh --backend-only
sudo bash scripts/deploy.sh --frontend-only

# Restart without rebuild (env-only change)
sudo bash scripts/deploy.sh --no-build

# Rollback to previous SHA
sudo bash scripts/rollback.sh abc1234

# Manual: check logs
docker compose logs backend --tail=50
docker compose logs frontend --tail=50

# Run Python in backend container
docker compose exec backend /app/.venv/bin/python -c "..."

# DB migration (auto-runs in deploy.sh; manual fallback)
docker compose exec backend /app/.venv/bin/python -c \
  "import sys; sys.path.insert(0, '/app/src/notebookllama'); from database import run_studio_migration; run_studio_migration(); print('Done')"

Important:

git pull must NOT use sudo; file operations in /opt/ DO need sudo
Health endpoint: GET /api/health (not /health)
Backend uses venv at /app/.venv/ — always call /app/.venv/bin/python
Frontend env vars (NEXT_PUBLIC_*) are baked into the build — frontend rebuild needed if they change
deploy.sh runs DB migration automatically (idempotent)
Repo is on Bitbucket: git@bitbucket.org:zlalani/sandbox-notebookllamalm-nextjs.git

Database

PostgreSQL in Docker (sandbox-nextjs-postgres). Credentials in backend/.env:

pgql_user, pgql_psw, pgql_db
Host inside Docker: postgres:5432
Host from outside Docker: localhost:5433

Schema (10 tables):

users — auth_provider, is_admin, is_suspended
notebooks — synthesis_data TEXT, studio_data TEXT, podcast_path
documents — llamacloud_file_id, pipeline_id
notebook_documents — junction table
document_summaries — summary, highlights (JSON), questions (JSON), answers (JSON)
chat_sessions — is_shared, notebook_id
chat_messages — role, content, sources
document_shares — permission_level enum (READ/WRITE/SHARE/ADMIN)
background_tasks — status, task_type
(PostgreSQL enum: permissionlevel)

Adding new columns: Add to model in database.py + add ALTER TABLE ... ADD COLUMN IF NOT EXISTS in run_studio_migration(), then call it in the container.

Studio Features

Seven output types generated from document summaries:

Type	Endpoint	Generator
Flashcards	POST `/notebooks/{id}/studio/flashcards`	`generate_flashcards()`
Quiz	POST `/notebooks/{id}/studio/quiz`	`generate_quiz()`
Mind Map	POST `/notebooks/{id}/studio/mindmap`	`generate_mindmap()`
Slides (PPTX)	POST `/notebooks/{id}/studio/slides`	`generate_slides()`
Report (PDF)	POST `/notebooks/{id}/studio/report`	`generate_report()`
Infographic	POST `/notebooks/{id}/studio/infographic`	`generate_infographic()`
Data Table	POST `/notebooks/{id}/studio/datatable`	`generate_datatable()`

All results are stored as JSON in notebooks.studio_data TEXT column. Download endpoints:

GET /notebooks/{id}/studio/slides/download → PPTX (python-pptx)
GET /notebooks/{id}/studio/report/download → PDF (weasyprint)

LLM routing: OpenAI/GPT → get_structured_llm() with Pydantic output; Claude/Gemini → manual JSON schema injection via achat().

AI Models

Model IDs are configurable via env vars in backend/.env (no rebuild needed to change them):

Alias key	Env var	Default model ID	Provider
`gpt54-exp`	`OPENAI_CHAT_MODEL`	`gpt-5.4-2026-03-05`	OpenAI (default for new notebooks)
`claude46-exp`	`ANTHROPIC_CHAT_MODEL`	`claude-sonnet-4-6`	Anthropic
`gemini31-exp`	`GEMINI_CHAT_MODEL`	`gemini-3.1-pro-preview`	Google
`gemini31-flash`	`GEMINI_FLASH_MODEL`	`gemini-3-flash-preview`	Google (fastest/cheapest)
`gpt4o`	(hardcoded)	`gpt-4o`	OpenAI stable
`gpt4`	(hardcoded)	`gpt-4`	OpenAI legacy

Additional env overrides:

OPENAI_LEGACY_MODEL — model used for podcast script + LlamaCloud query helper (defaults to OPENAI_CHAT_MODEL)
LLM_TIMEOUT_SECONDS — LLM call timeout in seconds (default: 900)
LLAMA_QUERY_TIMEOUT — LlamaCloud aquery timeout in seconds (default: 120)
CHAT_QUERY_TIMEOUT — WebSocket chat query timeout in seconds (default: 130)
TTS_TIMEOUT — ElevenLabs TTS timeout in seconds (default: 300)

Frontend Patterns

Auth token: stored in localStorage['auth-storage'] as Zustand state → state.token
API client: frontend/src/lib/api.ts — axios with JWT interceptor + 401 redirect
Data fetching: React Query (useQuery for reads, useMutation for writes)
WebSocket chat: chatAPI.connectWebSocket(notebookId, sessionId) → ws://host/api/chat/ws/{id}?session_id={sid}
Studio mutations: All 7 in studioMutations record inside the notebook page component

Local Development

# Backend (requires backend/.env with all API keys)
cd backend
uv sync
uv run uvicorn src.api.main:app --host 0.0.0.0 --port 9000 --reload

# Frontend
cd frontend
npm install
npm run dev   # port 4000

Infrastructure (local):

docker compose up -d postgres redis   # starts only DB + cache

Common Issues

Problem	Fix
`column notebooks.studio_data does not exist`	Run `run_studio_migration()` in the container
Backend 500 on all routes	Container running old image — rebuild: `docker compose build backend && docker compose up -d backend`
`git pull` fails with dubious ownership	`sudo chown -R user:user /opt/sandbox-notebookllamalm-nextjs`
`git pull` fails with unstaged changes	`git stash && git pull`
Frontend shows old UI after deploy	Frontend container wasn't rebuilt — run `docker compose build frontend && docker compose up -d frontend`
Health check fails in deploy script	Endpoint is `/api/health`, not `/health`

Knowledge Wiki

A cross-project knowledge base is maintained automatically from all Claude Code sessions.

Index: /Users/aimpress/Library/Mobile Documents/iCloud~md~obsidian/Documents/VadymSamoilenko/wiki/index.md
Query: cd ~/.claude/memory-compiler && uv run python scripts/query.py "your question"
Every session in this project automatically feeds the knowledge base.

8.1 KiB Raw Permalink Blame History