sandbox-notebookllamalm-nextjs/CLAUDE.md
Vadym Samoilenko 922ea3c377 Fix model IDs, hangs, deploy script, Docker healthchecks
MODELS (Block B):
- llm_factory.py: replace hardcoded model strings with env vars
  OPENAI_CHAT_MODEL (gpt-5.4-2026-03-05), ANTHROPIC_CHAT_MODEL (claude-sonnet-4-6),
  GEMINI_CHAT_MODEL (gemini-3.1-pro-preview), GEMINI_FLASH_MODEL (gemini-3-flash-preview)
- Fix broken IDs from fc17994: gemini-3-1-pro-preview → gemini-3.1-pro-preview,
  gemini-3-1-flash-live-preview → gemini-3-flash-preview, gpt-5.4 → gpt-5.4-2026-03-05
- Replace gpt-4.1 hardcodes in audio.py + utils.py with OPENAI_LEGACY_MODEL
- Replace hardcoded claude-sonnet-4-6 in studio_generators.py PPTX-from-template
- Replace hardcoded gemini model in gemini_video.py

HANGS (Block C):
- llm_factory.py: add timeout=LLM_TIMEOUT_SECONDS to Gemini branches (was missing)
- pipeline_manager.py: wrap aquery in asyncio.wait_for(timeout=LLAMA_QUERY_TIMEOUT=120s)
- chat.py: wrap query_notebook_pipeline in asyncio.wait_for(CHAT_QUERY_TIMEOUT=130s),
  send {"type":"error"} to client on timeout instead of hanging WS
- background_tasks.py: on startup mark IN_PROGRESS tasks as FAILED ("orphaned on restart")
- api.ts: add axios timeout 60s (was 0 = infinite)
- queryClient.ts: retry:1 + exponential retryDelay (was retry:3)
- notebooks/[id]/page.tsx: podcast poll only while status=processing (was always 5s)
- docker-compose.yml: healthchecks for all services + depends_on service_healthy conditions
- backend/Dockerfile: add --proxy-headers --timeout-keep-alive 65 --ws-ping-interval/timeout

DEPLOY (Block D):
- scripts/deploy.sh: idempotent rolling redeploy (git pull → build → migrate → up → health)
- scripts/rollback.sh: revert to any git SHA
- scripts/README.md: usage table
- .dockerignore: root-level (was missing)
- Retire legacy one-shot scripts → Old Readmes/

DOCS (Block E): Update CLAUDE.md models table + deploy section with new env vars

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 14:25:30 +01:00

8.1 KiB

CLAUDE.md — Developer Guide for AI Assistants

Project Overview

Sandbox-NotebookLM is a self-hosted alternative to Google NotebookLM. FastAPI backend + Next.js 15 frontend, deployed via Docker Compose on optical-web-1.

Live: https://ai-sandbox.oliver.solutions/notebookllama/ Backend API: https://ai-sandbox.oliver.solutions/notebookllama-back/api/docs


Architecture

backend/  (FastAPI + Python 3.13, uv package manager)
  src/
    api/
      main.py              — FastAPI app, mounts static files, filters polling logs
      routes/
        auth.py            — signup, login, Microsoft SSO
        notebooks.py       — CRUD, synthesis, podcast, sharing, Studio endpoints
        documents.py       — upload, task status, summaries
        chat.py            — WebSocket chat
        admin.py           — admin dashboard
    notebookllama/
      database.py          — SQLAlchemy models + get_db() / get_db_session()
      studio_generators.py — 7 LLM generators (flashcards, quiz, mindmap, slides, report, infographic, datatable)
      audio.py             — ElevenLabs podcast generation (saves to PODCAST_DATA_DIR)
      background_tasks.py  — threaded task queue
      notebook_synthesis.py
      pipeline_manager.py
      llm_factory.py       — get_llm_by_type(), get_structured_llm()

frontend/  (Next.js 15 App Router, React 19, TypeScript, Tailwind 4)
  src/
    app/
      notebooks/[id]/page.tsx  — main notebook page (~1600 lines, Studio section + modal)
    lib/api.ts                 — axios client + all API calls
    types/index.ts             — TypeScript interfaces
    store/authStore.ts         — Zustand auth (persisted to localStorage as 'auth-storage')

Docker Deployment

Server: optical-web-1 at /opt/sandbox-notebookllamalm-nextjs

# Standard deploy (git pull + build + up + health check)
ssh michael_clervi@optical-web-1
cd /opt/sandbox-notebookllamalm-nextjs
sudo bash scripts/deploy.sh

# Rebuild only backend or frontend
sudo bash scripts/deploy.sh --backend-only
sudo bash scripts/deploy.sh --frontend-only

# Restart without rebuild (env-only change)
sudo bash scripts/deploy.sh --no-build

# Rollback to previous SHA
sudo bash scripts/rollback.sh abc1234

# Manual: check logs
docker compose logs backend --tail=50
docker compose logs frontend --tail=50

# Run Python in backend container
docker compose exec backend /app/.venv/bin/python -c "..."

# DB migration (auto-runs in deploy.sh; manual fallback)
docker compose exec backend /app/.venv/bin/python -c \
  "import sys; sys.path.insert(0, '/app/src/notebookllama'); from database import run_studio_migration; run_studio_migration(); print('Done')"

Important:

  • git pull must NOT use sudo; file operations in /opt/ DO need sudo
  • Health endpoint: GET /api/health (not /health)
  • Backend uses venv at /app/.venv/ — always call /app/.venv/bin/python
  • Frontend env vars (NEXT_PUBLIC_*) are baked into the build — frontend rebuild needed if they change
  • deploy.sh runs DB migration automatically (idempotent)
  • Repo is on Bitbucket: git@bitbucket.org:zlalani/sandbox-notebookllamalm-nextjs.git

Database

PostgreSQL in Docker (sandbox-nextjs-postgres). Credentials in backend/.env:

  • pgql_user, pgql_psw, pgql_db
  • Host inside Docker: postgres:5432
  • Host from outside Docker: localhost:5433

Schema (10 tables):

  • users — auth_provider, is_admin, is_suspended
  • notebooks — synthesis_data TEXT, studio_data TEXT, podcast_path
  • documents — llamacloud_file_id, pipeline_id
  • notebook_documents — junction table
  • document_summaries — summary, highlights (JSON), questions (JSON), answers (JSON)
  • chat_sessions — is_shared, notebook_id
  • chat_messages — role, content, sources
  • document_shares — permission_level enum (READ/WRITE/SHARE/ADMIN)
  • background_tasks — status, task_type
  • (PostgreSQL enum: permissionlevel)

Adding new columns: Add to model in database.py + add ALTER TABLE ... ADD COLUMN IF NOT EXISTS in run_studio_migration(), then call it in the container.


Studio Features

Seven output types generated from document summaries:

Type Endpoint Generator
Flashcards POST /notebooks/{id}/studio/flashcards generate_flashcards()
Quiz POST /notebooks/{id}/studio/quiz generate_quiz()
Mind Map POST /notebooks/{id}/studio/mindmap generate_mindmap()
Slides (PPTX) POST /notebooks/{id}/studio/slides generate_slides()
Report (PDF) POST /notebooks/{id}/studio/report generate_report()
Infographic POST /notebooks/{id}/studio/infographic generate_infographic()
Data Table POST /notebooks/{id}/studio/datatable generate_datatable()

All results are stored as JSON in notebooks.studio_data TEXT column. Download endpoints:

  • GET /notebooks/{id}/studio/slides/download → PPTX (python-pptx)
  • GET /notebooks/{id}/studio/report/download → PDF (weasyprint)

LLM routing: OpenAI/GPT → get_structured_llm() with Pydantic output; Claude/Gemini → manual JSON schema injection via achat().


AI Models

Model IDs are configurable via env vars in backend/.env (no rebuild needed to change them):

Alias key Env var Default model ID Provider
gpt54-exp OPENAI_CHAT_MODEL gpt-5.4-2026-03-05 OpenAI (default for new notebooks)
claude46-exp ANTHROPIC_CHAT_MODEL claude-sonnet-4-6 Anthropic
gemini31-exp GEMINI_CHAT_MODEL gemini-3.1-pro-preview Google
gemini31-flash GEMINI_FLASH_MODEL gemini-3-flash-preview Google (fastest/cheapest)
gpt4o (hardcoded) gpt-4o OpenAI stable
gpt4 (hardcoded) gpt-4 OpenAI legacy

Additional env overrides:

  • OPENAI_LEGACY_MODEL — model used for podcast script + LlamaCloud query helper (defaults to OPENAI_CHAT_MODEL)
  • LLM_TIMEOUT_SECONDS — LLM call timeout in seconds (default: 900)
  • LLAMA_QUERY_TIMEOUT — LlamaCloud aquery timeout in seconds (default: 120)
  • CHAT_QUERY_TIMEOUT — WebSocket chat query timeout in seconds (default: 130)
  • TTS_TIMEOUT — ElevenLabs TTS timeout in seconds (default: 300)

Frontend Patterns

  • Auth token: stored in localStorage['auth-storage'] as Zustand state → state.token
  • API client: frontend/src/lib/api.ts — axios with JWT interceptor + 401 redirect
  • Data fetching: React Query (useQuery for reads, useMutation for writes)
  • WebSocket chat: chatAPI.connectWebSocket(notebookId, sessionId)ws://host/api/chat/ws/{id}?session_id={sid}
  • Studio mutations: All 7 in studioMutations record inside the notebook page component

Local Development

# Backend (requires backend/.env with all API keys)
cd backend
uv sync
uv run uvicorn src.api.main:app --host 0.0.0.0 --port 9000 --reload

# Frontend
cd frontend
npm install
npm run dev   # port 4000

Infrastructure (local):

docker compose up -d postgres redis   # starts only DB + cache

Common Issues

Problem Fix
column notebooks.studio_data does not exist Run run_studio_migration() in the container
Backend 500 on all routes Container running old image — rebuild: docker compose build backend && docker compose up -d backend
git pull fails with dubious ownership sudo chown -R user:user /opt/sandbox-notebookllamalm-nextjs
git pull fails with unstaged changes git stash && git pull
Frontend shows old UI after deploy Frontend container wasn't rebuilt — run docker compose build frontend && docker compose up -d frontend
Health check fails in deploy script Endpoint is /api/health, not /health

Knowledge Wiki

A cross-project knowledge base is maintained automatically from all Claude Code sessions.

  • Index: /Users/aimpress/Library/Mobile Documents/iCloud~md~obsidian/Documents/VadymSamoilenko/wiki/index.md
  • Query: cd ~/.claude/memory-compiler && uv run python scripts/query.py "your question"
  • Every session in this project automatically feeds the knowledge base.