MODELS (Block B):
- llm_factory.py: replace hardcoded model strings with env vars
OPENAI_CHAT_MODEL (gpt-5.4-2026-03-05), ANTHROPIC_CHAT_MODEL (claude-sonnet-4-6),
GEMINI_CHAT_MODEL (gemini-3.1-pro-preview), GEMINI_FLASH_MODEL (gemini-3-flash-preview)
- Fix broken IDs from fc17994: gemini-3-1-pro-preview → gemini-3.1-pro-preview,
gemini-3-1-flash-live-preview → gemini-3-flash-preview, gpt-5.4 → gpt-5.4-2026-03-05
- Replace gpt-4.1 hardcodes in audio.py + utils.py with OPENAI_LEGACY_MODEL
- Replace hardcoded claude-sonnet-4-6 in studio_generators.py PPTX-from-template
- Replace hardcoded gemini model in gemini_video.py
HANGS (Block C):
- llm_factory.py: add timeout=LLM_TIMEOUT_SECONDS to Gemini branches (was missing)
- pipeline_manager.py: wrap aquery in asyncio.wait_for(timeout=LLAMA_QUERY_TIMEOUT=120s)
- chat.py: wrap query_notebook_pipeline in asyncio.wait_for(CHAT_QUERY_TIMEOUT=130s),
send {"type":"error"} to client on timeout instead of hanging WS
- background_tasks.py: on startup mark IN_PROGRESS tasks as FAILED ("orphaned on restart")
- api.ts: add axios timeout 60s (was 0 = infinite)
- queryClient.ts: retry:1 + exponential retryDelay (was retry:3)
- notebooks/[id]/page.tsx: podcast poll only while status=processing (was always 5s)
- docker-compose.yml: healthchecks for all services + depends_on service_healthy conditions
- backend/Dockerfile: add --proxy-headers --timeout-keep-alive 65 --ws-ping-interval/timeout
DEPLOY (Block D):
- scripts/deploy.sh: idempotent rolling redeploy (git pull → build → migrate → up → health)
- scripts/rollback.sh: revert to any git SHA
- scripts/README.md: usage table
- .dockerignore: root-level (was missing)
- Retire legacy one-shot scripts → Old Readmes/
DOCS (Block E): Update CLAUDE.md models table + deploy section with new env vars
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.1 KiB
CLAUDE.md — Developer Guide for AI Assistants
Project Overview
Sandbox-NotebookLM is a self-hosted alternative to Google NotebookLM. FastAPI backend + Next.js 15 frontend, deployed via Docker Compose on optical-web-1.
Live: https://ai-sandbox.oliver.solutions/notebookllama/ Backend API: https://ai-sandbox.oliver.solutions/notebookllama-back/api/docs
Architecture
backend/ (FastAPI + Python 3.13, uv package manager)
src/
api/
main.py — FastAPI app, mounts static files, filters polling logs
routes/
auth.py — signup, login, Microsoft SSO
notebooks.py — CRUD, synthesis, podcast, sharing, Studio endpoints
documents.py — upload, task status, summaries
chat.py — WebSocket chat
admin.py — admin dashboard
notebookllama/
database.py — SQLAlchemy models + get_db() / get_db_session()
studio_generators.py — 7 LLM generators (flashcards, quiz, mindmap, slides, report, infographic, datatable)
audio.py — ElevenLabs podcast generation (saves to PODCAST_DATA_DIR)
background_tasks.py — threaded task queue
notebook_synthesis.py
pipeline_manager.py
llm_factory.py — get_llm_by_type(), get_structured_llm()
frontend/ (Next.js 15 App Router, React 19, TypeScript, Tailwind 4)
src/
app/
notebooks/[id]/page.tsx — main notebook page (~1600 lines, Studio section + modal)
lib/api.ts — axios client + all API calls
types/index.ts — TypeScript interfaces
store/authStore.ts — Zustand auth (persisted to localStorage as 'auth-storage')
Docker Deployment
Server: optical-web-1 at /opt/sandbox-notebookllamalm-nextjs
# Standard deploy (git pull + build + up + health check)
ssh michael_clervi@optical-web-1
cd /opt/sandbox-notebookllamalm-nextjs
sudo bash scripts/deploy.sh
# Rebuild only backend or frontend
sudo bash scripts/deploy.sh --backend-only
sudo bash scripts/deploy.sh --frontend-only
# Restart without rebuild (env-only change)
sudo bash scripts/deploy.sh --no-build
# Rollback to previous SHA
sudo bash scripts/rollback.sh abc1234
# Manual: check logs
docker compose logs backend --tail=50
docker compose logs frontend --tail=50
# Run Python in backend container
docker compose exec backend /app/.venv/bin/python -c "..."
# DB migration (auto-runs in deploy.sh; manual fallback)
docker compose exec backend /app/.venv/bin/python -c \
"import sys; sys.path.insert(0, '/app/src/notebookllama'); from database import run_studio_migration; run_studio_migration(); print('Done')"
Important:
git pullmust NOT usesudo; file operations in/opt/DO needsudo- Health endpoint:
GET /api/health(not/health) - Backend uses venv at
/app/.venv/— always call/app/.venv/bin/python - Frontend env vars (
NEXT_PUBLIC_*) are baked into the build — frontend rebuild needed if they change deploy.shruns DB migration automatically (idempotent)- Repo is on Bitbucket:
git@bitbucket.org:zlalani/sandbox-notebookllamalm-nextjs.git
Database
PostgreSQL in Docker (sandbox-nextjs-postgres). Credentials in backend/.env:
pgql_user,pgql_psw,pgql_db- Host inside Docker:
postgres:5432 - Host from outside Docker:
localhost:5433
Schema (10 tables):
users— auth_provider, is_admin, is_suspendednotebooks— synthesis_data TEXT, studio_data TEXT, podcast_pathdocuments— llamacloud_file_id, pipeline_idnotebook_documents— junction tabledocument_summaries— summary, highlights (JSON), questions (JSON), answers (JSON)chat_sessions— is_shared, notebook_idchat_messages— role, content, sourcesdocument_shares— permission_level enum (READ/WRITE/SHARE/ADMIN)background_tasks— status, task_type- (PostgreSQL enum: permissionlevel)
Adding new columns: Add to model in database.py + add ALTER TABLE ... ADD COLUMN IF NOT EXISTS in run_studio_migration(), then call it in the container.
Studio Features
Seven output types generated from document summaries:
| Type | Endpoint | Generator |
|---|---|---|
| Flashcards | POST /notebooks/{id}/studio/flashcards |
generate_flashcards() |
| Quiz | POST /notebooks/{id}/studio/quiz |
generate_quiz() |
| Mind Map | POST /notebooks/{id}/studio/mindmap |
generate_mindmap() |
| Slides (PPTX) | POST /notebooks/{id}/studio/slides |
generate_slides() |
| Report (PDF) | POST /notebooks/{id}/studio/report |
generate_report() |
| Infographic | POST /notebooks/{id}/studio/infographic |
generate_infographic() |
| Data Table | POST /notebooks/{id}/studio/datatable |
generate_datatable() |
All results are stored as JSON in notebooks.studio_data TEXT column. Download endpoints:
GET /notebooks/{id}/studio/slides/download→ PPTX (python-pptx)GET /notebooks/{id}/studio/report/download→ PDF (weasyprint)
LLM routing: OpenAI/GPT → get_structured_llm() with Pydantic output; Claude/Gemini → manual JSON schema injection via achat().
AI Models
Model IDs are configurable via env vars in backend/.env (no rebuild needed to change them):
| Alias key | Env var | Default model ID | Provider |
|---|---|---|---|
gpt54-exp |
OPENAI_CHAT_MODEL |
gpt-5.4-2026-03-05 |
OpenAI (default for new notebooks) |
claude46-exp |
ANTHROPIC_CHAT_MODEL |
claude-sonnet-4-6 |
Anthropic |
gemini31-exp |
GEMINI_CHAT_MODEL |
gemini-3.1-pro-preview |
|
gemini31-flash |
GEMINI_FLASH_MODEL |
gemini-3-flash-preview |
Google (fastest/cheapest) |
gpt4o |
(hardcoded) | gpt-4o |
OpenAI stable |
gpt4 |
(hardcoded) | gpt-4 |
OpenAI legacy |
Additional env overrides:
OPENAI_LEGACY_MODEL— model used for podcast script + LlamaCloud query helper (defaults toOPENAI_CHAT_MODEL)LLM_TIMEOUT_SECONDS— LLM call timeout in seconds (default:900)LLAMA_QUERY_TIMEOUT— LlamaCloudaquerytimeout in seconds (default:120)CHAT_QUERY_TIMEOUT— WebSocket chat query timeout in seconds (default:130)TTS_TIMEOUT— ElevenLabs TTS timeout in seconds (default:300)
Frontend Patterns
- Auth token: stored in
localStorage['auth-storage']as Zustand state →state.token - API client:
frontend/src/lib/api.ts— axios with JWT interceptor + 401 redirect - Data fetching: React Query (
useQueryfor reads,useMutationfor writes) - WebSocket chat:
chatAPI.connectWebSocket(notebookId, sessionId)→ws://host/api/chat/ws/{id}?session_id={sid} - Studio mutations: All 7 in
studioMutationsrecord inside the notebook page component
Local Development
# Backend (requires backend/.env with all API keys)
cd backend
uv sync
uv run uvicorn src.api.main:app --host 0.0.0.0 --port 9000 --reload
# Frontend
cd frontend
npm install
npm run dev # port 4000
Infrastructure (local):
docker compose up -d postgres redis # starts only DB + cache
Common Issues
| Problem | Fix |
|---|---|
column notebooks.studio_data does not exist |
Run run_studio_migration() in the container |
| Backend 500 on all routes | Container running old image — rebuild: docker compose build backend && docker compose up -d backend |
git pull fails with dubious ownership |
sudo chown -R user:user /opt/sandbox-notebookllamalm-nextjs |
git pull fails with unstaged changes |
git stash && git pull |
| Frontend shows old UI after deploy | Frontend container wasn't rebuilt — run docker compose build frontend && docker compose up -d frontend |
| Health check fails in deploy script | Endpoint is /api/health, not /health |
Knowledge Wiki
A cross-project knowledge base is maintained automatically from all Claude Code sessions.
- Index:
/Users/aimpress/Library/Mobile Documents/iCloud~md~obsidian/Documents/VadymSamoilenko/wiki/index.md - Query:
cd ~/.claude/memory-compiler && uv run python scripts/query.py "your question" - Every session in this project automatically feeds the knowledge base.