sandbox-notebookllamalm-nextjs/CLAUDE.md
Vadym Samoilenko ad8e857cf6 Update README and add CLAUDE.md developer guide
- README: rewrite for Docker-first setup, add Studio features, update schema/endpoints, add troubleshooting for common Docker issues
- CLAUDE.md: new file with architecture overview, deployment commands, DB schema, common issues for AI assistants

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 22:27:11 +00:00

6.5 KiB

CLAUDE.md — Developer Guide for AI Assistants

Project Overview

Sandbox-NotebookLM is a self-hosted alternative to Google NotebookLM. FastAPI backend + Next.js 15 frontend, deployed via Docker Compose on optical-web-1.

Live: https://ai-sandbox.oliver.solutions/notebookllama/ Backend API: https://ai-sandbox.oliver.solutions/notebookllama-back/api/docs


Architecture

backend/  (FastAPI + Python 3.13, uv package manager)
  src/
    api/
      main.py              — FastAPI app, mounts static files, filters polling logs
      routes/
        auth.py            — signup, login, Microsoft SSO
        notebooks.py       — CRUD, synthesis, podcast, sharing, Studio endpoints
        documents.py       — upload, task status, summaries
        chat.py            — WebSocket chat
        admin.py           — admin dashboard
    notebookllama/
      database.py          — SQLAlchemy models + get_db() / get_db_session()
      studio_generators.py — 7 LLM generators (flashcards, quiz, mindmap, slides, report, infographic, datatable)
      audio.py             — ElevenLabs podcast generation (saves to PODCAST_DATA_DIR)
      background_tasks.py  — threaded task queue
      notebook_synthesis.py
      pipeline_manager.py
      llm_factory.py       — get_llm_by_type(), get_structured_llm()

frontend/  (Next.js 15 App Router, React 19, TypeScript, Tailwind 4)
  src/
    app/
      notebooks/[id]/page.tsx  — main notebook page (~1600 lines, Studio section + modal)
    lib/api.ts                 — axios client + all API calls
    types/index.ts             — TypeScript interfaces
    store/authStore.ts         — Zustand auth (persisted to localStorage as 'auth-storage')

Docker Deployment

Server: optical-web-1 at /opt/sandbox-notebookllamalm-nextjs

# Full rebuild and deploy
git pull origin main
docker compose build backend   # or 'frontend' or both
docker compose up -d

# Rebuild only backend (faster, no frontend rebuild needed)
docker compose build backend && docker compose up -d backend

# Check logs
docker compose logs backend --tail=50
docker compose logs frontend --tail=50

# Run Python in backend container
docker compose exec backend /app/.venv/bin/python -c "..."

# DB migration (run when new columns added)
docker compose exec backend /app/.venv/bin/python -c \
  "import sys; sys.path.insert(0, '/app/src/notebookllama'); from database import run_studio_migration; run_studio_migration(); print('Done')"

Important:

  • git pull must NOT use sudo; file operations in /opt/ DO need sudo
  • Health endpoint: GET /api/health (not /health)
  • Backend uses venv at /app/.venv/ — always call /app/.venv/bin/python
  • Frontend env vars (NEXT_PUBLIC_*) are baked into the build — frontend rebuild needed if they change

Database

PostgreSQL in Docker (sandbox-nextjs-postgres). Credentials in backend/.env:

  • pgql_user, pgql_psw, pgql_db
  • Host inside Docker: postgres:5432
  • Host from outside Docker: localhost:5433

Schema (10 tables):

  • users — auth_provider, is_admin, is_suspended
  • notebooks — synthesis_data TEXT, studio_data TEXT, podcast_path
  • documents — llamacloud_file_id, pipeline_id
  • notebook_documents — junction table
  • document_summaries — summary, highlights (JSON), questions (JSON), answers (JSON)
  • chat_sessions — is_shared, notebook_id
  • chat_messages — role, content, sources
  • document_shares — permission_level enum (READ/WRITE/SHARE/ADMIN)
  • background_tasks — status, task_type
  • (PostgreSQL enum: permissionlevel)

Adding new columns: Add to model in database.py + add ALTER TABLE ... ADD COLUMN IF NOT EXISTS in run_studio_migration(), then call it in the container.


Studio Features

Seven output types generated from document summaries:

Type Endpoint Generator
Flashcards POST /notebooks/{id}/studio/flashcards generate_flashcards()
Quiz POST /notebooks/{id}/studio/quiz generate_quiz()
Mind Map POST /notebooks/{id}/studio/mindmap generate_mindmap()
Slides (PPTX) POST /notebooks/{id}/studio/slides generate_slides()
Report (PDF) POST /notebooks/{id}/studio/report generate_report()
Infographic POST /notebooks/{id}/studio/infographic generate_infographic()
Data Table POST /notebooks/{id}/studio/datatable generate_datatable()

All results are stored as JSON in notebooks.studio_data TEXT column. Download endpoints:

  • GET /notebooks/{id}/studio/slides/download → PPTX (python-pptx)
  • GET /notebooks/{id}/studio/report/download → PDF (weasyprint)

LLM routing: OpenAI/GPT → get_structured_llm() with Pydantic output; Claude/Gemini → manual JSON schema injection via achat().


AI Models

ID Provider Notes
gpt5-exp OpenAI GPT-5 Default
claude45-exp Anthropic Claude 4.5
gemini25-exp Google Gemini 2.5 Pro
openai GPT-4o
gemini Gemini 2.0 Flash
gpt4 GPT-4

Frontend Patterns

  • Auth token: stored in localStorage['auth-storage'] as Zustand state → state.token
  • API client: frontend/src/lib/api.ts — axios with JWT interceptor + 401 redirect
  • Data fetching: React Query (useQuery for reads, useMutation for writes)
  • WebSocket chat: chatAPI.connectWebSocket(notebookId, sessionId)ws://host/api/chat/ws/{id}?session_id={sid}
  • Studio mutations: All 7 in studioMutations record inside the notebook page component

Local Development

# Backend (requires backend/.env with all API keys)
cd backend
uv sync
uv run uvicorn src.api.main:app --host 0.0.0.0 --port 9000 --reload

# Frontend
cd frontend
npm install
npm run dev   # port 4000

Infrastructure (local):

docker compose up -d postgres redis   # starts only DB + cache

Common Issues

Problem Fix
column notebooks.studio_data does not exist Run run_studio_migration() in the container
Backend 500 on all routes Container running old image — rebuild: docker compose build backend && docker compose up -d backend
git pull fails with dubious ownership sudo chown -R user:user /opt/sandbox-notebookllamalm-nextjs
git pull fails with unstaged changes git stash && git pull
Frontend shows old UI after deploy Frontend container wasn't rebuilt — run docker compose build frontend && docker compose up -d frontend
Health check fails in deploy script Endpoint is /api/health, not /health