| .github/workflows | ||
| data/test | ||
| OLD-readme | ||
| src | ||
| tests | ||
| tools | ||
| .DS_Store | ||
| .gitignore | ||
| .pre-commit-config.yaml | ||
| .python-version | ||
| cleanup_llamacloud.py | ||
| compose.yaml | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
| server.log | ||
| start.sh | ||
| TRANSFORMATION.md | ||
| uv.lock | ||
| watch_server.sh | ||
🦙 Sandbox-NotebookLM
Enterprise Multi-User AI Document Analysis Platform
A production-ready, open-source alternative to Google's NotebookLM with multi-user support, multiple AI models, document collections, and team collaboration.
Key Features: Multi-document notebooks | 6 AI models | Private/shared chats | Background processing | Complete data isolation
📋 Table of Contents
- What's New
- Prerequisites
- Installation
- Configuration
- First-Time Setup
- Running the Application
- User Guide
- AI Models
- Architecture
- Troubleshooting
- Deployment
✨ What's New
Compared to Original NotebookLlaMa:
- Multi-User Support: Unlimited users with authentication
- Notebook Collections: Group 1-100+ documents together
- 6 AI Models: GPT-5, Claude 4.5, Gemini 2.5 Pro, GPT-4o, Gemini 2.0, GPT-4
- Data Isolation: Per-notebook LlamaCloud pipelines (SECURE!)
- Background Processing: Upload 20 docs, navigate away immediately
- Chat Privacy: Multiple sessions, private by default, optional sharing
- 3-Tier Permissions: Read, Write, Write+Share
- File Support: PDF, Word (.docx), PowerPoint (.pptx)
- Cross-Document Analysis: AI synthesizes insights across ALL documents
- Custom Podcasts: Theme, length (5-30 min), voice selection (8 voices)
- Admin Dashboard: Usage stats, cost tracking, system monitoring
- Professional UI: Custom branding, Montserrat typography, yellow theme
See TRANSFORMATION.md for complete comparison!
🔧 Prerequisites
Required Software
1. Docker Desktop
- Download: https://www.docker.com/products/docker-desktop
- Version: Latest stable
- Purpose: PostgreSQL database, Jaeger tracing, Adminer
- Verify:
docker --version
2. Python 3.13+
- Download: https://www.python.org/downloads/
- Verify:
python3 --version - Purpose: Application runtime
3. uv Package Manager
- macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh - Windows:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" - Verify:
uv --version - Purpose: Fast Python package management
Required API Keys
You'll need accounts and API keys from these services:
1. OpenAI (Required for GPT models)
- Sign up: https://platform.openai.com/signup
- Get API key: https://platform.openai.com/api-keys
- Pricing:
- GPT-5: $1.25/1M input, $10/1M output
- GPT-4o: $5/1M input, $15/1M output
- GPT-4: $30/1M input, $60/1M output
2. LlamaCloud (Required for document processing)
- Sign up: https://cloud.llamaindex.ai
- Get API key: Dashboard → Settings → API Keys
- Purpose: Document parsing, extraction, indexing
- Free tier: Available
3. ElevenLabs (Required for podcasts)
- Sign up: https://elevenlabs.io
- Get API key: Settings → API Keys
- Purpose: Text-to-speech for podcast generation
- Free tier: 10,000 characters/month
4. Google AI (Optional - for Gemini models)
- Sign up: https://aistudio.google.com/
- Get API key: https://aistudio.google.com/apikey
- Pricing:
- Gemini 2.5 Pro: $1.25/1M input, $5/1M output
- Gemini 2.0 Flash: $0.075/1M input, $0.30/1M output (cheapest!)
5. Anthropic (Optional - for Claude models)
- Sign up: https://console.anthropic.com/
- Get API key: Account Settings → API Keys
- Pricing:
- Claude Sonnet 4.5: $3/1M input, $15/1M output
- Claude Sonnet 4.0: $3/1M input, $15/1M output
📦 Installation
Step 1: Clone the Repository
git clone https://bitbucket.org/zlalani/sandbox-notebookllamalm.git
cd sandbox-notebookllamalm
Step 2: Install Python Dependencies
# Install all dependencies (including all AI model packages)
uv sync
This installs 40+ packages including:
- Streamlit (web framework)
- SQLAlchemy (database ORM)
- LlamaIndex core
- OpenAI client
- Anthropic client (Claude)
- Google Generative AI (Gemini)
- llama-index-llms-openai (latest: 0.6.1)
- llama-index-llms-anthropic (latest: 0.9.3)
- llama-index-llms-gemini (latest: 0.6.1)
- PostgreSQL driver
- bcrypt (password hashing)
- ElevenLabs client
- And more...
Note: This creates a .venv folder locally. Do NOT commit this to git! It's already in .gitignore.
Step 3: Start Docker Services
# Start PostgreSQL, Jaeger, and Adminer
docker compose up -d
Verify containers are running:
docker ps
You should see 3 containers:
instrumentation-postgres-1(port 5432)instrumentation-jaeger-1(port 16686)instrumentation-adminer-1(port 8080)
⚙️ Configuration
Step 4: Create Environment File
Create a .env file in the project root:
touch .env
Step 5: Add API Keys
Edit .env with your favorite editor and add:
# ===== OpenAI (Required) =====
OPENAI_API_KEY="sk-your-openai-key-here"
# ===== LlamaCloud (Required) =====
LLAMACLOUD_API_KEY="llx-your-llamacloud-key-here"
# ===== ElevenLabs (Required for podcasts) =====
ELEVENLABS_API_KEY="sk_your-elevenlabs-key-here"
# ===== Google AI (Optional - for Gemini models) =====
GOOGLE_API_KEY="your-google-api-key-here"
# ===== Anthropic (Optional - for Claude models) =====
ANTHROPIC_API_KEY="sk-ant-your-anthropic-key-here"
# ===== Database Configuration =====
pgql_db=postgres
pgql_user=postgres
pgql_psw=admin
# ===== LlamaCloud IDs (will be generated in next step) =====
EXTRACT_AGENT_ID=""
LLAMACLOUD_PIPELINE_ID=""
Important:
- ✅ Use quotes around API keys
- ❌ Do NOT use quotes around database credentials
- ✅ Google and Anthropic keys are optional (only needed if using those models)
Step 6: Generate LlamaCloud Resources
# Create extraction agent
uv run tools/create_llama_extract_agent.py
Copy the EXTRACT_AGENT_ID from output and paste into .env
# Create indexing pipeline (for legacy single-pipeline support)
uv run tools/create_llama_cloud_index.py
Copy the LLAMACLOUD_PIPELINE_ID from output and paste into .env
Note: This legacy pipeline is only used for document extraction. Each notebook creates its own dedicated pipeline for data isolation.
🎬 First-Time Setup
Step 7: Stop Conflicting Services
Important: Stop local PostgreSQL if installed (conflicts with Docker on port 5432):
# macOS (Homebrew)
brew services stop postgresql@14
brew services stop postgresql@15
killall postgres
# Linux
sudo systemctl stop postgresql
# Verify nothing on port 5432 except Docker
lsof -i :5432
Step 8: Initialize Database
# Create all database tables
uv run src/notebookllama/init_database.py
Expected output:
✓ Database connection successful
✓ Database tables created successfully
Tables created:
- users
- documents
- notebooks
- document_summaries
- notebook_documents
- chat_sessions
- chat_messages
- document_shares
- background_tasks
Step 9: Run Database Migration
# Migrate to notebook-first architecture
echo "yes" | uv run src/notebookllama/migrate_to_notebooks.py
This sets up the multi-document notebook structure.
🚀 Running the Application
Quick Start (Automated)
./start.sh
This script automatically:
- Starts Docker services
- Checks database initialization
- Stops conflicting PostgreSQL
- Starts MCP server (background)
- Launches Streamlit app
Manual Start (More Control)
Terminal 1: Start MCP Server
uv run src/notebookllama/server.py
Keep running. You should see:
INFO Starting MCP server 'MCP For NotebookLM'
INFO Uvicorn running on http://127.0.0.1:8000
Terminal 2: Start Streamlit App
streamlit run src/notebookllama/App.py
You should see:
You can now view your Streamlit app in your browser.
Local URL: http://localhost:8501
Verify All Services
# Check Docker containers
docker ps # Should show 3 containers
# Check MCP server
lsof -i :8000 # Should show Python process
# Check Streamlit
lsof -i :8501 # Should show Python process
# Check PostgreSQL (Docker only!)
lsof -i :5432 # Should ONLY show Docker, not local postgres
👤 User Guide
First-Time User Setup
- Open browser: http://localhost:8501
- Sign up:
- Click "Sign Up" tab
- Email: anything@example.com (doesn't need to be real)
- Username: unique username
- Password: minimum 8 characters
- Click "Sign Up"
- You're in! Dashboard loads automatically
Creating Your First Notebook
Step-by-Step:
1. Create Notebook
- Click "➕ Create New Notebook" (yellow button)
- Name: "Q4 Marketing Analysis"
- Description: "All Q4 marketing reports and presentations"
- Choose AI Model:
- 🚀 GPT-5 (Latest, $1.25/$10 per 1M)
- 🧠 Claude Sonnet 4.5 (Latest, $3/$15 per 1M)
- 💎 Gemini 2.5 Pro (Latest, $1.25/$5 per 1M)
- ⚡ GPT-4o (Stable, $5/$15 per 1M)
- ✨ Gemini 2.0 Flash (Cheapest, $0.075/$0.30 per 1M)
- 🤖 GPT-4 (Original, $30/$60 per 1M)
2. Upload Documents
- Upload 5-10 files (PDF, Word, PowerPoint)
- Click "Create Notebook"
- Files queue for background processing
- Navigate away immediately!
3. Wait for Processing
- Each document: ~30 seconds extraction + 60 seconds indexing
- Check status in Notebook Detail page
- "🔄 Refresh Status" button
4. View Notebook
- See all documents
- View individual summaries per document
- Click "🔄 Generate Cross-Document Analysis"
- Overall synthesis across ALL documents
- Common themes
- Key insights
- Comparative findings
5. Chat with Notebook
- Click "💬 Chat with Notebook"
- Wait for indexing to complete (~60-90 seconds after extraction)
- Create multiple chat sessions:
- "➕ New Chat" for fresh context
- Rename chats: Click ⋮ → ✏️ Rename
- Share chats: Click ⋮ → 🤝 Share
- Delete chats: Click ⋮ → 🗑️ Delete
- Private by default - only you see your chats
- Share specific chats with collaborators
6. Generate Podcast
- Click "🎙️ Generate Podcast"
- Length: 5, 10, 15, 20, 25, or 30 minutes
- Custom theme: "Focus on actionable insights for executives"
- Additional instructions: "Make it conversational, avoid jargon"
- Voice selection: Choose from 8 ElevenLabs voices
- Speaker 1: Brian, Adam, Charlie, Daniel
- Speaker 2: Sarah, Bella, Charlotte, Emily
- Click "Generate in Background"
- Navigate away! Come back in 3-5 minutes
- Listen/download when ready
7. Share Notebook
- Click "📤 Share Notebook"
- See who it's shared with (list of collaborators)
- Add new user:
- Enter email
- Choose permission:
- Read: View and chat only
- Write: View, chat, add documents, generate podcasts
- Write+Share: All of Write + can share with others
- Remove users: Click "Remove" next to their name
- Shared chats automatically visible to collaborators
🤖 AI Models
Supported Models (Choose per notebook):
| Model | Provider | Cost (per 1M tokens) | Best For |
|---|---|---|---|
| GPT-5 🚀 | OpenAI | $1.25 in, $10 out | Latest, state-of-the-art reasoning |
| Claude Sonnet 4.5 🧠 | Anthropic | $3 in, $15 out | Analysis, writing, nuanced understanding |
| Gemini 2.5 Pro 💎 | $1.25 in, $5 out | Latest from Google, balanced | |
| GPT-4o ⚡ | OpenAI | $5 in, $15 out | Stable, fast, multimodal |
| Gemini 2.0 Flash ✨ | $0.075 in, $0.30 out | Cheapest! 99% savings | |
| GPT-4 🤖 | OpenAI | $30 in, $60 out | Original, proven quality |
Model Selection
Choose when creating notebook - can't be changed later!
Each notebook uses its selected model for:
- Chat responses
- Cross-document synthesis
- Podcast script generation
- Q&A generation
Model shown in:
- Notebook detail header: "🚀 Using OpenAI GPT-5"
- Chat page: "🚀 Using OpenAI GPT-5 | Chatting across 5 documents"
- Chat responses: "🚀 Answered by OpenAI GPT-5"
🗄️ Architecture
Application Structure
Sandbox-NotebookLM/
├── src/notebookllama/
│ ├── App.py # Main entry (dashboard)
│ ├── auth.py # Authentication
│ ├── database.py # SQLAlchemy models (9 tables)
│ ├── llm_factory.py # AI model factory (6 models)
│ ├── notebook_manager.py # Notebook operations
│ ├── document_manager.py # Document operations
│ ├── pipeline_manager.py # LlamaCloud pipeline isolation
│ ├── notebook_synthesis.py # Cross-document analysis
│ ├── background_tasks.py # Async processing queue
│ ├── audio.py # Podcast generation
│ ├── styles.py # UI styling
│ ├── server.py # MCP server (for legacy chat)
│ ├── utils.py # LlamaCloud utilities
│ ├── workflow.py # LlamaIndex workflow
│ └── pages/
│ ├── 1_My_Notebooks.py # Create/manage notebooks
│ ├── 2_Notebook_Detail.py # View notebook
│ ├── 3_Notebook_Chat.py # Multi-session chat
│ ├── 4_Shared_Notebooks.py # Collaboration
│ └── 5_Admin_Dashboard.py # Admin monitoring
├── cleanup_llamacloud.py # One-time cleanup utility
├── compose.yaml # Docker configuration
├── pyproject.toml # Dependencies
├── .env # API keys (DO NOT COMMIT!)
└── README.md # This file
Technology Stack
- Frontend: Streamlit 1.46+
- Backend: Python 3.13, FastMCP, LlamaIndex
- Database: PostgreSQL 18 with SQLAlchemy ORM
- AI Models:
- OpenAI GPT-5, GPT-4o, GPT-4
- Anthropic Claude Sonnet 4.5, 4.0
- Google Gemini 2.5 Pro, 2.0 Flash
- Document Processing: LlamaCloud (Parse, Extract, Index)
- Audio: ElevenLabs (8 voices)
- Observability: Jaeger, OpenTelemetry
Database Schema (9 Tables)
users (authentication)
├── notebooks (collections with AI model choice)
│ ├── pipeline_id (dedicated LlamaCloud pipeline)
│ ├── model_type (gpt5, claude, gemini, etc.)
│ ├── notebook_documents (many-to-many)
│ │ └── documents (PDF, Word, PowerPoint)
│ │ └── document_summaries (AI analysis)
│ ├── chat_sessions (with privacy + sharing)
│ │ ├── is_shared (private by default)
│ │ └── chat_messages (conversation history)
│ └── document_shares (3-tier permissions)
└── background_tasks (async processing queue)
Data Flow
User creates notebook (selects AI model: GPT-5, Claude, or Gemini)
↓
Dedicated LlamaCloud pipeline created for this notebook
↓
User uploads 10 PDFs + 5 Word docs
↓
Files queue for background processing
↓
Each file: Upload → Parse → Extract → Index (90 sec total)
↓
Summaries saved to database
↓
User chats → Queries ONLY this notebook's pipeline
↓
Selected AI model (GPT-5/Claude/Gemini) generates response
↓
Sources from this notebook's documents only (isolated!)
🐛 Troubleshooting
Common Issues
1. Database Connection Failed
Error: role "postgres" does not exist
Solution:
# Stop local PostgreSQL
brew services stop postgresql@14
killall postgres
# Restart Docker with fresh volume
docker compose down -v
docker compose up -d
sleep 5
# Reinitialize
uv run src/notebookllama/init_database.py
echo "yes" | uv run src/notebookllama/migrate_to_notebooks.py
2. MCP Server Crashes
Error: Chat doesn't work, 500 errors
Solution:
# Check if running
lsof -i :8000
# Restart if needed
killall python
uv run src/notebookllama/server.py > server.log 2>&1 &
# Check logs
tail -f server.log
3. Empty Chat Responses
Error: "Sorry, I couldn't find an answer"
Causes:
- Documents still indexing (wait 60-90 seconds after extraction)
- No documents in notebook (upload files first)
- Pipeline not created (old notebook, recreate it)
Solution:
- Wait for "📊 X documents completed extraction, now indexing..." to clear
- Click "🔄 Check if ready now"
- Look for background task completion in Admin Dashboard
4. Model Not Working
Error: "Unknown model" or "Model is not found"
Solutions:
- Update packages:
uv sync --reinstall - Check API key: Verify in
.envfor that provider - Try stable model: Use GPT-4o, Claude 4.0, or Gemini 2.0 Flash
- Restart Streamlit: Kill and restart to load new packages
5. Port Already in Use
Error: "Address already in use"
Solution:
# Find and kill processes
lsof -i :5432 # PostgreSQL
lsof -i :8000 # MCP Server
lsof -i :8501 # Streamlit
# Kill specific PID
kill -9 <PID>
# Or kill all Python processes
killall python
6. Import Errors After Package Updates
Error: cannot import name...
Solution:
# Restart Streamlit to load new packages
killall python
streamlit run src/notebookllama/App.py
7. LlamaCloud Quota Exceeded
Error: Too many pipelines or files
Solution:
# One-time cleanup of orphaned resources
python cleanup_llamacloud.py
# Type "DELETE ALL" to confirm
🚢 Deployment
Environment Variables for Production
# Required
OPENAI_API_KEY="sk-..."
LLAMACLOUD_API_KEY="llx-..."
ELEVENLABS_API_KEY="sk_..."
EXTRACT_AGENT_ID="..."
LLAMACLOUD_PIPELINE_ID="..."
# Optional (for additional models)
GOOGLE_API_KEY="..."
ANTHROPIC_API_KEY="sk-ant-..."
# Database (use managed PostgreSQL in production)
DATABASE_URL="postgresql://user:pass@host:5432/dbname"
For Production Deployment
-
Use Managed PostgreSQL
- AWS RDS, Google Cloud SQL, Azure Database
- Enable backups
- Set up read replicas
-
Environment Security
- Use secrets management (AWS Secrets Manager, etc.)
- Rotate API keys regularly
- Never commit
.envto git
-
Scaling
- Run multiple Streamlit instances behind load balancer
- Separate MCP server instances
- Redis for session caching
- Dedicated workers for background tasks
-
Monitoring
- Use Admin Dashboard
- Set up alerts for failed tasks
- Monitor API usage and costs
- Track background task queue length
📊 Features Overview
Core Features
- ✅ Multi-user authentication (bcrypt passwords)
- ✅ Multi-document notebooks (1-100+ docs)
- ✅ 6 AI models (GPT-5, Claude 4.5, Gemini 2.5 Pro, etc.)
- ✅ 3 file types (PDF, Word, PowerPoint)
- ✅ Background processing (documents & podcasts)
- ✅ Per-notebook pipeline isolation (SECURE!)
- ✅ Multiple chat sessions per notebook
- ✅ Private chats with optional sharing
- ✅ 3-tier permissions (Read, Write, Write+Share)
- ✅ Cross-document synthesis
- ✅ Custom podcast generation
- ✅ Admin dashboard
- ✅ Automatic cleanup
Advanced Features
- ✅ Chat privacy (private by default)
- ✅ Share specific chats with team
- ✅ Rename/delete chat sessions
- ✅ Model attribution in responses
- ✅ Voice selection for podcasts
- ✅ Custom podcast themes/instructions
- ✅ Podcast length control (5-30 min)
- ✅ Status tracking for all async tasks
- ✅ Cost tracking per feature
- ✅ User analytics
- ✅ Background task monitoring
💰 Cost Estimates
Per Document (20-page PDF):
- LlamaCloud: ~$0.10 (parsing + extraction)
- AI Model (varies):
- Gemini 2.0 Flash: ~$0.02 (cheapest!)
- GPT-5: ~$0.20
- Claude 4.5: ~$0.30
- GPT-4: ~$0.50 (most expensive)
Per Podcast (10 minutes):
- Script generation (model-dependent): $0.05-$0.20
- ElevenLabs TTS: ~$0.30
- Total: $0.35-$0.50
Per Chat Message:
- Gemini 2.0 Flash: ~$0.001 (cheapest!)
- GPT-5/Gemini 2.5 Pro: ~$0.005
- Claude 4.5: ~$0.01
- GPT-4: ~$0.03
Monthly Estimate (100 documents, 1000 chats, 10 podcasts):
- With Gemini 2.0 Flash: ~$20/month
- With GPT-5: ~$50/month
- With Claude 4.5: ~$80/month
- With GPT-4: ~$150/month
Savings: Gemini 2.0 Flash is 87-93% cheaper than GPT-4!
🔒 Security Features
- ✅ Per-notebook LlamaCloud pipelines (complete data isolation)
- ✅ Private chats by default
- ✅ Granular sharing permissions
- ✅ Password hashing with bcrypt (12 rounds)
- ✅ SQL injection protection (SQLAlchemy ORM)
- ✅ Session-based authentication
- ✅ Access control on all operations
- ✅ Admin-only dashboard access
📚 Documentation
- README.md (this file) - Complete guide
- TRANSFORMATION.md - Before/after comparison, all features
- ENTERPRISE_SETUP.md - Enterprise features (in OLD-readme/)
- IMPLEMENTATION_SUMMARY.md - Technical details (in OLD-readme/)
- SIMPLIFIED_PLAN.md - Architecture decisions (in OLD-readme/)
🛠️ Maintenance
Clean Up Orphaned LlamaCloud Resources
One-time cleanup (if you have old test pipelines):
python cleanup_llamacloud.py
Going forward: Automatic! Deleting notebooks cleans up all cloud resources.
Check Background Tasks
Admin Dashboard (user #1 or email contains "admin"):
- View all background tasks
- See failed tasks
- Monitor processing status
- Track costs
Database Backup
# Backup database
docker exec instrumentation-postgres-1 pg_dump -U postgres postgres > backup.sql
# Restore database
docker exec -i instrumentation-postgres-1 psql -U postgres postgres < backup.sql
🎓 Quick Reference
Service URLs
- App: http://localhost:8501
- Jaeger Tracing: http://localhost:16686
- Database Admin: http://localhost:8080
- System: PostgreSQL
- Server: postgres
- Username: postgres
- Password: admin
- Database: postgres
Useful Commands
# Start everything
./start.sh
# Restart MCP server
killall python && uv run src/notebookllama/server.py &
# Restart Streamlit
killall python && streamlit run src/notebookllama/App.py
# Check database
PGPASSWORD=admin psql -h localhost -U postgres -d postgres -c "SELECT * FROM notebooks;"
# Clean up cloud resources
python cleanup_llamacloud.py
# View logs
tail -f server.log
# Reset database (CAUTION!)
docker compose down -v
docker compose up -d
uv run src/notebookllama/init_database.py
echo "yes" | uv run src/notebookllama/migrate_to_notebooks.py
🤝 Contributing
Areas for improvement:
- Additional AI models (Llama 3, Mistral, etc.)
- Real-time collaboration
- Export notebooks (PDF, Word)
- Mobile app
- API access with tokens
- Webhook notifications
- Email notifications
- Advanced search
📝 License
MIT License - See LICENSE file
🙏 Acknowledgments
Built with:
Original project: run-llama/notebookllama
Enhanced by: Claude Code (Anthropic)
📞 Support
- Repository: https://bitbucket.org/zlalani/sandbox-notebookllamalm
- Issues: Create an issue in repository
- Documentation: See TRANSFORMATION.md for complete feature list
🎯 Quick Start Checklist
- Docker installed and running
- Python 3.13+ installed
- uv package manager installed
- Repository cloned
uv synccompleted.envfile created with all required API keys- LlamaCloud extraction agent created
- LlamaCloud pipeline created
- Docker services started (
docker compose up -d) - Local PostgreSQL stopped
- Database initialized
- Database migrated
- MCP server running
- Streamlit app running
- Browser open to http://localhost:8501
- Account created
- First notebook created
- Documents uploaded
- Chat tested
- Podcast generated
- Sharing tested
Made with ❤️ using Claude Code
Version: 2.0.0 Enterprise Edition Last Updated: October 2, 2025 Lines of Code: 8,600+ Development Time: 17+ hours Total Cost: $78.39
🦙 Enjoy your Sandbox-NotebookLM! ✨