No description
Find a file
DJP aa66d14301 Fix chat share button text and complete chat system
Chat System Complete:
- Fixed function name conflict (chat → query_chat)
- Private chats show "🤝 Share" button (action to take)
- Shared chats show "🔒 Make Private" button (action to take)
- Button text now shows the ACTION not the current state
- New Chat button works perfectly
- Chat sessions listed in sidebar
- Multiple chats per notebook per user
- Rename chat functionality
- Delete chat functionality
- Share/unshare toggle
- Privacy by default

Features Working:
 Create new chat (fresh context)
 Switch between chat sessions
 Rename any chat
 Share chat with collaborators
 Unshare (make private again)
 Delete chat sessions
 View shared chats from others
 Chat history persistence
 Private by default
 Per-notebook pipeline isolation

Bug Fixes:
- Renamed chat() function to avoid shadowing by loop variable
- Fixed shared chat actions menu (⋮ now visible)
- Added extensive debug logging
- Better error handling in pipeline queries
- Fixed button text logic (show action, not state)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-01 20:43:10 -04:00
.github/workflows Renaming to NotebookLlama 2025-06-30 22:31:22 +02:00
data/test Refactor + workflow 2025-06-28 19:48:54 +02:00
src Fix chat share button text and complete chat system 2025-10-01 20:43:10 -04:00
tests Renaming to NotebookLlama 2025-06-30 22:31:22 +02:00
tools Renaming to NotebookLlama 2025-06-30 22:31:22 +02:00
.DS_Store Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
.gitignore UI + final touches 2025-06-28 22:07:44 +02:00
.pre-commit-config.yaml Renaming to NotebookLlama 2025-06-30 22:31:22 +02:00
.python-version first commit 2025-06-27 22:47:48 +02:00
compose.yaml Adding observability dashboard 2025-06-29 12:01:17 +02:00
CONTRIBUTING.md first commit 2025-06-27 22:47:48 +02:00
CRITICAL_ISSUE.md Fix button colors and document critical data leakage issue 2025-10-01 19:20:32 -04:00
CURRENT_STATUS.md Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
ENTERPRISE_SETUP.md Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
FINAL_README.md Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
IMPLEMENTATION_SUMMARY.md Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
LICENSE Adding document chat and moving to a multi-page app 2025-06-28 23:25:13 +02:00
pyproject.toml Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
README.md Add comprehensive README and fix notebook sharing bug 2025-10-01 17:34:34 -04:00
README_ENTERPRISE.md Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
server.log Fix button colors and document critical data leakage issue 2025-10-01 19:20:32 -04:00
SIMPLIFIED_PLAN.md Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
start.sh Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
uv.lock Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00
watch_server.sh Transform NotebookLlaMa to enterprise multi-user NotebookLM clone 2025-10-01 17:28:06 -04:00

🦙 NotebookLlaMa - Enterprise Multi-User NotebookLM Clone

A production-ready, open-source alternative to Google's NotebookLM with multi-user support, document collections, AI-powered chat, and podcast generation.


🌟 Features

Core Capabilities

  • 📓 Multi-Document Notebooks - Organize 1-100+ PDFs into collections
  • 💬 Intelligent Chat - Ask questions across ALL documents in a notebook
  • 🎙️ Podcast Generation - AI-generated audio conversations from your content
  • 🤝 Team Collaboration - Share notebooks with colleagues
  • 🔐 Enterprise Security - User authentication, data isolation, access controls
  • 📊 Observability - Full tracing with Jaeger and OpenTelemetry

What Makes This Special

  • Notebook-First Design - Documents are organized into collections (like Google NotebookLM)
  • Multi-Document Intelligence - Chat queries search across ALL documents simultaneously
  • Source Attribution - See which document each answer came from
  • True Multi-Tenancy - Complete data isolation between users
  • Production Ready - Database-backed, scalable architecture

📋 Table of Contents

  1. Prerequisites
  2. Installation
  3. Configuration
  4. First-Time Setup
  5. Running the Application
  6. User Guide
  7. Architecture
  8. Troubleshooting
  9. Deployment

🔧 Prerequisites

Required Software

  1. Docker Desktop

  2. Python 3.13+

  3. uv Package Manager

    • Install on macOS/Linux:
      curl -LsSf https://astral.sh/uv/install.sh | sh
      
    • Install on Windows:
      powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
      
    • Purpose: Fast Python package management

Required API Keys

You'll need accounts and API keys from:

  1. OpenAI (for GPT-4 chat and responses)

  2. LlamaCloud (for document parsing and indexing)

  3. ElevenLabs (for podcast voice generation)


📦 Installation

Step 1: Clone the Repository

git clone https://bitbucket.org/zlalani/sandbox-notebookllamalm.git notebookllama
cd notebookllama

Step 2: Install Python Dependencies

# Install all dependencies
uv sync

This installs:

  • Streamlit (web UI)
  • SQLAlchemy (database ORM)
  • LlamaIndex (AI workflows)
  • OpenAI, ElevenLabs clients
  • PostgreSQL driver
  • And 25+ other packages

Step 3: Start Docker Services

# Start PostgreSQL, Jaeger, and Adminer
docker compose up -d

This starts:

  • PostgreSQL on port 5432 (database)
  • Jaeger on port 16686 (tracing UI)
  • Adminer on port 8080 (database admin)

Verify Docker is running:

docker ps

You should see 3 containers: instrumentation-postgres-1, instrumentation-jaeger-1, instrumentation-adminer-1


⚙️ Configuration

Step 4: Set Up Environment Variables

Create your .env file:

# Copy example if it exists, or create new
touch .env

Edit .env with your favorite editor and add:

# ===== API Keys =====
OPENAI_API_KEY="sk-your-openai-api-key-here"
LLAMACLOUD_API_KEY="llx-your-llamacloud-api-key-here"
ELEVENLABS_API_KEY="sk_your-elevenlabs-api-key-here"

# ===== Database Configuration =====
pgql_db=postgres
pgql_user=postgres
pgql_psw=admin

# ===== LlamaCloud IDs (will be generated) =====
EXTRACT_AGENT_ID=""
LLAMACLOUD_PIPELINE_ID=""

Important:

  • Do NOT use quotes around database credentials
  • DO use quotes around API keys
  • Keep pgql_psw=admin (matches Docker setup)

Step 5: Create LlamaCloud Resources

Run these scripts to set up LlamaCloud extraction and indexing:

# Create extraction agent
uv run tools/create_llama_extract_agent.py

This will output an EXTRACT_AGENT_ID. Copy it to your .env file.

# Create indexing pipeline
uv run tools/create_llama_cloud_index.py

This will output a LLAMACLOUD_PIPELINE_ID. Copy it to your .env file.

Your .env should now look like:

OPENAI_API_KEY="sk-..."
LLAMACLOUD_API_KEY="llx-..."
ELEVENLABS_API_KEY="sk_..."
pgql_db=postgres
pgql_user=postgres
pgql_psw=admin
EXTRACT_AGENT_ID="cb7cdd30-81ea-4917-acd6-3bb505149289"
LLAMACLOUD_PIPELINE_ID="884e242c-86dd-4824-8347-e6dfb91d98dc"

🎬 First-Time Setup

Step 6: Stop Any Conflicting Services

Important: Stop local PostgreSQL if you have it installed:

# macOS (Homebrew)
brew services stop postgresql@14
brew services stop postgresql@15
killall postgres

# Linux (systemd)
sudo systemctl stop postgresql

# Windows
# Stop PostgreSQL service from Services panel

Why? Local PostgreSQL conflicts with the Docker PostgreSQL on port 5432.

Step 7: Initialize the Database

# Create all database tables
uv run src/notebookllama/init_database.py

You should see:

✓ Database connection successful
✓ Database tables created successfully

Tables created:
  - users
  - documents
  - notebooks
  - document_summaries
  - notebook_documents
  - chat_sessions
  - chat_messages
  - document_shares

If you see errors, check:

  • Docker is running: docker ps
  • PostgreSQL container is healthy: docker logs instrumentation-postgres-1
  • No local PostgreSQL is running: lsof -i :5432 (should only show Docker)

Step 8: Run Database Migration

# Migrate schema to notebook-first architecture
echo "yes" | uv run src/notebookllama/migrate_to_notebooks.py

This sets up the multi-document notebook structure.


🚀 Running the Application

# Use the automated startup script
./start.sh

This script:

  1. Starts Docker services
  2. Checks database is initialized
  3. Stops conflicting PostgreSQL
  4. Starts MCP server
  5. Launches Streamlit app

Manual Start (More Control)

Terminal 1: Start MCP Server

uv run src/notebookllama/server.py

Keep this running. You should see:

INFO Starting MCP server 'MCP For NotebookLM'...
INFO Uvicorn running on http://127.0.0.1:8000

Terminal 2: Start Streamlit App

streamlit run src/notebookllama/App.py

You should see:

You can now view your Streamlit app in your browser.
Local URL: http://localhost:8501

Verify Everything is Running

# Check all services
docker ps                    # Should show 3 containers
lsof -i :8000               # MCP server
lsof -i :8501               # Streamlit app
lsof -i :5432               # PostgreSQL (Docker only!)

👤 User Guide

First-Time User Setup

  1. Open your browser to http://localhost:8501

  2. Create an account:

    • Click "Sign Up" tab
    • Enter email (any format, doesn't need to be real)
    • Enter username (unique)
    • Enter password (minimum 8 characters)
    • Click "Sign Up"
  3. You're in! You'll see the dashboard.

Creating Your First Notebook

Workflow:

Create Notebook → Upload PDFs → AI Processes → View Summary → Chat → Generate Podcast → Share

Step-by-Step:

1. Create Notebook

  • Click "Create New Notebook" (green button)
  • Name: "Q4 Marketing Analysis"
  • Description: "All marketing reports and research for Q4 2024"
  • Upload documents now? Upload 2-5 PDFs
  • Click "Create Notebook"

2. Wait for Processing

  • Each document takes 30-60 seconds
  • Progress bar shows status
  • Documents are processed sequentially

3. View Your Notebook

  • Automatically redirected to notebook detail
  • See all uploaded documents
  • View combined summaries from ALL documents
  • Browse highlights and Q&A

4. Chat with Your Notebook

  • Click "💬 Chat with Notebook"
  • Ask: "What are the main themes across all documents?"
  • AI searches ALL documents in your notebook
  • Responses show which document each answer came from

5. Generate a Podcast

  • In Notebook Detail, click "🎙️ Generate Podcast"
  • Click "Generate Now"
  • Wait 3-5 minutes
  • Listen to AI-generated 10-15 minute discussion
  • Download and share

6. Share with Team

  • Click "📤 Share Notebook"
  • Enter colleague's email
  • Choose permission: Read or Write
  • Click "Share"
  • They receive access to entire notebook

Adding More Documents Later

  1. Open any notebook
  2. Click " Add Documents"
  3. Upload more PDFs
  4. They're processed and added to the collection
  5. Summaries and chat are automatically updated

Managing Notebooks

  • Edit: Change name/description
  • Delete: Removes notebook and all data
  • Remove Document: Take a document out of notebook
  • View Shared: See notebooks others shared with you

🏗️ Architecture

Application Structure

NotebookLlaMa/
├── src/notebookllama/
│   ├── App.py                      # Main entry point (dashboard)
│   ├── auth.py                     # Authentication system
│   ├── database.py                 # SQLAlchemy ORM models
│   ├── notebook_manager.py         # Notebook CRUD operations
│   ├── document_manager.py         # Document CRUD operations
│   ├── workflow.py                 # LlamaIndex workflow
│   ├── utils.py                    # LlamaCloud API calls
│   ├── audio.py                    # Podcast generation
│   ├── server.py                   # MCP server (for chat)
│   ├── init_database.py            # Database initialization
│   ├── migrate_to_notebooks.py     # Database migration
│   └── pages/
│       ├── 1_My_Notebooks.py       # List and create notebooks
│       ├── 2_Notebook_Detail.py    # View/manage notebook
│       ├── 3_Notebook_Chat.py      # Chat interface
│       ├── 4_Shared_Notebooks.py   # Shared notebooks view
│       └── 5_Observability_Dashboard.py  # Performance monitoring
├── compose.yaml                    # Docker services configuration
├── pyproject.toml                  # Python dependencies
├── start.sh                        # Automated startup script
└── Documentation/
    ├── README.md                   # This file
    ├── ENTERPRISE_SETUP.md         # Detailed setup guide
    ├── SIMPLIFIED_PLAN.md          # Architecture overview
    ├── IMPLEMENTATION_SUMMARY.md   # Technical details
    └── CURRENT_STATUS.md           # Known issues

Technology Stack

  • Frontend: Streamlit (Python web framework)
  • Backend: FastMCP, LlamaIndex
  • Database: PostgreSQL with SQLAlchemy ORM
  • AI Services:
    • OpenAI GPT-4 (chat, structured responses)
    • LlamaCloud (document parsing, extraction, indexing)
    • ElevenLabs (text-to-speech for podcasts)
  • Observability: Jaeger, OpenTelemetry
  • Authentication: bcrypt password hashing

Database Schema

users
├── notebooks (collections of documents)
   ├── notebook_documents (junction table)
      └── documents (PDF files)
          └── document_summaries (AI analysis)
   ├── chat_sessions
      └── chat_messages
   └── document_shares (sharing permissions)

7 Tables Total:

  • users - User accounts
  • notebooks - Document collections
  • documents - Uploaded PDFs
  • notebook_documents - Links documents to notebooks
  • document_summaries - AI-generated summaries, Q&A, highlights
  • chat_sessions - Conversation sessions
  • chat_messages - Individual messages
  • document_shares - Sharing and permissions

Data Flow

User uploads PDF
    ↓
Document saved to database
    ↓
Sent to LlamaCloud for parsing
    ↓
Sent to LlamaExtract for analysis
    ↓
Summary/Q&A/Highlights generated
    ↓
Saved to document_summaries table
    ↓
Added to LlamaCloud index
    ↓
Available for chat

🐛 Troubleshooting

Common Issues

1. Database Connection Failed

Symptom: role "postgres" does not exist or connection errors

Solution:

# Stop local PostgreSQL
brew services stop postgresql@14
killall postgres

# Restart Docker with fresh database
docker compose down -v
docker compose up -d

# Wait 5 seconds, then reinitialize
sleep 5
uv run src/notebookllama/init_database.py
echo "yes" | uv run src/notebookllama/migrate_to_notebooks.py

2. MCP Server Not Responding

Symptom: Chat doesn't work, 500 errors

Solution:

# Check if server is running
lsof -i :8000

# If not running or crashed:
killall python
uv run src/notebookllama/server.py > server.log 2>&1 &

# Check logs
tail -f server.log

3. Document Processing Fails

Symptom: "Error processing document" or uploads fail

Check:

# Verify API keys are set
grep OPENAI_API_KEY .env
grep LLAMACLOUD_API_KEY .env
grep ELEVENLABS_API_KEY .env

# Verify LlamaCloud IDs exist
grep EXTRACT_AGENT_ID .env
grep LLAMACLOUD_PIPELINE_ID .env

# Re-run LlamaCloud setup if needed
uv run tools/create_llama_extract_agent.py
uv run tools/create_llama_cloud_index.py

4. Port Already in Use

Symptom: "Address already in use" errors

Solution:

# Port 5432 (PostgreSQL)
lsof -i :5432
killall postgres  # Kill local postgres

# Port 8000 (MCP Server)
lsof -i :8000
kill -9 <PID>

# Port 8501 (Streamlit)
lsof -i :8501
kill -9 <PID>

5. Summaries Not Saving

Symptom: Documents show "Processing... Summary not yet available"

Cause: MCP server crashed during processing (known bug in MCP library)

Solution:

  • The newest code bypasses MCP for document processing
  • Summaries should now save reliably
  • If old documents have no summaries, re-upload them

6. Import Errors

Symptom: cannot import name 'core' from 'llama_index'

Solution:

# Reinstall/upgrade packages
uv sync --reinstall

🚢 Deployment

For Production Use

Environment Setup

  1. Use Managed PostgreSQL

    • AWS RDS, Google Cloud SQL, or Azure Database
    • Enable automated backups
    • Set up read replicas for scaling
  2. Environment Variables

    • Use secrets management (AWS Secrets Manager, etc.)
    • Never commit API keys to git
    • Rotate keys regularly
  3. Security Hardening

    • Enable HTTPS/TLS
    • Set up firewall rules
    • Implement rate limiting
    • Add CSRF protection
    • Set session timeout (30 minutes recommended)

Scaling Considerations

Single Server (< 50 users):

# Run everything on one machine
docker compose up -d
uv run src/notebookllama/server.py &
streamlit run src/notebookllama/App.py

Multi-Server (50-1000 users):

  • Load balancer (nginx/HAProxy)
  • Multiple Streamlit instances
  • Separate MCP server instances
  • Managed PostgreSQL
  • Redis for session caching

Enterprise (1000+ users):

  • Kubernetes deployment
  • Auto-scaling groups
  • CDN for static assets
  • Separate database for each service
  • Message queue (RabbitMQ/Kafka) for async tasks
  • Dedicated job workers for document processing

Monitoring

# Access Jaeger UI
http://localhost:16686

# Access database admin
http://localhost:8080

Set up alerts for:

  • API rate limits
  • Database connection pool exhaustion
  • Disk space (for podcasts and uploads)
  • Processing failures

📊 Usage Metrics

Performance

  • Document Processing: 30-60 seconds per PDF
  • Chat Response: 3-5 seconds per query
  • Podcast Generation: 3-5 minutes for 10-minute audio
  • Page Load: < 1 second with caching

Resource Requirements

Minimum:

  • 4 GB RAM
  • 2 CPU cores
  • 20 GB disk space

Recommended:

  • 8 GB RAM
  • 4 CPU cores
  • 100 GB disk space (for documents and podcasts)

API Usage Estimates

Per Document (assuming 20-page PDF):

  • LlamaCloud: ~$0.10 (parsing + extraction)
  • OpenAI: ~$0.50 (summary + Q&A generation)
  • Total: ~$0.60 per document

Per Podcast (10 minutes):

  • OpenAI: ~$0.20 (conversation script)
  • ElevenLabs: ~$0.30 (voice generation)
  • Total: ~$0.50 per podcast

Per Chat Message:

  • OpenAI: ~$0.01 per query

🔒 Security

Current Implementation

  • Password hashing with bcrypt (salt rounds: 12)
  • SQL injection protection via SQLAlchemy ORM
  • Session-based authentication
  • Per-user data isolation
  • Document access controls
  • Granular sharing permissions
  • HTTPS/TLS encryption
  • Rate limiting per user
  • Input validation and sanitization
  • CSRF tokens
  • Session expiration (30-60 minutes)
  • Two-factor authentication (2FA)
  • Audit logging
  • IP allowlisting
  • API key rotation

📚 Documentation Files

  • README.md (this file) - Main documentation
  • ENTERPRISE_SETUP.md - Detailed enterprise features guide
  • SIMPLIFIED_PLAN.md - Architecture and design decisions
  • IMPLEMENTATION_SUMMARY.md - Technical implementation details
  • CURRENT_STATUS.md - Known issues and limitations
  • FINAL_README.md - Quick reference guide

🤝 Contributing

We welcome contributions! Areas for improvement:

  • Notebook-level synthesis (consolidate summaries across docs)
  • Cross-document Q&A generation
  • Podcast length controls
  • Background job queue for processing
  • Advanced search across all notebooks
  • Export functionality (PDF, Word)
  • Mobile-responsive UI
  • API access with tokens

📝 License

MIT License - See LICENSE file for details


🙏 Acknowledgments

Built with:

Original project: run-llama/notebookllama


📞 Support

  • Issues: Create an issue in the repository
  • Questions: Check documentation files first
  • Bugs: Include error logs and steps to reproduce

🎓 Quick Reference

Useful Commands

# Start everything
./start.sh

# Restart MCP server
killall python && uv run src/notebookllama/server.py &

# Check database
PGPASSWORD=admin psql -h localhost -U postgres -d postgres -c "SELECT * FROM notebooks;"

# View logs
tail -f server.log

# Reset database (CAUTION: deletes all data!)
docker compose down -v
docker compose up -d
uv run src/notebookllama/init_database.py
echo "yes" | uv run src/notebookllama/migrate_to_notebooks.py

Service URLs


Made with ❤️ using Claude Code

Last Updated: October 1, 2025