No description
Find a file
Vadym Samoilenko 44a512c41f Phase 1 Complete: Dual-bot architecture, knowledge base, access control
- Remove notebook mode, add RAG + Personal Assistant dual-bot setup
- Add knowledge base management (upload, URL scraping, document processing)
- Add user feature access control (allowed_features, features_override)
- Update admin dashboard with knowledge base tab
- Redesign login page, sidebar, and profile
- Add Celery tasks for async document processing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 21:26:40 +00:00
backend Phase 1 Complete: Dual-bot architecture, knowledge base, access control 2026-03-04 21:26:40 +00:00
frontend Phase 1 Complete: Dual-bot architecture, knowledge base, access control 2026-03-04 21:26:40 +00:00
.env.example Phase 1 Complete: Dual-bot architecture, knowledge base, access control 2026-03-04 21:26:40 +00:00
.gitignore Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00
concept.md Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00
CONTEXT_HANDOVER.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_BACKEND_COMPLETE.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_DEV_LOGIN.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_FRONTEND_MVP.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_PHASE4.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_PHASE5.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_PHASE6_2_AUTH_UI.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_PHASE6_3_RAG_CHAT.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_PHASE6_4_NOTEBOOK.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_PHASE6_COMPLETE.md Backend fixes: Enable Assistant & Admin endpoints, fix model issues 2026-02-12 21:57:02 +00:00
CONTEXT_HANDOVER_PHASE6_SETUP.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_SESSION_2026_02_17.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
docker-compose.yml Phase 1 Complete: Dual-bot architecture, knowledge base, access control 2026-03-04 21:26:40 +00:00
HANDOVER_WEEK_1_COMPLETE.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
HANDOVER_WEEK_2_3_COMPLETE.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
implementation_plan.md Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00
README.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
SETUP_COMPLETE.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
technical_spec.md Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00

🚀 Enterprise AI Hub "Nexus"

Unified corporate AI platform with RAG, Executive Assistant, and Notebook modes

Status Frontend Backend SharePoint


📋 Table of Contents


🎯 Overview

Nexus is an enterprise-grade AI platform that combines multiple AI capabilities into a unified interface:

  • Mode A — RAG Chat: Query your corporate SharePoint knowledge base with cited answers
  • Mode B — AI Assistant: Productivity tools (summarize, translate, extract action items)
  • Mode C — Notebook: Isolated document analysis via file uploads
  • Admin Dashboard: User management, LLM config, SharePoint source management

Current Status: MVP running + 🔄 SharePoint RAG pipeline in progress (Weeks 13 of 6 complete)


Features

🔐 Authentication

  • Microsoft Entra ID (Azure AD) OAuth integration
  • Dev mode login (3 roles: Super Admin, Content Manager, User)
  • JWT tokens with auto-refresh
  • Role-based access control (RBAC)

💬 RAG Chat (Mode A)

  • Real-time SSE streaming responses
  • Markdown rendering with syntax highlighting
  • Source citations with clickable links
  • Conversation management
  • Stop generation mid-stream
  • SharePoint document search (in progress — Week 4+)

🎯 AI Assistant (Mode B)

  • Summarize Meeting — Extract key points, decisions, action items
  • Translate Document — 8 languages supported
  • Extract Action Items — Identify tasks from text
  • Copy results to clipboard
  • LLM routing (Claude Sonnet, Gemini Flash, GPT)

📒 Notebook Mode (Mode C)

  • File upload (PDF, DOCX, XLSX, TXT)
  • Isolated session per user
  • Pin sessions for long-term storage
  • Proxy to NotebookLlama backend

👑 Admin Dashboard (Super Admin)

  • User management table with role assignment
  • LLM provider configuration per mode
  • Analytics dashboard
  • SharePoint source management (in progress — Week 4)

🔄 SharePoint Integration (in progress)

  • Week 1 — Database models: SharePointSource, SharePointDocument, SharePointWebhook, SyncJob
  • Week 2 — Microsoft Graph API client: site/drive discovery, delta sync, file download, webhook management
  • Week 3 — Document processing pipeline: PDF/DOCX/XLSX extraction → chunking → OpenAI embeddings → Qdrant
  • Week 3 — Celery tasks: sync_sharepoint_source, process_single_document, renew_expiring_webhooks
  • Week 4 — REST API endpoints for source CRUD + manual sync trigger + webhook receiver
  • Week 5 — Celery services in Docker Compose + end-to-end webhook flow
  • Week 6 — Real SharePoint testing + unit tests + deployment

📱 Responsive Design

  • Desktop-optimized layout
  • Mobile-friendly sidebar drawer
  • Dark/Light mode support

🏗️ Architecture

Technology Stack

Layer Technology
Frontend Next.js 14 (App Router), TypeScript, Tailwind CSS, Shadcn/UI, Zustand
Backend FastAPI (Python 3.11), SQLAlchemy (async), JWT
Task Queue Celery + Redis (SharePoint sync, session cleanup)
Databases PostgreSQL 16, Redis 7, Qdrant (vector store)
LLM — RAG GPT-4 / OpenAI text-embedding-3-large
LLM — Assistant Claude Sonnet (writing), Gemini Flash (summarization)
SharePoint Microsoft Graph API (Client Credentials), MSAL

System Diagram

┌─────────────────┐
│   Next.js       │
│   Frontend      │◄──── User
│   :3000         │
└────────┬────────┘
         │ HTTP / SSE
         ▼
┌─────────────────┐      ┌──────────────┐
│   FastAPI       │      │  PostgreSQL  │
│   Backend       │◄────►│  :5432       │
│   :8000         │      └──────────────┘
└────┬───┬────────┘
     │   │
     │   ├──────►┌──────────────┐
     │   │       │    Redis     │
     │   │       │    :6379     │
     │   │       └──────────────┘
     │   │
     │   └──────►┌──────────────┐
     │           │   Qdrant     │
     │           │   :6333      │
     │           └──────────────┘
     │
     ▼ (Celery tasks)
┌──────────────────────────┐
│  Microsoft Graph API     │
│  SharePoint Documents    │
│  → Text Extract          │
│  → Chunk + Embed         │
│  → Qdrant Upsert         │
└──────────────────────────┘

Backend File Structure (current)

backend/
├── celery_app.py                         ✅ Celery config (Week 3)
├── app/
│   ├── api/v1/endpoints/
│   │   ├── auth.py                       ✅ Entra ID + dev login
│   │   ├── chat.py                       ✅ RAG streaming
│   │   ├── assistant.py                  ✅ Summarize / Translate / Extract
│   │   ├── notebook.py                   ✅ File upload + sessions
│   │   ├── admin.py                      ✅ User + LLM management
│   │   └── sharepoint.py                 🔜 Week 4
│   ├── core/
│   │   ├── auth.py                       ✅
│   │   ├── llm.py                        ✅ LLM router (multi-provider)
│   │   ├── sharepoint_client.py          ✅ Graph API client (Week 2)
│   │   └── document_processor.py        ✅ PDF/DOCX/XLSX pipeline (Week 3)
│   ├── models/
│   │   ├── user.py                       ✅
│   │   ├── conversation.py               ✅
│   │   └── sharepoint.py                 ✅ 4 models (Week 1)
│   ├── tasks/
│   │   ├── __init__.py                   ✅
│   │   └── sharepoint_sync.py            ✅ 4 Celery tasks (Week 3)
│   ├── rag/
│   │   ├── retriever.py                  ✅ Qdrant search
│   │   └── embeddings.py                 ✅ OpenAI embeddings
│   └── config.py                         ✅
├── alembic/versions/
│   ├── 001_initial_schema.py
│   ├── 002_conversations_messages.py
│   ├── 003_notebook_sessions.py
│   ├── 004_update_notebook_fields.py
│   └── 005_add_sharepoint_tables.py      ✅ Week 1

🚀 Quick Start

Prerequisites

  • Docker & Docker Compose
  • Node.js 18+
  • Git

1. Clone & Configure

git clone <repository-url>
cd enterprise-ai-hub-nexus
cp backend/.env.example .env   # Edit with real API keys

2. Start Backend Services (Docker)

docker-compose up -d

# Verify all containers are running
docker ps --filter "name=nexus" --format "table {{.Names}}\t{{.Status}}"

# Expected:
# nexus-postgres   Up (healthy)
# nexus-redis      Up (healthy)
# nexus-qdrant     Up
# nexus-backend    Up

3. Start Frontend

cd frontend
npm install
npm run dev

4. Access Application

Frontend:          http://localhost:3000
Backend API docs:  http://localhost:8000/docs
Qdrant Dashboard:  http://localhost:6333/dashboard

5. Login


💻 Development

Frontend

cd frontend
npm install
npm run dev        # Hot reload dev server
npm run type-check
npm run lint
npm run build

Backend

# Logs
docker logs -f nexus-backend

# Restart
docker restart nexus-backend

# Shell
docker exec -it nexus-backend bash

# Run migrations
docker exec nexus-backend python -m alembic upgrade head

Environment Variables

.env (root, loaded by Docker Compose):

# Database
POSTGRES_DB=nexus_db
POSTGRES_USER=nexus_user
POSTGRES_PASSWORD=changeme

# JWT
JWT_SECRET=your-secret-min-32-chars
JWT_ALGORITHM=HS256

# LLM Providers
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AIza...

# Microsoft Entra ID (SharePoint + Auth)
ENTRA_CLIENT_ID=<app-id>
ENTRA_CLIENT_SECRET=<secret>
ENTRA_TENANT_ID=<tenant-id>

# SharePoint
SHAREPOINT_WEBHOOK_BASE_URL=https://your-domain.com
SHAREPOINT_TENANT_DOMAIN=company.sharepoint.com

# Infra
REDIS_URL=redis://nexus-redis:6379/0
QDRANT_URL=http://nexus-qdrant:6333

frontend/.env.local:

NEXT_PUBLIC_API_URL=http://localhost:8000/api/v1
NEXT_PUBLIC_APP_NAME=Nexus AI Hub

🔄 SharePoint Integration

The SharePoint RAG pipeline syncs documents from Microsoft SharePoint into Qdrant for semantic search.

Current Progress (Weeks 13 of 6)

Week Focus Status
1 DB models + Alembic migration Complete
2 Microsoft Graph API client Complete
3 Document processing + Celery tasks Complete
4 REST API endpoints + webhook receiver 🔜 Next
5 Docker Compose Celery services + E2E test Pending
6 Real SharePoint testing + unit tests Pending

Azure AD Requirements

Your Azure AD app needs Application permissions (not Delegated):

  • Sites.Read.All
  • Files.Read.All

Grant admin consent after adding permissions.

Run Celery Worker (manual, outside Docker)

cd backend
celery -A celery_app worker --loglevel=info --queues=sharepoint
celery -A celery_app beat --loglevel=info

Supported Document Types

PDF, DOCX, DOC, XLSX, XLS, TXT

Qdrant Collection

  • Name: sharepoint_docs
  • Embedding model: text-embedding-3-large (3072 dimensions)
  • Indexed fields: sharepoint_id, source_id, department_id, region_code, is_active, file_type

🧪 Testing

Quick API Tests

# Get dev auth token
TOKEN=$(curl -s -X POST http://localhost:8000/api/v1/auth/login/dev \
  -H 'Content-Type: application/json' \
  -d '{"email":"admin@nexus.dev","role":"super_admin"}' | \
  python3 -c "import sys, json; print(json.load(sys.stdin)['access_token'])")

# Test assistant summarize
curl -X POST http://localhost:8000/api/v1/assistant/summarize \
  -H "Authorization: Bearer $TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"text":"Meeting notes: Discussed Q1 goals. Budget approved."}'

# Test admin users list
curl -X GET http://localhost:8000/api/v1/admin/users \
  -H "Authorization: Bearer $TOKEN"

# Test conversation creation
curl -X POST http://localhost:8000/api/v1/chat/conversations \
  -H "Authorization: Bearer $TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"mode":"rag","title":"Test Chat"}'

Verify SharePoint Code Loads

docker exec nexus-backend python3 -c "
import sys; sys.path.insert(0, '/app')
from app.core.sharepoint_client import SharePointGraphClient
from app.core.document_processor import DocumentProcessor
from app.tasks.sharepoint_sync import sync_sharepoint_source
print('All SharePoint imports OK')
"

Database Check

docker exec nexus-postgres psql -U nexus_user -d nexus_db -c "\dt"
# Should include: sharepoint_sources, sharepoint_documents, sharepoint_webhooks, sync_jobs

🐛 Troubleshooting

Backend won't start

docker logs nexus-backend
docker restart nexus-backend

Frontend errors

cd frontend && rm -rf .next && npm run dev

Port conflicts

lsof -i :3000  # Frontend
lsof -i :8000  # Backend
lsof -i :5432  # PostgreSQL

📊 Project Statistics

Metric Value
Frontend lines ~8,000 (TypeScript/TSX)
Backend lines ~6,500 (Python)
React components 28
API endpoints 20+
Database tables 9 (3 core + 2 notebook + 4 sharepoint)
Alembic migrations 5

🎯 What's Next

Week 4 (immediate)

  1. Create backend/app/api/v1/endpoints/sharepoint.py
    • POST /api/v1/sharepoint/sources — add SharePoint source
    • GET /api/v1/sharepoint/sources — list sources
    • POST /api/v1/sharepoint/sources/{id}/sync — trigger manual sync
    • POST /api/v1/sharepoint/webhooks/receive — receive Graph notifications
  2. Register router in main.py
  3. Migration 006_add_delta_link.py — add delta_link column to sharepoint_sources

Week 5

  1. Add celery-worker and celery-beat services to docker-compose.yml
  2. End-to-end webhook flow test

Week 6

  1. Real SharePoint connection test with actual credentials
  2. Unit tests for document processor and Graph client
  3. Deployment documentation

Built with ❤️ using Claude Sonnet 4.6