No description
Find a file
2026-03-31 22:18:18 +01:00
backend feat: purge orphaned Qdrant vectors button 2026-03-31 22:11:13 +01:00
deploy refactor: replace Scrapling with Firecrawl API for URL scraping 2026-03-31 20:12:45 +01:00
docs docs: rewrite README + add PDF user guide and technical documentation 2026-03-05 23:01:47 +00:00
frontend fix: user guide section 7.5-7.6 match actual UI button names 2026-03-31 22:18:18 +01:00
.env.example feat: add MASTER_API_KEY to code-interpreter service for admin dashboard 2026-03-30 21:31:20 +01:00
.gitignore fix: add frontend/public/ to git (was ignored by root gitignore 'public' rule) 2026-03-05 18:14:31 +00:00
cloudbuild.yaml Configure Cloud Run worker: VPC connector, env vars, Secret Manager 2026-03-05 00:08:31 +00:00
concept.md Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00
CONTEXT_HANDOVER.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_BACKEND_COMPLETE.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_DEV_LOGIN.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_FRONTEND_MVP.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_PHASE4.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_PHASE5.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_PHASE6_2_AUTH_UI.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_PHASE6_3_RAG_CHAT.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_PHASE6_4_NOTEBOOK.md Phase 6 Complete: Assistant Mode, Admin Dashboard, and Final Polish 2026-02-12 20:30:27 +00:00
CONTEXT_HANDOVER_PHASE6_COMPLETE.md Backend fixes: Enable Assistant & Admin endpoints, fix model issues 2026-02-12 21:57:02 +00:00
CONTEXT_HANDOVER_PHASE6_SETUP.md Initial commit: Phases 1-5 Complete + Frontend Setup 2026-02-12 19:10:28 +00:00
CONTEXT_HANDOVER_SESSION_2026_02_17.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
deploy.sh fix: correct stale port 8100 reference in deploy.sh note 2026-03-30 20:42:39 +01:00
docker-compose.prod.yml fix: SharePoint sync reliability — celery beat, token refresh, scopes 2026-03-31 21:20:45 +01:00
docker-compose.yml Phase 1 Complete: Dual-bot architecture, knowledge base, access control 2026-03-04 21:26:40 +00:00
HANDOVER_WEEK_1_COMPLETE.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
HANDOVER_WEEK_2_3_COMPLETE.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
implementation_plan.md Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00
README.md docs: rewrite README + add PDF user guide and technical documentation 2026-03-05 23:01:47 +00:00
SETUP_COMPLETE.md Week 2-3 Complete: SharePoint Graph Client + Document Processing Pipeline 2026-02-20 10:41:02 +00:00
technical_spec.md Phase 1 Complete: Environment Setup 2026-02-12 17:31:54 +00:00

Enterprise AI Hub Nexus

Secure AI platform for knowledge management, RAG chat, and Microsoft 365 productivity — built for OLIVER Agency.

Status Frontend Backend Auth


Table of Contents


Overview

Nexus is an enterprise AI platform that provides:

  • RAG Chat — natural language questions answered from the company knowledge base, with source citations
  • Personal Assistant — read emails, calendar, OneDrive files and SharePoint content via Microsoft Graph (read-only)
  • Knowledge Base Management — admin panel to upload, manage and re-index documents
  • Multi-language support — ask in any language; responses are in the same language
  • Department & Region scoping — content filtered per team and location

The platform runs in production on a GCE VM (optical-web-1) with Apache as a reverse proxy, Docker Compose for backend services, and a Google Cloud Run microservice for heavy document processing.


Features

Authentication

  • Microsoft Entra ID (Azure AD) — PKCE SPA flow, no client_secret
  • JWT tokens (HS256, 8-hour lifetime)
  • Role-based access: super_admin, content_manager, user
  • Microsoft 365 consent flow for Personal Assistant mode

RAG Chat

  • Multi-query expansion — 3 search variants (translated + UK English + US English terminology)
  • Parallel Qdrant vector search across all variants
  • LLM reranking (Claude Haiku, 010 score) from up to 60 candidates → top 5
  • Contextual chunk embedding ([Document Title]\n\nchunk text)
  • Document summary vectors (AI-generated 34 sentence summary per document)
  • Source citations with SharePoint links
  • SSE streaming responses

Personal Assistant (M365)

  • Read emails, summarise threads
  • Read calendar events
  • List and search OneDrive files
  • Search SharePoint document libraries
  • Agentic tool-calling loop (parallel tool execution, up to 5 rounds)

Knowledge Base

  • Upload: PDF, DOCX, DOC, XLSX, XLS, PPTX, PPT, TXT, CSV
  • Web page scraping (URL → index)
  • SharePoint document library browser and import
  • SHA-256 deduplication (identical files skipped)
  • 4-concurrent upload queue
  • Per-document reprocess button (failed or 0-chunk documents)
  • Re-index All — re-embeds all documents with current pipeline
  • Stop Re-index — cancels pending reindex tasks
  • Bulk delete with checkboxes
  • Stats: total docs, completed, failed, vectors in Qdrant
  • Sortable table, limit 1000 documents

Admin Panel

  • User management (invite, role, department, region)
  • Department management
  • LLM provider API key configuration
  • Analytics dashboard
  • SharePoint source configuration

Architecture

User Browser (Next.js SPA)
        │  HTTPS / SSE
        ▼
Apache 2.4 (reverse proxy + static files)
        │
        ├─── /nexus/*  ──→  /var/www/html/enterprise-ai-hub-nexus/  (static export)
        │
        └─── /api/v1/* ──→  FastAPI :8000 (Docker)
                                │
                    ┌───────────┼───────────┐
                    ▼           ▼           ▼
                Qdrant       PostgreSQL   Redis
               :6333          :5432       :6379
               (vectors)    (metadata)  (Celery)
                    │
                    └── Cloud Run: Doc Processor (extract + chunk)
                    └── Azure AD / MS Graph (auth + M365 tools)

GCE VM optical-web-1 hosts: Apache, FastAPI, Qdrant, PostgreSQL, Redis, Celery worker/beat — all in Docker Compose.

Google Cloud Run (nexus-processor, europe-west1) handles CPU-intensive document extraction and chunking, called via HTTPS from the backend VM.


Technology Stack

Layer Technology Details
Frontend Next.js 14 App Router, static export, basePath: /nexus
Frontend React 18 + TypeScript 5
Frontend Tailwind CSS + shadcn/ui
Frontend Zustand useAuthStore, useChatStore
Backend FastAPI 0.115+ Python 3.11+
Backend SQLAlchemy 2.x (async) asyncpg driver
Backend Alembic 14 migrations
Backend Celery + Redis Background tasks and scheduler
AI — RAG OpenAI GPT-5 (gpt-5.2) Streaming answers
AI — Assistant Anthropic Claude Sonnet (claude-sonnet-4-6) Agentic tool loop
AI — Reranking Anthropic Claude Haiku (claude-haiku-4-5) Rerank, summaries, query expansion
AI — Summary Google Gemini (gemini-3.1-pro-preview) Summarisation, planning
AI — Embeddings OpenAI text-embedding-3-large 3072 dimensions
Vector DB Qdrant 1.12.x Self-hosted on GCE VM
Relational DB PostgreSQL 15
Cloud Google Cloud Run Document processor microservice
Infrastructure GCE VM n2d-standard-4 Backend + all services
Auth Azure AD / Entra ID PKCE SPA flow
Auth Microsoft Graph API v1.0 User profile + M365 tools
Web server Apache 2.4 Reverse proxy + static files
Containers Docker Compose docker-compose.prod.yml

Repository Structure

enterprise-ai-hub-nexus/
│
├── backend/
│   ├── app/
│   │   ├── api/v1/endpoints/
│   │   │   ├── auth.py              # PKCE login → JWT
│   │   │   ├── chat.py              # SSE streaming RAG + assistant
│   │   │   ├── knowledge.py         # Document upload, list, delete, reindex
│   │   │   ├── users.py             # User CRUD (super_admin)
│   │   │   ├── departments.py       # Department management
│   │   │   └── config.py            # LLM key configuration
│   │   ├── core/
│   │   │   ├── document_processor.py   # Extract, chunk, embed, upsert to Qdrant
│   │   │   ├── llm.py                  # LLMFactory — multi-provider, streaming, tool loop
│   │   │   ├── cloud_run_client.py     # HTTP client for Cloud Run processor
│   │   │   └── web_scraper.py          # URL → text via trafilatura
│   │   ├── rag/
│   │   │   └── retriever.py         # Multi-query expansion, parallel search, LLM rerank
│   │   ├── tools/                   # Personal assistant tools (email, calendar, files)
│   │   ├── models/                  # SQLAlchemy ORM models
│   │   ├── schemas/                 # Pydantic request/response schemas
│   │   ├── config.py                # pydantic-settings (env vars)
│   │   └── database.py              # Async engine + AsyncSessionLocal
│   ├── alembic/versions/            # 14 migration files
│   ├── cloud_run_processor/         # Cloud Run microservice (extract + chunk only)
│   ├── Dockerfile
│   └── requirements.txt
│
├── frontend/
│   ├── app/
│   │   ├── admin/page.tsx           # Admin dashboard
│   │   ├── auth/callback/page.tsx   # OAuth callback handler
│   │   └── chat/page.tsx            # Main chat UI
│   ├── components/
│   │   ├── admin/                   # KnowledgeUploader, SharePointBrowser, UsersTab, etc.
│   │   ├── auth/protected-route.tsx # Auth guard with hydration tracking
│   │   └── chat/chat-interface.tsx  # SSE stream consumer, citations
│   ├── lib/
│   │   ├── api-client.ts            # Typed API client with JWT auto-attach
│   │   └── microsoft-oauth.ts       # PKCE flow + MS token exchange
│   ├── store/                       # useAuthStore, useChatStore (Zustand)
│   └── types/                       # TypeScript types
│
├── docs/
│   ├── 01_Enterprise_AI_Hub_Nexus_User_Guide.pdf
│   ├── 02_Enterprise_AI_Hub_Nexus_Technical_Documentation.pdf
│   └── OLIVER_BRAND_ADAPTATION.md
│
├── docker-compose.prod.yml          # Production services
├── docker-compose.local.yml         # Local dev (db :5433, redis :6380, backend :1222)
└── deploy.sh                        # Full deploy script

RAG Pipeline

The retrieval pipeline lives in backend/app/rag/retriever.py.

User Query
    │
    ▼
Query Expansion (Claude Haiku)
  → Variant 1: normalised / translated
  → Variant 2: UK English (annual leave, holiday, redundancy...)
  → Variant 3: US English (vacation, PTO, layoff...)
    │
    ▼
Parallel Embed (text-embedding-3-large, asyncio.gather)
    │
    ▼
Parallel Qdrant Search (top_k=10 per variant, filters: is_active, department, region)
    │
    ▼
Merge + Dedup by point ID (highest score kept) → up to 60 candidates
    │
    ▼
LLM Reranking (Claude Haiku, score 010 per chunk) → top 5
    │
    ▼
LLM Answer (GPT-5, streaming SSE) + source citations

Key design decisions:

  • Multi-query expansion bridges UK/US terminology differences in HR documents
  • Reranking replaces binary yes/no grading with continuous relevance scores
  • Each chunk stored with original text; contextualised version used only for embedding
  • Document summary vectors improve topic-level discovery

Document Processing

backend/app/core/document_processor.py — two-phase design:

Phase 1 — extract_and_chunk (runs on Cloud Run, no Qdrant/OpenAI needed):

  • PDF (text-based): MarkItDown
  • PDF (scanned): LlamaParse (cloud OCR), falls back to MarkItDown
  • DOCX, XLSX, PPTX: MarkItDown
  • TXT/CSV: direct UTF-8 decode
  • Chunk size: 1000 chars, overlap: 200 chars

Phase 2 — embed_and_upsert (runs on backend VM):

  • Delete existing vectors for document (re-ingestion safe)
  • Contextualise chunks: [Document Title]\n\nchunk text
  • Embed in parallel batches of 100 (asyncio.gather)
  • Generate document summary (Claude Haiku, 34 sentences)
  • Upsert all points + summary vector to Qdrant
  • Update DB status: COMPLETED / FAILED

Background processing — all document operations use FastAPI BackgroundTasks with an independent DB session to avoid StaleDataError from long-lived HTTP sessions.


Authentication

PKCE OAuth 2.0 flow — no client_secret:

  1. Browser generates code_verifier + code_challenge (S256)
  2. Redirect to Azure AD with code_challenge
  3. Azure AD returns auth_code to /auth/callback
  4. Browser exchanges auth_code + code_verifier → MS access_token (direct to Microsoft)
  5. Browser POSTs ms_access_token to POST /api/v1/auth/login
  6. Backend calls MS Graph /me to validate token and get user profile
  7. Backend returns signed app JWT
  8. All subsequent API calls use Authorization: Bearer <JWT>

For Personal Assistant mode, the MS access_token is passed in the request body as graph_token and used for Graph API calls on-demand.


Local Development

Prerequisites

  • Docker & Docker Compose
  • Node.js 18+
  • Python 3.11+ (optional, for running backend outside Docker)

1. Clone

git clone git@bitbucket.org:zlalani/enterprise-ai-hub-nexus.git
cd enterprise-ai-hub-nexus

2. Configure environment

cp backend/.env.example backend/.env
# Edit backend/.env with real API keys (see Environment Variables section)

3. Start backend services

docker-compose -f docker-compose.local.yml up -d
# Backend: http://localhost:1222
# PostgreSQL: localhost:5433
# Redis: localhost:6380
# Qdrant: http://localhost:6333

4. Apply migrations

docker exec backend alembic upgrade head

5. Start frontend

cd frontend
npm install
npm run dev
# http://localhost:3000/nexus

6. API docs

http://localhost:1222/docs

Production Deployment

Production runs on GCE VM optical-web-1. Deploy with:

# On the server
cd /opt/enterprise-ai-hub-nexus
./deploy.sh

deploy.sh performs:

  1. git pull origin main
  2. cd frontend && npm ci && npm run build
  3. rsync out/ /var/www/html/enterprise-ai-hub-nexus/
  4. docker-compose -f docker-compose.prod.yml up -d --build backend celery-worker
  5. docker exec backend alembic upgrade head

Docker Compose services (prod)

Service Image Port Description
backend ./Dockerfile 8000 FastAPI + uvicorn
celery-worker ./Dockerfile Celery worker (healthcheck: inspect ping)
celery-beat ./Dockerfile Celery scheduler
redis redis:7-alpine 6379 Broker + cache
qdrant qdrant/qdrant:v1.12.1 6333, 6334 Vector DB

Cloud Run Processor

Document processor deployed separately:

cd backend/cloud_run_processor
gcloud run deploy nexus-processor \
  --region europe-west1 \
  --timeout 900 \
  --memory 2Gi \
  --no-allow-unauthenticated

URL: https://nexus-processor-818629422283.europe-west1.run.app

The backend VM service account needs roles/run.invoker on this service.


Environment Variables

Variable Required Description
DATABASE_URL Yes postgresql+asyncpg://user:pass@host/db
REDIS_URL Yes redis://redis:6379/0
SECRET_KEY Yes JWT signing secret (32+ random bytes)
AZURE_CLIENT_ID Yes Azure AD app client ID
AZURE_TENANT_ID Yes Azure AD tenant ID
OPENAI_API_KEY Yes OpenAI key (RAG + embeddings)
ANTHROPIC_API_KEY Yes Anthropic key (Claude Sonnet + Haiku)
GOOGLE_API_KEY No Google Gemini key (summary / planning)
QDRANT_URL Yes http://qdrant:6333
CLOUD_RUN_PROCESSOR_URL No Cloud Run URL; if empty uses local processor
LLAMAPARSE_API_KEY No LlamaParse key for scanned PDF OCR
UPLOAD_DIR No File storage directory (default: /tmp/uploads)
CHUNK_SIZE No Chunk size in chars (default: 1000)
CHUNK_OVERLAP No Chunk overlap in chars (default: 200)
ACCESS_TOKEN_EXPIRE_MINUTES No JWT lifetime (default: 480 = 8 hours)

Note: Azure AD credentials (client ID, tenant ID) are also hardcoded as defaults in frontend/lib/microsoft-oauth.ts and next.config.mjs since the static frontend has no runtime env.


Database Migrations

# Apply all pending migrations
docker exec backend alembic upgrade head

# Check current revision
docker exec backend alembic current

# Generate new migration
docker exec backend alembic revision --autogenerate -m "description"

# Downgrade one step
docker exec backend alembic downgrade -1

Known gotcha: Do NOT use sa.Enum(create_type=False) with asyncpg — it is silently ignored. Let sa.Enum() inside op.create_table() handle type creation naturally. Do not add explicit CREATE TYPE SQL in the same migration.


API Overview

Method Path Description
POST /api/v1/auth/login MS access_token → app JWT
POST /api/v1/chat/stream SSE streaming chat (RAG / assistant / general)
GET /api/v1/admin/knowledge/documents List documents (max 1000)
POST /api/v1/admin/knowledge/upload Upload document (202, async processing)
POST /api/v1/admin/knowledge/scrape Scrape URL and index
POST /api/v1/admin/knowledge/stats Doc counts + Qdrant vector count
POST /api/v1/admin/knowledge/documents/{id}/reprocess Re-queue single document
POST /api/v1/admin/knowledge/documents/bulk-delete Delete multiple docs + vectors
POST /api/v1/admin/knowledge/reindex-all Re-queue all completed docs
POST /api/v1/admin/knowledge/reindex-stop Cancel pending reindex tasks
DELETE /api/v1/admin/knowledge/documents/{id} Delete doc + Qdrant vectors
GET /api/v1/admin/users List users (super_admin only)
GET /api/v1/admin/departments List departments

Full interactive docs at /docs (Swagger UI).


Troubleshooting

Backend won't start:

docker logs backend
docker-compose -f docker-compose.prod.yml restart backend

Reindex failing with AttributeError: Ensure you are on the latest commit — an early self.embeddings typo was fixed in document_processor.py.

Documents stuck in Processing: File may be too large for Cloud Run timeout (900s). Check Cloud Run logs. For very large files, consider increasing --timeout or splitting the file.

Qdrant vector count shows 0: Ensure Qdrant is running and QDRANT_URL is correct. The stats endpoint uses count() (compatible with Qdrant 1.12.x).

Azure AD login fails: Ensure the Azure AD app is configured as SPA platform (not Web). No client_secret should be needed. Check that redirect URIs include your frontend callback URL.

Frontend shows old version after deploy: Clear browser cache or do a hard refresh (Cmd+Shift+R). Apache serves static files without cache-busting headers by default.

Personal Assistant not working: Click "Connect Microsoft 365" in the sidebar and accept the consent dialog. The MS access_token is required for Graph API calls.


Documentation

Full documentation is in the docs/ directory:

  • docs/01_Enterprise_AI_Hub_Nexus_User_Guide.pdf — end-user guide
  • docs/02_Enterprise_AI_Hub_Nexus_Technical_Documentation.pdf — developer reference with architecture diagrams, RAG pipeline, Cloud Run setup, API reference

Built for OLIVER Agency — March 2026