Oliver-ai-bot_2.0/technical_spec.md
SamoilenkoVadym 67fe791702 Phase 1 Complete: Environment Setup
Infrastructure Setup:
- Initialized Git repository
- Created project directory structure (backend, frontend, infrastructure)
- Configured Docker Compose with 4 services:
  * PostgreSQL 16 (database)
  * Redis 7 (cache & task queue)
  * Qdrant (vector database)
  * Backend (placeholder)

Configuration:
- Created comprehensive .env.example with all required environment variables
- Added .gitignore for Python and Node.js
- Created backend Dockerfile (placeholder for Phase 2)
- Added healthchecks for all services

Documentation:
- Created README.md with quick start guide and project overview
- Documented implementation plan (6 phases)
- Included concept and technical specifications

Verification:
- docker-compose build:  Successful
- All services started and passed health checks:
  * PostgreSQL: Accepting connections on port 5432
  * Redis: Responding to ping on port 6379
  * Qdrant: API healthy on port 6333

Next Phase: Backend Core (FastAPI, Entra ID Auth, RBAC, Alembic migrations)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-12 17:31:54 +00:00

9.5 KiB

Perfect! I've completed the architectural interview. Here's the comprehensive specification based on your answers:

🎯 Enterprise AI Hub "Nexus" - Final Architecture Specification

  1. Core Technology Stack (Confirmed) Frontend: Next.js 14 (App Router), TypeScript, Shadcn/UI, Tailwind CSS, Zustand, TanStack Query Backend: FastAPI (Python 3.11+), LangChain/LangGraph Task Queue: Celery + Redis (exponential backoff retries, 3 attempts max) Auth: Microsoft Entra ID (OAuth2/OIDC) → Backend JWT (15min) + Refresh Tokens (7d) Databases: PostgreSQL (relational data) Qdrant (single collection sharepoint_docs, metadata-filtered) Redis (sessions, cache, task queue) Embeddings: OpenAI text-embedding-3-large (3072 dim)
  2. Database Schema Design Users & RBAC

users ├─ id (uuid, PK) ├─ entra_id (string, unique) -- Azure AD object ID ├─ email (string, unique) ├─ display_name (string) ├─ role (enum: 'super_admin', 'content_manager', 'user') ├─ department_id (FK → departments.id, nullable) ├─ country_id (FK → countries.id, nullable) ├─ created_at (timestamp) └─ last_login_at (timestamp)

departments ├─ id (uuid, PK) ├─ name (string, unique) └─ created_at (timestamp)

countries ├─ id (uuid, PK) ├─ code (string, unique) -- ISO 3166-1 alpha-2 ├─ name (string) └─ created_at (timestamp) Conversations & Messages

conversations ├─ id (uuid, PK) ├─ user_id (FK → users.id) ├─ mode (enum: 'rag', 'assistant', 'notebook') ├─ title (string, nullable) -- Auto-generated or user-set ├─ created_at (timestamp) └─ updated_at (timestamp)

messages ├─ id (uuid, PK) ├─ conversation_id (FK → conversations.id, ON DELETE CASCADE) ├─ role (enum: 'user', 'assistant', 'system') ├─ content (text) ├─ input_tokens (int, nullable) ├─ output_tokens (int, nullable) ├─ llm_provider (string, nullable) -- 'openai', 'azure', 'anthropic' ├─ llm_model (string, nullable) -- 'gpt-4-turbo', 'claude-sonnet-4.5' ├─ created_at (timestamp) └─ index: (conversation_id, created_at) Feedback System

feedback ├─ id (uuid, PK) ├─ message_id (FK → messages.id) ├─ conversation_id (FK → conversations.id) ├─ user_id (FK → users.id) ├─ rating (enum: 'positive', 'negative') ├─ user_comment (text, nullable) ├─ retrieved_sources (jsonb, nullable) -- [{doc_id, title, url, score}] ├─ status (enum: 'pending', 'reviewed', 'resolved') ├─ reviewed_by (FK → users.id, nullable) ├─ reviewed_at (timestamp, nullable) └─ created_at (timestamp) SharePoint Sync State

sharepoint_sync_state ├─ id (uuid, PK) ├─ library_url (string, unique) -- Full SharePoint library URL ├─ delta_token (string, nullable) -- MS Graph delta link ├─ last_sync_at (timestamp) ├─ status (enum: 'idle', 'syncing', 'error') └─ error_message (text, nullable)

sharepoint_documents ├─ id (uuid, PK) ├─ sharepoint_id (string, unique) -- MS Graph item ID ├─ file_name (string) ├─ file_url (string) -- Deep link to SharePoint ├─ library_url (string) ├─ department_id (FK → departments.id, nullable) ├─ country_id (FK → countries.id, nullable) ├─ file_type (string) -- pdf, docx, xlsx ├─ file_size (bigint) -- bytes ├─ last_modified (timestamp) -- From SharePoint ├─ qdrant_point_ids (jsonb) -- Array of vector IDs in Qdrant ├─ ingested_at (timestamp) ├─ is_active (boolean, default true) └─ index: (sharepoint_id), (is_active) Notebook Mode (NotebookLlama)

notebook_sessions ├─ id (uuid, PK) ├─ user_id (FK → users.id) ├─ title (string, nullable) ├─ conversation_id (FK → conversations.id) ├─ is_pinned (boolean, default false) ├─ expires_at (timestamp) -- 24h default, null if pinned ├─ created_at (timestamp) └─ index: (user_id, expires_at)

uploaded_files ├─ id (uuid, PK) ├─ session_id (FK → notebook_sessions.id, ON DELETE CASCADE) ├─ file_name (string) ├─ file_size (bigint) ├─ file_type (string) ├─ storage_path (string) -- Local filesystem or S3 ├─ notebookllama_file_id (string, nullable) └─ uploaded_at (timestamp) Admin Configuration

llm_providers ├─ id (uuid, PK) ├─ mode (enum: 'rag', 'assistant', 'notebook') ├─ provider (enum: 'openai', 'azure', 'anthropic') ├─ model_name (string) -- 'gpt-4-turbo', 'claude-sonnet-4.5' ├─ api_key_env_var (string) -- Reference to env var name ├─ is_active (boolean, default true) └─ updated_at (timestamp)

system_prompts ├─ id (uuid, PK) ├─ mode (enum: 'rag', 'assistant', 'notebook') ├─ prompt_text (text) ├─ is_active (boolean, default true) ├─ created_by (FK → users.id) └─ created_at (timestamp) 3. Qdrant Vector Store Structure Collection: sharepoint_docs

Vector Dimension: 3072 (text-embedding-3-large) Distance Metric: Cosine Payload Schema:

{ "sharepoint_id": "unique_doc_id", "file_name": "Policy_2024.pdf", "file_url": "https://sharepoint.com/...", "chunk_index": 0, "total_chunks": 12, "text": "Full chunk text...", "department_id": "uuid", "country_id": "uuid", "file_type": "pdf", "last_modified": "2024-11-15T10:00:00Z", "is_active": true } Indexes: Create payload indexes on department_id, country_id, is_active, file_type 4. Integration Patterns SharePoint Ingestion Pipeline

Celery Scheduled Task (hourly) ├─> Fetch MS Graph Delta API ├─> Identify: New, Modified, Deleted files ├─> For each changed document: │ ├─> Download file content │ ├─> Extract text (PyPDF2, python-docx, etc.) │ ├─> Chunk text (LangChain RecursiveCharacterTextSplitter) │ ├─> Generate embeddings (OpenAI API) │ ├─> Upsert to Qdrant with metadata │ └─> Update sharepoint_documents table └─> Update sharepoint_sync_state with new delta_token NotebookLlama Proxy Flow

Frontend Upload Request ├─> POST /api/notebook/sessions/{id}/upload ├─> FastAPI validates file (size, type) ├─> Save to local storage /tmp/uploads/{session_id}/ ├─> Forward to NotebookLlama POST /upload ├─> Store uploaded_files record └─> Return file_id to frontend

Chat Query ├─> POST /api/notebook/sessions/{id}/chat (SSE stream) ├─> Fetch session uploaded_files from PostgreSQL ├─> Forward to NotebookLlama POST /chat with context └─> Stream response token-by-token to frontend RAG Citation Format

User Query: "What is the vacation policy?"

LLM Response: "According to company policy [1], employees receive 20 days of annual leave [2]. This applies to all full-time staff [1].

Sources: [1] HR_Policy_2024.pdf - https://sharepoint.com/sites/hr/HR_Policy_2024.pdf [2] Benefits_Guide.docx - https://sharepoint.com/sites/hr/Benefits_Guide.docx" 5. Authentication Flow

  1. User → Frontend: Click "Login with Microsoft"
  2. Frontend → Entra ID: Redirect to OAuth consent page
  3. Entra ID → Frontend: Return authorization code
  4. Frontend → Backend: POST /api/auth/login { code }
  5. Backend: a. Exchange code for Entra ID access token b. Fetch user profile (email, name, Entra object ID) c. Check if user exists in PostgreSQL
    • If new: Auto-create user with role='user'
    • If existing: Update last_login_at d. Generate Backend JWT (exp: 15min) e. Generate Refresh Token (exp: 7 days), store in Redis f. Return { access_token, refresh_token, user }
  6. Frontend: Store tokens in memory (access) + httpOnly cookie (refresh)
  7. All API calls: Send access_token in Authorization: Bearer header
  8. Token refresh: POST /api/auth/refresh { refresh_token }
  9. API Streaming (Server-Sent Events) Endpoint: GET /api/chat/stream?conversation_id=xxx

FastAPI

async def stream_chat(conversation_id: str): async def event_generator(): async for token in llm_client.stream(messages): yield f"data: {json.dumps({'token': token})}\n\n" yield f"data: {json.dumps({'done': true})}\n\n"

return StreamingResponse(
    event_generator(),
    media_type="text/event-stream"
)

// Frontend const eventSource = new EventSource(/api/chat/stream?conversation_id=${id}); eventSource.onmessage = (event) => { const data = JSON.parse(event.data); if (data.done) { eventSource.close(); } else { appendToken(data.token); } }; 7. Security Checklist No user passwords (Entra ID SSO only) Backend validates all JWT signatures Refresh tokens stored in Redis with TTL API keys in environment variables, never in database CORS configured to frontend domain only Rate limiting deferred to Phase 2 SQL injection protection via ORM (SQLAlchemy parameterized queries) File upload validation (type, size limits) Celery task retry limits to prevent runaway jobs 8. Deployment Readiness Environment Variables: .env.example template with all required keys Database Migrations: Alembic for PostgreSQL schema versioning Celery Workers: Separate processes for task execution Redis: Single instance for sessions + task queue (consider Redis Cluster for scale) Qdrant: Docker deployment or Qdrant Cloud Secrets: .env for local dev, environment variables in production Next Steps You're now ready to proceed with implementation. The architecture is:

Scalable: Celery for async tasks, Redis caching, Qdrant vector search Secure: Entra ID SSO, JWT auth, no stored credentials Observable: Per-message token tracking, feedback system, admin dashboards Maintainable: Clear separation of concerns, dynamic configuration, auto-retry logic Would you like me to start creating the database migration scripts, API scaffolding, or any other specific component?