# Netflix GraphRAG Marketing Chatbot An AI-powered knowledge assistant that answers questions about Netflix marketing materials — specifically the GPD Key Art Playbook and related design guidelines. The system combines traditional vector search (RAG) with a Neo4j knowledge graph (GraphRAG) to deliver contextual, cross-document answers with source citations and relevant document images. ## How It Works The chatbot uses a **dual retrieval** approach: 1. **Vector Search** — Documents are parsed, semantically chunked, and embedded with OpenAI embeddings. User queries are matched against these chunks via similarity search. 2. **GraphRAG** — Entities and relationships are extracted from document chunks into a Neo4j knowledge graph. Community detection (Louvain clustering) groups related entities, and each community receives an AI-generated summary. At query time, relevant communities are retrieved alongside vector results. A custom **ReAct agent** (built on LlamaIndex Workflows) orchestrates both retrieval tools, deciding which to call based on the query, then synthesizes a unified response. ``` User Query → ReAct Agent → [Vector Tool | GraphRAG Tool] → Response Synthesis → Answer + Sources + Images ``` ### Why GraphRAG? Standard RAG retrieves isolated text chunks. GraphRAG adds: - **Cross-document connections** — Links entities that appear across different documents - **Community context** — Provides broader topical summaries, not just individual chunks - **Semantic relationships** — Understands how concepts relate beyond keyword overlap ## Architecture ``` ┌─────────────────┐ HTTP/JSON ┌──────────────────────────────────┐ │ React Frontend │ ◄──────────────► │ Flask + Hypercorn (async ASGI) │ │ (Vite, TailwindCSS)│ │ │ └─────────────────┘ │ ┌────────────────────────────┐ │ │ │ ReAct Agent (Workflow) │ │ │ │ ├─ Vector Query Tool │ │ │ │ └─ GraphRAG Query Tool │ │ │ └────────────────────────────┘ │ └──────┬──────────┬───────────────┘ │ │ ┌─────────────┤ ├─────────────┐ ▼ ▼ ▼ ▼ ┌──────────┐ ┌──────────┐ ┌─────┐ ┌───────────┐ │ MongoDB │ │ Neo4j │ │OpenAI│ │LlamaCloud │ │(sessions,│ │(knowledge│ │(LLM, │ │(LlamaParse│ │ convos) │ │ graph) │ │embed)│ │ doc parse)│ └──────────┘ └──────────┘ └─────┘ └───────────┘ ``` ## Prerequisites - **Python 3.10+** - **Node.js 18+** - **Docker** (for MongoDB and Neo4j) - **API Keys**: OpenAI, Anthropic (optional), LlamaCloud (for document parsing) ## Quick Start ### 1. Environment Configuration Create a `.env` file in the project root: ```env PRODUCTION=false OPENAI_API_KEY=your-openai-key ANTHROPIC_API_KEY=your-anthropic-key LLAMA_CLOUD_API_KEY=your-llamacloud-key ``` ### 2. Start Databases ```bash # MongoDB (conversation storage) cd db && docker-compose up -d && cd .. # Neo4j (knowledge graph) cd neo4j && docker-compose -f docker-compose-neo4j.yml up -d && cd .. ``` Neo4j Browser is available at http://localhost:7474. ### 3. Backend ```bash python -m venv venv source venv/bin/activate pip install -r requirements.txt python main.py ``` The backend starts on **http://localhost:6175**. On first run, it will: 1. Connect to MongoDB and initialize the schema 2. Parse documents from `supporting_files/files_for_rag_store/` via LlamaParse 3. Build the vector index and persist it to `index_storage/` 4. Extract entities/relationships and populate the Neo4j knowledge graph 5. Run community detection and generate community summaries Subsequent starts load the persisted index from `index_storage/` (much faster). ### 4. Frontend ```bash cd chat-interface npm install npm run dev ``` The frontend starts on **http://localhost:5173** and connects to the backend at the URL specified in `chat-interface/.env.development`. ## Development Mode With `PRODUCTION=false` in `.env`: - **Authentication is bypassed** — no Microsoft login required; the system uses `dev_user@local` automatically - **Hot reload** is enabled on the backend (Hypercorn reloader) - The frontend works even **without a running backend** (mock responses for UI development) The frontend has its own env files: - `chat-interface/.env.development` — sets `VITE_BACKEND_URL` for local dev (default: `http://localhost:6175`) - `chat-interface/.env.production` — points to the production backend ## API Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/chat` | Send a chat message. Body: `{ message, sessionId }`. Header: `X-MS-USERNAME` | | `GET` | `/conversations` | List conversations for the authenticated user | | `GET` | `/conversations//messages` | Retrieve messages for a conversation | | `DELETE` | `/conversations/` | Delete a conversation | | `GET` | `/images/` | Serve an extracted document image | | `GET` | `/status` | System initialization status | ### Example Request ```bash curl -X POST http://localhost:6175/chat \ -H "Content-Type: application/json" \ -H "X-MS-USERNAME: dev_user@local" \ -d '{"message": "What are the key art guidelines?", "sessionId": "test-session-1"}' ``` ## Project Structure ``` ├── main.py # Flask app init, Hypercorn config, startup sequence ├── config.py # All configuration (API keys, models, paths, timeouts) ├── routes.py # Flask route handlers (chat, conversations, images) ├── ai_core.py # ReAct agent workflow, vector index, document processing ├── graph_rag_integration.py # GraphRAG classes (extractor, store, query engine) ├── graphRAG.py # Standalone GraphRAG (for testing) ├── shared_state.py # Global state management for agent/index/graph components ├── session_manager.py # Session → user/conversation mapping ├── mongodb_utils.py # MongoDB CRUD operations ├── document_generator.py # DOCX brief export from conversations ├── json_utils.py # Custom JSON serializer for LlamaIndex objects ├── utils.py # Logging utilities ├── .env # Environment variables (API keys, mode) ├── requirements.txt # Python dependencies ├── index_storage/ # Persisted vector index (auto-generated) ├── supporting_files/ # Source documents for the knowledge base │ └── files_for_rag_store/ # Documents ingested by the RAG pipeline ├── uploads/images/ # Extracted document page images ├── db/ # MongoDB docker-compose ├── neo4j/ # Neo4j docker-compose and data volumes └── chat-interface/ # React frontend ├── src/App.jsx # Main chat UI component ├── src/auth.js # MSAL authentication with dev bypass └── src/components/ # UI components (ConversationManager, ThemeToggle) ``` ## Reindexing Documents To rebuild the index after changing source documents: ```bash rm -rf index_storage/ python main.py ``` This re-parses all documents, rebuilds the vector index, and regenerates the knowledge graph. ## Tech Stack | Layer | Technology | |-------|-----------| | LLM | OpenAI GPT-4.1 (`gpt-4.1`) | | Embeddings | OpenAI `text-embedding-3-small` | | RAG Framework | LlamaIndex | | Document Parsing | LlamaParse (LlamaCloud) | | Knowledge Graph | Neo4j + NetworkX (community detection) | | Backend | Python, Flask, Hypercorn | | Conversation DB | MongoDB (via PyMongo) | | Frontend | React 18, Vite, Tailwind CSS | | Auth | Microsoft MSAL (Azure AD) | ## Troubleshooting | Issue | Fix | |-------|-----| | Login screen appears in dev mode | Verify `PRODUCTION=false` in `.env` | | MongoDB connection error | Ensure MongoDB container is running: `docker ps` | | Neo4j connection error | Check Neo4j container and verify credentials in `config.py` match `docker-compose-neo4j.yml` | | Frontend can't reach backend | Check `VITE_BACKEND_URL` in `chat-interface/.env.development` matches the backend port | | CORS errors | Verify the frontend origin is listed in `CORS_ALLOWED_ORIGINS` in `config.py` | | Slow first startup | Expected — LlamaParse document processing and graph construction take time. Subsequent starts use the cached index |