5 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Repository Overview
GraphRAG-enhanced Netflix marketing chatbot. Combines LlamaIndex vector search with a Neo4j knowledge graph (GraphRAG) to answer queries about Netflix marketing materials (GPD Key Art Playbook). Flask/Hypercorn async backend, React frontend, MongoDB for conversation history.
Development Commands
Backend
# Setup
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Run backend (Hypercorn ASGI server on port 6175)
python main.py
Frontend
cd chat-interface
npm install
npm run dev # Vite dev server (port 5173)
npm run build # Production build
npm run lint # ESLint
Databases (Docker)
# MongoDB (conversation storage)
cd db && docker-compose up -d
# Neo4j (knowledge graph)
cd neo4j && docker-compose -f docker-compose-neo4j.yml up -d
# Neo4j Browser: http://localhost:7474
Reindex Documents
rm -rf index_storage/
python main.py # Rebuilds index on startup
Test GraphRAG standalone
python graphRAG.py
Architecture
Request Flow
User → React Frontend → Flask Routes (routes.py) → ReActAgent2 (ai_core.py)
→ Tool Selection → [Vector Query Tool | GraphRAG Query Tool]
→ Response Synthesis → JSON API → Frontend
Key Backend Files
main.py: Flask app init, Hypercorn config, startup sequence (MongoDB init → AI index init)config.py: All configuration — API keys, model params, paths, timeouts. Loads from.envroutes.py: All Flask route handlers.register_routes(app)pattern. Main endpoint:POST /chatai_core.py(~79KB): Core AI logic. Contains:ReActAgent2(Workflow): Custom LlamaIndex Workflow-based ReAct agent with 4 steps:new_user_msg→prepare_chat_history→handle_llm_input→handle_tool_callsinitialize_global_index(): Startup function that builds/loads vector index and GraphRAG components- Document parsing via LlamaParse, semantic chunking, vector index creation
graph_rag_integration.py: GraphRAG integration layer.GraphRAGExtractor,GraphRAGStore,GraphRAGQueryEngineclasses. Community detection via NetworkX/Louvain. Creates hybrid vector+graph retrieval toolsgraphRAG.py: Standalone GraphRAG implementation (original, used for testing). Duplicates some classes fromgraph_rag_integration.pyshared_state.py: Module-level globals for the agent, index, and GraphRAG components. Setter/getter functions ensure cross-module consistencysession_manager.py: Maps frontend session IDs to MongoDB user/conversation records. In-memory cache + DB persistencemongodb_utils.py: All MongoDB CRUD operations. Connection:mongodb://netflix:netflix@localhost:27017, DB:netflix_chatbotjson_utils.py: Custom Flask JSON provider for serializing LlamaIndex objectsdocument_generator.py: Creates DOCX brief exports from conversations
Frontend (chat-interface/)
- React 18 + Vite + Tailwind CSS
src/App.jsx: Main chat UI component, API calls, message renderingsrc/auth.js: MSAL authentication with dev bypasssrc/components/ConversationManager.jsx: Conversation CRUD- Backend URL configured via
VITE_BACKEND_URLenv var
Shared State Pattern
Global AI state lives in shared_state.py and is imported by reference across modules. The set_global_agent(), set_global_index(), and set_graphrag_components() functions must be used to update state (not direct assignment from other modules).
Dual Retrieval System
The ReAct agent selects between two tools:
- Vector Query Tool: LlamaIndex
VectorStoreIndexwith OpenAI embeddings (text-embedding-3-small), semantic chunking, metadata-filtered retrieval - GraphRAG Query Tool: Neo4j property graph with entity extraction, community detection (Louvain clustering), and community-summarized retrieval
Development Mode
Set PRODUCTION=false in .env to:
- Bypass MSAL authentication (uses
dev_user@local) - Enable Hypercorn hot-reload
- Bind to
localhostinstead of0.0.0.0 - Frontend works without backend (mock responses)
Frontend dev mode requires chat-interface/.env.development with:
VITE_NODE_ENV=development
VITE_MODE=development
API Endpoints
POST /chat— Send message. Body:{message, sessionId}. Header:X-MS-USERNAMEGET /conversations— List user conversationsGET /conversations/<id>/messages— Get conversation messagesDELETE /conversations/<id>— Delete conversationGET /images/<filename>— Serve extracted document imagesGET /status— System initialization status
Key Dependencies
- LLM: OpenAI
gpt-4.1(configurable inconfig.py) - Embeddings: OpenAI
text-embedding-3-small - Document Parsing: LlamaParse (requires
LLAMA_CLOUD_API_KEY) - Graph DB: Neo4j (bolt://localhost:7687)
- Conversation DB: MongoDB (localhost:27017)
- ASGI Server: Hypercorn (async Flask support)