michael 594f749d4c Initial commit: HP Marketing Materials GraphRAG Chatbot

Full-stack GraphRAG chatbot for HP marketing materials with:
- Python/Flask backend with custom ReAct agent (LlamaIndex)
- Neo4j knowledge graph + vector search hybrid retrieval
- LlamaParse multimodal document processing (text + images)
- React/Vite frontend with conversation management
- MongoDB conversation persistence
- MSAL authentication support

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-23 08:37:58 -06:00

6.5 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

HP Marketing Materials Chatbot — a GraphRAG (Graph Retrieval-Augmented Generation) system that combines vector search with knowledge graph capabilities to answer questions about HP marketing materials and brand guidelines. Processes multimodal documents (text + images) via a custom ReAct agent.

Development Commands

Backend

pip install -r requirements.txt
python main.py                    # Starts Hypercorn ASGI server on localhost:8746

Frontend

cd chat-interface
npm install
npm run dev       # Vite dev server
npm run build     # Production build to dist/
npm run lint      # ESLint

Required Services

Neo4j: Port 7688, credentials neo4j/hp-graphrag-2024 (HP-dedicated instance; port 7687 is a separate Netflix project)
MongoDB: URI mongodb://hp:hp@localhost:27017/?authSource=hp_chatbot, database hp_chatbot

Environment Variables

Backend requires .env at project root with: OPENAI_API_KEY, LLAMA_CLOUD_API_KEY, NEO4J_URL, NEO4J_USERNAME, NEO4J_PASSWORD, PORT (default 8746). Frontend uses chat-interface/.env with: VITE_BACKEND_URL, VITE_APP_BASE_URL.

Architecture

Request Flow

Frontend (App.jsx) sends POST to /chat with {message, sessionId}
routes.py:chat() maps session to conversation via session_manager.py and MongoDB
The global ReActAgent2 (from shared_state.py) processes the query
Agent uses two tools: vector search (answer_questions_from_hp_marketing_materials) and GraphRAG hybrid search (answerquestionswith_graphrag)
Response includes text, sources, reasoning steps, and image references
Images are served via /images/<filename> from uploads/images/

Shared State Pattern (Critical)

All modules access the AI agent, vector index, and GraphRAG components through shared_state.py — a module with global variables and setter/getter functions. This avoids circular imports and ensures all modules reference the same instances. Never import these globals directly from ai_core.py; always use shared_state.

Key globals: global_workflow_agent, global_index, global_graph_store, global_graphrag_query_engine

ReAct Agent (`ai_core.py`)

ReActAgent2 is a custom LlamaIndex Workflow subclass implementing a ReAct loop:

Steps: new_user_msg → prepare_chat_history → handle_llm_input → (tool calls via handle_tool_calls → loop back) → StopEvent
Has a simple_run() method monkey-patched onto the agent at initialization time (replaces the default workflow run)
Includes regex-based cleaning of LLM "thinking" artifacts from final responses
Timeouts: AGENT_TIMEOUT (600s overall), LLM_TIMEOUT (300s per call), TOOL_EXECUTION_TIMEOUT (300s per tool)

GraphRAG System (`graph_rag_integration.py`)

Three main classes:

GraphRAGExtractor: LlamaIndex TransformComponent that extracts entity-relation triplets from text nodes using LLM
GraphRAGStore: Wraps Neo4jPropertyGraphStore, adds community detection (tries graspologic → python-louvain → NetworkX fallback), caches community summaries to index_storage/graphrag_cache/ as pickle files
GraphRAGQueryEngine: Combines vector retrieval with community-based graph retrieval, returning both contexts for synthesis

Startup Sequence (`main.py`)

MongoDB initialization (init_mongodb.py)
initialize_global_index() in ai_core.py:
- Configures LLM (chatgpt-4o-latest) and embeddings (text-embedding-3-small)
- Loads existing vector index from index_storage/hp_docs_index/ or builds new from supporting_files/files_for_rag_store/
- Connects to Neo4j, creates/loads GraphRAG components
- Builds communities (from cache or fresh)
- Creates ReActAgent2 and stores in shared_state

Frontend (React + Vite)

Single-page app in chat-interface/, main component is App.jsx
Auth via MSAL (auth.js), username sent as X-MS-USERNAME header
Dev mode uses fallback dev_user@local username
Conversation sidebar with auto-width resizing
Markdown rendering via showdown, image viewer with pagination
Styling: TailwindCSS + Shadcn/ui + Radix tooltips

JSON Serialization

json_utils.py provides CustomJSONEncoder and CustomJSONProvider that handle LlamaIndex types (ToolOutput, ReasoningSteps, ChatMessage, etc.), BSON ObjectId, and datetime. Flask is configured to use this provider globally.

Document Processing Pipeline

Upload → LlamaParse (dual: text + images) → Semantic splitting (SemanticSplitterNodeParser) → Page-based image assignment to chunks → Dual indexing (vector store + Neo4j knowledge graph) → Community detection and caching

API Endpoints

Method	Path	Purpose
POST	`/chat`	Main chat endpoint
GET	`/status`	System status (always returns initialized=true)
GET	`/images/<filename>`	Serve document images
GET	`/list-images`	List available images
GET	`/conversations`	List user conversations
GET	`/conversations/<id>/messages`	Get conversation messages
POST	`/conversations/new`	Create new conversation
DELETE	`/conversations/<id>`	Delete conversation (soft by default)
POST	`/reset`	Reset global agent memory
POST	`/download-brief`	Generate Word doc from markdown
POST	`/capture-screenshot`	Manual LlamaParse image capture (dev only)
GET	`/debug-status`	Debug endpoint (dev only)
POST	`/reinitialize`	Force agent reinit (dev only)

Key Conventions

Use log_structured(level, message, data_dict) from utils.py for all logging — it handles safe serialization of LlamaIndex objects
Session management maps frontend sessionId to MongoDB conversation_id via session_manager.py with an in-memory cache backed by MongoDB
The agent is a single global instance — all users share the same agent, but conversation history is loaded per-request from MongoDB
Community summaries use gpt-4o-mini for cost efficiency; main agent uses chatgpt-4o-latest

Deployment

Backend: Set PRODUCTION=true env var, deploys to https://ai-sandbox.oliver.solutions/hp_chatbot_back
Frontend: npm run build, deploy dist/ to /hp_chatbot/ path at https://ai-sandbox.oliver.solutions/hp_chatbot/
CORS origins configured in config.py:CORS_ALLOWED_ORIGINS

Testing

No formal test suite. Manual testing: start backend (python main.py), start frontend (npm run dev), test chat + image responses + document citations.

6.5 KiB Raw Permalink Blame History