Replace asyncio.ensure_future() with a daemon thread for GraphRAG initialization. The Neo4j driver and NetworkX calls are synchronous and were starving Hypercorn of CPU time on the shared event loop. A separate thread with its own event loop isolates the blocking work so the server accepts connections immediately after Phase 1 completes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| chat-interface | ||
| docs | ||
| documentation | ||
| .gitignore | ||
| ai_core.py | ||
| CLAUDE.md | ||
| config.py | ||
| docker-compose.yml | ||
| document_generator.py | ||
| graph_rag_integration.py | ||
| graphRAG.py | ||
| init_mongodb.py | ||
| json_utils.py | ||
| main.py | ||
| mongodb_utils.py | ||
| README.md | ||
| requirements.txt | ||
| routes.py | ||
| session_manager.py | ||
| shared_state.py | ||
| utils.py | ||
HP Marketing Materials Chatbot
A GraphRAG (Graph Retrieval-Augmented Generation) chatbot that answers questions about HP marketing materials and brand guidelines. Combines vector search with a Neo4j knowledge graph for more comprehensive retrieval, and processes multimodal documents (text + images) using LlamaParse.
Features
- Hybrid retrieval: Vector search + knowledge graph community detection for richer context
- Multimodal document processing: Extracts text and page images from PDFs via LlamaParse
- Custom ReAct agent: LlamaIndex-based workflow with tool use, reasoning steps, and source citations
- Conversation persistence: MongoDB-backed chat history with multi-conversation support
- Image references: Responses include relevant document page screenshots
- Brief export: Download conversation summaries as Word documents
Prerequisites
- Python 3.10+
- Node.js 18+
- Neo4j (dedicated instance on port 7688)
- MongoDB (with authentication configured)
- API Keys: OpenAI (
OPENAI_API_KEY), LlamaCloud (LLAMA_CLOUD_API_KEY)
Quick Start
1. Backend Setup
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
# Install dependencies
pip install -r requirements.txt
# Create .env file at project root
cat > .env << 'EOF'
OPENAI_API_KEY=your_openai_key
LLAMA_CLOUD_API_KEY=your_llama_cloud_key
NEO4J_URL=bolt://localhost:7688
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=hp-graphrag-2024
PORT=8746
PRODUCTION=false
LOG_LEVEL=INFO
EOF
# Start the server
python main.py
The backend runs on http://localhost:8746. On first startup it will:
- Initialize MongoDB collections and indexes
- Load or build the vector index from
supporting_files/files_for_rag_store/ - Connect to Neo4j and build/load the knowledge graph
- Build community summaries (cached to
index_storage/graphrag_cache/)
2. Frontend Setup
cd chat-interface
# Install dependencies
npm install
# Create .env file
cat > .env << 'EOF'
VITE_BACKEND_URL=http://localhost:8746
VITE_APP_BASE_URL=/
EOF
# Start dev server
npm run dev
The frontend runs on http://localhost:5173.
3. Database Setup
Neo4j:
- Run a Neo4j instance on port 7688 (port 7687 is reserved for a separate project)
- Credentials:
neo4j/hp-graphrag-2024 - The application auto-populates the graph on first index build
MongoDB:
- Create a user
hpwith passwordhpandauthSource=hp_chatbot - Database:
hp_chatbot - Collections (
users,conversations,messages) are auto-created byinit_mongodb.pyon startup
Example MongoDB user setup:
use hp_chatbot
db.createUser({
user: "hp",
pwd: "hp",
roles: [{ role: "readWrite", db: "hp_chatbot" }]
})
Project Structure
├── main.py # Entry point, Hypercorn ASGI server
├── config.py # Centralized configuration
├── ai_core.py # ReAct agent, document processing, index init
├── graph_rag_integration.py # GraphRAG: extraction, community detection, query engine
├── routes.py # Flask API endpoints
├── shared_state.py # Global state for agent/index/graph (cross-module)
├── session_manager.py # Session-to-conversation mapping
├── mongodb_utils.py # MongoDB CRUD operations
├── json_utils.py # Custom JSON serialization for LlamaIndex types
├── document_generator.py # Markdown-to-Word document conversion
├── utils.py # Logging and file utilities
├── init_mongodb.py # Database initialization script
├── requirements.txt # Python dependencies
├── .env # Environment variables (not committed)
├── supporting_files/
│ └── files_for_rag_store/ # HP marketing documents for indexing
├── uploads/
│ └── images/ # Extracted document page images
├── index_storage/
│ ├── hp_docs_index/ # Persisted vector index
│ └── graphrag_cache/ # Cached community summaries (pickle)
└── chat-interface/ # React frontend
├── src/
│ ├── App.jsx # Main chat interface component
│ ├── auth.js # MSAL authentication
│ ├── components/
│ │ ├── ChatInterface.jsx
│ │ ├── ConversationManager.jsx
│ │ └── ThemeToggle.jsx
│ └── lib/utils.js
├── package.json
└── dist/ # Production build output
Architecture Overview
┌─────────────┐ POST /chat ┌──────────────┐
│ React UI │ ──────────────────► │ Flask/ │
│ (App.jsx) │ ◄────────────────── │ Hypercorn │
│ │ JSON response │ (routes.py) │
└─────────────┘ └──────┬───────┘
│
┌──────▼───────┐
│ Session Mgr │──── MongoDB
│ │ (conversations,
└──────┬───────┘ messages, users)
│
┌──────▼───────┐
│ ReActAgent2 │
│ (ai_core.py) │
└──────┬───────┘
│
┌────────────┼────────────┐
│ │
┌──────▼──────┐ ┌──────▼───────┐
│ Vector │ │ GraphRAG │
│ Query Tool │ │ Query Tool │
│ │ │ │
│ LlamaIndex │ │ Vector + │
│ VectorStore │ │ Community │
│ Index │ │ Retrieval │
└─────────────┘ └──────┬───────┘
│
┌──────▼───────┐
│ Neo4j │
│ Knowledge │
│ Graph │
└──────────────┘
API Reference
| Method | Endpoint | Description |
|---|---|---|
POST |
/chat |
Send a chat message. Body: {message, sessionId} |
GET |
/status?sessionId= |
Check system initialization status |
GET |
/conversations |
List user's conversations (requires X-MS-USERNAME header) |
POST |
/conversations/new |
Create a new conversation |
GET |
/conversations/:id/messages |
Get messages for a conversation |
DELETE |
/conversations/:id |
Soft-delete a conversation |
POST |
/reset |
Reset global agent memory. Body: {sessionId} |
GET |
/images/:filename |
Serve a document page image |
GET |
/list-images |
List all available images |
POST |
/download-brief |
Generate Word doc. Body: {brief_content, sessionId} |
Authentication: The frontend sends the MSAL username via X-MS-USERNAME header. In development mode (PRODUCTION=false), a default dev_user@local is used.
Configuration
All configuration is centralized in config.py. Key settings:
| Setting | Default | Description |
|---|---|---|
LLM_MODEL |
chatgpt-4o-latest |
Main LLM for the ReAct agent |
EMBEDDING_MODEL |
text-embedding-3-small |
Embedding model for vector index |
LLM_TEMPERATURE |
0.3 |
LLM temperature |
SIMILARITY_TOP_K |
10 |
Number of vector results to retrieve |
AGENT_TIMEOUT |
600s |
Overall agent workflow timeout |
LLM_TIMEOUT |
300s |
Per-LLM-call timeout |
SERVER_PORT |
8746 |
Backend server port |
Community summaries use gpt-4o-mini for cost efficiency (configured in graph_rag_integration.py).
Adding Documents
Place HP marketing documents (PDF, DOCX, PPTX, TXT) in supporting_files/files_for_rag_store/. On the next startup with no existing index, the system will:
- Parse documents with LlamaParse (text + image extraction)
- Split into semantic chunks
- Build a vector index (persisted to
index_storage/hp_docs_index/) - Extract knowledge graph triplets and store in Neo4j
- Run community detection and cache summaries
To force a full reindex, delete index_storage/hp_docs_index/ and clear the Neo4j database before restarting.
Deployment
Backend
- Set
PRODUCTION=trueenvironment variable - Server binds to
0.0.0.0in production mode - Configure
CORS_ALLOWED_ORIGINSinconfig.py - Production URL:
https://ai-sandbox.oliver.solutions/hp_chatbot_back
Frontend
cd chat-interface
npm run build
- Deploy
dist/contents to the/hp_chatbot/path - Ensure proper MIME types for
.jsfiles on the web server - Configure SPA routing (see
web.configor.htaccess) - Production URL:
https://ai-sandbox.oliver.solutions/hp_chatbot/
Troubleshooting
| Issue | Solution |
|---|---|
| Backend won't start | Check that Neo4j and MongoDB are running. Verify OPENAI_API_KEY is set in .env |
| "Agent unavailable" errors | Check startup logs for LLM API test failure. The /reinitialize endpoint (dev only) can force re-init |
| No images in responses | Verify LLAMA_CLOUD_API_KEY is set. Check that uploads/images/ contains extracted images |
| CORS errors | Add the frontend origin to CORS_ALLOWED_ORIGINS in config.py |
| Slow first startup | Initial document processing and graph building can take significant time depending on document volume |
| Neo4j connection refused | Ensure Neo4j is on port 7688 (not 7687, which is a different project) |