Replace asyncio.ensure_future() with a daemon thread for GraphRAG
initialization. The Neo4j driver and NetworkX calls are synchronous
and were starving Hypercorn of CPU time on the shared event loop.
A separate thread with its own event loop isolates the blocking work
so the server accepts connections immediately after Phase 1 completes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Cache extracted triples to disk (neo4j_triples.pickle) so Neo4j can be
repopulated without expensive LLM re-extraction on cold starts
- Split initialization into two phases: fast vector-only (~1-2 min) and
background GraphRAG, so the server serves requests while GraphRAG loads
- Add GraphRAG status flags to shared_state for monitoring readiness
- Update /status endpoint to expose graphrag_ready/initializing/error
- Restructure main.py to use single event loop for background task support
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>