9.5 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is a modern Contract Analysis Tool v2.0 - a production-ready Retrieval-Augmented Generation (RAG) application for intelligent contract analysis and document Q&A. The system consists of a FastAPI backend and React frontend.
Architecture
Stack:
- Backend: FastAPI + MongoDB + Redis + ChromaDB
- Frontend: React + Vite + Tailwind CSS
- AI/ML: OpenAI GPT-4, LlamaIndex, ChromaDB for vector storage
- Authentication: JWT-based with role-based access control
Data Flow:
React Frontend → FastAPI Backend → MongoDB + ChromaDB → OpenAI API
↓
Redis Cache
Development Commands
Backend (FastAPI)
Start development server:
cd backend
source venv/bin/activate # On Windows: venv\Scripts\activate
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
Install dependencies:
cd backend
pip install -r requirements.txt
Database setup:
- MongoDB runs on port 27017
- Redis runs on port 6379
- Application auto-creates collections/indexes on startup
Initialize default users:
curl -X POST http://localhost:8000/api/v1/auth/init-users
Health check:
curl http://localhost:8000/health
Frontend (React)
Start development server:
cd frontend
npm run dev
Build for production:
cd frontend
npm run build
Lint code:
cd frontend
npm run lint
Install dependencies:
cd frontend
npm install
Docker Development
Start all services:
cd backend
docker-compose up -d
Backend only (with external DB):
cd backend
docker-compose up -d mongo redis
Project Structure
Backend (/backend)
app/main.py- FastAPI application entry pointapp/config/settings.py- Environment configuration and database settingsapp/api/v1/- API endpoints (auth, documents, indices, chat, admin)app/models/- MongoDB data models (user, document, index, chat)app/services/- Business logic (document_processor, rag_service)app/core/- Core utilities (auth, security, cache)app/utils/- Helper utilities (file_utils)
Frontend (/frontend)
src/App.jsx- Main React application with routingsrc/pages/- Page components (Dashboard, DocumentManager, ChatInterface, AdminPanel)src/components/- Reusable UI components organized by featuresrc/services/- API service layer (authService, documentService, chatService, indexService)src/context/- React context providers (AuthContext)src/utils/- Frontend utilities and constants
Key Features & Workflows
Authentication System
- JWT-based authentication with role-based access (admin/user)
- Default users:
admin@oliver.agency/admin123,user@oliver.agency/user123 - Protected routes with automatic token refresh
Document Processing Pipeline
- Upload → Document uploaded via React frontend
- Process → Backend processes with LlamaIndex (PDF parsing, chunking)
- Index → Embeddings stored in ChromaDB, metadata in MongoDB
- Query → Natural language queries via RAG system
Index Management
- Users can create document indices for organizing documents
- Role-based access control for index management
- ChromaDB handles vector storage, MongoDB stores metadata
Chat System
- Context-Aware Conversations: AI remembers previous 10 messages within 24-hour window
- Real-time document Q&A using RAG with source citations
- Proper message ordering - chronological display with correct timestamps
- Conversation continuity - responses reference previous context when relevant
- Configurable top-k results for query precision (3, 5, 10, 15)
- Smart caching - context-dependent responses aren't cached, simple queries are
- Session statistics - track response times, cache hit rates, message counts
Environment Configuration
Backend (.env)
# Database
MONGODB_URL=mongodb://localhost:27017
DATABASE_NAME=contract_analysis
# Redis
REDIS_URL=redis://localhost:6379
# Authentication
JWT_SECRET_KEY=your-super-secret-jwt-key
JWT_ALGORITHM=HS256
JWT_EXPIRE_MINUTES=30
# OpenAI
OPENAI_API_KEY=your-openai-api-key
LLAMAPARSE_API_KEY=your-llamaparse-api-key
# Application
DEBUG=false
CORS_ORIGINS=["http://localhost:3000"]
UPLOAD_DIR=./uploads
INDICES_DIR=./indices
# Cache
CACHE_ENABLED=true
CACHE_TTL=3600
Frontend (.env)
VITE_API_URL=http://localhost:8000
VITE_APP_NAME=Contract Analysis Tool
API Endpoints
Authentication:
POST /api/v1/auth/login- User loginPOST /api/v1/auth/init-users- Initialize default users
Documents:
POST /api/v1/documents/upload- Upload documents to indexGET /api/v1/documents/{index_id}- List documents in index
Indices:
POST /api/v1/indices/create- Create new document indexGET /api/v1/indices/- List user's indices
Chat:
POST /api/v1/chat/query- Query documents with natural language
Admin:
GET /api/v1/admin/stats- System statistics (admin only)POST /api/v1/admin/documents/upload-single- Upload single documentPOST /api/v1/admin/documents/upload-multiple- Upload multiple documentsGET /api/v1/admin/documents/{index_id}- Get index documentsPOST /api/v1/admin/documents/{document_id}/reprocess- Reprocess documentDELETE /api/v1/admin/documents/{document_id}- Delete documentGET /api/v1/admin/indices- Get all indicesPOST /api/v1/admin/indices/create- Create new indexPOST /api/v1/admin/chat/query- RAG query interface
Development Notes
Database Connections
- MongoDB connection pooling handled automatically
- Redis connection with fallback if unavailable
- ChromaDB indices stored in
./indicesdirectory
File Handling
- Uploads stored in
./uploads/{index_id}/directory structure - Supported formats: PDF, DOCX, DOC, TXT, CSV, JSON, HTML, MD, RTF
- 50MB file size limit (configurable)
- Automatic file naming for batch uploads
Caching Strategy
- Redis caches API responses for performance
- TTL configurable via
CACHE_TTLenvironment variable - Cache keys include user context for security
Document Processing
- Async processing with database status tracking
- Processing states: pending → processing → completed/failed
- Embedding states: pending → processing → completed/failed
- Automatic retry capability for failed documents
- Chunk count and vector ID tracking in MongoDB
Vector Storage
- ChromaDB persistent storage in
./indices/chroma_db/ - Collections named
index_{index_id}for organization - Metadata includes document_id, chunk_index, index_id
- Configurable similarity search with top-k results
Chat Context System
- Context Window: 24-hour rolling window with max 10 previous messages
- Smart Context: AI uses conversation history for continuity and follow-up questions
- Context Caching: Responses with context aren't cached (dynamic), simple queries are cached
- Database Storage: All messages stored with proper timestamps and context metadata
- Context Display: Frontend shows when context is used and how many previous messages
- Session Management: Track conversation statistics and context usage
Message Ordering & Timestamps
- Chronological Order: Messages displayed in proper time sequence (oldest → newest)
- Accurate Timestamps: Server-side timestamp generation with UTC storage
- Separate Timestamps: User and assistant messages have distinct timestamps
- Proper Database Storage:
created_at,user_timestamp, andassistant_timestampfields - Frontend Display: Localized timestamp formatting with date and time
- Context Indicators: Visual indicators show when AI used previous conversation context
Error Handling & Validation
- Collection Validation: Check ChromaDB collection exists before querying
- Document Status Check: Verify documents are fully processed before chat
- Graceful Degradation: Fallback responses when context generation fails
- User-Friendly Errors: Clear, actionable error messages with next steps
- Progress Tracking: Real-time status updates during document processing
Progress Visualization
- Upload Progress: Real-time progress bars during file uploads
- Processing Status: Visual indicators for document processing stages
- Embedding Progress: Separate progress tracking for text processing and embedding
- Success States: Clear visual feedback when operations complete
- Status Dashboard: Comprehensive view of document processing pipeline
Security Features
- JWT token validation on protected routes
- Input validation with Pydantic schemas
- CORS configuration for frontend integration
- File upload validation and sanitization
Testing
Backend Testing
cd backend
# API documentation available at http://localhost:8000/docs
# Manual testing via Swagger UI
Frontend Testing
- React components use modern hooks patterns
- Error boundaries for graceful error handling
- Loading states for better UX
Migration Context
This is a migrated application from PHP/Python to FastAPI/React. The migration maintained:
- Complete feature parity with the original application
- All document processing capabilities
- ChromaDB indices compatibility
- Enhanced performance and security
- Modern, responsive UI
The MIGRATION_PLAN.md file contains detailed information about the migration process and architecture decisions.