# Migration Plan: PHP/Python → FastAPI/React ## Overview This document outlines the complete migration strategy for transforming the current PHP/Python hybrid RAG application into a modern FastAPI backend with React frontend architecture. ## Current Application Analysis ### Existing Features - **User Authentication**: Role-based access (admin/user) with SQLite storage - **Document Management**: File uploads, processing, and indexing - **RAG System**: LlamaIndex + ChromaDB for document retrieval - **Contract Analysis**: GPT-4 powered contract field extraction - **Chat Interface**: Natural language document Q&A - **Caching System**: Response caching for performance - **Index Management**: User-specific access control to document indices ### Current Architecture ``` PHP Frontend (Web UI) → Python Backend (Processing) → OpenAI API ↓ ↓ SQLite DB ChromaDB Vectors ``` ## New Architecture ### Target Architecture ``` React Frontend → FastAPI Backend → MongoDB + ChromaDB → OpenAI API ↓ Redis Cache ``` ## Project Structure ### Backend (FastAPI) ``` backend/ ├── app/ │ ├── __init__.py │ ├── main.py # FastAPI app entry point │ ├── config/ │ │ ├── __init__.py │ │ ├── settings.py # Environment configuration │ │ └── database.py # MongoDB connection │ ├── models/ │ │ ├── __init__.py │ │ ├── user.py # User data models │ │ ├── document.py # Document models │ │ ├── index.py # Index models │ │ └── chat.py # Chat/query models │ ├── schemas/ │ │ ├── __init__.py │ │ ├── user.py # Pydantic schemas │ │ ├── document.py │ │ ├── index.py │ │ └── chat.py │ ├── api/ │ │ ├── __init__.py │ │ ├── deps.py # Dependencies │ │ └── v1/ │ │ ├── __init__.py │ │ ├── auth.py # Authentication endpoints │ │ ├── documents.py # Document management │ │ ├── indices.py # Index management │ │ ├── chat.py # Chat/query endpoints │ │ └── admin.py # Admin endpoints │ ├── core/ │ │ ├── __init__.py │ │ ├── auth.py # JWT authentication │ │ ├── security.py # Security utilities │ │ └── cache.py # Redis caching │ ├── services/ │ │ ├── __init__.py │ │ ├── document_processor.py # Document processing service │ │ ├── rag_service.py # RAG retrieval service │ │ ├── index_service.py # Index management service │ │ └── openai_service.py # OpenAI integration │ ├── utils/ │ │ ├── __init__.py │ │ ├── file_utils.py # File handling utilities │ │ └── llama_utils.py # LlamaIndex utilities │ └── middleware/ │ ├── __init__.py │ ├── cors.py # CORS middleware │ └── logging.py # Request logging ├── requirements.txt ├── Dockerfile ├── docker-compose.yml └── .env.example ``` ### Frontend (React) ``` frontend/ ├── public/ │ ├── index.html │ └── favicon.ico ├── src/ │ ├── components/ │ │ ├── common/ │ │ │ ├── Header.jsx │ │ │ ├── Sidebar.jsx │ │ │ ├── Layout.jsx │ │ │ └── LoadingSpinner.jsx │ │ ├── auth/ │ │ │ ├── LoginForm.jsx │ │ │ └── ProtectedRoute.jsx │ │ ├── documents/ │ │ │ ├── DocumentUpload.jsx │ │ │ ├── DocumentList.jsx │ │ │ └── DocumentViewer.jsx │ │ ├── chat/ │ │ │ ├── ChatInterface.jsx │ │ │ ├── MessageList.jsx │ │ │ └── MessageInput.jsx │ │ ├── indices/ │ │ │ ├── IndexList.jsx │ │ │ ├── IndexManager.jsx │ │ │ └── CreateIndex.jsx │ │ └── admin/ │ │ ├── UserManagement.jsx │ │ └── SystemMonitor.jsx │ ├── hooks/ │ │ ├── useAuth.js │ │ ├── useDocuments.js │ │ ├── useChat.js │ │ └── useIndices.js │ ├── services/ │ │ ├── api.js # Axios configuration │ │ ├── authService.js # Authentication API calls │ │ ├── documentService.js # Document API calls │ │ ├── chatService.js # Chat API calls │ │ └── indexService.js # Index API calls │ ├── context/ │ │ ├── AuthContext.js │ │ └── ThemeContext.js │ ├── utils/ │ │ ├── constants.js │ │ ├── helpers.js │ │ └── validation.js │ ├── styles/ │ │ ├── globals.css │ │ └── components/ │ ├── App.jsx │ ├── index.js │ └── routes.js ├── package.json ├── tailwind.config.js ├── vite.config.js └── .env.example ``` ## Technology Stack ### Backend - **FastAPI**: Modern, fast web framework for Python APIs - **MongoDB**: Document database for user data, metadata - **ChromaDB**: Vector database for document embeddings (kept from current) - **Redis**: Caching layer for improved performance - **Pydantic**: Data validation and serialization - **JWT**: Token-based authentication - **LlamaIndex**: RAG framework (kept from current) - **OpenAI**: GPT-4 for analysis and embeddings ### Frontend - **React 18**: Modern React with hooks - **Vite**: Fast build tool and dev server - **Tailwind CSS**: Utility-first CSS framework - **Axios**: HTTP client for API calls - **React Router**: Client-side routing - **React Hook Form**: Form handling - **Zustand**: State management - **React Query**: Server state management ## Migration Strategy ### Phase 1: Backend Foundation 1. Set up FastAPI project structure 2. Configure MongoDB connection 3. Implement user authentication with JWT 4. Create data models and schemas 5. Set up Redis caching ### Phase 2: Core Services 1. Port document processing pipeline 2. Implement RAG service with LlamaIndex 3. Create OpenAI integration service 4. Implement index management 5. Set up file upload handling ### Phase 3: API Endpoints 1. Authentication endpoints 2. Document management endpoints 3. Chat/query endpoints 4. Index management endpoints 5. Admin endpoints ### Phase 4: Frontend Development 1. Set up React project with Vite 2. Create authentication flow 3. Build document management interface 4. Implement chat interface 5. Create admin dashboard ### Phase 5: Integration & Testing 1. Connect frontend to backend APIs 2. Implement proper error handling 3. Add loading states and UX improvements 4. Performance optimization 5. Security hardening ### Phase 6: Deployment 1. Docker containerization 2. Environment configuration 3. Production deployment setup 4. Monitoring and logging ## Data Migration ### User Data - Migrate from SQLite to MongoDB - Transform user authentication to JWT - Preserve user roles and permissions ### Document Indices - Keep existing ChromaDB indices - Update index metadata in MongoDB - Maintain document access permissions ### Configuration - Environment variables migration - API key management - Cache configuration ## Key Improvements ### Performance - Async/await throughout backend - Redis caching for API responses - Optimized database queries - React Query for client-side caching ### Security - JWT-based authentication - Input validation with Pydantic - CORS configuration - Rate limiting ### Scalability - Microservice-ready architecture - Database connection pooling - Horizontal scaling support - Load balancing ready ### Developer Experience - Type hints throughout Python code - API documentation with FastAPI - Modern React patterns - Hot reloading in development ### User Experience - Modern, responsive UI - Real-time updates - Better error handling - Improved performance ## Implementation Timeline 1. **Week 1**: Backend foundation and authentication 2. **Week 2**: Core services and API endpoints 3. **Week 3**: Frontend setup and basic components 4. **Week 4**: Integration and testing 5. **Week 5**: Deployment and optimization ## File Deletion Strategy Files will be deleted progressively as new implementations are completed: 1. **Phase 1**: Remove PHP authentication files after JWT implementation 2. **Phase 2**: Remove PHP API files after FastAPI endpoints 3. **Phase 3**: Remove Python processing scripts after service implementation 4. **Phase 4**: Remove remaining PHP files after frontend completion 5. **Phase 5**: Clean up temporary files and documentation ## Environment Configuration ### Backend (.env) ``` # Database MONGODB_URL=mongodb://localhost:27017 DATABASE_NAME=contract_analysis # Redis REDIS_URL=redis://localhost:6379 # Authentication JWT_SECRET_KEY=your-secret-key JWT_ALGORITHM=HS256 JWT_EXPIRE_MINUTES=30 # OpenAI OPENAI_API_KEY=your-openai-key LLAMAPARSE_API_KEY=your-llamaparse-key # Application DEBUG=false CORS_ORIGINS=["http://localhost:3000"] ``` ### Frontend (.env) ``` VITE_API_URL=http://localhost:8000 VITE_APP_NAME=Contract Analysis Tool ``` ## Success Criteria - [ ] Complete feature parity with current application - [ ] Improved performance (faster load times, better caching) - [ ] Modern, responsive UI - [ ] Scalable architecture - [ ] Comprehensive API documentation - [ ] Security improvements - [ ] Easy deployment and maintenance This migration will modernize the application while maintaining all existing functionality and improving performance, security, and maintainability.