10 KiB
10 KiB
Migration Plan: PHP/Python → FastAPI/React
Overview
This document outlines the complete migration strategy for transforming the current PHP/Python hybrid RAG application into a modern FastAPI backend with React frontend architecture.
Current Application Analysis
Existing Features
- User Authentication: Role-based access (admin/user) with SQLite storage
- Document Management: File uploads, processing, and indexing
- RAG System: LlamaIndex + ChromaDB for document retrieval
- Contract Analysis: GPT-4 powered contract field extraction
- Chat Interface: Natural language document Q&A
- Caching System: Response caching for performance
- Index Management: User-specific access control to document indices
Current Architecture
PHP Frontend (Web UI) → Python Backend (Processing) → OpenAI API
↓ ↓
SQLite DB ChromaDB Vectors
New Architecture
Target Architecture
React Frontend → FastAPI Backend → MongoDB + ChromaDB → OpenAI API
↓
Redis Cache
Project Structure
Backend (FastAPI)
backend/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI app entry point
│ ├── config/
│ │ ├── __init__.py
│ │ ├── settings.py # Environment configuration
│ │ └── database.py # MongoDB connection
│ ├── models/
│ │ ├── __init__.py
│ │ ├── user.py # User data models
│ │ ├── document.py # Document models
│ │ ├── index.py # Index models
│ │ └── chat.py # Chat/query models
│ ├── schemas/
│ │ ├── __init__.py
│ │ ├── user.py # Pydantic schemas
│ │ ├── document.py
│ │ ├── index.py
│ │ └── chat.py
│ ├── api/
│ │ ├── __init__.py
│ │ ├── deps.py # Dependencies
│ │ └── v1/
│ │ ├── __init__.py
│ │ ├── auth.py # Authentication endpoints
│ │ ├── documents.py # Document management
│ │ ├── indices.py # Index management
│ │ ├── chat.py # Chat/query endpoints
│ │ └── admin.py # Admin endpoints
│ ├── core/
│ │ ├── __init__.py
│ │ ├── auth.py # JWT authentication
│ │ ├── security.py # Security utilities
│ │ └── cache.py # Redis caching
│ ├── services/
│ │ ├── __init__.py
│ │ ├── document_processor.py # Document processing service
│ │ ├── rag_service.py # RAG retrieval service
│ │ ├── index_service.py # Index management service
│ │ └── openai_service.py # OpenAI integration
│ ├── utils/
│ │ ├── __init__.py
│ │ ├── file_utils.py # File handling utilities
│ │ └── llama_utils.py # LlamaIndex utilities
│ └── middleware/
│ ├── __init__.py
│ ├── cors.py # CORS middleware
│ └── logging.py # Request logging
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
└── .env.example
Frontend (React)
frontend/
├── public/
│ ├── index.html
│ └── favicon.ico
├── src/
│ ├── components/
│ │ ├── common/
│ │ │ ├── Header.jsx
│ │ │ ├── Sidebar.jsx
│ │ │ ├── Layout.jsx
│ │ │ └── LoadingSpinner.jsx
│ │ ├── auth/
│ │ │ ├── LoginForm.jsx
│ │ │ └── ProtectedRoute.jsx
│ │ ├── documents/
│ │ │ ├── DocumentUpload.jsx
│ │ │ ├── DocumentList.jsx
│ │ │ └── DocumentViewer.jsx
│ │ ├── chat/
│ │ │ ├── ChatInterface.jsx
│ │ │ ├── MessageList.jsx
│ │ │ └── MessageInput.jsx
│ │ ├── indices/
│ │ │ ├── IndexList.jsx
│ │ │ ├── IndexManager.jsx
│ │ │ └── CreateIndex.jsx
│ │ └── admin/
│ │ ├── UserManagement.jsx
│ │ └── SystemMonitor.jsx
│ ├── hooks/
│ │ ├── useAuth.js
│ │ ├── useDocuments.js
│ │ ├── useChat.js
│ │ └── useIndices.js
│ ├── services/
│ │ ├── api.js # Axios configuration
│ │ ├── authService.js # Authentication API calls
│ │ ├── documentService.js # Document API calls
│ │ ├── chatService.js # Chat API calls
│ │ └── indexService.js # Index API calls
│ ├── context/
│ │ ├── AuthContext.js
│ │ └── ThemeContext.js
│ ├── utils/
│ │ ├── constants.js
│ │ ├── helpers.js
│ │ └── validation.js
│ ├── styles/
│ │ ├── globals.css
│ │ └── components/
│ ├── App.jsx
│ ├── index.js
│ └── routes.js
├── package.json
├── tailwind.config.js
├── vite.config.js
└── .env.example
Technology Stack
Backend
- FastAPI: Modern, fast web framework for Python APIs
- MongoDB: Document database for user data, metadata
- ChromaDB: Vector database for document embeddings (kept from current)
- Redis: Caching layer for improved performance
- Pydantic: Data validation and serialization
- JWT: Token-based authentication
- LlamaIndex: RAG framework (kept from current)
- OpenAI: GPT-4 for analysis and embeddings
Frontend
- React 18: Modern React with hooks
- Vite: Fast build tool and dev server
- Tailwind CSS: Utility-first CSS framework
- Axios: HTTP client for API calls
- React Router: Client-side routing
- React Hook Form: Form handling
- Zustand: State management
- React Query: Server state management
Migration Strategy
Phase 1: Backend Foundation
- Set up FastAPI project structure
- Configure MongoDB connection
- Implement user authentication with JWT
- Create data models and schemas
- Set up Redis caching
Phase 2: Core Services
- Port document processing pipeline
- Implement RAG service with LlamaIndex
- Create OpenAI integration service
- Implement index management
- Set up file upload handling
Phase 3: API Endpoints
- Authentication endpoints
- Document management endpoints
- Chat/query endpoints
- Index management endpoints
- Admin endpoints
Phase 4: Frontend Development
- Set up React project with Vite
- Create authentication flow
- Build document management interface
- Implement chat interface
- Create admin dashboard
Phase 5: Integration & Testing
- Connect frontend to backend APIs
- Implement proper error handling
- Add loading states and UX improvements
- Performance optimization
- Security hardening
Phase 6: Deployment
- Docker containerization
- Environment configuration
- Production deployment setup
- Monitoring and logging
Data Migration
User Data
- Migrate from SQLite to MongoDB
- Transform user authentication to JWT
- Preserve user roles and permissions
Document Indices
- Keep existing ChromaDB indices
- Update index metadata in MongoDB
- Maintain document access permissions
Configuration
- Environment variables migration
- API key management
- Cache configuration
Key Improvements
Performance
- Async/await throughout backend
- Redis caching for API responses
- Optimized database queries
- React Query for client-side caching
Security
- JWT-based authentication
- Input validation with Pydantic
- CORS configuration
- Rate limiting
Scalability
- Microservice-ready architecture
- Database connection pooling
- Horizontal scaling support
- Load balancing ready
Developer Experience
- Type hints throughout Python code
- API documentation with FastAPI
- Modern React patterns
- Hot reloading in development
User Experience
- Modern, responsive UI
- Real-time updates
- Better error handling
- Improved performance
Implementation Timeline
- Week 1: Backend foundation and authentication
- Week 2: Core services and API endpoints
- Week 3: Frontend setup and basic components
- Week 4: Integration and testing
- Week 5: Deployment and optimization
File Deletion Strategy
Files will be deleted progressively as new implementations are completed:
- Phase 1: Remove PHP authentication files after JWT implementation
- Phase 2: Remove PHP API files after FastAPI endpoints
- Phase 3: Remove Python processing scripts after service implementation
- Phase 4: Remove remaining PHP files after frontend completion
- Phase 5: Clean up temporary files and documentation
Environment Configuration
Backend (.env)
# Database
MONGODB_URL=mongodb://localhost:27017
DATABASE_NAME=contract_analysis
# Redis
REDIS_URL=redis://localhost:6379
# Authentication
JWT_SECRET_KEY=your-secret-key
JWT_ALGORITHM=HS256
JWT_EXPIRE_MINUTES=30
# OpenAI
OPENAI_API_KEY=your-openai-key
LLAMAPARSE_API_KEY=your-llamaparse-key
# Application
DEBUG=false
CORS_ORIGINS=["http://localhost:3000"]
Frontend (.env)
VITE_API_URL=http://localhost:8000
VITE_APP_NAME=Contract Analysis Tool
Success Criteria
- Complete feature parity with current application
- Improved performance (faster load times, better caching)
- Modern, responsive UI
- Scalable architecture
- Comprehensive API documentation
- Security improvements
- Easy deployment and maintenance
This migration will modernize the application while maintaining all existing functionality and improving performance, security, and maintainability.