326 lines
No EOL
10 KiB
Markdown
326 lines
No EOL
10 KiB
Markdown
# Migration Plan: PHP/Python → FastAPI/React
|
|
|
|
## Overview
|
|
This document outlines the complete migration strategy for transforming the current PHP/Python hybrid RAG application into a modern FastAPI backend with React frontend architecture.
|
|
|
|
## Current Application Analysis
|
|
|
|
### Existing Features
|
|
- **User Authentication**: Role-based access (admin/user) with SQLite storage
|
|
- **Document Management**: File uploads, processing, and indexing
|
|
- **RAG System**: LlamaIndex + ChromaDB for document retrieval
|
|
- **Contract Analysis**: GPT-4 powered contract field extraction
|
|
- **Chat Interface**: Natural language document Q&A
|
|
- **Caching System**: Response caching for performance
|
|
- **Index Management**: User-specific access control to document indices
|
|
|
|
### Current Architecture
|
|
```
|
|
PHP Frontend (Web UI) → Python Backend (Processing) → OpenAI API
|
|
↓ ↓
|
|
SQLite DB ChromaDB Vectors
|
|
```
|
|
|
|
## New Architecture
|
|
|
|
### Target Architecture
|
|
```
|
|
React Frontend → FastAPI Backend → MongoDB + ChromaDB → OpenAI API
|
|
↓
|
|
Redis Cache
|
|
```
|
|
|
|
## Project Structure
|
|
|
|
### Backend (FastAPI)
|
|
```
|
|
backend/
|
|
├── app/
|
|
│ ├── __init__.py
|
|
│ ├── main.py # FastAPI app entry point
|
|
│ ├── config/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── settings.py # Environment configuration
|
|
│ │ └── database.py # MongoDB connection
|
|
│ ├── models/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── user.py # User data models
|
|
│ │ ├── document.py # Document models
|
|
│ │ ├── index.py # Index models
|
|
│ │ └── chat.py # Chat/query models
|
|
│ ├── schemas/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── user.py # Pydantic schemas
|
|
│ │ ├── document.py
|
|
│ │ ├── index.py
|
|
│ │ └── chat.py
|
|
│ ├── api/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── deps.py # Dependencies
|
|
│ │ └── v1/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── auth.py # Authentication endpoints
|
|
│ │ ├── documents.py # Document management
|
|
│ │ ├── indices.py # Index management
|
|
│ │ ├── chat.py # Chat/query endpoints
|
|
│ │ └── admin.py # Admin endpoints
|
|
│ ├── core/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── auth.py # JWT authentication
|
|
│ │ ├── security.py # Security utilities
|
|
│ │ └── cache.py # Redis caching
|
|
│ ├── services/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── document_processor.py # Document processing service
|
|
│ │ ├── rag_service.py # RAG retrieval service
|
|
│ │ ├── index_service.py # Index management service
|
|
│ │ └── openai_service.py # OpenAI integration
|
|
│ ├── utils/
|
|
│ │ ├── __init__.py
|
|
│ │ ├── file_utils.py # File handling utilities
|
|
│ │ └── llama_utils.py # LlamaIndex utilities
|
|
│ └── middleware/
|
|
│ ├── __init__.py
|
|
│ ├── cors.py # CORS middleware
|
|
│ └── logging.py # Request logging
|
|
├── requirements.txt
|
|
├── Dockerfile
|
|
├── docker-compose.yml
|
|
└── .env.example
|
|
```
|
|
|
|
### Frontend (React)
|
|
```
|
|
frontend/
|
|
├── public/
|
|
│ ├── index.html
|
|
│ └── favicon.ico
|
|
├── src/
|
|
│ ├── components/
|
|
│ │ ├── common/
|
|
│ │ │ ├── Header.jsx
|
|
│ │ │ ├── Sidebar.jsx
|
|
│ │ │ ├── Layout.jsx
|
|
│ │ │ └── LoadingSpinner.jsx
|
|
│ │ ├── auth/
|
|
│ │ │ ├── LoginForm.jsx
|
|
│ │ │ └── ProtectedRoute.jsx
|
|
│ │ ├── documents/
|
|
│ │ │ ├── DocumentUpload.jsx
|
|
│ │ │ ├── DocumentList.jsx
|
|
│ │ │ └── DocumentViewer.jsx
|
|
│ │ ├── chat/
|
|
│ │ │ ├── ChatInterface.jsx
|
|
│ │ │ ├── MessageList.jsx
|
|
│ │ │ └── MessageInput.jsx
|
|
│ │ ├── indices/
|
|
│ │ │ ├── IndexList.jsx
|
|
│ │ │ ├── IndexManager.jsx
|
|
│ │ │ └── CreateIndex.jsx
|
|
│ │ └── admin/
|
|
│ │ ├── UserManagement.jsx
|
|
│ │ └── SystemMonitor.jsx
|
|
│ ├── hooks/
|
|
│ │ ├── useAuth.js
|
|
│ │ ├── useDocuments.js
|
|
│ │ ├── useChat.js
|
|
│ │ └── useIndices.js
|
|
│ ├── services/
|
|
│ │ ├── api.js # Axios configuration
|
|
│ │ ├── authService.js # Authentication API calls
|
|
│ │ ├── documentService.js # Document API calls
|
|
│ │ ├── chatService.js # Chat API calls
|
|
│ │ └── indexService.js # Index API calls
|
|
│ ├── context/
|
|
│ │ ├── AuthContext.js
|
|
│ │ └── ThemeContext.js
|
|
│ ├── utils/
|
|
│ │ ├── constants.js
|
|
│ │ ├── helpers.js
|
|
│ │ └── validation.js
|
|
│ ├── styles/
|
|
│ │ ├── globals.css
|
|
│ │ └── components/
|
|
│ ├── App.jsx
|
|
│ ├── index.js
|
|
│ └── routes.js
|
|
├── package.json
|
|
├── tailwind.config.js
|
|
├── vite.config.js
|
|
└── .env.example
|
|
```
|
|
|
|
## Technology Stack
|
|
|
|
### Backend
|
|
- **FastAPI**: Modern, fast web framework for Python APIs
|
|
- **MongoDB**: Document database for user data, metadata
|
|
- **ChromaDB**: Vector database for document embeddings (kept from current)
|
|
- **Redis**: Caching layer for improved performance
|
|
- **Pydantic**: Data validation and serialization
|
|
- **JWT**: Token-based authentication
|
|
- **LlamaIndex**: RAG framework (kept from current)
|
|
- **OpenAI**: GPT-4 for analysis and embeddings
|
|
|
|
### Frontend
|
|
- **React 18**: Modern React with hooks
|
|
- **Vite**: Fast build tool and dev server
|
|
- **Tailwind CSS**: Utility-first CSS framework
|
|
- **Axios**: HTTP client for API calls
|
|
- **React Router**: Client-side routing
|
|
- **React Hook Form**: Form handling
|
|
- **Zustand**: State management
|
|
- **React Query**: Server state management
|
|
|
|
## Migration Strategy
|
|
|
|
### Phase 1: Backend Foundation
|
|
1. Set up FastAPI project structure
|
|
2. Configure MongoDB connection
|
|
3. Implement user authentication with JWT
|
|
4. Create data models and schemas
|
|
5. Set up Redis caching
|
|
|
|
### Phase 2: Core Services
|
|
1. Port document processing pipeline
|
|
2. Implement RAG service with LlamaIndex
|
|
3. Create OpenAI integration service
|
|
4. Implement index management
|
|
5. Set up file upload handling
|
|
|
|
### Phase 3: API Endpoints
|
|
1. Authentication endpoints
|
|
2. Document management endpoints
|
|
3. Chat/query endpoints
|
|
4. Index management endpoints
|
|
5. Admin endpoints
|
|
|
|
### Phase 4: Frontend Development
|
|
1. Set up React project with Vite
|
|
2. Create authentication flow
|
|
3. Build document management interface
|
|
4. Implement chat interface
|
|
5. Create admin dashboard
|
|
|
|
### Phase 5: Integration & Testing
|
|
1. Connect frontend to backend APIs
|
|
2. Implement proper error handling
|
|
3. Add loading states and UX improvements
|
|
4. Performance optimization
|
|
5. Security hardening
|
|
|
|
### Phase 6: Deployment
|
|
1. Docker containerization
|
|
2. Environment configuration
|
|
3. Production deployment setup
|
|
4. Monitoring and logging
|
|
|
|
## Data Migration
|
|
|
|
### User Data
|
|
- Migrate from SQLite to MongoDB
|
|
- Transform user authentication to JWT
|
|
- Preserve user roles and permissions
|
|
|
|
### Document Indices
|
|
- Keep existing ChromaDB indices
|
|
- Update index metadata in MongoDB
|
|
- Maintain document access permissions
|
|
|
|
### Configuration
|
|
- Environment variables migration
|
|
- API key management
|
|
- Cache configuration
|
|
|
|
## Key Improvements
|
|
|
|
### Performance
|
|
- Async/await throughout backend
|
|
- Redis caching for API responses
|
|
- Optimized database queries
|
|
- React Query for client-side caching
|
|
|
|
### Security
|
|
- JWT-based authentication
|
|
- Input validation with Pydantic
|
|
- CORS configuration
|
|
- Rate limiting
|
|
|
|
### Scalability
|
|
- Microservice-ready architecture
|
|
- Database connection pooling
|
|
- Horizontal scaling support
|
|
- Load balancing ready
|
|
|
|
### Developer Experience
|
|
- Type hints throughout Python code
|
|
- API documentation with FastAPI
|
|
- Modern React patterns
|
|
- Hot reloading in development
|
|
|
|
### User Experience
|
|
- Modern, responsive UI
|
|
- Real-time updates
|
|
- Better error handling
|
|
- Improved performance
|
|
|
|
## Implementation Timeline
|
|
|
|
1. **Week 1**: Backend foundation and authentication
|
|
2. **Week 2**: Core services and API endpoints
|
|
3. **Week 3**: Frontend setup and basic components
|
|
4. **Week 4**: Integration and testing
|
|
5. **Week 5**: Deployment and optimization
|
|
|
|
## File Deletion Strategy
|
|
|
|
Files will be deleted progressively as new implementations are completed:
|
|
|
|
1. **Phase 1**: Remove PHP authentication files after JWT implementation
|
|
2. **Phase 2**: Remove PHP API files after FastAPI endpoints
|
|
3. **Phase 3**: Remove Python processing scripts after service implementation
|
|
4. **Phase 4**: Remove remaining PHP files after frontend completion
|
|
5. **Phase 5**: Clean up temporary files and documentation
|
|
|
|
## Environment Configuration
|
|
|
|
### Backend (.env)
|
|
```
|
|
# Database
|
|
MONGODB_URL=mongodb://localhost:27017
|
|
DATABASE_NAME=contract_analysis
|
|
|
|
# Redis
|
|
REDIS_URL=redis://localhost:6379
|
|
|
|
# Authentication
|
|
JWT_SECRET_KEY=your-secret-key
|
|
JWT_ALGORITHM=HS256
|
|
JWT_EXPIRE_MINUTES=30
|
|
|
|
# OpenAI
|
|
OPENAI_API_KEY=your-openai-key
|
|
LLAMAPARSE_API_KEY=your-llamaparse-key
|
|
|
|
# Application
|
|
DEBUG=false
|
|
CORS_ORIGINS=["http://localhost:3000"]
|
|
```
|
|
|
|
### Frontend (.env)
|
|
```
|
|
VITE_API_URL=http://localhost:8000
|
|
VITE_APP_NAME=Contract Analysis Tool
|
|
```
|
|
|
|
## Success Criteria
|
|
|
|
- [ ] Complete feature parity with current application
|
|
- [ ] Improved performance (faster load times, better caching)
|
|
- [ ] Modern, responsive UI
|
|
- [ ] Scalable architecture
|
|
- [ ] Comprehensive API documentation
|
|
- [ ] Security improvements
|
|
- [ ] Easy deployment and maintenance
|
|
|
|
This migration will modernize the application while maintaining all existing functionality and improving performance, security, and maintainability. |