From ba0391bbc6e06049aeef9aa9b3c2c93acc074de2 Mon Sep 17 00:00:00 2001 From: michael Date: Thu, 18 Sep 2025 14:31:35 -0500 Subject: [PATCH] updated readme --- README.md | 269 +++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 185 insertions(+), 84 deletions(-) diff --git a/README.md b/README.md index 61311a2..6ef94d8 100644 --- a/README.md +++ b/README.md @@ -1,142 +1,243 @@ # Video Query Tool -This application processes videos using Google's Gemini AI model, allowing users to: +A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload videos and receive AI-generated content based on customizable prompts. The application features Azure AD B2C authentication, chunked file uploads for large videos, PDF generation with Mermaid diagram support, and comprehensive usage tracking. -1. Upload videos (MP4, AVI, MOV, etc.) -2. Choose from preset processing modes or use custom prompts -3. Get AI-generated markdown content based on the video content +## Features -## Important Notes +### Core Functionality +- **Video Processing**: Upload and analyze videos using Google Gemini 2.5 Pro AI model +- **Multiple Processing Modes**: + - Meeting Summary + - Process/Tool Documentation + - Process Documentation with Mermaid Charts + - Custom Prompts +- **Large File Support**: Chunked upload system supporting files up to 5GB +- **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams +- **Authentication**: Azure AD B2C integration with both popup and redirect flows -- **Video Length Limitation**: The Gemini AI model can only process videos up to 55 minutes in length. -- **File Size**: The application supports uploads up to 5GB. +### Technical Features +- **Drag & Drop Upload**: Modern file upload interface with progress tracking +- **Real-time Processing**: Live status updates during video analysis +- **Error Handling**: Comprehensive error handling and user feedback +- **Usage Analytics**: Automated tracking via webhook integration +- **Production Ready**: Systemd service configuration and deployment scripts + +## Limitations + +- **Video Length**: Gemini AI processes videos up to 55 minutes maximum +- **File Size**: Application supports uploads up to 5GB +- **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM ## Project Structure ``` video_query/ -├── backend/ # Flask/Hypercorn server -│ ├── app.py # Main Flask application -│ ├── video_processor.py # Video processing logic -│ └── run.py # Hypercorn server script -└── frontend/ # React frontend - ├── public/ # Static assets - └── src/ # React source code +├── backend/ # Flask/Hypercorn API server +│ ├── app.py # Main Flask application with PDF generation +│ ├── video_processor.py # Gemini API integration and video processing +│ ├── auth.py # Azure AD B2C authentication handlers +│ ├── chunked_upload.py # Chunked file upload Blueprint +│ ├── run.py # Hypercorn production server +│ ├── requirements.txt # Python dependencies +│ └── test_*.py # API testing utilities +├── frontend/ # React SPA +│ ├── src/ +│ │ ├── components/ # React components +│ │ │ ├── VideoUpload.js # Drag & drop file upload +│ │ │ ├── PromptSelector.js # Mode selection and prompt editing +│ │ │ ├── ResultDisplay.js # Results with PDF generation +│ │ │ ├── AuthenticatedContent.js # Main application interface +│ │ │ └── Login.js # Authentication interface +│ │ ├── auth/ # Authentication utilities +│ │ │ ├── authConfig.js # Azure AD B2C configuration +│ │ │ ├── AuthProvider.js # MSAL React provider +│ │ │ └── authApiClient.js # Authenticated API client +│ │ └── utils/ +│ │ └── chunkedUploader.js # Large file upload handler +│ ├── package.json # Node.js dependencies +│ └── build/ # Production build output +├── DEPLOYMENT.md # Production deployment instructions +├── LOG_EXTRACTION_README.md # Usage analytics documentation +├── restart.sh # Development restart script +├── quick_extract.sh # Log extraction utility +├── extract_user_logs*.sh # Advanced log processing +└── requirements.txt # Root Python dependencies (legacy) ``` +## Dependencies + +### Backend Dependencies +- **Flask 3.1.0**: Web framework +- **google-generativeai 0.8.5**: Gemini AI API client +- **Hypercorn 0.17.3**: ASGI production server +- **python-jose**: JWT token validation for Azure AD +- **flask-cors 5.0.1**: Cross-origin resource sharing +- **pdfkit 1.0.0**: PDF generation from HTML +- **cairosvg 2.8.0**: SVG to PNG conversion for diagrams +- **Pillow 11.2.1**: Image processing +- **python-dotenv 1.1.0**: Environment variable management + +### Frontend Dependencies +- **React 18.2.0**: UI framework +- **@azure/msal-react 3.0.12**: Microsoft Authentication Library +- **axios 1.6.0**: HTTP client +- **bootstrap 5.3.2**: UI components and styling +- **mermaid 11.6.0**: Diagram generation +- **react-dropzone 14.2.3**: File upload interface +- **showdown 2.1.0**: Markdown to HTML conversion + ## Setup Instructions +### Prerequisites +- Python 3.8+ +- Node.js 16+ +- Google Cloud API key with Gemini access +- Azure AD B2C tenant (for authentication) +- wkhtmltopdf (for PDF generation) + ### Backend Setup -1. Create and activate a virtual environment: - ``` +1. **Create and activate virtual environment**: + ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` -2. Install backend dependencies: - ``` - pip install -r requirements.txt +2. **Install dependencies**: + ```bash + pip install -r backend/requirements.txt ``` -3. Set your Google API key: - ``` - export GOOGLE_API_KEY=your_api_key_here +3. **Set up environment variables**: + ```bash + export GOOGLE_API_KEY="your_gemini_api_key_here" ``` -4. Run the development server: +4. **Install system dependencies for PDF generation**: + ```bash + # Ubuntu/Debian: + sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev + + # macOS: + brew install cairo wkhtmltopdf ``` + +5. **Start development server**: + ```bash cd backend - python run.py + python run.py --host 0.0.0.0 --port 5010 ``` ### Frontend Setup -1. Install Node.js dependencies: - ``` +1. **Install Node.js dependencies**: + ```bash cd frontend npm install ``` -2. Start the development server: - ``` +2. **Configure authentication** (edit `src/auth/authConfig.js`): + - Update Azure AD B2C tenant ID + - Update client ID + - Update redirect URIs + +3. **Start development server**: + ```bash npm start ``` -## Deployment +## Production Deployment -### Backend Deployment with Systemd +### System Requirements +- Ubuntu/CentOS server +- Apache/Nginx web server +- Python 3.8+ with virtual environment +- wkhtmltopdf system package +- Node.js for building frontend -1. Update the systemd service file (`backend/video-query.service`): - - Update paths to match your server - - Add your GOOGLE_API_KEY - - Place in `/etc/systemd/system/` +### Backend Deployment -2. Enable and start the service: +1. **Install system packages**: + ```bash + sudo apt-get update + sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev ``` + +2. **Create production service** (see `DEPLOYMENT.md` for systemd configuration): + ```bash sudo systemctl enable video-query sudo systemctl start video-query ``` -3. Check the service status: - ``` - sudo systemctl status video-query - ``` +### Frontend Deployment -### Frontend Deployment with Apache - -1. Build the React frontend: - ``` +1. **Build for production**: + ```bash cd frontend - npm run build + PUBLIC_URL=/video_query npm run build ``` -2. Copy the build directory to your Apache document root: - ``` +2. **Deploy to web server**: + ```bash cp -r build/* /var/www/html/video-query/ ``` -3. Configure Apache to serve the React app, adding the following to your Apache configuration: - ``` - - ServerName yourdomain.com - DocumentRoot /var/www/html/video-query - - - AllowOverride All - Require all granted - - # Redirect all requests to index.html for React routing - RewriteEngine On - RewriteBase / - RewriteRule ^index\.html$ - [L] - RewriteCond %{REQUEST_FILENAME} !-f - RewriteCond %{REQUEST_FILENAME} !-d - RewriteRule . /index.html [L] - - - # Proxy API requests to the backend - ProxyPass /api http://localhost:5010/api - ProxyPassReverse /api http://localhost:5010/api - - ``` - -4. Restart Apache: - ``` - sudo systemctl restart apache2 - ``` - ## API Reference -The backend API exposes a single endpoint: +### Authentication Endpoints +- **GET /api/auth-test**: Verify authentication status -- **POST /api/process**: Processes an uploaded video with the specified prompt - - Form parameters: - - `video`: The video file - - `prompt`: The prompt text to process the video with - - Returns: - - Success: `{ "success": true, "content": "markdown content..." }` - - Error: `{ "success": false, "message": "error message..." }` +### Video Processing Endpoints +- **POST /api/process**: Main video processing endpoint + - Accepts both direct uploads and chunked upload references + - Form data: `video` file, `prompt` text + - JSON data: `file_path`, `filename`, `prompt` (for chunked uploads) + +### Chunked Upload Endpoints +- **POST /api/init-upload**: Initialize chunked upload session +- **POST /api/upload-chunk/**: Upload file chunk +- **POST /api/complete-upload/**: Mark upload complete +- **POST /api/cancel-upload/**: Cancel upload + +### PDF Generation Endpoints +- **POST /api/generate-pdf**: Generate PDF from HTML with Mermaid diagrams + - JSON data: `html`, `textDiagrams`, `svgDiagrams`, `diagramPngs` + +## Usage Analytics + +The application includes built-in usage tracking that sends data to a webhook endpoint for analytics purposes. This tracks: +- User email addresses +- Processing timestamps +- Prompts used +- Model information + +Log extraction utilities are provided in `extract_user_logs*.sh` scripts. + +## Configuration Files + +### Key Configuration Files +- **CLAUDE.md**: Development guidelines and build commands +- **.gitignore**: Comprehensive exclusion patterns +- **backend/requirements.txt**: Production Python dependencies +- **frontend/package.json**: Node.js dependencies and build scripts + +### Environment Variables +- `GOOGLE_API_KEY`: Required for Gemini API access +- Various Azure AD B2C configuration in frontend auth config + +## Development Utilities + +- **restart.sh**: Quick development environment restart +- **backend/test_*.py**: API testing and validation scripts +- **backend/run.py**: Production server with optimized settings for large uploads + +## Security Features + +- Azure AD B2C integration with JWT validation +- CORS protection with specific origin allowlisting +- Secure file upload validation +- Temporary file cleanup +- Token expiration handling ## License