updated readme

2025-09-18 14:31:35 -05:00 · 2025-09-18 14:31:35 -05:00 · ba0391bbc6
commit ba0391bbc6
parent 3008d8f8fc
1 changed files with 185 additions and 84 deletions
--- a/README.md
+++ b/README.md
@ -1,142 +1,243 @@
 # Video Query Tool

-This application processes videos using Google's Gemini AI model, allowing users to:
+A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload videos and receive AI-generated content based on customizable prompts. The application features Azure AD B2C authentication, chunked file uploads for large videos, PDF generation with Mermaid diagram support, and comprehensive usage tracking.

-1. Upload videos (MP4, AVI, MOV, etc.)
-2. Choose from preset processing modes or use custom prompts
-3. Get AI-generated markdown content based on the video content
+## Features

-## Important Notes
+### Core Functionality
+- **Video Processing**: Upload and analyze videos using Google Gemini 2.5 Pro AI model
+- **Multiple Processing Modes**:
+  - Meeting Summary
+  - Process/Tool Documentation
+  - Process Documentation with Mermaid Charts
+  - Custom Prompts
+- **Large File Support**: Chunked upload system supporting files up to 5GB
+- **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams
+- **Authentication**: Azure AD B2C integration with both popup and redirect flows

- **Video Length Limitation**: The Gemini AI model can only process videos up to 55 minutes in length.
- **File Size**: The application supports uploads up to 5GB.
+### Technical Features
+- **Drag & Drop Upload**: Modern file upload interface with progress tracking
+- **Real-time Processing**: Live status updates during video analysis
+- **Error Handling**: Comprehensive error handling and user feedback
+- **Usage Analytics**: Automated tracking via webhook integration
+- **Production Ready**: Systemd service configuration and deployment scripts
+
+## Limitations
+
+- **Video Length**: Gemini AI processes videos up to 55 minutes maximum
+- **File Size**: Application supports uploads up to 5GB
+- **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM

 ## Project Structure

 ```
 video_query/
-├── backend/             # Flask/Hypercorn server
-│   ├── app.py           # Main Flask application
-│   ├── video_processor.py # Video processing logic
-│   └── run.py           # Hypercorn server script
-└── frontend/            # React frontend
-    ├── public/          # Static assets
-    └── src/             # React source code
+├── backend/                    # Flask/Hypercorn API server
+│   ├── app.py                 # Main Flask application with PDF generation
+│   ├── video_processor.py     # Gemini API integration and video processing
+│   ├── auth.py                # Azure AD B2C authentication handlers
+│   ├── chunked_upload.py      # Chunked file upload Blueprint
+│   ├── run.py                 # Hypercorn production server
+│   ├── requirements.txt       # Python dependencies
+│   └── test_*.py              # API testing utilities
+├── frontend/                   # React SPA
+│   ├── src/
+│   │   ├── components/        # React components
+│   │   │   ├── VideoUpload.js    # Drag & drop file upload
+│   │   │   ├── PromptSelector.js # Mode selection and prompt editing
+│   │   │   ├── ResultDisplay.js  # Results with PDF generation
+│   │   │   ├── AuthenticatedContent.js # Main application interface
+│   │   │   └── Login.js         # Authentication interface
+│   │   ├── auth/              # Authentication utilities
+│   │   │   ├── authConfig.js     # Azure AD B2C configuration
+│   │   │   ├── AuthProvider.js   # MSAL React provider
+│   │   │   └── authApiClient.js  # Authenticated API client
+│   │   └── utils/
+│   │       └── chunkedUploader.js # Large file upload handler
+│   ├── package.json           # Node.js dependencies
+│   └── build/                 # Production build output
+├── DEPLOYMENT.md              # Production deployment instructions
+├── LOG_EXTRACTION_README.md   # Usage analytics documentation
+├── restart.sh                 # Development restart script
+├── quick_extract.sh           # Log extraction utility
+├── extract_user_logs*.sh      # Advanced log processing
+└── requirements.txt           # Root Python dependencies (legacy)
 ```

+## Dependencies
+
+### Backend Dependencies
+- **Flask 3.1.0**: Web framework
+- **google-generativeai 0.8.5**: Gemini AI API client
+- **Hypercorn 0.17.3**: ASGI production server
+- **python-jose**: JWT token validation for Azure AD
+- **flask-cors 5.0.1**: Cross-origin resource sharing
+- **pdfkit 1.0.0**: PDF generation from HTML
+- **cairosvg 2.8.0**: SVG to PNG conversion for diagrams
+- **Pillow 11.2.1**: Image processing
+- **python-dotenv 1.1.0**: Environment variable management
+
+### Frontend Dependencies
+- **React 18.2.0**: UI framework
+- **@azure/msal-react 3.0.12**: Microsoft Authentication Library
+- **axios 1.6.0**: HTTP client
+- **bootstrap 5.3.2**: UI components and styling
+- **mermaid 11.6.0**: Diagram generation
+- **react-dropzone 14.2.3**: File upload interface
+- **showdown 2.1.0**: Markdown to HTML conversion
+
 ## Setup Instructions

+### Prerequisites
+- Python 3.8+
+- Node.js 16+
+- Google Cloud API key with Gemini access
+- Azure AD B2C tenant (for authentication)
+- wkhtmltopdf (for PDF generation)
+
 ### Backend Setup

-1. Create and activate a virtual environment:
-   ```
+1. **Create and activate virtual environment**:
+   ```bash
   python -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```

-2. Install backend dependencies:
-   ```
-   pip install -r requirements.txt
+2. **Install dependencies**:
+   ```bash
+   pip install -r backend/requirements.txt
   ```

-3. Set your Google API key:
-   ```
-   export GOOGLE_API_KEY=your_api_key_here
+3. **Set up environment variables**:
+   ```bash
+   export GOOGLE_API_KEY="your_gemini_api_key_here"
   ```

-4. Run the development server:
+4. **Install system dependencies for PDF generation**:
+   ```bash
+   # Ubuntu/Debian:
+   sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev
+
+   # macOS:
+   brew install cairo wkhtmltopdf
   ```
+
+5. **Start development server**:
+   ```bash
   cd backend
-   python run.py
+   python run.py --host 0.0.0.0 --port 5010
   ```

 ### Frontend Setup

-1. Install Node.js dependencies:
-   ```
+1. **Install Node.js dependencies**:
+   ```bash
   cd frontend
   npm install
   ```

-2. Start the development server:
-   ```
+2. **Configure authentication** (edit `src/auth/authConfig.js`):
+   - Update Azure AD B2C tenant ID
+   - Update client ID
+   - Update redirect URIs
+
+3. **Start development server**:
+   ```bash
   npm start
   ```

-## Deployment
+## Production Deployment

-### Backend Deployment with Systemd
+### System Requirements
+- Ubuntu/CentOS server
+- Apache/Nginx web server
+- Python 3.8+ with virtual environment
+- wkhtmltopdf system package
+- Node.js for building frontend

-1. Update the systemd service file (`backend/video-query.service`):
-   - Update paths to match your server
-   - Add your GOOGLE_API_KEY
-   - Place in `/etc/systemd/system/`
+### Backend Deployment

-2. Enable and start the service:
+1. **Install system packages**:
+   ```bash
+   sudo apt-get update
+   sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev
   ```
+
+2. **Create production service** (see `DEPLOYMENT.md` for systemd configuration):
+   ```bash
   sudo systemctl enable video-query
   sudo systemctl start video-query
   ```

-3. Check the service status:
-   ```
-   sudo systemctl status video-query
-   ```
+### Frontend Deployment

-### Frontend Deployment with Apache
-
-1. Build the React frontend:
-   ```
+1. **Build for production**:
+   ```bash
   cd frontend
-   npm run build
+   PUBLIC_URL=/video_query npm run build
   ```

-2. Copy the build directory to your Apache document root:
-   ```
+2. **Deploy to web server**:
+   ```bash
   cp -r build/* /var/www/html/video-query/
   ```

-3. Configure Apache to serve the React app, adding the following to your Apache configuration:
-   ```
-   <VirtualHost *:80>
-     ServerName yourdomain.com
-     DocumentRoot /var/www/html/video-query
-     
-     <Directory "/var/www/html/video-query">
-       AllowOverride All
-       Require all granted
-       
-       # Redirect all requests to index.html for React routing
-       RewriteEngine On
-       RewriteBase /
-       RewriteRule ^index\.html$ - [L]
-       RewriteCond %{REQUEST_FILENAME} !-f
-       RewriteCond %{REQUEST_FILENAME} !-d
-       RewriteRule . /index.html [L]
-     </Directory>
-     
-     # Proxy API requests to the backend
-     ProxyPass /api http://localhost:5010/api
-     ProxyPassReverse /api http://localhost:5010/api
-   </VirtualHost>
-   ```
-
-4. Restart Apache:
-   ```
-   sudo systemctl restart apache2
-   ```
-
 ## API Reference

-The backend API exposes a single endpoint:
+### Authentication Endpoints
+- **GET /api/auth-test**: Verify authentication status

- **POST /api/process**: Processes an uploaded video with the specified prompt
-  - Form parameters:
-    - `video`: The video file
-    - `prompt`: The prompt text to process the video with
-  - Returns:
-    - Success: `{ "success": true, "content": "markdown content..." }`
-    - Error: `{ "success": false, "message": "error message..." }`
+### Video Processing Endpoints
+- **POST /api/process**: Main video processing endpoint
+  - Accepts both direct uploads and chunked upload references
+  - Form data: `video` file, `prompt` text
+  - JSON data: `file_path`, `filename`, `prompt` (for chunked uploads)
+
+### Chunked Upload Endpoints
+- **POST /api/init-upload**: Initialize chunked upload session
+- **POST /api/upload-chunk/<upload_id>**: Upload file chunk
+- **POST /api/complete-upload/<upload_id>**: Mark upload complete
+- **POST /api/cancel-upload/<upload_id>**: Cancel upload
+
+### PDF Generation Endpoints
+- **POST /api/generate-pdf**: Generate PDF from HTML with Mermaid diagrams
+  - JSON data: `html`, `textDiagrams`, `svgDiagrams`, `diagramPngs`
+
+## Usage Analytics
+
+The application includes built-in usage tracking that sends data to a webhook endpoint for analytics purposes. This tracks:
+- User email addresses
+- Processing timestamps
+- Prompts used
+- Model information
+
+Log extraction utilities are provided in `extract_user_logs*.sh` scripts.
+
+## Configuration Files
+
+### Key Configuration Files
+- **CLAUDE.md**: Development guidelines and build commands
+- **.gitignore**: Comprehensive exclusion patterns
+- **backend/requirements.txt**: Production Python dependencies
+- **frontend/package.json**: Node.js dependencies and build scripts
+
+### Environment Variables
+- `GOOGLE_API_KEY`: Required for Gemini API access
+- Various Azure AD B2C configuration in frontend auth config
+
+## Development Utilities
+
+- **restart.sh**: Quick development environment restart
+- **backend/test_*.py**: API testing and validation scripts
+- **backend/run.py**: Production server with optimized settings for large uploads
+
+## Security Features
+
+- Azure AD B2C integration with JWT validation
+- CORS protection with specific origin allowlisting
+- Secure file upload validation
+- Temporary file cleanup
+- Token expiration handling

 ## License