diff --git a/README.md b/README.md
index 61311a2..6ef94d8 100644
--- a/README.md
+++ b/README.md
@@ -1,142 +1,243 @@
# Video Query Tool
-This application processes videos using Google's Gemini AI model, allowing users to:
+A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload videos and receive AI-generated content based on customizable prompts. The application features Azure AD B2C authentication, chunked file uploads for large videos, PDF generation with Mermaid diagram support, and comprehensive usage tracking.
-1. Upload videos (MP4, AVI, MOV, etc.)
-2. Choose from preset processing modes or use custom prompts
-3. Get AI-generated markdown content based on the video content
+## Features
-## Important Notes
+### Core Functionality
+- **Video Processing**: Upload and analyze videos using Google Gemini 2.5 Pro AI model
+- **Multiple Processing Modes**:
+ - Meeting Summary
+ - Process/Tool Documentation
+ - Process Documentation with Mermaid Charts
+ - Custom Prompts
+- **Large File Support**: Chunked upload system supporting files up to 5GB
+- **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams
+- **Authentication**: Azure AD B2C integration with both popup and redirect flows
-- **Video Length Limitation**: The Gemini AI model can only process videos up to 55 minutes in length.
-- **File Size**: The application supports uploads up to 5GB.
+### Technical Features
+- **Drag & Drop Upload**: Modern file upload interface with progress tracking
+- **Real-time Processing**: Live status updates during video analysis
+- **Error Handling**: Comprehensive error handling and user feedback
+- **Usage Analytics**: Automated tracking via webhook integration
+- **Production Ready**: Systemd service configuration and deployment scripts
+
+## Limitations
+
+- **Video Length**: Gemini AI processes videos up to 55 minutes maximum
+- **File Size**: Application supports uploads up to 5GB
+- **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM
## Project Structure
```
video_query/
-├── backend/ # Flask/Hypercorn server
-│ ├── app.py # Main Flask application
-│ ├── video_processor.py # Video processing logic
-│ └── run.py # Hypercorn server script
-└── frontend/ # React frontend
- ├── public/ # Static assets
- └── src/ # React source code
+├── backend/ # Flask/Hypercorn API server
+│ ├── app.py # Main Flask application with PDF generation
+│ ├── video_processor.py # Gemini API integration and video processing
+│ ├── auth.py # Azure AD B2C authentication handlers
+│ ├── chunked_upload.py # Chunked file upload Blueprint
+│ ├── run.py # Hypercorn production server
+│ ├── requirements.txt # Python dependencies
+│ └── test_*.py # API testing utilities
+├── frontend/ # React SPA
+│ ├── src/
+│ │ ├── components/ # React components
+│ │ │ ├── VideoUpload.js # Drag & drop file upload
+│ │ │ ├── PromptSelector.js # Mode selection and prompt editing
+│ │ │ ├── ResultDisplay.js # Results with PDF generation
+│ │ │ ├── AuthenticatedContent.js # Main application interface
+│ │ │ └── Login.js # Authentication interface
+│ │ ├── auth/ # Authentication utilities
+│ │ │ ├── authConfig.js # Azure AD B2C configuration
+│ │ │ ├── AuthProvider.js # MSAL React provider
+│ │ │ └── authApiClient.js # Authenticated API client
+│ │ └── utils/
+│ │ └── chunkedUploader.js # Large file upload handler
+│ ├── package.json # Node.js dependencies
+│ └── build/ # Production build output
+├── DEPLOYMENT.md # Production deployment instructions
+├── LOG_EXTRACTION_README.md # Usage analytics documentation
+├── restart.sh # Development restart script
+├── quick_extract.sh # Log extraction utility
+├── extract_user_logs*.sh # Advanced log processing
+└── requirements.txt # Root Python dependencies (legacy)
```
+## Dependencies
+
+### Backend Dependencies
+- **Flask 3.1.0**: Web framework
+- **google-generativeai 0.8.5**: Gemini AI API client
+- **Hypercorn 0.17.3**: ASGI production server
+- **python-jose**: JWT token validation for Azure AD
+- **flask-cors 5.0.1**: Cross-origin resource sharing
+- **pdfkit 1.0.0**: PDF generation from HTML
+- **cairosvg 2.8.0**: SVG to PNG conversion for diagrams
+- **Pillow 11.2.1**: Image processing
+- **python-dotenv 1.1.0**: Environment variable management
+
+### Frontend Dependencies
+- **React 18.2.0**: UI framework
+- **@azure/msal-react 3.0.12**: Microsoft Authentication Library
+- **axios 1.6.0**: HTTP client
+- **bootstrap 5.3.2**: UI components and styling
+- **mermaid 11.6.0**: Diagram generation
+- **react-dropzone 14.2.3**: File upload interface
+- **showdown 2.1.0**: Markdown to HTML conversion
+
## Setup Instructions
+### Prerequisites
+- Python 3.8+
+- Node.js 16+
+- Google Cloud API key with Gemini access
+- Azure AD B2C tenant (for authentication)
+- wkhtmltopdf (for PDF generation)
+
### Backend Setup
-1. Create and activate a virtual environment:
- ```
+1. **Create and activate virtual environment**:
+ ```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
-2. Install backend dependencies:
- ```
- pip install -r requirements.txt
+2. **Install dependencies**:
+ ```bash
+ pip install -r backend/requirements.txt
```
-3. Set your Google API key:
- ```
- export GOOGLE_API_KEY=your_api_key_here
+3. **Set up environment variables**:
+ ```bash
+ export GOOGLE_API_KEY="your_gemini_api_key_here"
```
-4. Run the development server:
+4. **Install system dependencies for PDF generation**:
+ ```bash
+ # Ubuntu/Debian:
+ sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev
+
+ # macOS:
+ brew install cairo wkhtmltopdf
```
+
+5. **Start development server**:
+ ```bash
cd backend
- python run.py
+ python run.py --host 0.0.0.0 --port 5010
```
### Frontend Setup
-1. Install Node.js dependencies:
- ```
+1. **Install Node.js dependencies**:
+ ```bash
cd frontend
npm install
```
-2. Start the development server:
- ```
+2. **Configure authentication** (edit `src/auth/authConfig.js`):
+ - Update Azure AD B2C tenant ID
+ - Update client ID
+ - Update redirect URIs
+
+3. **Start development server**:
+ ```bash
npm start
```
-## Deployment
+## Production Deployment
-### Backend Deployment with Systemd
+### System Requirements
+- Ubuntu/CentOS server
+- Apache/Nginx web server
+- Python 3.8+ with virtual environment
+- wkhtmltopdf system package
+- Node.js for building frontend
-1. Update the systemd service file (`backend/video-query.service`):
- - Update paths to match your server
- - Add your GOOGLE_API_KEY
- - Place in `/etc/systemd/system/`
+### Backend Deployment
-2. Enable and start the service:
+1. **Install system packages**:
+ ```bash
+ sudo apt-get update
+ sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev
```
+
+2. **Create production service** (see `DEPLOYMENT.md` for systemd configuration):
+ ```bash
sudo systemctl enable video-query
sudo systemctl start video-query
```
-3. Check the service status:
- ```
- sudo systemctl status video-query
- ```
+### Frontend Deployment
-### Frontend Deployment with Apache
-
-1. Build the React frontend:
- ```
+1. **Build for production**:
+ ```bash
cd frontend
- npm run build
+ PUBLIC_URL=/video_query npm run build
```
-2. Copy the build directory to your Apache document root:
- ```
+2. **Deploy to web server**:
+ ```bash
cp -r build/* /var/www/html/video-query/
```
-3. Configure Apache to serve the React app, adding the following to your Apache configuration:
- ```
-
- ServerName yourdomain.com
- DocumentRoot /var/www/html/video-query
-
-
- AllowOverride All
- Require all granted
-
- # Redirect all requests to index.html for React routing
- RewriteEngine On
- RewriteBase /
- RewriteRule ^index\.html$ - [L]
- RewriteCond %{REQUEST_FILENAME} !-f
- RewriteCond %{REQUEST_FILENAME} !-d
- RewriteRule . /index.html [L]
-
-
- # Proxy API requests to the backend
- ProxyPass /api http://localhost:5010/api
- ProxyPassReverse /api http://localhost:5010/api
-
- ```
-
-4. Restart Apache:
- ```
- sudo systemctl restart apache2
- ```
-
## API Reference
-The backend API exposes a single endpoint:
+### Authentication Endpoints
+- **GET /api/auth-test**: Verify authentication status
-- **POST /api/process**: Processes an uploaded video with the specified prompt
- - Form parameters:
- - `video`: The video file
- - `prompt`: The prompt text to process the video with
- - Returns:
- - Success: `{ "success": true, "content": "markdown content..." }`
- - Error: `{ "success": false, "message": "error message..." }`
+### Video Processing Endpoints
+- **POST /api/process**: Main video processing endpoint
+ - Accepts both direct uploads and chunked upload references
+ - Form data: `video` file, `prompt` text
+ - JSON data: `file_path`, `filename`, `prompt` (for chunked uploads)
+
+### Chunked Upload Endpoints
+- **POST /api/init-upload**: Initialize chunked upload session
+- **POST /api/upload-chunk/**: Upload file chunk
+- **POST /api/complete-upload/**: Mark upload complete
+- **POST /api/cancel-upload/**: Cancel upload
+
+### PDF Generation Endpoints
+- **POST /api/generate-pdf**: Generate PDF from HTML with Mermaid diagrams
+ - JSON data: `html`, `textDiagrams`, `svgDiagrams`, `diagramPngs`
+
+## Usage Analytics
+
+The application includes built-in usage tracking that sends data to a webhook endpoint for analytics purposes. This tracks:
+- User email addresses
+- Processing timestamps
+- Prompts used
+- Model information
+
+Log extraction utilities are provided in `extract_user_logs*.sh` scripts.
+
+## Configuration Files
+
+### Key Configuration Files
+- **CLAUDE.md**: Development guidelines and build commands
+- **.gitignore**: Comprehensive exclusion patterns
+- **backend/requirements.txt**: Production Python dependencies
+- **frontend/package.json**: Node.js dependencies and build scripts
+
+### Environment Variables
+- `GOOGLE_API_KEY`: Required for Gemini API access
+- Various Azure AD B2C configuration in frontend auth config
+
+## Development Utilities
+
+- **restart.sh**: Quick development environment restart
+- **backend/test_*.py**: API testing and validation scripts
+- **backend/run.py**: Production server with optimized settings for large uploads
+
+## Security Features
+
+- Azure AD B2C integration with JWT validation
+- CORS protection with specific origin allowlisting
+- Secure file upload validation
+- Temporary file cleanup
+- Token expiration handling
## License