updated readme
This commit is contained in:
parent
3008d8f8fc
commit
ba0391bbc6
1 changed files with 185 additions and 84 deletions
269
README.md
269
README.md
|
|
@ -1,142 +1,243 @@
|
|||
# Video Query Tool
|
||||
|
||||
This application processes videos using Google's Gemini AI model, allowing users to:
|
||||
A full-stack web application that processes videos using Google's Gemini AI model, allowing users to upload videos and receive AI-generated content based on customizable prompts. The application features Azure AD B2C authentication, chunked file uploads for large videos, PDF generation with Mermaid diagram support, and comprehensive usage tracking.
|
||||
|
||||
1. Upload videos (MP4, AVI, MOV, etc.)
|
||||
2. Choose from preset processing modes or use custom prompts
|
||||
3. Get AI-generated markdown content based on the video content
|
||||
## Features
|
||||
|
||||
## Important Notes
|
||||
### Core Functionality
|
||||
- **Video Processing**: Upload and analyze videos using Google Gemini 2.5 Pro AI model
|
||||
- **Multiple Processing Modes**:
|
||||
- Meeting Summary
|
||||
- Process/Tool Documentation
|
||||
- Process Documentation with Mermaid Charts
|
||||
- Custom Prompts
|
||||
- **Large File Support**: Chunked upload system supporting files up to 5GB
|
||||
- **PDF Generation**: Convert results to PDF with embedded Mermaid diagrams
|
||||
- **Authentication**: Azure AD B2C integration with both popup and redirect flows
|
||||
|
||||
- **Video Length Limitation**: The Gemini AI model can only process videos up to 55 minutes in length.
|
||||
- **File Size**: The application supports uploads up to 5GB.
|
||||
### Technical Features
|
||||
- **Drag & Drop Upload**: Modern file upload interface with progress tracking
|
||||
- **Real-time Processing**: Live status updates during video analysis
|
||||
- **Error Handling**: Comprehensive error handling and user feedback
|
||||
- **Usage Analytics**: Automated tracking via webhook integration
|
||||
- **Production Ready**: Systemd service configuration and deployment scripts
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Video Length**: Gemini AI processes videos up to 55 minutes maximum
|
||||
- **File Size**: Application supports uploads up to 5GB
|
||||
- **Supported Formats**: MP4, AVI, MOV, WMV, MKV, WEBM
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
video_query/
|
||||
├── backend/ # Flask/Hypercorn server
|
||||
│ ├── app.py # Main Flask application
|
||||
│ ├── video_processor.py # Video processing logic
|
||||
│ └── run.py # Hypercorn server script
|
||||
└── frontend/ # React frontend
|
||||
├── public/ # Static assets
|
||||
└── src/ # React source code
|
||||
├── backend/ # Flask/Hypercorn API server
|
||||
│ ├── app.py # Main Flask application with PDF generation
|
||||
│ ├── video_processor.py # Gemini API integration and video processing
|
||||
│ ├── auth.py # Azure AD B2C authentication handlers
|
||||
│ ├── chunked_upload.py # Chunked file upload Blueprint
|
||||
│ ├── run.py # Hypercorn production server
|
||||
│ ├── requirements.txt # Python dependencies
|
||||
│ └── test_*.py # API testing utilities
|
||||
├── frontend/ # React SPA
|
||||
│ ├── src/
|
||||
│ │ ├── components/ # React components
|
||||
│ │ │ ├── VideoUpload.js # Drag & drop file upload
|
||||
│ │ │ ├── PromptSelector.js # Mode selection and prompt editing
|
||||
│ │ │ ├── ResultDisplay.js # Results with PDF generation
|
||||
│ │ │ ├── AuthenticatedContent.js # Main application interface
|
||||
│ │ │ └── Login.js # Authentication interface
|
||||
│ │ ├── auth/ # Authentication utilities
|
||||
│ │ │ ├── authConfig.js # Azure AD B2C configuration
|
||||
│ │ │ ├── AuthProvider.js # MSAL React provider
|
||||
│ │ │ └── authApiClient.js # Authenticated API client
|
||||
│ │ └── utils/
|
||||
│ │ └── chunkedUploader.js # Large file upload handler
|
||||
│ ├── package.json # Node.js dependencies
|
||||
│ └── build/ # Production build output
|
||||
├── DEPLOYMENT.md # Production deployment instructions
|
||||
├── LOG_EXTRACTION_README.md # Usage analytics documentation
|
||||
├── restart.sh # Development restart script
|
||||
├── quick_extract.sh # Log extraction utility
|
||||
├── extract_user_logs*.sh # Advanced log processing
|
||||
└── requirements.txt # Root Python dependencies (legacy)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Backend Dependencies
|
||||
- **Flask 3.1.0**: Web framework
|
||||
- **google-generativeai 0.8.5**: Gemini AI API client
|
||||
- **Hypercorn 0.17.3**: ASGI production server
|
||||
- **python-jose**: JWT token validation for Azure AD
|
||||
- **flask-cors 5.0.1**: Cross-origin resource sharing
|
||||
- **pdfkit 1.0.0**: PDF generation from HTML
|
||||
- **cairosvg 2.8.0**: SVG to PNG conversion for diagrams
|
||||
- **Pillow 11.2.1**: Image processing
|
||||
- **python-dotenv 1.1.0**: Environment variable management
|
||||
|
||||
### Frontend Dependencies
|
||||
- **React 18.2.0**: UI framework
|
||||
- **@azure/msal-react 3.0.12**: Microsoft Authentication Library
|
||||
- **axios 1.6.0**: HTTP client
|
||||
- **bootstrap 5.3.2**: UI components and styling
|
||||
- **mermaid 11.6.0**: Diagram generation
|
||||
- **react-dropzone 14.2.3**: File upload interface
|
||||
- **showdown 2.1.0**: Markdown to HTML conversion
|
||||
|
||||
## Setup Instructions
|
||||
|
||||
### Prerequisites
|
||||
- Python 3.8+
|
||||
- Node.js 16+
|
||||
- Google Cloud API key with Gemini access
|
||||
- Azure AD B2C tenant (for authentication)
|
||||
- wkhtmltopdf (for PDF generation)
|
||||
|
||||
### Backend Setup
|
||||
|
||||
1. Create and activate a virtual environment:
|
||||
```
|
||||
1. **Create and activate virtual environment**:
|
||||
```bash
|
||||
python -m venv venv
|
||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||
```
|
||||
|
||||
2. Install backend dependencies:
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
2. **Install dependencies**:
|
||||
```bash
|
||||
pip install -r backend/requirements.txt
|
||||
```
|
||||
|
||||
3. Set your Google API key:
|
||||
```
|
||||
export GOOGLE_API_KEY=your_api_key_here
|
||||
3. **Set up environment variables**:
|
||||
```bash
|
||||
export GOOGLE_API_KEY="your_gemini_api_key_here"
|
||||
```
|
||||
|
||||
4. Run the development server:
|
||||
4. **Install system dependencies for PDF generation**:
|
||||
```bash
|
||||
# Ubuntu/Debian:
|
||||
sudo apt-get install wkhtmltopdf python3-cairo libcairo2-dev
|
||||
|
||||
# macOS:
|
||||
brew install cairo wkhtmltopdf
|
||||
```
|
||||
|
||||
5. **Start development server**:
|
||||
```bash
|
||||
cd backend
|
||||
python run.py
|
||||
python run.py --host 0.0.0.0 --port 5010
|
||||
```
|
||||
|
||||
### Frontend Setup
|
||||
|
||||
1. Install Node.js dependencies:
|
||||
```
|
||||
1. **Install Node.js dependencies**:
|
||||
```bash
|
||||
cd frontend
|
||||
npm install
|
||||
```
|
||||
|
||||
2. Start the development server:
|
||||
```
|
||||
2. **Configure authentication** (edit `src/auth/authConfig.js`):
|
||||
- Update Azure AD B2C tenant ID
|
||||
- Update client ID
|
||||
- Update redirect URIs
|
||||
|
||||
3. **Start development server**:
|
||||
```bash
|
||||
npm start
|
||||
```
|
||||
|
||||
## Deployment
|
||||
## Production Deployment
|
||||
|
||||
### Backend Deployment with Systemd
|
||||
### System Requirements
|
||||
- Ubuntu/CentOS server
|
||||
- Apache/Nginx web server
|
||||
- Python 3.8+ with virtual environment
|
||||
- wkhtmltopdf system package
|
||||
- Node.js for building frontend
|
||||
|
||||
1. Update the systemd service file (`backend/video-query.service`):
|
||||
- Update paths to match your server
|
||||
- Add your GOOGLE_API_KEY
|
||||
- Place in `/etc/systemd/system/`
|
||||
### Backend Deployment
|
||||
|
||||
2. Enable and start the service:
|
||||
1. **Install system packages**:
|
||||
```bash
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y wkhtmltopdf python3-cairo python3-pil libcairo2-dev
|
||||
```
|
||||
|
||||
2. **Create production service** (see `DEPLOYMENT.md` for systemd configuration):
|
||||
```bash
|
||||
sudo systemctl enable video-query
|
||||
sudo systemctl start video-query
|
||||
```
|
||||
|
||||
3. Check the service status:
|
||||
```
|
||||
sudo systemctl status video-query
|
||||
```
|
||||
### Frontend Deployment
|
||||
|
||||
### Frontend Deployment with Apache
|
||||
|
||||
1. Build the React frontend:
|
||||
```
|
||||
1. **Build for production**:
|
||||
```bash
|
||||
cd frontend
|
||||
npm run build
|
||||
PUBLIC_URL=/video_query npm run build
|
||||
```
|
||||
|
||||
2. Copy the build directory to your Apache document root:
|
||||
```
|
||||
2. **Deploy to web server**:
|
||||
```bash
|
||||
cp -r build/* /var/www/html/video-query/
|
||||
```
|
||||
|
||||
3. Configure Apache to serve the React app, adding the following to your Apache configuration:
|
||||
```
|
||||
<VirtualHost *:80>
|
||||
ServerName yourdomain.com
|
||||
DocumentRoot /var/www/html/video-query
|
||||
|
||||
<Directory "/var/www/html/video-query">
|
||||
AllowOverride All
|
||||
Require all granted
|
||||
|
||||
# Redirect all requests to index.html for React routing
|
||||
RewriteEngine On
|
||||
RewriteBase /
|
||||
RewriteRule ^index\.html$ - [L]
|
||||
RewriteCond %{REQUEST_FILENAME} !-f
|
||||
RewriteCond %{REQUEST_FILENAME} !-d
|
||||
RewriteRule . /index.html [L]
|
||||
</Directory>
|
||||
|
||||
# Proxy API requests to the backend
|
||||
ProxyPass /api http://localhost:5010/api
|
||||
ProxyPassReverse /api http://localhost:5010/api
|
||||
</VirtualHost>
|
||||
```
|
||||
|
||||
4. Restart Apache:
|
||||
```
|
||||
sudo systemctl restart apache2
|
||||
```
|
||||
|
||||
## API Reference
|
||||
|
||||
The backend API exposes a single endpoint:
|
||||
### Authentication Endpoints
|
||||
- **GET /api/auth-test**: Verify authentication status
|
||||
|
||||
- **POST /api/process**: Processes an uploaded video with the specified prompt
|
||||
- Form parameters:
|
||||
- `video`: The video file
|
||||
- `prompt`: The prompt text to process the video with
|
||||
- Returns:
|
||||
- Success: `{ "success": true, "content": "markdown content..." }`
|
||||
- Error: `{ "success": false, "message": "error message..." }`
|
||||
### Video Processing Endpoints
|
||||
- **POST /api/process**: Main video processing endpoint
|
||||
- Accepts both direct uploads and chunked upload references
|
||||
- Form data: `video` file, `prompt` text
|
||||
- JSON data: `file_path`, `filename`, `prompt` (for chunked uploads)
|
||||
|
||||
### Chunked Upload Endpoints
|
||||
- **POST /api/init-upload**: Initialize chunked upload session
|
||||
- **POST /api/upload-chunk/<upload_id>**: Upload file chunk
|
||||
- **POST /api/complete-upload/<upload_id>**: Mark upload complete
|
||||
- **POST /api/cancel-upload/<upload_id>**: Cancel upload
|
||||
|
||||
### PDF Generation Endpoints
|
||||
- **POST /api/generate-pdf**: Generate PDF from HTML with Mermaid diagrams
|
||||
- JSON data: `html`, `textDiagrams`, `svgDiagrams`, `diagramPngs`
|
||||
|
||||
## Usage Analytics
|
||||
|
||||
The application includes built-in usage tracking that sends data to a webhook endpoint for analytics purposes. This tracks:
|
||||
- User email addresses
|
||||
- Processing timestamps
|
||||
- Prompts used
|
||||
- Model information
|
||||
|
||||
Log extraction utilities are provided in `extract_user_logs*.sh` scripts.
|
||||
|
||||
## Configuration Files
|
||||
|
||||
### Key Configuration Files
|
||||
- **CLAUDE.md**: Development guidelines and build commands
|
||||
- **.gitignore**: Comprehensive exclusion patterns
|
||||
- **backend/requirements.txt**: Production Python dependencies
|
||||
- **frontend/package.json**: Node.js dependencies and build scripts
|
||||
|
||||
### Environment Variables
|
||||
- `GOOGLE_API_KEY`: Required for Gemini API access
|
||||
- Various Azure AD B2C configuration in frontend auth config
|
||||
|
||||
## Development Utilities
|
||||
|
||||
- **restart.sh**: Quick development environment restart
|
||||
- **backend/test_*.py**: API testing and validation scripts
|
||||
- **backend/run.py**: Production server with optimized settings for large uploads
|
||||
|
||||
## Security Features
|
||||
|
||||
- Azure AD B2C integration with JWT validation
|
||||
- CORS protection with specific origin allowlisting
|
||||
- Secure file upload validation
|
||||
- Temporary file cleanup
|
||||
- Token expiration handling
|
||||
|
||||
## License
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue