solventum-image-metadata/DOCKER.md
SamoilenkoVadym acc071927e Add Docker support with complete deployment setup
Features:
- Docker mode detection via DOCKER_MODE env var
- Persistent volumes for uploads, database, and output
- Health checks and auto-restart
- Complete docker-compose.yml configuration
- Helper script (docker-run.sh) for easy management
- Comprehensive DOCKER.md documentation

Changes:
- web_app.py: Auto-detect Docker mode, use persistent dirs
- src/database.py: Auto-detect database path based on environment
- Dockerfile: Multi-stage build with all dependencies (ExifTool, Tesseract, Poppler, FFmpeg)
- docker-compose.yml: Production-ready configuration
- docker-run.sh: Management script (build, start, stop, logs, etc.)
- DOCKER.md: Complete deployment and troubleshooting guide
- README.md: Added Docker quick start section
- .gitignore: Added Docker-related entries

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:07:15 +00:00

385 lines
7.4 KiB
Markdown

# Docker Deployment Guide
Complete guide for deploying Oliver Metadata Tool using Docker.
## Prerequisites
- Docker 20.10+ installed
- Docker Compose 2.0+ installed
- 2GB+ available disk space
- Network access for pulling base images
## Quick Start
### 1. Build and Start
```bash
# Using docker-compose directly
docker-compose up -d
# Or using the helper script
./docker-run.sh build
./docker-run.sh start
```
### 2. Access Application
Open browser at: **http://localhost:5001**
Default credentials:
- Username: `tester`
- Password: `oliveradmin`
### 3. View Logs
```bash
# Using docker-compose
docker-compose logs -f
# Or using the helper script
./docker-run.sh logs
```
## Configuration
### Environment Variables
Create `.env` file in project root (optional):
```env
# Required for AI metadata generation
OPENAI_API_KEY=your-openai-api-key-here
# Optional: AI Configuration
AI_MODEL=gpt-4o-mini
MAX_TOKENS=500
TEMPERATURE=0.5
# Optional: Microsoft SSO
AZURE_CLIENT_ID=your-azure-client-id
AZURE_CLIENT_SECRET=your-azure-client-secret
AZURE_TENANT_ID=your-azure-tenant-id
REDIRECT_URI=http://localhost:5001/auth/callback
# Optional: Flask secret key
SECRET_KEY=your-secret-key-here
```
### Docker Compose Configuration
The `docker-compose.yml` file includes:
- **Port mapping**: `5001:5001`
- **Persistent volumes**:
- `uploads:/app/uploads` - Temporary file uploads
- `database:/app/data` - SQLite database
- `output:/app/output` - Processed files, backups, reports
- **Auto-restart**: Container restarts unless explicitly stopped
- **Health checks**: Every 30 seconds
## Management Commands
### Using docker-run.sh Script
```bash
# Build image
./docker-run.sh build
# Start application
./docker-run.sh start
# Stop application
./docker-run.sh stop
# Restart application
./docker-run.sh restart
# View logs
./docker-run.sh logs
# Show status
./docker-run.sh status
# Clean up (removes data!)
./docker-run.sh clean
```
### Using Docker Compose Directly
```bash
# Build image
docker-compose build
# Start in background
docker-compose up -d
# Start with logs
docker-compose up
# Stop
docker-compose down
# Restart
docker-compose restart
# View logs
docker-compose logs -f
# Check status
docker-compose ps
# Remove containers and volumes (deletes data!)
docker-compose down -v
```
## Data Persistence
### Volumes
Three Docker volumes persist data between container restarts:
1. **uploads** - `/app/uploads`
- Temporary file uploads during processing
- Cleared when files are downloaded
2. **database** - `/app/data`
- SQLite database (`oliver_metadata.db`)
- User accounts, sessions, audit logs
3. **output** - `/app/output`
- Processed files
- Backups
- Reports
- Templates
### Backup Data
```bash
# Backup database
docker-compose exec oliver-metadata tar -czf /tmp/backup.tar.gz /app/data
docker cp oliver-metadata-tool:/tmp/backup.tar.gz ./backup-$(date +%Y%m%d).tar.gz
# Or backup entire volumes
docker run --rm -v oliver-metadata_database:/data -v $(pwd):/backup alpine tar -czf /backup/database-backup.tar.gz -C /data .
```
### Restore Data
```bash
# Stop container
docker-compose down
# Remove old volume
docker volume rm oliver-metadata_database
# Recreate volume and restore
docker run --rm -v oliver-metadata_database:/data -v $(pwd):/backup alpine tar -xzf /backup/database-backup.tar.gz -C /data
# Start container
docker-compose up -d
```
## Troubleshooting
### Container won't start
```bash
# Check logs
docker-compose logs
# Check if port is in use
lsof -i :5001
# Rebuild image
docker-compose build --no-cache
```
### Permission issues
```bash
# Check volume permissions
docker-compose exec oliver-metadata ls -la /app/uploads /app/data /app/output
# Fix permissions (if needed)
docker-compose exec oliver-metadata chown -R root:root /app/uploads /app/data /app/output
```
### Database locked errors
```bash
# Stop container
docker-compose down
# Start with fresh database
docker volume rm oliver-metadata_database
docker-compose up -d
```
### ExifTool not found
ExifTool is installed in the Docker image. Verify:
```bash
docker-compose exec oliver-metadata exiftool -ver
```
Should output version 12.15+
### Memory issues
Increase Docker memory allocation:
- Docker Desktop → Settings → Resources → Memory
- Recommended: 2GB minimum, 4GB+ for large batches
## Production Deployment
### Security Recommendations
1. **Change default credentials**
- Create new users via web interface
- Disable or remove test account
2. **Use environment variables**
- Never commit `.env` to git
- Use secrets management (Docker secrets, Kubernetes secrets)
3. **Enable HTTPS**
- Use reverse proxy (nginx, Traefik, Caddy)
- Terminate SSL at proxy level
4. **Set custom secret key**
```env
SECRET_KEY=$(openssl rand -hex 32)
```
5. **Limit file upload size**
- Default: 500MB
- Adjust via nginx/proxy if needed
### Reverse Proxy Example (nginx)
```nginx
server {
listen 80;
server_name metadata.example.com;
location / {
proxy_pass http://localhost:5001;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Increase timeouts for large file uploads
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;
}
}
```
### Resource Limits
Add to `docker-compose.yml`:
```yaml
services:
oliver-metadata:
# ... existing config ...
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
cpus: '1.0'
memory: 2G
```
## System Requirements
### Container Resources
- **CPU**: 1-2 cores (AI generation can use more)
- **Memory**: 2GB minimum, 4GB recommended
- **Disk**: 5GB+ (depends on file volume)
### Host Requirements
- **OS**: Linux, macOS, Windows with WSL2
- **Docker**: 20.10+
- **Architecture**: x86_64/amd64 (ARM64 may work but untested)
## Updates
### Update to latest version
```bash
# Pull latest code
git pull origin main
# Rebuild image
docker-compose build
# Restart containers
docker-compose up -d
```
### Update Python dependencies
```bash
# Rebuild without cache
docker-compose build --no-cache
# Restart
docker-compose up -d
```
## Monitoring
### Health Checks
Built-in health check runs every 30 seconds:
```bash
# Check health status
docker ps
# View health check logs
docker inspect oliver-metadata-tool | jq '.[0].State.Health'
```
### Resource Usage
```bash
# Real-time stats
docker stats oliver-metadata-tool
# Container info
docker inspect oliver-metadata-tool
```
## Support
For issues or questions:
1. Check logs: `docker-compose logs -f`
2. Verify configuration: `docker-compose config`
3. Test connection: `curl http://localhost:5001/login`
4. Open GitHub issue with logs and configuration
## FAQ
**Q: Can I change the port?**
A: Yes, edit `docker-compose.yml` port mapping: `"8080:5001"`
**Q: Does this work on ARM (Apple Silicon)?**
A: Should work but untested. Try building with `--platform linux/arm64`
**Q: How do I use my own database?**
A: Mount external database file as volume: `./my-db.db:/app/data/oliver_metadata.db`
**Q: Can I run multiple instances?**
A: Yes, change port mapping and container name in docker-compose.yml for each instance
**Q: Does it support S3 storage?**
A: Not yet, but you can mount S3 as volume using FUSE/s3fs