base_updater.validate_metadata() requires non-empty title. Now uses filename stem as fallback when user leaves title blank. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| app | ||
| deploy | ||
| docs | ||
| src | ||
| static | ||
| templates | ||
| tests | ||
| .env.example | ||
| .gitignore | ||
| deploy.sh | ||
| docker-compose.yml | ||
| docker-run.sh | ||
| DOCKER.md | ||
| Dockerfile | ||
| README.md | ||
| requirements.txt | ||
| run.py | ||
| web_app.py | ||
Oliver Metadata Tool v3.1 Enterprise Edition
Universal metadata creation and management tool for all file types. Create, import, and manage metadata from multiple sources with an intuitive web interface, user authentication, and AI-powered metadata generation.
Developer: Vadym Samoilenko License: Corporate License - Oliver Marketing Version: 3.1 (Enterprise Edition)
Features
Multiple Metadata Sources
- 📂 File Import: Import metadata from CSV, Excel, or JSON with smart column mapping and sheet selection
- 🤖 AI Generation: OpenAI-powered intelligent metadata generation
- ✏️ Manual Entry: Direct editing with real-time validation
- 📋 Templates: Reusable metadata templates with variables
Enterprise Features
- 🔐 Authentication: Local user authentication + Microsoft SSO support
- 👥 User Management: SQLite database for users and sessions
- 📊 Audit Logging: Track all user actions and metadata changes
- 🔍 AI Usage Tracking: Monitor OpenAI token usage and costs
File Support
- 300+ File Formats via ExifTool integration
- PDF Files: Full metadata support (title, subject, keywords, author, copyright)
- Images: JPEG, PNG, GIF, HEIC, TIFF, RAW formats
- Office Documents: Word, Excel, PowerPoint
- Video Files: MP4, MOV, AVI, MKV
- Unicode Support: Full support for Chinese, Japanese, Korean characters
Advanced Capabilities
- Smart Field Mapping: Auto-detect columns with fuzzy matching
- Batch Processing: Process multiple files with selective updates
- Custom Metadata Fields: Add unlimited custom fields
- CSV Export: Export metadata and processing results
- Template Variables: {filename}, {date}, {user}, custom variables
Requirements
System Dependencies
- Python 3.8+
- ExifTool 12.15+ (required for 300+ format support)
- Tesseract OCR (optional - for image text extraction)
- Poppler (optional - for PDF content extraction)
Python Dependencies
All listed in requirements.txt:
- Flask 2.3.0+ (Web framework)
- pandas, openpyxl (Excel/CSV processing)
- PyExifTool 0.5.6+ (Metadata operations)
- openai 1.0.0+ (AI generation)
- tiktoken 0.5.0+ (Token counting)
- tenacity 8.2.0+ (Retry logic)
- msal (Microsoft SSO - optional)
Installation
1. Install System Dependencies
macOS:
brew install exiftool tesseract tesseract-lang poppler
Linux (Ubuntu/Debian):
sudo apt-get install libimage-exiftool-perl tesseract-ocr tesseract-ocr-chi-sim tesseract-ocr-chi-tra tesseract-ocr-jpn tesseract-ocr-kor poppler-utils
Windows:
# Install ExifTool from: https://exiftool.org/
choco install exiftool tesseract
Verify ExifTool Installation:
exiftool -ver
# Should show version 12.15 or higher
See docs/EXIFTOOL_SETUP.md for detailed setup instructions.
2. Create Virtual Environment
python3 -m venv venv_local
source venv_local/bin/activate # On Windows: venv_local\Scripts\activate
3. Install Python Dependencies
pip install -r requirements.txt
4. Configure Environment Variables
Create a .env file in the project root:
# Required: OpenAI API Key (for AI metadata generation)
OPENAI_API_KEY=your-openai-api-key-here
# Optional: Microsoft SSO (for enterprise authentication)
# AZURE_CLIENT_ID=your-azure-client-id
# AZURE_CLIENT_SECRET=your-azure-client-secret
# AZURE_TENANT_ID=your-azure-tenant-id
# REDIRECT_URI=http://localhost:5001/auth/callback
# Optional: Flask secret key (auto-generated if not set)
# SECRET_KEY=your-secret-key-here
# Optional: AI settings (defaults shown)
# AI_MODEL=gpt-4o-mini
# MAX_TOKENS=500
# TEMPERATURE=0.5
# API_TIMEOUT=30
# API_MAX_RETRIES=3
5. Initialize Database
The database will be created automatically on first run. To manually initialize:
python -c "from src.database import Database; db = Database(); print('Database initialized')"
Docker Deployment (Recommended)
Quick Start with Docker
# Build and start
docker-compose up -d
# Or use the helper script
./docker-run.sh build
./docker-run.sh start
# Access at http://localhost:5001
Benefits:
- ✅ No manual dependency installation
- ✅ Consistent environment across systems
- ✅ Persistent data storage via volumes
- ✅ Easy updates and rollbacks
- ✅ Production-ready configuration
See DOCKER.md for complete Docker deployment guide.
Usage
Starting the Web Application
Local Development:
python web_app.py
Docker:
docker-compose up -d
The application will:
- ✅ Check for ExifTool availability
- ✅ Initialize SQLite database (users, sessions, audit_log)
- ✅ Start Flask server on http://localhost:5001
- 🌐 Open browser automatically (local mode only)
Login
Test Account:
- Username:
tester - Password:
oliveradmin
Microsoft SSO (if configured):
- Click "Sign in with Microsoft" button
- Authenticate via Azure AD
- Users auto-created on first login
Using Metadata Sources
1. Import from File
- Select "Import from File (CSV/Excel/JSON)" from metadata source dropdown (default)
- Click "Choose File" and select your metadata file
- Configure mapping modal:
- For Excel files: Select sheet name
- Map columns: Filename (required), Title, Description, Keywords
- Auto-detection suggests best matches
- Preview first 3 rows
- Confirm mapping
- Upload files to process - tool matches files by filename
2. AI Generation
- Select "AI Generation" from metadata source dropdown
- Upload files
- AI generates metadata (10-30 seconds per file)
- Review and edit generated metadata
- Save changes
3. Manual Entry
- Select "Manual Entry"
- Upload files
- Fill in metadata fields manually
- Save changes
4. Templates
- Create template with variables
- Select template from dropdown
- Apply to selected files
- Review and save
Batch Operations
- Upload multiple files
- Use checkboxes to select files
- "Select All" / "Deselect All" buttons
- Edit metadata individually
- Click "Update Selected Files" to save all at once
- Export results to CSV
Configuration
Database Schema
Users Table:
- id, username, password_hash, email, full_name
- auth_method (local/sso)
- created_at, last_login, is_active
Sessions Table:
- session_id, user_id, created_at, expires_at
- ip_address, user_agent
Audit Log Table:
- id, user_id, action, details, timestamp
AI Usage Tracking
Every AI metadata generation is logged with:
- User ID
- Timestamp
- Tokens used (prompt + completion)
- Cost estimate (based on gpt-4o-mini pricing)
View logs in database:
SELECT * FROM audit_log WHERE action = 'ai_generation' ORDER BY timestamp DESC;
User Management
Create New User:
from src.database import Database
db = Database()
db.create_user(
username='newuser',
password='password123',
email='user@example.com',
full_name='New User',
auth_method='local'
)
List All Users:
users = db.get_all_users()
for user in users:
print(f"{user['username']} - Last login: {user['last_login']}")
Architecture
File Structure
oliver-metadata-tool/
├── web_app.py # Flask web application (main entry point)
├── requirements.txt # Python dependencies
├── .env # Environment configuration
├── oliver_metadata.db # SQLite database (auto-created)
├── src/
│ ├── config.py # Configuration management
│ ├── database.py # Database operations
│ ├── auth.py # Authentication logic
│ ├── metadata_analyzer.py # AI metadata generation
│ ├── metadata_importer.py # Import from files
│ ├── template_manager.py # Template system
│ ├── field_mapper.py # Column mapping
│ ├── excel_metadata_lookup.py # Excel lookup
│ ├── extractors/
│ │ ├── pdf_extractor.py
│ │ ├── image_extractor.py
│ │ ├── office_extractor.py
│ │ ├── video_extractor.py
│ │ └── exiftool_extractor.py
│ └── updaters/
│ ├── pdf_updater.py
│ ├── image_updater.py
│ ├── office_updater.py
│ ├── video_updater.py
│ └── exiftool_updater.py
├── templates/
│ ├── index.html # Main UI
│ └── login.html # Login page
└── docs/
└── EXIFTOOL_SETUP.md # ExifTool setup guide
Technology Stack
- Backend: Flask (Python)
- Database: SQLite
- Frontend: HTML5, CSS3, JavaScript (Vanilla)
- Design: Montserrat font, Dark & Gold theme
- Authentication: Flask-Session, werkzeug.security, MSAL
- AI: OpenAI API (gpt-4o-mini)
- Metadata: PyExifTool, pypdf, python-docx, openpyxl
API Endpoints
Authentication
GET /login- Login pagePOST /login- Authenticate userGET /logout- Destroy sessionGET /login/microsoft- Microsoft SSO redirectGET /auth/callback- SSO callback
File Operations
POST /upload- Upload files and generate metadataPOST /update-manual- Update file metadata manuallyGET /download/<filename>- Download processed file
Metadata Sources
POST /upload-excel- Upload Excel file for mappingPOST /preview-excel-sheet- Preview Excel sheet structurePOST /configure-excel-mapping- Configure Excel column mappingPOST /import-metadata- Upload import file for mappingPOST /configure-import-mapping- Configure import column mapping
Templates
GET /templates/list- List all templatesPOST /templates/save- Save new templatePOST /templates/load- Load template by nameDELETE /templates/delete- Delete templatePOST /templates/apply- Apply template to filesPOST /templates/preview- Preview template output
Security & Privacy
Authentication
- Passwords hashed with werkzeug.security (pbkdf2:sha256)
- Session tokens: 32-byte cryptographically secure random strings
- Sessions expire after 24 hours
- Microsoft SSO via OAuth2 + Azure AD
Data Protection
- All credentials stored in
.env(excluded from git) - Database file excluded from git
- API keys never logged or exposed to frontend
- Audit trail for all user actions
Production Recommendations
- HTTPS: Use SSL/TLS certificates in production
- Database: Migrate to PostgreSQL for better concurrency
- Rate Limiting: Add rate limits to prevent abuse
- CSRF Protection: Enable Flask-WTF for form security
- Error Tracking: Integrate Sentry or similar service
- Backups: Regular database backups
- Monitoring: Track AI token usage for cost management
Troubleshooting
Common Issues
ExifTool not found:
# Verify installation
exiftool -ver
# macOS: Reinstall with Homebrew
brew reinstall exiftool
# Linux: Reinstall with apt
sudo apt-get install --reinstall libimage-exiftool-perl
Database locked error:
# Stop all instances
lsof -ti:5001 | xargs kill -9
# Restart application
python web_app.py
OpenAI API errors:
- Check API key in
.envfile - Verify API key is valid at https://platform.openai.com/api-keys
- Check token usage limits on OpenAI dashboard
Import failed - column not found:
- Use the mapping modal to manually select columns
- Check that your file has headers in the first row
- Verify file encoding is UTF-8
Development
Running Tests
# Unit tests (if implemented)
pytest tests/
# Manual integration test
python -c "from src.database import Database; from src.config import Config; print('✅ All imports successful')"
Git Workflow
# Check status
git status
# Add changes
git add .
# Commit with message
git commit -m "Your commit message"
# Push to remote
git push origin main
License & Credits
License: Corporate License - Oliver Marketing All rights reserved. Unauthorized copying, distribution, or modification is prohibited.
Developer: Vadym Samoilenko Company: Oliver Marketing Version: 3.1 Enterprise Edition Release Date: January 2026
Third-Party Software:
- ExifTool by Phil Harvey (Perl Artistic License)
- Flask by Pallets (BSD License)
- OpenAI API (Commercial License)
- PyExifTool (LGPL License)
Support
For issues, questions, or feature requests:
- Internal Support: Contact IT department
- Developer: Vadym Samoilenko
- Documentation: See
docs/folder
Changelog
v3.1 (January 2026) - Enterprise Edition
- ✅ User authentication (local + Microsoft SSO)
- ✅ SQLite database with audit logging
- ✅ Unified import from file (CSV/Excel/JSON) with smart column mapping
- ✅ Excel sheet selection and preview
- ✅ Custom metadata fields support
- ✅ AI usage tracking and cost monitoring
- ✅ Dark & Gold UI redesign
- ✅ Template variables and preview
- ✅ Batch selection and CSV export
- ✅ Consolidated metadata sources (removed redundant Excel Lookup)
v3.0 (January 2026)
- ✅ ExifTool integration (300+ formats)
- ✅ Multiple metadata sources (Import, AI, Manual)
- ✅ Field mapping with fuzzy matching
- ✅ Metadata templates system
- ✅ Rebranded to Oliver Metadata Tool
v2.x (Prior)
- Basic Excel lookup functionality
- Multi-format file support
- Web interface