No description
Find a file
SamoilenkoVadym 804c8acbbb v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update
Features:
- Smart column mapping for Excel and Import files (CSV/Excel/JSON)
- Modal dialogs for configuring sheet and column mappings
- Auto-detection of common column names (filename, title, description, keywords)
- Preview of first 3 rows before confirming mapping
- Case-insensitive filename matching without extension

UI Improvements:
- Fixed output folder selection (now uses text input instead of folder browser)
- Removed non-functional Reset button from metadata editor
- Clear button for output folder path

Documentation:
- Updated README.md with v3.1 Enterprise Edition information
- Developer: Vadym Samoilenko
- License: Corporate License - Oliver Marketing
- Added AI usage tracking and logging documentation
- Complete installation guide with all dependencies
- API endpoint documentation
- Security and privacy section
- Troubleshooting guide

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 17:06:18 +00:00
docs Phase 1.4: ExifTool integration for enhanced metadata support 2026-01-25 15:26:01 +00:00
src Phase 4 Complete: Authentication, Database, and Microsoft SSO 2026-01-25 15:57:47 +00:00
templates v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update 2026-01-25 17:06:18 +00:00
.gitignore v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update 2026-01-25 17:06:18 +00:00
README.md v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update 2026-01-25 17:06:18 +00:00
requirements.txt Phase 4 Complete: Authentication, Database, and Microsoft SSO 2026-01-25 15:57:47 +00:00
run_gui.py Initial commit: Universal metadata tool with Excel-based lookup 2026-01-25 14:23:42 +00:00
web_app.py v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update 2026-01-25 17:06:18 +00:00

Oliver Metadata Tool v3.1 Enterprise Edition

Universal metadata creation and management tool for all file types. Create, import, and manage metadata from multiple sources with an intuitive web interface, user authentication, and AI-powered metadata generation.

Developer: Vadym Samoilenko License: Corporate License - Oliver Marketing Version: 3.1 (Enterprise Edition)


Features

Multiple Metadata Sources

  • 📊 Excel Lookup: Configure custom Excel files with column mapping
  • 🤖 AI Generation: OpenAI-powered intelligent metadata generation
  • ✏️ Manual Entry: Direct editing with real-time validation
  • 📂 File Import: Import from CSV, Excel, or JSON with custom mapping
  • 📋 Templates: Reusable metadata templates with variables

Enterprise Features

  • 🔐 Authentication: Local user authentication + Microsoft SSO support
  • 👥 User Management: SQLite database for users and sessions
  • 📊 Audit Logging: Track all user actions and metadata changes
  • 🔍 AI Usage Tracking: Monitor OpenAI token usage and costs

File Support

  • 300+ File Formats via ExifTool integration
  • PDF Files: Full metadata support (title, subject, keywords, author, copyright)
  • Images: JPEG, PNG, GIF, HEIC, TIFF, RAW formats
  • Office Documents: Word, Excel, PowerPoint
  • Video Files: MP4, MOV, AVI, MKV
  • Unicode Support: Full support for Chinese, Japanese, Korean characters

Advanced Capabilities

  • Smart Field Mapping: Auto-detect columns with fuzzy matching
  • Batch Processing: Process multiple files with selective updates
  • Custom Metadata Fields: Add unlimited custom fields
  • CSV Export: Export metadata and processing results
  • Template Variables: {filename}, {date}, {user}, custom variables

Requirements

System Dependencies

  • Python 3.8+
  • ExifTool 12.15+ (required for 300+ format support)
  • Tesseract OCR (optional - for image text extraction)
  • Poppler (optional - for PDF content extraction)

Python Dependencies

All listed in requirements.txt:

  • Flask 2.3.0+ (Web framework)
  • pandas, openpyxl (Excel/CSV processing)
  • PyExifTool 0.5.6+ (Metadata operations)
  • openai 1.0.0+ (AI generation)
  • tiktoken 0.5.0+ (Token counting)
  • tenacity 8.2.0+ (Retry logic)
  • msal (Microsoft SSO - optional)

Installation

1. Install System Dependencies

macOS:

brew install exiftool tesseract tesseract-lang poppler

Linux (Ubuntu/Debian):

sudo apt-get install libimage-exiftool-perl tesseract-ocr tesseract-ocr-chi-sim tesseract-ocr-chi-tra tesseract-ocr-jpn tesseract-ocr-kor poppler-utils

Windows:

# Install ExifTool from: https://exiftool.org/
choco install exiftool tesseract

Verify ExifTool Installation:

exiftool -ver
# Should show version 12.15 or higher

See docs/EXIFTOOL_SETUP.md for detailed setup instructions.

2. Create Virtual Environment

python3 -m venv venv_local
source venv_local/bin/activate  # On Windows: venv_local\Scripts\activate

3. Install Python Dependencies

pip install -r requirements.txt

4. Configure Environment Variables

Create a .env file in the project root:

# Required: OpenAI API Key (for AI metadata generation)
OPENAI_API_KEY=your-openai-api-key-here

# Optional: Microsoft SSO (for enterprise authentication)
# AZURE_CLIENT_ID=your-azure-client-id
# AZURE_CLIENT_SECRET=your-azure-client-secret
# AZURE_TENANT_ID=your-azure-tenant-id
# REDIRECT_URI=http://localhost:5001/auth/callback

# Optional: Flask secret key (auto-generated if not set)
# SECRET_KEY=your-secret-key-here

# Optional: AI settings (defaults shown)
# AI_MODEL=gpt-4o-mini
# MAX_TOKENS=500
# TEMPERATURE=0.5
# API_TIMEOUT=30
# API_MAX_RETRIES=3

5. Initialize Database

The database will be created automatically on first run. To manually initialize:

python -c "from src.database import Database; db = Database(); print('Database initialized')"

Usage

Starting the Web Application

python web_app.py

The application will:

  1. Check for ExifTool availability
  2. Initialize SQLite database (users, sessions, audit_log)
  3. Start Flask server on http://localhost:5001
  4. 🌐 Open browser automatically

Login

Test Account:

  • Username: tester
  • Password: oliveradmin

Microsoft SSO (if configured):

  • Click "Sign in with Microsoft" button
  • Authenticate via Azure AD
  • Users auto-created on first login

Using Metadata Sources

1. Excel Lookup

  1. Click "Upload Excel File"
  2. Configure mapping modal:
    • Select sheet name
    • Map columns: Filename (required), Title, Description, Keywords
    • Preview first 3 rows
  3. Confirm mapping
  4. Upload files to process

2. AI Generation

  1. Select "AI Generation" from metadata source dropdown
  2. Upload files
  3. AI generates metadata (10-30 seconds per file)
  4. Review and edit generated metadata
  5. Save changes

3. Manual Entry

  1. Select "Manual Entry"
  2. Upload files
  3. Fill in metadata fields manually
  4. Save changes

4. Import from File

  1. Click "Import from File"
  2. Upload CSV/Excel/JSON file
  3. Configure column mapping (same as Excel)
  4. Upload files to match metadata

5. Templates

  1. Create template with variables
  2. Select template from dropdown
  3. Apply to selected files
  4. Review and save

Batch Operations

  1. Upload multiple files
  2. Use checkboxes to select files
  3. "Select All" / "Deselect All" buttons
  4. Edit metadata individually
  5. Click "Update Selected Files" to save all at once
  6. Export results to CSV

Configuration

Database Schema

Users Table:

  • id, username, password_hash, email, full_name
  • auth_method (local/sso)
  • created_at, last_login, is_active

Sessions Table:

  • session_id, user_id, created_at, expires_at
  • ip_address, user_agent

Audit Log Table:

  • id, user_id, action, details, timestamp

AI Usage Tracking

Every AI metadata generation is logged with:

  • User ID
  • Timestamp
  • Tokens used (prompt + completion)
  • Cost estimate (based on gpt-4o-mini pricing)

View logs in database:

SELECT * FROM audit_log WHERE action = 'ai_generation' ORDER BY timestamp DESC;

User Management

Create New User:

from src.database import Database
db = Database()
db.create_user(
    username='newuser',
    password='password123',
    email='user@example.com',
    full_name='New User',
    auth_method='local'
)

List All Users:

users = db.get_all_users()
for user in users:
    print(f"{user['username']} - Last login: {user['last_login']}")

Architecture

File Structure

oliver-metadata-tool/
├── web_app.py              # Flask web application (main entry point)
├── requirements.txt        # Python dependencies
├── .env                    # Environment configuration
├── oliver_metadata.db      # SQLite database (auto-created)
├── src/
│   ├── config.py           # Configuration management
│   ├── database.py         # Database operations
│   ├── auth.py             # Authentication logic
│   ├── metadata_analyzer.py    # AI metadata generation
│   ├── metadata_importer.py    # Import from files
│   ├── template_manager.py     # Template system
│   ├── field_mapper.py         # Column mapping
│   ├── excel_metadata_lookup.py # Excel lookup
│   ├── extractors/
│   │   ├── pdf_extractor.py
│   │   ├── image_extractor.py
│   │   ├── office_extractor.py
│   │   ├── video_extractor.py
│   │   └── exiftool_extractor.py
│   └── updaters/
│       ├── pdf_updater.py
│       ├── image_updater.py
│       ├── office_updater.py
│       ├── video_updater.py
│       └── exiftool_updater.py
├── templates/
│   ├── index.html          # Main UI
│   └── login.html          # Login page
└── docs/
    └── EXIFTOOL_SETUP.md   # ExifTool setup guide

Technology Stack

  • Backend: Flask (Python)
  • Database: SQLite
  • Frontend: HTML5, CSS3, JavaScript (Vanilla)
  • Design: Montserrat font, Dark & Gold theme
  • Authentication: Flask-Session, werkzeug.security, MSAL
  • AI: OpenAI API (gpt-4o-mini)
  • Metadata: PyExifTool, pypdf, python-docx, openpyxl

API Endpoints

Authentication

  • GET /login - Login page
  • POST /login - Authenticate user
  • GET /logout - Destroy session
  • GET /login/microsoft - Microsoft SSO redirect
  • GET /auth/callback - SSO callback

File Operations

  • POST /upload - Upload files and generate metadata
  • POST /update-manual - Update file metadata manually
  • GET /download/<filename> - Download processed file

Metadata Sources

  • POST /upload-excel - Upload Excel file for mapping
  • POST /preview-excel-sheet - Preview Excel sheet structure
  • POST /configure-excel-mapping - Configure Excel column mapping
  • POST /import-metadata - Upload import file for mapping
  • POST /configure-import-mapping - Configure import column mapping

Templates

  • GET /templates/list - List all templates
  • POST /templates/save - Save new template
  • POST /templates/load - Load template by name
  • DELETE /templates/delete - Delete template
  • POST /templates/apply - Apply template to files
  • POST /templates/preview - Preview template output

Security & Privacy

Authentication

  • Passwords hashed with werkzeug.security (pbkdf2:sha256)
  • Session tokens: 32-byte cryptographically secure random strings
  • Sessions expire after 24 hours
  • Microsoft SSO via OAuth2 + Azure AD

Data Protection

  • All credentials stored in .env (excluded from git)
  • Database file excluded from git
  • API keys never logged or exposed to frontend
  • Audit trail for all user actions

Production Recommendations

  1. HTTPS: Use SSL/TLS certificates in production
  2. Database: Migrate to PostgreSQL for better concurrency
  3. Rate Limiting: Add rate limits to prevent abuse
  4. CSRF Protection: Enable Flask-WTF for form security
  5. Error Tracking: Integrate Sentry or similar service
  6. Backups: Regular database backups
  7. Monitoring: Track AI token usage for cost management

Troubleshooting

Common Issues

ExifTool not found:

# Verify installation
exiftool -ver

# macOS: Reinstall with Homebrew
brew reinstall exiftool

# Linux: Reinstall with apt
sudo apt-get install --reinstall libimage-exiftool-perl

Database locked error:

# Stop all instances
lsof -ti:5001 | xargs kill -9

# Restart application
python web_app.py

OpenAI API errors:

Import failed - column not found:

  • Use the mapping modal to manually select columns
  • Check that your file has headers in the first row
  • Verify file encoding is UTF-8

Development

Running Tests

# Unit tests (if implemented)
pytest tests/

# Manual integration test
python -c "from src.database import Database; from src.config import Config; print('✅ All imports successful')"

Git Workflow

# Check status
git status

# Add changes
git add .

# Commit with message
git commit -m "Your commit message"

# Push to remote
git push origin main

License & Credits

License: Corporate License - Oliver Marketing All rights reserved. Unauthorized copying, distribution, or modification is prohibited.

Developer: Vadym Samoilenko Company: Oliver Marketing Version: 3.1 Enterprise Edition Release Date: January 2026

Third-Party Software:

  • ExifTool by Phil Harvey (Perl Artistic License)
  • Flask by Pallets (BSD License)
  • OpenAI API (Commercial License)
  • PyExifTool (LGPL License)

Support

For issues, questions, or feature requests:

  • Internal Support: Contact IT department
  • Developer: Vadym Samoilenko
  • Documentation: See docs/ folder

Changelog

v3.1 (January 2026) - Enterprise Edition

  • User authentication (local + Microsoft SSO)
  • SQLite database with audit logging
  • Smart column mapping for Excel/CSV import
  • Custom metadata fields support
  • AI usage tracking and cost monitoring
  • Dark & Gold UI redesign
  • Template variables and preview
  • Batch selection and CSV export

v3.0 (January 2026)

  • ExifTool integration (300+ formats)
  • Multiple metadata sources (Excel, AI, Manual, Import)
  • Field mapping with fuzzy matching
  • Metadata templates system
  • Rebranded to Oliver Metadata Tool

v2.x (Prior)

  • Basic Excel lookup functionality
  • Multi-format file support
  • Web and GUI interfaces