Commit graph

44 commits

Author SHA1 Message Date
SamoilenkoVadym
f5aa512629 Handle redirect on login page and clear stuck MSAL state 2026-02-06 23:55:12 +00:00
SamoilenkoVadym
60b9329da2 Fix MSAL initialization and interaction_in_progress error 2026-02-06 23:53:00 +00:00
SamoilenkoVadym
25c5d1ba11 Complete SPA OAuth flow - login button uses MSAL.js
- Login button now uses MSAL.js loginRedirect() for PKCE
- oauth_callback uses MSAL.js handleRedirectPromise() to complete token exchange
- PKCE flow is now entirely in browser (SPA compatible)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:50:23 +00:00
SamoilenkoVadym
1fb5072a19 Simplify OAuth callback with direct token exchange and debug output 2026-02-06 23:45:34 +00:00
SamoilenkoVadym
2e64ae9d15 Add SPA-compatible OAuth flow with MSAL.js
- Render oauth_callback.html with MSAL.js for browser token exchange
- Add /auth/token endpoint to receive token from JavaScript
- Token exchange happens in browser (cross-origin) for SPA compatibility

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:40:42 +00:00
SamoilenkoVadym
497ab446ad Handle OAuth callback on root path /
- Check for OAuth code in query params on main page
- Process SSO login without requiring /auth/callback route
- Redirect to clean URL after successful login

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:27:55 +00:00
SamoilenkoVadym
992787bef1 Add debug logging to auth callback 2026-02-06 23:25:42 +00:00
SamoilenkoVadym
14ea29c5cb Add PKCE support for Azure AD public client SSO
- Use initiate_auth_code_flow for PKCE (required by Azure AD for public clients)
- Store auth flow in session for token exchange
- Fix AADSTS9002325 error

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:21:20 +00:00
SamoilenkoVadym
614322e135 Fix URL prefix to use single path /solventum-image-metadata
- Remove -back suffix, use single path for monolithic Flask app
- All routes now use /solventum-image-metadata/ prefix

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:12:10 +00:00
SamoilenkoVadym
1da1cd6220 Add gunicorn for production WSGI server
- Replace Flask development server with gunicorn
- 2 workers, 120s timeout for file processing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:55:46 +00:00
SamoilenkoVadym
a1ddf28108 Fix redirect URLs for reverse proxy
- Add URL_PREFIX for all redirect URLs
- Redirects now go to /solventum-image-metadata-back/login instead of /login

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:32:54 +00:00
SamoilenkoVadym
26095be769 Remove git pull from deploy script
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:26:12 +00:00
SamoilenkoVadym
189cb3dab3 Add deployment script and configure reverse proxy with Azure SSO
- Add deploy.sh for idempotent Docker deployments
- Configure API_BASE for /solventum-image-metadata-back/ reverse proxy
- Enable Azure AD SSO with public client flow (no secret required)
- Remove hardcoded tester user for production security
- Add ProxyFix middleware for reverse proxy header handling

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 16:37:19 +00:00
SamoilenkoVadym
e6e3037459 Add GPT-5 model support with enhanced API logging
- Add GPT-5, GPT-5-mini, and GPT-5-nano to valid models list

- Add model validation with automatic fallback to gpt-4o-mini

- Update _is_new_model() to recognize GPT-5 as new generation

- Add detailed API response logging (model used, tokens, content preview)

- Add empty content detection with helpful error messages

- Fix API parameter selection for GPT-5 models

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 14:33:55 +00:00
SamoilenkoVadym
7a6a95c179 Improve AI response parsing for better metadata extraction
Enhanced JSON and text parsing:
- Smarter JSON extraction from responses with extra text
- Find JSON object {...} anywhere in response text
- Improved markdown code block removal
- Validate parsed JSON has meaningful content
- Better text parsing fallback with multiple strategies
- Added debug logging for raw AI responses
- Handle edge cases like empty titles or malformed JSON

Fixes issue where AI-generated metadata was not displayed correctly.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 14:12:10 +00:00
SamoilenkoVadym
0b9a29b0c4 Fix temperature parameter for new models and add file cleanup
Fixed two critical issues:

1. OpenAI API temperature parameter:
   - New models (gpt-5-mini, gpt-4o, o1, o3) only support default temperature=1
   - Modified _get_api_params() to exclude temperature for new models
   - Older models still use custom temperature setting
   - Fixes 400 error: 'temperature' unsupported value

2. File cleanup to prevent disk space issues:
   - Added cleanup_session_files() to remove files when session ends
   - Added cleanup_old_files() to remove files older than 24 hours
   - Cleanup runs automatically on app startup in Docker mode
   - Cleanup runs on logout to free up space immediately
   - Added /cleanup-session endpoint for manual cleanup
   - Files no longer accumulate in Docker volume

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 14:10:00 +00:00
SamoilenkoVadym
5a951a64f2 Fix OpenAI API compatibility and add animated progress for AI generation
Fixed two critical issues:

1. OpenAI API compatibility for newer models:
   - Added _get_token_param() method to detect model type
   - Newer models (gpt-5-mini, gpt-4o, o1, o3) use max_completion_tokens
   - Older models (gpt-3.5-turbo) use max_tokens
   - Fixes 400 error: 'max_tokens' not supported parameter

2. Progress bar for AI generation:
   - Added startProgressAnimation() and stopProgressAnimation()
   - Animated progress bar shows activity during AI processing
   - Progress slowly increments to 90% to indicate work in progress
   - Stops animation when processing completes or errors occur

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 14:06:39 +00:00
SamoilenkoVadym
5b5056764c Configure Docker to load environment variables from .env file
Modified docker-compose.yml to support .env file:
- Added env_file directive to load .env automatically
- Simplified environment section (only DOCKER_MODE required)
- Enables AI metadata generation when OPENAI_API_KEY is set in .env
- All optional configs now loaded from .env file

Users can now create .env file with their OpenAI API key and other settings.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 14:02:42 +00:00
SamoilenkoVadym
cba7275764 Fix KeyError: change 'path' to 'filepath' in download_selected_files
Fixed critical bug in download_selected_files function:
- Session stores files with 'filepath' key, not 'path'
- Changed file_info['path'] to file_info['filepath']
- Added extensive logging to catch future issues

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 14:01:01 +00:00
SamoilenkoVadym
808e33cd2e Fix variable name bug in downloadSelectedFiles function
Fixed critical bug preventing download functionality:
- Changed currentSessionId to sessionId to match actual variable name
- Function was silently failing with 'No active session' error

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:57:56 +00:00
SamoilenkoVadym
ec3a2e2ffe Change Download All to Download Selected Files functionality
Modified download feature to work with selected files instead of all files:
- Button now shows 'Download Selected Files (N) as ZIP' with count
- New endpoint /download-selected accepts POST with file_indices
- Frontend sends only selected file indices to backend
- Button text updates dynamically when selection changes
- All files selected by default as before
- Users can select/deselect files before downloading

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:56:13 +00:00
SamoilenkoVadym
25bec09f79 Fix Download All button placement and visibility
Improved Download All Files button display:
- Changed insertion point from message div to after fileList element
- Added prominent styling with border, background, and shadow
- Increased button size and padding for better visibility
- Added hover effect for better UX

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:54:01 +00:00
SamoilenkoVadym
74639b949a Add Download All Files feature with ZIP archive support
Added functionality to download all processed files at once in a ZIP archive:
- New endpoint /download-all/<session_id> in web_app.py
- Creates timestamped ZIP archive with all files from session
- Download All button appears after successful file updates
- Button shows at bottom of results with clear styling
- Added zipfile and datetime imports

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:51:37 +00:00
SamoilenkoVadym
ac9249b13a Add download buttons to file update success messages
Modified updateAllFiles() function to display download buttons for each updated file.
Success messages now include download links with improved UX using flexbox layout.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:48:18 +00:00
SamoilenkoVadym
f5d77b8b39 Fix output directory issue in Docker mode
Problem:
- Users entered local folder paths that Docker container cannot access
- Files were not saved to user's folders
- output_dir check (os.path.isdir) always failed for host paths

Solution:
1. Backend (web_app.py):
   - Only use output_dir in non-Docker mode
   - In Docker mode, always update files in-place
   - Users download files via browser instead

2. Frontend (templates/index.html):
   - Hide output_dir field in Docker mode
   - Show info message: files available for download
   - Safe JS check for outputDir element

3. Template rendering:
   - Pass docker_mode flag to template
   - Conditional display of output directory section

Result:
 Docker mode: Files updated in-place, downloadable via browser
 Local mode: output_dir still works for direct folder saving
 No more confusion about folder paths

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:44:13 +00:00
SamoilenkoVadym
32eec7381f Remove deprecated version field from docker-compose.yml
- Removed 'version: 3.8' as it's obsolete in Docker Compose 5.0+
- Tested successfully with docker-compose 5.0.2
- All functionality working correctly

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:17:56 +00:00
SamoilenkoVadym
acc071927e Add Docker support with complete deployment setup
Features:
- Docker mode detection via DOCKER_MODE env var
- Persistent volumes for uploads, database, and output
- Health checks and auto-restart
- Complete docker-compose.yml configuration
- Helper script (docker-run.sh) for easy management
- Comprehensive DOCKER.md documentation

Changes:
- web_app.py: Auto-detect Docker mode, use persistent dirs
- src/database.py: Auto-detect database path based on environment
- Dockerfile: Multi-stage build with all dependencies (ExifTool, Tesseract, Poppler, FFmpeg)
- docker-compose.yml: Production-ready configuration
- docker-run.sh: Management script (build, start, stop, logs, etc.)
- DOCKER.md: Complete deployment and troubleshooting guide
- README.md: Added Docker quick start section
- .gitignore: Added Docker-related entries

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 13:07:15 +00:00
SamoilenkoVadym
007597c88a Remove GUI version, fix import metadata bugs, add .env.example
Major Changes:

- Removed GUI version (run_gui.py, src/gui_app.py) - Web-only application

- Fixed duplicate JavaScript variable declaration (importSessionId)

- Fixed metadata import endpoint to use session data instead of Excel lookup

- Added .env.example with all configuration options

Bug Fixes:

- Fixed /update endpoint to use suggested_metadata from session

- Fixed JavaScript updateAllFiles() to send session_id and file_index

- Updated README.md to reflect web-only interface

Dependencies:

- Updated requirements.txt to use minimum version constraints (>=)

Configuration:

- Added comprehensive .env.example with all environment variables

- Documented OpenAI API, Microsoft SSO, and optional tool paths

Testing:

- Verified import metadata workflow end-to-end

- Confirmed file upload and metadata update functionality

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 12:14:18 +00:00
SamoilenkoVadym
706e394a98 Remove Browse/Clear buttons, add folder path copy instructions
- Removed non-functional Browse button for folder selection
- Removed Clear button
- Removed selectOutputFolder() and clearOutputFolder() JavaScript functions
- Added detailed instructions for copying folder paths on Mac and Windows
- Simplified UI to single text input field with helpful hints

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 18:02:43 +00:00
SamoilenkoVadym
e8dd5f8193 Remove remaining Excel-specific validation code
- Removed validation check for 'excel' metadata source (no longer exists)
- Removed excelSessionId references from file upload handler
- Cleaned up duplicate validation logic
- Import now handles all file types (CSV/Excel/JSON) via importSessionId

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 17:15:30 +00:00
SamoilenkoVadym
e2f1867509 Consolidate Excel Lookup and Import into unified Import from File feature
- Removed separate Excel Lookup option from metadata source dropdown
- Consolidated into single 'Import from File (CSV/Excel/JSON)' option
- Removed duplicate Excel Mapping Modal and related functions
- Enhanced Import modal to handle Excel sheet selection
- Updated footer to show v3.1 and simplified metadata sources
- Updated README to reflect consolidated import functionality
- Removed redundant Excel-specific code for cleaner codebase

Benefits:
- Simpler user interface (3 metadata sources instead of 4)
- Unified mapping interface for all file types
- Less code duplication and easier maintenance
- Better UX with consistent workflow

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 17:14:48 +00:00
SamoilenkoVadym
804c8acbbb v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update
Features:
- Smart column mapping for Excel and Import files (CSV/Excel/JSON)
- Modal dialogs for configuring sheet and column mappings
- Auto-detection of common column names (filename, title, description, keywords)
- Preview of first 3 rows before confirming mapping
- Case-insensitive filename matching without extension

UI Improvements:
- Fixed output folder selection (now uses text input instead of folder browser)
- Removed non-functional Reset button from metadata editor
- Clear button for output folder path

Documentation:
- Updated README.md with v3.1 Enterprise Edition information
- Developer: Vadym Samoilenko
- License: Corporate License - Oliver Marketing
- Added AI usage tracking and logging documentation
- Complete installation guide with all dependencies
- API endpoint documentation
- Security and privacy section
- Troubleshooting guide

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 17:06:18 +00:00
SamoilenkoVadym
e9784d7da8 Phase 4 Complete: Authentication, Database, and Microsoft SSO
This commit implements a complete authentication system with local users,
session management, and Microsoft SSO support for enterprise environments.

New Files Created:
- src/database.py: SQLite database management with users, sessions, audit_log
- src/auth.py: Authentication module with login, SSO, and session management
- templates/login.html: Modern login page with SSO button

Database Schema:
- users table: username, password_hash, email, full_name, auth_method
- sessions table: session management with expiration
- audit_log table: user activity tracking
- Indexes for performance optimization

Authentication Features:
- Local authentication with test user (tester/oliveradmin)
- Password hashing with Werkzeug
- Session management with 24-hour expiration
- @login_required decorator for route protection
- Automatic session cleanup

Microsoft SSO Integration:
- MSAL library integration for Azure AD
- OAuth2 authorization code flow
- Microsoft Graph API user info retrieval
- Automatic user creation/update from SSO
- CSRF protection with state parameter
- Graceful fallback when SSO not configured

Security Improvements:
- All routes protected with @login_required
- Session-based authentication with database storage
- IP address and user agent logging
- Audit trail for user actions
- Secure session token generation

Configuration:
- Environment variables for Azure AD (AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_TENANT_ID)
- SECRET_KEY for Flask session encryption
- Optional MSAL dependency (SSO works only if configured)

Dependencies Added:
- Werkzeug>=3.0.0 for password hashing
- msal>=1.20.0 for Microsoft SSO (optional)

Test Credentials:
- Username: tester
- Password: oliveradmin

Phase 4 Status: Complete
Next Phase: Phase 5 (Modern UI Overhaul) for v3.1 release

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:57:47 +00:00
SamoilenkoVadym
f99aa118bf Phase 3 Complete: Batch Selection, CSV Export, and Metadata Templates
This commit completes Phase 3 implementation with advanced batch processing
and metadata template system.

Changes:
- Added batch file selection with checkboxes
- Implemented select all/deselect all functionality
- Updated batch processing to handle only selected files
- Added CSV export for processing results
- Created template_manager.py with variable substitution system
- Added template endpoints (list, save, load, delete, apply, preview)
- Integrated template UI with modal dialog for creation
- Template variables: {filename}, {date}, {datetime}, {user}, {year}, {month}, {day}

Phase 3 Status: Complete
Next Phase: Phase 4 (Authentication + SSO) for v3.1 release

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:52:05 +00:00
SamoilenkoVadym
61210a5e3d Phase 3.1: Field mapping foundation with auto-detection
Created comprehensive FieldMapper module (400+ lines):
- Fuzzy field matching with SequenceMatcher (60% similarity threshold)
- 10+ aliases per standard field (title, subject, keywords, description)
- Auto-mapping with confidence scores (0.0 to 1.0)
- Mapping suggestions with alternatives (top 2 per field)
- Exact match detection (score 1.0) and substring bonuses (0.85)
- Preset save/load/delete for reusable mappings
- Mapping validation (duplicate targets, coverage stats)
- Unmapped field detection and coverage percentage

FieldMapper features:
- auto_map(): Generate mapping from source fields
- suggest_mapping(): Get best match + alternatives for each field
- validate_mapping(): Check for conflicts and warnings
- apply_mapping(): Transform data using field mapping
- get_mapping_coverage(): Calculate mapping completeness
- Preset management: save, load, list, delete

MetadataImporter enhancements:
- preview_file_structure(): Preview columns and suggest mappings
- import_with_mapping(): Import with custom field mapping
- Integration with FieldMapper for smart detection
- Sample row preview (5 rows) before import

Web API additions:
- /preview-import endpoint: Preview file structure and field suggestions
- Returns: columns, sample rows, mapping suggestions with confidence
- Supports CSV, Excel, JSON format detection

Field mapping workflow:
1. User uploads import file for preview
2. System analyzes columns and suggests mappings
3. User reviews/adjusts mappings (confidence scores shown)
4. User confirms and imports with mapping
5. Optional: Save mapping as preset for reuse

Technical highlights:
- SequenceMatcher from difflib for fuzzy string matching
- Normalize field names (lowercase, underscores)
- Multiple alias sets per target field
- Confidence-based ranking of matches
- Preset persistence via JSON file

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:45:11 +00:00
SamoilenkoVadym
03079080d8 Phase 2.4: Metadata import from external files (CSV, Excel, JSON)
Created comprehensive metadata_importer.py module:
- CSV import with multiple encoding support (UTF-8, Latin1, ISO-8859-1, CP1252)
- Excel import (.xlsx, .xls) with sheet selection
- JSON import (object and array formats)
- Intelligent column detection for filename, title, subject, keywords
- Fuzzy column matching (case-insensitive, multiple aliases)
- Metadata normalization to standard format
- Import validation with statistics
- File lookup by filename stem (case-insensitive)

Web interface enhancements:
- /import-metadata endpoint for file uploads
- Import section UI (appears when Import source selected)
- Real-time import statistics display (records, title/subject/keywords counts)
- Import session management with unique session IDs
- Visual feedback (active state, success/error messages)
- Validation: requires import file before processing with import source

Import workflow:
1. User selects "Import from File" metadata source
2. Import section appears with file chooser
3. User uploads CSV/Excel/JSON with metadata
4. System validates and shows statistics
5. User uploads files to process
6. System matches files to imported metadata by filename

Supported import formats:
- CSV: filename, title, subject/description, keywords columns
- Excel: Any sheet with filename and metadata columns
- JSON: {filename: {metadata}} or [{filename, metadata}] formats

Technical features:
- Pandas DataFrame parsing for CSV/Excel
- Flexible column name detection (10+ aliases per field)
- NaN/null value handling
- List/array keyword support
- Unicode filename support

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:39:27 +00:00
SamoilenkoVadym
1bf2483f2d Phase 2.3: AI metadata generation with production-ready features
Enhanced metadata_analyzer.py with production-ready capabilities:
- Token counting with tiktoken for accurate OpenAI usage tracking
- Exponential backoff retry logic with tenacity library
- Intelligent content truncation based on token limits (not characters)
- Configurable timeout and max retries from Config
- Graceful fallback when tiktoken/tenacity unavailable
- Enhanced error reporting with _ai_error and _tokens_used metadata

Integrated AI generation in web interface:
- AI analyzer lazy initialization in web_app.py
- Real content extraction and AI analysis in upload endpoint
- Error handling for insufficient content or API failures
- Token usage logging for monitoring and optimization

UI improvements for AI experience:
- Special loading message for AI processing (10-30s per file)
- Display token usage for AI-generated metadata
- Show AI errors prominently with helpful messages
- Filter internal metadata fields (_tokens_used, _ai_error) from forms

Dependencies leveraged:
- tiktoken: Proper OpenAI token counting (10x more accurate)
- tenacity: Exponential backoff retry (3 attempts, 2-10s delays)
- openai: Production timeout support (30s default)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:36:48 +00:00
SamoilenkoVadym
fa2b4da2f7 Phase 2.1 & 2.2: Manual metadata editing and multiple sources
Implemented manual metadata editing UI:
- Added editable input fields for title (200 chars), subject (300 chars), keywords (500 chars)
- Character counters with warning/danger indicators at 90%/100%
- Real-time validation with visual feedback
- Save and Reset buttons for each file
- Individual file metadata updates via /update-manual endpoint

Implemented multiple metadata sources:
- Added metadata source selector dropdown (Excel, Manual, AI, Import)
- Modified /upload endpoint to handle different metadata sources
- Excel lookup: existing functionality (fastest)
- Manual entry: empty fields for user input
- AI generation: placeholder for Phase 2.3
- Import: placeholder for Phase 2.4

Technical improvements:
- Session-based metadata storage for persistence
- Graceful success/error feedback with visual indicators
- Sanitized metadata input with length limits
- Backup creation before updates

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:34:05 +00:00
SamoilenkoVadym
ae19179752 Phase 1.4: ExifTool integration for enhanced metadata support
Added ExifTool integration to support 300+ file formats with improved
performance and unified API for metadata operations.

Changes:
- Added PyExifTool>=0.5.6 to requirements.txt
- Created comprehensive ExifTool setup guide (docs/EXIFTOOL_SETUP.md)
- Created ExifToolExtractor for reading metadata from images/video/PDF
- Created ExifToolUpdater for writing metadata to images/video/PDF
- Updated README with ExifTool installation instructions

ExifTool Benefits:
- Unified API for images, videos, PDFs (vs 5+ separate libraries)
- Support for 300+ formats (HEIC, RAW, MKV, and more)
- 10-60x faster batch operations with stay_open mode
- Better PDF metadata writing (current pypdf is read-only)
- Battle-tested tool with 20+ years of development

Architecture:
- Hybrid approach: ExifTool for images/video/PDF, Python libs for Office
- Graceful fallback if ExifTool not installed
- Automatic detection on startup with helpful messages
- Tag mapping from ExifTool tags to standard fields (title/subject/keywords)

Implementation follows existing extractor/updater patterns for consistency.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:26:01 +00:00
SamoilenkoVadym
f4e1017964 Phase 1.3: Improve startup error handling and dependency checks
Added comprehensive startup checks in web_app.py:
- Check for Excel file existence with helpful error message
- Validate OpenAI API key availability (optional)
- Check ExifTool installation (optional)
- Display available metadata sources based on configuration
- Updated branding in startup messages

Benefits:
- Users see clear error messages for missing dependencies
- Easy troubleshooting of configuration issues
- Graceful degradation when optional features unavailable

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:17:03 +00:00
SamoilenkoVadym
c1f403cd83 Phase 1.2: Add AI dependencies for production-ready metadata generation
- Added tiktoken>=0.5.0 for proper token counting
- Added tenacity>=8.2.0 for retry logic with exponential backoff
- Added openai>=1.0.0 for AI metadata generation
- Updated Flask version specification

These dependencies enable:
- Accurate token usage tracking for OpenAI API calls
- Automatic retry on API failures with smart backoff
- Production-ready AI integration

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:16:00 +00:00
SamoilenkoVadym
7db62e06da Phase 1.1: Rebrand to Oliver Metadata Tool v3.0
- Updated application name to "Oliver Metadata Tool"
- Updated version to 3.0.0
- Added App Info constants to config.py (APP_NAME, APP_VERSION, APP_DESCRIPTION)
- Updated web interface (title, header, footer)
- Updated README with new branding and description
- Added AI configuration settings to config.py
- Added ExifTool check method to config.py

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:15:26 +00:00
SamoilenkoVadym
2082ea7ce7 Initial commit: Universal metadata tool with Excel-based lookup
- Added Flask web interface for batch metadata processing
- Added Excel-based metadata lookup (Celum ID mapping)
- Dual-sheet support: DSB (primary) and Medsurg (fallback)
- Unicode/hieroglyph support for CGA region (Chinese, Japanese, Korean)
- Multi-format support: PDF, images, Office docs, video
- OCR with multi-language support (Tesseract)
- Filename matching without extension (case-insensitive)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 14:23:42 +00:00
Dave Porter
dbee1be64d Initial commit 2026-01-22 19:57:36 +00:00