Enhanced metadata_analyzer.py with production-ready capabilities:
- Token counting with tiktoken for accurate OpenAI usage tracking
- Exponential backoff retry logic with tenacity library
- Intelligent content truncation based on token limits (not characters)
- Configurable timeout and max retries from Config
- Graceful fallback when tiktoken/tenacity unavailable
- Enhanced error reporting with _ai_error and _tokens_used metadata
Integrated AI generation in web interface:
- AI analyzer lazy initialization in web_app.py
- Real content extraction and AI analysis in upload endpoint
- Error handling for insufficient content or API failures
- Token usage logging for monitoring and optimization
UI improvements for AI experience:
- Special loading message for AI processing (10-30s per file)
- Display token usage for AI-generated metadata
- Show AI errors prominently with helpful messages
- Filter internal metadata fields (_tokens_used, _ai_error) from forms
Dependencies leveraged:
- tiktoken: Proper OpenAI token counting (10x more accurate)
- tenacity: Exponential backoff retry (3 attempts, 2-10s delays)
- openai: Production timeout support (30s default)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added comprehensive startup checks in web_app.py:
- Check for Excel file existence with helpful error message
- Validate OpenAI API key availability (optional)
- Check ExifTool installation (optional)
- Display available metadata sources based on configuration
- Updated branding in startup messages
Benefits:
- Users see clear error messages for missing dependencies
- Easy troubleshooting of configuration issues
- Graceful degradation when optional features unavailable
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Updated application name to "Oliver Metadata Tool"
- Updated version to 3.0.0
- Added App Info constants to config.py (APP_NAME, APP_VERSION, APP_DESCRIPTION)
- Updated web interface (title, header, footer)
- Updated README with new branding and description
- Added AI configuration settings to config.py
- Added ExifTool check method to config.py
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Added Flask web interface for batch metadata processing
- Added Excel-based metadata lookup (Celum ID mapping)
- Dual-sheet support: DSB (primary) and Medsurg (fallback)
- Unicode/hieroglyph support for CGA region (Chinese, Japanese, Korean)
- Multi-format support: PDF, images, Office docs, video
- OCR with multi-language support (Tesseract)
- Filename matching without extension (case-insensitive)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>