solventum-image-metadata/requirements.txt
SamoilenkoVadym ae19179752 Phase 1.4: ExifTool integration for enhanced metadata support
Added ExifTool integration to support 300+ file formats with improved
performance and unified API for metadata operations.

Changes:
- Added PyExifTool>=0.5.6 to requirements.txt
- Created comprehensive ExifTool setup guide (docs/EXIFTOOL_SETUP.md)
- Created ExifToolExtractor for reading metadata from images/video/PDF
- Created ExifToolUpdater for writing metadata to images/video/PDF
- Updated README with ExifTool installation instructions

ExifTool Benefits:
- Unified API for images, videos, PDFs (vs 5+ separate libraries)
- Support for 300+ formats (HEIC, RAW, MKV, and more)
- 10-60x faster batch operations with stay_open mode
- Better PDF metadata writing (current pypdf is read-only)
- Battle-tested tool with 20+ years of development

Architecture:
- Hybrid approach: ExifTool for images/video/PDF, Python libs for Office
- Graceful fallback if ExifTool not installed
- Automatic detection on startup with helpful messages
- Tag mapping from ExifTool tags to standard fields (title/subject/keywords)

Implementation follows existing extractor/updater patterns for consistency.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 15:26:01 +00:00

40 lines
598 B
Text

# Core Libraries
python-magic==0.4.27
python-dotenv==1.0.1
tqdm==4.66.1
# Excel Processing
pandas>=2.0.0
openpyxl>=3.1.0
# PDF Processing
pypdf==4.0.1
pdfplumber==0.11.0
PyPDF2==3.0.1
# Image Processing
Pillow==10.2.0
pytesseract==0.3.10
pdf2image==1.17.0
piexif==1.1.3
iptcinfo3==2.1.4
# Office Documents
python-docx==1.1.0
python-pptx==0.6.23
# Video Processing
mutagen==1.47.0
ffmpeg-python==0.2.0
pymediainfo==6.1.0
# AI & Metadata Generation
openai>=1.0.0
tiktoken>=0.5.0
tenacity>=8.2.0
# ExifTool Integration (optional but recommended)
PyExifTool>=0.5.6
# Web Framework
Flask>=3.0.0