Added ExifTool integration to support 300+ file formats with improved performance and unified API for metadata operations. Changes: - Added PyExifTool>=0.5.6 to requirements.txt - Created comprehensive ExifTool setup guide (docs/EXIFTOOL_SETUP.md) - Created ExifToolExtractor for reading metadata from images/video/PDF - Created ExifToolUpdater for writing metadata to images/video/PDF - Updated README with ExifTool installation instructions ExifTool Benefits: - Unified API for images, videos, PDFs (vs 5+ separate libraries) - Support for 300+ formats (HEIC, RAW, MKV, and more) - 10-60x faster batch operations with stay_open mode - Better PDF metadata writing (current pypdf is read-only) - Battle-tested tool with 20+ years of development Architecture: - Hybrid approach: ExifTool for images/video/PDF, Python libs for Office - Graceful fallback if ExifTool not installed - Automatic detection on startup with helpful messages - Tag mapping from ExifTool tags to standard fields (title/subject/keywords) Implementation follows existing extractor/updater patterns for consistency. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
40 lines
598 B
Text
40 lines
598 B
Text
# Core Libraries
|
|
python-magic==0.4.27
|
|
python-dotenv==1.0.1
|
|
tqdm==4.66.1
|
|
|
|
# Excel Processing
|
|
pandas>=2.0.0
|
|
openpyxl>=3.1.0
|
|
|
|
# PDF Processing
|
|
pypdf==4.0.1
|
|
pdfplumber==0.11.0
|
|
PyPDF2==3.0.1
|
|
|
|
# Image Processing
|
|
Pillow==10.2.0
|
|
pytesseract==0.3.10
|
|
pdf2image==1.17.0
|
|
piexif==1.1.3
|
|
iptcinfo3==2.1.4
|
|
|
|
# Office Documents
|
|
python-docx==1.1.0
|
|
python-pptx==0.6.23
|
|
|
|
# Video Processing
|
|
mutagen==1.47.0
|
|
ffmpeg-python==0.2.0
|
|
pymediainfo==6.1.0
|
|
|
|
# AI & Metadata Generation
|
|
openai>=1.0.0
|
|
tiktoken>=0.5.0
|
|
tenacity>=8.2.0
|
|
|
|
# ExifTool Integration (optional but recommended)
|
|
PyExifTool>=0.5.6
|
|
|
|
# Web Framework
|
|
Flask>=3.0.0
|