Added ExifTool integration to support 300+ file formats with improved performance and unified API for metadata operations. Changes: - Added PyExifTool>=0.5.6 to requirements.txt - Created comprehensive ExifTool setup guide (docs/EXIFTOOL_SETUP.md) - Created ExifToolExtractor for reading metadata from images/video/PDF - Created ExifToolUpdater for writing metadata to images/video/PDF - Updated README with ExifTool installation instructions ExifTool Benefits: - Unified API for images, videos, PDFs (vs 5+ separate libraries) - Support for 300+ formats (HEIC, RAW, MKV, and more) - 10-60x faster batch operations with stay_open mode - Better PDF metadata writing (current pypdf is read-only) - Battle-tested tool with 20+ years of development Architecture: - Hybrid approach: ExifTool for images/video/PDF, Python libs for Office - Graceful fallback if ExifTool not installed - Automatic detection on startup with helpful messages - Tag mapping from ExifTool tags to standard fields (title/subject/keywords) Implementation follows existing extractor/updater patterns for consistency. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
243 lines
5.7 KiB
Markdown
243 lines
5.7 KiB
Markdown
# ExifTool Setup Guide
|
|
|
|
ExifTool is a powerful command-line application for reading, writing, and editing metadata in a wide variety of files. Oliver Metadata Tool uses ExifTool to provide enhanced metadata support for 300+ file formats.
|
|
|
|
## Why ExifTool?
|
|
|
|
- **Unified API**: Single tool handles images, videos, PDFs, and more
|
|
- **300+ formats**: Support for virtually all media file types
|
|
- **Better performance**: Optimized batch operations (10-60x faster)
|
|
- **Battle-tested**: 20+ years of development and widespread use
|
|
- **PDF writing support**: Can write PDF metadata (unlike pypdf)
|
|
|
|
## Installation
|
|
|
|
### macOS
|
|
|
|
```bash
|
|
brew install exiftool
|
|
```
|
|
|
|
Verify installation:
|
|
```bash
|
|
exiftool -ver
|
|
# Should show version 12.15 or higher
|
|
```
|
|
|
|
### Linux (Ubuntu/Debian)
|
|
|
|
```bash
|
|
sudo apt-get update
|
|
sudo apt-get install libimage-exiftool-perl
|
|
```
|
|
|
|
Verify installation:
|
|
```bash
|
|
exiftool -ver
|
|
```
|
|
|
|
### Linux (Fedora/RHEL/CentOS)
|
|
|
|
```bash
|
|
sudo yum install perl-Image-ExifTool
|
|
```
|
|
|
|
### Windows
|
|
|
|
**Option 1: Chocolatey**
|
|
```powershell
|
|
choco install exiftool
|
|
```
|
|
|
|
**Option 2: Manual installation**
|
|
1. Download from https://exiftool.org/
|
|
2. Extract the `.zip` file
|
|
3. Rename `exiftool(-k).exe` to `exiftool.exe`
|
|
4. Add the directory to your PATH
|
|
|
|
Verify installation:
|
|
```powershell
|
|
exiftool -ver
|
|
```
|
|
|
|
## Verification
|
|
|
|
After installation, verify ExifTool is accessible:
|
|
|
|
```bash
|
|
# Check version
|
|
exiftool -ver
|
|
|
|
# Check location
|
|
which exiftool # macOS/Linux
|
|
where exiftool # Windows
|
|
|
|
# Test with a file
|
|
exiftool your-image.jpg
|
|
```
|
|
|
|
## What Oliver Metadata Tool Uses ExifTool For
|
|
|
|
### Supported Operations
|
|
|
|
1. **Images (JPEG, PNG, GIF, TIFF, HEIC, RAW formats)**
|
|
- Read/write Title, Description, Keywords
|
|
- Access EXIF, IPTC, XMP metadata
|
|
- Support for camera metadata
|
|
|
|
2. **Videos (MP4, MOV, AVI, MKV)**
|
|
- Read/write Title, Description, Keywords
|
|
- QuickTime metadata support
|
|
- Unified API across formats
|
|
|
|
3. **PDFs**
|
|
- Read/write PDF metadata fields
|
|
- Better than pypdf for metadata writing
|
|
- Preserves document structure
|
|
|
|
### Format Coverage
|
|
|
|
ExifTool provides support for these additional formats beyond Python libraries:
|
|
|
|
- **Images**: HEIC, CR2, NEF, ARW, DNG (RAW formats)
|
|
- **Video**: MKV, WebM, FLV, WMV (extended video formats)
|
|
- **Audio**: MP3, FLAC, WAV, OGG (audio files)
|
|
- **Documents**: EPUB, MOBI (ebook formats)
|
|
- **3D/CAD**: STL, DWG, DXF
|
|
- And 250+ more formats
|
|
|
|
## PyExifTool Python Wrapper
|
|
|
|
Oliver Metadata Tool uses the PyExifTool library to interact with ExifTool from Python:
|
|
|
|
```python
|
|
from exiftool import ExifToolHelper
|
|
|
|
# Read metadata
|
|
with ExifToolHelper() as et:
|
|
metadata = et.get_metadata(["image.jpg"])
|
|
print(metadata[0])
|
|
|
|
# Write metadata
|
|
with ExifToolHelper() as et:
|
|
et.set_tags(
|
|
["image.jpg"],
|
|
tags={"EXIF:ImageDescription": "New Title"},
|
|
params=["-overwrite_original"]
|
|
)
|
|
```
|
|
|
|
### Batch Mode Performance
|
|
|
|
PyExifTool uses ExifTool's `-stay_open` mode, which keeps one ExifTool process running for multiple operations:
|
|
|
|
- **Single file operations**: ~50-100ms overhead
|
|
- **Batch operations (100 files)**: 10-60x faster than spawning new processes
|
|
- **Memory efficient**: One process handles all operations
|
|
|
|
## Troubleshooting
|
|
|
|
### ExifTool not found
|
|
|
|
**Error:** `ExifTool not found` or `exiftool command not available`
|
|
|
|
**Solution:**
|
|
1. Install ExifTool using the instructions above
|
|
2. Restart your terminal/command prompt
|
|
3. Verify with `exiftool -ver`
|
|
4. If still not found, check your PATH environment variable
|
|
|
|
### Permission denied
|
|
|
|
**Error:** `Permission denied when executing exiftool`
|
|
|
|
**Solution (macOS/Linux):**
|
|
```bash
|
|
chmod +x /path/to/exiftool
|
|
```
|
|
|
|
### PyExifTool import error
|
|
|
|
**Error:** `ModuleNotFoundError: No module named 'exiftool'`
|
|
|
|
**Solution:**
|
|
```bash
|
|
pip install PyExifTool>=0.5.6
|
|
```
|
|
|
|
### Encoding issues with Unicode filenames
|
|
|
|
ExifTool handles Unicode filenames natively. If you encounter issues:
|
|
|
|
1. Ensure your terminal supports UTF-8
|
|
2. Use the PyExifTool wrapper (handles encoding automatically)
|
|
3. Check file system supports Unicode filenames
|
|
|
|
## Performance Tips
|
|
|
|
### Use batch mode for multiple files
|
|
|
|
```python
|
|
# Good: Process multiple files in one batch
|
|
with ExifToolHelper() as et:
|
|
et.set_tags(
|
|
["file1.jpg", "file2.jpg", "file3.jpg"],
|
|
tags={"EXIF:ImageDescription": "Title"},
|
|
params=["-overwrite_original"]
|
|
)
|
|
|
|
# Avoid: Processing files one at a time
|
|
for file in files:
|
|
with ExifToolHelper() as et:
|
|
et.set_tags([file], tags={...})
|
|
```
|
|
|
|
### Use specific tag names
|
|
|
|
```python
|
|
# Good: Specific tag queries
|
|
et.get_tags(["image.jpg"], tags=["EXIF:ImageDescription", "XMP:Title"])
|
|
|
|
# Slower: Extract all tags
|
|
et.get_metadata(["image.jpg"]) # Returns 100+ tags
|
|
```
|
|
|
|
### Skip unnecessary tags with -fast
|
|
|
|
For read-only operations where you only need basic metadata:
|
|
|
|
```python
|
|
et.execute("-fast", "-json", "image.jpg")
|
|
```
|
|
|
|
## Integration with Oliver Metadata Tool
|
|
|
|
Oliver Metadata Tool automatically detects ExifTool and uses it when available:
|
|
|
|
1. **On startup**: Checks for ExifTool installation
|
|
2. **Hybrid approach**: Uses ExifTool for images/video/PDF, Python libraries for Office docs
|
|
3. **Graceful fallback**: Falls back to pure Python if ExifTool unavailable
|
|
|
|
### Check ExifTool status
|
|
|
|
```python
|
|
from src.config import Config
|
|
|
|
if Config.check_exiftool():
|
|
print("ExifTool available")
|
|
else:
|
|
print("Using Python libraries")
|
|
```
|
|
|
|
## References
|
|
|
|
- [ExifTool Official Website](https://exiftool.org/)
|
|
- [ExifTool Documentation](https://exiftool.org/exiftool_pod.html)
|
|
- [PyExifTool GitHub](https://github.com/sylikc/pyexiftool)
|
|
- [PyExifTool Documentation](https://sylikc.github.io/pyexiftool/)
|
|
- [Supported File Types](https://exiftool.org/#supported)
|
|
- [Tag Names Reference](https://exiftool.org/TagNames/)
|
|
|
|
## License
|
|
|
|
ExifTool is free software licensed under the Perl Artistic License or GPL version 1 or later.
|