Fixed issue where only 1 of 3 master asset IDs was being added to the FERRERO.MASTERASSETIDS tabular field. The bug was caused by calling _add_master_asset_id_field() before _add_master_asset_ids_field(), which created the field with a single value and blocked the multi-value method from adding all IDs. Changes: - metadata_extractor_mvp.py: Prioritize master_opentext_ids parameter using if/elif logic to prevent single-ID method from blocking multi-ID - a2_to_a3_upload_polling.py: Load multiple master assets in PPR mode - filename_parser.py: Parse multiple tracking IDs (e.g., ID1+ID2+ID3) - query_db.py: Fix .env loading path - Added documentation and test files for multiple master asset IDs Tested in PPR with 3 tracking IDs (BqB8vo+SfUQ7m+laRJo0) - all 3 master asset IDs now correctly appear in the metadata structure. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
378 lines
12 KiB
Markdown
378 lines
12 KiB
Markdown
# Option 1: Multiple Tracking IDs in Filename - Implementation Guide
|
||
|
||
## Overview
|
||
|
||
Allow a single derivative/localized asset to reference multiple master assets by including multiple tracking IDs in the filename.
|
||
|
||
**Example Filename:**
|
||
```
|
||
1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg
|
||
^^^^^^^^^^^^^^^^^
|
||
Multiple tracking IDs
|
||
```
|
||
|
||
**Delimiter:** Use `+` to separate multiple tracking IDs (could also use `,` or `_`)
|
||
|
||
---
|
||
|
||
## Changes Required
|
||
|
||
### 1️⃣ Filename Parser (`scripts/shared/filename_parser.py`)
|
||
|
||
**Current Code (line ~182):**
|
||
```python
|
||
# Tracking ID: 6 alphanumeric, optionally with -N suffix
|
||
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?$', part):
|
||
tracking = part
|
||
tracking_mode = 'full'
|
||
base_tracking_id = tracking
|
||
|
||
if tracking.endswith('-N'):
|
||
tracking_mode = 'folder_only'
|
||
base_tracking_id = tracking[:-2] # Strip -N suffix
|
||
|
||
parsed['tracking_id'] = base_tracking_id
|
||
parsed['tracking_mode'] = tracking_mode
|
||
parsed['tracking_id_with_suffix'] = tracking
|
||
logger.debug("Found tracking ID: {}".format(tracking))
|
||
index += 1
|
||
```
|
||
|
||
**Modified Code:**
|
||
```python
|
||
# Tracking ID(s): 6 alphanumeric, optionally with -N suffix
|
||
# Supports multiple IDs separated by + (e.g., "BqB8vo+SfUQ7m+laRJo0")
|
||
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$', part):
|
||
tracking_ids = []
|
||
tracking_modes = []
|
||
tracking_ids_with_suffix = []
|
||
|
||
# Split by + delimiter to get all tracking IDs
|
||
id_parts = part.split('+')
|
||
|
||
for tracking in id_parts:
|
||
tracking_mode = 'full'
|
||
base_tracking_id = tracking
|
||
|
||
if tracking.endswith('-N'):
|
||
tracking_mode = 'folder_only'
|
||
base_tracking_id = tracking[:-2] # Strip -N suffix
|
||
logger.info("Detected folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))
|
||
|
||
tracking_ids.append(base_tracking_id)
|
||
tracking_modes.append(tracking_mode)
|
||
tracking_ids_with_suffix.append(tracking)
|
||
|
||
# Store primary (first) tracking ID for backward compatibility
|
||
parsed['tracking_id'] = tracking_ids[0]
|
||
parsed['tracking_mode'] = tracking_modes[0]
|
||
parsed['tracking_id_with_suffix'] = tracking_ids_with_suffix[0]
|
||
|
||
# Store all tracking IDs for multi-master support
|
||
parsed['tracking_ids'] = tracking_ids
|
||
parsed['tracking_modes'] = tracking_modes
|
||
parsed['tracking_ids_with_suffix'] = tracking_ids_with_suffix
|
||
parsed['has_multiple_masters'] = len(tracking_ids) > 1
|
||
|
||
logger.debug("Found {} tracking ID(s): {}".format(len(tracking_ids), ', '.join(tracking_ids)))
|
||
index += 1
|
||
```
|
||
|
||
**Key Changes:**
|
||
- Updated regex to match multiple IDs: `^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$`
|
||
- Split on `+` delimiter
|
||
- Store primary ID for backward compatibility
|
||
- Add new fields: `tracking_ids`, `has_multiple_masters`
|
||
|
||
---
|
||
|
||
### 2️⃣ A2→A3 Upload Script (`scripts/a2_to_a3_upload_polling.py`)
|
||
|
||
**Current Code (line ~97):**
|
||
```python
|
||
# 2. Load master metadata from database
|
||
master_asset = db.get_master_asset(tracking_id)
|
||
|
||
if not master_asset:
|
||
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
|
||
```
|
||
|
||
**Modified Code:**
|
||
```python
|
||
# 2. Load master metadata from database (support multiple tracking IDs)
|
||
tracking_ids = parsed.get('tracking_ids', [tracking_id]) # Get all tracking IDs or fallback to single
|
||
has_multiple_masters = parsed.get('has_multiple_masters', False)
|
||
|
||
# Load all master assets
|
||
master_assets = []
|
||
master_opentext_ids = []
|
||
|
||
if has_multiple_masters:
|
||
logger.info("Multiple master assets detected: {}".format(', '.join(tracking_ids)))
|
||
|
||
for tid in tracking_ids:
|
||
master = db.get_master_asset(tid)
|
||
if not master:
|
||
logger.warning("Master asset not found for tracking ID: {}".format(tid))
|
||
continue
|
||
master_assets.append(master)
|
||
master_opentext_ids.append(master['opentext_id'])
|
||
|
||
if not master_assets:
|
||
raise ValueError("No master assets found for tracking IDs: {}".format(', '.join(tracking_ids)))
|
||
|
||
# Use first master for metadata inheritance (could enhance this later)
|
||
master_asset = master_assets[0]
|
||
logger.info("Using primary master {} for metadata, linking all {} masters".format(
|
||
tracking_ids[0], len(master_assets)))
|
||
else:
|
||
# Single master (backward compatible)
|
||
master_asset = db.get_master_asset(tracking_id)
|
||
if not master_asset:
|
||
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
|
||
master_opentext_ids = [master_asset['opentext_id']]
|
||
```
|
||
|
||
**Current Code (line ~194):**
|
||
```python
|
||
asset_rep = mvp_extractor.build_mvp_asset_representation(
|
||
master_metadata=master_asset['full_metadata'],
|
||
clean_filename=clean_filename,
|
||
parsed_filename=parsed,
|
||
box_metadata=box_metadata,
|
||
tracking_mode=tracking_mode,
|
||
master_opentext_id=master_asset['opentext_id'] # Single ID
|
||
)
|
||
```
|
||
|
||
**Modified Code:**
|
||
```python
|
||
# Pass all master opentext IDs (support multiple)
|
||
asset_rep = mvp_extractor.build_mvp_asset_representation(
|
||
master_metadata=master_asset['full_metadata'],
|
||
clean_filename=clean_filename,
|
||
parsed_filename=parsed,
|
||
box_metadata=box_metadata,
|
||
tracking_mode=tracking_mode,
|
||
master_opentext_id=master_asset['opentext_id'], # Primary for ARTESIA.FIELD.ASSET_ID
|
||
master_opentext_ids=master_opentext_ids # All IDs for MASTERASSETIDS field
|
||
)
|
||
```
|
||
|
||
**Key Changes:**
|
||
- Extract multiple tracking IDs from parsed data
|
||
- Look up all master assets in database
|
||
- Collect all master opentext_ids
|
||
- Pass list to metadata extractor
|
||
|
||
---
|
||
|
||
### 3️⃣ Metadata Extractor (`scripts/shared/metadata_extractor_mvp.py`)
|
||
|
||
**Current Method Signature (line ~97):**
|
||
```python
|
||
def build_mvp_asset_representation(self, master_metadata, clean_filename,
|
||
parsed_filename, box_metadata=None,
|
||
tracking_mode='full', master_opentext_id=None):
|
||
```
|
||
|
||
**Modified Method Signature:**
|
||
```python
|
||
def build_mvp_asset_representation(self, master_metadata, clean_filename,
|
||
parsed_filename, box_metadata=None,
|
||
tracking_mode='full', master_opentext_id=None,
|
||
master_opentext_ids=None):
|
||
```
|
||
|
||
**Current Code (line ~139):**
|
||
```python
|
||
if master_opentext_id:
|
||
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
|
||
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
|
||
```
|
||
|
||
**Modified Code:**
|
||
```python
|
||
# Add Master Asset ID field(s) if provided (derivative tracking)
|
||
if master_opentext_id:
|
||
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
|
||
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
|
||
|
||
# Add MASTERASSETIDS tabular field with all master IDs (support multiple)
|
||
if master_opentext_ids and len(master_opentext_ids) > 0:
|
||
mvp_fields = self._add_master_asset_ids_field(mvp_fields, master_opentext_ids)
|
||
logger.info("Added MASTERASSETIDS field with {} value(s)".format(len(master_opentext_ids)))
|
||
```
|
||
|
||
**New Method (add after `_add_master_asset_id_field`):**
|
||
```python
|
||
def _add_master_asset_ids_field(self, mvp_fields, master_opentext_ids):
|
||
"""
|
||
Add FERRERO.MASTERASSETIDS tabular field with multiple master asset IDs
|
||
Supports Many-to-Many relationship between derivatives and masters
|
||
|
||
Args:
|
||
mvp_fields: List of MVP fields
|
||
master_opentext_ids: List of DAM Asset IDs of master assets
|
||
|
||
Returns:
|
||
Updated mvp_fields list with FERRERO.MASTERASSETIDS
|
||
"""
|
||
if not master_opentext_ids or len(master_opentext_ids) == 0:
|
||
logger.info("No master_opentext_ids provided - skipping FERRERO.MASTERASSETIDS field")
|
||
return mvp_fields
|
||
|
||
# Check if field already exists
|
||
for field in mvp_fields:
|
||
if self._get_field_id(field) == 'FERRERO.MASTERASSETIDS':
|
||
logger.info("FERRERO.MASTERASSETIDS already present")
|
||
return mvp_fields
|
||
|
||
# Build values array with all master asset IDs
|
||
values = []
|
||
for master_id in master_opentext_ids:
|
||
values.append({
|
||
'cascading_domain_value': False,
|
||
'domain_value': True,
|
||
'is_locked': False,
|
||
'value': {
|
||
'field_value': {
|
||
'type': 'string',
|
||
'value': master_id
|
||
},
|
||
'type': 'com.artesia.metadata.DomainValue'
|
||
}
|
||
})
|
||
|
||
# Create tabular field
|
||
new_field = {
|
||
'id': 'FERRERO.MASTERASSETIDS',
|
||
'parent_table_id': 'FERRERO.TABULAR.FIELD.MASTERASSETIDS',
|
||
'type': 'com.artesia.metadata.MetadataTableField',
|
||
'values': values
|
||
}
|
||
|
||
mvp_fields.append(new_field)
|
||
logger.info("Added FERRERO.MASTERASSETIDS field with {} master asset ID(s): {}".format(
|
||
len(values), ', '.join(master_opentext_ids[:3]) + ('...' if len(master_opentext_ids) > 3 else '')))
|
||
|
||
return mvp_fields
|
||
```
|
||
|
||
**Key Changes:**
|
||
- Add `master_opentext_ids` parameter (list)
|
||
- New method `_add_master_asset_ids_field` that accepts a list
|
||
- Builds `values` array with all master IDs
|
||
- Backward compatible (still uses single `master_opentext_id` for ARTESIA.FIELD.ASSET_ID)
|
||
|
||
---
|
||
|
||
## Testing Examples
|
||
|
||
### Single Master (Backward Compatible)
|
||
**Filename:** `1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo.jpg`
|
||
|
||
**Parsed:**
|
||
```python
|
||
{
|
||
'tracking_id': 'BqB8vo',
|
||
'tracking_ids': ['BqB8vo'],
|
||
'has_multiple_masters': False
|
||
}
|
||
```
|
||
|
||
**Result:** Single ID in MASTERASSETIDS field (current behavior)
|
||
|
||
---
|
||
|
||
### Multiple Masters (New Feature)
|
||
**Filename:** `1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg`
|
||
|
||
**Parsed:**
|
||
```python
|
||
{
|
||
'tracking_id': 'BqB8vo', # Primary (for backward compatibility)
|
||
'tracking_ids': ['BqB8vo', 'SfUQ7m', 'laRJo0'],
|
||
'has_multiple_masters': True
|
||
}
|
||
```
|
||
|
||
**Database Lookups:**
|
||
- BqB8vo → fc5c389776516bb58044c7d4bf479da458599baf
|
||
- SfUQ7m → ad3948d72ea8550a338a600ae87a1bdd1968b066
|
||
- laRJo0 → 020d76f957ec9f4ec0b18035a2d012cd3fd376c2
|
||
|
||
**Result:** 3 IDs in MASTERASSETIDS field values array
|
||
|
||
---
|
||
|
||
## Migration Path
|
||
|
||
1. **Phase 1 - Implement Code** (No Breaking Changes)
|
||
- Add changes to all 3 files
|
||
- Test with single tracking ID (should work exactly as before)
|
||
- Backward compatible with existing filenames
|
||
|
||
2. **Phase 2 - Test Multiple IDs**
|
||
- Create test file with multiple tracking IDs
|
||
- Upload to PPR with `--dryrun`
|
||
- Verify 3 values in MASTERASSETIDS field
|
||
|
||
3. **Phase 3 - Agency Tool Integration**
|
||
- Agency tool generates filenames with `+` delimiter
|
||
- Agency tool uses multiple tracking IDs when needed
|
||
- Most files will still have single tracking ID (normal case)
|
||
|
||
4. **Phase 4 - Production Deployment**
|
||
- Enable in PROD after testing in PPR
|
||
- Update field in PROD DAM schema first
|
||
- Deploy code changes
|
||
|
||
---
|
||
|
||
## Alternative Delimiters
|
||
|
||
If `+` causes issues, alternatives:
|
||
|
||
| Delimiter | Example | Notes |
|
||
|-----------|---------|-------|
|
||
| `+` | `BqB8vo+SfUQ7m` | ✅ Recommended (clear separator) |
|
||
| `,` | `BqB8vo,SfUQ7m` | ⚠️ Might conflict with CSV exports |
|
||
| `_` | `BqB8vo_SfUQ7m` | ⚠️ Already used in filename structure |
|
||
| `~` | `BqB8vo~SfUQ7m` | ✅ Alternative if + causes issues |
|
||
|
||
---
|
||
|
||
## Error Handling
|
||
|
||
**What happens if one tracking ID is not found?**
|
||
|
||
```python
|
||
# Option A: Skip missing masters (log warning)
|
||
for tid in tracking_ids:
|
||
master = db.get_master_asset(tid)
|
||
if not master:
|
||
logger.warning("Master asset not found for tracking ID: {}".format(tid))
|
||
continue # Skip this one, continue with others
|
||
|
||
# Option B: Fail entire upload (strict)
|
||
for tid in tracking_ids:
|
||
master = db.get_master_asset(tid)
|
||
if not master:
|
||
raise ValueError("Master asset not found for tracking ID: {}".format(tid))
|
||
```
|
||
|
||
**Recommendation:** Use Option A (skip missing) - derivative still uploads with available master links.
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
**Files to Modify:**
|
||
1. `scripts/shared/filename_parser.py` - Parse multiple tracking IDs
|
||
2. `scripts/a2_to_a3_upload_polling.py` - Look up multiple masters
|
||
3. `scripts/shared/metadata_extractor_mvp.py` - Add all IDs to field
|
||
|
||
**Backward Compatible:** ✅ Yes - existing single-ID filenames work exactly as before
|
||
|
||
**Ready to Implement:** This document provides all code changes needed.
|