ferrero-opentext/Python-Version/OPTION1_MULTIPLE_TRACKING_IDS.md
nickviljoen 444ac7ac6d Fix: PPR multiple master asset IDs now correctly populate MASTERASSETIDS field
Fixed issue where only 1 of 3 master asset IDs was being added to the
FERRERO.MASTERASSETIDS tabular field. The bug was caused by calling
_add_master_asset_id_field() before _add_master_asset_ids_field(),
which created the field with a single value and blocked the multi-value
method from adding all IDs.

Changes:
- metadata_extractor_mvp.py: Prioritize master_opentext_ids parameter
  using if/elif logic to prevent single-ID method from blocking multi-ID
- a2_to_a3_upload_polling.py: Load multiple master assets in PPR mode
- filename_parser.py: Parse multiple tracking IDs (e.g., ID1+ID2+ID3)
- query_db.py: Fix .env loading path
- Added documentation and test files for multiple master asset IDs

Tested in PPR with 3 tracking IDs (BqB8vo+SfUQ7m+laRJo0) - all 3 master
asset IDs now correctly appear in the metadata structure.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 21:02:09 +02:00

378 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Option 1: Multiple Tracking IDs in Filename - Implementation Guide
## Overview
Allow a single derivative/localized asset to reference multiple master assets by including multiple tracking IDs in the filename.
**Example Filename:**
```
1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg
^^^^^^^^^^^^^^^^^
Multiple tracking IDs
```
**Delimiter:** Use `+` to separate multiple tracking IDs (could also use `,` or `_`)
---
## Changes Required
### 1⃣ Filename Parser (`scripts/shared/filename_parser.py`)
**Current Code (line ~182):**
```python
# Tracking ID: 6 alphanumeric, optionally with -N suffix
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?$', part):
tracking = part
tracking_mode = 'full'
base_tracking_id = tracking
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2] # Strip -N suffix
parsed['tracking_id'] = base_tracking_id
parsed['tracking_mode'] = tracking_mode
parsed['tracking_id_with_suffix'] = tracking
logger.debug("Found tracking ID: {}".format(tracking))
index += 1
```
**Modified Code:**
```python
# Tracking ID(s): 6 alphanumeric, optionally with -N suffix
# Supports multiple IDs separated by + (e.g., "BqB8vo+SfUQ7m+laRJo0")
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$', part):
tracking_ids = []
tracking_modes = []
tracking_ids_with_suffix = []
# Split by + delimiter to get all tracking IDs
id_parts = part.split('+')
for tracking in id_parts:
tracking_mode = 'full'
base_tracking_id = tracking
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2] # Strip -N suffix
logger.info("Detected folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))
tracking_ids.append(base_tracking_id)
tracking_modes.append(tracking_mode)
tracking_ids_with_suffix.append(tracking)
# Store primary (first) tracking ID for backward compatibility
parsed['tracking_id'] = tracking_ids[0]
parsed['tracking_mode'] = tracking_modes[0]
parsed['tracking_id_with_suffix'] = tracking_ids_with_suffix[0]
# Store all tracking IDs for multi-master support
parsed['tracking_ids'] = tracking_ids
parsed['tracking_modes'] = tracking_modes
parsed['tracking_ids_with_suffix'] = tracking_ids_with_suffix
parsed['has_multiple_masters'] = len(tracking_ids) > 1
logger.debug("Found {} tracking ID(s): {}".format(len(tracking_ids), ', '.join(tracking_ids)))
index += 1
```
**Key Changes:**
- Updated regex to match multiple IDs: `^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$`
- Split on `+` delimiter
- Store primary ID for backward compatibility
- Add new fields: `tracking_ids`, `has_multiple_masters`
---
### 2⃣ A2→A3 Upload Script (`scripts/a2_to_a3_upload_polling.py`)
**Current Code (line ~97):**
```python
# 2. Load master metadata from database
master_asset = db.get_master_asset(tracking_id)
if not master_asset:
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
```
**Modified Code:**
```python
# 2. Load master metadata from database (support multiple tracking IDs)
tracking_ids = parsed.get('tracking_ids', [tracking_id]) # Get all tracking IDs or fallback to single
has_multiple_masters = parsed.get('has_multiple_masters', False)
# Load all master assets
master_assets = []
master_opentext_ids = []
if has_multiple_masters:
logger.info("Multiple master assets detected: {}".format(', '.join(tracking_ids)))
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
logger.warning("Master asset not found for tracking ID: {}".format(tid))
continue
master_assets.append(master)
master_opentext_ids.append(master['opentext_id'])
if not master_assets:
raise ValueError("No master assets found for tracking IDs: {}".format(', '.join(tracking_ids)))
# Use first master for metadata inheritance (could enhance this later)
master_asset = master_assets[0]
logger.info("Using primary master {} for metadata, linking all {} masters".format(
tracking_ids[0], len(master_assets)))
else:
# Single master (backward compatible)
master_asset = db.get_master_asset(tracking_id)
if not master_asset:
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
master_opentext_ids = [master_asset['opentext_id']]
```
**Current Code (line ~194):**
```python
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master_asset['full_metadata'],
clean_filename=clean_filename,
parsed_filename=parsed,
box_metadata=box_metadata,
tracking_mode=tracking_mode,
master_opentext_id=master_asset['opentext_id'] # Single ID
)
```
**Modified Code:**
```python
# Pass all master opentext IDs (support multiple)
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master_asset['full_metadata'],
clean_filename=clean_filename,
parsed_filename=parsed,
box_metadata=box_metadata,
tracking_mode=tracking_mode,
master_opentext_id=master_asset['opentext_id'], # Primary for ARTESIA.FIELD.ASSET_ID
master_opentext_ids=master_opentext_ids # All IDs for MASTERASSETIDS field
)
```
**Key Changes:**
- Extract multiple tracking IDs from parsed data
- Look up all master assets in database
- Collect all master opentext_ids
- Pass list to metadata extractor
---
### 3⃣ Metadata Extractor (`scripts/shared/metadata_extractor_mvp.py`)
**Current Method Signature (line ~97):**
```python
def build_mvp_asset_representation(self, master_metadata, clean_filename,
parsed_filename, box_metadata=None,
tracking_mode='full', master_opentext_id=None):
```
**Modified Method Signature:**
```python
def build_mvp_asset_representation(self, master_metadata, clean_filename,
parsed_filename, box_metadata=None,
tracking_mode='full', master_opentext_id=None,
master_opentext_ids=None):
```
**Current Code (line ~139):**
```python
if master_opentext_id:
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
```
**Modified Code:**
```python
# Add Master Asset ID field(s) if provided (derivative tracking)
if master_opentext_id:
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
# Add MASTERASSETIDS tabular field with all master IDs (support multiple)
if master_opentext_ids and len(master_opentext_ids) > 0:
mvp_fields = self._add_master_asset_ids_field(mvp_fields, master_opentext_ids)
logger.info("Added MASTERASSETIDS field with {} value(s)".format(len(master_opentext_ids)))
```
**New Method (add after `_add_master_asset_id_field`):**
```python
def _add_master_asset_ids_field(self, mvp_fields, master_opentext_ids):
"""
Add FERRERO.MASTERASSETIDS tabular field with multiple master asset IDs
Supports Many-to-Many relationship between derivatives and masters
Args:
mvp_fields: List of MVP fields
master_opentext_ids: List of DAM Asset IDs of master assets
Returns:
Updated mvp_fields list with FERRERO.MASTERASSETIDS
"""
if not master_opentext_ids or len(master_opentext_ids) == 0:
logger.info("No master_opentext_ids provided - skipping FERRERO.MASTERASSETIDS field")
return mvp_fields
# Check if field already exists
for field in mvp_fields:
if self._get_field_id(field) == 'FERRERO.MASTERASSETIDS':
logger.info("FERRERO.MASTERASSETIDS already present")
return mvp_fields
# Build values array with all master asset IDs
values = []
for master_id in master_opentext_ids:
values.append({
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'field_value': {
'type': 'string',
'value': master_id
},
'type': 'com.artesia.metadata.DomainValue'
}
})
# Create tabular field
new_field = {
'id': 'FERRERO.MASTERASSETIDS',
'parent_table_id': 'FERRERO.TABULAR.FIELD.MASTERASSETIDS',
'type': 'com.artesia.metadata.MetadataTableField',
'values': values
}
mvp_fields.append(new_field)
logger.info("Added FERRERO.MASTERASSETIDS field with {} master asset ID(s): {}".format(
len(values), ', '.join(master_opentext_ids[:3]) + ('...' if len(master_opentext_ids) > 3 else '')))
return mvp_fields
```
**Key Changes:**
- Add `master_opentext_ids` parameter (list)
- New method `_add_master_asset_ids_field` that accepts a list
- Builds `values` array with all master IDs
- Backward compatible (still uses single `master_opentext_id` for ARTESIA.FIELD.ASSET_ID)
---
## Testing Examples
### Single Master (Backward Compatible)
**Filename:** `1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo.jpg`
**Parsed:**
```python
{
'tracking_id': 'BqB8vo',
'tracking_ids': ['BqB8vo'],
'has_multiple_masters': False
}
```
**Result:** Single ID in MASTERASSETIDS field (current behavior)
---
### Multiple Masters (New Feature)
**Filename:** `1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg`
**Parsed:**
```python
{
'tracking_id': 'BqB8vo', # Primary (for backward compatibility)
'tracking_ids': ['BqB8vo', 'SfUQ7m', 'laRJo0'],
'has_multiple_masters': True
}
```
**Database Lookups:**
- BqB8vo → fc5c389776516bb58044c7d4bf479da458599baf
- SfUQ7m → ad3948d72ea8550a338a600ae87a1bdd1968b066
- laRJo0 → 020d76f957ec9f4ec0b18035a2d012cd3fd376c2
**Result:** 3 IDs in MASTERASSETIDS field values array
---
## Migration Path
1. **Phase 1 - Implement Code** (No Breaking Changes)
- Add changes to all 3 files
- Test with single tracking ID (should work exactly as before)
- Backward compatible with existing filenames
2. **Phase 2 - Test Multiple IDs**
- Create test file with multiple tracking IDs
- Upload to PPR with `--dryrun`
- Verify 3 values in MASTERASSETIDS field
3. **Phase 3 - Agency Tool Integration**
- Agency tool generates filenames with `+` delimiter
- Agency tool uses multiple tracking IDs when needed
- Most files will still have single tracking ID (normal case)
4. **Phase 4 - Production Deployment**
- Enable in PROD after testing in PPR
- Update field in PROD DAM schema first
- Deploy code changes
---
## Alternative Delimiters
If `+` causes issues, alternatives:
| Delimiter | Example | Notes |
|-----------|---------|-------|
| `+` | `BqB8vo+SfUQ7m` | ✅ Recommended (clear separator) |
| `,` | `BqB8vo,SfUQ7m` | ⚠️ Might conflict with CSV exports |
| `_` | `BqB8vo_SfUQ7m` | ⚠️ Already used in filename structure |
| `~` | `BqB8vo~SfUQ7m` | ✅ Alternative if + causes issues |
---
## Error Handling
**What happens if one tracking ID is not found?**
```python
# Option A: Skip missing masters (log warning)
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
logger.warning("Master asset not found for tracking ID: {}".format(tid))
continue # Skip this one, continue with others
# Option B: Fail entire upload (strict)
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
raise ValueError("Master asset not found for tracking ID: {}".format(tid))
```
**Recommendation:** Use Option A (skip missing) - derivative still uploads with available master links.
---
## Summary
**Files to Modify:**
1. `scripts/shared/filename_parser.py` - Parse multiple tracking IDs
2. `scripts/a2_to_a3_upload_polling.py` - Look up multiple masters
3. `scripts/shared/metadata_extractor_mvp.py` - Add all IDs to field
**Backward Compatible:** ✅ Yes - existing single-ID filenames work exactly as before
**Ready to Implement:** This document provides all code changes needed.