ferrero-opentext/Python-Version/OPTION1_MULTIPLE_TRACKING_IDS.md
nickviljoen 444ac7ac6d Fix: PPR multiple master asset IDs now correctly populate MASTERASSETIDS field
Fixed issue where only 1 of 3 master asset IDs was being added to the
FERRERO.MASTERASSETIDS tabular field. The bug was caused by calling
_add_master_asset_id_field() before _add_master_asset_ids_field(),
which created the field with a single value and blocked the multi-value
method from adding all IDs.

Changes:
- metadata_extractor_mvp.py: Prioritize master_opentext_ids parameter
  using if/elif logic to prevent single-ID method from blocking multi-ID
- a2_to_a3_upload_polling.py: Load multiple master assets in PPR mode
- filename_parser.py: Parse multiple tracking IDs (e.g., ID1+ID2+ID3)
- query_db.py: Fix .env loading path
- Added documentation and test files for multiple master asset IDs

Tested in PPR with 3 tracking IDs (BqB8vo+SfUQ7m+laRJo0) - all 3 master
asset IDs now correctly appear in the metadata structure.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 21:02:09 +02:00

12 KiB
Raw Permalink Blame History

Option 1: Multiple Tracking IDs in Filename - Implementation Guide

Overview

Allow a single derivative/localized asset to reference multiple master assets by including multiple tracking IDs in the filename.

Example Filename:

1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg
                                                          ^^^^^^^^^^^^^^^^^
                                                          Multiple tracking IDs

Delimiter: Use + to separate multiple tracking IDs (could also use , or _)


Changes Required

1 Filename Parser (scripts/shared/filename_parser.py)

Current Code (line ~182):

# Tracking ID: 6 alphanumeric, optionally with -N suffix
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?$', part):
    tracking = part
    tracking_mode = 'full'
    base_tracking_id = tracking

    if tracking.endswith('-N'):
        tracking_mode = 'folder_only'
        base_tracking_id = tracking[:-2]  # Strip -N suffix

    parsed['tracking_id'] = base_tracking_id
    parsed['tracking_mode'] = tracking_mode
    parsed['tracking_id_with_suffix'] = tracking
    logger.debug("Found tracking ID: {}".format(tracking))
    index += 1

Modified Code:

# Tracking ID(s): 6 alphanumeric, optionally with -N suffix
# Supports multiple IDs separated by + (e.g., "BqB8vo+SfUQ7m+laRJo0")
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$', part):
    tracking_ids = []
    tracking_modes = []
    tracking_ids_with_suffix = []

    # Split by + delimiter to get all tracking IDs
    id_parts = part.split('+')

    for tracking in id_parts:
        tracking_mode = 'full'
        base_tracking_id = tracking

        if tracking.endswith('-N'):
            tracking_mode = 'folder_only'
            base_tracking_id = tracking[:-2]  # Strip -N suffix
            logger.info("Detected folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))

        tracking_ids.append(base_tracking_id)
        tracking_modes.append(tracking_mode)
        tracking_ids_with_suffix.append(tracking)

    # Store primary (first) tracking ID for backward compatibility
    parsed['tracking_id'] = tracking_ids[0]
    parsed['tracking_mode'] = tracking_modes[0]
    parsed['tracking_id_with_suffix'] = tracking_ids_with_suffix[0]

    # Store all tracking IDs for multi-master support
    parsed['tracking_ids'] = tracking_ids
    parsed['tracking_modes'] = tracking_modes
    parsed['tracking_ids_with_suffix'] = tracking_ids_with_suffix
    parsed['has_multiple_masters'] = len(tracking_ids) > 1

    logger.debug("Found {} tracking ID(s): {}".format(len(tracking_ids), ', '.join(tracking_ids)))
    index += 1

Key Changes:

  • Updated regex to match multiple IDs: ^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$
  • Split on + delimiter
  • Store primary ID for backward compatibility
  • Add new fields: tracking_ids, has_multiple_masters

2 A2→A3 Upload Script (scripts/a2_to_a3_upload_polling.py)

Current Code (line ~97):

# 2. Load master metadata from database
master_asset = db.get_master_asset(tracking_id)

if not master_asset:
    raise ValueError("No master asset for tracking ID: {}".format(tracking_id))

Modified Code:

# 2. Load master metadata from database (support multiple tracking IDs)
tracking_ids = parsed.get('tracking_ids', [tracking_id])  # Get all tracking IDs or fallback to single
has_multiple_masters = parsed.get('has_multiple_masters', False)

# Load all master assets
master_assets = []
master_opentext_ids = []

if has_multiple_masters:
    logger.info("Multiple master assets detected: {}".format(', '.join(tracking_ids)))

    for tid in tracking_ids:
        master = db.get_master_asset(tid)
        if not master:
            logger.warning("Master asset not found for tracking ID: {}".format(tid))
            continue
        master_assets.append(master)
        master_opentext_ids.append(master['opentext_id'])

    if not master_assets:
        raise ValueError("No master assets found for tracking IDs: {}".format(', '.join(tracking_ids)))

    # Use first master for metadata inheritance (could enhance this later)
    master_asset = master_assets[0]
    logger.info("Using primary master {} for metadata, linking all {} masters".format(
        tracking_ids[0], len(master_assets)))
else:
    # Single master (backward compatible)
    master_asset = db.get_master_asset(tracking_id)
    if not master_asset:
        raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
    master_opentext_ids = [master_asset['opentext_id']]

Current Code (line ~194):

asset_rep = mvp_extractor.build_mvp_asset_representation(
    master_metadata=master_asset['full_metadata'],
    clean_filename=clean_filename,
    parsed_filename=parsed,
    box_metadata=box_metadata,
    tracking_mode=tracking_mode,
    master_opentext_id=master_asset['opentext_id']  # Single ID
)

Modified Code:

# Pass all master opentext IDs (support multiple)
asset_rep = mvp_extractor.build_mvp_asset_representation(
    master_metadata=master_asset['full_metadata'],
    clean_filename=clean_filename,
    parsed_filename=parsed,
    box_metadata=box_metadata,
    tracking_mode=tracking_mode,
    master_opentext_id=master_asset['opentext_id'],  # Primary for ARTESIA.FIELD.ASSET_ID
    master_opentext_ids=master_opentext_ids  # All IDs for MASTERASSETIDS field
)

Key Changes:

  • Extract multiple tracking IDs from parsed data
  • Look up all master assets in database
  • Collect all master opentext_ids
  • Pass list to metadata extractor

3 Metadata Extractor (scripts/shared/metadata_extractor_mvp.py)

Current Method Signature (line ~97):

def build_mvp_asset_representation(self, master_metadata, clean_filename,
                                    parsed_filename, box_metadata=None,
                                    tracking_mode='full', master_opentext_id=None):

Modified Method Signature:

def build_mvp_asset_representation(self, master_metadata, clean_filename,
                                    parsed_filename, box_metadata=None,
                                    tracking_mode='full', master_opentext_id=None,
                                    master_opentext_ids=None):

Current Code (line ~139):

if master_opentext_id:
    mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
    logger.info("Added Master Asset ID field: {}".format(master_opentext_id))

Modified Code:

# Add Master Asset ID field(s) if provided (derivative tracking)
if master_opentext_id:
    mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
    logger.info("Added Master Asset ID field: {}".format(master_opentext_id))

# Add MASTERASSETIDS tabular field with all master IDs (support multiple)
if master_opentext_ids and len(master_opentext_ids) > 0:
    mvp_fields = self._add_master_asset_ids_field(mvp_fields, master_opentext_ids)
    logger.info("Added MASTERASSETIDS field with {} value(s)".format(len(master_opentext_ids)))

New Method (add after _add_master_asset_id_field):

def _add_master_asset_ids_field(self, mvp_fields, master_opentext_ids):
    """
    Add FERRERO.MASTERASSETIDS tabular field with multiple master asset IDs
    Supports Many-to-Many relationship between derivatives and masters

    Args:
        mvp_fields: List of MVP fields
        master_opentext_ids: List of DAM Asset IDs of master assets

    Returns:
        Updated mvp_fields list with FERRERO.MASTERASSETIDS
    """
    if not master_opentext_ids or len(master_opentext_ids) == 0:
        logger.info("No master_opentext_ids provided - skipping FERRERO.MASTERASSETIDS field")
        return mvp_fields

    # Check if field already exists
    for field in mvp_fields:
        if self._get_field_id(field) == 'FERRERO.MASTERASSETIDS':
            logger.info("FERRERO.MASTERASSETIDS already present")
            return mvp_fields

    # Build values array with all master asset IDs
    values = []
    for master_id in master_opentext_ids:
        values.append({
            'cascading_domain_value': False,
            'domain_value': True,
            'is_locked': False,
            'value': {
                'field_value': {
                    'type': 'string',
                    'value': master_id
                },
                'type': 'com.artesia.metadata.DomainValue'
            }
        })

    # Create tabular field
    new_field = {
        'id': 'FERRERO.MASTERASSETIDS',
        'parent_table_id': 'FERRERO.TABULAR.FIELD.MASTERASSETIDS',
        'type': 'com.artesia.metadata.MetadataTableField',
        'values': values
    }

    mvp_fields.append(new_field)
    logger.info("Added FERRERO.MASTERASSETIDS field with {} master asset ID(s): {}".format(
        len(values), ', '.join(master_opentext_ids[:3]) + ('...' if len(master_opentext_ids) > 3 else '')))

    return mvp_fields

Key Changes:

  • Add master_opentext_ids parameter (list)
  • New method _add_master_asset_ids_field that accepts a list
  • Builds values array with all master IDs
  • Backward compatible (still uses single master_opentext_id for ARTESIA.FIELD.ASSET_ID)

Testing Examples

Single Master (Backward Compatible)

Filename: 1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo.jpg

Parsed:

{
    'tracking_id': 'BqB8vo',
    'tracking_ids': ['BqB8vo'],
    'has_multiple_masters': False
}

Result: Single ID in MASTERASSETIDS field (current behavior)


Multiple Masters (New Feature)

Filename: 1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg

Parsed:

{
    'tracking_id': 'BqB8vo',  # Primary (for backward compatibility)
    'tracking_ids': ['BqB8vo', 'SfUQ7m', 'laRJo0'],
    'has_multiple_masters': True
}

Database Lookups:

  • BqB8vo → fc5c389776516bb58044c7d4bf479da458599baf
  • SfUQ7m → ad3948d72ea8550a338a600ae87a1bdd1968b066
  • laRJo0 → 020d76f957ec9f4ec0b18035a2d012cd3fd376c2

Result: 3 IDs in MASTERASSETIDS field values array


Migration Path

  1. Phase 1 - Implement Code (No Breaking Changes)

    • Add changes to all 3 files
    • Test with single tracking ID (should work exactly as before)
    • Backward compatible with existing filenames
  2. Phase 2 - Test Multiple IDs

    • Create test file with multiple tracking IDs
    • Upload to PPR with --dryrun
    • Verify 3 values in MASTERASSETIDS field
  3. Phase 3 - Agency Tool Integration

    • Agency tool generates filenames with + delimiter
    • Agency tool uses multiple tracking IDs when needed
    • Most files will still have single tracking ID (normal case)
  4. Phase 4 - Production Deployment

    • Enable in PROD after testing in PPR
    • Update field in PROD DAM schema first
    • Deploy code changes

Alternative Delimiters

If + causes issues, alternatives:

Delimiter Example Notes
+ BqB8vo+SfUQ7m Recommended (clear separator)
, BqB8vo,SfUQ7m ⚠️ Might conflict with CSV exports
_ BqB8vo_SfUQ7m ⚠️ Already used in filename structure
~ BqB8vo~SfUQ7m Alternative if + causes issues

Error Handling

What happens if one tracking ID is not found?

# Option A: Skip missing masters (log warning)
for tid in tracking_ids:
    master = db.get_master_asset(tid)
    if not master:
        logger.warning("Master asset not found for tracking ID: {}".format(tid))
        continue  # Skip this one, continue with others

# Option B: Fail entire upload (strict)
for tid in tracking_ids:
    master = db.get_master_asset(tid)
    if not master:
        raise ValueError("Master asset not found for tracking ID: {}".format(tid))

Recommendation: Use Option A (skip missing) - derivative still uploads with available master links.


Summary

Files to Modify:

  1. scripts/shared/filename_parser.py - Parse multiple tracking IDs
  2. scripts/a2_to_a3_upload_polling.py - Look up multiple masters
  3. scripts/shared/metadata_extractor_mvp.py - Add all IDs to field

Backward Compatible: Yes - existing single-ID filenames work exactly as before

Ready to Implement: This document provides all code changes needed.