ferrero-opentext/Python-Version/MARKDOWN_DOCS/UPLOAD_FROM_BOX_STATUS.md

18 KiB

Upload from Box Workflow - Implementation Status

Date: October 29, 2025 Session: Asset Upload (A2→A3) - Phase 1 Complete Status: Core Components Built | UI Integration Pending


Executive Summary

Successfully implemented the core backend components for the "Upload from Box" workflow. This feature allows users to:

  1. Load files from a Box folder (by Folder ID)
  2. Parse V2 naming convention filenames
  3. Extract tracking IDs and load master metadata from PostgreSQL
  4. Merge master metadata with filename-derived data (filename wins)
  5. Prepare assets for upload to DAM

Phase 1 (Core Components): COMPLETE Phase 2 (UI Integration): READY TO START


What Was Built

1. DAM Lookup Domains Documentation

File: ECOMMERCE_ALLOWED_FIELDS.md Purpose: Complete reference of all 182 DAM lookup domains and their allowed values

Key Information:

  • All available field IDs
  • Field datatypes
  • Allowed values count
  • Example values

Usage: Reference this when building metadata or validating field values


2. FilenameParser Class

File: src/FilenameParser.php Purpose: Parse and validate V2 naming convention filenames

V2 Naming Convention:

[OMG_JOB_NUMBER]_[BRAND_CODE]_[COUNTRY_CODE]_[LANGUAGE_CODE]_[SUBJECT_TITLE]_[ASSET_TYPE]_[SPOT_VERSION]_[SECONDS]S_[ASPECT_RATIO]_[TRACKING_ID]

Example:

Input:  1234567_RAF_CH_de_TEST_FILE_OLV_001_15S_16x9_a7K9mP.mp4
Output: RAF_CH_de_TEST_FILE_OLV_001_15S_16x9.mp4 (OMG Job & Tracking ID stripped)

Key Methods:

  • parseFilename($filename) - Parse into components
  • validateStructure($filename) - Strict validation
  • stripUploadComponents($filename) - Remove job number & tracking ID
  • getCleanFilename($filename) - Get upload-ready filename
  • extractTrackingId($filename) - Get tracking ID only

Features:

  • Strict validation (rejects non-compliant filenames)
  • Detailed error messages for each component
  • Warnings for non-critical issues
  • Handles multi-word subject titles
  • MST (Master) file detection
  • Comprehensive test suite (8/8 tests passing)

Test File: test_filename_parser.php - Run to verify functionality


3. MetadataMerger Class

File: src/MetadataMerger.php Purpose: Merge master metadata with filename-derived data

Merge Strategy:

  • Priority: Filename always wins (as per requirements)
  • Master Fields: Locked (read-only in UI)
  • Derived Fields: Editable (from filename parsing)

Derived (Editable) Fields:

  • FERRERO.FIELD.MKTG.ASSET TYPE - From filename asset_type
  • MAIN_LANGUAGES - From filename language_code
  • ARTESIA.FIELD.ASSET NAME - From filename
  • FERRERO.FIELD.SUB BRAND - From filename brand_code
  • FERRERO.FIELD.STATE - Default: "Local"
  • FERRERO.FIELD.FISCAL YEAR - Default: "2025/2026"

Key Methods:

  • mergeMetadata($masterMetadata, $parsedFilename) - Merge data sources
  • buildAssetRepresentation($mergedMetadata) - Create API upload JSON
  • identifyEditableFields($mergedMetadata) - List editable field IDs
  • getConflicts($mergedMetadata) - Track filename vs master conflicts
  • formatForDisplay($mergedMetadata) - Human-readable output

Conflict Tracking:

  • Logs all cases where filename data overrides master data
  • Useful for debugging and validation

4. BoxFileRetriever Class

File: src/BoxFileRetriever.php Purpose: List and download files from Box folders

Key Methods:

  • listFilesInFolder($folderId) - Get all files in Box folder
  • listFilesWithTrackingIDs($folderId) - Include tracking ID parsing
  • getFileMetadata($fileId) - Get specific file details
  • downloadFile($fileId, $filename) - Download to temp directory
  • extractTrackingId($filename) - Parse tracking ID from filename
  • testConnection() - Verify Box access

Features:

  • JWT authentication (production-ready, no expiring tokens)
  • Automatic temp directory management
  • File filtering (only files, not folders)
  • Tracking ID extraction
  • Box URL generation

Temp Directory: /tmp/ferrero_box_downloads/


File Structure

Ferrero-Opentext/
├── src/
│   ├── FilenameParser.php          ✅ NEW - V2 naming parser
│   ├── MetadataMerger.php          ✅ NEW - Merge master + filename
│   ├── BoxFileRetriever.php        ✅ NEW - Box folder file listing
│   ├── BoxClient.php               ✅ EXISTING - Box JWT auth
│   ├── DatabaseClient.php          ✅ EXISTING - PostgreSQL access
│   ├── AssetUploaderSimple.php     ✅ EXISTING - Upload to DAM
│   └── [other existing files]
│
├── ECOMMERCE_ALLOWED_FIELDS.md     ✅ NEW - Field documentation
├── DAM_LOOKUPDOMAINS_RAW.json      ✅ NEW - Raw API response (15MB)
├── test_filename_parser.php        ✅ NEW - Parser test suite
├── fetch_lookupdomains.php         ✅ NEW - Lookup fetch script
│
├── workflow_v3.php                 ⏳ NEEDS UPDATE - Add "Upload from Box" tab
└── UPLOAD_FROM_BOX_STATUS.md       ✅ THIS FILE

How It Works - Complete Flow

Step 1: User Provides Box Folder ID

Input: Box Folder ID (e.g., 348304357505) Action: Paste into interface

Step 2: List Files from Box

$retriever = new BoxFileRetriever();
$result = $retriever->listFilesWithTrackingIDs($folderId);

// Returns:
[
    'success' => true,
    'file_count' => 5,
    'files' => [
        [
            'id' => '2029961099212',
            'name' => '1234567_RAF_CH_de_TEST_OLV_001_15S_16x9_a7K9mP.mp4',
            'tracking_id' => 'a7K9mP',
            'has_tracking_id' => true,
            'size' => 5242880,
            'box_url' => 'https://app.box.com/file/2029961099212'
        ],
        // ... more files
    ]
]

Step 3: Parse Each Filename

$parser = new FilenameParser();
$parsed = $parser->parseFilename($filename);

// Returns:
[
    'omg_job_number' => '1234567',
    'brand_code' => 'RAF',
    'country_code' => 'CH',
    'language_code' => 'de',
    'subject_title' => 'TEST',
    'asset_type' => 'OLV',
    'spot_version' => '001',
    'seconds' => '15',
    'aspect_ratio' => '16x9',
    'tracking_id' => 'a7K9mP',
    'is_valid' => true,
    'validation_errors' => []
]

Step 4: Load Master Metadata from Database

$db = DatabaseClient::getInstance();
$masterAsset = $db->lookupByTrackingId('a7K9mP');

// Returns master asset metadata from PostgreSQL:
[
    'tracking_id' => 'a7K9mP',
    'opentext_id' => '0008a50461e6a554...',
    'upload_directory' => 'ea0dbf86e13e3634...',
    'metadata' => { /* Full DAM metadata JSON */ }
]

Step 5: Merge Metadata

$merger = new MetadataMerger();
$merged = $merger->mergeMetadata($masterAsset, $parsed);

// Filename data overrides master data
// Tracks source of each field (master/filename/default)
// Identifies editable vs locked fields

Step 6: Build Asset Representation

$assetRepresentation = $merger->buildAssetRepresentation($merged);

// Creates proper structure for DAM API:
[
    'asset_resource' => [
        'asset' => [
            'metadata' => { /* Merged metadata */ },
            'metadata_model_id' => 'ECOMMERCE',
            'security_policy_list' => [['id' => 1594]]
        ]
    ]
]

Step 7: Strip Filename Components

$cleanFilename = $parser->stripUploadComponents($filename);

// Input:  1234567_RAF_CH_de_TEST_OLV_001_15S_16x9_a7K9mP.mp4
// Output: RAF_CH_de_TEST_OLV_001_15S_16x9.mp4

Step 8: Upload to DAM

$uploader = new AssetUploaderSimple();
$uploadResult = $uploader->uploadAsset(
    $cleanFilename,
    $localFilePath,
    $assetRepresentation,
    $masterAsset['upload_directory']
);

// Uploads to correct folder with proper metadata

Next Steps - Phase 2: UI Integration

Task 5: Add "Upload from Box" Tab

File to Edit: workflow_v3.php

Location: After the "Upload" tab (around line 800-900)

UI Components Needed:

A. Box Folder Input Section

<div class="form-group">
    <label>Box Folder ID</label>
    <input type="text" id="box-folder-id" placeholder="e.g., 348304357505">
    <button id="load-box-files" class="btn btn-primary">Load Files</button>
</div>
<div id="box-status"></div>

B. File List Table

<table id="box-files-table">
    <thead>
        <tr>
            <th><input type="checkbox" id="select-all"></th>
            <th>Filename</th>
            <th>Tracking ID</th>
            <th>Master Asset</th>
            <th>Valid</th>
            <th>Size</th>
        </tr>
    </thead>
    <tbody id="box-files-list">
        <!-- Populated by AJAX -->
    </tbody>
</table>

C. Metadata Preview Section

<div id="metadata-preview">
    <h4>Metadata for Selected File</h4>

    <div class="metadata-section">
        <h5>🔒 Master Fields (Locked)</h5>
        <div id="locked-fields">
            <!-- Grayed out, read-only fields -->
        </div>
    </div>

    <div class="metadata-section">
        <h5>✏️ Derived Fields (Editable)</h5>
        <div id="editable-fields">
            <!-- Editable input fields -->
        </div>
    </div>

    <div id="validation-warnings"></div>
</div>

D. Upload Controls

<div id="upload-controls">
    <button id="upload-selected" class="btn btn-success">Upload Selected Assets</button>
    <div id="upload-progress"></div>
</div>

Task 6: Implement AJAX Endpoints

File to Edit: workflow_v3.php (around line 100-200, where other AJAX handlers are)

Endpoints to Add:

1. Load Box Files

case 'box_list_files':
    $folderId = $_POST['folder_id'] ?? '';
    $retriever = new BoxFileRetriever();
    $result = $retriever->listFilesWithTrackingIDs($folderId);
    echo json_encode($result);
    break;

2. Parse Filename

case 'parse_filename':
    $filename = $_POST['filename'] ?? '';
    $parser = new FilenameParser();
    $parsed = $parser->parseFilename($filename);
    echo json_encode($parsed);
    break;

3. Load Master Metadata

case 'load_master_metadata':
    $trackingId = $_POST['tracking_id'] ?? '';
    $db = DatabaseClient::getInstance();
    $masterAsset = $db->lookupByTrackingId($trackingId);
    echo json_encode($masterAsset);
    break;

4. Merge Metadata

case 'merge_metadata':
    $masterMetadata = json_decode($_POST['master_metadata'], true);
    $parsedFilename = json_decode($_POST['parsed_filename'], true);

    $merger = new MetadataMerger();
    $merged = $merger->mergeMetadata($masterMetadata, $parsedFilename);
    $editableFields = $merger->identifyEditableFields($merged);

    echo json_encode([
        'merged' => $merged,
        'editable_fields' => $editableFields
    ]);
    break;

5. Upload from Box

case 'upload_from_box':
    // 1. Get Box file
    // 2. Download to temp
    // 3. Parse filename
    // 4. Load master metadata
    // 5. Merge metadata
    // 6. Build asset representation
    // 7. Upload to DAM
    // 8. Clean up temp file
    // 9. Return result
    break;

Task 7: Build Metadata Edit UI

JavaScript Functions Needed:

// Load files from Box
function loadBoxFiles(folderId) {
    $.ajax({
        url: 'workflow_v3.php',
        method: 'POST',
        data: { action: 'box_list_files', folder_id: folderId },
        success: function(response) {
            displayBoxFiles(response.files);
        }
    });
}

// Display files in table
function displayBoxFiles(files) {
    files.forEach(file => {
        // Parse filename
        // Validate
        // Show in table with checkbox
    });
}

// Load metadata for selected file
function loadFileMetadata(filename, trackingId) {
    // Parse filename
    // Load master metadata
    // Merge
    // Display in preview
}

// Display metadata with locked/editable sections
function displayMetadata(merged, editableFields) {
    // Clear containers
    // Populate locked fields (read-only)
    // Populate editable fields (input elements)
}

// Upload selected files
function uploadSelectedFiles() {
    var selectedFiles = getSelectedFiles();
    selectedFiles.forEach(file => {
        uploadSingleFile(file);
    });
}

Task 8: Implement Upload Processing

Steps:

  1. Download file from Box to temp
  2. Parse filename and validate
  3. Load master metadata from DB
  4. Merge metadata with filename data
  5. Get user edits from UI
  6. Build final asset representation
  7. Strip filename components
  8. Upload to DAM using AssetUploaderSimple
  9. Update status (if all files done)
  10. Clean up temp files

Testing Checklist

Unit Tests

  • FilenameParser - All 8 tests passing
  • MetadataMerger - Need test cases
  • BoxFileRetriever - Need test cases

Integration Tests

  • Load files from Box folder
  • Parse V2 filenames correctly
  • Extract tracking IDs
  • Load master metadata from DB
  • Merge metadata (filename wins)
  • Build asset representation
  • Upload to DAM successfully
  • Update campaign status to A3

UI Tests

  • Box Folder ID input validation
  • File list display
  • Filename validation display
  • Metadata preview (locked vs editable)
  • Field editing
  • Upload progress display
  • Error handling

Known Limitations & Considerations

1. Box API Rate Limits

  • JWT tokens are valid for 60 minutes
  • File downloads have no explicit timeout
  • Consider batch processing for large folders

2. Filename Validation

  • Strict validation will reject non-compliant filenames
  • User must fix filenames before upload
  • Consider adding "force upload" option for special cases

3. Metadata Conflicts

  • Filename always wins (as designed)
  • Conflicts are logged but not shown to user
  • Consider adding conflict warning UI

4. Database Lookups

  • Tracking IDs must exist in PostgreSQL
  • No fallback if tracking ID not found
  • Consider adding "create new master" option

5. Upload Folder Extraction

  • Relies on tracking ID → master asset → upload_directory
  • If upload_directory is NULL, upload will fail
  • Need error handling for this case

Configuration Requirements

PostgreSQL Database

Host: localhost
Port: 5433
Database: ferrero_tracking
Table: master_assets

Required Fields:
- tracking_id (primary key)
- opentext_id (DAM asset ID)
- upload_directory (target folder ID)
- metadata (JSON - full DAM metadata)

Box Configuration

File: Box-config.json or 43984435_n1izyn3l_config.json

Auth Method: JWT (RSA-256)
Root Folder: 348304357505

DAM Configuration

Environment: Production
Base URL: https://ppr.dam.ferrero.com/otmmapi
Metadata Model: ECOMMERCE
Security Policy: 1594

Code Quality Notes

Well Implemented

  • Strict filename validation with detailed errors
  • Filename always wins (requirement met)
  • Editable field identification
  • Conflict tracking
  • Comprehensive parsing
  • JWT authentication (production-ready)
  • Test suite for parser

Could Enhance

  • Add test suites for Merger and Retriever
  • Batch upload optimization
  • Progress tracking for large uploads
  • Error recovery strategies
  • Conflict resolution UI
  • Manual metadata override option

Performance Considerations

Expected Performance

  • Box file listing: 1-3 seconds
  • Filename parsing: <1ms per file
  • DB metadata lookup: <100ms per tracking ID
  • Metadata merge: <10ms per file
  • File download from Box: 5-30 seconds (depending on size)
  • DAM upload: 3-10 seconds per file

Optimization Opportunities

  • Cache parsed filenames
  • Batch DB lookups
  • Parallel Box downloads
  • Streaming uploads (for large files)

Error Handling Strategy

Validation Errors

  • Invalid filename structure → Show errors, block upload
  • Missing tracking ID → Show error, allow manual entry
  • Tracking ID not in DB → Error, cannot proceed

Runtime Errors

  • Box connection failure → Retry with exponential backoff
  • Download failure → Log, skip file, continue with others
  • Upload failure → Log, mark file for retry, don't update status
  • DB connection failure → Fatal error, stop process

User Experience

  • Clear error messages
  • Validation feedback in real-time
  • Progress indicators
  • Retry options
  • Detailed logs for debugging

Documentation Status

Completed

  • Lookup domains documentation
  • V2 naming convention reference
  • FilenameParser API documentation
  • MetadataMerger API documentation
  • BoxFileRetriever API documentation
  • Complete workflow documentation
  • This status document

Pending

  • UI component documentation
  • AJAX endpoint documentation
  • Testing procedures
  • Deployment guide
  • User manual

Next Session Quick Start

To Continue Implementation:

  1. Read this file (UPLOAD_FROM_BOX_STATUS.md) for full context

  2. Test existing components:

    cd /Users/daveporter/Desktop/CODING-2024/Ferrero-Opentext
    php test_filename_parser.php
    
  3. Start UI integration:

    • Edit workflow_v3.php
    • Add "Upload from Box" tab (see Task 5 above)
    • Implement AJAX endpoints (see Task 6 above)
  4. Reference files:

    • ECOMMERCE_ALLOWED_FIELDS.md - Field reference
    • downloads/asset_representation MVP.json - Metadata structure
    • PROJECT_STATUS_2025-10-29.md - Overall project status
  5. Key Classes to Use:

    require_once 'src/FilenameParser.php';
    require_once 'src/MetadataMerger.php';
    require_once 'src/BoxFileRetriever.php';
    require_once 'src/DatabaseClient.php';
    require_once 'src/AssetUploaderSimple.php';
    

Success Criteria

Phase 1 (Core Components) COMPLETE

  • Lookup domains documentation exported
  • FilenameParser implemented and tested
  • MetadataMerger implemented
  • BoxFileRetriever implemented
  • All integration points identified

Phase 2 (UI Integration) READY TO START

  • "Upload from Box" tab added to workflow
  • All AJAX endpoints implemented
  • File list display working
  • Metadata preview/edit UI working
  • Upload processing complete
  • Status update to A3 working

Phase 3 (Testing & Polish) PENDING

  • Integration tests passing
  • Error handling verified
  • Performance acceptable
  • User experience polished
  • Documentation complete

Status: Ready for Phase 2 Implementation Estimated Time to Complete Phase 2: 4-6 hours Overall Progress: 40% Complete (Core backend done, UI pending)


End of Status Report