ferrero-opentext/Python-Version/MARKDOWN_DOCS/UPLOAD_FROM_BOX_STATUS.md

715 lines
18 KiB
Markdown

# Upload from Box Workflow - Implementation Status
**Date:** October 29, 2025
**Session:** Asset Upload (A2→A3) - Phase 1 Complete
**Status:** Core Components Built ✅ | UI Integration Pending ⏳
---
## Executive Summary
Successfully implemented the core backend components for the "Upload from Box" workflow. This feature allows users to:
1. Load files from a Box folder (by Folder ID)
2. Parse V2 naming convention filenames
3. Extract tracking IDs and load master metadata from PostgreSQL
4. Merge master metadata with filename-derived data (filename wins)
5. Prepare assets for upload to DAM
**Phase 1 (Core Components):** ✅ COMPLETE
**Phase 2 (UI Integration):** ⏳ READY TO START
---
## What Was Built
### 1. DAM Lookup Domains Documentation ✅
**File:** `ECOMMERCE_ALLOWED_FIELDS.md`
**Purpose:** Complete reference of all 182 DAM lookup domains and their allowed values
**Key Information:**
- All available field IDs
- Field datatypes
- Allowed values count
- Example values
**Usage:** Reference this when building metadata or validating field values
---
### 2. FilenameParser Class ✅
**File:** `src/FilenameParser.php`
**Purpose:** Parse and validate V2 naming convention filenames
**V2 Naming Convention:**
```
[OMG_JOB_NUMBER]_[BRAND_CODE]_[COUNTRY_CODE]_[LANGUAGE_CODE]_[SUBJECT_TITLE]_[ASSET_TYPE]_[SPOT_VERSION]_[SECONDS]S_[ASPECT_RATIO]_[TRACKING_ID]
```
**Example:**
```
Input: 1234567_RAF_CH_de_TEST_FILE_OLV_001_15S_16x9_a7K9mP.mp4
Output: RAF_CH_de_TEST_FILE_OLV_001_15S_16x9.mp4 (OMG Job & Tracking ID stripped)
```
**Key Methods:**
- `parseFilename($filename)` - Parse into components
- `validateStructure($filename)` - Strict validation
- `stripUploadComponents($filename)` - Remove job number & tracking ID
- `getCleanFilename($filename)` - Get upload-ready filename
- `extractTrackingId($filename)` - Get tracking ID only
**Features:**
- Strict validation (rejects non-compliant filenames)
- Detailed error messages for each component
- Warnings for non-critical issues
- Handles multi-word subject titles
- MST (Master) file detection
- Comprehensive test suite (8/8 tests passing)
**Test File:** `test_filename_parser.php` - Run to verify functionality
---
### 3. MetadataMerger Class ✅
**File:** `src/MetadataMerger.php`
**Purpose:** Merge master metadata with filename-derived data
**Merge Strategy:**
- **Priority:** Filename always wins (as per requirements)
- **Master Fields:** Locked (read-only in UI)
- **Derived Fields:** Editable (from filename parsing)
**Derived (Editable) Fields:**
- `FERRERO.FIELD.MKTG.ASSET TYPE` - From filename asset_type
- `MAIN_LANGUAGES` - From filename language_code
- `ARTESIA.FIELD.ASSET NAME` - From filename
- `FERRERO.FIELD.SUB BRAND` - From filename brand_code
- `FERRERO.FIELD.STATE` - Default: "Local"
- `FERRERO.FIELD.FISCAL YEAR` - Default: "2025/2026"
**Key Methods:**
- `mergeMetadata($masterMetadata, $parsedFilename)` - Merge data sources
- `buildAssetRepresentation($mergedMetadata)` - Create API upload JSON
- `identifyEditableFields($mergedMetadata)` - List editable field IDs
- `getConflicts($mergedMetadata)` - Track filename vs master conflicts
- `formatForDisplay($mergedMetadata)` - Human-readable output
**Conflict Tracking:**
- Logs all cases where filename data overrides master data
- Useful for debugging and validation
---
### 4. BoxFileRetriever Class ✅
**File:** `src/BoxFileRetriever.php`
**Purpose:** List and download files from Box folders
**Key Methods:**
- `listFilesInFolder($folderId)` - Get all files in Box folder
- `listFilesWithTrackingIDs($folderId)` - Include tracking ID parsing
- `getFileMetadata($fileId)` - Get specific file details
- `downloadFile($fileId, $filename)` - Download to temp directory
- `extractTrackingId($filename)` - Parse tracking ID from filename
- `testConnection()` - Verify Box access
**Features:**
- JWT authentication (production-ready, no expiring tokens)
- Automatic temp directory management
- File filtering (only files, not folders)
- Tracking ID extraction
- Box URL generation
**Temp Directory:** `/tmp/ferrero_box_downloads/`
---
## File Structure
```
Ferrero-Opentext/
├── src/
│ ├── FilenameParser.php ✅ NEW - V2 naming parser
│ ├── MetadataMerger.php ✅ NEW - Merge master + filename
│ ├── BoxFileRetriever.php ✅ NEW - Box folder file listing
│ ├── BoxClient.php ✅ EXISTING - Box JWT auth
│ ├── DatabaseClient.php ✅ EXISTING - PostgreSQL access
│ ├── AssetUploaderSimple.php ✅ EXISTING - Upload to DAM
│ └── [other existing files]
├── ECOMMERCE_ALLOWED_FIELDS.md ✅ NEW - Field documentation
├── DAM_LOOKUPDOMAINS_RAW.json ✅ NEW - Raw API response (15MB)
├── test_filename_parser.php ✅ NEW - Parser test suite
├── fetch_lookupdomains.php ✅ NEW - Lookup fetch script
├── workflow_v3.php ⏳ NEEDS UPDATE - Add "Upload from Box" tab
└── UPLOAD_FROM_BOX_STATUS.md ✅ THIS FILE
```
---
## How It Works - Complete Flow
### Step 1: User Provides Box Folder ID
**Input:** Box Folder ID (e.g., 348304357505)
**Action:** Paste into interface
### Step 2: List Files from Box
```php
$retriever = new BoxFileRetriever();
$result = $retriever->listFilesWithTrackingIDs($folderId);
// Returns:
[
'success' => true,
'file_count' => 5,
'files' => [
[
'id' => '2029961099212',
'name' => '1234567_RAF_CH_de_TEST_OLV_001_15S_16x9_a7K9mP.mp4',
'tracking_id' => 'a7K9mP',
'has_tracking_id' => true,
'size' => 5242880,
'box_url' => 'https://app.box.com/file/2029961099212'
],
// ... more files
]
]
```
### Step 3: Parse Each Filename
```php
$parser = new FilenameParser();
$parsed = $parser->parseFilename($filename);
// Returns:
[
'omg_job_number' => '1234567',
'brand_code' => 'RAF',
'country_code' => 'CH',
'language_code' => 'de',
'subject_title' => 'TEST',
'asset_type' => 'OLV',
'spot_version' => '001',
'seconds' => '15',
'aspect_ratio' => '16x9',
'tracking_id' => 'a7K9mP',
'is_valid' => true,
'validation_errors' => []
]
```
### Step 4: Load Master Metadata from Database
```php
$db = DatabaseClient::getInstance();
$masterAsset = $db->lookupByTrackingId('a7K9mP');
// Returns master asset metadata from PostgreSQL:
[
'tracking_id' => 'a7K9mP',
'opentext_id' => '0008a50461e6a554...',
'upload_directory' => 'ea0dbf86e13e3634...',
'metadata' => { /* Full DAM metadata JSON */ }
]
```
### Step 5: Merge Metadata
```php
$merger = new MetadataMerger();
$merged = $merger->mergeMetadata($masterAsset, $parsed);
// Filename data overrides master data
// Tracks source of each field (master/filename/default)
// Identifies editable vs locked fields
```
### Step 6: Build Asset Representation
```php
$assetRepresentation = $merger->buildAssetRepresentation($merged);
// Creates proper structure for DAM API:
[
'asset_resource' => [
'asset' => [
'metadata' => { /* Merged metadata */ },
'metadata_model_id' => 'ECOMMERCE',
'security_policy_list' => [['id' => 1594]]
]
]
]
```
### Step 7: Strip Filename Components
```php
$cleanFilename = $parser->stripUploadComponents($filename);
// Input: 1234567_RAF_CH_de_TEST_OLV_001_15S_16x9_a7K9mP.mp4
// Output: RAF_CH_de_TEST_OLV_001_15S_16x9.mp4
```
### Step 8: Upload to DAM
```php
$uploader = new AssetUploaderSimple();
$uploadResult = $uploader->uploadAsset(
$cleanFilename,
$localFilePath,
$assetRepresentation,
$masterAsset['upload_directory']
);
// Uploads to correct folder with proper metadata
```
---
## Next Steps - Phase 2: UI Integration
### Task 5: Add "Upload from Box" Tab ⏳
**File to Edit:** `workflow_v3.php`
**Location:** After the "Upload" tab (around line 800-900)
**UI Components Needed:**
#### A. Box Folder Input Section
```html
<div class="form-group">
<label>Box Folder ID</label>
<input type="text" id="box-folder-id" placeholder="e.g., 348304357505">
<button id="load-box-files" class="btn btn-primary">Load Files</button>
</div>
<div id="box-status"></div>
```
#### B. File List Table
```html
<table id="box-files-table">
<thead>
<tr>
<th><input type="checkbox" id="select-all"></th>
<th>Filename</th>
<th>Tracking ID</th>
<th>Master Asset</th>
<th>Valid</th>
<th>Size</th>
</tr>
</thead>
<tbody id="box-files-list">
<!-- Populated by AJAX -->
</tbody>
</table>
```
#### C. Metadata Preview Section
```html
<div id="metadata-preview">
<h4>Metadata for Selected File</h4>
<div class="metadata-section">
<h5>🔒 Master Fields (Locked)</h5>
<div id="locked-fields">
<!-- Grayed out, read-only fields -->
</div>
</div>
<div class="metadata-section">
<h5>✏️ Derived Fields (Editable)</h5>
<div id="editable-fields">
<!-- Editable input fields -->
</div>
</div>
<div id="validation-warnings"></div>
</div>
```
#### D. Upload Controls
```html
<div id="upload-controls">
<button id="upload-selected" class="btn btn-success">Upload Selected Assets</button>
<div id="upload-progress"></div>
</div>
```
---
### Task 6: Implement AJAX Endpoints ⏳
**File to Edit:** `workflow_v3.php` (around line 100-200, where other AJAX handlers are)
**Endpoints to Add:**
#### 1. Load Box Files
```php
case 'box_list_files':
$folderId = $_POST['folder_id'] ?? '';
$retriever = new BoxFileRetriever();
$result = $retriever->listFilesWithTrackingIDs($folderId);
echo json_encode($result);
break;
```
#### 2. Parse Filename
```php
case 'parse_filename':
$filename = $_POST['filename'] ?? '';
$parser = new FilenameParser();
$parsed = $parser->parseFilename($filename);
echo json_encode($parsed);
break;
```
#### 3. Load Master Metadata
```php
case 'load_master_metadata':
$trackingId = $_POST['tracking_id'] ?? '';
$db = DatabaseClient::getInstance();
$masterAsset = $db->lookupByTrackingId($trackingId);
echo json_encode($masterAsset);
break;
```
#### 4. Merge Metadata
```php
case 'merge_metadata':
$masterMetadata = json_decode($_POST['master_metadata'], true);
$parsedFilename = json_decode($_POST['parsed_filename'], true);
$merger = new MetadataMerger();
$merged = $merger->mergeMetadata($masterMetadata, $parsedFilename);
$editableFields = $merger->identifyEditableFields($merged);
echo json_encode([
'merged' => $merged,
'editable_fields' => $editableFields
]);
break;
```
#### 5. Upload from Box
```php
case 'upload_from_box':
// 1. Get Box file
// 2. Download to temp
// 3. Parse filename
// 4. Load master metadata
// 5. Merge metadata
// 6. Build asset representation
// 7. Upload to DAM
// 8. Clean up temp file
// 9. Return result
break;
```
---
### Task 7: Build Metadata Edit UI ⏳
**JavaScript Functions Needed:**
```javascript
// Load files from Box
function loadBoxFiles(folderId) {
$.ajax({
url: 'workflow_v3.php',
method: 'POST',
data: { action: 'box_list_files', folder_id: folderId },
success: function(response) {
displayBoxFiles(response.files);
}
});
}
// Display files in table
function displayBoxFiles(files) {
files.forEach(file => {
// Parse filename
// Validate
// Show in table with checkbox
});
}
// Load metadata for selected file
function loadFileMetadata(filename, trackingId) {
// Parse filename
// Load master metadata
// Merge
// Display in preview
}
// Display metadata with locked/editable sections
function displayMetadata(merged, editableFields) {
// Clear containers
// Populate locked fields (read-only)
// Populate editable fields (input elements)
}
// Upload selected files
function uploadSelectedFiles() {
var selectedFiles = getSelectedFiles();
selectedFiles.forEach(file => {
uploadSingleFile(file);
});
}
```
---
### Task 8: Implement Upload Processing ⏳
**Steps:**
1. Download file from Box to temp
2. Parse filename and validate
3. Load master metadata from DB
4. Merge metadata with filename data
5. Get user edits from UI
6. Build final asset representation
7. Strip filename components
8. Upload to DAM using AssetUploaderSimple
9. Update status (if all files done)
10. Clean up temp files
---
## Testing Checklist
### Unit Tests
- [x] FilenameParser - All 8 tests passing
- [ ] MetadataMerger - Need test cases
- [ ] BoxFileRetriever - Need test cases
### Integration Tests
- [ ] Load files from Box folder
- [ ] Parse V2 filenames correctly
- [ ] Extract tracking IDs
- [ ] Load master metadata from DB
- [ ] Merge metadata (filename wins)
- [ ] Build asset representation
- [ ] Upload to DAM successfully
- [ ] Update campaign status to A3
### UI Tests
- [ ] Box Folder ID input validation
- [ ] File list display
- [ ] Filename validation display
- [ ] Metadata preview (locked vs editable)
- [ ] Field editing
- [ ] Upload progress display
- [ ] Error handling
---
## Known Limitations & Considerations
### 1. Box API Rate Limits
- JWT tokens are valid for 60 minutes
- File downloads have no explicit timeout
- Consider batch processing for large folders
### 2. Filename Validation
- Strict validation will reject non-compliant filenames
- User must fix filenames before upload
- Consider adding "force upload" option for special cases
### 3. Metadata Conflicts
- Filename always wins (as designed)
- Conflicts are logged but not shown to user
- Consider adding conflict warning UI
### 4. Database Lookups
- Tracking IDs must exist in PostgreSQL
- No fallback if tracking ID not found
- Consider adding "create new master" option
### 5. Upload Folder Extraction
- Relies on tracking ID → master asset → upload_directory
- If upload_directory is NULL, upload will fail
- Need error handling for this case
---
## Configuration Requirements
### PostgreSQL Database
```
Host: localhost
Port: 5433
Database: ferrero_tracking
Table: master_assets
Required Fields:
- tracking_id (primary key)
- opentext_id (DAM asset ID)
- upload_directory (target folder ID)
- metadata (JSON - full DAM metadata)
```
### Box Configuration
```
File: Box-config.json or 43984435_n1izyn3l_config.json
Auth Method: JWT (RSA-256)
Root Folder: 348304357505
```
### DAM Configuration
```
Environment: Production
Base URL: https://ppr.dam.ferrero.com/otmmapi
Metadata Model: ECOMMERCE
Security Policy: 1594
```
---
## Code Quality Notes
### Well Implemented ✅
- Strict filename validation with detailed errors
- Filename always wins (requirement met)
- Editable field identification
- Conflict tracking
- Comprehensive parsing
- JWT authentication (production-ready)
- Test suite for parser
### Could Enhance
- Add test suites for Merger and Retriever
- Batch upload optimization
- Progress tracking for large uploads
- Error recovery strategies
- Conflict resolution UI
- Manual metadata override option
---
## Performance Considerations
### Expected Performance
- Box file listing: 1-3 seconds
- Filename parsing: <1ms per file
- DB metadata lookup: <100ms per tracking ID
- Metadata merge: <10ms per file
- File download from Box: 5-30 seconds (depending on size)
- DAM upload: 3-10 seconds per file
### Optimization Opportunities
- Cache parsed filenames
- Batch DB lookups
- Parallel Box downloads
- Streaming uploads (for large files)
---
## Error Handling Strategy
### Validation Errors
- Invalid filename structure Show errors, block upload
- Missing tracking ID Show error, allow manual entry
- Tracking ID not in DB Error, cannot proceed
### Runtime Errors
- Box connection failure Retry with exponential backoff
- Download failure Log, skip file, continue with others
- Upload failure Log, mark file for retry, don't update status
- DB connection failure Fatal error, stop process
### User Experience
- Clear error messages
- Validation feedback in real-time
- Progress indicators
- Retry options
- Detailed logs for debugging
---
## Documentation Status
### Completed ✅
- [x] Lookup domains documentation
- [x] V2 naming convention reference
- [x] FilenameParser API documentation
- [x] MetadataMerger API documentation
- [x] BoxFileRetriever API documentation
- [x] Complete workflow documentation
- [x] This status document
### Pending ⏳
- [ ] UI component documentation
- [ ] AJAX endpoint documentation
- [ ] Testing procedures
- [ ] Deployment guide
- [ ] User manual
---
## Next Session Quick Start
### To Continue Implementation:
1. **Read this file** (UPLOAD_FROM_BOX_STATUS.md) for full context
2. **Test existing components:**
```bash
cd /Users/daveporter/Desktop/CODING-2024/Ferrero-Opentext
php test_filename_parser.php
```
3. **Start UI integration:**
- Edit `workflow_v3.php`
- Add "Upload from Box" tab (see Task 5 above)
- Implement AJAX endpoints (see Task 6 above)
4. **Reference files:**
- `ECOMMERCE_ALLOWED_FIELDS.md` - Field reference
- `downloads/asset_representation MVP.json` - Metadata structure
- `PROJECT_STATUS_2025-10-29.md` - Overall project status
5. **Key Classes to Use:**
```php
require_once 'src/FilenameParser.php';
require_once 'src/MetadataMerger.php';
require_once 'src/BoxFileRetriever.php';
require_once 'src/DatabaseClient.php';
require_once 'src/AssetUploaderSimple.php';
```
---
## Success Criteria
### Phase 1 (Core Components) ✅ COMPLETE
- [x] Lookup domains documentation exported
- [x] FilenameParser implemented and tested
- [x] MetadataMerger implemented
- [x] BoxFileRetriever implemented
- [x] All integration points identified
### Phase 2 (UI Integration) ⏳ READY TO START
- [ ] "Upload from Box" tab added to workflow
- [ ] All AJAX endpoints implemented
- [ ] File list display working
- [ ] Metadata preview/edit UI working
- [ ] Upload processing complete
- [ ] Status update to A3 working
### Phase 3 (Testing & Polish) ⏳ PENDING
- [ ] Integration tests passing
- [ ] Error handling verified
- [ ] Performance acceptable
- [ ] User experience polished
- [ ] Documentation complete
---
**Status:** Ready for Phase 2 Implementation
**Estimated Time to Complete Phase 2:** 4-6 hours
**Overall Progress:** 40% Complete (Core backend done, UI pending)
---
**End of Status Report**