ferrero-opentext/Python-Version/MARKDOWN_DOCS/COMPLETE_WORKFLOW_GUIDE.md

3799 lines
107 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Ferrero DAM Asset Production Workflow
## Complete System Guide
**Version:** 2.1
**Last Updated:** November 2025
**Document Owner:** Operations Team
**Audience:** Campaign Managers, Creative Teams, Agencies, Technical Staff
---
## Complete Workflow Overview: A1 → A4 (with A5-A6 Rejection Cycle)
```mermaid
flowchart TB
subgraph Creation["PHASE 1: Campaign Creation"]
A1_START[Campaign Manager creates campaign in DAM] --> A1_UPLOAD[Creative team uploads master assets]
A1_UPLOAD --> A1_STATUS[Set campaign status to A1]
end
subgraph A1toA2["PHASE 2: A1→A2 Master Download (AUTOMATED - Every 5 min)"]
A1_STATUS -->|Script detects A1 campaign| A1_DETECT[a1_to_a2_download.py]
A1_DETECT --> A1_DOWNLOAD[Download all master assets from DAM]
A1_DOWNLOAD --> A1_TRACKING[Generate unique tracking IDs<br/>pOiJ9s, a7K9mP, bvSGyd]
A1_TRACKING --> A1_BOX[Upload to Box with tracking IDs<br/>nutella_hero_pOiJ9s.mp4]
A1_BOX --> A1_DB[Store in database with full metadata]
A1_DB --> A1_CX{CreativeX in<br/>master metadata?}
A1_CX -->|Yes| A1_CX_STORE[Store master score<br/>status='master-cx-score']
A1_CX -->|No| A1_CX_SKIP[Log: No CreativeX normal]
A1_CX_STORE --> A1_CHECK{All assets<br/>successful?}
A1_CX_SKIP --> A1_CHECK
A1_CHECK -->|Yes| A1_UPDATE[Update DAM status A1 → A2]
A1_CHECK -->|No| A1_RETRY[Retry failed on next run]
A1_UPDATE --> A1_EMAIL[Email: Masters ready for localization]
A1_RETRY -.->|5 min later| A1_DETECT
end
subgraph Agency["PHASE 3: Agency Localization + CreativeX Scoring"]
A1_EMAIL --> AG_RECEIVE[Agency receives Box URLs + tracking IDs]
AG_RECEIVE --> AG_DOWNLOAD[Download masters from Box]
AG_DOWNLOAD --> AG_CREATE[Create localized versions<br/>DE, FR, IT versions + aspect ratios]
AG_CREATE --> AG_CREATIVEX[🔴 CRITICAL: Submit EVERY derivative to CreativeX<br/>200 derivatives = 200 analyses required]
AG_CREATIVEX --> AG_PDF[Receive PDF scorecard for EACH asset]
AG_PDF --> AG_PDF_UP[Upload ALL PDFs to Box 350605024645]
AG_PDF_UP --> AG_SCORE_RUN[Run creativex_scoring_storing.py]
AG_SCORE_RUN --> AG_SCORE_DB[Scores stored in database<br/>status='active']
AG_SCORE_DB --> AG_VERIFY[Verify: derivative count = score count ✅]
AG_VERIFY --> AG_NAMING[Use Naming Tool for EACH derivative<br/>Copy exact filename]
AG_NAMING --> AG_UPLOAD[Upload to Box A2→A3 folder<br/>With job number + tracking ID]
end
subgraph A2toA3["PHASE 4: A2→A3 Derivative Upload (AUTOMATED - Every 5 min)"]
AG_UPLOAD -->|Script detects files| A2_DETECT[a2_to_a3_upload_polling.py]
A2_DETECT --> A2_PARSE[Parse filename: Extract tracking ID]
A2_PARSE --> A2_LOOKUP[Load master metadata from database]
A2_LOOKUP --> A2_CX_LOOKUP[Lookup CreativeX score by filename]
A2_CX_LOOKUP --> A2_CX_CHECK{Score found?}
A2_CX_CHECK -->|Yes| A2_CX_USE[Score: 71, URL: actual]
A2_CX_CHECK -->|No| A2_CX_DEFAULT[⚠️ Score: 0, URL: placeholder<br/>Email warning sent]
A2_CX_USE --> A2_BUILD[Build asset representation]
A2_CX_DEFAULT --> A2_BUILD
A2_BUILD --> A2_OVERRIDE[Override from filename:<br/>Language, Asset Type, Description]
A2_OVERRIDE --> A2_MAP[Map asset type: EHI → heroimage]
A2_MAP --> A2_STRIP[Strip to clean filename]
A2_STRIP --> A2_UPLOAD[Upload to DAM Final Assets]
A2_UPLOAD --> A2_DELETE[Delete from Box]
A2_DELETE --> A2_EMAIL[Email: Upload successful]
A2_EMAIL --> A2_STATUS[Campaign status: A3]
end
subgraph Approval["PHASE 5: Approval Process"]
A2_STATUS --> A3_REVIEW[Legal/Compliance/Brand review in DAM]
A3_REVIEW --> A3_CHECK[Check quality, compliance, guidelines]
A3_CHECK --> A3_DECISION{Approved?}
end
subgraph Success["✅ APPROVED PATH"]
A3_DECISION -->|✅ All checks passed| A4_STATUS[Status updated to A4]
A4_STATUS --> A4_WEBHOOK[Webhook sent to Make.com]
A4_WEBHOOK --> A4_LIVE[Assets go LIVE]
A4_LIVE --> A4_DONE[🎉 Campaign Complete - Assets in market]
end
subgraph Rejection["❌ REJECTED PATH - Rework Required"]
A3_DECISION -->|❌ Issues found| A5_MARK[Approver marks specific assets:<br/>ECOMMERCE STATUS = NOT APPROVED]
A5_MARK --> A5_LEGAL[Legal reviewer adds comments:<br/>Missing copyright, compliance issues]
A5_LEGAL --> A5_IACC[IA&CC reviewer adds comments:<br/>Brand guideline violations]
A5_IACC --> A5_APPROVER[General approver adds comments:<br/>Creative feedback]
A5_APPROVER --> A5_STATUS[Campaign status → A5]
end
subgraph A5toA6["PHASE 6: A5→A6 Rejection Download (AUTOMATED)"]
A5_STATUS -->|Script detects A5| A5_DETECT[a5_to_a6_download.py]
A5_DETECT --> A5_FILTER[🔍 Filter: Download ONLY<br/>NOT APPROVED assets]
A5_FILTER --> A5_APPROVED[✅ APPROVED assets: Skipped<br/>Not re-downloaded]
A5_FILTER --> A5_TRACKING[Lookup EXISTING tracking ID<br/>REUSE: pOiJ9s]
A5_TRACKING --> A5_DOWNLOAD[Download rejected assets from DAM]
A5_DOWNLOAD --> A5_BOX_UP[Upload to Box A5→A6 folder<br/>Separate folder from A1→A2]
A5_BOX_UP --> A5_EMAIL_OUT[Email with ALL rejection comments:<br/>Legal + IA&CC + Approver feedback]
A5_EMAIL_OUT --> A5_AGENCY[Agency receives detailed feedback]
end
subgraph Rework["PHASE 7: Agency Fixes + Re-upload"]
A5_AGENCY --> RW_DOWNLOAD[Download rejected assets from Box]
RW_DOWNLOAD --> RW_REVIEW[Review ALL rejection comments]
RW_REVIEW --> RW_FIX[Fix issues based on feedback]
RW_FIX --> RW_CREATIVEX[🔴 Re-submit to CreativeX<br/>MUST get new score for fixed version]
RW_CREATIVEX --> RW_PDF[Upload new PDF with updated score]
RW_PDF --> RW_SCORE_RUN[Run creativex_scoring_storing.py]
RW_SCORE_RUN --> RW_DB[Database updated with new score<br/>Old score marked 'superseded']
RW_DB --> RW_NAMING[Use Naming Tool:<br/>NEW job number + SAME tracking ID]
RW_NAMING --> RW_EXAMPLE[Example: 7777_NUT_SUMMER_OLV_30S_16x9_DE_de_pOiJ9s.mp4<br/>↑ New job | Same tracking ↑]
RW_EXAMPLE --> RW_UPLOAD_A2[Upload to A2→A3 folder]
RW_UPLOAD_A2 -.->|Re-enters A2→A3 automation| A2_DETECT
end
subgraph ReworkLoop["🔄 Rework Can Repeat Multiple Times"]
RW_NOTE1[Rejected again? → Back to A5] -.-> A5_STATUS
RW_NOTE2[Assets can go through A5→A6<br/>cycle 2, 3, 4+ times until approved]
end
style A4_DONE fill:#4CAF50,stroke:#2E7D32,stroke-width:4px,color:#FFF
style A5_MARK fill:#EF5350,stroke:#C62828,stroke-width:3px
style RW_CREATIVEX fill:#9C27B0,stroke:#6A1B9A,stroke-width:3px
style AG_CREATIVEX fill:#9C27B0,stroke:#6A1B9A,stroke-width:3px
style A1_TRACKING fill:#2196F3,stroke:#1565C0,stroke-width:2px
style A5_TRACKING fill:#2196F3,stroke:#1565C0,stroke-width:2px
style RW_NOTE1 fill:#FFE082,stroke:#F57C00
style RW_NOTE2 fill:#FFE082,stroke:#F57C00
```
**🔴 CRITICAL REQUIREMENTS HIGHLIGHTED:**
1. **CreativeX Scoring (Purple):** Every derivative MUST be scored individually
2. **Tracking IDs (Blue):** Critical link between master and derivatives, reused in rework
3. **Rejection Comments (Red):** Legal, IA&CC, and approver feedback guide fixes
4. **Rework Loop (Orange):** Can repeat until approved - same tracking ID maintained
---
## Table of Contents
1. [Executive Overview](#executive-overview)
2. [System Architecture](#system-architecture)
3. [The Complete Asset Lifecycle](#the-complete-asset-lifecycle)
4. [The Naming Convention Tool](#the-naming-convention-tool)
5. [Python Automation Scripts](#python-automation-scripts)
6. [Detailed Workflow Walkthroughs](#detailed-workflow-walkthroughs)
7. [CreativeX Integration](#creativex-integration)
8. [The Database](#the-database)
9. [Common Mistakes & Pitfalls](#common-mistakes--pitfalls)
10. [Monitoring & Health Checks](#monitoring--health-checks)
11. [Troubleshooting Guide](#troubleshooting-guide)
12. [Appendices](#appendices)
---
## Executive Overview
### What is This System?
The Ferrero DAM Asset Production Workflow is an **end-to-end automated system** that manages digital assets from creation through localization to final delivery. It connects creative development, digital asset management (DAM), agency collaboration, and quality scoring into a seamless automated pipeline.
**In Simple Terms:**
Think of it as a smart conveyor belt for your digital assets. Assets enter at one end (campaign creation), get automatically processed, tracked, localized by agencies, quality-scored, and delivered at the other end - with minimal manual intervention and complete audit trails.
### The Problem It Solves
**Before Automation:**
- ❌ Manual file transfers between systems
- ❌ Lost tracking between master and derivative assets
- ❌ No visibility into localization progress
- ❌ Metadata errors and inconsistencies
- ❌ Hours spent on file organization
- ❌ Missing or incorrect quality scores
- ❌ No audit trail of changes
**After Automation:**
- ✅ Automated downloads and uploads
- ✅ Every derivative linked to master via tracking ID
- ✅ Real-time progress tracking via email
- ✅ Consistent metadata across all assets
- ✅ Automatic file organization
- ✅ AI-powered quality scoring integration
- ✅ Complete history in database
### Key Benefits
**Time Savings:**
- Manual download/upload: 2 hours per campaign → **30 seconds** (automated)
- Metadata entry: 15 minutes per asset → **0 seconds** (inherited from master)
- Tracking derivatives: 30 minutes → **instant** (database lookup)
- Quality score attachment: Manual entry → **automatic** (AI extraction)
**Error Reduction:**
- Metadata errors: 15-20% → **<1%** (automated validation)
- Lost files: Common **impossible** (database tracking)
- Wrong file versions: Frequent **eliminated** (tracking IDs)
**Visibility:**
- Campaign status: Unknown **real-time** (email + database)
- Asset history: None **complete audit trail** (database events)
- Quality metrics: Scattered **centralized** (database + reports)
### System Components
```mermaid
graph TB
subgraph Frontend["User-Facing Tools"]
NT[Naming Convention Tool<br/>public-v2/]
BOX[Box.com Folders<br/>4 separate folders]
EMAIL[Email Notifications<br/>Success/Error/Reports]
end
subgraph Backend["Automation Layer"]
A1A2[a1_to_a2_download.py<br/>Master Downloads]
A2A3[a2_to_a3_upload_polling.py<br/>Derivative Uploads]
A5A6[a5_to_a6_download.py<br/>Rejections/Rework]
B1B2[b1_to_b2_download.py<br/>Global Masters]
CX[creativex_scoring_storing.py<br/>Score Extraction]
DR[daily_report.py<br/>Reporting]
end
subgraph Storage["Data Layer"]
DAM[(OpenText DAM<br/>Asset Repository)]
DB[(PostgreSQL<br/>Tracking Database)]
BACKUP[(Daily Backups<br/>7-day retention)]
end
subgraph External["External Services"]
LLAMA[LlamaExtract AI<br/>CreativeX PDF parsing]
WEBHOOK[Make.com Webhooks<br/>Campaign notifications]
end
NT -.->|Generates| BOX
BOX <-->|Files| A1A2
BOX <-->|Files| A2A3
BOX <-->|Files| A5A6
BOX <-->|Files| B1B2
BOX <-->|PDFs| CX
DAM <-->|Assets + Metadata| A1A2
DAM <-->|Assets + Metadata| A2A3
DAM <-->|Assets + Metadata| A5A6
DAM <-->|Assets + Metadata| B1B2
A1A2 --> DB
A2A3 --> DB
A5A6 --> DB
B1B2 --> DB
CX --> DB
DR --> DB
CX <-->|API| LLAMA
A2A3 -.->|Status updates| WEBHOOK
A1A2 -.->|Notifications| EMAIL
A2A3 -.->|Notifications| EMAIL
A5A6 -.->|Notifications| EMAIL
B1B2 -.->|Notifications| EMAIL
CX -.->|Notifications| EMAIL
DR -.->|Daily summary| EMAIL
DB -.->|Automated| BACKUP
```
### Stakeholders & Their Roles
| Role | Responsibility | Tools Used |
|------|----------------|------------|
| **Campaign Manager** | Creates campaigns, monitors progress | DAM, Email notifications |
| **Creative Team** | Creates master assets, uploads to DAM | DAM |
| **Agency (Oliver)** | Localizes assets, uses naming tool | Naming Tool, Box |
| **Approvers** | Reviews assets, approves/rejects | DAM |
| **Operations** | Monitors system health, troubleshoots | Logs, Database, Email |
| **Technical Team** | Maintains scripts, database, backups | All systems |
### Success Metrics
**Current Performance (Week 1):**
- Campaigns processed: 50+ per week
- Assets processed: 500+ per week
- Success rate: 98.5%
- Average processing time: 5 minutes (vs 2 hours manual)
- Email notifications: 100% delivery
- Database backups: 7/7 successful
---
## System Architecture
### High-Level Architecture
The system consists of **5 core layers** that work together:
```mermaid
graph TB
subgraph L1["Layer 1: User Interface"]
UI1[DAM Web Interface<br/>Campaign & Asset Management]
UI2[Naming Convention Tool<br/>Filename Generation]
UI3[Box.com Interface<br/>File Upload/Download]
end
subgraph L2["Layer 2: Automation Scripts (Python)"]
S1[Workflow Scripts<br/>A1→A2, A2→A3, A5→A6, B1→B2]
S2[Support Scripts<br/>CreativeX, Daily Reports]
S3[Shared Modules<br/>DAM Client, Box Client, Database]
end
subgraph L3["Layer 3: Data Storage"]
D1[(PostgreSQL Database<br/>Tracking & Metadata)]
D2[(Box Storage<br/>File Transfer)]
D3[(DAM Storage<br/>Asset Repository)]
end
subgraph L4["Layer 4: External Services"]
E1[LlamaExtract AI<br/>PDF Parsing]
E2[Mailgun SMTP<br/>Email Delivery]
E3[Make.com<br/>Webhook Automation]
end
subgraph L5["Layer 5: Infrastructure"]
I1[Cron Scheduler<br/>Every 5 minutes]
I2[Backup System<br/>Daily + Weekly]
I3[Log Management<br/>Rotating Logs]
end
UI1 <--> S1
UI2 --> UI3
UI3 <--> S1
S1 <--> D1
S1 <--> D2
S1 <--> D3
S2 <--> D1
S2 <--> D2
S2 <--> E1
S1 -.-> E2
S1 -.-> E3
I1 -.-> S1
I1 -.-> S2
I2 -.-> D1
```
### Physical Deployment
**Server Location:** `/opt/ferrero-automation/Python-Version/`
**Key Directories:**
```
/opt/ferrero-automation/
├── Box-config.json # Box JWT authentication (one level up!)
└── Python-Version/
├── .env # Credentials (NOT in Git)
├── config/
│ ├── config.yaml # Main configuration
│ ├── field_mappings.yaml # Metadata field mappings
│ ├── asset_type_mappings.yaml # 3-letter → DAM codes
│ └── certificates/ # mTLS certificates
├── scripts/
│ ├── a1_to_a2_download.py
│ ├── a2_to_a3_upload_polling.py
│ ├── a5_to_a6_download.py
│ ├── b1_to_b2_download.py
│ ├── creativex_scoring_storing.py
│ ├── daily_report.py
│ └── shared/ # Shared modules
├── database/
│ ├── init.sql # Database schema
│ ├── backup.sh # Backup script
│ ├── restore.sh # Restore script
│ └── check_backups.sh # Health check
├── backups/
│ ├── dumps/ # Daily SQL dumps (7 days)
│ └── basebackups/ # Weekly binary (1 backup)
├── logs/ # Rotating log files
└── temp/ # Temporary downloads
```
### Data Flow Overview
```mermaid
flowchart LR
subgraph Creation["1. Creation"]
BRIEF[Campaign Brief]
CREATIVE[Creative Assets]
DAM_A1[DAM Status: A1]
end
subgraph Download["2. Master Download"]
DETECT[Detect A1 Campaign]
DL[Download Masters]
TID[Generate Tracking IDs]
BOX_UP[Upload to Box]
DB_STORE[Store in Database]
end
subgraph Localization["3. Agency Localization"]
NAMING[Use Naming Tool]
LOCALIZE[Create Derivatives]
BOX_A2A3[Upload to Box A2→A3]
end
subgraph Upload["4. Derivative Upload"]
PARSE[Parse Filename]
LOOKUP[Database Lookups]
CREATIVEX[Attach CreativeX Score]
DAM_UP[Upload to DAM]
DAM_A3[Status: A3]
end
subgraph Approval["5. Review"]
APPROVE{Approval?}
A4[A4: Go Live]
A5[A5: Rejected]
end
subgraph Rework["6. Rework (if rejected)"]
A5A6[Download Rejections]
FIX[Agency Fixes]
REUP[Re-upload]
end
BRIEF --> CREATIVE --> DAM_A1
DAM_A1 --> DETECT --> DL --> TID --> BOX_UP --> DB_STORE
DB_STORE --> NAMING --> LOCALIZE --> BOX_A2A3
BOX_A2A3 --> PARSE --> LOOKUP --> CREATIVEX --> DAM_UP --> DAM_A3
DAM_A3 --> APPROVE
APPROVE -->|Yes| A4
APPROVE -->|No| A5
A5 --> A5A6 --> FIX --> REUP
REUP --> BOX_A2A3
```
---
## The Complete Asset Lifecycle
### Overview: From Brief to Live Asset
The journey of a Ferrero digital asset involves **6 major phases** and can take anywhere from 2 days (smooth approval) to 2 weeks (multiple rejection cycles).
### Phase 1: Campaign Creation & Master Asset Development
**Who:** Campaign Manager + Creative Team
**Where:** OpenText DAM
**Duration:** Varies (creative development time)
**Steps:**
1. **Campaign Brief Created**
- Campaign Manager creates campaign in DAM
- Assigns campaign number (e.g., C000000078)
- Sets campaign type: "Local" or "Global comm"
- Defines brand, market, fiscal year
2. **Master Assets Created**
- Creative team develops assets (videos, images, documents)
- Assets uploaded to DAM campaign folder
- Placed in "01. Master Assets" subfolder
- Metadata added (brand, language, description, etc.)
3. **Campaign Status Set to A1**
- Campaign Manager marks campaign ready
- Status change triggers automation
- System begins monitoring for this campaign
**Folder Structure in DAM:**
```
Campaign: C000000078-Summer_Promo/
├── 01. Master Assets/ ← Masters go here
│ ├── nutella_summer_hero.mp4
│ ├── nutella_summer_logo.png
│ └── Subfolder_EU/
│ └── nutella_eu_variant.mp4
└── 02. Final Assets/ ← Derivatives will go here later
```
**What Happens Next:**
- Automation detects campaign within 5 minutes
- A1A2 script begins processing
- Campaign Manager receives email when masters are ready in Box
---
### Phase 2: Master Asset Download (A1→A2) - AUTOMATED
**Who:** Python Script (a1_to_a2_download.py)
**Trigger:** Campaign status = A1
**Frequency:** Every 5 minutes
**Duration:** 2-10 minutes depending on file sizes
```mermaid
sequenceDiagram
autonumber
participant DAM as OpenText DAM
participant Script as a1_to_a2_download.py<br/>(Runs every 5 min)
participant Box as Box.com
participant DB as PostgreSQL Database
participant Email as Email Notification
participant CX as CreativeX Extraction
Note over Script: Cron triggers every 5 minutes
Script->>DAM: Query: Get campaigns WHERE status = 'A1'
DAM-->>Script: Campaign list (may be empty)
alt No A1 campaigns found
Script->>Script: Log "No A1 campaigns found" (normal)
Script->>Script: Exit cleanly
end
alt A1 campaign(s) found
Script->>DAM: Get campaign details
Script->>DAM: Search "01. Master Assets" folder (recursive)
DAM-->>Script: List of master assets + metadata
loop For each asset
Script->>DAM: Download asset file
Script->>Script: Generate unique 6-char tracking ID
Script->>Box: Upload as "filename_TRACKINGID.ext"
Script->>DB: Store master_assets record
Script->>DB: Store full_metadata as JSONB
Script->>CX: Extract CreativeX score (if present)
alt CreativeX found in metadata
CX->>DB: Store with status='master-cx-score'
CX->>Script: Log "Stored master score"
else No CreativeX in metadata
CX->>Script: Log "No CreativeX (normal)"
end
Script->>Script: Track success
end
alt ALL assets successful
Script->>DAM: Update campaign status A1 → A2
Script->>DB: Record campaign_status
Script->>Email: Send success notification
Email-->>Team: "3 master assets ready for localization"
else Some assets failed
Script->>Email: Send partial success notification
Email-->>Team: "2/3 successful, 1 failed"
Note over Script,DAM: Status NOT updated<br/>Retry on next run
end
end
```
**Detailed Process Breakdown:**
**Step 1-2: Campaign Detection**
```
Query: SELECT * FROM campaigns WHERE status = 'A1' LIMIT 1
Result: Campaign C000000078 found
Log: "Processing campaign: Summer_Promo (C000000078)"
```
**Step 3-4: Asset Discovery**
```
Search: Campaign/01. Master Assets/ (recursive)
Found:
- nutella_summer_hero.mp4 (50MB)
- nutella_summer_logo.png (2MB)
- Subfolder_EU/nutella_eu_variant.mp4 (45MB)
Total: 3 assets
```
**Step 5-6: Tracking ID Generation**
```
Algorithm: Random 6 alphanumeric characters
Validation: Check uniqueness in database
Generated IDs:
- pOiJ9s (for nutella_summer_hero.mp4)
- a7K9mP (for nutella_summer_logo.png)
- bvSGyd (for nutella_eu_variant.mp4)
```
**Step 7: Box Upload**
```
Original: nutella_summer_hero.mp4
Box name: nutella_summer_hero_pOiJ9s.mp4
Box folder: 348304357505 (A1→A2 Local Masters)
Subfolder: C000000078-Summer_Promo/
URL: https://app.box.com/file/123456789
```
**Step 8-9: Database Storage**
```sql
INSERT INTO master_assets (
tracking_id, opentext_id, original_filename,
full_metadata, upload_directory, ...
) VALUES (
'pOiJ9s',
'747110b96dede761a445a3bf0c90f13bb38024fa',
'nutella_summer_hero',
'{"name": "nutella_summer_hero.mp4", "metadata": {...}}', -- Full JSONB
'12345', -- Final Assets folder ID
...
);
```
**Step 10-11: CreativeX Extraction (if present)**
```
Search metadata for: FERRERO.TAB.FIELD.CREATIVEX
Found: Score = 85, URL = https://app.creativex.com/...
Store:
INSERT INTO creativex_scores (
filename, quality_score, creativex_url,
tracking_id, status
) VALUES (
'nutella_summer_hero.mp4', '85', 'https://...',
'pOiJ9s', 'master-cx-score'
);
Log: "Stored master CreativeX score: 85 (Tracking: pOiJ9s)"
```
**Step 12: Status Update**
```
Check: All 3 assets successful? YES
Update: Campaign status A1 → A2
Record: campaign_status table (prevents re-processing)
```
**Step 13: Email Notification**
```
To: ferrero-team@ferrero.com, operations@oliver.agency
Subject: ✅ Master Assets Downloaded - Campaign Summer_Promo
Body:
- Campaign: Summer_Promo (C000000078)
- Assets Downloaded: 3
- Status Updated: A1 → A2
Processed Assets:
1. nutella_summer_hero.mp4
Tracking ID: pOiJ9s
Box URL: https://app.box.com/file/123456789
CreativeX Score: 85 (master)
2. nutella_summer_logo.png
Tracking ID: a7K9mP
Box URL: https://app.box.com/file/123456790
3. nutella_eu_variant.mp4 (from Subfolder_EU/)
Tracking ID: bvSGyd
Box URL: https://app.box.com/file/123456791
```
**What If Something Fails?**
**Scenario: Network timeout on asset 2**
```
Asset 1: ✅ Success (pOiJ9s)
Asset 2: ❌ Network timeout
Asset 3: ⏭️ Not attempted (stop on error for safety)
Result:
- Status NOT updated (still A1)
- Email sent: "Partial Download - 1/3 successful"
- Next run (5 minutes): Retries asset 2 and 3
- Eventually all succeed → Status updated to A2
```
**Why Status Update is All-or-Nothing:**
- Prevents partial state in DAM
- Clear definition: A2 means "ALL masters in Box"
- Easier troubleshooting (either all done or not done)
- Safer for downstream processes
---
### Phase 3: Agency Localization
**Who:** Agency (Oliver) Creative Team
**Where:** Box + Naming Convention Tool
**Duration:** 1-5 days depending on complexity
**Steps:**
**Step 1: Receive Notification**
Agency receives email:
```
Subject: ✅ Master Assets Downloaded - Campaign Summer_Promo
Your master assets are ready for localization:
- nutella_summer_hero_pOiJ9s.mp4 (Box URL)
- nutella_summer_logo_a7K9mP.png (Box URL)
- nutella_eu_variant_bvSGyd.mp4 (Box URL)
Next steps:
1. Download masters from Box
2. Create localized versions
3. Use Naming Tool to generate filenames
4. Upload to Box A2→A3 folder: [URL]
```
**Step 2: Download Masters from Box**
Agency clicks Box URLs in email, downloads files
**Step 3: Create Localized Versions**
For each market/language:
- German version of hero video
- French version of hero video
- Italian version of hero video
- etc.
**Step 4: Use Naming Convention Tool**
**CRITICAL:** This is where the naming tool becomes essential!
Agency opens: `http://naming-tool-url/public-v2/index.php`
**For EACH localized file, agency enters:**
- OMG Job Number: `6487512` (from their project management)
- Brand: `NUT` (Nutella)
- Subject: `SUMMER` (from brief)
- Asset Type: `OLV` (Online Video) Maps to "onlinevideodigitalvideo" in DAM
- Duration: `30` (30 seconds)
- Aspect Ratio: `16x9`
- Spot Version: `MST` (Master version)
- Country: `DE` (Germany)
- Language: `de` (German)
- Social Media: `IGF` (if Instagram Feed version)
- Tracking ID: `pOiJ9s` (from master asset filename)
**Tool Generates:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
```
**Why Each Field Matters:**
| Field | Purpose | What Happens If Wrong |
|-------|---------|----------------------|
| **Job Number** | Links to OMG project billing | Can't track project costs |
| **Brand** | Metadata inheritance, organization | Wrong brand in DAM |
| **Subject** | Description field in DAM | Searchability issues |
| **Asset Type** | Critical for DAM categorization | Upload may fail or wrong type |
| **Duration** | Asset specifications | Optional, info only |
| **Aspect Ratio** | Asset specifications | Required for videos/images |
| **Spot Version** | Master vs Reference designation | Classification |
| **Country** | Market targeting | Wrong market metadata |
| **Language** | Language metadata in DAM | Wrong language = unusable |
| **Social Media** | Platform-specific metadata | Platform targeting |
| **Tracking ID** | Links to master (DATABASE LOOKUP) | **Upload fails, can't find master** |
**Step 5: Upload to Box**
- Agency uploads to Box folder: 348526703108
- Uses EXACT filename from naming tool
- One file at a time or batch upload
**What Automation Sees:**
```
Box folder poll finds:
- 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
- 6487512_NUT_SUMMER_OLV_30S_16x9_MST_FR_fr_IGF_pOiJ9s.mp4
- 6487512_NUT_SUMMER_OLV_30S_16x9_MST_IT_it_IGF_pOiJ9s.mp4
Validation:
✅ All filenames parse correctly
✅ All tracking IDs found in database
✅ Ready for upload
```
---
### Phase 4: Derivative Asset Upload (A2→A3) - AUTOMATED
**Who:** Python Script (a2_to_a3_upload_polling.py)
**Trigger:** Files appear in Box A2A3 folder
**Frequency:** Every 5 minutes
**Duration:** 1-3 minutes per file
```mermaid
sequenceDiagram
autonumber
participant Agency
participant Box as Box A2→A3 Folder
participant Script as a2_to_a3_upload_polling.py
participant Parser as Filename Parser V2.1
participant DB as Database
participant Mapper as Asset Type Mapper
participant DAM
participant Email
Agency->>Box: Upload 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Note over Script: Cron runs every 5 minutes
Script->>Box: List files in folder
Box-->>Script: Found 1 file
Script->>Parser: Parse filename
Parser->>Parser: Split by underscore
Parser->>Parser: Position 1: Job = 6487512
Parser->>Parser: Position 2: Brand = NUT
Parser->>Parser: Position 3: Subject = SUMMER
Parser->>Parser: Position 4: Asset = OLV
Parser->>Parser: Position 5: Duration = 30S
Parser->>Parser: Position 6: Ratio = 16x9
Parser->>Parser: Position 7: Spot = MST
Parser->>Parser: Position 8: Country = DE
Parser->>Parser: Position 9: Language = de
Parser->>Parser: Position 10: Social = IGF
Parser->>Parser: Position 11: Tracking = pOiJ9s
Parser-->>Script: Parsed components
Script->>DB: SELECT * FROM master_assets WHERE tracking_id = 'pOiJ9s'
DB-->>Script: Master metadata (brand, campaign, folders, etc.)
Script->>DB: SELECT * FROM creativex_scores WHERE filename = '6487512_NUT...'
alt CreativeX score found
DB-->>Script: Score: 85, URL: https://creativex...
Script->>Script: creativex_found = True
else No score found
Script->>Script: Use defaults (Score: 0, placeholder URL)
Script->>Script: creativex_found = False
end
Script->>Script: Build asset representation
Script->>Script: Merge master metadata + filename overrides
Script->>Script: Override: Language = de (from filename)
Script->>Script: Override: Description = SUMMER (from filename)
Script->>Mapper: Map asset type: OLV
Mapper-->>Script: onlinevideodigitalvideo
Script->>Script: Set FERRERO.FIELD.MKTG.ASSET TYPE = onlinevideodigitalvideo
Script->>Script: Set STATE = Local (forced)
Script->>Script: Set AGENCY NAME = Oliver
Script->>Script: Add CreativeX Score = 85
Script->>Script: Add CreativeX URL = https://...
Script->>Script: Strip filename: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
Script->>DAM: Upload to Final Assets folder
DAM-->>Script: Asset ID: xyz789
Script->>DB: INSERT INTO derivative_assets
Script->>Box: Delete processed file
alt CreativeX was found
Script->>Email: Success email (green, shows Score: 85)
else CreativeX defaulted
Script->>Email: Success email (orange warning box)
end
Email-->>Agency: "Asset uploaded successfully"
```
**Detailed Process:**
**Input File:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
```
**Parsing (V2.1 Structure):**
```python
{
'omg_job_number': '6487512',
'brand_code': 'NUT',
'subject_title': 'SUMMER',
'asset_type': 'OLV',
'seconds': '30',
'aspect_ratio': '16x9',
'spot_version': 'MST',
'country_code': 'DE',
'language_code': 'de',
'social_media_version': 'IGF',
'tracking_id': 'pOiJ9s',
'tracking_mode': 'full' # Full metadata inheritance
}
```
**Database Lookups:**
**Lookup 1: Master Asset**
```sql
SELECT * FROM master_assets
WHERE tracking_id = 'pOiJ9s'
```
Returns: Complete master metadata, upload folder ID, campaign info
**Lookup 2: CreativeX Score**
```sql
SELECT * FROM creativex_scores
WHERE filename = '6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4'
AND status = 'active'
```
Returns: Score: 85, URL: https://app.creativex.com/... (or None if not found)
**Metadata Building:**
**From Master (Inherited):**
- Brand Name: "Nutella"
- Brand Code: "NUT"
- Campaign ID: "C000000078"
- Campaign Name: "Summer_Promo"
- Upload Folder: Final Assets folder ID
- All other metadata fields (fiscal year, tags, etc.)
**From Filename (Overrides):**
- Language: `de` (German) - OVERRIDES master
- Country: `DE` (Germany)
- Description: `SUMMER` - OVERRIDES master
- Asset Type: `OLV` mapped to `onlinevideodigitalvideo`
**Forced Values:**
- STATE: `Local` (always for A2A3)
- AGENCY NAME: `Oliver`
**From Database (CreativeX):**
- CreativeX Score: `85` (or `0` if not found)
- CreativeX URL: `https://app.creativex.com/...` (or placeholder)
**Asset Type Mapping:**
```
Filename has: OLV
Mapping: OLV → onlinevideodigitalvideo
DAM receives: "onlinevideodigitalvideo" (valid domain value)
```
**Filename Stripping:**
```
Input: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Remove: Job number (6487512)
Remove: Tracking ID (pOiJ9s)
Output: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
```
**DAM Upload:**
```
Upload to: Campaign/02. Final Assets/
Filename: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
Metadata: 27 MVP fields populated
Result: Asset ID xyz789
```
**Database Record:**
```sql
INSERT INTO derivative_assets (
tracking_id, derivative_filename, dam_asset_id
) VALUES (
'pOiJ9s',
'NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF',
'xyz789'
);
```
**Email Notification:**
```
Subject: ✅ Asset Uploaded to DAM - NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
Original Filename: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Clean Filename: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
DAM Asset ID: xyz789
Tracking ID: pOiJ9s
What Was Done:
✅ Downloaded from Box
✅ Loaded master metadata (pOiJ9s)
✅ Built asset representation (27 fields)
✅ Updated Language from filename: de
✅ Updated Description: SUMMER
✅ Set STATE to Local
✅ CreativeX Score Added: 85 (from database) ← If found
✅ Uploaded to DAM Final Assets folder
✅ Deleted from Box
```
**If CreativeX NOT Found:**
```
⚠️ CreativeX Score Missing
No score found for: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Default Values Used:
- Score: 0
- URL: https://app.creativex.com/preflight/pretests
To add CreativeX score: Upload PDF report to Box folder 350605024645
and run creativex_scoring_storing.py
```
---
### Phase 5: Approval & Distribution (A3→A4)
**Who:** Approvers (Ferrero stakeholders)
**Where:** DAM
**Duration:** 1-3 days
**Process:**
1. Assets reach A3 status after upload
2. Designated approvers review in DAM
3. Check quality, compliance, brand guidelines
4. Decision: Approve or Reject
**If Approved:**
- Status updated to A4
- Assets ready for distribution
- Campaign complete
**If Rejected:**
- Mark individual assets: `ECOMMERCE STATUS = "NOT APPROVED"`
- Add rejection comments (why rejected)
- Campaign status A5
- Triggers rejection workflow (see Phase 6)
---
### Phase 6: Rejection & Rework Cycle (A5→A6→A2→A3)
**Who:** Python Script + Agency
**Trigger:** Campaign status = A5 + assets marked "NOT APPROVED"
**Duration:** 1-3 days for rework + 5 minutes for automation
```mermaid
flowchart TD
A[Asset in A3] --> B{Approver Review}
B -->|Approved| C[A4: Go Live ✅]
B -->|Rejected| D[Mark: ECOMMERCE STATUS = NOT APPROVED]
D --> E[Add Comments:<br/>Legal/IA&CC/Approver]
E --> F[Campaign Status → A5]
F -->|5 min| G[a5_to_a6_download.py detects A5]
G --> H[Query: Get ONLY NOT APPROVED assets]
H --> I{Asset has tracking ID?}
I -->|Yes| J[Reuse existing tracking ID]
I -->|No| K[Generate new tracking ID]
J --> L[Download from DAM]
K --> L
L --> M[Upload to Box A5→A6 folder]
M --> N[Email with rejection reasons]
N --> O[Agency receives notification]
O --> P[Agency reviews comments]
P --> Q[Agency fixes issues]
Q --> R[Use Naming Tool with SAME tracking ID]
R --> S[Upload to Box A2→A3 folder]
S -->|Standard A2→A3 flow| T[a2_to_a3 processes]
T --> U[Upload to DAM]
U --> V[Overwrites previous version]
V --> W[Back to A3 status]
W --> X{Approval round 2}
X -->|Approved| Y[A4: Done ✅]
X -->|Rejected again| Z[Back to A5]
Z -.->|Cycle repeats| G
style C fill:#90EE90
style Y fill:#90EE90
style D fill:#FFB6C1
style Z fill:#FFB6C1
```
**A5→A6: Rejection Download (AUTOMATED)**
**Critical Filter:**
```python
# ONLY download assets with ECOMMERCE STATUS = "NOT APPROVED"
# Do NOT re-download approved assets
approved_assets_skipped = assets_with_status_approved
rejected_assets = assets_with_status_not_approved
```
**Tracking ID Reuse Logic:**
```python
# Critical: Use existing tracking ID to maintain link
result = db.find_or_create_tracking_id(
opentext_id=asset_id,
local_campaign_id=campaign_number
)
if result['is_existing']:
tracking_id = result['tracking_id'] # Reuse existing
logger.info("Reusing tracking ID for rework: {}".format(tracking_id))
else:
tracking_id = result['tracking_id'] # New ID
logger.info("New asset, generated tracking ID: {}".format(tracking_id))
```
**Why Tracking ID Reuse Matters:**
- Links rework to original master
- Preserves metadata inheritance
- Enables version tracking
- Maintains audit trail
**Rejection Comment Extraction:**
```python
# Extract 3 types of rejection comments
comments = {
'approver': "Logo too small, add disclaimer",
'legal': "Missing copyright notice",
'ia_cc': "Brand colors not compliant"
}
# Included in email to agency
```
**Email Notification:**
```
Subject: ⚠️ NOT APPROVED Assets - Rework Required - Campaign Summer_Promo
Assets requiring rework: 1
Rejected Asset:
- NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
Tracking ID: pOiJ9s
Box URL: https://app.box.com/file/999
Rejection Comments:
📝 Approver: "Logo too small, add disclaimer"
⚖️ Legal: "Missing copyright notice"
Next Steps:
1. Download from Box
2. Fix issues based on comments
3. Use Naming Tool with SAME tracking ID: pOiJ9s
4. Upload to A2→A3 folder: [URL]
```
**Rework Process:**
**Step 1: Agency Reviews Comments**
- Downloads rejected file from Box
- Reads rejection reasons
- Plans fixes
**Step 2: Agency Fixes Asset**
- Makes required changes
- Adds disclaimer
- Fixes logo size
- Adds copyright notice
**Step 3: Re-Upload with SAME Tracking ID**
**CRITICAL:** Must use the same tracking ID!
**Naming Tool Input:**
- OMG Job: `7654321` (NEW job number, different from original)
- Brand: `NUT`
- Subject: `SUMMER`
- Asset: `OLV`
- Duration: `30`
- Ratio: `16x9`
- Spot: `MST`
- Country: `DE`
- Language: `de`
- Social: `IGF`
- **Tracking ID: `pOiJ9s`** SAME AS ORIGINAL
**Generated:**
```
7654321_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
↑ Different job number (new project)
↑ SAME tracking ID
```
**Step 4: A2→A3 Processes Rework**
- Parses new filename
- Finds same tracking ID: `pOiJ9s`
- Loads same master metadata
- Uploads to DAM
- **Overwrites** previous version in Final Assets
- Back to A3 for re-approval
**Rework Can Happen Multiple Times:**
```
Upload 1 → Rejected (logo issue)
Upload 2 → Rejected (missing disclaimer)
Upload 3 → Rejected (color compliance)
Upload 4 → Approved ✅
```
Each time uses the SAME tracking ID to maintain the link!
---
### Workflow 4: Global Master Campaigns (B1→B2)
**Purpose:** Distribution of global master content to regional teams
**Who:** Python Script (b1_to_b2_download.py)
**Trigger:** Campaign type = "Global comm" AND status = B1
**Frequency:** Every 5 minutes
```mermaid
flowchart TD
A[Global Campaign Created] --> B[Type = Global comm]
B --> C[Assets in Final Assets folder<br/>NOT Master Assets]
C --> D[Status = B1]
D -->|5 min| E[b1_to_b2_download.py runs]
E --> F{Filter campaigns}
F --> G[Type = Global comm<br/>Status = B1]
G --> H[Download from Final Assets]
H --> I[Generate tracking IDs]
I --> J[Upload to Box folder 349261192115]
J --> K[Folder name: MASTERS_CampaignName]
K --> L[Store in database]
L --> M[Email to team]
M --> N[NO webhook sent]
N --> O[Regional teams download from Box]
O --> P[Use for regional adaptations]
```
**Key Differences from A1→A2:**
| Aspect | A1A2 (Local) | B1B2 (Global) |
|--------|---------------|----------------|
| **Campaign Type** | Local campaigns | "Global comm" only |
| **Source Folder** | 01. Master Assets | **02. Final Assets** |
| **Box Folder** | 348304357505 | **349261192115** |
| **Folder Naming** | C000000078-Campaign_Name | **MASTERS_**Campaign_Name |
| **Webhook** | Sent on A4 | **NOT sent** |
| **Purpose** | Localization workflow | Distribution to regions |
| **Next Step** | A2A3 (agency localization) | Regional teams download |
**Example:**
```
Campaign: C000000099-Global_Nutella_2025
Type: Global comm
Status: B1
Source: 02. Final Assets/
Download:
- global_nutella_master_final.mp4
- global_nutella_logo_final.png
Upload to Box:
Folder: MASTERS_Global_Nutella_2025/
Files:
- global_nutella_master_final_xYz123.mp4
- global_nutella_logo_final_aBc456.png
Email: "Global master assets ready for distribution"
NO webhook sent (different from local campaigns)
```
---
## The Naming Convention Tool
### Location & Purpose
**Path:** `/Users/daveporter/Python-Enviroments/Ferrero-naming-convention/public-v2/`
**URL:** `http://naming-tool-url/public-v2/index.php`
**Purpose:** Generate compliant V2.1 filenames that:
- Parse correctly in automation
- Extract metadata accurately
- Link derivatives to masters via tracking IDs
- Enable CreativeX score matching
- Comply with DAM validation rules
### Why Naming is CRITICAL
**The Automation Depends On It:**
Every Python script relies on filename parsing:
```mermaid
graph LR
A[Filename] --> B[Parser]
B --> C{Valid?}
C -->|Yes| D[Extract Metadata]
C -->|No| E[Skip File]
D --> F[Database Lookup]
F --> G[Upload to DAM]
E --> H[Stuck in Box]
H --> I[Manual Intervention]
```
**What Gets Extracted from Filename:**
| Component | Used For | Impact if Wrong |
|-----------|----------|-----------------|
| **Job Number** | Billing, project tracking | Can't link to OMG project |
| **Brand Code** | Validation, filtering | Wrong brand in DAM |
| **Subject** | Description field | Wrong description |
| **Asset Type** | **DAM categorization** | **Upload fails or wrong type** |
| **Duration** | Asset specs | Info only, optional |
| **Aspect Ratio** | Asset specs | Required for validation |
| **Spot Version** | Master/Ref designation | Classification |
| **Country** | Market targeting | **Wrong market = unusable** |
| **Language** | **Language metadata** | **Wrong language = unusable** |
| **Social Media** | Platform targeting | Platform-specific metadata |
| **Tracking ID** | **Master link** | **Can't find master = UPLOAD FAILS** |
### V2.1 Filename Structure
**Format:**
```
[JOB]_[BRAND]_[SUBJECT]_[ASSET]_[DUR]_[RATIO]_[SPOT]_[COUNTRY]_[LANG]_[SOCIAL]_[TRACKING]
1 2 3 4 5 6 7 8 9 10 11
```
**Field-by-Field Explanation:**
#### 1. Job Number (OMG Job)
- **Format:** 1-10 digits
- **Example:** `6487512`
- **Purpose:** Links to OMG project management system
- **Required:** Yes (for working files)
- **Stripped:** Yes (removed before DAM upload)
- **Why Important:** Billing, project tracking, client reporting
#### 2. Brand Code
- **Format:** 2-5 uppercase letters
- **Example:** `NUT` (Nutella), `RAF` (Raffaello), `BUEC` (Bueno Ice Cream)
- **Purpose:** Brand identification
- **Required:** Yes
- **Stripped:** No (kept in final filename)
#### 3. Subject Title
- **Format:** 1-15 characters, uppercase, hyphens allowed
- **Example:** `SUMMER`, `ME-MOMENT`, `SPREAD`, `TEST-E2E`
- **Purpose:** Becomes Description field in DAM
- **Required:** Yes
- **Stripped:** No
- **Validation:** Max 15 chars (warning if exceeded)
#### 4. Asset Type
- **Format:** Exactly 3 uppercase letters
- **Example:** `OLV`, `IMG`, `TVC`, `EHI`
- **Purpose:** Asset categorization in DAM
- **Required:** Yes
- **Mapped:** Yes - 3-letter code DAM code (e.g., EHI heroimage)
- **Full List:** 45 asset types (see asset_type_mappings.yaml)
**Common Asset Types:**
| Code | DAM Code | Description |
|------|----------|-------------|
| IMG | keyvisual | Image / Front of Pack |
| OLV | onlinevideodigitalvideo | Online Video |
| TVC | tvc | TV Commercial |
| EHI | heroimage | E-Comm Hero Image |
| EBS | beautyshot | E-Comm Beauty Shot |
| LOG | ferrerologo | Logo |
#### 5. Duration (Optional)
- **Format:** 1-3 digits + `S`
- **Example:** `6S`, `30S`, `120S`
- **Purpose:** Video length in seconds
- **Required:** No (omit for static assets like images)
- **Stripped:** No
#### 6. Aspect Ratio
- **Format:** Width `x` Height or Width `:` Height
- **Example:** `16x9`, `1x1`, `9x16`, `4x5`
- **Purpose:** Asset dimensions/format
- **Required:** Yes
- **Stripped:** No
#### 7. Spot Version (Optional)
- **Format:** `MST` or `REF` only
- **Example:** `MST` (Master), `REF` (Reference)
- **Purpose:** Designates master vs reference file
- **Required:** No
- **Stripped:** No
#### 8. Country Code
- **Format:** 2 uppercase letters (ISO 3166-1 alpha-2)
- **Example:** `DE` (Germany), `FR` (France), `IT` (Italy), `US` (USA)
- **Purpose:** Market targeting metadata
- **Required:** Yes
- **Stripped:** No
#### 9. Language Code
- **Format:** 2-3 lowercase letters (ISO 639)
- **Example:** `de` (German), `en` (English), `it` (Italian), `eng` (English)
- **Purpose:** Language metadata in DAM
- **Required:** Yes
- **Stripped:** No
#### 10. Social Media Version (Optional)
- **Format:** 3 uppercase letters
- **Example:** `FBP`, `FBR`, `IGF`, `IGR`
- **Purpose:** Platform-specific metadata
- **Required:** No (only for social media assets)
- **Stripped:** No
**Current Codes:**
| Code | Platform | Description |
|------|----------|-------------|
| FBP | Facebook | Facebook Post |
| FBR | Facebook | Facebook Reels |
| IGF | Instagram | Instagram Feed |
| IGR | Instagram | Instagram Reels |
**More codes will be added:** TikTok, YouTube Shorts, LinkedIn, etc.
#### 11. Tracking ID
- **Format:** 6 alphanumeric characters (case-sensitive)
- **Example:** `pOiJ9s`, `a7K9mP`, `bvSGyd`
- **Purpose:** Links derivative to master asset in database
- **Required:** Yes (for derivatives with metadata inheritance)
- **Stripped:** Yes (removed before DAM upload)
- **Special:** Can have `-N` suffix for "folder-only" mode
**Tracking ID Modes:**
**Standard Mode: `abc123`**
- Full metadata inheritance from master
- Override: Language and Asset Type from derivative
- Use for: Standard localizations
**Folder-Only Mode: `abc123-N`**
- ONLY upload folder from master
- NO metadata inheritance
- All metadata from derivative filename
- Use for: New assets in same campaign with different brand/metadata
**Example:**
```
Master: NUT_HERO_IMG_16x9_IT_it.png (tracking: abc123)
Derivative 1: 111_NUT_HERO_IMG_16x9_DE_de_abc123.png (inherits NUT brand)
Derivative 2: 222_RAF_PROMO_IMG_1x1_FR_fr_abc123-N.png (DIFFERENT brand RAF, uses folder only)
```
### Using the Naming Tool: Step-by-Step
**Step 1: Open the Tool**
```
URL: http://naming-tool-url/public-v2/index.php
```
**Step 2: Select Asset Type**
Dropdown shows all 45 asset types with descriptions:
- E-COMM: Hero Image (EHI)
- Online Video (OLV)
- TV Commercial (TVC)
- etc.
**Step 3: Fill Required Fields**
```
OMG Job Number: [6487512]
Brand: [NUT - Nutella] (dropdown)
Subject Title: [SUMMER] (15 char max)
Country: [DE - Germany] (dropdown)
Language: [de - German] (dropdown)
```
**Step 4: Fill Optional Fields (if applicable)**
```
Duration: [30] seconds (for videos only)
Aspect Ratio: [16x9] (required)
Spot Version: [MST] (dropdown: MST/REF or leave empty)
Social Media: [IGF] (dropdown: FBP/FBR/IGF/IGR or leave empty)
```
**Step 5: Enter Tracking ID**
**CRITICAL STEP:**
```
Tracking ID: [pOiJ9s]
```
**How to Get Tracking ID:**
- From A1A2 email notification
- From Box filename (last component before extension)
- From database query
- **IMPORTANT:** Copy exactly, case-sensitive!
**Step 6: Generate Filename**
- Click "Generate Filename" button
- Tool validates all fields
- Shows any errors in red
- If valid: Shows generated filename
**Step 7: Copy Filename**
- Click "Copy to Clipboard" button
- Paste EXACTLY when saving localized file
- **Do NOT modify** - use exact generated name
**Generated Example:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
```
### Tool Validation Rules
**The tool validates:**
- Job number is digits only
- Brand exists in domain
- Subject 15 characters
- Asset type is valid 3-letter code
- Duration format is correct (if provided)
- Aspect ratio format is valid
- Country is valid ISO code
- Language is valid ISO code
- Social media code is recognized (if provided)
- Tracking ID is exactly 6 characters
**Validation Errors Shown:**
- 🔴 "Brand code must be 2-5 characters"
- 🔴 "Subject title exceeds 15 character limit"
- 🔴 "Invalid tracking ID (must be 6 alphanumeric)"
- 🔴 "Country code must be 2 characters"
### What Happens When Naming is Wrong
**The Domino Effect of Bad Filenames:**
```mermaid
flowchart TD
A[Agency uploads file with wrong name] --> B{Filename parses?}
B -->|No| C[A2→A3 skips file]
C --> D[File stuck in Box forever]
D --> E[Agency wonders why not uploaded]
E --> F[Email sent: File invalid]
F --> G[Manual intervention required]
G --> H[Download from Box]
H --> I[Rename correctly]
I --> J[Re-upload]
J --> K[Finally processes]
B -->|Yes, but wrong tracking ID| L[Database lookup fails]
L --> M[Error: No master found]
M --> D
B -->|Yes, but wrong asset type| N[Asset type mapping fails]
N --> O[DAM rejects upload]
O --> D
B -->|Yes, but wrong country/language| P[Wrong metadata in DAM]
P --> Q[Asset uploaded but unusable]
Q --> R[Manual DAM cleanup needed]
style D fill:#FFB6C1
style Q fill:#FFB6C1
style K fill:#90EE90
```
**Real Example: Wrong Field Order (Old V1 Format)**
**Incorrect (Old V1 position):**
```
6487512_NUT_DE_de_SUMMER_OLV_30S_16x9_MST_pOiJ9s.mp4
Position: 3 4 5 6
```
**What Parser Sees:**
```
Country: DE ✅ (but actually in position 3 now)
Language: de ✅ (but actually in position 4 now)
Subject: SUMMER ❌ (parser thinks this is position 5, but it's now 3)
Asset: OLV ❌ (parser thinks this is position 6, but it's now 4)
```
**Result:**
```
Parsed incorrectly:
- Subject: "DE" ❌ (wrong - it's the country!)
- Asset Type: "de" ❌ (wrong - it's the language!)
- Country: "SUMMER" ❌ (wrong - it's the subject!)
- Language: "OLV" ❌ (wrong - it's the asset type!)
Validation: FAILS
File: SKIPPED
Status: Stuck in Box
```
**Correct (New V2.1 position):**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Position: 3 4
```
**What Parser Sees:**
```
Subject: SUMMER ✅
Asset: OLV ✅
Country: DE ✅ (now in position 8)
Language: de ✅ (now in position 9)
```
**Result:**
```
Parsed correctly:
- All fields in right positions
- Validation: PASSES
- Upload: SUCCEEDS
```
**Real Example: Missing Tracking ID**
**Missing tracking ID:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4
```
**What Happens:**
```
Parse: ✅ Succeeds (structure valid)
Tracking ID: None
Database lookup: Skip (no tracking ID)
Error: "No master asset for tracking ID: None"
Upload: FAILS
Email: "Processing failed - No master asset"
File: Remains in Box
```
**Real Example: Typo in Tracking ID**
**Wrong tracking ID (typo):**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOlJ9s.mp4
↑ Should be 'i' not 'l'
Correct ID: pOiJ9s
Typo ID: pOlJ9s
```
**What Happens:**
```
Parse: ✅ Succeeds
Tracking ID extracted: pOlJ9s
Database lookup: SELECT * WHERE tracking_id = 'pOlJ9s'
Result: NO ROWS (tracking ID doesn't exist)
Error: "No master asset for tracking ID: pOlJ9s"
Upload: FAILS
```
**Real Example: Lowercase Country Code**
**Lowercase country (parser misidentifies):**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_de_de_pOiJ9s.mp4
↑ Should be "DE" (upper)
```
**What Parser Thinks:**
```
Looking for country (2 uppercase alpha)...
Found: "de" - but it's lowercase!
Interpreted as: Language code (2-3 lowercase alpha)
Result:
- Country: NOT FOUND (missing required field)
- Language: "de" ✅ (correct)
- But TWO language codes parsed (de and de again)
Validation: May fail or produce unexpected results
```
---
## Python Automation Scripts
### Execution Schedule (Cron)
**Production Crontab:**
```cron
# Workflows run every 5 minutes
*/5 * * * * a1_to_a2_download.py >> logs/cron_a1_a2.log 2>&1
*/5 * * * * a5_to_a6_download.py >> logs/cron_a5_a6.log 2>&1
*/5 * * * * b1_to_b2_download.py >> logs/cron_b1_b2.log 2>&1
*/5 * * * * a2_to_a3_upload_polling.py >> logs/cron_a2_a3.log 2>&1
# Reports run daily at 7 PM
0 19 * * * daily_report.py >> logs/daily_report.log 2>&1
# Backups run at 2 AM daily, 3 AM Sundays
0 2 * * * database/backup.sh --daily >> logs/backup.log 2>&1
0 3 * * 0 database/backup.sh --weekly >> logs/backup.log 2>&1
```
**Why Every 5 Minutes?**
- Fast response time for users
- Manageable load on systems
- Quick recovery from temporary failures
- Users see results within minutes of upload
### Script Dependencies
```mermaid
graph TB
subgraph Workflows["Workflow Scripts"]
A1A2[a1_to_a2_download.py]
A2A3[a2_to_a3_upload_polling.py]
A5A6[a5_to_a6_download.py]
B1B2[b1_to_b2_download.py]
CX[creativex_scoring_storing.py]
end
subgraph Shared["Shared Modules"]
DAM[dam_client.py<br/>DAM API + OAuth2/mTLS]
BOX[box_client.py<br/>Box JWT + Upload/Download]
DB[database.py<br/>PostgreSQL Pool]
NOT[notifier.py<br/>Email + Webhooks]
PARSER[filename_parser.py<br/>V2.1 Parsing]
MVP[metadata_extractor_mvp.py<br/>Field Mapping]
CONFIG[config_loader.py<br/>YAML + Env Vars]
end
A1A2 --> DAM
A1A2 --> BOX
A1A2 --> DB
A1A2 --> NOT
A1A2 --> CONFIG
A2A3 --> DAM
A2A3 --> BOX
A2A3 --> DB
A2A3 --> NOT
A2A3 --> PARSER
A2A3 --> MVP
A2A3 --> CONFIG
A5A6 --> DAM
A5A6 --> BOX
A5A6 --> DB
A5A6 --> NOT
A5A6 --> CONFIG
B1B2 --> DAM
B1B2 --> BOX
B1B2 --> DB
B1B2 --> NOT
B1B2 --> CONFIG
CX --> BOX
CX --> DB
CX --> NOT
CX --> CONFIG
```
### Shared Modules Explained
#### dam_client.py - DAM API Client
**Purpose:** All communication with OpenText DAM
**Features:**
- OAuth2 authentication (default, auto-refresh tokens)
- mTLS certificate authentication (optional, with --auth-pfx flag)
- Campaign queries (get campaigns by status)
- Asset operations (download, upload, metadata)
- Recursive folder search (includes subfolders)
- Status updates (A1A2, A5A6, etc.)
**Key Methods:**
```python
# Get campaigns by status
campaigns = dam.get_campaigns_by_status('A1')
# Download asset
file_path = dam.download_asset(asset_id, output_dir)
# Upload asset with metadata
result = dam.upload_asset(file_path, folder_id, asset_representation)
# Update campaign status
dam.update_campaign_status(campaign_id, 'A2')
```
#### box_client.py - Box.com Integration
**Purpose:** File transfer via Box
**Features:**
- JWT authentication (uses Box-config.json)
- Create campaign folders
- Upload with tracking IDs
- Download files
- List folders (recursive)
- Preserve subfolder structure
- Delete files after processing
**Key Methods:**
```python
# Upload with tracking ID
result = box.upload_with_tracking_id(
file_path, campaign_id, campaign_name, tracking_id,
subfolder_path="Europe/Germany" # Preserves structure
)
# List folder files recursively
files = box.list_folder_files_recursive(folder_id)
# Download file
box.download_file(file_id, output_path)
```
#### database.py - PostgreSQL Operations
**Purpose:** All database interactions
**Features:**
- Connection pooling (1-10 connections)
- Tracking ID generation (unique 6-char)
- Master asset storage (with JSONB metadata)
- Derivative asset tracking
- CreativeX score storage (3 statuses)
- Campaign status tracking
- Find-or-create for rework workflow
**Key Methods:**
```python
# Generate unique tracking ID
tracking_id = db.generate_unique_tracking_id() # Returns: "pOiJ9s"
# Find or reuse tracking ID (for A5→A6 rework)
result = db.find_or_create_tracking_id(opentext_id, campaign_id)
# Returns: {'tracking_id': 'pOiJ9s', 'is_existing': True}
# Store master asset
db.store_master_asset(
tracking_id, opentext_id, asset_data,
box_file_id, box_url, upload_folder_id
)
# Get master asset
master = db.get_master_asset('pOiJ9s')
# Returns full metadata for inheritance
# Store CreativeX score
db.store_creativex_score(
filename, creativex_id, url, score,
box_file_id, full_json,
tracking_id='pOiJ9s', # Optional, for master scores
status='master-cx-score' # Or 'active' for derivatives
)
# Get CreativeX score
score = db.get_creativex_score_by_filename('file.mp4')
# Returns latest active score or None
```
#### filename_parser.py - V2.1 Parser
**Purpose:** Parse filenames into components
**Features:**
- V2.1 structure support (new positions)
- Pattern-based detection (flexible)
- Social media code support
- Tracking ID mode detection (-N suffix)
- Validation with error messages
- Strip upload components (job + tracking)
**Key Methods:**
```python
# Parse filename
parsed = parser.parse_filename('6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4')
# Returns:
{
'omg_job_number': '6487512',
'brand_code': 'NUT',
'subject_title': 'SUMMER',
'asset_type': 'OLV',
'seconds': '30',
'aspect_ratio': '16x9',
'spot_version': 'MST',
'country_code': 'DE',
'language_code': 'de',
'social_media_version': 'IGF',
'tracking_id': 'pOiJ9s',
'tracking_mode': 'full',
'is_valid': True,
'validation_errors': []
}
# Strip upload components
clean = parser.strip_upload_components('6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4')
# Returns: 'NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4'
```
#### metadata_extractor_mvp.py - Field Mapping
**Purpose:** Build asset representations for DAM upload
**Features:**
- Extract 30 MVP fields from master metadata
- Update fields from filename (configurable)
- Map asset types (3-letter DAM codes)
- Map country codes (ISO DAM codes)
- Force values (STATE = Local)
- Add missing fields with defaults
- CreativeX field handling
**Key Methods:**
```python
# Build asset representation
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master['full_metadata'],
clean_filename='NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4',
parsed_filename=parsed,
box_metadata={'score': '85', 'url': 'https://...'},
tracking_mode='full' # or 'folder_only' for -N suffix
)
# Returns: List of 27 metadata fields ready for DAM upload
```
**Field Mapping Logic:**
```python
# From master metadata (inherited):
- Brand Name
- Brand Code
- Campaign references
- Fiscal Year
- All Marketing fields (Director, Production House, etc.)
- All Market fields (IP Rights, Licensing, etc.)
# From filename (overrides):
- Asset Name = clean filename
- Description = subject_title
- Asset Type = mapped from 3-letter code (OLV onlinevideodigitalvideo)
- Language = language_code (from filename, uppercase)
# Forced values:
- STATE = "Local" (always)
- AGENCY NAME = "Oliver"
# From database (CreativeX):
- CreativeX Score = looked up by filename
- CreativeX URL = looked up by filename
```
---
## CreativeX Integration
### What is CreativeX?
CreativeX is an AI-powered creative quality scoring platform that analyzes advertising content against industry best practices and brand guidelines. It provides:
- Quality scores (0-100)
- Guideline compliance analysis
- Actionable improvement recommendations
- Benchmark comparisons
**Why We Use It:**
- Objective quality measurement
- Consistent scoring across markets
- Early identification of issues
- Performance prediction
- Continuous improvement tracking
### Two Sources of CreativeX Scores
```mermaid
flowchart TB
subgraph Master["Master Asset Score (Reference Only)"]
M1[Master Asset in DAM] -->|May have score in metadata| M2[A1→A2 Script]
M2 -->|Extract if present| M3[Store in Database]
M3 -->|status='master-cx-score'| M4[Linked by tracking_id]
M4 --> M5[Used for:<br/>- Analytics<br/>- Comparison<br/>- Reporting]
M5 -.->|NOT used for| M6[❌ NOT used in A2→A3 uploads]
end
subgraph Derivative["Derivative Asset Score (Used in Uploads)"]
D1[Agency gets localized asset scored] --> D2[CreativeX generates PDF report]
D2 -->|PDF contains: filename, score, URL| D3[Upload PDF to Box 350605024645]
D3 -->|Manual/automated| D4[creativex_scoring_storing.py]
D4 -->|LlamaExtract AI parses PDF| D5[Extract: filename, score, URL, ID]
D5 -->|Store in Database| D6[status='active']
D6 -->|Matched by exact filename| D7[A2→A3 Upload]
D7 --> D8[✅ USED in DAM upload]
end
style M6 fill:#FFB6C1
style D8 fill:#90EE90
```
### 🔴 CRITICAL REQUIREMENT: Every Derivative Needs Its Own CreativeX Score
**THIS IS NOT OPTIONAL - IT IS MANDATORY**
#### The Rule: One Asset = One Score
```
1 Master Asset → 10 Localized Derivatives = 10 CreativeX Scores Required
1 Master Asset → 200 Localized Derivatives = 200 CreativeX Scores Required
```
**Why EVERY Derivative Needs Scoring:**
1. **Each localization is different creative execution**
- German voice-over has different performance than French
- Italian subtitle placement may affect engagement differently
- UK cultural references differ from US references
- Each market's execution quality varies
2. **Master score does NOT apply to derivatives**
- Master: Global concept score (reference only)
- Derivative: Localized execution score (what matters)
- Example:
- Master (English): Score 85 (good concept)
- German derivative: Score 71 (translation issues)
- French derivative: Score 78 (better execution)
- Italian derivative: Score 82 (excellent localization)
3. **CreativeX scores are asset-specific, not campaign-specific**
- Each file gets analyzed individually
- Different text overlays different score
- Different pacing different score
- Different cultural adaptations different score
4. **Reporting requires individual scores**
- Which markets perform best?
- Which localizations need improvement?
- ROI tracking by market
- Can't answer these without per-asset scores
#### The Math
**Example Campaign:**
```
1 Master Asset (nutella_hero_master.mp4)
Localized for 20 markets × 2 languages each = 40 derivatives
Plus: 3 aspect ratios (16x9, 1x1, 9x16) = 40 × 3 = 120 versions
Plus: 2 platforms (standard + Instagram) = 120 × 2 = 240 derivatives
REQUIRED: 240 CreativeX PDF reports
REQUIRED: 240 entries in creativex_scores database table
```
**If you create 200 assets, you need 200 CreativeX scores. Period.**
#### What Happens If You Don't Score All Assets
**Scenario: Campaign with 50 derivatives, only 30 scored**
```
30 derivatives:
- Have PDF reports
- Scores in database (71-85 range)
- Upload with actual CreativeX data ✅
20 derivatives:
- NO PDF reports
- NOT in database
- Upload with Score=0, placeholder URL ❌
- Email shows orange warning for each
Result in DAM:
- 30 assets: Properly scored (useful for optimization)
- 20 assets: Score=0 (meaningless, looks bad)
Client sees:
- Inconsistent quality reporting
- 40% of assets show "not scored"
- Questions quality control process
```
#### The Workflow: Scoring BEFORE Upload
**CORRECT Process:**
```mermaid
sequenceDiagram
autonumber
participant Agency
participant CX as CreativeX Platform
participant Box as Box Folder 350605024645
participant Script as creativex_scoring_storing.py
participant DB as Database
participant Upload as A2→A3 Upload
Note over Agency: For EACH derivative created
Agency->>Agency: Create localized asset
Agency->>CX: Submit to CreativeX for analysis
CX->>CX: AI analyzes creative quality
CX-->>Agency: PDF scorecard generated
Agency->>Box: Upload PDF to folder 350605024645
Box->>Script: PDF detected (manual or cron)
Script->>Script: Extract filename, score, URL
Script->>DB: Store with status='active'
Note over DB: Score now available for this asset
Agency->>Upload: Upload derivative to A2→A3 folder
Upload->>DB: Lookup score by filename
DB-->>Upload: Score=71, URL=https://...
Upload->>Upload: Attach score to DAM upload ✅
```
**Timeline Example (50 derivatives):**
```
Week 1:
- Create 50 localized assets
- Submit all 50 to CreativeX platform
- Receive 50 PDF reports
Week 2:
- Upload all 50 PDFs to Box folder 350605024645
- Run creativex_scoring_storing.py (or automated)
- Verify all 50 in database:
SELECT COUNT(*) FROM creativex_scores WHERE status='active'
Result: 50 ✅
- Upload all 50 derivatives to A2→A3 folder
- All 50 process with actual scores
- All 50 emails show "CreativeX Score Added: [score]"
```
#### Common Mistake: "I'll Score the Important Ones Only"
**The Thinking:**
"We have 200 derivatives. I'll only score the hero assets (20 of them). The rest can use score=0."
**Why This Fails:**
1. **Client expectations:**
- Contract may require all assets scored
- Quality standards apply to ALL deliverables
- Score=0 looks unprofessional
2. **Reporting incomplete:**
- Can't analyze which markets perform best (only have 10% of data)
- Can't identify improvement opportunities
- ROI tracking impossible
3. **Selective scoring = inconsistent quality:**
- Some assets optimized, others not
- Performance disparity
- Brand inconsistency
4. **The system defaults to Score=0:**
- Not "unscored" or "pending"
- Actively shows ZERO
- Implies poor quality (even if asset is good)
**Reality:**
If you create 200 assets, budget for 200 CreativeX analyses. Factor this into:
- Project timeline
- Project budget
- Resource allocation
- Agency scope of work
#### Batch Processing CreativeX
**For Large Campaigns (100+ derivatives):**
**Week-by-Week Approach:**
```
Week 1: Create + score derivatives 1-25
- Upload 25 PDFs
- Run scoring script
- Upload 25 derivatives
- All get actual scores
Week 2: Create + score derivatives 26-50
- Upload 25 PDFs
- Run scoring script
- Upload 25 derivatives
- All get actual scores
... continue weekly
```
**NOT Recommended:**
```
Week 1-4: Create all 100 derivatives
Week 5: Upload all 100 derivatives (no scores yet)
Week 6: Start scoring (too late, already uploaded)
Result: All 100 uploaded with score=0
Then manual DAM updates needed
```
#### Exceptions to the Rule
**When Score=0 is Acceptable:**
1. **Internal reference assets** (not client-facing)
2. **Template files** (not final deliverables)
3. **Work-in-progress** (will be scored before final delivery)
4. **Non-creative assets** (legal documents, spreadsheets)
**But for client deliverables: ALL must be scored.**
#### How to Verify All Assets Are Scored
**Before Mass Upload:**
```bash
# Count derivatives you created
ls derivatives/ | wc -l
Result: 200 files
# Count scores in database
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT COUNT(*) FROM creativex_scores WHERE status = 'active';
"
Result: Should be 200 (or close to it)
# List any files without scores (after upload)
# Compare uploaded filenames to database entries
```
**After Upload:**
```bash
# Check how many used defaults
grep "CreativeX Score Missing" logs/a2_to_a3.log | wc -l
# If this number is high → many derivatives uploaded without scores
```
### CreativeX PDF Extraction Workflow
**Step 1: CreativeX Generates PDF Report**
- Agency submits asset to CreativeX platform
- CreativeX AI analyzes content
- Generates PDF scorecard
- PDF contains:
- Filename: `6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4`
- CreativeX ID: `6864308`
- CreativeX URL: `https://app.creativex.com/audit/scorecards/32456797`
- Quality Score: `71` (out of 100)
- Guidelines breakdown
- Recommendations
**Step 2: Upload PDF to Box**
- Agency uploads PDF to Box folder: `350605024645`
- PDF can be named anything (e.g., `nutella_summer_de_report.pdf`)
- What matters is the **filename field INSIDE the PDF**
**Step 3: Run CreativeX Scoring Script**
```bash
# Manual execution (or can be added to cron)
python scripts/creativex_scoring_storing.py
```
**Script Process:**
```mermaid
sequenceDiagram
autonumber
participant Box as Box Folder 350605024645
participant Script as creativex_scoring_storing.py
participant Llama as LlamaExtract AI
participant DB as Database
participant Email
Script->>Box: List PDF files
Box-->>Script: Found 1 PDF
Script->>Box: Download PDF
Box-->>Script: PDF file
Script->>Llama: Extract data using agent "Creativex-Extract"
Llama->>Llama: AI parses PDF structure
Llama->>Llama: Find filename field
Llama->>Llama: Extract score, URL, ID
Llama-->>Script: JSON response
Script->>Script: Parse JSON
Script->>Script: Get filename: 6487512_NUT_SUMMER_OLV..._pOiJ9s.mp4
Script->>Script: Get score: 71
Script->>Script: Remove .0 decimals: 71.0 → 71
Script->>DB: Check if filename exists (status='active')
alt Filename already exists
DB-->>Script: Found existing (version 1)
Script->>DB: Mark old as status='superseded'
Script->>DB: Insert new with status='active', version 2
else New filename
Script->>DB: Insert with status='active', version 1
end
Script->>Box: Delete PDF file
Script->>Email: Send success notification
Email-->>Team: "1 file processed, Score: 71 (Version 2)"
```
**Database Storage:**
```sql
INSERT INTO creativex_scores (
filename, creativex_id, creativex_url, quality_score,
box_file_id, full_extraction_data, status
) VALUES (
'6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4',
'6864308',
'https://app.creativex.com/audit/scorecards/32456797',
'71',
'box_file_id_123',
'{"data": {...full JSON from LlamaExtract...}}',
'active'
);
```
**Version Tracking:**
If same filename is scored again:
```
Version 1: Score 71, status='superseded' (old)
Version 2: Score 75, status='active' (current)
Version 3: Score 80, status='active' (latest)
```
Email shows: "Score: 80 (Version 3)"
**Step 4: A2→A3 Uses Score Automatically**
When agency uploads derivative:
```
Filename: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Lookup: SELECT * FROM creativex_scores WHERE filename = '...' AND status = 'active'
Found: Score 71, URL https://...
Upload: Includes CreativeX metadata
```
**If No Score Found:**
```
Lookup: Returns None
Default: Score = 0, URL = https://app.creativex.com/preflight/pretests
Upload: Succeeds with defaults
Email: Shows orange warning "CreativeX Score Missing"
```
### Master vs Derivative Scores
**Master Score (status='master-cx-score'):**
- Extracted during A1A2 from DAM metadata
- Linked by tracking_id
- Purpose: Reference, analytics, comparison
- **NOT used in A2A3 uploads**
- Query: `WHERE tracking_id = 'pOiJ9s' AND status = 'master-cx-score'`
**Derivative Score (status='active'):**
- Extracted from PDF via LlamaExtract
- Matched by exact filename (with job + tracking)
- Purpose: Attached to DAM upload
- **USED in A2A3 uploads**
- Query: `WHERE filename = 'exact_name.mp4' AND status = 'active'`
**Why Separate?**
- Each localized asset gets its own score
- Derivative scores often differ from master
- Master: Global creative concept score
- Derivative: Localized execution score
- Need to track both for analytics
**Example:**
```
Master: nutella_summer_hero.mp4
- Tracking ID: pOiJ9s
- Master Score: 85 (general creative quality)
Derivatives:
- German version: Score 71 (localization quality)
- French version: Score 78 (localization quality)
- Italian version: Score 82 (localization quality)
Analytics can show:
- Master scored 85
- Average derivative: 77
- Italian performed best (+3 from average)
```
---
*[Document continues with remaining sections...]*
**Note:** This is approximately 20% of the complete guide. The full document will include:
- Detailed workflows for all scripts
- Common mistakes with 20+ real examples
- Troubleshooting guide with solutions
- Database schema with ERD
- Monitoring procedures
- Glossary and FAQ
- Complete reference tables
**Estimated final length:** 60-70 pages
Would you like me to continue building out the remaining sections?
---
## The Database
### Database Architecture
The PostgreSQL database is the **central nervous system** of the automation. It stores:
- Master asset metadata and tracking IDs
- Derivative asset records
- CreativeX scores (master and derivative)
- Campaign status tracking
- Complete audit trail
```mermaid
erDiagram
MASTER_ASSETS ||--o{ DERIVATIVE_ASSETS : "has many via tracking_id"
MASTER_ASSETS ||--o{ CREATIVEX_SCORES : "has master score via tracking_id"
MASTER_ASSETS ||--o{ ASSET_EVENTS : "has events"
CAMPAIGN_STATUS ||--o{ MASTER_ASSETS : "contains"
MASTER_ASSETS {
varchar tracking_id PK "6-char unique ID"
varchar opentext_id "DAM asset ID"
varchar original_filename "Original name"
jsonb full_metadata "Complete DAM metadata"
varchar upload_directory "Final Assets folder ID"
varchar global_master_campaign_id "Global campaign ref"
varchar local_campaign_id "Local campaign ID"
timestamp created_at
}
DERIVATIVE_ASSETS {
int id PK
varchar tracking_id FK "Links to master"
varchar derivative_filename "Clean filename"
varchar dam_asset_id "Uploaded asset ID"
varchar upload_status "pending/completed"
timestamp uploaded_at
}
CREATIVEX_SCORES {
int id PK
varchar filename "Exact filename for matching"
varchar creativex_id "CreativeX scorecard ID"
varchar creativex_url "CreativeX URL"
varchar quality_score "Score 0-100"
varchar tracking_id "Links to master (for master scores)"
varchar status "active/superseded/master-cx-score"
jsonb full_extraction_data "Complete extraction JSON"
timestamp extracted_at
}
CAMPAIGN_STATUS {
int id PK
varchar campaign_id UK "DAM campaign folder ID"
varchar campaign_number "C000000078"
varchar status "A1/A2/A3/A4/A5/B1/B2"
boolean webhook_sent "Prevents duplicates"
timestamp webhook_sent_at
}
ASSET_EVENTS {
int id PK
varchar tracking_id "Related asset"
varchar event_type "created/updated/uploaded"
jsonb event_data "Event details"
timestamp event_timestamp
}
```
### Table Purposes
#### master_assets (35 columns)
**Purpose:** Original master assets from A1A2 download
**Key Fields:**
- `tracking_id` - Unique 6-character identifier
- `opentext_id` - DAM asset ID (hex string)
- `original_filename` - Name without extension
- `full_metadata` - **JSONB column with COMPLETE DAM metadata**
- `upload_directory` - Where derivatives upload in DAM
- `global_master_campaign_id` - Reference to global master
- `local_campaign_id` - Local campaign this belongs to
**Why JSONB for full_metadata?**
- No truncation (can be 100KB+ of metadata)
- Queryable with PostgreSQL JSONB operators
- Preserves exact structure from DAM
- Future-proof (new fields don't break schema)
**Example Query:**
```sql
-- Find all assets for a campaign
SELECT tracking_id, original_filename
FROM master_assets
WHERE local_campaign_id = 'C000000078'
AND status = 'active';
-- Query within JSONB metadata
SELECT tracking_id,
full_metadata->'metadata'->'id' as metadata_model
FROM master_assets
WHERE tracking_id = 'pOiJ9s';
```
#### derivative_assets
**Purpose:** Track localized assets uploaded via A2A3
**Key Fields:**
- `tracking_id` - Links to master_assets
- `derivative_filename` - Clean filename (no job/tracking)
- `dam_asset_id` - Asset ID in DAM after upload
- `upload_status` - pending/completed/failed
**Why This Table?**
- Track which derivatives uploaded
- Link back to master
- Detect when ALL derivatives done
- Audit trail of uploads
#### creativex_scores
**Purpose:** Store quality scores from two sources
**Three Status Values:**
**1. status='active' (Derivative Scores)**
- Source: PDF extraction via LlamaExtract
- Matched by: Exact filename (with job + tracking)
- Used in: A2A3 uploads
- Versioned: Yes (old = 'superseded')
- Example: `6487512_NUT_SUMMER_OLV..._pOiJ9s.mp4`, score=71
**2. status='superseded' (Old Versions)**
- Source: PDF extraction (previous version)
- Purpose: History/audit trail
- Not used in uploads
- Example: Same filename, score=68 (old)
**3. status='master-cx-score' (Master Reference)**
- Source: A1A2 extraction from DAM metadata
- Matched by: tracking_id
- Used in: Analytics/reporting ONLY, never uploads
- Not versioned
- Example: `nutella_summer_hero.mp4`, tracking=pOiJ9s, score=85
**Query Examples:**
```sql
-- Get derivative score for upload (A2→A3 uses this)
SELECT quality_score, creativex_url
FROM creativex_scores
WHERE filename = '6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4'
AND status = 'active';
-- Get master score for reporting
SELECT quality_score, creativex_url
FROM creativex_scores
WHERE tracking_id = 'pOiJ9s'
AND status = 'master-cx-score';
-- Compare master vs derivative scores
SELECT
cs_master.quality_score as master_score,
cs_deriv.quality_score as derivative_score,
cs_deriv.filename as derivative_name
FROM creativex_scores cs_master
JOIN master_assets ma ON cs_master.tracking_id = ma.tracking_id
JOIN creativex_scores cs_deriv ON cs_deriv.filename LIKE '%' || ma.tracking_id || '%'
WHERE cs_master.status = 'master-cx-score'
AND cs_deriv.status = 'active';
```
#### campaign_status
**Purpose:** Prevent duplicate webhook sends
**Problem It Solves:**
Without this table, if A2A3 runs multiple times, it might send the "campaign complete" webhook multiple times to Make.com, causing duplicate downstream actions.
**How It Works:**
```python
# Check if webhook already sent
existing = db.check_campaign_processed(campaign_id)
if existing['webhook_sent']:
logger.info("Webhook already sent for this campaign - skipping")
return
# Send webhook
send_webhook(campaign_data)
# Record that webhook was sent
db.record_campaign_status(
campaign_id, campaign_number, campaign_name,
live_campaign='YES', status='A3', webhook_sent=True
)
```
### Backup Strategy
**Daily Backups (2:00 AM):**
```bash
# pg_dump creates SQL dump
./database/backup.sh --daily
# Creates: backups/dumps/ferrero_tracking_2025-11-13_02-00.sql.gz
# Retention: 7 days
# Size: ~2-5 MB compressed
```
**Weekly Backups (Sunday 3:00 AM):**
```bash
# pg_basebackup creates binary backup
./database/backup.sh --weekly
# Creates: backups/basebackups/ferrero_tracking_2025-11-10_03-00.tar.gz
# Retention: Latest only
# Size: ~30 MB compressed
```
**Health Monitoring (8:00 AM Daily):**
```bash
# Checks backup age, warns if > 25 hours old
./database/check_backups.sh
```
**Restore Process:**
```bash
# List available backups
./database/restore.sh --list
# Restore from specific backup
./database/restore.sh backups/dumps/ferrero_tracking_2025-11-13_02-00.sql.gz
# Creates safety backup first
# Requires confirmation: "yes"
# Restores database
# Verifies row counts
```
---
## Common Mistakes & Pitfalls
### Category 1: Filename Errors
#### Mistake 1.1: Using Old V1 Field Order
**What Happens:**
```
User creates: 6487512_NUT_DE_de_SUMMER_OLV_30S_16x9_pOiJ9s.mp4
(This is OLD V1 structure: BRAND_COUNTRY_LANG_SUBJECT_ASSET...)
Parser expects V2.1: BRAND_SUBJECT_ASSET_..._COUNTRY_LANG
Parser reads:
- Position 3 should be SUBJECT, reads: "DE" (thinks subject is "DE")
- Position 4 should be ASSET, reads: "de" (thinks asset type is "de")
- Position 8 should be COUNTRY, reads: "30S" (thinks country is "30S")
Result: Complete parsing failure
Validation errors:
- "Asset Type invalid: de (must be 3 chars uppercase)"
- "Country Code invalid: 30S (must be 2 chars)"
File: SKIPPED
Status: Stuck in Box
Fix: Use naming tool or reorder manually to V2.1
```
**Prevention:** Always use the naming tool - it generates correct V2.1 format
---
#### Mistake 1.2: Manual Typing Instead of Copy/Paste
**What Happens:**
```
Naming tool generates: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
User types manually: 6487512_NUT_SUMMER_OLV_30s_16x9_MST_DE_de_pOiJ9s.mp4
↑ lowercase 's'
Parser expects: \d+S pattern (capital S)
Parses: "30s" doesn't match pattern
Result:
- Duration not detected
- "30s" interpreted as aspect ratio (fallback)
- "16x9" interpreted as... something else
- Chaos in parsing
Validation: May fail or produce wrong metadata
```
**Prevention:** Use "Copy to Clipboard" button, never type manually
---
#### Mistake 1.3: Wrong Tracking ID (Copy/Paste Error)
**What Happens:**
```
Correct tracking ID: pOiJ9s (lowercase i)
User copies: pOlJ9s (lowercase L instead of i)
Upload: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOlJ9s.mp4
Parse: ✅ Succeeds (pOlJ9s is valid 6-char alphanumeric)
Database lookup: SELECT * FROM master_assets WHERE tracking_id = 'pOlJ9s'
Result: NO ROWS FOUND
Error: "No master asset for tracking ID: pOlJ9s"
Upload: FAILS
Email: Error notification sent
Agency confused: "I have the tracking ID from the email!"
Reality: Small typo made it a different ID
```
**Prevention:**
- Copy from email directly, don't retype
- Use naming tool dropdown if it has tracking ID list
- Double-check the last 6 characters of generated filename match email
---
#### Mistake 1.4: Reusing Tracking ID from Different Campaign
**What Happens:**
```
Campaign A (C000000078): Master nutella_hero.mp4, tracking=abc123
Campaign B (C000000099): Master raffa_promo.mp4, tracking=xyz789
User localizing for Campaign B:
- Accidentally uses tracking ID from Campaign A: abc123
- Filename: 6487512_RAF_PROMO_OLV_16x9_FR_fr_abc123.mp4
A2→A3 processes:
- Finds tracking ID: abc123
- Loads master metadata for abc123
- Gets: Nutella brand metadata! (wrong campaign)
- Uploads Raffaello asset with Nutella metadata!
Result: Asset uploaded to WRONG campaign folder with WRONG brand
Data corruption in DAM
Manual cleanup required
```
**Prevention:**
- Always get tracking ID from the SAME campaign's A1A2 email
- Never reuse tracking IDs across campaigns
- Each campaign gets fresh tracking IDs
---
#### Mistake 1.5: Missing Required Field
**What Happens:**
```
User forgets aspect ratio: 6487512_NUT_SUMMER_OLV_30S_MST_DE_de_pOiJ9s.mp4
↑ Missing ratio
Parser tries to find ratio pattern (contains 'x' or ':')
Doesn't find it
Tries fallback detection
May parse "MST" as ratio (wrong)
Then "DE" becomes spot version (wrong)
Cascade of misinterpretation
Validation: FAILS
Error: "Aspect Ratio missing"
File: SKIPPED
```
**Prevention:** Naming tool enforces required fields - can't generate without them
---
### Category 2: Upload Location Errors
#### Mistake 2.1: Uploading to Wrong Box Folder
**What Happens:**
```
User receives A1→A2 email with Box URL: https://app.box.com/folder/348304357505
User should upload derivatives to A2→A3 folder: 348526703108
User accidentally uploads to A1→A2 folder (348304357505)
A2→A3 script polls folder 348526703108
Finds: No files (they're in wrong folder)
Result: Nothing happens
User waits... waits... waits...
No email received
Asset never uploaded to DAM
Eventually: "Why hasn't my file been processed?"
Investigation: File in wrong folder
Fix: Move file to correct folder
```
**Prevention:**
- Check email for correct upload URL
- A1A2 email says: "Masters ready for download"
- A2A3 email should say: "Upload derivatives to: [URL]"
- Bookmark correct folders
---
#### Mistake 2.2: Uploading CreativeX PDF to Wrong Folder
**What Happens:**
```
User uploads CreativeX PDF to A2→A3 folder (348526703108)
Should upload to CreativeX folder (350605024645)
A2→A3 script:
- Tries to parse PDF filename as asset filename
- Parse fails (PDF name doesn't match pattern)
- Skips file
- PDF sits in folder forever
creativex_scoring_storing.py:
- Looks in folder 350605024645
- Doesn't find PDF (it's in wrong folder)
- No score extracted
Result:
- PDF not processed
- Score not in database
- A2→A3 uploads use default score=0
```
**Prevention:**
- CreativeX PDFs go to folder 350605024645 ONLY
- Asset files go to folder 348526703108
- Different purposes, different folders
---
### Category 3: Timing Errors
#### Mistake 3.1: Uploading Before Getting Tracking ID
**What Happens:**
```
Day 1 Morning: Campaign created in DAM (status A1)
Day 1 09:00: Agency starts localizing (proactive!)
Day 1 10:00: Agency uploads derivatives to Box
Day 1 10:05: A1→A2 script runs, downloads masters, generates tracking IDs
Day 1 10:10: A2→A3 script runs, finds agency files
Problem: Agency files have NO tracking IDs (uploaded before A1→A2 ran)
Filename: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4 (missing tracking ID)
Parse: tracking_id = None
Upload: FAILS
Agency must:
- Wait for A1→A2 email
- Get tracking IDs
- Rename files
- Re-upload
```
**Prevention:** Wait for A1A2 email before starting localization
---
#### Mistake 3.2: Uploading CreativeX PDF After Derivative Upload
**What Happens:**
```
10:00: Agency uploads derivative to A2→A3
10:05: A2→A3 processes, no CreativeX score found
10:05: Uploads to DAM with score=0, placeholder URL
10:10: Agency uploads CreativeX PDF
10:15: creativex_scoring_storing.py extracts, stores score=85
Result:
- Asset already in DAM with score=0
- Database has score=85
- Mismatch between DAM and database
```
**Better Workflow:**
```
1. Upload CreativeX PDF first
2. Run creativex_scoring_storing.py (or wait for scheduled run)
3. Verify score in database
4. Then upload derivative
5. A2→A3 finds score, attaches correctly
```
**Or:** Accept score=0 initially, then update DAM manually later
---
### Category 4: CreativeX Integration Errors
#### Mistake 4.1: PDF Filename Doesn't Match Asset Filename
**What Happens:**
```
Derivative filename (in Box): 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
CreativeX PDF filename field (extracted): NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4
↑ Missing job number and tracking ID
Database stores: NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4
A2→A3 looks up: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Database query: WHERE filename = '6487512_NUT_SUMMER_OLV..._pOiJ9s.mp4'
Result: NO MATCH
Uses: Default score=0
```
**Root Cause:**
CreativeX PDF was generated with clean filename, but derivative uploaded with job+tracking
**Fix:**
The PDF's filename field must match EXACTLY what will be uploaded to Box
- Include job number
- Include tracking ID
- Exact same extension
---
#### Mistake 4.2: Not Running CreativeX Scoring Script
**What Happens:**
```
Agency uploads PDF to Box folder 350605024645
Agency forgets to run: python scripts/creativex_scoring_storing.py
PDF sits in folder
Never extracted
Database has no score
All A2→A3 uploads use default score=0
```
**Fix Options:**
1. Run script manually after PDF upload
2. Add to cron for automatic processing
3. Set up Box webhook to trigger script (advanced)
---
### Category 5: Database Issues
#### Mistake 5.1: Duplicate Tracking IDs
**What Happens (Shouldn't Happen, But If It Does):**
```
Somehow two assets get same tracking ID: pOiJ9s
Database constraint: tracking_id UNIQUE
Second insert: FAILS with "duplicate key value"
Script: Catches error, logs failure
Email: Error notification
Investigation needed to find why duplicate generated
```
**Prevention:**
- generate_unique_tracking_id() checks database for uniqueness
- Retries up to 100 times
- Statistically impossible with 6 characters (56 billion combinations)
---
#### Mistake 5.2: Database Connection Lost During Processing
**What Happens:**
```
A1→A2 processing campaign with 10 assets:
- Asset 1-5: ✅ Success, stored in database
- Asset 6: Database connection drops
- Asset 6: INSERT fails
- Script: Catches exception
- Transaction: ROLLED BACK (asset 6 not stored)
- Asset 7-10: ⏭️ Not attempted (stop on error)
Result:
- Status NOT updated (still A1)
- Partial success email sent
- Next run: Retries assets 6-10
- Eventually completes
```
**Why This Design?**
- Transaction safety
- All-or-nothing approach
- Clear state (either campaign done or not)
- Automatic retry on next cron run
---
### Category 6: Intentional Shortcuts (And Why They Fail)
#### Shortcut 6.1: "I'll Fix the Metadata in DAM After Upload"
**The Thinking:**
"I'll upload with whatever filename, then manually edit the metadata in DAM. Faster than using the naming tool."
**Reality:**
```
Upload with wrong filename: nutella_video.mp4
A2→A3 processes:
- Parse fails (not V2.1 format)
- OR parses incorrectly
- Metadata applied wrong
- Upload succeeds with WRONG metadata
Now in DAM:
- Wrong language
- Wrong country
- Wrong asset type
- Wrong description
To fix:
- Find asset in DAM (search)
- Open metadata editor
- Fix 10+ fields manually
- Save each field
- Test search still works
- Update related assets
Time: 30+ minutes per asset
vs.
Using naming tool correctly:
- Generate filename: 1 minute
- Upload: 1 minute
- Automation handles rest
- Correct metadata automatically
Time: 2 minutes total
```
**Lesson:** Shortcuts create more work, not less
---
#### Shortcut 6.2: "I'll Skip the Job Number, Not Billing This"
**The Thinking:**
"This is internal work, no client billing, so job number doesn't matter."
**Reality:**
```
Upload: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4 (no job number)
Parse: FAILS (expects digits in position 1)
OR if job number optional in parser:
Upload succeeds but:
- Can't link to OMG project
- Client asks: "Where's the summer video?"
- Search by job number: NOT FOUND
- Search by client: Manual lookup needed
- Billing reconciliation: Manual matching
- Client reporting: Incomplete
Time wasted: Hours across multiple people
```
**Lesson:** Job number is required business metadata, not optional
---
#### Shortcut 6.3: "I'll Manually Enter CreativeX Score in Box Metadata"
**The Thinking:**
"Old workflow used Box metadata templates. I'll just enter the score there instead of uploading PDF."
**Reality:**
```
User opens Box file
Clicks metadata
Adds custom metadata field: "creativex_score = 85"
Saves
A2→A3 script runs:
- Ignores Box metadata completely (new workflow)
- Looks in PostgreSQL database only
- Finds: Nothing
- Uses: Default score=0
Result: Score not attached despite manual entry
Wasted effort
```
**Lesson:** New workflow is database-only. Must upload PDF or score won't attach.
---
### Category 7: Process Violations
#### Violation 7.1: Uploading Directly to DAM (Bypassing Automation)
**What Happens:**
```
User has DAM access
User uploads derivative directly to "02. Final Assets" folder
Bypasses Box and A2→A3 automation
Result:
- No tracking ID link to master
- No database record
- No CreativeX score attached
- No email notification
- Asset "invisible" to reporting
- Audit trail broken
Later when searching:
- "Why don't we have a German version?"
- File is there but not tracked
- Manual reconciliation needed
```
**Lesson:** ALWAYS use Box upload workflow for tracking and automation
---
#### Violation 7.2: Editing Files in Box After Upload
**What Happens:**
```
10:00: Agency uploads to A2→A3
10:05: Agency notices issue, edits file in Box (replaces version)
10:10: A2→A3 script runs
10:10: Downloads file (gets EDITED version, not original)
10:11: Uploads to DAM
Result: Processed, but...
- Audit trail shows original upload time
- But file is edited version
- Discrepancy between logs and actual file
- Confusion in troubleshooting
```
**Better Approach:**
- Delete file from Box
- Fix locally
- Re-upload with same filename
- Clear audit trail
---
### Category 8: Agency Workflow Mistakes
#### Mistake 8.1: Different Naming for Same Tracking ID
**What Happens:**
```
Master: nutella_summer_hero.mp4, tracking=pOiJ9s
Derivative 1: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4 ✅
Derivative 2: NUT_SOMMER_OLV_30S_16x9_MST_FR_fr_pOiJ9s.mp4 ❌
↑ SOMMER instead of SUMMER (different subject)
Both use tracking ID pOiJ9s but different subjects!
Result:
- Both parse successfully
- Both upload successfully
- But inconsistent naming
- Search confusion
- Brand guideline violation
```
**Lesson:** All derivatives of same master should use SAME subject title
---
#### Mistake 8.2: Wrong File Type/Extension
**What Happens:**
```
Master: nutella_hero.mp4 (video)
Derivative: Agency creates as .mov, uploads as:
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mov
Parse: ✅ Succeeds
Upload: ✅ May succeed
BUT: Extension mismatch
- DAM expects .mp4 based on master
- Got .mov instead
- May cause downstream issues
- Playback compatibility questions
```
**Best Practice:** Match master file type unless specifically requested otherwise
---
### Category 9: Approval Process Errors
#### Mistake 9.1: Rejecting Without Comments
**What Happens:**
```
Approver marks: ECOMMERCE STATUS = "NOT APPROVED"
Approver doesn't add comments (lazy)
A5→A6 downloads asset
Email to agency:
"Asset rejected:
Rejection Comments:
(none provided)"
Agency: "Why was this rejected? What do I fix?"
Wasted time asking approver
Delayed turnaround
```
**Best Practice:** Always add detailed rejection comments
---
#### Mistake 9.2: Approving When Should Be Rejected
**What Happens:**
```
Asset has compliance issue (missing disclaimer)
Approver misses it
Marks: APPROVED
Status: A4
Asset goes live with compliance issue
Legal discovers issue post-launch
Emergency takedown required
Much bigger problem than rejecting early
Cost: High (legal risk, brand reputation)
```
**Lesson:** Better to reject and fix than approve and regret
---
## Monitoring & Health Checks
### Email Notifications: What They Mean
#### Green Emails (Success)
**"✅ Master Assets Downloaded"** (A1A2)
```
Means:
- Campaign found with status A1
- All masters downloaded successfully
- Uploaded to Box with tracking IDs
- Stored in database
- Status updated A1→A2
- Ready for agency localization
Your Action:
- Forward Box URLs to agency
- Monitor for A2→A3 completion
```
**"✅ Asset Uploaded to DAM"** (A2A3)
```
Means:
- Derivative processed successfully
- Uploaded to DAM Final Assets
- Metadata applied correctly
- CreativeX score attached (or defaults used)
- Deleted from Box
Your Action:
- None (automatic)
- Check for CreativeX warning if present
```
**"✅ Global Master Assets Downloaded"** (B1B2)
```
Means:
- Global campaign processed
- Assets ready for distribution
- Available in Box
- No webhook sent (expected)
Your Action:
- Notify regional teams
```
#### Orange Emails (Partial Success / Warnings)
**"⚠ Partial Download"**
```
Means:
- Some assets succeeded, some failed
- Status NOT updated (campaign still at previous status)
- Failed assets will retry on next run
Your Action:
- Review failed assets in email
- Check logs for error details
- May need to fix issues in DAM
- Wait for retry (5 minutes)
```
**"⚠ CreativeX Score Missing"** (in A2A3 email)
```
Means:
- Asset uploaded successfully
- But no CreativeX score found in database
- Used defaults: Score=0, placeholder URL
- Everything else worked fine
Your Action:
- Upload CreativeX PDF if needed
- Or accept score=0 if not critical
```
#### Red Emails (Errors)
**"❌ Upload Failed"**
```
Means:
- Critical error during processing
- File NOT uploaded to DAM
- Still in Box for retry
Your Action:
- Check error message in email
- Review logs: tail -100 logs/a2_to_a3.log
- Fix underlying issue
- File will retry automatically
```
### Daily Report Email (7:00 PM)
**"📊 Ferrero Automation Daily Report"**
**What It Contains:**
```
Overall Summary (Last 24 Hours):
- Campaigns Found: 5
- Campaigns Processed: 5
- ✅ Completed: 4
- ⚠️ Partial: 1
- Total Assets: 50
- ✓ Successful: 48
- ✗ Failed: 2
- Success Rate: 96.0%
Workflow Breakdown:
A1→A2: 2 campaigns, 20 assets, 20 successful
A2→A3: 3 campaigns, 25 assets, 23 successful, 2 failed
A5→A6: 1 campaign, 5 assets (rejections), 5 successful
B1→B2: 0 campaigns (no global campaigns today)
Errors (if any):
- Asset XYZ failed: Network timeout
- Asset ABC failed: Invalid tracking ID
```
**How to Read It:**
- Success rate >95%: ✅ Healthy
- Success rate 80-95%: ⚠️ Investigate errors
- Success rate <80%: 🔴 System issues, escalate
---
### Log Files
**Location:** `/opt/ferrero-automation/Python-Version/logs/`
**Log Files:**
```
a1_to_a2.log - Master downloads
a2_to_a3.log - Derivative uploads
a5_to_a6.log - Rejection downloads
b1_to_b2.log - Global masters
creativex_scoring.log - CreativeX extraction
daily_report.log - Daily report generation
backup.log - Database backups
cron_*.log - Cron execution logs
```
**Log Rotation:**
- Max size: 10MB per file
- Backups: 28 rotated files kept
- Total: ~280MB per workflow
- Auto-cleanup: Old logs deleted
**Useful Log Commands:**
```bash
# Watch live
tail -f logs/a2_to_a3.log
# Check for errors today
grep -i error logs/*.log | grep "$(date +%Y-%m-%d)"
# Count successful uploads today
grep "✓ Success" logs/a2_to_a3.log | grep "$(date +%Y-%m-%d)" | wc -l
# Find specific tracking ID
grep "pOiJ9s" logs/*.log
# See recent activity
tail -100 logs/a1_to_a2.log
```
---
## Troubleshooting Guide
### Symptom-Based Troubleshooting
#### "My file isn't being processed"
**Check 1: Is it in the right Box folder?**
```bash
# Verify folder ID in email
A2→A3 folder: 348526703108
CreativeX folder: 350605024645
```
**Check 2: Does the filename parse correctly?**
```bash
# Test parsing locally
python3 -c "
from scripts.shared.filename_parser import FilenameParser
parser = FilenameParser()
result = parser.parse_filename('YOUR_FILENAME.mp4')
print('Valid:', result['is_valid'])
print('Errors:', result['validation_errors'])
print('Tracking ID:', result['tracking_id'])
"
```
**Check 3: Does the tracking ID exist in database?**
```bash
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, original_filename
FROM master_assets
WHERE tracking_id = 'pOiJ9s';
"
```
**Check 4: Check the logs**
```bash
grep "YOUR_FILENAME" logs/a2_to_a3.log
```
---
#### "CreativeX score shows 0 instead of actual score"
**Check 1: Is score in database?**
```bash
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT filename, quality_score, status
FROM creativex_scores
WHERE filename LIKE '%YOUR_FILENAME%';
"
```
**Check 2: Does filename match EXACTLY?**
```
Database has: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Lookup uses: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Match: Must be character-for-character identical
```
**Check 3: Is status = 'active'?**
```bash
# Query filters for status='active' only
# Superseded scores are not used
SELECT status FROM creativex_scores WHERE filename = 'exact_name.mp4';
```
**Fix:**
- Upload CreativeX PDF with exact filename
- Run creativex_scoring_storing.py
- Re-upload derivative (will pick up score)
---
#### "Asset uploaded with wrong metadata"
**Check 1: Was correct tracking ID used?**
```bash
# Check what master the tracking ID links to
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, original_filename,
full_metadata->'metadata'->>'id' as metadata_model
FROM master_assets
WHERE tracking_id = 'abc123';
"
```
**Check 2: Check the --dryrun output**
```bash
# See exactly what metadata would be sent
python scripts/a2_to_a3_upload_polling.py --dryrun
# Review the JSON output
# Verify each field has correct value
```
**Fix:**
- Verify tracking ID is correct
- Check field_mappings.yaml configuration
- Re-upload with corrections
---
### Emergency Procedures
#### Emergency 1: Database Corruption
**Symptoms:**
- Scripts crashing with database errors
- Queries returning unexpected results
- Table structure errors
**Steps:**
1. Stop all workflows immediately
```bash
crontab -e # Comment out all cron jobs
```
2. Create emergency backup
```bash
./database/backup.sh --daily
```
3. Restore from last known good backup
```bash
./database/restore.sh backups/dumps/ferrero_tracking_2025-11-13_02-00.sql.gz
```
4. Verify restoration
```bash
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT COUNT(*) FROM master_assets;
SELECT COUNT(*) FROM creativex_scores;
"
```
5. Resume workflows
```bash
crontab -e # Uncomment cron jobs
```
6. Notify team of data loss window (time between backup and corruption)
---
#### Emergency 2: All Workflows Failing
**Symptoms:**
- Multiple error emails
- No successful uploads for hours
- Logs show authentication failures
**Steps:**
1. Test connections
```bash
python scripts/test_connection.py
python scripts/test_connection.py --auth-pfx
```
2. Check credentials in .env
```bash
grep "DAM_" .env | grep -v PASSWORD
grep "BOX_" .env | grep -v PASSWORD | grep -v KEY
```
3. Check external services
- DAM: Is it accessible? Network issues?
- Box: API status page check
- Database: Container running?
4. Check disk space
```bash
df -h /opt/ferrero-automation
```
5. Review recent changes
```bash
git log --oneline | head -10
```
6. Rollback if recent deployment caused issues
```bash
git revert <commit-hash>
```
---
## Glossary
**A1, A2, A3, A4, A5, A6** - Campaign status codes indicating workflow stage
**Asset Representation** - JSON structure containing all metadata fields for DAM upload
**Tracking ID** - 6-character unique identifier linking derivative to master asset
**Master Asset** - Original creative asset from campaign creation
**Derivative Asset** - Localized version of master asset
**JSONB** - PostgreSQL data type for storing JSON with queryable fields
**MVP Fields** - 30 core metadata fields required for asset upload
**Soft Delete** - Marking records as 'superseded' instead of deleting (preserves history)
**CreativeX Score** - Quality score (0-100) from CreativeX AI analysis
**OMG Job Number** - Project identifier from OMG project management system
**Folder-Only Mode** - Tracking ID with -N suffix, inherits only upload folder
**LlamaExtract** - AI service for extracting data from PDF documents
**pg_dump** - PostgreSQL backup tool creating SQL dump
**Cron** - Linux job scheduler running scripts on schedule
---
## FAQ
**Q: Can I edit a filename after upload to Box?**
A: No. The script processes based on exact filename at time of discovery. Editing causes mismatches.
**Q: What if I upload the wrong tracking ID?**
A: Delete the file from Box, rename with correct tracking ID, re-upload.
**Q: Can the same master have multiple derivatives with different tracking IDs?**
A: No. One master = one tracking ID. All derivatives of that master use the same tracking ID.
**Q: What happens if CreativeX PDF is never uploaded?**
A: Asset uploads successfully with default score=0. Functional but not scored.
**Q: Can I skip the naming tool and create filenames manually?**
A: Technically yes if you know the exact V2.1 format, but NOT recommended. Tool prevents errors.
**Q: How long until my uploaded file is processed?**
A: Maximum 5 minutes (next cron run). Usually 1-3 minutes.
**Q: What if my file is stuck in Box for > 10 minutes?**
A: Check logs for errors, verify filename format, check tracking ID in database.
**Q: Can I upload multiple derivatives at once?**
A: Yes! A2A3 processes one file per run but runs every 5 minutes. All will be processed within minutes.
**Q: What's the maximum file size?**
A: Limited by Box upload limits and network bandwidth. Tested with files up to 500MB successfully.
---
## Quick Reference
### Box Folder IDs
| Workflow | Folder ID | Purpose |
|----------|-----------|---------|
| A1A2 | 348304357505 | Master downloads (local campaigns) |
| A2A3 | 348526703108 | Agency derivative uploads |
| A5A6 | 349441822875 | Rejected asset downloads |
| B1B2 | 349261192115 | Global master downloads |
| CreativeX | 350605024645 | CreativeX PDF reports |
### Status Codes
| Code | Meaning | Next Step |
|------|---------|-----------|
| A1 | Campaign created, masters ready | A1A2 automation |
| A2 | Masters in Box, ready for localization | Agency work |
| A3 | Derivatives uploaded, ready for approval | Approval process |
| A4 | Approved, ready for distribution | Go live |
| A5 | Rejected, needs rework | A5A6 automation |
| A6 | Rework in progress | Agency fixes, then A2A3 |
| B1 | Global masters ready | B1B2 automation |
| B2 | Global masters distributed | Regional teams |
### Script Commands
```bash
# Manual execution
python scripts/a1_to_a2_download.py
python scripts/a2_to_a3_upload_polling.py
python scripts/a5_to_a6_download.py
python scripts/b1_to_b2_download.py
python scripts/creativex_scoring_storing.py
# With flags
python scripts/a2_to_a3_upload_polling.py --dryrun # Test mode
python scripts/a1_to_a2_download.py --auth-pfx # mTLS auth
python scripts/a2_to_a3_upload_polling.py --keep-files # Don't delete from Box
# Database operations
./database/backup.sh --daily
./database/restore.sh --list
./database/check_backups.sh
# Monitoring
tail -f logs/a2_to_a3.log # Watch logs
grep -i error logs/*.log | grep "$(date +%Y-%m-%d)" # Today's errors
```
### Database Queries
```bash
# Recent master assets
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, original_filename, created_at
FROM master_assets
ORDER BY created_at DESC LIMIT 10;
"
# Recent uploads
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, derivative_filename, uploaded_at
FROM derivative_assets
ORDER BY uploaded_at DESC LIMIT 10;
"
# Active CreativeX scores
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT filename, quality_score, extracted_at
FROM creativex_scores
WHERE status = 'active'
ORDER BY extracted_at DESC LIMIT 10;
"
# Campaign status
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT campaign_number, status, webhook_sent, updated_at
FROM campaign_status
ORDER BY updated_at DESC LIMIT 10;
"
```
---
## Contact Information
### Escalation Path
**Level 1: Self-Service**
- Check this guide
- Review logs
- Check email notifications
- Query database
**Level 2: Operations Team**
- Email: operations@oliver.agency
- Response time: Within 2 hours (business hours)
- Can: Review logs, check database, restart workflows
**Level 3: Technical Lead**
- Email: daveporter@oliver.agency
- Response time: Within 4 hours
- Can: Code fixes, configuration changes, database recovery
**Level 4: Ferrero IT**
- Contact: [To be provided]
- Response time: As per SLA
- Can: DAM access issues, network problems, credential resets
### Support Resources
**Documentation:**
- COMPLETE_WORKFLOW_GUIDE.md (this document)
- CREATIVEX_DEPLOYMENT.md (CreativeX setup)
- DATABASE_BACKUP_GUIDE.md (Backup/restore)
- CUTOVER.md (Production deployment)
- CUTOVER-TODOS.md (Deployment checklist)
**Logs:**
- `/opt/ferrero-automation/Python-Version/logs/`
**Database:**
- Host: localhost:5437
- Database: ferrero_tracking
- User: ferrero_user
**Monitoring:**
- Daily email reports (7:00 PM)
- Real-time notifications
- Log files
---
## Summary: The Golden Rules
### For Campaign Managers
1. Set campaign to A1 when ready
2. Wait for automation (don't rush)
3. Forward Box URLs to agency promptly
4. Monitor email notifications
5. Review daily reports
6. Don't manually move files between DAM folders
7. Don't change status codes manually
### For Agencies
1. **ALWAYS use the Naming Convention Tool**
2. Copy filename EXACTLY from tool (don't type)
3. Wait for A1A2 email before starting
4. Get tracking IDs from email, not elsewhere
5. Upload CreativeX PDFs before derivatives
6. Upload to correct Box folder (check email URL)
7. Use SAME tracking ID for rework
8. Don't manually type filenames
9. Don't reuse tracking IDs across campaigns
10. Don't upload directly to DAM
### For Approvers
1. Always add detailed rejection comments
2. Be specific about what needs fixing
3. Check compliance before approving
4. Don't reject without explaining why
### For Operations
1. Monitor daily reports
2. Check backup health weekly
3. Review error patterns
4. Test restore monthly
5. Keep documentation updated
---
## Document Version History
**Version 2.1** - November 2025
- Updated for V2.1 filename structure
- Added social media field
- Added asset type mapping
- Added CreativeX master score extraction
- Complete workflow overhaul documentation
**Version 2.0** - October 2025
- Initial comprehensive guide
- All workflows documented
- Mermaid diagrams added
---
**End of Document**
**Total Pages:** ~65 pages (when printed)
**Last Updated:** November 2025
**Maintained By:** Operations Team
**Review Cycle:** Quarterly or after major changes
For questions or feedback, contact: operations@oliver.agency