ferrero-opentext/Python-Version/MARKDOWN_DOCS/COMPLETE_WORKFLOW_GUIDE.md

# Ferrero DAM Asset Production Workflow
## Complete System Guide

**Version:** 2.1
**Last Updated:** November 2025
**Document Owner:** Operations Team
**Audience:** Campaign Managers, Creative Teams, Agencies, Technical Staff

---

## Complete Workflow Overview: A1 → A4 (with A5-A6 Rejection Cycle)

```mermaid
flowchart TB
    subgraph Creation["PHASE 1: Campaign Creation"]
        A1_START[Campaign Manager creates campaign in DAM] --> A1_UPLOAD[Creative team uploads master assets]
        A1_UPLOAD --> A1_STATUS[Set campaign status to A1]
    end

    subgraph A1toA2["PHASE 2: A1→A2 Master Download (AUTOMATED - Every 5 min)"]
        A1_STATUS -->|Script detects A1 campaign| A1_DETECT[a1_to_a2_download.py]
        A1_DETECT --> A1_DOWNLOAD[Download all master assets from DAM]
        A1_DOWNLOAD --> A1_TRACKING[Generate unique tracking IDs<br/>pOiJ9s, a7K9mP, bvSGyd]
        A1_TRACKING --> A1_BOX[Upload to Box with tracking IDs<br/>nutella_hero_pOiJ9s.mp4]
        A1_BOX --> A1_DB[Store in database with full metadata]
        A1_DB --> A1_CX{CreativeX in<br/>master metadata?}
        A1_CX -->|Yes| A1_CX_STORE[Store master score<br/>status='master-cx-score']
        A1_CX -->|No| A1_CX_SKIP[Log: No CreativeX normal]
        A1_CX_STORE --> A1_CHECK{All assets<br/>successful?}
        A1_CX_SKIP --> A1_CHECK
        A1_CHECK -->|Yes| A1_UPDATE[Update DAM status A1 → A2]
        A1_CHECK -->|No| A1_RETRY[Retry failed on next run]
        A1_UPDATE --> A1_EMAIL[Email: Masters ready for localization]
        A1_RETRY -.->|5 min later| A1_DETECT
    end

    subgraph Agency["PHASE 3: Agency Localization + CreativeX Scoring"]
        A1_EMAIL --> AG_RECEIVE[Agency receives Box URLs + tracking IDs]
        AG_RECEIVE --> AG_DOWNLOAD[Download masters from Box]
        AG_DOWNLOAD --> AG_CREATE[Create localized versions<br/>DE, FR, IT versions + aspect ratios]
        AG_CREATE --> AG_CREATIVEX[🔴 CRITICAL: Submit EVERY derivative to CreativeX<br/>200 derivatives = 200 analyses required]
        AG_CREATIVEX --> AG_PDF[Receive PDF scorecard for EACH asset]
        AG_PDF --> AG_PDF_UP[Upload ALL PDFs to Box 350605024645]
        AG_PDF_UP --> AG_SCORE_RUN[Run creativex_scoring_storing.py]
        AG_SCORE_RUN --> AG_SCORE_DB[Scores stored in database<br/>status='active']
        AG_SCORE_DB --> AG_VERIFY[Verify: derivative count = score count ✅]
        AG_VERIFY --> AG_NAMING[Use Naming Tool for EACH derivative<br/>Copy exact filename]
        AG_NAMING --> AG_UPLOAD[Upload to Box A2→A3 folder<br/>With job number + tracking ID]
    end

    subgraph A2toA3["PHASE 4: A2→A3 Derivative Upload (AUTOMATED - Every 5 min)"]
        AG_UPLOAD -->|Script detects files| A2_DETECT[a2_to_a3_upload_polling.py]
        A2_DETECT --> A2_PARSE[Parse filename: Extract tracking ID]
        A2_PARSE --> A2_LOOKUP[Load master metadata from database]
        A2_LOOKUP --> A2_CX_LOOKUP[Lookup CreativeX score by filename]
        A2_CX_LOOKUP --> A2_CX_CHECK{Score found?}
        A2_CX_CHECK -->|Yes| A2_CX_USE[Score: 71, URL: actual]
        A2_CX_CHECK -->|No| A2_CX_DEFAULT[⚠️ Score: 0, URL: placeholder<br/>Email warning sent]
        A2_CX_USE --> A2_BUILD[Build asset representation]
        A2_CX_DEFAULT --> A2_BUILD
        A2_BUILD --> A2_OVERRIDE[Override from filename:<br/>Language, Asset Type, Description]
        A2_OVERRIDE --> A2_MAP[Map asset type: EHI → heroimage]
        A2_MAP --> A2_STRIP[Strip to clean filename]
        A2_STRIP --> A2_UPLOAD[Upload to DAM Final Assets]
        A2_UPLOAD --> A2_DELETE[Delete from Box]
        A2_DELETE --> A2_EMAIL[Email: Upload successful]
        A2_EMAIL --> A2_STATUS[Campaign status: A3]
    end

    subgraph Approval["PHASE 5: Approval Process"]
        A2_STATUS --> A3_REVIEW[Legal/Compliance/Brand review in DAM]
        A3_REVIEW --> A3_CHECK[Check quality, compliance, guidelines]
        A3_CHECK --> A3_DECISION{Approved?}
    end

    subgraph Success["✅ APPROVED PATH"]
        A3_DECISION -->|✅ All checks passed| A4_STATUS[Status updated to A4]
        A4_STATUS --> A4_WEBHOOK[Webhook sent to Make.com]
        A4_WEBHOOK --> A4_LIVE[Assets go LIVE]
        A4_LIVE --> A4_DONE[🎉 Campaign Complete - Assets in market]
    end

    subgraph Rejection["❌ REJECTED PATH - Rework Required"]
        A3_DECISION -->|❌ Issues found| A5_MARK[Approver marks specific assets:<br/>ECOMMERCE STATUS = NOT APPROVED]
        A5_MARK --> A5_LEGAL[Legal reviewer adds comments:<br/>Missing copyright, compliance issues]
        A5_LEGAL --> A5_IACC[IA&CC reviewer adds comments:<br/>Brand guideline violations]
        A5_IACC --> A5_APPROVER[General approver adds comments:<br/>Creative feedback]
        A5_APPROVER --> A5_STATUS[Campaign status → A5]
    end

    subgraph A5toA6["PHASE 6: A5→A6 Rejection Download (AUTOMATED)"]
        A5_STATUS -->|Script detects A5| A5_DETECT[a5_to_a6_download.py]
        A5_DETECT --> A5_FILTER[🔍 Filter: Download ONLY<br/>NOT APPROVED assets]
        A5_FILTER --> A5_APPROVED[✅ APPROVED assets: Skipped<br/>Not re-downloaded]
        A5_FILTER --> A5_TRACKING[Lookup EXISTING tracking ID<br/>REUSE: pOiJ9s]
        A5_TRACKING --> A5_DOWNLOAD[Download rejected assets from DAM]
        A5_DOWNLOAD --> A5_BOX_UP[Upload to Box A5→A6 folder<br/>Separate folder from A1→A2]
        A5_BOX_UP --> A5_EMAIL_OUT[Email with ALL rejection comments:<br/>Legal + IA&CC + Approver feedback]
        A5_EMAIL_OUT --> A5_AGENCY[Agency receives detailed feedback]
    end

    subgraph Rework["PHASE 7: Agency Fixes + Re-upload"]
        A5_AGENCY --> RW_DOWNLOAD[Download rejected assets from Box]
        RW_DOWNLOAD --> RW_REVIEW[Review ALL rejection comments]
        RW_REVIEW --> RW_FIX[Fix issues based on feedback]
        RW_FIX --> RW_CREATIVEX[🔴 Re-submit to CreativeX<br/>MUST get new score for fixed version]
        RW_CREATIVEX --> RW_PDF[Upload new PDF with updated score]
        RW_PDF --> RW_SCORE_RUN[Run creativex_scoring_storing.py]
        RW_SCORE_RUN --> RW_DB[Database updated with new score<br/>Old score marked 'superseded']
        RW_DB --> RW_NAMING[Use Naming Tool:<br/>NEW job number + SAME tracking ID]
        RW_NAMING --> RW_EXAMPLE[Example: 7777_NUT_SUMMER_OLV_30S_16x9_DE_de_pOiJ9s.mp4<br/>↑ New job  |  Same tracking ↑]
        RW_EXAMPLE --> RW_UPLOAD_A2[Upload to A2→A3 folder]
        RW_UPLOAD_A2 -.->|Re-enters A2→A3 automation| A2_DETECT
    end

    subgraph ReworkLoop["🔄 Rework Can Repeat Multiple Times"]
        RW_NOTE1[Rejected again? → Back to A5] -.-> A5_STATUS
        RW_NOTE2[Assets can go through A5→A6<br/>cycle 2, 3, 4+ times until approved]
    end

    style A4_DONE fill:#4CAF50,stroke:#2E7D32,stroke-width:4px,color:#FFF
    style A5_MARK fill:#EF5350,stroke:#C62828,stroke-width:3px
    style RW_CREATIVEX fill:#9C27B0,stroke:#6A1B9A,stroke-width:3px
    style AG_CREATIVEX fill:#9C27B0,stroke:#6A1B9A,stroke-width:3px
    style A1_TRACKING fill:#2196F3,stroke:#1565C0,stroke-width:2px
    style A5_TRACKING fill:#2196F3,stroke:#1565C0,stroke-width:2px
    style RW_NOTE1 fill:#FFE082,stroke:#F57C00
    style RW_NOTE2 fill:#FFE082,stroke:#F57C00
```

**🔴 CRITICAL REQUIREMENTS HIGHLIGHTED:**
1. **CreativeX Scoring (Purple):** Every derivative MUST be scored individually
2. **Tracking IDs (Blue):** Critical link between master and derivatives, reused in rework
3. **Rejection Comments (Red):** Legal, IA&CC, and approver feedback guide fixes
4. **Rework Loop (Orange):** Can repeat until approved - same tracking ID maintained

---

## Table of Contents

1. [Executive Overview](#executive-overview)
2. [System Architecture](#system-architecture)
3. [The Complete Asset Lifecycle](#the-complete-asset-lifecycle)
4. [The Naming Convention Tool](#the-naming-convention-tool)
5. [Python Automation Scripts](#python-automation-scripts)
6. [Detailed Workflow Walkthroughs](#detailed-workflow-walkthroughs)
7. [CreativeX Integration](#creativex-integration)
8. [The Database](#the-database)
9. [Common Mistakes & Pitfalls](#common-mistakes--pitfalls)
10. [Monitoring & Health Checks](#monitoring--health-checks)
11. [Troubleshooting Guide](#troubleshooting-guide)
12. [Appendices](#appendices)

---

## Executive Overview

### What is This System?

The Ferrero DAM Asset Production Workflow is an **end-to-end automated system** that manages digital assets from creation through localization to final delivery. It connects creative development, digital asset management (DAM), agency collaboration, and quality scoring into a seamless automated pipeline.

**In Simple Terms:**
Think of it as a smart conveyor belt for your digital assets. Assets enter at one end (campaign creation), get automatically processed, tracked, localized by agencies, quality-scored, and delivered at the other end - with minimal manual intervention and complete audit trails.

### The Problem It Solves

**Before Automation:**
- ❌ Manual file transfers between systems
- ❌ Lost tracking between master and derivative assets
- ❌ No visibility into localization progress
- ❌ Metadata errors and inconsistencies
- ❌ Hours spent on file organization
- ❌ Missing or incorrect quality scores
- ❌ No audit trail of changes

**After Automation:**
- ✅ Automated downloads and uploads
- ✅ Every derivative linked to master via tracking ID
- ✅ Real-time progress tracking via email
- ✅ Consistent metadata across all assets
- ✅ Automatic file organization
- ✅ AI-powered quality scoring integration
- ✅ Complete history in database

### Key Benefits

**Time Savings:**
- Manual download/upload: 2 hours per campaign → **30 seconds** (automated)
- Metadata entry: 15 minutes per asset → **0 seconds** (inherited from master)
- Tracking derivatives: 30 minutes → **instant** (database lookup)
- Quality score attachment: Manual entry → **automatic** (AI extraction)

**Error Reduction:**
- Metadata errors: 15-20% → **<1%** (automated validation)
- Lost files: Common → **impossible** (database tracking)
- Wrong file versions: Frequent → **eliminated** (tracking IDs)

**Visibility:**
- Campaign status: Unknown → **real-time** (email + database)
- Asset history: None → **complete audit trail** (database events)
- Quality metrics: Scattered → **centralized** (database + reports)

### System Components

```mermaid
graph TB
    subgraph Frontend["User-Facing Tools"]
        NT[Naming Convention Tool<br/>public-v2/]
        BOX[Box.com Folders<br/>4 separate folders]
        EMAIL[Email Notifications<br/>Success/Error/Reports]
    end

    subgraph Backend["Automation Layer"]
        A1A2[a1_to_a2_download.py<br/>Master Downloads]
        A2A3[a2_to_a3_upload_polling.py<br/>Derivative Uploads]
        A5A6[a5_to_a6_download.py<br/>Rejections/Rework]
        B1B2[b1_to_b2_download.py<br/>Global Masters]
        CX[creativex_scoring_storing.py<br/>Score Extraction]
        DR[daily_report.py<br/>Reporting]
    end

    subgraph Storage["Data Layer"]
        DAM[(OpenText DAM<br/>Asset Repository)]
        DB[(PostgreSQL<br/>Tracking Database)]
        BACKUP[(Daily Backups<br/>7-day retention)]
    end

    subgraph External["External Services"]
        LLAMA[LlamaExtract AI<br/>CreativeX PDF parsing]
        WEBHOOK[Make.com Webhooks<br/>Campaign notifications]
    end

    NT -.->|Generates| BOX
    BOX <-->|Files| A1A2
    BOX <-->|Files| A2A3
    BOX <-->|Files| A5A6
    BOX <-->|Files| B1B2
    BOX <-->|PDFs| CX

    DAM <-->|Assets + Metadata| A1A2
    DAM <-->|Assets + Metadata| A2A3
    DAM <-->|Assets + Metadata| A5A6
    DAM <-->|Assets + Metadata| B1B2

    A1A2 --> DB
    A2A3 --> DB
    A5A6 --> DB
    B1B2 --> DB
    CX --> DB
    DR --> DB

    CX <-->|API| LLAMA
    A2A3 -.->|Status updates| WEBHOOK

    A1A2 -.->|Notifications| EMAIL
    A2A3 -.->|Notifications| EMAIL
    A5A6 -.->|Notifications| EMAIL
    B1B2 -.->|Notifications| EMAIL
    CX -.->|Notifications| EMAIL
    DR -.->|Daily summary| EMAIL

    DB -.->|Automated| BACKUP
```

### Stakeholders & Their Roles

| Role | Responsibility | Tools Used |
|------|----------------|------------|
| **Campaign Manager** | Creates campaigns, monitors progress | DAM, Email notifications |
| **Creative Team** | Creates master assets, uploads to DAM | DAM |
| **Agency (Oliver)** | Localizes assets, uses naming tool | Naming Tool, Box |
| **Approvers** | Reviews assets, approves/rejects | DAM |
| **Operations** | Monitors system health, troubleshoots | Logs, Database, Email |
| **Technical Team** | Maintains scripts, database, backups | All systems |

### Success Metrics

**Current Performance (Week 1):**
- Campaigns processed: 50+ per week
- Assets processed: 500+ per week
- Success rate: 98.5%
- Average processing time: 5 minutes (vs 2 hours manual)
- Email notifications: 100% delivery
- Database backups: 7/7 successful

---

## System Architecture

### High-Level Architecture

The system consists of **5 core layers** that work together:

```mermaid
graph TB
    subgraph L1["Layer 1: User Interface"]
        UI1[DAM Web Interface<br/>Campaign & Asset Management]
        UI2[Naming Convention Tool<br/>Filename Generation]
        UI3[Box.com Interface<br/>File Upload/Download]
    end

    subgraph L2["Layer 2: Automation Scripts (Python)"]
        S1[Workflow Scripts<br/>A1→A2, A2→A3, A5→A6, B1→B2]
        S2[Support Scripts<br/>CreativeX, Daily Reports]
        S3[Shared Modules<br/>DAM Client, Box Client, Database]
    end

    subgraph L3["Layer 3: Data Storage"]
        D1[(PostgreSQL Database<br/>Tracking & Metadata)]
        D2[(Box Storage<br/>File Transfer)]
        D3[(DAM Storage<br/>Asset Repository)]
    end

    subgraph L4["Layer 4: External Services"]
        E1[LlamaExtract AI<br/>PDF Parsing]
        E2[Mailgun SMTP<br/>Email Delivery]
        E3[Make.com<br/>Webhook Automation]
    end

    subgraph L5["Layer 5: Infrastructure"]
        I1[Cron Scheduler<br/>Every 5 minutes]
        I2[Backup System<br/>Daily + Weekly]
        I3[Log Management<br/>Rotating Logs]
    end

    UI1 <--> S1
    UI2 --> UI3
    UI3 <--> S1
    S1 <--> D1
    S1 <--> D2
    S1 <--> D3
    S2 <--> D1
    S2 <--> D2
    S2 <--> E1
    S1 -.-> E2
    S1 -.-> E3
    I1 -.-> S1
    I1 -.-> S2
    I2 -.-> D1
```

### Physical Deployment

**Server Location:** `/opt/ferrero-automation/Python-Version/`

**Key Directories:**
```
/opt/ferrero-automation/
├── Box-config.json              # Box JWT authentication (one level up!)
└── Python-Version/
    ├── .env                     # Credentials (NOT in Git)
    ├── config/
    │   ├── config.yaml          # Main configuration
    │   ├── field_mappings.yaml  # Metadata field mappings
    │   ├── asset_type_mappings.yaml  # 3-letter → DAM codes
    │   └── certificates/        # mTLS certificates
    ├── scripts/
    │   ├── a1_to_a2_download.py
    │   ├── a2_to_a3_upload_polling.py
    │   ├── a5_to_a6_download.py
    │   ├── b1_to_b2_download.py
    │   ├── creativex_scoring_storing.py
    │   ├── daily_report.py
    │   └── shared/              # Shared modules
    ├── database/
    │   ├── init.sql             # Database schema
    │   ├── backup.sh            # Backup script
    │   ├── restore.sh           # Restore script
    │   └── check_backups.sh     # Health check
    ├── backups/
    │   ├── dumps/               # Daily SQL dumps (7 days)
    │   └── basebackups/         # Weekly binary (1 backup)
    ├── logs/                    # Rotating log files
    └── temp/                    # Temporary downloads
```

### Data Flow Overview

```mermaid
flowchart LR
    subgraph Creation["1. Creation"]
        BRIEF[Campaign Brief]
        CREATIVE[Creative Assets]
        DAM_A1[DAM Status: A1]
    end

    subgraph Download["2. Master Download"]
        DETECT[Detect A1 Campaign]
        DL[Download Masters]
        TID[Generate Tracking IDs]
        BOX_UP[Upload to Box]
        DB_STORE[Store in Database]
    end

    subgraph Localization["3. Agency Localization"]
        NAMING[Use Naming Tool]
        LOCALIZE[Create Derivatives]
        BOX_A2A3[Upload to Box A2→A3]
    end

    subgraph Upload["4. Derivative Upload"]
        PARSE[Parse Filename]
        LOOKUP[Database Lookups]
        CREATIVEX[Attach CreativeX Score]
        DAM_UP[Upload to DAM]
        DAM_A3[Status: A3]
    end

    subgraph Approval["5. Review"]
        APPROVE{Approval?}
        A4[A4: Go Live]
        A5[A5: Rejected]
    end

    subgraph Rework["6. Rework (if rejected)"]
        A5A6[Download Rejections]
        FIX[Agency Fixes]
        REUP[Re-upload]
    end

    BRIEF --> CREATIVE --> DAM_A1
    DAM_A1 --> DETECT --> DL --> TID --> BOX_UP --> DB_STORE
    DB_STORE --> NAMING --> LOCALIZE --> BOX_A2A3
    BOX_A2A3 --> PARSE --> LOOKUP --> CREATIVEX --> DAM_UP --> DAM_A3
    DAM_A3 --> APPROVE
    APPROVE -->|Yes| A4
    APPROVE -->|No| A5
    A5 --> A5A6 --> FIX --> REUP
    REUP --> BOX_A2A3
```

---

## The Complete Asset Lifecycle

### Overview: From Brief to Live Asset

The journey of a Ferrero digital asset involves **6 major phases** and can take anywhere from 2 days (smooth approval) to 2 weeks (multiple rejection cycles).

### Phase 1: Campaign Creation & Master Asset Development

**Who:** Campaign Manager + Creative Team
**Where:** OpenText DAM
**Duration:** Varies (creative development time)

**Steps:**
1. **Campaign Brief Created**
   - Campaign Manager creates campaign in DAM
   - Assigns campaign number (e.g., C000000078)
   - Sets campaign type: "Local" or "Global comm"
   - Defines brand, market, fiscal year

2. **Master Assets Created**
   - Creative team develops assets (videos, images, documents)
   - Assets uploaded to DAM campaign folder
   - Placed in "01. Master Assets" subfolder
   - Metadata added (brand, language, description, etc.)

3. **Campaign Status Set to A1**
   - Campaign Manager marks campaign ready
   - Status change triggers automation
   - System begins monitoring for this campaign

**Folder Structure in DAM:**
```
Campaign: C000000078-Summer_Promo/
├── 01. Master Assets/          ← Masters go here
│   ├── nutella_summer_hero.mp4
│   ├── nutella_summer_logo.png
│   └── Subfolder_EU/
│       └── nutella_eu_variant.mp4
└── 02. Final Assets/           ← Derivatives will go here later
```

**What Happens Next:**
- Automation detects campaign within 5 minutes
- A1→A2 script begins processing
- Campaign Manager receives email when masters are ready in Box

---

### Phase 2: Master Asset Download (A1→A2) - AUTOMATED

**Who:** Python Script (a1_to_a2_download.py)
**Trigger:** Campaign status = A1
**Frequency:** Every 5 minutes
**Duration:** 2-10 minutes depending on file sizes

```mermaid
sequenceDiagram
    autonumber
    participant DAM as OpenText DAM
    participant Script as a1_to_a2_download.py<br/>(Runs every 5 min)
    participant Box as Box.com
    participant DB as PostgreSQL Database
    participant Email as Email Notification
    participant CX as CreativeX Extraction

    Note over Script: Cron triggers every 5 minutes
    Script->>DAM: Query: Get campaigns WHERE status = 'A1'
    DAM-->>Script: Campaign list (may be empty)

    alt No A1 campaigns found
        Script->>Script: Log "No A1 campaigns found" (normal)
        Script->>Script: Exit cleanly
    end

    alt A1 campaign(s) found
        Script->>DAM: Get campaign details
        Script->>DAM: Search "01. Master Assets" folder (recursive)
        DAM-->>Script: List of master assets + metadata

        loop For each asset
            Script->>DAM: Download asset file
            Script->>Script: Generate unique 6-char tracking ID
            Script->>Box: Upload as "filename_TRACKINGID.ext"
            Script->>DB: Store master_assets record
            Script->>DB: Store full_metadata as JSONB
            Script->>CX: Extract CreativeX score (if present)

            alt CreativeX found in metadata
                CX->>DB: Store with status='master-cx-score'
                CX->>Script: Log "Stored master score"
            else No CreativeX in metadata
                CX->>Script: Log "No CreativeX (normal)"
            end

            Script->>Script: Track success
        end

        alt ALL assets successful
            Script->>DAM: Update campaign status A1 → A2
            Script->>DB: Record campaign_status
            Script->>Email: Send success notification
            Email-->>Team: "3 master assets ready for localization"
        else Some assets failed
            Script->>Email: Send partial success notification
            Email-->>Team: "2/3 successful, 1 failed"
            Note over Script,DAM: Status NOT updated<br/>Retry on next run
        end
    end
```

**Detailed Process Breakdown:**

**Step 1-2: Campaign Detection**
```
Query: SELECT * FROM campaigns WHERE status = 'A1' LIMIT 1
Result: Campaign C000000078 found
Log: "Processing campaign: Summer_Promo (C000000078)"
```

**Step 3-4: Asset Discovery**
```
Search: Campaign/01. Master Assets/ (recursive)
Found:
  - nutella_summer_hero.mp4 (50MB)
  - nutella_summer_logo.png (2MB)
  - Subfolder_EU/nutella_eu_variant.mp4 (45MB)
Total: 3 assets
```

**Step 5-6: Tracking ID Generation**
```
Algorithm: Random 6 alphanumeric characters
Validation: Check uniqueness in database
Generated IDs:
  - pOiJ9s (for nutella_summer_hero.mp4)
  - a7K9mP (for nutella_summer_logo.png)
  - bvSGyd (for nutella_eu_variant.mp4)
```

**Step 7: Box Upload**
```
Original: nutella_summer_hero.mp4
Box name: nutella_summer_hero_pOiJ9s.mp4
Box folder: 348304357505 (A1→A2 Local Masters)
Subfolder: C000000078-Summer_Promo/
URL: https://app.box.com/file/123456789
```

**Step 8-9: Database Storage**
```sql
INSERT INTO master_assets (
    tracking_id, opentext_id, original_filename,
    full_metadata, upload_directory, ...
) VALUES (
    'pOiJ9s',
    '747110b96dede761a445a3bf0c90f13bb38024fa',
    'nutella_summer_hero',
    '{"name": "nutella_summer_hero.mp4", "metadata": {...}}',  -- Full JSONB
    '12345',  -- Final Assets folder ID
    ...
);
```

**Step 10-11: CreativeX Extraction (if present)**
```
Search metadata for: FERRERO.TAB.FIELD.CREATIVEX
Found: Score = 85, URL = https://app.creativex.com/...
Store:
  INSERT INTO creativex_scores (
      filename, quality_score, creativex_url,
      tracking_id, status
  ) VALUES (
      'nutella_summer_hero.mp4', '85', 'https://...',
      'pOiJ9s', 'master-cx-score'
  );
Log: "Stored master CreativeX score: 85 (Tracking: pOiJ9s)"
```

**Step 12: Status Update**
```
Check: All 3 assets successful? YES
Update: Campaign status A1 → A2
Record: campaign_status table (prevents re-processing)
```

**Step 13: Email Notification**
```
To: ferrero-team@ferrero.com, operations@oliver.agency
Subject: ✅ Master Assets Downloaded - Campaign Summer_Promo
Body:
  - Campaign: Summer_Promo (C000000078)
  - Assets Downloaded: 3
  - Status Updated: A1 → A2

  Processed Assets:
  1. nutella_summer_hero.mp4
     Tracking ID: pOiJ9s
     Box URL: https://app.box.com/file/123456789
     CreativeX Score: 85 (master)

  2. nutella_summer_logo.png
     Tracking ID: a7K9mP
     Box URL: https://app.box.com/file/123456790

  3. nutella_eu_variant.mp4 (from Subfolder_EU/)
     Tracking ID: bvSGyd
     Box URL: https://app.box.com/file/123456791
```

**What If Something Fails?**

**Scenario: Network timeout on asset 2**
```
Asset 1: ✅ Success (pOiJ9s)
Asset 2: ❌ Network timeout
Asset 3: ⏭️ Not attempted (stop on error for safety)

Result:
- Status NOT updated (still A1)
- Email sent: "Partial Download - 1/3 successful"
- Next run (5 minutes): Retries asset 2 and 3
- Eventually all succeed → Status updated to A2
```

**Why Status Update is All-or-Nothing:**
- Prevents partial state in DAM
- Clear definition: A2 means "ALL masters in Box"
- Easier troubleshooting (either all done or not done)
- Safer for downstream processes

---

### Phase 3: Agency Localization

**Who:** Agency (Oliver) Creative Team
**Where:** Box + Naming Convention Tool
**Duration:** 1-5 days depending on complexity

**Steps:**

**Step 1: Receive Notification**
Agency receives email:
```
Subject: ✅ Master Assets Downloaded - Campaign Summer_Promo

Your master assets are ready for localization:
- nutella_summer_hero_pOiJ9s.mp4 (Box URL)
- nutella_summer_logo_a7K9mP.png (Box URL)
- nutella_eu_variant_bvSGyd.mp4 (Box URL)

Next steps:
1. Download masters from Box
2. Create localized versions
3. Use Naming Tool to generate filenames
4. Upload to Box A2→A3 folder: [URL]
```

**Step 2: Download Masters from Box**
Agency clicks Box URLs in email, downloads files

**Step 3: Create Localized Versions**
For each market/language:
- German version of hero video
- French version of hero video
- Italian version of hero video
- etc.

**Step 4: Use Naming Convention Tool**

**CRITICAL:** This is where the naming tool becomes essential!

Agency opens: `http://naming-tool-url/public-v2/index.php`

**For EACH localized file, agency enters:**
- OMG Job Number: `6487512` (from their project management)
- Brand: `NUT` (Nutella)
- Subject: `SUMMER` (from brief)
- Asset Type: `OLV` (Online Video) → Maps to "onlinevideodigitalvideo" in DAM
- Duration: `30` (30 seconds)
- Aspect Ratio: `16x9`
- Spot Version: `MST` (Master version)
- Country: `DE` (Germany)
- Language: `de` (German)
- Social Media: `IGF` (if Instagram Feed version)
- Tracking ID: `pOiJ9s` (from master asset filename)

**Tool Generates:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
```

**Why Each Field Matters:**

| Field | Purpose | What Happens If Wrong |
|-------|---------|----------------------|
| **Job Number** | Links to OMG project billing | Can't track project costs |
| **Brand** | Metadata inheritance, organization | Wrong brand in DAM |
| **Subject** | Description field in DAM | Searchability issues |
| **Asset Type** | Critical for DAM categorization | Upload may fail or wrong type |
| **Duration** | Asset specifications | Optional, info only |
| **Aspect Ratio** | Asset specifications | Required for videos/images |
| **Spot Version** | Master vs Reference designation | Classification |
| **Country** | Market targeting | Wrong market metadata |
| **Language** | Language metadata in DAM | Wrong language = unusable |
| **Social Media** | Platform-specific metadata | Platform targeting |
| **Tracking ID** | Links to master (DATABASE LOOKUP) | **Upload fails, can't find master** |

**Step 5: Upload to Box**
- Agency uploads to Box folder: 348526703108
- Uses EXACT filename from naming tool
- One file at a time or batch upload

**What Automation Sees:**
```
Box folder poll finds:
- 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
- 6487512_NUT_SUMMER_OLV_30S_16x9_MST_FR_fr_IGF_pOiJ9s.mp4
- 6487512_NUT_SUMMER_OLV_30S_16x9_MST_IT_it_IGF_pOiJ9s.mp4

Validation:
✅ All filenames parse correctly
✅ All tracking IDs found in database
✅ Ready for upload
```

---

### Phase 4: Derivative Asset Upload (A2→A3) - AUTOMATED

**Who:** Python Script (a2_to_a3_upload_polling.py)
**Trigger:** Files appear in Box A2→A3 folder
**Frequency:** Every 5 minutes
**Duration:** 1-3 minutes per file

```mermaid
sequenceDiagram
    autonumber
    participant Agency
    participant Box as Box A2→A3 Folder
    participant Script as a2_to_a3_upload_polling.py
    participant Parser as Filename Parser V2.1
    participant DB as Database
    participant Mapper as Asset Type Mapper
    participant DAM
    participant Email

    Agency->>Box: Upload 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4

    Note over Script: Cron runs every 5 minutes
    Script->>Box: List files in folder
    Box-->>Script: Found 1 file

    Script->>Parser: Parse filename
    Parser->>Parser: Split by underscore
    Parser->>Parser: Position 1: Job = 6487512
    Parser->>Parser: Position 2: Brand = NUT
    Parser->>Parser: Position 3: Subject = SUMMER
    Parser->>Parser: Position 4: Asset = OLV
    Parser->>Parser: Position 5: Duration = 30S
    Parser->>Parser: Position 6: Ratio = 16x9
    Parser->>Parser: Position 7: Spot = MST
    Parser->>Parser: Position 8: Country = DE
    Parser->>Parser: Position 9: Language = de
    Parser->>Parser: Position 10: Social = IGF
    Parser->>Parser: Position 11: Tracking = pOiJ9s
    Parser-->>Script: Parsed components

    Script->>DB: SELECT * FROM master_assets WHERE tracking_id = 'pOiJ9s'
    DB-->>Script: Master metadata (brand, campaign, folders, etc.)

    Script->>DB: SELECT * FROM creativex_scores WHERE filename = '6487512_NUT...'

    alt CreativeX score found
        DB-->>Script: Score: 85, URL: https://creativex...
        Script->>Script: creativex_found = True
    else No score found
        Script->>Script: Use defaults (Score: 0, placeholder URL)
        Script->>Script: creativex_found = False
    end

    Script->>Script: Build asset representation
    Script->>Script: Merge master metadata + filename overrides
    Script->>Script: Override: Language = de (from filename)
    Script->>Script: Override: Description = SUMMER (from filename)

    Script->>Mapper: Map asset type: OLV
    Mapper-->>Script: onlinevideodigitalvideo
    Script->>Script: Set FERRERO.FIELD.MKTG.ASSET TYPE = onlinevideodigitalvideo

    Script->>Script: Set STATE = Local (forced)
    Script->>Script: Set AGENCY NAME = Oliver
    Script->>Script: Add CreativeX Score = 85
    Script->>Script: Add CreativeX URL = https://...

    Script->>Script: Strip filename: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4

    Script->>DAM: Upload to Final Assets folder
    DAM-->>Script: Asset ID: xyz789

    Script->>DB: INSERT INTO derivative_assets
    Script->>Box: Delete processed file

    alt CreativeX was found
        Script->>Email: Success email (green, shows Score: 85)
    else CreativeX defaulted
        Script->>Email: Success email (orange warning box)
    end

    Email-->>Agency: "Asset uploaded successfully"
```

**Detailed Process:**

**Input File:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
```

**Parsing (V2.1 Structure):**
```python
{
    'omg_job_number': '6487512',
    'brand_code': 'NUT',
    'subject_title': 'SUMMER',
    'asset_type': 'OLV',
    'seconds': '30',
    'aspect_ratio': '16x9',
    'spot_version': 'MST',
    'country_code': 'DE',
    'language_code': 'de',
    'social_media_version': 'IGF',
    'tracking_id': 'pOiJ9s',
    'tracking_mode': 'full'  # Full metadata inheritance
}
```

**Database Lookups:**

**Lookup 1: Master Asset**
```sql
SELECT * FROM master_assets
WHERE tracking_id = 'pOiJ9s'
```
Returns: Complete master metadata, upload folder ID, campaign info

**Lookup 2: CreativeX Score**
```sql
SELECT * FROM creativex_scores
WHERE filename = '6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4'
  AND status = 'active'
```
Returns: Score: 85, URL: https://app.creativex.com/... (or None if not found)

**Metadata Building:**

**From Master (Inherited):**
- Brand Name: "Nutella"
- Brand Code: "NUT"
- Campaign ID: "C000000078"
- Campaign Name: "Summer_Promo"
- Upload Folder: Final Assets folder ID
- All other metadata fields (fiscal year, tags, etc.)

**From Filename (Overrides):**
- Language: `de` (German) - OVERRIDES master
- Country: `DE` (Germany)
- Description: `SUMMER` - OVERRIDES master
- Asset Type: `OLV` → mapped to `onlinevideodigitalvideo`

**Forced Values:**
- STATE: `Local` (always for A2→A3)
- AGENCY NAME: `Oliver`

**From Database (CreativeX):**
- CreativeX Score: `85` (or `0` if not found)
- CreativeX URL: `https://app.creativex.com/...` (or placeholder)

**Asset Type Mapping:**
```
Filename has: OLV
Mapping: OLV → onlinevideodigitalvideo
DAM receives: "onlinevideodigitalvideo" (valid domain value)
```

**Filename Stripping:**
```
Input:  6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Remove: Job number (6487512)
Remove: Tracking ID (pOiJ9s)
Output: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
```

**DAM Upload:**
```
Upload to: Campaign/02. Final Assets/
Filename: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
Metadata: 27 MVP fields populated
Result: Asset ID xyz789
```

**Database Record:**
```sql
INSERT INTO derivative_assets (
    tracking_id, derivative_filename, dam_asset_id
) VALUES (
    'pOiJ9s',
    'NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF',
    'xyz789'
);
```

**Email Notification:**
```
Subject: ✅ Asset Uploaded to DAM - NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4

Original Filename: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Clean Filename: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
DAM Asset ID: xyz789
Tracking ID: pOiJ9s

What Was Done:
✅ Downloaded from Box
✅ Loaded master metadata (pOiJ9s)
✅ Built asset representation (27 fields)
✅ Updated Language from filename: de
✅ Updated Description: SUMMER
✅ Set STATE to Local
✅ CreativeX Score Added: 85 (from database)  ← If found
✅ Uploaded to DAM Final Assets folder
✅ Deleted from Box
```

**If CreativeX NOT Found:**
```
⚠️ CreativeX Score Missing
No score found for: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4

Default Values Used:
- Score: 0
- URL: https://app.creativex.com/preflight/pretests

To add CreativeX score: Upload PDF report to Box folder 350605024645
and run creativex_scoring_storing.py
```

---

### Phase 5: Approval & Distribution (A3→A4)

**Who:** Approvers (Ferrero stakeholders)
**Where:** DAM
**Duration:** 1-3 days

**Process:**
1. Assets reach A3 status after upload
2. Designated approvers review in DAM
3. Check quality, compliance, brand guidelines
4. Decision: Approve or Reject

**If Approved:**
- Status updated to A4
- Assets ready for distribution
- Campaign complete

**If Rejected:**
- Mark individual assets: `ECOMMERCE STATUS = "NOT APPROVED"`
- Add rejection comments (why rejected)
- Campaign status → A5
- Triggers rejection workflow (see Phase 6)

---

### Phase 6: Rejection & Rework Cycle (A5→A6→A2→A3)

**Who:** Python Script + Agency
**Trigger:** Campaign status = A5 + assets marked "NOT APPROVED"
**Duration:** 1-3 days for rework + 5 minutes for automation

```mermaid
flowchart TD
    A[Asset in A3] --> B{Approver Review}
    B -->|Approved| C[A4: Go Live ✅]
    B -->|Rejected| D[Mark: ECOMMERCE STATUS = NOT APPROVED]
    D --> E[Add Comments:<br/>Legal/IA&CC/Approver]
    E --> F[Campaign Status → A5]
    F -->|5 min| G[a5_to_a6_download.py detects A5]
    G --> H[Query: Get ONLY NOT APPROVED assets]
    H --> I{Asset has tracking ID?}
    I -->|Yes| J[Reuse existing tracking ID]
    I -->|No| K[Generate new tracking ID]
    J --> L[Download from DAM]
    K --> L
    L --> M[Upload to Box A5→A6 folder]
    M --> N[Email with rejection reasons]
    N --> O[Agency receives notification]
    O --> P[Agency reviews comments]
    P --> Q[Agency fixes issues]
    Q --> R[Use Naming Tool with SAME tracking ID]
    R --> S[Upload to Box A2→A3 folder]
    S -->|Standard A2→A3 flow| T[a2_to_a3 processes]
    T --> U[Upload to DAM]
    U --> V[Overwrites previous version]
    V --> W[Back to A3 status]
    W --> X{Approval round 2}
    X -->|Approved| Y[A4: Done ✅]
    X -->|Rejected again| Z[Back to A5]
    Z -.->|Cycle repeats| G

    style C fill:#90EE90
    style Y fill:#90EE90
    style D fill:#FFB6C1
    style Z fill:#FFB6C1
```

**A5→A6: Rejection Download (AUTOMATED)**

**Critical Filter:**
```python
# ONLY download assets with ECOMMERCE STATUS = "NOT APPROVED"
# Do NOT re-download approved assets
approved_assets_skipped = assets_with_status_approved
rejected_assets = assets_with_status_not_approved
```

**Tracking ID Reuse Logic:**
```python
# Critical: Use existing tracking ID to maintain link
result = db.find_or_create_tracking_id(
    opentext_id=asset_id,
    local_campaign_id=campaign_number
)

if result['is_existing']:
    tracking_id = result['tracking_id']  # Reuse existing
    logger.info("Reusing tracking ID for rework: {}".format(tracking_id))
else:
    tracking_id = result['tracking_id']  # New ID
    logger.info("New asset, generated tracking ID: {}".format(tracking_id))
```

**Why Tracking ID Reuse Matters:**
- Links rework to original master
- Preserves metadata inheritance
- Enables version tracking
- Maintains audit trail

**Rejection Comment Extraction:**
```python
# Extract 3 types of rejection comments
comments = {
    'approver': "Logo too small, add disclaimer",
    'legal': "Missing copyright notice",
    'ia_cc': "Brand colors not compliant"
}

# Included in email to agency
```

**Email Notification:**
```
Subject: ⚠️ NOT APPROVED Assets - Rework Required - Campaign Summer_Promo

Assets requiring rework: 1

Rejected Asset:
- NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4
  Tracking ID: pOiJ9s
  Box URL: https://app.box.com/file/999

  Rejection Comments:
  📝 Approver: "Logo too small, add disclaimer"
  ⚖️ Legal: "Missing copyright notice"

Next Steps:
1. Download from Box
2. Fix issues based on comments
3. Use Naming Tool with SAME tracking ID: pOiJ9s
4. Upload to A2→A3 folder: [URL]
```

**Rework Process:**

**Step 1: Agency Reviews Comments**
- Downloads rejected file from Box
- Reads rejection reasons
- Plans fixes

**Step 2: Agency Fixes Asset**
- Makes required changes
- Adds disclaimer
- Fixes logo size
- Adds copyright notice

**Step 3: Re-Upload with SAME Tracking ID**
**CRITICAL:** Must use the same tracking ID!

**Naming Tool Input:**
- OMG Job: `7654321` (NEW job number, different from original)
- Brand: `NUT`
- Subject: `SUMMER`
- Asset: `OLV`
- Duration: `30`
- Ratio: `16x9`
- Spot: `MST`
- Country: `DE`
- Language: `de`
- Social: `IGF`
- **Tracking ID: `pOiJ9s`** ← SAME AS ORIGINAL

**Generated:**
```
7654321_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
         ↑ Different job number (new project)
                                                    ↑ SAME tracking ID
```

**Step 4: A2→A3 Processes Rework**
- Parses new filename
- Finds same tracking ID: `pOiJ9s`
- Loads same master metadata
- Uploads to DAM
- **Overwrites** previous version in Final Assets
- Back to A3 for re-approval

**Rework Can Happen Multiple Times:**
```
Upload 1 → Rejected (logo issue)
Upload 2 → Rejected (missing disclaimer)
Upload 3 → Rejected (color compliance)
Upload 4 → Approved ✅
```

Each time uses the SAME tracking ID to maintain the link!

---

### Workflow 4: Global Master Campaigns (B1→B2)

**Purpose:** Distribution of global master content to regional teams

**Who:** Python Script (b1_to_b2_download.py)
**Trigger:** Campaign type = "Global comm" AND status = B1
**Frequency:** Every 5 minutes

```mermaid
flowchart TD
    A[Global Campaign Created] --> B[Type = Global comm]
    B --> C[Assets in Final Assets folder<br/>NOT Master Assets]
    C --> D[Status = B1]
    D -->|5 min| E[b1_to_b2_download.py runs]
    E --> F{Filter campaigns}
    F --> G[Type = Global comm<br/>Status = B1]
    G --> H[Download from Final Assets]
    H --> I[Generate tracking IDs]
    I --> J[Upload to Box folder 349261192115]
    J --> K[Folder name: MASTERS_CampaignName]
    K --> L[Store in database]
    L --> M[Email to team]
    M --> N[NO webhook sent]
    N --> O[Regional teams download from Box]
    O --> P[Use for regional adaptations]
```

**Key Differences from A1→A2:**

| Aspect | A1→A2 (Local) | B1→B2 (Global) |
|--------|---------------|----------------|
| **Campaign Type** | Local campaigns | "Global comm" only |
| **Source Folder** | 01. Master Assets | **02. Final Assets** |
| **Box Folder** | 348304357505 | **349261192115** |
| **Folder Naming** | C000000078-Campaign_Name | **MASTERS_**Campaign_Name |
| **Webhook** | Sent on A4 | **NOT sent** |
| **Purpose** | Localization workflow | Distribution to regions |
| **Next Step** | A2→A3 (agency localization) | Regional teams download |

**Example:**
```
Campaign: C000000099-Global_Nutella_2025
Type: Global comm
Status: B1
Source: 02. Final Assets/

Download:
- global_nutella_master_final.mp4
- global_nutella_logo_final.png

Upload to Box:
Folder: MASTERS_Global_Nutella_2025/
Files:
- global_nutella_master_final_xYz123.mp4
- global_nutella_logo_final_aBc456.png

Email: "Global master assets ready for distribution"
NO webhook sent (different from local campaigns)
```

---

## The Naming Convention Tool

### Location & Purpose

**Path:** `/Users/daveporter/Python-Enviroments/Ferrero-naming-convention/public-v2/`

**URL:** `http://naming-tool-url/public-v2/index.php`

**Purpose:** Generate compliant V2.1 filenames that:
- Parse correctly in automation
- Extract metadata accurately
- Link derivatives to masters via tracking IDs
- Enable CreativeX score matching
- Comply with DAM validation rules

### Why Naming is CRITICAL

**The Automation Depends On It:**

Every Python script relies on filename parsing:

```mermaid
graph LR
    A[Filename] --> B[Parser]
    B --> C{Valid?}
    C -->|Yes| D[Extract Metadata]
    C -->|No| E[Skip File]
    D --> F[Database Lookup]
    F --> G[Upload to DAM]
    E --> H[Stuck in Box]
    H --> I[Manual Intervention]
```

**What Gets Extracted from Filename:**

| Component | Used For | Impact if Wrong |
|-----------|----------|-----------------|
| **Job Number** | Billing, project tracking | Can't link to OMG project |
| **Brand Code** | Validation, filtering | Wrong brand in DAM |
| **Subject** | Description field | Wrong description |
| **Asset Type** | **DAM categorization** | **Upload fails or wrong type** |
| **Duration** | Asset specs | Info only, optional |
| **Aspect Ratio** | Asset specs | Required for validation |
| **Spot Version** | Master/Ref designation | Classification |
| **Country** | Market targeting | **Wrong market = unusable** |
| **Language** | **Language metadata** | **Wrong language = unusable** |
| **Social Media** | Platform targeting | Platform-specific metadata |
| **Tracking ID** | **Master link** | **Can't find master = UPLOAD FAILS** |

### V2.1 Filename Structure

**Format:**
```
[JOB]_[BRAND]_[SUBJECT]_[ASSET]_[DUR]_[RATIO]_[SPOT]_[COUNTRY]_[LANG]_[SOCIAL]_[TRACKING]
  1      2        3        4       5     6       7       8        9      10       11
```

**Field-by-Field Explanation:**

#### 1. Job Number (OMG Job)
- **Format:** 1-10 digits
- **Example:** `6487512`
- **Purpose:** Links to OMG project management system
- **Required:** Yes (for working files)
- **Stripped:** Yes (removed before DAM upload)
- **Why Important:** Billing, project tracking, client reporting

#### 2. Brand Code
- **Format:** 2-5 uppercase letters
- **Example:** `NUT` (Nutella), `RAF` (Raffaello), `BUEC` (Bueno Ice Cream)
- **Purpose:** Brand identification
- **Required:** Yes
- **Stripped:** No (kept in final filename)

#### 3. Subject Title
- **Format:** 1-15 characters, uppercase, hyphens allowed
- **Example:** `SUMMER`, `ME-MOMENT`, `SPREAD`, `TEST-E2E`
- **Purpose:** Becomes Description field in DAM
- **Required:** Yes
- **Stripped:** No
- **Validation:** Max 15 chars (warning if exceeded)

#### 4. Asset Type
- **Format:** Exactly 3 uppercase letters
- **Example:** `OLV`, `IMG`, `TVC`, `EHI`
- **Purpose:** Asset categorization in DAM
- **Required:** Yes
- **Mapped:** Yes - 3-letter code → DAM code (e.g., EHI → heroimage)
- **Full List:** 45 asset types (see asset_type_mappings.yaml)

**Common Asset Types:**
| Code | DAM Code | Description |
|------|----------|-------------|
| IMG | keyvisual | Image / Front of Pack |
| OLV | onlinevideodigitalvideo | Online Video |
| TVC | tvc | TV Commercial |
| EHI | heroimage | E-Comm Hero Image |
| EBS | beautyshot | E-Comm Beauty Shot |
| LOG | ferrerologo | Logo |

#### 5. Duration (Optional)
- **Format:** 1-3 digits + `S`
- **Example:** `6S`, `30S`, `120S`
- **Purpose:** Video length in seconds
- **Required:** No (omit for static assets like images)
- **Stripped:** No

#### 6. Aspect Ratio
- **Format:** Width `x` Height or Width `:` Height
- **Example:** `16x9`, `1x1`, `9x16`, `4x5`
- **Purpose:** Asset dimensions/format
- **Required:** Yes
- **Stripped:** No

#### 7. Spot Version (Optional)
- **Format:** `MST` or `REF` only
- **Example:** `MST` (Master), `REF` (Reference)
- **Purpose:** Designates master vs reference file
- **Required:** No
- **Stripped:** No

#### 8. Country Code
- **Format:** 2 uppercase letters (ISO 3166-1 alpha-2)
- **Example:** `DE` (Germany), `FR` (France), `IT` (Italy), `US` (USA)
- **Purpose:** Market targeting metadata
- **Required:** Yes
- **Stripped:** No

#### 9. Language Code
- **Format:** 2-3 lowercase letters (ISO 639)
- **Example:** `de` (German), `en` (English), `it` (Italian), `eng` (English)
- **Purpose:** Language metadata in DAM
- **Required:** Yes
- **Stripped:** No

#### 10. Social Media Version (Optional)
- **Format:** 3 uppercase letters
- **Example:** `FBP`, `FBR`, `IGF`, `IGR`
- **Purpose:** Platform-specific metadata
- **Required:** No (only for social media assets)
- **Stripped:** No

**Current Codes:**
| Code | Platform | Description |
|------|----------|-------------|
| FBP | Facebook | Facebook Post |
| FBR | Facebook | Facebook Reels |
| IGF | Instagram | Instagram Feed |
| IGR | Instagram | Instagram Reels |

**More codes will be added:** TikTok, YouTube Shorts, LinkedIn, etc.

#### 11. Tracking ID
- **Format:** 6 alphanumeric characters (case-sensitive)
- **Example:** `pOiJ9s`, `a7K9mP`, `bvSGyd`
- **Purpose:** Links derivative to master asset in database
- **Required:** Yes (for derivatives with metadata inheritance)
- **Stripped:** Yes (removed before DAM upload)
- **Special:** Can have `-N` suffix for "folder-only" mode

**Tracking ID Modes:**

**Standard Mode: `abc123`**
- Full metadata inheritance from master
- Override: Language and Asset Type from derivative
- Use for: Standard localizations

**Folder-Only Mode: `abc123-N`**
- ONLY upload folder from master
- NO metadata inheritance
- All metadata from derivative filename
- Use for: New assets in same campaign with different brand/metadata

**Example:**
```
Master: NUT_HERO_IMG_16x9_IT_it.png (tracking: abc123)
Derivative 1: 111_NUT_HERO_IMG_16x9_DE_de_abc123.png (inherits NUT brand)
Derivative 2: 222_RAF_PROMO_IMG_1x1_FR_fr_abc123-N.png (DIFFERENT brand RAF, uses folder only)
```

### Using the Naming Tool: Step-by-Step

**Step 1: Open the Tool**
```
URL: http://naming-tool-url/public-v2/index.php
```

**Step 2: Select Asset Type**
Dropdown shows all 45 asset types with descriptions:
- E-COMM: Hero Image (EHI)
- Online Video (OLV)
- TV Commercial (TVC)
- etc.

**Step 3: Fill Required Fields**
```
OMG Job Number: [6487512]
Brand: [NUT - Nutella] (dropdown)
Subject Title: [SUMMER] (15 char max)
Country: [DE - Germany] (dropdown)
Language: [de - German] (dropdown)
```

**Step 4: Fill Optional Fields (if applicable)**
```
Duration: [30] seconds (for videos only)
Aspect Ratio: [16x9] (required)
Spot Version: [MST] (dropdown: MST/REF or leave empty)
Social Media: [IGF] (dropdown: FBP/FBR/IGF/IGR or leave empty)
```

**Step 5: Enter Tracking ID**
**CRITICAL STEP:**
```
Tracking ID: [pOiJ9s]
```

**How to Get Tracking ID:**
- From A1→A2 email notification
- From Box filename (last component before extension)
- From database query
- **IMPORTANT:** Copy exactly, case-sensitive!

**Step 6: Generate Filename**
- Click "Generate Filename" button
- Tool validates all fields
- Shows any errors in red
- If valid: Shows generated filename

**Step 7: Copy Filename**
- Click "Copy to Clipboard" button
- Paste EXACTLY when saving localized file
- **Do NOT modify** - use exact generated name

**Generated Example:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
```

### Tool Validation Rules

**The tool validates:**
- ✅ Job number is digits only
- ✅ Brand exists in domain
- ✅ Subject ≤ 15 characters
- ✅ Asset type is valid 3-letter code
- ✅ Duration format is correct (if provided)
- ✅ Aspect ratio format is valid
- ✅ Country is valid ISO code
- ✅ Language is valid ISO code
- ✅ Social media code is recognized (if provided)
- ✅ Tracking ID is exactly 6 characters

**Validation Errors Shown:**
- 🔴 "Brand code must be 2-5 characters"
- 🔴 "Subject title exceeds 15 character limit"
- 🔴 "Invalid tracking ID (must be 6 alphanumeric)"
- 🔴 "Country code must be 2 characters"

### What Happens When Naming is Wrong

**The Domino Effect of Bad Filenames:**

```mermaid
flowchart TD
    A[Agency uploads file with wrong name] --> B{Filename parses?}
    B -->|No| C[A2→A3 skips file]
    C --> D[File stuck in Box forever]
    D --> E[Agency wonders why not uploaded]
    E --> F[Email sent: File invalid]
    F --> G[Manual intervention required]
    G --> H[Download from Box]
    H --> I[Rename correctly]
    I --> J[Re-upload]
    J --> K[Finally processes]

    B -->|Yes, but wrong tracking ID| L[Database lookup fails]
    L --> M[Error: No master found]
    M --> D

    B -->|Yes, but wrong asset type| N[Asset type mapping fails]
    N --> O[DAM rejects upload]
    O --> D

    B -->|Yes, but wrong country/language| P[Wrong metadata in DAM]
    P --> Q[Asset uploaded but unusable]
    Q --> R[Manual DAM cleanup needed]

    style D fill:#FFB6C1
    style Q fill:#FFB6C1
    style K fill:#90EE90
```

**Real Example: Wrong Field Order (Old V1 Format)**

❌ **Incorrect (Old V1 position):**
```
6487512_NUT_DE_de_SUMMER_OLV_30S_16x9_MST_pOiJ9s.mp4
         Position: 3  4    5      6
```

**What Parser Sees:**
```
Country: DE ✅ (but actually in position 3 now)
Language: de ✅ (but actually in position 4 now)
Subject: SUMMER ❌ (parser thinks this is position 5, but it's now 3)
Asset: OLV ❌ (parser thinks this is position 6, but it's now 4)
```

**Result:**
```
Parsed incorrectly:
- Subject: "DE" ❌ (wrong - it's the country!)
- Asset Type: "de" ❌ (wrong - it's the language!)
- Country: "SUMMER" ❌ (wrong - it's the subject!)
- Language: "OLV" ❌ (wrong - it's the asset type!)

Validation: FAILS
File: SKIPPED
Status: Stuck in Box
```

✅ **Correct (New V2.1 position):**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
         Position: 3      4
```

**What Parser Sees:**
```
Subject: SUMMER ✅
Asset: OLV ✅
Country: DE ✅ (now in position 8)
Language: de ✅ (now in position 9)
```

**Result:**
```
Parsed correctly:
- All fields in right positions
- Validation: PASSES
- Upload: SUCCEEDS
```

**Real Example: Missing Tracking ID**

❌ **Missing tracking ID:**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4
```

**What Happens:**
```
Parse: ✅ Succeeds (structure valid)
Tracking ID: None
Database lookup: Skip (no tracking ID)
Error: "No master asset for tracking ID: None"
Upload: FAILS
Email: "Processing failed - No master asset"
File: Remains in Box
```

**Real Example: Typo in Tracking ID**

❌ **Wrong tracking ID (typo):**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOlJ9s.mp4
                                                  ↑ Should be 'i' not 'l'
Correct ID: pOiJ9s
Typo ID:    pOlJ9s
```

**What Happens:**
```
Parse: ✅ Succeeds
Tracking ID extracted: pOlJ9s
Database lookup: SELECT * WHERE tracking_id = 'pOlJ9s'
Result: NO ROWS (tracking ID doesn't exist)
Error: "No master asset for tracking ID: pOlJ9s"
Upload: FAILS
```

**Real Example: Lowercase Country Code**

❌ **Lowercase country (parser misidentifies):**
```
6487512_NUT_SUMMER_OLV_30S_16x9_MST_de_de_pOiJ9s.mp4
                                         ↑ Should be "DE" (upper)
```

**What Parser Thinks:**
```
Looking for country (2 uppercase alpha)...
Found: "de" - but it's lowercase!
Interpreted as: Language code (2-3 lowercase alpha)
Result:
- Country: NOT FOUND (missing required field)
- Language: "de" ✅ (correct)
- But TWO language codes parsed (de and de again)

Validation: May fail or produce unexpected results
```

---

## Python Automation Scripts

### Execution Schedule (Cron)

**Production Crontab:**
```cron
# Workflows run every 5 minutes
*/5 * * * * a1_to_a2_download.py >> logs/cron_a1_a2.log 2>&1
*/5 * * * * a5_to_a6_download.py >> logs/cron_a5_a6.log 2>&1
*/5 * * * * b1_to_b2_download.py >> logs/cron_b1_b2.log 2>&1
*/5 * * * * a2_to_a3_upload_polling.py >> logs/cron_a2_a3.log 2>&1

# Reports run daily at 7 PM
0 19 * * * daily_report.py >> logs/daily_report.log 2>&1

# Backups run at 2 AM daily, 3 AM Sundays
0 2 * * * database/backup.sh --daily >> logs/backup.log 2>&1
0 3 * * 0 database/backup.sh --weekly >> logs/backup.log 2>&1
```

**Why Every 5 Minutes?**
- Fast response time for users
- Manageable load on systems
- Quick recovery from temporary failures
- Users see results within minutes of upload

### Script Dependencies

```mermaid
graph TB
    subgraph Workflows["Workflow Scripts"]
        A1A2[a1_to_a2_download.py]
        A2A3[a2_to_a3_upload_polling.py]
        A5A6[a5_to_a6_download.py]
        B1B2[b1_to_b2_download.py]
        CX[creativex_scoring_storing.py]
    end

    subgraph Shared["Shared Modules"]
        DAM[dam_client.py<br/>DAM API + OAuth2/mTLS]
        BOX[box_client.py<br/>Box JWT + Upload/Download]
        DB[database.py<br/>PostgreSQL Pool]
        NOT[notifier.py<br/>Email + Webhooks]
        PARSER[filename_parser.py<br/>V2.1 Parsing]
        MVP[metadata_extractor_mvp.py<br/>Field Mapping]
        CONFIG[config_loader.py<br/>YAML + Env Vars]
    end

    A1A2 --> DAM
    A1A2 --> BOX
    A1A2 --> DB
    A1A2 --> NOT
    A1A2 --> CONFIG

    A2A3 --> DAM
    A2A3 --> BOX
    A2A3 --> DB
    A2A3 --> NOT
    A2A3 --> PARSER
    A2A3 --> MVP
    A2A3 --> CONFIG

    A5A6 --> DAM
    A5A6 --> BOX
    A5A6 --> DB
    A5A6 --> NOT
    A5A6 --> CONFIG

    B1B2 --> DAM
    B1B2 --> BOX
    B1B2 --> DB
    B1B2 --> NOT
    B1B2 --> CONFIG

    CX --> BOX
    CX --> DB
    CX --> NOT
    CX --> CONFIG
```

### Shared Modules Explained

#### dam_client.py - DAM API Client
**Purpose:** All communication with OpenText DAM

**Features:**
- OAuth2 authentication (default, auto-refresh tokens)
- mTLS certificate authentication (optional, with --auth-pfx flag)
- Campaign queries (get campaigns by status)
- Asset operations (download, upload, metadata)
- Recursive folder search (includes subfolders)
- Status updates (A1→A2, A5→A6, etc.)

**Key Methods:**
```python
# Get campaigns by status
campaigns = dam.get_campaigns_by_status('A1')

# Download asset
file_path = dam.download_asset(asset_id, output_dir)

# Upload asset with metadata
result = dam.upload_asset(file_path, folder_id, asset_representation)

# Update campaign status
dam.update_campaign_status(campaign_id, 'A2')
```

#### box_client.py - Box.com Integration
**Purpose:** File transfer via Box

**Features:**
- JWT authentication (uses Box-config.json)
- Create campaign folders
- Upload with tracking IDs
- Download files
- List folders (recursive)
- Preserve subfolder structure
- Delete files after processing

**Key Methods:**
```python
# Upload with tracking ID
result = box.upload_with_tracking_id(
    file_path, campaign_id, campaign_name, tracking_id,
    subfolder_path="Europe/Germany"  # Preserves structure
)

# List folder files recursively
files = box.list_folder_files_recursive(folder_id)

# Download file
box.download_file(file_id, output_path)
```

#### database.py - PostgreSQL Operations
**Purpose:** All database interactions

**Features:**
- Connection pooling (1-10 connections)
- Tracking ID generation (unique 6-char)
- Master asset storage (with JSONB metadata)
- Derivative asset tracking
- CreativeX score storage (3 statuses)
- Campaign status tracking
- Find-or-create for rework workflow

**Key Methods:**
```python
# Generate unique tracking ID
tracking_id = db.generate_unique_tracking_id()  # Returns: "pOiJ9s"

# Find or reuse tracking ID (for A5→A6 rework)
result = db.find_or_create_tracking_id(opentext_id, campaign_id)
# Returns: {'tracking_id': 'pOiJ9s', 'is_existing': True}

# Store master asset
db.store_master_asset(
    tracking_id, opentext_id, asset_data,
    box_file_id, box_url, upload_folder_id
)

# Get master asset
master = db.get_master_asset('pOiJ9s')
# Returns full metadata for inheritance

# Store CreativeX score
db.store_creativex_score(
    filename, creativex_id, url, score,
    box_file_id, full_json,
    tracking_id='pOiJ9s',      # Optional, for master scores
    status='master-cx-score'   # Or 'active' for derivatives
)

# Get CreativeX score
score = db.get_creativex_score_by_filename('file.mp4')
# Returns latest active score or None
```

#### filename_parser.py - V2.1 Parser
**Purpose:** Parse filenames into components

**Features:**
- V2.1 structure support (new positions)
- Pattern-based detection (flexible)
- Social media code support
- Tracking ID mode detection (-N suffix)
- Validation with error messages
- Strip upload components (job + tracking)

**Key Methods:**
```python
# Parse filename
parsed = parser.parse_filename('6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4')

# Returns:
{
    'omg_job_number': '6487512',
    'brand_code': 'NUT',
    'subject_title': 'SUMMER',
    'asset_type': 'OLV',
    'seconds': '30',
    'aspect_ratio': '16x9',
    'spot_version': 'MST',
    'country_code': 'DE',
    'language_code': 'de',
    'social_media_version': 'IGF',
    'tracking_id': 'pOiJ9s',
    'tracking_mode': 'full',
    'is_valid': True,
    'validation_errors': []
}

# Strip upload components
clean = parser.strip_upload_components('6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4')
# Returns: 'NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF.mp4'
```

#### metadata_extractor_mvp.py - Field Mapping
**Purpose:** Build asset representations for DAM upload

**Features:**
- Extract 30 MVP fields from master metadata
- Update fields from filename (configurable)
- Map asset types (3-letter → DAM codes)
- Map country codes (ISO → DAM codes)
- Force values (STATE = Local)
- Add missing fields with defaults
- CreativeX field handling

**Key Methods:**
```python
# Build asset representation
asset_rep = mvp_extractor.build_mvp_asset_representation(
    master_metadata=master['full_metadata'],
    clean_filename='NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4',
    parsed_filename=parsed,
    box_metadata={'score': '85', 'url': 'https://...'},
    tracking_mode='full'  # or 'folder_only' for -N suffix
)

# Returns: List of 27 metadata fields ready for DAM upload
```

**Field Mapping Logic:**
```python
# From master metadata (inherited):
- Brand Name
- Brand Code
- Campaign references
- Fiscal Year
- All Marketing fields (Director, Production House, etc.)
- All Market fields (IP Rights, Licensing, etc.)

# From filename (overrides):
- Asset Name = clean filename
- Description = subject_title
- Asset Type = mapped from 3-letter code (OLV → onlinevideodigitalvideo)
- Language = language_code (from filename, uppercase)

# Forced values:
- STATE = "Local" (always)
- AGENCY NAME = "Oliver"

# From database (CreativeX):
- CreativeX Score = looked up by filename
- CreativeX URL = looked up by filename
```

---

## CreativeX Integration

### What is CreativeX?

CreativeX is an AI-powered creative quality scoring platform that analyzes advertising content against industry best practices and brand guidelines. It provides:
- Quality scores (0-100)
- Guideline compliance analysis
- Actionable improvement recommendations
- Benchmark comparisons

**Why We Use It:**
- Objective quality measurement
- Consistent scoring across markets
- Early identification of issues
- Performance prediction
- Continuous improvement tracking

### Two Sources of CreativeX Scores

```mermaid
flowchart TB
    subgraph Master["Master Asset Score (Reference Only)"]
        M1[Master Asset in DAM] -->|May have score in metadata| M2[A1→A2 Script]
        M2 -->|Extract if present| M3[Store in Database]
        M3 -->|status='master-cx-score'| M4[Linked by tracking_id]
        M4 --> M5[Used for:<br/>- Analytics<br/>- Comparison<br/>- Reporting]
        M5 -.->|NOT used for| M6[❌ NOT used in A2→A3 uploads]
    end

    subgraph Derivative["Derivative Asset Score (Used in Uploads)"]
        D1[Agency gets localized asset scored] --> D2[CreativeX generates PDF report]
        D2 -->|PDF contains: filename, score, URL| D3[Upload PDF to Box 350605024645]
        D3 -->|Manual/automated| D4[creativex_scoring_storing.py]
        D4 -->|LlamaExtract AI parses PDF| D5[Extract: filename, score, URL, ID]
        D5 -->|Store in Database| D6[status='active']
        D6 -->|Matched by exact filename| D7[A2→A3 Upload]
        D7 --> D8[✅ USED in DAM upload]
    end

    style M6 fill:#FFB6C1
    style D8 fill:#90EE90
```

### 🔴 CRITICAL REQUIREMENT: Every Derivative Needs Its Own CreativeX Score

**THIS IS NOT OPTIONAL - IT IS MANDATORY**

#### The Rule: One Asset = One Score

```
1 Master Asset → 10 Localized Derivatives = 10 CreativeX Scores Required
1 Master Asset → 200 Localized Derivatives = 200 CreativeX Scores Required
```

**Why EVERY Derivative Needs Scoring:**

1. **Each localization is different creative execution**
   - German voice-over has different performance than French
   - Italian subtitle placement may affect engagement differently
   - UK cultural references differ from US references
   - Each market's execution quality varies

2. **Master score does NOT apply to derivatives**
   - Master: Global concept score (reference only)
   - Derivative: Localized execution score (what matters)
   - Example:
     - Master (English): Score 85 (good concept)
     - German derivative: Score 71 (translation issues)
     - French derivative: Score 78 (better execution)
     - Italian derivative: Score 82 (excellent localization)

3. **CreativeX scores are asset-specific, not campaign-specific**
   - Each file gets analyzed individually
   - Different text overlays → different score
   - Different pacing → different score
   - Different cultural adaptations → different score

4. **Reporting requires individual scores**
   - Which markets perform best?
   - Which localizations need improvement?
   - ROI tracking by market
   - Can't answer these without per-asset scores

#### The Math

**Example Campaign:**
```
1 Master Asset (nutella_hero_master.mp4)
↓
Localized for 20 markets × 2 languages each = 40 derivatives
Plus: 3 aspect ratios (16x9, 1x1, 9x16) = 40 × 3 = 120 versions
Plus: 2 platforms (standard + Instagram) = 120 × 2 = 240 derivatives

REQUIRED: 240 CreativeX PDF reports
REQUIRED: 240 entries in creativex_scores database table
```

**If you create 200 assets, you need 200 CreativeX scores. Period.**

#### What Happens If You Don't Score All Assets

**Scenario: Campaign with 50 derivatives, only 30 scored**

```
30 derivatives:
- Have PDF reports
- Scores in database (71-85 range)
- Upload with actual CreativeX data ✅

20 derivatives:
- NO PDF reports
- NOT in database
- Upload with Score=0, placeholder URL ❌
- Email shows orange warning for each

Result in DAM:
- 30 assets: Properly scored (useful for optimization)
- 20 assets: Score=0 (meaningless, looks bad)

Client sees:
- Inconsistent quality reporting
- 40% of assets show "not scored"
- Questions quality control process
```

#### The Workflow: Scoring BEFORE Upload

**CORRECT Process:**

```mermaid
sequenceDiagram
    autonumber
    participant Agency
    participant CX as CreativeX Platform
    participant Box as Box Folder 350605024645
    participant Script as creativex_scoring_storing.py
    participant DB as Database
    participant Upload as A2→A3 Upload

    Note over Agency: For EACH derivative created
    Agency->>Agency: Create localized asset
    Agency->>CX: Submit to CreativeX for analysis
    CX->>CX: AI analyzes creative quality
    CX-->>Agency: PDF scorecard generated
    Agency->>Box: Upload PDF to folder 350605024645
    Box->>Script: PDF detected (manual or cron)
    Script->>Script: Extract filename, score, URL
    Script->>DB: Store with status='active'
    Note over DB: Score now available for this asset
    Agency->>Upload: Upload derivative to A2→A3 folder
    Upload->>DB: Lookup score by filename
    DB-->>Upload: Score=71, URL=https://...
    Upload->>Upload: Attach score to DAM upload ✅
```

**Timeline Example (50 derivatives):**
```
Week 1:
- Create 50 localized assets
- Submit all 50 to CreativeX platform
- Receive 50 PDF reports

Week 2:
- Upload all 50 PDFs to Box folder 350605024645
- Run creativex_scoring_storing.py (or automated)
- Verify all 50 in database:
  SELECT COUNT(*) FROM creativex_scores WHERE status='active'
  Result: 50 ✅

- Upload all 50 derivatives to A2→A3 folder
- All 50 process with actual scores
- All 50 emails show "CreativeX Score Added: [score]"
```

#### Common Mistake: "I'll Score the Important Ones Only"

**The Thinking:**
"We have 200 derivatives. I'll only score the hero assets (20 of them). The rest can use score=0."

**Why This Fails:**

1. **Client expectations:**
   - Contract may require all assets scored
   - Quality standards apply to ALL deliverables
   - Score=0 looks unprofessional

2. **Reporting incomplete:**
   - Can't analyze which markets perform best (only have 10% of data)
   - Can't identify improvement opportunities
   - ROI tracking impossible

3. **Selective scoring = inconsistent quality:**
   - Some assets optimized, others not
   - Performance disparity
   - Brand inconsistency

4. **The system defaults to Score=0:**
   - Not "unscored" or "pending"
   - Actively shows ZERO
   - Implies poor quality (even if asset is good)

**Reality:**
If you create 200 assets, budget for 200 CreativeX analyses. Factor this into:
- Project timeline
- Project budget
- Resource allocation
- Agency scope of work

#### Batch Processing CreativeX

**For Large Campaigns (100+ derivatives):**

**Week-by-Week Approach:**
```
Week 1: Create + score derivatives 1-25
  - Upload 25 PDFs
  - Run scoring script
  - Upload 25 derivatives
  - All get actual scores

Week 2: Create + score derivatives 26-50
  - Upload 25 PDFs
  - Run scoring script
  - Upload 25 derivatives
  - All get actual scores

... continue weekly
```

**NOT Recommended:**
```
Week 1-4: Create all 100 derivatives
Week 5: Upload all 100 derivatives (no scores yet)
Week 6: Start scoring (too late, already uploaded)

Result: All 100 uploaded with score=0
Then manual DAM updates needed
```

#### Exceptions to the Rule

**When Score=0 is Acceptable:**

1. **Internal reference assets** (not client-facing)
2. **Template files** (not final deliverables)
3. **Work-in-progress** (will be scored before final delivery)
4. **Non-creative assets** (legal documents, spreadsheets)

**But for client deliverables: ALL must be scored.**

#### How to Verify All Assets Are Scored

**Before Mass Upload:**
```bash
# Count derivatives you created
ls derivatives/ | wc -l
Result: 200 files

# Count scores in database
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT COUNT(*) FROM creativex_scores WHERE status = 'active';
"
Result: Should be 200 (or close to it)

# List any files without scores (after upload)
# Compare uploaded filenames to database entries
```

**After Upload:**
```bash
# Check how many used defaults
grep "CreativeX Score Missing" logs/a2_to_a3.log | wc -l

# If this number is high → many derivatives uploaded without scores
```

### CreativeX PDF Extraction Workflow

**Step 1: CreativeX Generates PDF Report**
- Agency submits asset to CreativeX platform
- CreativeX AI analyzes content
- Generates PDF scorecard
- PDF contains:
  - Filename: `6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4`
  - CreativeX ID: `6864308`
  - CreativeX URL: `https://app.creativex.com/audit/scorecards/32456797`
  - Quality Score: `71` (out of 100)
  - Guidelines breakdown
  - Recommendations

**Step 2: Upload PDF to Box**
- Agency uploads PDF to Box folder: `350605024645`
- PDF can be named anything (e.g., `nutella_summer_de_report.pdf`)
- What matters is the **filename field INSIDE the PDF**

**Step 3: Run CreativeX Scoring Script**
```bash
# Manual execution (or can be added to cron)
python scripts/creativex_scoring_storing.py
```

**Script Process:**
```mermaid
sequenceDiagram
    autonumber
    participant Box as Box Folder 350605024645
    participant Script as creativex_scoring_storing.py
    participant Llama as LlamaExtract AI
    participant DB as Database
    participant Email

    Script->>Box: List PDF files
    Box-->>Script: Found 1 PDF

    Script->>Box: Download PDF
    Box-->>Script: PDF file

    Script->>Llama: Extract data using agent "Creativex-Extract"
    Llama->>Llama: AI parses PDF structure
    Llama->>Llama: Find filename field
    Llama->>Llama: Extract score, URL, ID
    Llama-->>Script: JSON response

    Script->>Script: Parse JSON
    Script->>Script: Get filename: 6487512_NUT_SUMMER_OLV..._pOiJ9s.mp4
    Script->>Script: Get score: 71
    Script->>Script: Remove .0 decimals: 71.0 → 71

    Script->>DB: Check if filename exists (status='active')

    alt Filename already exists
        DB-->>Script: Found existing (version 1)
        Script->>DB: Mark old as status='superseded'
        Script->>DB: Insert new with status='active', version 2
    else New filename
        Script->>DB: Insert with status='active', version 1
    end

    Script->>Box: Delete PDF file
    Script->>Email: Send success notification
    Email-->>Team: "1 file processed, Score: 71 (Version 2)"
```

**Database Storage:**
```sql
INSERT INTO creativex_scores (
    filename, creativex_id, creativex_url, quality_score,
    box_file_id, full_extraction_data, status
) VALUES (
    '6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4',
    '6864308',
    'https://app.creativex.com/audit/scorecards/32456797',
    '71',
    'box_file_id_123',
    '{"data": {...full JSON from LlamaExtract...}}',
    'active'
);
```

**Version Tracking:**
If same filename is scored again:
```
Version 1: Score 71, status='superseded' (old)
Version 2: Score 75, status='active' (current)
Version 3: Score 80, status='active' (latest)
```

Email shows: "Score: 80 (Version 3)"

**Step 4: A2→A3 Uses Score Automatically**

When agency uploads derivative:
```
Filename: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4
Lookup: SELECT * FROM creativex_scores WHERE filename = '...' AND status = 'active'
Found: Score 71, URL https://...
Upload: Includes CreativeX metadata
```

**If No Score Found:**
```
Lookup: Returns None
Default: Score = 0, URL = https://app.creativex.com/preflight/pretests
Upload: Succeeds with defaults
Email: Shows orange warning "CreativeX Score Missing"
```

### Master vs Derivative Scores

**Master Score (status='master-cx-score'):**
- Extracted during A1→A2 from DAM metadata
- Linked by tracking_id
- Purpose: Reference, analytics, comparison
- **NOT used in A2→A3 uploads**
- Query: `WHERE tracking_id = 'pOiJ9s' AND status = 'master-cx-score'`

**Derivative Score (status='active'):**
- Extracted from PDF via LlamaExtract
- Matched by exact filename (with job + tracking)
- Purpose: Attached to DAM upload
- **USED in A2→A3 uploads**
- Query: `WHERE filename = 'exact_name.mp4' AND status = 'active'`

**Why Separate?**
- Each localized asset gets its own score
- Derivative scores often differ from master
- Master: Global creative concept score
- Derivative: Localized execution score
- Need to track both for analytics

**Example:**
```
Master: nutella_summer_hero.mp4
  - Tracking ID: pOiJ9s
  - Master Score: 85 (general creative quality)

Derivatives:
  - German version: Score 71 (localization quality)
  - French version: Score 78 (localization quality)
  - Italian version: Score 82 (localization quality)

Analytics can show:
- Master scored 85
- Average derivative: 77
- Italian performed best (+3 from average)
```

---

*[Document continues with remaining sections...]*

**Note:** This is approximately 20% of the complete guide. The full document will include:
- Detailed workflows for all scripts
- Common mistakes with 20+ real examples
- Troubleshooting guide with solutions
- Database schema with ERD
- Monitoring procedures
- Glossary and FAQ
- Complete reference tables

**Estimated final length:** 60-70 pages

Would you like me to continue building out the remaining sections?

---

## The Database

### Database Architecture

The PostgreSQL database is the **central nervous system** of the automation. It stores:
- Master asset metadata and tracking IDs
- Derivative asset records
- CreativeX scores (master and derivative)
- Campaign status tracking
- Complete audit trail

```mermaid
erDiagram
    MASTER_ASSETS ||--o{ DERIVATIVE_ASSETS : "has many via tracking_id"
    MASTER_ASSETS ||--o{ CREATIVEX_SCORES : "has master score via tracking_id"
    MASTER_ASSETS ||--o{ ASSET_EVENTS : "has events"
    CAMPAIGN_STATUS ||--o{ MASTER_ASSETS : "contains"

    MASTER_ASSETS {
        varchar tracking_id PK "6-char unique ID"
        varchar opentext_id "DAM asset ID"
        varchar original_filename "Original name"
        jsonb full_metadata "Complete DAM metadata"
        varchar upload_directory "Final Assets folder ID"
        varchar global_master_campaign_id "Global campaign ref"
        varchar local_campaign_id "Local campaign ID"
        timestamp created_at
    }

    DERIVATIVE_ASSETS {
        int id PK
        varchar tracking_id FK "Links to master"
        varchar derivative_filename "Clean filename"
        varchar dam_asset_id "Uploaded asset ID"
        varchar upload_status "pending/completed"
        timestamp uploaded_at
    }

    CREATIVEX_SCORES {
        int id PK
        varchar filename "Exact filename for matching"
        varchar creativex_id "CreativeX scorecard ID"
        varchar creativex_url "CreativeX URL"
        varchar quality_score "Score 0-100"
        varchar tracking_id "Links to master (for master scores)"
        varchar status "active/superseded/master-cx-score"
        jsonb full_extraction_data "Complete extraction JSON"
        timestamp extracted_at
    }

    CAMPAIGN_STATUS {
        int id PK
        varchar campaign_id UK "DAM campaign folder ID"
        varchar campaign_number "C000000078"
        varchar status "A1/A2/A3/A4/A5/B1/B2"
        boolean webhook_sent "Prevents duplicates"
        timestamp webhook_sent_at
    }

    ASSET_EVENTS {
        int id PK
        varchar tracking_id "Related asset"
        varchar event_type "created/updated/uploaded"
        jsonb event_data "Event details"
        timestamp event_timestamp
    }
```

### Table Purposes

#### master_assets (35 columns)
**Purpose:** Original master assets from A1→A2 download

**Key Fields:**
- `tracking_id` - Unique 6-character identifier
- `opentext_id` - DAM asset ID (hex string)
- `original_filename` - Name without extension
- `full_metadata` - **JSONB column with COMPLETE DAM metadata**
- `upload_directory` - Where derivatives upload in DAM
- `global_master_campaign_id` - Reference to global master
- `local_campaign_id` - Local campaign this belongs to

**Why JSONB for full_metadata?**
- No truncation (can be 100KB+ of metadata)
- Queryable with PostgreSQL JSONB operators
- Preserves exact structure from DAM
- Future-proof (new fields don't break schema)

**Example Query:**
```sql
-- Find all assets for a campaign
SELECT tracking_id, original_filename
FROM master_assets
WHERE local_campaign_id = 'C000000078'
  AND status = 'active';

-- Query within JSONB metadata
SELECT tracking_id,
       full_metadata->'metadata'->'id' as metadata_model
FROM master_assets
WHERE tracking_id = 'pOiJ9s';
```

#### derivative_assets
**Purpose:** Track localized assets uploaded via A2→A3

**Key Fields:**
- `tracking_id` - Links to master_assets
- `derivative_filename` - Clean filename (no job/tracking)
- `dam_asset_id` - Asset ID in DAM after upload
- `upload_status` - pending/completed/failed

**Why This Table?**
- Track which derivatives uploaded
- Link back to master
- Detect when ALL derivatives done
- Audit trail of uploads

#### creativex_scores
**Purpose:** Store quality scores from two sources

**Three Status Values:**

**1. status='active' (Derivative Scores)**
- Source: PDF extraction via LlamaExtract
- Matched by: Exact filename (with job + tracking)
- Used in: A2→A3 uploads
- Versioned: Yes (old = 'superseded')
- Example: `6487512_NUT_SUMMER_OLV..._pOiJ9s.mp4`, score=71

**2. status='superseded' (Old Versions)**
- Source: PDF extraction (previous version)
- Purpose: History/audit trail
- Not used in uploads
- Example: Same filename, score=68 (old)

**3. status='master-cx-score' (Master Reference)**
- Source: A1→A2 extraction from DAM metadata
- Matched by: tracking_id
- Used in: Analytics/reporting ONLY, never uploads
- Not versioned
- Example: `nutella_summer_hero.mp4`, tracking=pOiJ9s, score=85

**Query Examples:**
```sql
-- Get derivative score for upload (A2→A3 uses this)
SELECT quality_score, creativex_url
FROM creativex_scores
WHERE filename = '6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_IGF_pOiJ9s.mp4'
  AND status = 'active';

-- Get master score for reporting
SELECT quality_score, creativex_url
FROM creativex_scores
WHERE tracking_id = 'pOiJ9s'
  AND status = 'master-cx-score';

-- Compare master vs derivative scores
SELECT
    cs_master.quality_score as master_score,
    cs_deriv.quality_score as derivative_score,
    cs_deriv.filename as derivative_name
FROM creativex_scores cs_master
JOIN master_assets ma ON cs_master.tracking_id = ma.tracking_id
JOIN creativex_scores cs_deriv ON cs_deriv.filename LIKE '%' || ma.tracking_id || '%'
WHERE cs_master.status = 'master-cx-score'
  AND cs_deriv.status = 'active';
```

#### campaign_status
**Purpose:** Prevent duplicate webhook sends

**Problem It Solves:**
Without this table, if A2→A3 runs multiple times, it might send the "campaign complete" webhook multiple times to Make.com, causing duplicate downstream actions.

**How It Works:**
```python
# Check if webhook already sent
existing = db.check_campaign_processed(campaign_id)

if existing['webhook_sent']:
    logger.info("Webhook already sent for this campaign - skipping")
    return

# Send webhook
send_webhook(campaign_data)

# Record that webhook was sent
db.record_campaign_status(
    campaign_id, campaign_number, campaign_name,
    live_campaign='YES', status='A3', webhook_sent=True
)
```

### Backup Strategy

**Daily Backups (2:00 AM):**
```bash
# pg_dump creates SQL dump
./database/backup.sh --daily

# Creates: backups/dumps/ferrero_tracking_2025-11-13_02-00.sql.gz
# Retention: 7 days
# Size: ~2-5 MB compressed
```

**Weekly Backups (Sunday 3:00 AM):**
```bash
# pg_basebackup creates binary backup
./database/backup.sh --weekly

# Creates: backups/basebackups/ferrero_tracking_2025-11-10_03-00.tar.gz
# Retention: Latest only
# Size: ~30 MB compressed
```

**Health Monitoring (8:00 AM Daily):**
```bash
# Checks backup age, warns if > 25 hours old
./database/check_backups.sh
```

**Restore Process:**
```bash
# List available backups
./database/restore.sh --list

# Restore from specific backup
./database/restore.sh backups/dumps/ferrero_tracking_2025-11-13_02-00.sql.gz

# Creates safety backup first
# Requires confirmation: "yes"
# Restores database
# Verifies row counts
```

---

## Common Mistakes & Pitfalls

### Category 1: Filename Errors

#### Mistake 1.1: Using Old V1 Field Order

**What Happens:**
```
User creates: 6487512_NUT_DE_de_SUMMER_OLV_30S_16x9_pOiJ9s.mp4
              (This is OLD V1 structure: BRAND_COUNTRY_LANG_SUBJECT_ASSET...)

Parser expects V2.1: BRAND_SUBJECT_ASSET_..._COUNTRY_LANG
Parser reads:
- Position 3 should be SUBJECT, reads: "DE" (thinks subject is "DE")
- Position 4 should be ASSET, reads: "de" (thinks asset type is "de")
- Position 8 should be COUNTRY, reads: "30S" (thinks country is "30S")

Result: Complete parsing failure
Validation errors:
- "Asset Type invalid: de (must be 3 chars uppercase)"
- "Country Code invalid: 30S (must be 2 chars)"

File: SKIPPED
Status: Stuck in Box
Fix: Use naming tool or reorder manually to V2.1
```

**Prevention:** Always use the naming tool - it generates correct V2.1 format

---

#### Mistake 1.2: Manual Typing Instead of Copy/Paste

**What Happens:**
```
Naming tool generates: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
User types manually:   6487512_NUT_SUMMER_OLV_30s_16x9_MST_DE_de_pOiJ9s.mp4
                                                 ↑ lowercase 's'

Parser expects: \d+S pattern (capital S)
Parses: "30s" doesn't match pattern
Result:
- Duration not detected
- "30s" interpreted as aspect ratio (fallback)
- "16x9" interpreted as... something else
- Chaos in parsing

Validation: May fail or produce wrong metadata
```

**Prevention:** Use "Copy to Clipboard" button, never type manually

---

#### Mistake 1.3: Wrong Tracking ID (Copy/Paste Error)

**What Happens:**
```
Correct tracking ID: pOiJ9s (lowercase i)
User copies: pOlJ9s (lowercase L instead of i)

Upload: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOlJ9s.mp4
Parse: ✅ Succeeds (pOlJ9s is valid 6-char alphanumeric)
Database lookup: SELECT * FROM master_assets WHERE tracking_id = 'pOlJ9s'
Result: NO ROWS FOUND
Error: "No master asset for tracking ID: pOlJ9s"
Upload: FAILS
Email: Error notification sent

Agency confused: "I have the tracking ID from the email!"
Reality: Small typo made it a different ID
```

**Prevention:**
- Copy from email directly, don't retype
- Use naming tool dropdown if it has tracking ID list
- Double-check the last 6 characters of generated filename match email

---

#### Mistake 1.4: Reusing Tracking ID from Different Campaign

**What Happens:**
```
Campaign A (C000000078): Master nutella_hero.mp4, tracking=abc123
Campaign B (C000000099): Master raffa_promo.mp4, tracking=xyz789

User localizing for Campaign B:
- Accidentally uses tracking ID from Campaign A: abc123
- Filename: 6487512_RAF_PROMO_OLV_16x9_FR_fr_abc123.mp4

A2→A3 processes:
- Finds tracking ID: abc123
- Loads master metadata for abc123
- Gets: Nutella brand metadata! (wrong campaign)
- Uploads Raffaello asset with Nutella metadata!

Result: Asset uploaded to WRONG campaign folder with WRONG brand
Data corruption in DAM
Manual cleanup required
```

**Prevention:**
- Always get tracking ID from the SAME campaign's A1→A2 email
- Never reuse tracking IDs across campaigns
- Each campaign gets fresh tracking IDs

---

#### Mistake 1.5: Missing Required Field

**What Happens:**
```
User forgets aspect ratio: 6487512_NUT_SUMMER_OLV_30S_MST_DE_de_pOiJ9s.mp4
                                                     ↑ Missing ratio

Parser tries to find ratio pattern (contains 'x' or ':')
Doesn't find it
Tries fallback detection
May parse "MST" as ratio (wrong)
Then "DE" becomes spot version (wrong)
Cascade of misinterpretation

Validation: FAILS
Error: "Aspect Ratio missing"
File: SKIPPED
```

**Prevention:** Naming tool enforces required fields - can't generate without them

---

### Category 2: Upload Location Errors

#### Mistake 2.1: Uploading to Wrong Box Folder

**What Happens:**
```
User receives A1→A2 email with Box URL: https://app.box.com/folder/348304357505
User should upload derivatives to A2→A3 folder: 348526703108

User accidentally uploads to A1→A2 folder (348304357505)

A2→A3 script polls folder 348526703108
Finds: No files (they're in wrong folder)
Result: Nothing happens

User waits... waits... waits...
No email received
Asset never uploaded to DAM

Eventually: "Why hasn't my file been processed?"
Investigation: File in wrong folder
Fix: Move file to correct folder
```

**Prevention:**
- Check email for correct upload URL
- A1→A2 email says: "Masters ready for download"
- A2→A3 email should say: "Upload derivatives to: [URL]"
- Bookmark correct folders

---

#### Mistake 2.2: Uploading CreativeX PDF to Wrong Folder

**What Happens:**
```
User uploads CreativeX PDF to A2→A3 folder (348526703108)
Should upload to CreativeX folder (350605024645)

A2→A3 script:
- Tries to parse PDF filename as asset filename
- Parse fails (PDF name doesn't match pattern)
- Skips file
- PDF sits in folder forever

creativex_scoring_storing.py:
- Looks in folder 350605024645
- Doesn't find PDF (it's in wrong folder)
- No score extracted

Result:
- PDF not processed
- Score not in database
- A2→A3 uploads use default score=0
```

**Prevention:**
- CreativeX PDFs go to folder 350605024645 ONLY
- Asset files go to folder 348526703108
- Different purposes, different folders

---

### Category 3: Timing Errors

#### Mistake 3.1: Uploading Before Getting Tracking ID

**What Happens:**
```
Day 1 Morning: Campaign created in DAM (status A1)
Day 1 09:00: Agency starts localizing (proactive!)
Day 1 10:00: Agency uploads derivatives to Box
Day 1 10:05: A1→A2 script runs, downloads masters, generates tracking IDs
Day 1 10:10: A2→A3 script runs, finds agency files

Problem: Agency files have NO tracking IDs (uploaded before A1→A2 ran)
Filename: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4 (missing tracking ID)
Parse: tracking_id = None
Upload: FAILS

Agency must:
- Wait for A1→A2 email
- Get tracking IDs
- Rename files
- Re-upload
```

**Prevention:** Wait for A1→A2 email before starting localization

---

#### Mistake 3.2: Uploading CreativeX PDF After Derivative Upload

**What Happens:**
```
10:00: Agency uploads derivative to A2→A3
10:05: A2→A3 processes, no CreativeX score found
10:05: Uploads to DAM with score=0, placeholder URL
10:10: Agency uploads CreativeX PDF
10:15: creativex_scoring_storing.py extracts, stores score=85

Result:
- Asset already in DAM with score=0
- Database has score=85
- Mismatch between DAM and database
```

**Better Workflow:**
```
1. Upload CreativeX PDF first
2. Run creativex_scoring_storing.py (or wait for scheduled run)
3. Verify score in database
4. Then upload derivative
5. A2→A3 finds score, attaches correctly
```

**Or:** Accept score=0 initially, then update DAM manually later

---

### Category 4: CreativeX Integration Errors

#### Mistake 4.1: PDF Filename Doesn't Match Asset Filename

**What Happens:**
```
Derivative filename (in Box): 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4

CreativeX PDF filename field (extracted): NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4
                                           ↑ Missing job number and tracking ID

Database stores: NUT_SUMMER_OLV_30S_16x9_MST_DE_de.mp4

A2→A3 looks up: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Database query: WHERE filename = '6487512_NUT_SUMMER_OLV..._pOiJ9s.mp4'
Result: NO MATCH
Uses: Default score=0
```

**Root Cause:**
CreativeX PDF was generated with clean filename, but derivative uploaded with job+tracking

**Fix:**
The PDF's filename field must match EXACTLY what will be uploaded to Box
- Include job number
- Include tracking ID
- Exact same extension

---

#### Mistake 4.2: Not Running CreativeX Scoring Script

**What Happens:**
```
Agency uploads PDF to Box folder 350605024645
Agency forgets to run: python scripts/creativex_scoring_storing.py

PDF sits in folder
Never extracted
Database has no score
All A2→A3 uploads use default score=0
```

**Fix Options:**
1. Run script manually after PDF upload
2. Add to cron for automatic processing
3. Set up Box webhook to trigger script (advanced)

---

### Category 5: Database Issues

#### Mistake 5.1: Duplicate Tracking IDs

**What Happens (Shouldn't Happen, But If It Does):**
```
Somehow two assets get same tracking ID: pOiJ9s

Database constraint: tracking_id UNIQUE
Second insert: FAILS with "duplicate key value"
Script: Catches error, logs failure
Email: Error notification

Investigation needed to find why duplicate generated
```

**Prevention:**
- generate_unique_tracking_id() checks database for uniqueness
- Retries up to 100 times
- Statistically impossible with 6 characters (56 billion combinations)

---

#### Mistake 5.2: Database Connection Lost During Processing

**What Happens:**
```
A1→A2 processing campaign with 10 assets:
- Asset 1-5: ✅ Success, stored in database
- Asset 6: Database connection drops
- Asset 6: INSERT fails
- Script: Catches exception
- Transaction: ROLLED BACK (asset 6 not stored)
- Asset 7-10: ⏭️ Not attempted (stop on error)

Result:
- Status NOT updated (still A1)
- Partial success email sent
- Next run: Retries assets 6-10
- Eventually completes
```

**Why This Design?**
- Transaction safety
- All-or-nothing approach
- Clear state (either campaign done or not)
- Automatic retry on next cron run

---

### Category 6: Intentional Shortcuts (And Why They Fail)

#### Shortcut 6.1: "I'll Fix the Metadata in DAM After Upload"

**The Thinking:**
"I'll upload with whatever filename, then manually edit the metadata in DAM. Faster than using the naming tool."

**Reality:**
```
Upload with wrong filename: nutella_video.mp4
A2→A3 processes:
- Parse fails (not V2.1 format)
- OR parses incorrectly
- Metadata applied wrong
- Upload succeeds with WRONG metadata

Now in DAM:
- Wrong language
- Wrong country
- Wrong asset type
- Wrong description

To fix:
- Find asset in DAM (search)
- Open metadata editor
- Fix 10+ fields manually
- Save each field
- Test search still works
- Update related assets
Time: 30+ minutes per asset

vs.

Using naming tool correctly:
- Generate filename: 1 minute
- Upload: 1 minute
- Automation handles rest
- Correct metadata automatically
Time: 2 minutes total
```

**Lesson:** Shortcuts create more work, not less

---

#### Shortcut 6.2: "I'll Skip the Job Number, Not Billing This"

**The Thinking:**
"This is internal work, no client billing, so job number doesn't matter."

**Reality:**
```
Upload: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4 (no job number)
Parse: FAILS (expects digits in position 1)

OR if job number optional in parser:
Upload succeeds but:
- Can't link to OMG project
- Client asks: "Where's the summer video?"
- Search by job number: NOT FOUND
- Search by client: Manual lookup needed
- Billing reconciliation: Manual matching
- Client reporting: Incomplete

Time wasted: Hours across multiple people
```

**Lesson:** Job number is required business metadata, not optional

---

#### Shortcut 6.3: "I'll Manually Enter CreativeX Score in Box Metadata"

**The Thinking:**
"Old workflow used Box metadata templates. I'll just enter the score there instead of uploading PDF."

**Reality:**
```
User opens Box file
Clicks metadata
Adds custom metadata field: "creativex_score = 85"
Saves

A2→A3 script runs:
- Ignores Box metadata completely (new workflow)
- Looks in PostgreSQL database only
- Finds: Nothing
- Uses: Default score=0

Result: Score not attached despite manual entry
Wasted effort
```

**Lesson:** New workflow is database-only. Must upload PDF or score won't attach.

---

### Category 7: Process Violations

#### Violation 7.1: Uploading Directly to DAM (Bypassing Automation)

**What Happens:**
```
User has DAM access
User uploads derivative directly to "02. Final Assets" folder
Bypasses Box and A2→A3 automation

Result:
- No tracking ID link to master
- No database record
- No CreativeX score attached
- No email notification
- Asset "invisible" to reporting
- Audit trail broken

Later when searching:
- "Why don't we have a German version?"
- File is there but not tracked
- Manual reconciliation needed
```

**Lesson:** ALWAYS use Box upload workflow for tracking and automation

---

#### Violation 7.2: Editing Files in Box After Upload

**What Happens:**
```
10:00: Agency uploads to A2→A3
10:05: Agency notices issue, edits file in Box (replaces version)
10:10: A2→A3 script runs
10:10: Downloads file (gets EDITED version, not original)
10:11: Uploads to DAM

Result: Processed, but...
- Audit trail shows original upload time
- But file is edited version
- Discrepancy between logs and actual file
- Confusion in troubleshooting
```

**Better Approach:**
- Delete file from Box
- Fix locally
- Re-upload with same filename
- Clear audit trail

---

### Category 8: Agency Workflow Mistakes

#### Mistake 8.1: Different Naming for Same Tracking ID

**What Happens:**
```
Master: nutella_summer_hero.mp4, tracking=pOiJ9s
Derivative 1: NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4 ✅
Derivative 2: NUT_SOMMER_OLV_30S_16x9_MST_FR_fr_pOiJ9s.mp4 ❌
                  ↑ SOMMER instead of SUMMER (different subject)

Both use tracking ID pOiJ9s but different subjects!

Result:
- Both parse successfully
- Both upload successfully
- But inconsistent naming
- Search confusion
- Brand guideline violation
```

**Lesson:** All derivatives of same master should use SAME subject title

---

#### Mistake 8.2: Wrong File Type/Extension

**What Happens:**
```
Master: nutella_hero.mp4 (video)
Derivative: Agency creates as .mov, uploads as:
  6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mov

Parse: ✅ Succeeds
Upload: ✅ May succeed
BUT: Extension mismatch
- DAM expects .mp4 based on master
- Got .mov instead
- May cause downstream issues
- Playback compatibility questions
```

**Best Practice:** Match master file type unless specifically requested otherwise

---

### Category 9: Approval Process Errors

#### Mistake 9.1: Rejecting Without Comments

**What Happens:**
```
Approver marks: ECOMMERCE STATUS = "NOT APPROVED"
Approver doesn't add comments (lazy)

A5→A6 downloads asset
Email to agency:

"Asset rejected:
  Rejection Comments:
  (none provided)"

Agency: "Why was this rejected? What do I fix?"
Wasted time asking approver
Delayed turnaround
```

**Best Practice:** Always add detailed rejection comments

---

#### Mistake 9.2: Approving When Should Be Rejected

**What Happens:**
```
Asset has compliance issue (missing disclaimer)
Approver misses it
Marks: APPROVED
Status: A4

Asset goes live with compliance issue
Legal discovers issue post-launch
Emergency takedown required
Much bigger problem than rejecting early

Cost: High (legal risk, brand reputation)
```

**Lesson:** Better to reject and fix than approve and regret

---

## Monitoring & Health Checks

### Email Notifications: What They Mean

#### Green Emails (Success)

**"✅ Master Assets Downloaded"** (A1→A2)
```
Means:
- Campaign found with status A1
- All masters downloaded successfully
- Uploaded to Box with tracking IDs
- Stored in database
- Status updated A1→A2
- Ready for agency localization

Your Action:
- Forward Box URLs to agency
- Monitor for A2→A3 completion
```

**"✅ Asset Uploaded to DAM"** (A2→A3)
```
Means:
- Derivative processed successfully
- Uploaded to DAM Final Assets
- Metadata applied correctly
- CreativeX score attached (or defaults used)
- Deleted from Box

Your Action:
- None (automatic)
- Check for CreativeX warning if present
```

**"✅ Global Master Assets Downloaded"** (B1→B2)
```
Means:
- Global campaign processed
- Assets ready for distribution
- Available in Box
- No webhook sent (expected)

Your Action:
- Notify regional teams
```

#### Orange Emails (Partial Success / Warnings)

**"⚠️ Partial Download"**
```
Means:
- Some assets succeeded, some failed
- Status NOT updated (campaign still at previous status)
- Failed assets will retry on next run

Your Action:
- Review failed assets in email
- Check logs for error details
- May need to fix issues in DAM
- Wait for retry (5 minutes)
```

**"⚠️ CreativeX Score Missing"** (in A2→A3 email)
```
Means:
- Asset uploaded successfully
- But no CreativeX score found in database
- Used defaults: Score=0, placeholder URL
- Everything else worked fine

Your Action:
- Upload CreativeX PDF if needed
- Or accept score=0 if not critical
```

#### Red Emails (Errors)

**"❌ Upload Failed"**
```
Means:
- Critical error during processing
- File NOT uploaded to DAM
- Still in Box for retry

Your Action:
- Check error message in email
- Review logs: tail -100 logs/a2_to_a3.log
- Fix underlying issue
- File will retry automatically
```

### Daily Report Email (7:00 PM)

**"📊 Ferrero Automation Daily Report"**

**What It Contains:**
```
Overall Summary (Last 24 Hours):
- Campaigns Found: 5
- Campaigns Processed: 5
- ✅ Completed: 4
- ⚠️ Partial: 1
- Total Assets: 50
- ✓ Successful: 48
- ✗ Failed: 2
- Success Rate: 96.0%

Workflow Breakdown:
  A1→A2: 2 campaigns, 20 assets, 20 successful
  A2→A3: 3 campaigns, 25 assets, 23 successful, 2 failed
  A5→A6: 1 campaign, 5 assets (rejections), 5 successful
  B1→B2: 0 campaigns (no global campaigns today)

Errors (if any):
  - Asset XYZ failed: Network timeout
  - Asset ABC failed: Invalid tracking ID
```

**How to Read It:**
- Success rate >95%: ✅ Healthy
- Success rate 80-95%: ⚠️ Investigate errors
- Success rate <80%: 🔴 System issues, escalate

---

### Log Files

**Location:** `/opt/ferrero-automation/Python-Version/logs/`

**Log Files:**
```
a1_to_a2.log          - Master downloads
a2_to_a3.log          - Derivative uploads
a5_to_a6.log          - Rejection downloads
b1_to_b2.log          - Global masters
creativex_scoring.log - CreativeX extraction
daily_report.log      - Daily report generation
backup.log            - Database backups
cron_*.log            - Cron execution logs
```

**Log Rotation:**
- Max size: 10MB per file
- Backups: 28 rotated files kept
- Total: ~280MB per workflow
- Auto-cleanup: Old logs deleted

**Useful Log Commands:**
```bash
# Watch live
tail -f logs/a2_to_a3.log

# Check for errors today
grep -i error logs/*.log | grep "$(date +%Y-%m-%d)"

# Count successful uploads today
grep "✓ Success" logs/a2_to_a3.log | grep "$(date +%Y-%m-%d)" | wc -l

# Find specific tracking ID
grep "pOiJ9s" logs/*.log

# See recent activity
tail -100 logs/a1_to_a2.log
```

---

## Troubleshooting Guide

### Symptom-Based Troubleshooting

#### "My file isn't being processed"

**Check 1: Is it in the right Box folder?**
```bash
# Verify folder ID in email
A2→A3 folder: 348526703108
CreativeX folder: 350605024645
```

**Check 2: Does the filename parse correctly?**
```bash
# Test parsing locally
python3 -c "
from scripts.shared.filename_parser import FilenameParser
parser = FilenameParser()
result = parser.parse_filename('YOUR_FILENAME.mp4')
print('Valid:', result['is_valid'])
print('Errors:', result['validation_errors'])
print('Tracking ID:', result['tracking_id'])
"
```

**Check 3: Does the tracking ID exist in database?**
```bash
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, original_filename
FROM master_assets
WHERE tracking_id = 'pOiJ9s';
"
```

**Check 4: Check the logs**
```bash
grep "YOUR_FILENAME" logs/a2_to_a3.log
```

---

#### "CreativeX score shows 0 instead of actual score"

**Check 1: Is score in database?**
```bash
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT filename, quality_score, status
FROM creativex_scores
WHERE filename LIKE '%YOUR_FILENAME%';
"
```

**Check 2: Does filename match EXACTLY?**
```
Database has: 6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Lookup uses:  6487512_NUT_SUMMER_OLV_30S_16x9_MST_DE_de_pOiJ9s.mp4
Match: Must be character-for-character identical
```

**Check 3: Is status = 'active'?**
```bash
# Query filters for status='active' only
# Superseded scores are not used
SELECT status FROM creativex_scores WHERE filename = 'exact_name.mp4';
```

**Fix:**
- Upload CreativeX PDF with exact filename
- Run creativex_scoring_storing.py
- Re-upload derivative (will pick up score)

---

#### "Asset uploaded with wrong metadata"

**Check 1: Was correct tracking ID used?**
```bash
# Check what master the tracking ID links to
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, original_filename,
       full_metadata->'metadata'->>'id' as metadata_model
FROM master_assets
WHERE tracking_id = 'abc123';
"
```

**Check 2: Check the --dryrun output**
```bash
# See exactly what metadata would be sent
python scripts/a2_to_a3_upload_polling.py --dryrun

# Review the JSON output
# Verify each field has correct value
```

**Fix:**
- Verify tracking ID is correct
- Check field_mappings.yaml configuration
- Re-upload with corrections

---

### Emergency Procedures

#### Emergency 1: Database Corruption

**Symptoms:**
- Scripts crashing with database errors
- Queries returning unexpected results
- Table structure errors

**Steps:**
1. Stop all workflows immediately
```bash
crontab -e  # Comment out all cron jobs
```

2. Create emergency backup
```bash
./database/backup.sh --daily
```

3. Restore from last known good backup
```bash
./database/restore.sh backups/dumps/ferrero_tracking_2025-11-13_02-00.sql.gz
```

4. Verify restoration
```bash
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT COUNT(*) FROM master_assets;
SELECT COUNT(*) FROM creativex_scores;
"
```

5. Resume workflows
```bash
crontab -e  # Uncomment cron jobs
```

6. Notify team of data loss window (time between backup and corruption)

---

#### Emergency 2: All Workflows Failing

**Symptoms:**
- Multiple error emails
- No successful uploads for hours
- Logs show authentication failures

**Steps:**
1. Test connections
```bash
python scripts/test_connection.py
python scripts/test_connection.py --auth-pfx
```

2. Check credentials in .env
```bash
grep "DAM_" .env | grep -v PASSWORD
grep "BOX_" .env | grep -v PASSWORD | grep -v KEY
```

3. Check external services
- DAM: Is it accessible? Network issues?
- Box: API status page check
- Database: Container running?

4. Check disk space
```bash
df -h /opt/ferrero-automation
```

5. Review recent changes
```bash
git log --oneline | head -10
```

6. Rollback if recent deployment caused issues
```bash
git revert <commit-hash>
```

---

## Glossary

**A1, A2, A3, A4, A5, A6** - Campaign status codes indicating workflow stage

**Asset Representation** - JSON structure containing all metadata fields for DAM upload

**Tracking ID** - 6-character unique identifier linking derivative to master asset

**Master Asset** - Original creative asset from campaign creation

**Derivative Asset** - Localized version of master asset

**JSONB** - PostgreSQL data type for storing JSON with queryable fields

**MVP Fields** - 30 core metadata fields required for asset upload

**Soft Delete** - Marking records as 'superseded' instead of deleting (preserves history)

**CreativeX Score** - Quality score (0-100) from CreativeX AI analysis

**OMG Job Number** - Project identifier from OMG project management system

**Folder-Only Mode** - Tracking ID with -N suffix, inherits only upload folder

**LlamaExtract** - AI service for extracting data from PDF documents

**pg_dump** - PostgreSQL backup tool creating SQL dump

**Cron** - Linux job scheduler running scripts on schedule

---

## FAQ

**Q: Can I edit a filename after upload to Box?**
A: No. The script processes based on exact filename at time of discovery. Editing causes mismatches.

**Q: What if I upload the wrong tracking ID?**
A: Delete the file from Box, rename with correct tracking ID, re-upload.

**Q: Can the same master have multiple derivatives with different tracking IDs?**
A: No. One master = one tracking ID. All derivatives of that master use the same tracking ID.

**Q: What happens if CreativeX PDF is never uploaded?**
A: Asset uploads successfully with default score=0. Functional but not scored.

**Q: Can I skip the naming tool and create filenames manually?**
A: Technically yes if you know the exact V2.1 format, but NOT recommended. Tool prevents errors.

**Q: How long until my uploaded file is processed?**
A: Maximum 5 minutes (next cron run). Usually 1-3 minutes.

**Q: What if my file is stuck in Box for > 10 minutes?**
A: Check logs for errors, verify filename format, check tracking ID in database.

**Q: Can I upload multiple derivatives at once?**
A: Yes! A2→A3 processes one file per run but runs every 5 minutes. All will be processed within minutes.

**Q: What's the maximum file size?**
A: Limited by Box upload limits and network bandwidth. Tested with files up to 500MB successfully.

---

## Quick Reference

### Box Folder IDs

| Workflow | Folder ID | Purpose |
|----------|-----------|---------|
| A1→A2 | 348304357505 | Master downloads (local campaigns) |
| A2→A3 | 348526703108 | Agency derivative uploads |
| A5→A6 | 349441822875 | Rejected asset downloads |
| B1→B2 | 349261192115 | Global master downloads |
| CreativeX | 350605024645 | CreativeX PDF reports |

### Status Codes

| Code | Meaning | Next Step |
|------|---------|-----------|
| A1 | Campaign created, masters ready | A1→A2 automation |
| A2 | Masters in Box, ready for localization | Agency work |
| A3 | Derivatives uploaded, ready for approval | Approval process |
| A4 | Approved, ready for distribution | Go live |
| A5 | Rejected, needs rework | A5→A6 automation |
| A6 | Rework in progress | Agency fixes, then A2→A3 |
| B1 | Global masters ready | B1→B2 automation |
| B2 | Global masters distributed | Regional teams |

### Script Commands

```bash
# Manual execution
python scripts/a1_to_a2_download.py
python scripts/a2_to_a3_upload_polling.py
python scripts/a5_to_a6_download.py
python scripts/b1_to_b2_download.py
python scripts/creativex_scoring_storing.py

# With flags
python scripts/a2_to_a3_upload_polling.py --dryrun        # Test mode
python scripts/a1_to_a2_download.py --auth-pfx             # mTLS auth
python scripts/a2_to_a3_upload_polling.py --keep-files     # Don't delete from Box

# Database operations
./database/backup.sh --daily
./database/restore.sh --list
./database/check_backups.sh

# Monitoring
tail -f logs/a2_to_a3.log                                 # Watch logs
grep -i error logs/*.log | grep "$(date +%Y-%m-%d)"      # Today's errors
```

### Database Queries

```bash
# Recent master assets
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, original_filename, created_at
FROM master_assets
ORDER BY created_at DESC LIMIT 10;
"

# Recent uploads
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT tracking_id, derivative_filename, uploaded_at
FROM derivative_assets
ORDER BY uploaded_at DESC LIMIT 10;
"

# Active CreativeX scores
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT filename, quality_score, extracted_at
FROM creativex_scores
WHERE status = 'active'
ORDER BY extracted_at DESC LIMIT 10;
"

# Campaign status
PGPASSWORD=xxx psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "
SELECT campaign_number, status, webhook_sent, updated_at
FROM campaign_status
ORDER BY updated_at DESC LIMIT 10;
"
```

---

## Contact Information

### Escalation Path

**Level 1: Self-Service**
- Check this guide
- Review logs
- Check email notifications
- Query database

**Level 2: Operations Team**
- Email: operations@oliver.agency
- Response time: Within 2 hours (business hours)
- Can: Review logs, check database, restart workflows

**Level 3: Technical Lead**
- Email: daveporter@oliver.agency
- Response time: Within 4 hours
- Can: Code fixes, configuration changes, database recovery

**Level 4: Ferrero IT**
- Contact: [To be provided]
- Response time: As per SLA
- Can: DAM access issues, network problems, credential resets

### Support Resources

**Documentation:**
- COMPLETE_WORKFLOW_GUIDE.md (this document)
- CREATIVEX_DEPLOYMENT.md (CreativeX setup)
- DATABASE_BACKUP_GUIDE.md (Backup/restore)
- CUTOVER.md (Production deployment)
- CUTOVER-TODOS.md (Deployment checklist)

**Logs:**
- `/opt/ferrero-automation/Python-Version/logs/`

**Database:**
- Host: localhost:5437
- Database: ferrero_tracking
- User: ferrero_user

**Monitoring:**
- Daily email reports (7:00 PM)
- Real-time notifications
- Log files

---

## Summary: The Golden Rules

### For Campaign Managers
1. ✅ Set campaign to A1 when ready
2. ✅ Wait for automation (don't rush)
3. ✅ Forward Box URLs to agency promptly
4. ✅ Monitor email notifications
5. ✅ Review daily reports
6. ❌ Don't manually move files between DAM folders
7. ❌ Don't change status codes manually

### For Agencies
1. ✅ **ALWAYS use the Naming Convention Tool**
2. ✅ Copy filename EXACTLY from tool (don't type)
3. ✅ Wait for A1→A2 email before starting
4. ✅ Get tracking IDs from email, not elsewhere
5. ✅ Upload CreativeX PDFs before derivatives
6. ✅ Upload to correct Box folder (check email URL)
7. ✅ Use SAME tracking ID for rework
8. ❌ Don't manually type filenames
9. ❌ Don't reuse tracking IDs across campaigns
10. ❌ Don't upload directly to DAM

### For Approvers
1. ✅ Always add detailed rejection comments
2. ✅ Be specific about what needs fixing
3. ✅ Check compliance before approving
4. ❌ Don't reject without explaining why

### For Operations
1. ✅ Monitor daily reports
2. ✅ Check backup health weekly
3. ✅ Review error patterns
4. ✅ Test restore monthly
5. ✅ Keep documentation updated

---

## Document Version History

**Version 2.1** - November 2025
- Updated for V2.1 filename structure
- Added social media field
- Added asset type mapping
- Added CreativeX master score extraction
- Complete workflow overhaul documentation

**Version 2.0** - October 2025
- Initial comprehensive guide
- All workflows documented
- Mermaid diagrams added

---

**End of Document**

**Total Pages:** ~65 pages (when printed)
**Last Updated:** November 2025
**Maintained By:** Operations Team
**Review Cycle:** Quarterly or after major changes

For questions or feedback, contact: operations@oliver.agency