ferrero-opentext/Python-Version
nickviljoen 9e92db185a Feature: Apply naming-tool pre-upload metadata overrides on A2→A3 upload
The naming tool's metadata editor saves pre-upload overrides to the
override_metadata table (shared ferrero_tracking DB), but until now the
Python upload pipeline never read from it — every edit was being saved
but never applied to DAM. This wires up the consumer side so user edits
land on the uploaded asset.

- database.py: get_override_metadata() / mark_override_applied(),
  resilient to a missing override_metadata table on dev DBs
- metadata_extractor_mvp.py: OVERRIDE_FIELD_MAP (mirrors the naming
  tool's editor-field → DAM-field-ID map) + _apply_override_fields().
  Applied after master/filename/forced/CreativeX values but before
  asset_type_overrides so EOL/LTD compliance still wins. Empty editor
  values are skipped (leaves inherited value alone). Validity ISO
  dates normalised to MM/DD/YYYY for DAM
- a2_to_a3_upload_polling.py: lookup before building the asset rep,
  pass override_fields into build_mvp_asset_representation, mark
  applied only after confirmed upload success

Override priority: user edit > master metadata > forced defaults >
hardcoded today+365 validity — so the team's per-asset validity
period (e.g. 1 month) now flows through end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:06:06 +02:00
..
backups Add automated PostgreSQL backup and restore system 2025-11-11 17:30:10 -05:00
config Fix: EOL/LTD asset type overrides — IP Rights, CreativeX, descriptions 2026-04-30 13:24:19 +02:00
database Enhancement: Capture CreativeX score on B1→B2 global masters 2026-04-29 11:31:07 +02:00
locks feat: Implement new Python script locking, relocate PHP workflow, and update Python scripts and documentation. 2025-11-21 17:20:34 -05:00
MARKDOWN_DOCS Docs: Refresh A1 empty-folder doc and LTD asset type notes 2026-04-30 18:19:06 +02:00
migrations Fix A2→A3 email template and database logging issues 2025-12-22 10:12:36 -05:00
scripts Feature: Apply naming-tool pre-upload metadata overrides on A2→A3 upload 2026-05-19 12:06:06 +02:00
tests refactor: Relocate test scripts to a dedicated tests/ directory and introduce orchestrator.py. 2025-11-21 17:10:04 -05:00
.gitignore Security: Add .env files to .gitignore 2026-01-31 18:07:44 +00:00
compare_ppr_structure.py PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
DAM- CX mappings.csv Revert "Fix: Add YouTube platform mapping and social media code fallback for CreativeX" 2026-02-13 17:17:06 +02:00
deploy.sh changed database port 2025-11-05 13:51:30 -06:00
docker-compose.yml changed database port 2025-11-05 13:51:30 -06:00
DOCS feat: Add Python project dependencies. 2025-11-20 22:45:04 -05:00
ferrero-backup-daily.service Update backup services to run as root for Docker permissions 2025-12-01 22:39:00 -05:00
ferrero-backup-daily.timer feat: Add systemd services and timers for daily and weekly database backups, introduce a token retrieval script, and update the backup guide documentation. 2025-12-01 22:19:24 -05:00
ferrero-backup-weekly.service Update backup services to run as root for Docker permissions 2025-12-01 22:39:00 -05:00
ferrero-backup-weekly.timer feat: Add systemd services and timers for daily and weekly database backups, introduce a token retrieval script, and update the backup guide documentation. 2025-12-01 22:19:24 -05:00
ferrero-orchestrator-prod.service Add systemd service files for Prod Orchestrator and CreativeX Service 2025-12-01 16:09:26 -05:00
ferrero-orchestrator.service feat: Add A1→A3 campaign advance script, introduce systemd service for orchestrator, and ref 2025-11-24 13:50:16 -05:00
ferrero_dam_backup.sql PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
PPR_ANALYSIS_INDEX.md PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
PPR_COMPARISON_REPORT.md PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
PPR_COMPARISON_SUMMARY.md PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
PPR_EXECUTIVE_SUMMARY.md PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
PPR_RESULTS.txt PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
PPR_SIDE_BY_SIDE.md PPR Environment: Use SIMPLE metadata structure for tabular fields 2026-01-23 16:52:50 +02:00
query_db.py Implement Auth V2 (Hybrid mTLS/OAuth) and update field mappings 2025-11-21 16:46:37 -05:00
README.md Docs: Refresh A1 empty-folder doc and LTD asset type notes 2026-04-30 18:19:06 +02:00
requirements.txt Add CreativeX score extraction and storage system 2025-11-11 16:15:45 -05:00
RUN_ORCHESTRATOR.md Add A1 retry logic and orchestrator off-hours cadence 2026-01-31 17:38:57 +02:00
setup.sh changed database port 2025-11-05 13:51:30 -06:00
test_mailgun_recipients.py Update Mailgun test: try US/EU endpoints, handle non-JSON errors 2026-03-13 13:29:27 +02:00

Ferrero Content Scaling - Python Automation System

Complete automated workflow for Ferrero DAM Content Scaling

Version: 2.1 Last Updated: April 16, 2026 Status: Production Ready & Fully Tested


📋 Table of Contents

  1. Overview
  2. What's Included
  3. Quick Start
  4. Server Deployment
  5. Workflows
  6. Authentication
  7. Configuration
  8. Monitoring
  9. Troubleshooting

🎯 Overview

Automated end-to-end workflow system that:

  • Downloads master assets from OpenText DAM
  • Uploads to Box with tracking IDs
  • Processes localized assets from Box
  • Filters rejected assets for rework
  • Sends email notifications
  • Generates daily summary reports
  • Supports OAuth2 and mTLS authentication

All workflows run autonomously via cron (every 5 minutes)


📦 What's Included

4 Workflow Automation Scripts

  1. A1→A2 - Master Assets Download (Local Adaptation)

    • Downloads master assets from DAM
    • Uploads to Box with tracking IDs (never start with 'M')
    • Stores in database with full metadata
    • Updates status A1 → A2
  2. A5→A6 - Rework Assets Download (Rejections)

    • Filters for NOT APPROVED assets only
    • Extracts rejection comments from 3 reviewers
    • Uploads to separate Revisions Box folder
    • Smart tracking ID reuse
    • Updates status A5 → A6
  3. B1→B2 - Global Master Assets Download

    • Downloads global master assets
    • Uses MASTERS_CampaignNumber naming
    • Separate Box folder for global assets
    • Tracking IDs always start with 'M'
    • Updates status B1 → B2
  4. A2→A3 - Upload from Box (Polling)

    • Monitors Box folder for agency uploads
    • Matches tracking IDs to master assets
    • Uploads to DAM with inherited metadata
    • Updates status A2 → A3
  5. CreativeX Scoring - Independent Service

    • Runs as a separate service (every 15 mins)
    • Extracts scores from PDFs using LlamaCloud
    • Stores scores in database
    • Independent of main orchestrator to prevent blocking

Additional Scripts

  1. Daily Report - Automated summary email (7 PM daily)

    • Analyzes all workflow logs (last 24 hours)
    • Campaign and asset statistics
    • Success rate calculations
    • Error summaries
    • Professional dashboard-style email
  2. Test Scripts

    • test_connection.py - Tests DAM, Box, Database connections
    • test_mtls_cert.py - Tests mTLS certificate loading
    • test_mtls_debug.py - Detailed mTLS debugging

🚀 Quick Start

Prerequisites

  • Python 3.6+ (server) or 3.10+ (local)
  • PostgreSQL 15+ database
  • Box account with JWT app
  • OpenText DAM API credentials
  • SMTP server for emails (Mailgun recommended)

Local Setup

cd Python-Version

# 1. Create virtual environment
python3 -m venv venv
source venv/bin/activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure environment
cp .env.example .env  # If exists, or edit .env directly
nano .env

# 4. Test connections
python scripts/test_connection.py

# Should show:
# ✓ DAM connection OK
# ✓ Box connection OK
# ✓ Database connection OK

🖥️ Server Deployment

Step 1: Copy Files to Server

Important: Box-config.json location

The Box JWT configuration file must be one folder up from Python-Version:

/opt/ferrero-automation/
├── Box-config.json          ← Box JWT config HERE
└── Python-Version/
    ├── .env
    ├── config/
    │   └── config.yaml      ← Points to ../Box-config.json
    └── scripts/

Copy command:

# Copy both folders
scp Box-config.json user@server:/opt/ferrero-automation/
scp -r Python-Version/ user@server:/opt/ferrero-automation/

Or use rsync:

# From Ferrero-Opentext folder
rsync -av Box-config.json Python-Version/ user@server:/opt/ferrero-automation/

Step 2: Setup Database

Using Docker (Recommended):

cd /opt/ferrero-automation/Python-Version

docker run -d \
  --name ferrero-tracking-db \
  --restart always \
  -e POSTGRES_DB=ferrero_tracking \
  -e POSTGRES_USER=ferrero_user \
  -e POSTGRES_PASSWORD=ferrero_pass_2025 \
  -p 5437:5432 \
  -v $(pwd)/database/init.sql:/docker-entrypoint-initdb.d/init.sql \
  -v ferrero-db-data:/var/lib/postgresql/data \
  postgres:15

# Wait 10 seconds for initialization
sleep 10

# Verify database
docker exec ferrero-tracking-db psql -U ferrero_user -d ferrero_tracking -c "\dt"

Step 3: Install Python Dependencies

cd /opt/ferrero-automation/Python-Version

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Step 4: Configure for Server

Update .env:

nano .env

# Update database (if different from defaults)
DB_HOST=localhost
DB_PORT=5437
DB_USER=ferrero_user
DB_PASSWORD=ferrero_pass_2025  # CHANGE IN PRODUCTION!

# Verify all other credentials are correct

Step 5: Test Everything

source venv/bin/activate

# Test connections
python scripts/test_connection.py

# Test each workflow (won't process if no campaigns)
python scripts/a1_to_a2_download.py
python scripts/a5_to_a6_download.py
python scripts/b1_to_b2_download.py

# Generate test report
python scripts/daily_report.py

Step 6: Setup Cron Jobs

crontab -e

Add:

# Ferrero Automation (every 5 minutes)
*/5 * * * * cd /opt/ferrero-automation/Python-Version && venv/bin/python scripts/a1_to_a2_download.py >> logs/cron_a1_a2.log 2>&1
*/5 * * * * cd /opt/ferrero-automation/Python-Version && venv/bin/python scripts/a5_to_a6_download.py >> logs/cron_a5_a6.log 2>&1
*/5 * * * * cd /opt/ferrero-automation/Python-Version && venv/bin/python scripts/b1_to_b2_download.py >> logs/cron_b1_b2.log 2>&1
*/5 * * * * cd /opt/ferrero-automation/Python-Version && venv/bin/python scripts/a2_to_a3_upload_polling.py >> logs/cron_a2_a3.log 2>&1

# Daily report (7 PM)
0 19 * * * cd /opt/ferrero-automation/Python-Version && venv/bin/python scripts/daily_report.py >> logs/daily_report.log 2>&1

🔄 Workflows

A1→A2: Master Assets Download

Purpose: Download master assets from DAM for local adaptation

Process:

  1. Search for campaigns with status = A1
  2. Process first campaign only (cron runs every 5 min)
  3. Recursively search Master Assets folder (including subfolders)
  4. Download all assets from DAM
  5. Upload to Box folder 348304357505
    • Folder name: C000000078-Campaign_Name
    • Preserves DAM subfolder structure
  6. Store in database with tracking ID (never starts with 'M' for regular files)
  7. ONLY update A1→A2 if ALL assets successful
  8. Send webhook + email notification

Box Folder: 348304357505 (Local Adaptation) Email: a1_to_a2_complete, a1_to_a2_partial, a1_to_a2_no_assets


A5→A6: Rework Assets Download (NOT APPROVED Filter)

Purpose: Download ONLY rejected assets that need rework

Process:

  1. Search for campaigns with status = A5
  2. Process first campaign only
  3. Recursively search Final Assets folder
  4. Filter for ECOMMERCE STATUS = "NOT APPROVED" only
  5. Skip all approved/other status assets
  6. Download NOT APPROVED assets
  7. Extract rejection details:
    • Approver comments, certifier, date
    • Legal comments, certifier, date
    • IA&CC comments, certifier, date
  8. Upload to Box folder 349441822875 (Revisions)
    • Folder name: C000000078-Campaign_Name-Revisions
    • Preserves DAM subfolder structure
  9. Smart tracking ID lookup (reuse if asset exists from A1→A2)
  10. ONLY update A5→A6 if ALL rejected assets successful
  11. Send detailed email with rejection comments

Box Folder: 349441822875 (Revisions) Email: a5_to_a6_rejections, a5_to_a6_partial, a5_to_a6_no_rejections

Special Features:

  • Filters assets by ECOMMERCE STATUS field
  • Extracts 10 rejection-related fields
  • Shows detailed comments in email (color-coded by reviewer)
  • Reuses tracking IDs from original download

B1→B2: Global Master Assets Download

Purpose: Download global master assets for global campaigns

Process:

  1. Search for campaigns with status = B1 AND type = "Global comm"
  2. Process first campaign only
  3. Recursively search Final Assets folder (B1 uses Final, not Master)
  4. Download all assets
  5. Upload to Box folder 349261192115
    • Folder name: MASTERS_C000000068-Campaign_Name
    • Preserves DAM subfolder structure
  6. Store in database with tracking ID (always starts with 'M' for master files)
  7. ONLY update B1→B2 if ALL assets successful
  8. Send email notification (no webhook)

Box Folder: 349261192115 (Global Masters) Email: b1_to_b2_complete, b1_to_b2_partial, b1_to_b2_no_assets


A2→A3: Upload from Box

Purpose: Process localized assets uploaded by agency to Box

Process:

  1. Poll Box folder 348526703108 for new files
  2. Match tracking ID from filename
  3. Load master metadata from database
  4. Build asset representation (27 MVP fields)
  5. Update Description, Language, State from filename
  6. Upload to DAM Final Assets folder
  7. Delete file from Box
  8. Update A2→A3 when ALL campaign assets uploaded

Two tracking modes:

  • Full inheritance (standard): Inherits all metadata from the master asset
  • Folder-only (-N suffix): Builds metadata from a reference template (config/asset_representation_template.json) and populates values from the filename. Used when the derivative only needs the upload folder from the master.

Asset type overrides: Certain asset types (e.g., EOL) trigger field overrides configured in the environment's field mappings file (e.g., Agency Name, Languages, IP Rights, validity dates).

Box Folder: 348526703108 (Agency Uploads) Email: a2_to_a3_file_uploaded, a2_to_a3_complete


Daily Report

Purpose: Automated daily summary email at 7 PM

Process:

  1. Analyzes all 4 workflow logs (last 24 hours)
  2. Calculates statistics:
    • Campaigns found/processed/completed/partial
    • Assets successful/failed/skipped
    • NOT APPROVED count (A5→A6)
    • Success rate percentage
    • Error count
  3. Per-workflow breakdown
  4. Campaign details (collapsible)
  5. Error logs (collapsible)
  6. Professional dashboard-style email

Runs: Daily at 7:00 PM via cron Email: Blue dashboard theme with statistics


🔐 Authentication

Both OAuth2 and mTLS authentication are supported across all workflows.

OAuth2 (Default - Production Ready)

Current production authentication method - No special configuration needed.

Configuration (.env):

DAM_BASE_URL=https://ppr.dam.ferrero.com/otmmapi
DAM_AUTH_URL=https://ppr.dam.ferrero.com/otdsws/oauth2/token
DAM_CLIENT_ID=otds-OLV
DAM_CLIENT_SECRET=hs28LZ9ZzQ5I9rlW3P7Wwyw85oOatlC1

Usage (default - no flag needed):

# All workflows use OAuth2 by default
python scripts/a1_to_a2_download.py
python scripts/a2_to_a3_upload_polling.py
python scripts/a4_webhook_monitor.py
python scripts/a5_to_a6_download.py
python scripts/b1_to_b2_download.py

How it works:

  • Requests OAuth2 token from /otdsws/oauth2/token
  • Uses client_credentials grant type
  • Token added to Authorization: Bearer header
  • Token auto-refreshes (1 hour expiry)
  • No IP whitelisting required

mTLS Certificate (Optional - Enhanced Security)

Client certificate authentication - Requires IP whitelisting.

Configuration (.env):

DAM_MTLS_BASE_URL=https://dev-auth.app-api.ferrero.com/00003/mm
DAM_MTLS_CERT_PATH=config/certificates/dam-mtls-dev.pfx
DAM_MTLS_CERT_PASSWORD=fnJ8xrnh!54NE&2HR62=2P3YEy+hy9RajZ7v5&=y

Usage (add --auth-pfx flag):

# All workflows support mTLS with --auth-pfx flag
python scripts/a1_to_a2_download.py --auth-pfx
python scripts/a2_to_a3_upload_polling.py --auth-pfx
python scripts/a4_webhook_monitor.py --auth-pfx
python scripts/a5_to_a6_download.py --auth-pfx
python scripts/b1_to_b2_download.py --auth-pfx

How it works:

  • Uses PFX/P12 client certificate
  • Certificate auto-converted to PEM format
  • Presented during TLS handshake
  • Different base URL than OAuth2
  • Requires IP whitelisting on Ferrero's side

Requirements:

  • ⚠️ Must run from whitelisted IP address
  • ⚠️ Different API endpoint: https://dev-auth.app-api.ferrero.com/00003/mm
  • ⚠️ For production: Get production certificate (current is dev)

Testing mTLS:

# Test certificate loading
python scripts/test_mtls_cert.py

# Should show:
# ✓ Certificate loaded successfully
# ✓ Private key extracted
# ✓ Certificate chain: 3 certificates
# ✓ Valid PEM format

# Test DAM connection with mTLS
python scripts/test_connection.py --auth-pfx

# Should show:
# ✓ DAM connection OK (mTLS)
# ✓ Box connection OK
# ✓ Database connection OK

Authentication Comparison

Feature OAuth2 (Default) mTLS (--auth-pfx)
Base URL https://ppr.dam.ferrero.com/otmmapi https://dev-auth.app-api.ferrero.com/00003/mm
IP Whitelisting Not required Required
Certificate Not needed PFX/P12 file required
Production Ready Yes ⚠️ Need production cert
Flag None (default) --auth-pfx
Token Refresh Automatic Not applicable

New hybrid authentication method - Uses mTLS to get OAuth token, then direct DAM access. Solves APIM file size limits (500MB+)

Configuration (.env):

DAM_MTLS_OAUTH_URL=https://dev-auth.app-api.ferrero.com/00003/mm/v6_OAUTH
DAM_MTLS_CERT_PATH=config/certificates/dam-mtls-dev.pfx
DAM_MTLS_CERT_PASSWORD=your_password

Usage (add --auth-pfx-v2 flag):

python scripts/a1_to_a2_download.py --auth-pfx-v2

How it works:

  1. Authenticates with mTLS cert to DAM_MTLS_OAUTH_URL
  2. Receives OAuth bearer token
  3. Connects DIRECTLY to DAM (bypassing APIM gateway)
  4. Uploads/Downloads large files without size limits

🎼 Orchestration (New)

Recommended way to run all workflows.

Instead of managing multiple cron jobs, use the Orchestrator:

  • Runs every minute
  • Manages schedules for all scripts
  • Prevents overlaps (locks execution)
  • Handles Daily Report schedule (7 PM)
  • Logs to logs/orchestrator.log

Usage:

# Run manually (force all tasks)
python scripts/orchestrator.py --force

# Run via Cron (Standard)
python scripts/orchestrator.py

Configuration: Modify scripts/orchestrator.py TASKS list to change intervals.


🖥️ Server Deployment (Updated)

Step 1-5: Same as above

Step 6: Setup Cron Job (Orchestrator)

You only need ONE cron entry now:

crontab -e

Add:

# Ferrero Orchestrator (Runs every minute)
* * * * * cd /opt/ferrero-automation/Python-Version && venv/bin/python scripts/orchestrator.py >> logs/orchestrator_cron.log 2>&1

We now have dedicated systemd services for all components:

  1. Orchestrator Service (Main workflows)

    sudo cp ferrero-orchestrator-prod.service /etc/systemd/system/
    sudo systemctl enable --now ferrero-orchestrator-prod
    
  2. CreativeX Service (Scoring)

    sudo cp scripts/creativex-service.service /etc/systemd/system/
    sudo systemctl enable --now creativex-service
    
  3. Database Backups (Daily & Weekly)

    sudo cp ferrero-backup-*.{service,timer} /etc/systemd/system/
    sudo systemctl enable --now ferrero-backup-daily.timer
    sudo systemctl enable --now ferrero-backup-weekly.timer
    

View Logs:

# Orchestrator
tail -f logs/orchestrator_prod.log

# CreativeX
tail -f logs/creativex_service.log

# Backups
sudo journalctl -u ferrero-backup-daily.service

🎛️ Script Options & Flags

A1→A2 Download Script

# OAuth2 (default)
python scripts/a1_to_a2_download.py

# mTLS certificate authentication
python scripts/a1_to_a2_download.py --auth-pfx

# Help
python scripts/a1_to_a2_download.py --help

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication instead of OAuth2

A2→A3 Upload Script

# OAuth2 (default)
python scripts/a2_to_a3_upload_polling.py

# mTLS authentication
python scripts/a2_to_a3_upload_polling.py --auth-pfx

# Keep files in Box after upload (testing)
python scripts/a2_to_a3_upload_polling.py --keep-files

# Force status update A2→A3 (testing)
python scripts/a2_to_a3_upload_polling.py --A3update

# Combination (mTLS + keep files + force update)
python scripts/a2_to_a3_upload_polling.py --auth-pfx --keep-files --A3update

# Help
python scripts/a2_to_a3_upload_polling.py --help

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication
  • --keep-files - Don't delete files from Box after upload (for testing)
  • --A3update - Force campaign status update A2→A3 after upload (for testing)

A4 Webhook Monitor Script

# OAuth2 (default)
python scripts/a4_webhook_monitor.py

# mTLS authentication
python scripts/a4_webhook_monitor.py --auth-pfx

# Help
python scripts/a4_webhook_monitor.py --help

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication

Behavior:

  • Processes ALL A4 campaigns in one run
  • Prevents duplicate webhooks via database tracking
  • Sends "Live Campaign: NO" webhook
  • Shows summary statistics

A5→A6 Download Script

# OAuth2 (default)
python scripts/a5_to_a6_download.py

# mTLS authentication
python scripts/a5_to_a6_download.py --auth-pfx

# Help
python scripts/a5_to_a6_download.py --help

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication

Special Features:

  • Filters for ECOMMERCE STATUS = "NOT APPROVED" only
  • Extracts rejection comments from 3 reviewers
  • Reuses tracking IDs from A1→A2 if asset exists

B1→B2 Download Script

# OAuth2 (default)
python scripts/b1_to_b2_download.py

# mTLS authentication
python scripts/b1_to_b2_download.py --auth-pfx

# Help
python scripts/b1_to_b2_download.py --help

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication

Special Features:

  • Filters for campaigns with type = "Global comm"
  • Uses "MASTERS_" prefix for Box folders
  • Searches Final Assets (not Master Assets)

Daily Report Script

# Generate daily report
python scripts/daily_report.py

# No authentication flags - analyzes local logs only

No flags needed - This script only reads local log files and doesn't connect to DAM.


Test Connection Script

# Test all connections with OAuth2
python scripts/test_connection.py

# Test all connections with mTLS
python scripts/test_connection.py --auth-pfx

Available Flags:

  • --auth-pfx - Test mTLS certificate authentication

Tests:

  • DAM connection (OAuth2 or mTLS)
  • Box connection
  • Database connection

Reset Campaign to A1 (Interactive)

Purpose: Reset any campaign from any A# status back to A1 for testing/retrying workflows.

# Reset any A# campaigns to A1 (shows all A2, A3, A4, A5, A6)
python scripts/reset_campaign_to_a1.py

# Reset only A4 campaigns to A1
python scripts/reset_campaign_to_a1.py --status A4

# With mTLS authentication
python scripts/reset_campaign_to_a1.py --auth-pfx

# Filter A2 campaigns with mTLS
python scripts/reset_campaign_to_a1.py --status A2 --auth-pfx

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication
  • --status A2 - Filter for specific status (A2, A3, A4, A5, A6)

Interactive Behavior:

  • Shows campaign name, number, and current status
  • Asks yes/no for each campaign individually
  • Waits for user input before proceeding to next
  • Shows summary at the end

Example Output:

======================================================================
Campaign 1/2
======================================================================
Campaign: CONTENT SCALING 12345
Number:   C000000078
Current:  A4
======================================================================
Reset this campaign to A1? (yes/no): yes

Resetting campaign to A1...
✓ SUCCESS: Campaign reset to A1

Advance A1 to A3 (Interactive)

Purpose: Advance A1 campaigns directly to A3, skipping A2. Useful for testing A2→A3 upload workflow.

# Advance A1 campaigns to A3
python scripts/advance_a1_to_a3.py

# With mTLS authentication
python scripts/advance_a1_to_a3.py --auth-pfx

Available Flags:

  • --auth-pfx - Use mTLS certificate authentication

Interactive Behavior:

  • Finds all A1 campaigns
  • Shows campaign name and number
  • Asks yes/no for each campaign individually
  • Advances to A3 if user confirms
  • Shows summary at the end

Example Output:

======================================================================
Campaign 1/3
======================================================================
Campaign: CONTENT SCALING TEST
Number:   C000000551
Current:  A1
Target:   A3
======================================================================
Advance this campaign A1 → A3? (yes/no): yes

Advancing campaign A1 → A3...
✓ SUCCESS: Campaign advanced to A3

Use Case:

  • Testing A2→A3 upload workflow without running full A1→A2 download
  • Quickly setting up test campaigns in A3 status
  • Skipping the download step when you already have files in Box

Update Campaign Status (Targeted)

Purpose: Update a specific campaign to a specific status. Search by campaign number or name.

# Update specific campaign by number
python scripts/update_campaign_status.py --camp C000000078 --status A2

# Update by partial campaign name
python scripts/update_campaign_status.py --camp "CONTENT SCALING" --status A4

# Update to A3 with mTLS
python scripts/update_campaign_status.py --camp C000000551 --status A3 --auth-pfx

# Reset to A1
python scripts/update_campaign_status.py --camp "KINDER JOY" --status A1

Required Flags:

  • --camp - Campaign number (e.g., C000000078) or partial campaign name
  • --status - Target status (A1, A2, A3, A4, A5, A6, B1, B2)

Optional Flags:

  • --auth-pfx - Use mTLS certificate authentication

Interactive Behavior:

  • Searches all statuses for matching campaign
  • Shows current status and target status
  • If multiple matches, lets user choose which one
  • Asks for yes/no confirmation before updating
  • Shows success/failure result

Example Output:

======================================================================
Found 1 matching campaign(s)
======================================================================

======================================================================
Selected Campaign
======================================================================
Name:           CONTENT SCALING KINDER JOY
Number:         C000000553
Current Status: A4
Target Status:  A1
======================================================================

Update campaign status A4 → A1? (yes/no): yes

Updating campaign status...

======================================================================
✓ SUCCESS
======================================================================
Campaign:    CONTENT SCALING KINDER JOY
Number:      C000000553
New Status:  A1
======================================================================

Use Cases:

  • Quickly update specific campaign without searching through lists
  • Testing workflow progression: A1 → A2 → A3 → A4
  • Resetting specific campaigns for retesting
  • Works with campaign number OR campaign name (partial match)

⚙️ Configuration

Environment Variables (.env)

Required variables:

# Environment
ENV=staging  # or production

# DAM - OAuth2 (default)
DAM_BASE_URL=https://ppr.dam.ferrero.com/otmmapi
DAM_AUTH_URL=https://ppr.dam.ferrero.com/otdsws/oauth2/token
DAM_CLIENT_ID=otds-OLV
DAM_CLIENT_SECRET=your_secret_here

# DAM - mTLS (optional with --auth-pfx)
DAM_MTLS_BASE_URL=https://dev-auth.app-api.ferrero.com/00003/mm
DAM_MTLS_CERT_PATH=config/certificates/dam-mtls-dev.pfx
DAM_MTLS_CERT_PASSWORD=your_cert_password

# Box
BOX_CLIENT_ID=your_box_client_id
BOX_CLIENT_SECRET=your_box_secret
BOX_JWT_KEY_ID=your_jwt_key
BOX_PASSPHRASE=your_passphrase
BOX_ENTERPRISE_ID=43984435

# Box Folders
BOX_ROOT_FOLDER_A1_A2=348304357505  # Local Adaptation downloads
BOX_ROOT_FOLDER_A2_A3=348526703108  # Agency uploads
BOX_ROOT_FOLDER_B1_B2=349261192115  # Global Masters

# Database
DB_HOST=localhost
DB_PORT=5437
DB_USER=ferrero_user
DB_PASSWORD=ferrero_pass_2025

# Email (Mailgun SMTP)
SMTP_SERVER=smtp.mailgun.org
SMTP_PORT=587
SMTP_USER=your_smtp_user
SMTP_PASSWORD=your_smtp_password
SENDER_EMAIL=TWIST-UK-SERVER@oliver.agency
ERROR_EMAIL=daveporter@oliver.agency
REPORT_EMAILS=daveporter@oliver.agency

# Webhook
CAMPAIGN_STATUS_WEBHOOK_URL=https://hook.us1.make.celonis.com/your_webhook_id

Box JWT Configuration

File location: ../Box-config.json (one folder up from Python-Version)

Structure:

Ferrero-Opentext/
├── Box-config.json          ← Box JWT config
└── Python-Version/
    └── config/
        └── config.yaml      ← References ../Box-config.json

For server deployment, copy both:

scp Box-config.json user@server:/opt/ferrero-automation/
scp -r Python-Version/ user@server:/opt/ferrero-automation/

Field Mappings (Environment-Specific)

The system auto-detects the environment from the DAM URL and loads the appropriate config:

  • PPR: config/field_mappings_ppr.yaml (pre-production, ppr.dam.ferrero.com)
  • PROD: config/field_mappings_prod.yaml (production, dam.ferrero.com)

Each file defines: MVP fields, filename update rules, forced values, defaults, and asset type overrides.

Asset Type Mappings

config/asset_type_mappings.yaml maps 3-letter codes from the naming tool to DAM domain values (e.g., EHI -> heroimage, EOL -> externallegalopinion).

Last updated: April 16, 2026 per Scaling Agencies Metadata List. 38 asset types mapped (was 39). Changes:

  • Removed: CID, ECB, EBS, EOP, EUG, EWB, FPO, PKI, PRI
  • Added: EAN, ESI, NTB, PIR, PKC, PKT, SCP, SNC, UPI
  • Changed: DAT DAM code updated from digitalassettoolkit to digitalasset

Asset Representation Template

config/asset_representation_template.json is the reference template for folder-only mode (-N flag uploads). It contains the full field metadata structure that the DAM API requires for asset creation. This template was provided by the client and should be updated if the DAM metadata model changes.

Asset Type Overrides (EOL / LTD)

Certain asset types trigger field overrides configured in the field mappings file. Currently configured for both PPR and PROD:

EOL (External Legal Opinion)

  • Agency Name = "-"
  • Production House = "-"
  • Main Languages = "Global"
  • IP Rights = "Yes"
  • Licensing = "No"
  • Validity dates removed

LTD (Licensing Translation Document) — supports the EOL workflow with translated license claims. Same overrides as EOL, plus a fixed Description: "Translation of License claim - For approval purposes only". Currently mapped to the same DAM-side code (externallegalopinion) as a placeholder pending client confirmation.

These overrides are applied after all other field processing and take final precedence. An empty-string override removes the field; a non-empty override targeting a field that isn't in mvp_fields will be appended as a simple string field.


🗄️ Database

Connection Details

Default (Docker):

  • Host: localhost
  • Port: 5437
  • Database: ferrero_tracking
  • User: ferrero_user
  • Password: ferrero_pass_2025 (⚠️ CHANGE IN PRODUCTION!)

Schema

Tables:

  1. master_assets (35 columns)
    • tracking_id, opentext_id, full_metadata (JSONB)
    • global_master_campaign_id, local_campaign_id
    • All metadata fields
  2. derivative_assets - Localized assets
  3. asset_events - Audit log
  4. workflow_state - Workflow tracking

Complete schema: See DATABASE_SCHEMA.md or database/init.sql

Setup Database

# Docker method
docker run -d \
  --name ferrero-tracking-db \
  --restart always \
  -e POSTGRES_DB=ferrero_tracking \
  -e POSTGRES_USER=ferrero_user \
  -e POSTGRES_PASSWORD=ferrero_pass_2025 \
  -p 5437:5432 \
  -v $(pwd)/database/init.sql:/docker-entrypoint-initdb.d/init.sql \
  -v ferrero-db-data:/var/lib/postgresql/data \
  postgres:15

# Or native PostgreSQL
psql -h localhost -U postgres -f database/init.sql

📊 Monitoring

Log Files

All logs in logs/ directory:

  • a1_to_a2.log - A1→A2 workflow
  • a5_to_a6.log - A5→A6 workflow (rework)
  • b1_to_b2.log - B1→B2 workflow (global)
  • a2_to_a3.log - A2→A3 workflow (upload)
  • cron_*.log - Cron execution logs
  • daily_report.log - Report generation

Log rotation:

  • Max size: 10MB per file
  • Backups: 28 files (~1 month retention)
  • Auto-cleanup

View logs:

tail -f logs/a1_to_a2.log
tail -f logs/a5_to_a6.log

# Search for errors
grep -i error logs/*.log
grep -i "NOT APPROVED" logs/a5_to_a6.log

Database Queries

Connect to database:

PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking

Quick queries (from command line):

# View recent A1→A2 downloads
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT tracking_id, original_filename, brand_code, country_code, local_campaign_id, created_at FROM master_assets ORDER BY created_at DESC LIMIT 10;"

# View full details for recent assets
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT id, tracking_id, original_filename, file_extension, file_size_bytes, brand_code, country_code, language_code, local_campaign_id, global_master_campaign_id, created_at FROM master_assets ORDER BY created_at DESC LIMIT 20;"

# Count total assets
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT COUNT(*) as total_assets FROM master_assets WHERE status = 'active';"

# Assets by campaign
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT local_campaign_id, COUNT(*) as asset_count FROM master_assets WHERE status = 'active' GROUP BY local_campaign_id ORDER BY asset_count DESC;"

# View campaign status (webhooks)
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, campaign_name, status, live_campaign, webhook_sent, webhook_sent_at FROM campaign_status ORDER BY created_at DESC LIMIT 10;"

# View derivative assets (A2→A3 uploads)
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT tracking_id, derivative_filename, upload_status, uploaded_at FROM derivative_assets ORDER BY created_at DESC LIMIT 10;"

# View recent audit events
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT tracking_id, event_type, event_timestamp FROM asset_events ORDER BY event_timestamp DESC LIMIT 20;"

Advanced queries (in psql):

-- Assets with full metadata
SELECT tracking_id, original_filename,
       full_metadata->>'name' as metadata_name,
       jsonb_array_length(full_metadata->'metadata'->'metadata_element_list') as field_count,
       created_at
FROM master_assets
ORDER BY created_at DESC
LIMIT 10;

-- Global master relationships
SELECT local_campaign_id, global_master_campaign_id, COUNT(*) as asset_count
FROM master_assets
WHERE global_master_campaign_id IS NOT NULL
GROUP BY local_campaign_id, global_master_campaign_id
ORDER BY asset_count DESC;

-- Tracking ID with master and derivatives
SELECT
    m.tracking_id,
    m.original_filename as master_filename,
    m.local_campaign_id,
    COUNT(d.id) as derivative_count,
    m.created_at
FROM master_assets m
LEFT JOIN derivative_assets d ON m.tracking_id = d.tracking_id
GROUP BY m.tracking_id, m.original_filename, m.local_campaign_id, m.created_at
ORDER BY m.created_at DESC
LIMIT 20;

-- Assets by brand
SELECT brand_code, COUNT(*) as count
FROM master_assets
WHERE status = 'active'
GROUP BY brand_code
ORDER BY count DESC;

-- Check for duplicates
SELECT opentext_id, local_campaign_id, COUNT(*)
FROM master_assets
GROUP BY opentext_id, local_campaign_id
HAVING COUNT(*) > 1;

-- Recent workflow activity
SELECT workflow_name, campaign_id, last_status, last_run_at
FROM workflow_state
ORDER BY last_run_at DESC
LIMIT 10;

Database management:

# Initialize/reset database
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -f database/init.sql

# Backup database
pg_dump -h localhost -p 5437 -U ferrero_user ferrero_tracking > backup_$(date +%Y%m%d).sql

# View table sizes
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "\dt+"

# View all tables
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "\dt"

Email Notifications

All workflows send emails:

  • Success (green theme)
  • ⚠️ Partial/failures (orange theme)
  • Errors (red theme)
  • 📊 Daily report (blue dashboard)

Templates include:

  • Campaign details
  • Asset lists with tracking IDs
  • Box URLs
  • Rejection comments (A5→A6)
  • Error messages
  • Next steps

🔧 Troubleshooting

Connection Test Fails

# Run diagnostic
python scripts/test_connection.py

# Check each component
python scripts/test_connection.py --auth-pfx  # Test mTLS

Common issues:

  • DAM: Check credentials in .env
  • Box: Verify Box-config.json is one folder up
  • Database: Check PostgreSQL is running on port 5437
  • mTLS: Check you're on whitelisted IP

Box JWT File Not Found

Error: Failed to initialize Box client: [Errno 2] No such file or directory: '../Box-config.json'

Solution:

# Check Box-config.json location
ls -la ../Box-config.json

# Should be one folder up from Python-Version
# Structure:
#   /opt/ferrero-automation/
#   ├── Box-config.json
#   └── Python-Version/

Database Connection Fails

# Check Docker container running
docker ps | grep ferrero

# Check PostgreSQL service
sudo systemctl status postgresql

# Test connection manually
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT 1;"

No Campaigns Found

Expected behavior - scripts check every 5 minutes:

No A1 campaigns found - exiting

This is normal! The script will check again in 5 minutes.

Email Not Sending

# Check SMTP configuration in .env
grep SMTP .env

# Test email manually
python -c "from scripts.shared.notifier import Notifier; from shared.config_loader import load_config; n=Notifier(load_config('config/config.yaml')); print('OK')"

mTLS Certificate Issues

# Test certificate loads
python scripts/test_mtls_cert.py

# Should show:
# ✓ Certificate loaded successfully
# ✓ Valid PEM format

# Check permissions
ls -l config/certificates/*.pfx
# Should be: -rw------- (600)

# Fix if needed
chmod 600 config/certificates/*.pfx

Logs Not Created

# Create log directories
mkdir -p logs/backup
chmod 755 logs

📁 Complete File Structure

/opt/ferrero-automation/
├── Box-config.json                    ← Box JWT (one folder up!)
└── Python-Version/
    ├── .env                          ← Environment variables
    ├── config/
    │   ├── config.yaml              ← Main configuration
    │   ├── field_mappings.yaml      ← Field definitions
    │   └── certificates/            ← mTLS certificates (gitignored)
    │       └── dam-mtls-dev.pfx
    ├── database/
    │   └── init.sql                 ← Database schema
    ├── scripts/
    │   ├── a1_to_a2_download.py    ← A1→A2 workflow
    │   ├── a5_to_a6_download.py    ← A5→A6 workflow (rework)
    │   ├── b1_to_b2_download.py    ← B1→B2 workflow (global)
    │   ├── a2_to_a3_upload_polling.py  ← A2→A3 workflow
    │   ├── daily_report.py          ← Daily summary
    │   ├── test_connection.py       ← Connection tester
    │   ├── test_mtls_cert.py        ← Certificate tester
    │   ├── test_mtls_debug.py       ← mTLS debugger
    │   └── shared/
    │       ├── dam_client.py        ← DAM API (dual auth)
    │       ├── box_client.py        ← Box API
    │       ├── database.py          ← PostgreSQL
    │       ├── notifier.py          ← Email + webhooks (12 templates)
    │       └── config_loader.py     ← Config loader
    ├── logs/                        ← Auto-created
    ├── temp/                        ← Auto-created
    ├── venv/                        ← Virtual environment
    ├── requirements.txt             ← Python dependencies
    ├── DEPLOYMENT_GUIDE.md          ← Full deployment guide
    ├── DATABASE_SCHEMA.md           ← Schema documentation
    └── README.md                    ← This file

🔒 Security

Production Checklist

Before going live:

  • Change database password from default
  • Update all .env passwords
  • Get production mTLS certificate (if using mTLS)
  • Verify Box JWT credentials are production keys
  • Set file permissions:
    chmod 600 .env
    chmod 600 config/certificates/*.pfx
    chmod 600 ../Box-config.json
    
  • Configure firewall rules
  • Enable database backups
  • Set up log monitoring
  • Test all workflows end-to-end

Gitignored Files (Never Committed)

.env                     # All credentials
config/certificates/     # mTLS certificates
*.pfx, *.p12            # Certificate files
logs/                    # Log files
temp/                    # Temporary downloads
__pycache__/            # Python cache
*.pyc                    # Compiled files

🎯 Key Features

Recursive Folder Search (All Workflows)

  • Searches subfolders within Master/Final Assets folders
  • Preserves folder structure in Box
  • Logs subfolder paths

NOT APPROVED Filtering (A5→A6 Only)

  • Only downloads assets with ECOMMERCE STATUS = "NOT APPROVED"
  • Skips approved/other status assets
  • Extracts detailed rejection comments from 3 reviewers

Smart Tracking ID Management

  • 6-character unique IDs
  • Reuses existing IDs for rework assets (A5→A6)
  • Links master assets to derivatives
  • Full audit trail in database

Professional Email Notifications

  • Unified modern styling across all templates
  • Color-coded by severity (green/orange/red/blue)
  • Detailed asset lists with Box URLs
  • Rejection comments with reviewer names (A5→A6)
  • Collapsible sections in daily report

📞 Support

Getting Help

  1. Check logs: tail -f logs/a1_to_a2.log
  2. Test connections: python scripts/test_connection.py
  3. Check database: psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking
  4. Review config: cat config/config.yaml

Common Commands

# Activate environment
source venv/bin/activate

# Test everything
python scripts/test_connection.py

# Manual workflow run
python scripts/a1_to_a2_download.py

# Generate report
python scripts/daily_report.py

# Check cron logs
tail -f logs/cron_*.log

# Database stats
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT COUNT(*) FROM master_assets;"

📝 Additional Documentation

  • DEPLOYMENT_GUIDE.md - Complete server deployment guide
  • DATABASE_SCHEMA.md - Full database schema reference
  • config/field_mappings.yaml - Editable field definitions

Production Ready

Fully tested and working:

  • All 4 workflows automated
  • Recursive folder search
  • NOT APPROVED filtering
  • Email notifications (12 templates)
  • Daily summary reports
  • Dual authentication (OAuth2 + mTLS)
  • Complete database schema
  • Deployment documentation

Compatible with:

  • Python 3.6+ (server)
  • Python 3.10+ (local development)
  • PostgreSQL 15+
  • Box SDK 3.x
  • OpenText DAM API v6

Version: 2.1 - Production Ready Last Updated: April 16, 2026 Repository: bitbucket.org:zlalani/ferrero-opentext.git

🚀 Ready to deploy!