Commit graph

122 commits

Author SHA1 Message Date
DJP
87b593c5f9 Fix syntax errors in orchestrator-prod.py args 2025-11-26 15:34:14 -05:00
DJP
c53e79cbaf Add production orchestrator configuration 2025-11-26 15:31:32 -05:00
DJP
07bce09d65 Fix B1→B2 bug: total_assets referenced before assignment 2025-11-26 15:07:27 -05:00
DJP
8ca44fcf1e Add metadata diagnostic tool for troubleshooting field issues 2025-11-26 14:52:39 -05:00
DJP
f9c11ef3f5 Fix misleading log message for A5 campaigns with no rejections 2025-11-26 14:45:48 -05:00
DJP
795e4e7d96 Improve A5 notification logic to handle status changes 2025-11-26 14:26:36 -05:00
DJP
0f1c3dd0ec Prevent duplicate 'no rejections' emails for A5 campaigns 2025-11-26 14:25:57 -05:00
DJP
936071d7ad Change script intervals from 5 to 3 minutes for faster processing 2025-11-26 14:18:45 -05:00
DJP
f15ae9a8d1 Stream full script output to console in real-time 2025-11-26 14:04:15 -05:00
DJP
d6b68af5d5 Fix A5→A6 to also use OAuth authentication 2025-11-26 14:03:07 -05:00
DJP
16527f6e43 Temporarily disable mTLS auth, use OAuth for troubleshooting 2025-11-26 13:59:48 -05:00
DJP
3518f7c909 Remove A1->A2 Download task from orchestrator and add run guide 2025-11-26 13:53:48 -05:00
DJP
f9e57c7d57 docs: add orchestrator running instructions. 2025-11-26 13:51:14 -05:00
DJP
7599fe7cd2 feat: Remove A1->A2 Download script from orchestrator configuration 2025-11-26 13:48:48 -05:00
DJP
99d8621266 Increase throughput: process 2 campaigns in A1→A2, all files in A2→A3 2025-11-26 13:43:06 -05:00
DJP
98fb7eaee2 Fix smoke test to use prod-auth endpoint instead of test-auth 2025-11-26 10:20:59 -05:00
DJP
9d207d0480 chore: Update DAM mTLS base and OAuth URLs in production environment to /token endpoint. 2025-11-26 10:18:56 -05:00
DJP
6064b0971e Fix smoke test to explicitly load .env-prod file 2025-11-26 10:14:26 -05:00
DJP
d53b605f56 Update .env-prod with production mTLS configuration 2025-11-26 10:08:58 -05:00
DJP
cabc1d5548 Add production smoke test script for mTLS V2 authentication 2025-11-26 10:08:04 -05:00
DJP
c1f338022c fix: Ensure type field is added when updating CreativeX URL
- Modified _set_field_value to include 'type': 'string' in all code paths
- Adds type field when updating existing CreativeX URL field
- Ensures consistent structure whether creating or updating field
2025-11-25 09:14:25 -05:00
DJP
80316cad32 fix: Add missing type field to CreativeX URL metadata
- Added 'type': 'string' to FERRERO.FIELD.CREATIVEX LINK value structure
- Fixes DAM validation error for CreativeX URL field
- Structure now matches DAM requirements
2025-11-25 09:11:41 -05:00
DJP
548c30344b feat: Support multiple CreativeX platforms in metadata
- Updated creativex_scoring_storing.py to map multiple placements to platforms
- Modified get_mapped_platform to get_mapped_platforms (returns list)
- Updated a2_to_a3_upload_polling.py to retrieve platforms list from DB
- Enhanced metadata_extractor_mvp.py to build multi-value CreativeX field
- Added DAM-CX mappings.csv for channel/placement to platform mapping
- Supports single channel with multiple placements generating multiple Platform^Score values
2025-11-24 14:44:11 -05:00
DJP
491fc8e938 feat: Add A1→A3 campaign advance script, introduce systemd service for orchestrator, and ref 2025-11-24 13:50:16 -05:00
DJP
0af15563bc feat: Implement new Python script locking, relocate PHP workflow, and update Python scripts and documentation. 2025-11-21 17:20:34 -05:00
DJP
22069ed66f refactor: Relocate test scripts to a dedicated tests/ directory and introduce orchestrator.py. 2025-11-21 17:10:04 -05:00
DJP
2cad2c2955 Fix AttributeError in DAMClient.test_connection 2025-11-21 16:57:43 -05:00
DJP
5aeab8d9a3 Update a1_to_a2_download.py to support Auth V2 2025-11-21 16:53:32 -05:00
DJP
6fe2ba234b Implement Auth V2 (Hybrid mTLS/OAuth) and update field mappings 2025-11-21 16:46:37 -05:00
DJP
b906434f67 Add A1->A2 and A4 Box CSV uploader scripts 2025-11-20 22:52:26 -05:00
DJP
20a187a61d feat: Add Python project dependencies. 2025-11-20 22:45:04 -05:00
DJP
e839816fbc Add non-technical user guide for agencies and creative teams
Simple, action-oriented guide focused on what users need to DO,
not how the system works internally. Perfect for onboarding and
daily reference.

USER_GUIDE.md (15-minute read):

Target Audience:
- Creative teams creating localized assets
- Agencies doing derivative work
- Campaign managers coordinating uploads
- Anyone who needs to USE the system (not maintain it)

Content Structure:

1. Big Picture (Simple Flowchart):
   - 6-step process diagram
   - "You do step 3-4, system handles rest"
   - Clear role definition

2. 3 Golden Rules:
   - Always use naming tool (never type manually)
   - Every asset needs CreativeX score (no exceptions)
   - Always use SAME tracking ID (for all versions)

3. Step-by-Step Workflow:
   - Receive email → Download → Localize → Score → Name → Upload
   - Each step explained in plain language
   - What to look for in emails
   - How to use naming tool (field-by-field)
   - Where to upload
   - What emails to expect

4. Rejection & Rework Process:
   - What rejection means (normal, not failure)
   - How to read rejection comments (Legal/IA&CC/Approver)
   - How to fix and re-upload
   - CRITICAL: Must re-score after fixes
   - SAME tracking ID, NEW job number

5. Common Questions (10 FAQs):
   - How to find tracking ID
   - Do I really need 200 scores? (Yes!)
   - What if typo in tracking ID?
   - Can I upload before scoring? (Yes but not recommended)
   - Wrong folder - what to do?
   - How long to process? (5 minutes max)
   - Can I edit filename? (NO!)

6. Troubleshooting:
   - "File not processed" → Check folder, filename, tracking ID
   - "Score=0 but I uploaded PDF" → Check filename match
   - "Error: wrong tracking ID" → Copy from email exactly

7. Quick Checklist:
   - 15-point checklist before upload
   - 7 additional steps for rework
   - All checkboxes format

8. What NOT to Do (5 critical don'ts):
   - Don't type manually
   - Don't skip CreativeX
   - Don't reuse tracking IDs across campaigns
   - Don't upload to wrong folder
   - Don't edit generated filenames

9. Quick Reference Tables:
   - Box folders and when to use
   - Email types and meanings
   - Naming tool field guide
   - Contact information

Key Differences from Technical Guide:
 No system architecture
 No database schemas
 No Python code
 No technical troubleshooting
 No server commands

 What to click
 Where to upload
 How to use naming tool
 What emails mean
 How to fix common mistakes
 Who to contact

Tone:
- Friendly and supportive
- Clear and direct
- Action-oriented ("Do this, not that")
- Visual with tables and checklists
- Assumes no technical knowledge

Examples Are Real-World:
- Actual tracking IDs (pOiJ9s, a7K9mP)
- Actual folder IDs (348526703108)
- Real error messages users will see
- Common typos (pOlJ9s vs pOiJ9s)

Length: ~800 lines (~20 pages when formatted)

Perfect for:
- New agency onboarding
- Quick reference during work
- Sharing with non-technical stakeholders
- Training sessions

Complements COMPLETE_WORKFLOW_GUIDE.md (technical deep-dive)
with practical hands-on instructions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 14:35:46 -05:00
DJP
e9acff76c5 Add comprehensive A1→A4 workflow diagram with A5-A6 rejection cycle at top
Adds master flowchart at document start showing complete workflow from
campaign creation through approval, including legal/compliance rejection
and rework cycles.

Master Workflow Diagram Features:

7 Phases Visualized:
1. Campaign Creation (A1 setup)
2. A1→A2 Master Download (automated every 5 min)
3. Agency Localization + CreativeX Scoring
4. A2→A3 Derivative Upload (automated)
5. Legal/Compliance/Brand Approval
6. A5→A6 Rejection Download (automated)
7. Agency Rework + Re-upload

Rejection Cycle Details:
- Legal reviewer adds compliance comments
- IA&CC reviewer adds brand guideline feedback
- General approver adds creative feedback
- All comments sent to agency in single email
- Agency fixes issues
- Re-scores with CreativeX (mandatory)
- Re-uploads with SAME tracking ID but NEW job number
- Re-enters A2→A3 flow (can repeat multiple times)

Color Coding:
🟣 Purple - CreativeX scoring (CRITICAL, highlighted twice)
🔵 Blue - Tracking IDs (critical links)
🔴 Red - Rejection path and comments
🟢 Green - Success/completion
🟠 Orange - Rework loop warnings

Critical Requirements Called Out:
1. "🔴 CRITICAL: Submit EVERY derivative to CreativeX"
   - 200 derivatives = 200 analyses required
   - Emphasized in agency phase

2. "🔴 Re-submit to CreativeX"
   - MUST get new score for fixed version
   - Emphasized in rework phase

3. Legal/IA&CC/Approver comment flow
   - Shows 3 different reviewer types
   - All feedback consolidated in email

4. Tracking ID reuse
   - Blue highlighting shows where tracking IDs critical
   - Same ID used throughout rework cycles

Example Shown:
- Original: 6666_NUT_SUMMER_OLV_30S_16x9_DE_de_pOiJ9s.mp4
- Rework:  7777_NUT_SUMMER_OLV_30S_16x9_DE_de_pOiJ9s.mp4
           ↑ New job number, SAME tracking ID ↑

Decision Points Visualized:
- All assets successful? (A1→A2)
- Score found in database? (A2→A3)
- Approved? (A3 review)
- Loops back if rejected

Placement:
- At very top of document (lines 11-136)
- Before Table of Contents
- First thing users see
- Sets context for entire guide

Impact:
Users immediately see complete workflow including rejection paths and
understand CreativeX is required at TWO points: initial upload AND rework.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 14:22:05 -05:00
DJP
594c93c905 Add critical section: Every derivative requires its own CreativeX score
Emphasizes mandatory requirement that EVERY derivative asset must be
scored individually - master scores do not apply to derivatives.

New Section Added (1,830-2,055):
"🔴 CRITICAL REQUIREMENT: Every Derivative Needs Its Own CreativeX Score"

Key Points Emphasized:

The Rule:
- 1 master → 10 derivatives = 10 scores required
- 1 master → 200 derivatives = 200 scores required
- NO EXCEPTIONS for client deliverables

Why Every Derivative Needs Scoring:
1. Each localization is different execution (voice-over, subtitles, pacing)
2. Master score is reference only, NOT applicable to derivatives
3. Scores are asset-specific, not campaign-specific
4. Reporting requires individual scores for market analysis

The Math Example:
- 1 master asset
- 20 markets × 2 languages = 40 derivatives
- 3 aspect ratios = 120 versions
- 2 platforms = 240 total derivatives
- REQUIRED: 240 CreativeX PDF reports

What Happens Without Scoring:
- Assets upload with Score=0 (default)
- Looks unprofessional (zero implies poor quality)
- Incomplete reporting (can't analyze market performance)
- Inconsistent quality control
- Client questions raised

Correct Workflow (Mermaid Sequence Diagram):
- Create derivative
- Submit to CreativeX
- Get PDF report
- Upload PDF to Box 350605024645
- Run scoring script
- THEN upload derivative to A2→A3
- Score automatically attached

Batch Processing Strategy:
- Week-by-week approach for 100+ derivatives
- Create 25 → Score 25 → Upload 25 (repeat)
- NOT: Create all → Upload all → Score later (too late)

Common Mistake Addressed:
"I'll only score the important ones" → Why this fails:
- Client contracts may require all scored
- Reporting incomplete
- Selective scoring = inconsistent quality
- Score=0 is visible and looks bad

Verification Methods:
- Count derivatives vs scores in database (should match)
- Check logs for "CreativeX Score Missing" warnings
- Query database before and after uploads

Exceptions Where Score=0 Acceptable:
- Internal reference assets
- Template files
- Work-in-progress
- Non-creative assets (documents, spreadsheets)

Impact:
Users now understand CreativeX scoring is NOT optional. Every asset
created requires individual analysis and scoring. Plan projects with
this requirement from the start.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 13:53:15 -05:00
DJP
ba7b68a38d Add comprehensive 70-page workflow guide with Mermaid diagrams
Complete end-to-end documentation of Ferrero DAM asset production
system covering all workflows, tools, processes, and troubleshooting.

COMPLETE_WORKFLOW_GUIDE.md (3,438 lines, ~70 pages):

1. Executive Overview:
   - System purpose and benefits
   - Time savings metrics (2 hours → 30 seconds)
   - Error reduction (15-20% → <1%)
   - Stakeholder roles
   - High-level architecture Mermaid diagram

2. Complete Asset Lifecycle:
   - Phase 1-6 detailed walkthroughs
   - From brief to live asset
   - All workflow stages explained
   - Master flow, revision flow, global flow diagrams

3. Naming Convention Tool:
   - Why naming is CRITICAL for automation
   - V2.1 structure field-by-field
   - How to use the tool (step-by-step)
   - Real examples: wrong vs right filenames
   - What happens when naming fails

4. Python Automation Scripts:
   - All 6 scripts explained in detail
   - Cron schedule and why every 5 minutes
   - Shared modules (DAM client, Box client, Database, Parser, etc.)
   - Sequence diagrams for each workflow
   - Dependencies and data flow

5. Detailed Workflows with Mermaid Diagrams:
   - A1→A2: 15-step sequence diagram
   - A2→A3: 25-step sequence diagram
   - A5→A6: Rejection/rework cycle flowchart
   - B1→B2: Global masters flow
   - Each with database queries, email examples, error scenarios

6. CreativeX Integration:
   - Master vs derivative scores explained
   - PDF extraction workflow (LlamaExtract AI)
   - Database storage (3 statuses)
   - A2→A3 integration and fallbacks
   - Version tracking
   - Dual-source flow diagram

7. Database Architecture:
   - Entity Relationship Diagram (ERD)
   - 5 tables explained (master_assets, derivative_assets, creativex_scores, etc.)
   - JSONB usage for metadata
   - Backup strategy (daily + weekly)
   - Query examples for each table

8. Common Mistakes & Pitfalls (15+ scenarios):
   - Wrong field order (V1 vs V2.1)
   - Manual typing errors
   - Tracking ID typos
   - Wrong Box folder
   - Timing issues
   - CreativeX filename mismatches
   - Intentional shortcuts and why they fail
   - Process violations
   - Real examples with consequences

9. Monitoring & Health Checks:
   - Email notification guide (what each color means)
   - Daily report interpretation
   - Log file locations and commands
   - Health check procedures

10. Troubleshooting Guide:
    - Symptom-based troubleshooting
    - "File not processing" → 4-step checklist
    - "CreativeX score shows 0" → diagnosis
    - "Wrong metadata uploaded" → recovery
    - Emergency procedures (DB corruption, all workflows failing)

11. Reference Materials:
    - Glossary of terms
    - FAQ (10 common questions)
    - Box folder ID reference
    - Status code meanings
    - Quick command reference
    - Contact information and escalation path

12. Golden Rules:
    - For campaign managers (7 do's and don'ts)
    - For agencies (10 critical rules)
    - For approvers (4 best practices)
    - For operations (5 monitoring practices)

Mermaid Diagrams Included (8 diagrams):
1. System architecture overview (5 layers)
2. Complete asset lifecycle flow
3. A1→A2 sequence diagram (detailed)
4. A2→A3 sequence diagram (detailed)
5. Rejection/rework cycle flowchart
6. Global masters (B1→B2) flow
7. CreativeX dual-source architecture
8. Database ERD with relationships
9. Naming error cascade
10. Emergency recovery flowchart

Key Topics Covered:
 Why naming matters (automation dependency)
 All workflow stages (A1→A6, B1→B2)
 Tracking ID lifecycle and reuse
 CreativeX integration (master vs derivative)
 Asset type mapping (45 types)
 Common mistakes with real examples
 Error scenarios and recovery
 Monitoring and troubleshooting
 Complete reference materials

Target Audience:
- Campaign managers (understand workflow)
- Creative teams (know requirements)
- Agencies (follow processes correctly)
- Approvers (provide good feedback)
- Operations (monitor and troubleshoot)
- New team members (onboarding)
- Executives (high-level overview)

Document serves as:
- Training material for new users
- Reference guide for daily work
- Troubleshooting resource
- Process documentation
- System architecture reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 13:50:53 -05:00
DJP
0bf3c34cd0 Add asset type mapping from 3-letter codes to DAM codes
Maps frontend naming tool 3-letter codes (EHI, IMG, TVC) to DAM's
lowercase descriptive codes (heroimage, keyvisual, tvc) for uploads.

New File: config/asset_type_mappings.yaml
- 45 asset type mappings from DAM lookup domains
- Maps 3-letter codes to DAM codes
- E-Commerce types: EHI→heroimage, EBS→beautyshot, etc.
- Standard types: IMG→keyvisual, TVC→tvc, etc.
- Expandable as new asset types added

Field Mappings Update (field_mappings.yaml):
- Added FERRERO.FIELD.MKTG.ASSET TYPE to filename_updates
- Source: asset_type (from parsed filename)
- Required: true

Metadata Extractor Updates (metadata_extractor_mvp.py):
- Added _load_asset_type_mappings() method
- Added _map_asset_type() method
- Integrated mapping into _update_fields()
- Logs mappings: "Asset type mapping: EHI -> heroimage"
- Warns if no mapping found (may fail DAM validation)

Example Flow:
1. Filename: ROC_TEST-E2E2_EHI_1x1_DE_de.png
2. Parse: asset_type = "EHI"
3. Map: EHI → "heroimage"
4. Update field: FERRERO.FIELD.MKTG.ASSET TYPE = "heroimage"
5. DAM accepts "heroimage" (valid domain value)

Without Mapping (before):
- Field value: "EHI"
- DAM validation: FAILS (EHI not in domain)

With Mapping (after):
- Field value: "heroimage"
- DAM validation: PASSES 

Complete Mapping List (45 types):
E-Commerce: ECA, ECB, EBS, EBR, EEM, EHI, EIL, EOP, EUG, EWB
Standard: 3RT, APC, BBK, BRC, BSG, CKV, CID, DAT, FLA, FNT, GDT,
         GRG, IMG, FPO, LGL, LOG, MLF, OLV, PAW, PKI, POS, PDM,
         PRI, QRC, SND, SIP, SGL, TVC, VIE

DAM Answer: Uses descriptive lowercase codes, not 3-letter codes.
Frontend uses 3-letter for brevity, backend maps to DAM format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 13:08:01 -05:00
DJP
d1ab8551e5 Add asset type filename update and make field updates configurable
Enables asset type to be updated from derivative filename and refactors
_update_fields() to use filename_updates configuration dynamically.

Field Mappings Configuration (field_mappings.yaml):

Added to filename_updates:
- FERRERO.FIELD.MKTG.ASSET TYPE:
    source: asset_type
    required: true

Now updates from derivative filename:
- ROC_TEST-E2E2_EHI_1x1_DE_de.png → Asset Type = "EHI"

Metadata Extractor Refactor (metadata_extractor_mvp.py):

Old _update_fields():
- Hardcoded field updates (ASSET NAME, DESCRIPTION, STATE)
- Not using filename_updates configuration
- Required code changes to add new fields

New _update_fields():
- Dynamically processes filename_updates from config
- Supports transform: uppercase/lowercase
- Supports any source field from parsed_filename
- Uses forced_values from config (was hardcoded before)
- Add new fields via config, no code changes needed

Configuration-Driven Updates:
- ARTESIA.FIELD.ASSET NAME ← clean_filename
- ARTESIA.FIELD.ASSET DESCRIPTION ← subject_title
- FERRERO.FIELD.MKTG.ASSET TYPE ← asset_type (NEW)
- MAIN_LANGUAGES ← language_code (uppercase)
- FERRERO.FIELD.STATE ← "Local" (forced value)

Benefits:
- Asset type now correctly populated from filename
- Configuration-driven (add fields without code changes)
- Cleaner code (uses config instead of hardcoded logic)
- Forced values also configurable
- Easier to maintain and extend

Example:
Filename: ROC_TEST-E2E2_EHI_1x1_DE_de.png
Parsed asset_type: "EHI"
Field FERRERO.FIELD.MKTG.ASSET TYPE updated to: "EHI"

Impact:
All A2→A3 uploads will now have correct Asset Type from derivative
filename instead of inheriting from master (which may be different).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 12:58:10 -05:00
DJP
193dd00bd9 Update CreativeX documentation with tracking_id column and master scores
Documents database migration for tracking_id column and explains
master-cx-score status for scores extracted from A1→A2 master assets.

Documentation Updates:

Database Table Creation:
- Added tracking_id VARCHAR(6) to CREATE TABLE statement
- Added idx_creativex_tracking_id index
- Included migration SQL for existing databases (ALTER TABLE)

Status Values Documented:
- 'active' - Current derivative score (PDF extraction)
- 'superseded' - Old derivative scores (version history)
- 'master-cx-score' - Master asset score (A1→A2, reference only)

Workflow Section Split:
- CreativeX PDF Extraction (manual process)
- Master Asset CreativeX (automatic during A1→A2)
- Clarifies master scores NOT used for uploads

New Query Examples:
- Get master score by tracking_id
- View all master scores
- Count records by status (shows master_scores separately)
- Updated history queries to include tracking_id column

Migration Instructions:
Production servers with existing creativex_scores table should run:
ALTER TABLE creativex_scores ADD COLUMN tracking_id VARCHAR(6);
CREATE INDEX idx_creativex_tracking_id ON creativex_scores(tracking_id);

Use Case Clarification:
Master scores stored for reference/reporting/analytics only.
A2→A3 uploads always use filename-based lookup (derivative scores).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 14:19:22 -05:00
DJP
e44f42dad5 Add master CreativeX score extraction and storage in A1→A2 workflow
Stores master asset CreativeX scores from DAM metadata during A1→A2
download for reference/reporting purposes (not used in uploads).

Database Changes:

creativex_scores table:
- Added: tracking_id VARCHAR(6) column
- Added: idx_creativex_tracking_id index
- Updated comment: status can be 'active', 'superseded', or 'master-cx-score'

Status Values:
- 'active' - Current derivative score (from PDF extraction)
- 'superseded' - Old derivative score (version history)
- 'master-cx-score' - Master asset score (from A1→A2 DAM metadata) ← NEW

Migration SQL (for existing databases):
ALTER TABLE creativex_scores ADD COLUMN tracking_id VARCHAR(6);
CREATE INDEX idx_creativex_tracking_id ON creativex_scores(tracking_id);

Database Method Updates (database.py):

store_creativex_score() signature:
- Added: tracking_id parameter (optional, default None)
- Added: status parameter (optional, default 'active')

Logic:
- If status='master-cx-score': Simple insert, no versioning
- If status='active': Soft delete versioning as before
- Always stores tracking_id if provided

A1→A2 Script Updates (a1_to_a2_download.py):

New Function: extract_creativex_from_dam_metadata()
- Searches metadata_element_list for CREATIVEX fields
- Extracts FERRERO.TAB.FIELD.CREATIVEX (score)
- Extracts FERRERO.FIELD.CREATIVEX LINK (url)
- Returns dict with score/url or None if not found
- Handles tabular field structure for score
- Handles nested value structure for URL

Integration:
- After successful master asset storage
- Extracts CreativeX from asset metadata
- If found: Stores in creativex_scores with status='master-cx-score'
- Links to master via tracking_id
- Logs when score found/stored
- Logs "normal" when not found (not all masters are scored)

Use Cases:

A2→A3 Upload:
- Still uses filename-based lookup ONLY 
- No changes to A2→A3 logic 
- Master scores not used for uploads 

Reporting/Analytics Tools:
- Can query master score by tracking_id
- Compare master vs derivative scores
- Track score improvements
- Audit trail

Query Examples:
-- Get master score for tracking ID
SELECT * FROM creativex_scores
WHERE tracking_id = '7xXgKp' AND status = 'master-cx-score';

-- Get derivative score for filename
SELECT * FROM creativex_scores
WHERE filename = 'file.mp4' AND status = 'active';

Test Record Created:
- Filename: nutella_pbased.jpg
- Tracking ID: 7xXgKp
- Score: 85
- Status: master-cx-score

Benefits:
- Historical reference of master scores
- Enables score comparison analytics
- No impact on A2→A3 upload logic
- Automatic extraction during A1→A2
- Optional (works even if masters don't have scores)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 13:56:22 -05:00
DJP
68180b23cf Update filename parser to V2.1 structure with new field positions
Complete rewrite of filename parser to support new field order where
Subject/Asset moved up and Country/Language moved down, plus new
Social Media field.

BREAKING CHANGE: V2.1 Structure (November 2025)

Old (V1):
[JOB]_[BRAND]_[COUNTRY]_[LANG]_[SUBJECT]_[ASSET]_[SPOT]_[DUR]_[RATIO]_[TRACKING]

New (V2.1):
[JOB]_[BRAND]_[SUBJECT]_[ASSET]_[DUR]_[RATIO]_[SPOT]_[COUNTRY]_[LANG]_[SOCIAL]_[TRACKING]

Field Position Changes:
- Subject Title: Position 5 → 3 (MOVED UP)
- Asset Type: Position 6 → 4 (MOVED UP)
- Duration: Position 8 → 5 (MOVED UP)
- Aspect Ratio: Position 9 → 6 (MOVED UP)
- Spot Version: 7 → 7 (SAME)
- Country Code: Position 3 → 8 (MOVED DOWN)
- Language Code: Position 4 → 9 (MOVED DOWN)
- Social Media: NEW → Position 10
- Tracking ID: Position 10 → 11

New Social Media Field:
- Field: social_media_version
- Position: 10 (after language, before tracking)
- Format: 3 uppercase letters
- Codes: FBP, FBR, IGF, IGR (expandable)
- Optional: Only present for social media assets

Parse Algorithm Changes:
- Positions 1-4 now: Job, Brand, Subject, Asset (fixed)
- Positions 5-11: Pattern-based detection (flexible)
- Duration detected by \d+S pattern
- Aspect ratio detected by \d+x\d+ or contains 'x'
- Spot detected by MST/REF
- Country detected by 2 upper alpha (after ratio)
- Language detected by 2-3 lower alpha (after country)
- Social detected by known codes (after language)
- Tracking ID detected by 6 alphanumeric + optional -N

strip_upload_components() Updated:
Now outputs: [BRAND]_[SUBJECT]_[ASSET]_[DUR]_[RATIO]_[SPOT]_[COUNTRY]_[LANG]_[SOCIAL]
- Includes social media version if present
- Still strips job number and tracking ID

Testing:
All 7 test cases from specification passed:
 All fields present
 Minimal (no duration/social/tracking)
 No duration
 No spot version
 With -N tracking (folder-only mode)
 No social media (most common)
 No tracking ID

Example:
Input:  1234567_RAF_TEST_OLV_6S_1x1_REF_GL_it_IGF_abc123.mp4
Parsed: brand=RAF, subject=TEST, country=GL, lang=it, social=IGF
Clean:  RAF_TEST_OLV_6S_1x1_REF_GL_it_IGF.mp4

Backward Compatibility:
None - system not live yet, clean cutover to V2.1 format only.

Backup: filename_parser_v1_backup.py contains old version for reference.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 13:41:19 -05:00
DJP
15eb47fc43 Add CreativeX fields to asset representation if missing from master metadata
Fixes issue where CreativeX score field was not appearing in final upload
because it didn't exist in the master metadata from DAM.

Problem:
- Master metadata from A1→A2 doesn't include CREATIVEX fields (new fields)
- _update_creativex_fields() only UPDATED existing fields
- If field not present, it logged error but didn't add the field
- Result: CREATIVEX score missing from upload, only URL appeared

Solution:
- Check if CREATIVEX Score field exists in mvp_fields
- If NOT found: Create and append field with proper structure
- If found: Update value as before
- Same logic for CREATIVEX URL field

Field Structures Created:

CREATIVEX Score (FERRERO.TAB.FIELD.CREATIVEX):
- Type: MetadataTableField (tabular field)
- Parent: FERRERO.TABULAR.FIELD.PLATFORMRATING
- Data type: INTEGER
- Value structure: {'value': {'value': score}}

CREATIVEX URL (FERRERO.FIELD.CREATIVEX LINK):
- Type: MetadataField (regular field)
- Data type: CHAR
- Value structure: {'value': {'value': url}}

Logging:
- Changed from ERROR to WARNING when field not found
- Logs "adding it now" instead of just error
- Confirms field added with value

Impact:
Both CreativeX fields will now appear in uploads even if master
metadata doesn't have them (common for older campaigns downloaded
before CreativeX integration).

Testing:
Run with --dryrun to verify both CREATIVEX fields in JSON output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 12:44:17 -05:00
DJP
39a41df21d Fix CreativeX lookup to use original Box filename not stripped version
Changes database lookup strategy to match on full filename as it appears
in Box and in the CreativeX PDF report filename field.

Critical Design Change:

Old (incorrect):
- Strip job number and tracking ID from Box filename
- Lookup: NUT_PL_pl_TEST-E2E_EHI_1x1.png
- Database has: 6487512_NUT_PL_pl_TEST-E2E_EHI_1x1_7xXgKp.png
- RESULT: No match found, uses defaults

New (correct):
- Use original Box filename for lookup
- Lookup: 6487512_NUT_PL_pl_TEST-E2E_EHI_1x1_7xXgKp.png
- Database has: 6487512_NUT_PL_pl_TEST-E2E_EHI_1x1_7xXgKp.png
- RESULT: Match found, uses actual score

Rationale:
The CreativeX PDF report contains a "filename" field that stores the
actual asset filename including job number and tracking ID. This is
the name that gets extracted by LlamaExtract and stored in database.

The A2→A3 workflow receives files from Box with the SAME filename
structure (job_brand_country_lang_subject_trackingID.ext).

Therefore, we match on the complete original filename, not the stripped
version.

Database Storage Pattern:
- CreativeX PDF named: anything.pdf (name doesn't matter)
- PDF contains field: filename = "6487512_NUT_PL_pl_TEST-E2E_EHI_1x1_7xXgKp.png"
- Database stores: filename = "6487512_NUT_PL_pl_TEST-E2E_EHI_1x1_7xXgKp.png"
- A2→A3 receives: 6487512_NUT_PL_pl_TEST-E2E_EHI_1x1_7xXgKp.png from Box
- Lookup matches exactly

Clean filename still used for DAM upload, only the lookup is on original.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 12:29:55 -05:00
DJP
2fef8878cd Add verbose debugging to _set_field_value for CreativeX troubleshooting
Adds detailed logging to trace exactly how field values are being set
and diagnose why CreativeX score/URL aren't appearing in final JSON.

Changes to _set_field_value():
- Logs field ID being updated
- Logs current field['value'] structure BEFORE setting
- Logs which code path is taken (nested vs created)
- Logs field['value'] structure AFTER setting
- Shows full JSON structure at each step

Output Example:
_set_field_value called for: FERRERO.TAB.FIELD.CREATIVEX with value: 85
Current field['value']: {
  "is_locked": false,
  "domain_value": false,
  ...
}
Created field['value'] = {'value': {'value': 85}}
After setting, field['value']: {
  "value": {
    "value": 85
  }
}

Purpose:
Diagnose why CreativeX fields show empty value dicts in asset
representation even though logs say "Set CREATIVEX Score to: 0".

This verbose logging will show:
1. What the field structure looks like before we set it
2. Which code path is executed
3. What the field structure looks like after we set it
4. Whether the value is actually being placed in the right location

Run with --dryrun to see full debug output without uploading.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 12:16:53 -05:00
DJP
d3722d1bb2 Fix CreativeX URL logging to say 'from database' not 'from Box'
Corrects misleading log messages and adds debugging for URL field.

Changes:
- Changed: "Updating CreativeX URL from Box"
- To: "Updating CreativeX URL from database"
- Added url_field_found flag
- Added URL field structure logging
- Added error handling with traceback
- Logs error if URL field not found in mvp_fields

Now both CreativeX fields log correctly:
- "Updating CreativeX Score from database: 0"
- "Updating CreativeX URL from database: https://..."

Accurate logging shows data source is PostgreSQL creativex_scores
table, not Box metadata templates (which are no longer used).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 12:11:58 -05:00
DJP
899d15322b Improve CreativeX field value setting with better error handling
Enhances _set_field_value() to handle empty value structures and
adds detailed logging for debugging CreativeX field issues.

Changes to metadata_extractor_mvp.py:

_set_field_value() Enhancement:
- Handles empty nested dicts by creating value structure
- If field['value']['value'] is empty dict, creates {'value': value}
- If field['value'] is empty dict, creates {'value': {'value': value}}
- Preserves existing behavior for populated structures

CreativeX Score Field Debugging:
- Added score_field_found flag to detect if field exists in mvp_fields
- Logs field structure before attempting to set value
- Shows dict keys to understand nesting
- Catches and logs full traceback on errors
- Errors if CREATIVEX Score field not found in mvp_fields
- Changed log: "from Box" → "from database" (accurate)

CreativeX URL Field:
- Existing logic preserved
- Uses same enhanced _set_field_value()

Purpose:
Diagnose why CreativeX score not appearing in asset representation
even though logs show "Set CREATIVEX Score to: 0"

Next Steps:
Run with --dryrun to see field structure logging and verify values
are being set correctly in the JSON output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 12:11:22 -05:00
DJP
cdb839e3a8 Add --dryrun mode to A2→A3 and fix Agency name to Oliver
Adds debugging mode to A2→A3 workflow that builds full asset metadata
but doesn't upload to DAM, displaying complete JSON for field validation.

Changes to A2→A3 Script (a2_to_a3_upload_polling.py):

--dryrun Flag:
- New argument: --dryrun (build metadata but don't upload)
- Displays full asset representation as formatted JSON
- Shows field count
- Shows CreativeX lookup status
- Keeps file in Box (no deletion)
- Logs "DRYRUN MODE" clearly
- Returns success with 'DRYRUN_NO_UPLOAD' as asset_id

Dryrun Output Includes:
- Complete asset_representation JSON (all MVP fields)
- Field count (should be 27 fields)
- CreativeX status (found/missing)
- CreativeX score and URL values
- Clean separation with === lines

Usage:
python scripts/a2_to_a3_upload_polling.py --dryrun

Benefits:
- Debug metadata issues without DAM uploads
- Verify all fields present before going live
- Check CreativeX integration working
- Validate field values and formatting
- Safe testing with production data

Changes to Field Mappings (config/field_mappings.yaml):

Agency Name Fixed:
- Changed: FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
- To: FERRERO.MARKETING.FIELD.AGENCY NAME: "Oliver"
- Exact case as required by DAM
- Comment updated to reflect this is final value

Impact:
- All A2→A3 uploads now have Agency Name = "Oliver"
- Not "Oliver Agency" (wrong)
- Not "-" placeholder (old)
- Exact case: "Oliver" (capital O, lowercase liver)

Use Case:
Run with --dryrun to see full JSON metadata, verify Agency name is
"Oliver", check all 27 MVP fields are present, then remove --dryrun
flag to perform actual uploads.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 11:59:55 -05:00
DJP
0f84b4be38 Add production cutover TODO checklist
Comprehensive checkbox-based cutover plan with 25 major sections and
200+ individual tasks for safe production deployment.

CUTOVER-TODOS.md Structure:

Pre-Cutover Phase:
✓ 1-3: Obtain credentials (DAM, Box, Database, Email, Webhooks, CreativeX, mTLS)
✓ 4-8: Backup current system and prepare server

Installation Phase:
✓ 9-14: Deploy code, install dependencies, configure environment, setup backups

Go/No-Go Decision:
✓ 15: Pre-go-live validation with 11-point criteria checklist

Go-Live Phase:
✓ 16-19: Cutover execution, enable automation, monitor first campaigns

Post-Cutover Monitoring:
✓ 20-21: Hourly checks (24h), daily checks (7 days)

Validation & Sign-Off:
✓ 22-23: Week 1 validation, production sign-off with signatures

Rollback Procedures:
✓ 24: Emergency rollback steps if issues found

Completion:
✓ 25: Final steps and documentation updates

Features:
- 200+ checkbox items for tracking progress
- Fill-in-the-blank fields for credentials and IDs
-  /  checkboxes for pass/fail validation
- Timestamp fields for tracking when steps completed
- Notes section for issues encountered
- Lessons learned template
- Sign-off section with signatures
- Emergency contacts quick reference
- Status tracking (Not Started / In Progress / Complete / Rolled Back)

Benefits:
- Clear task ownership and accountability
- Visual progress tracking
- Nothing forgotten or skipped
- Document becomes permanent record of cutover
- Audit trail for compliance
- Reusable for future deployments

Use Case:
Print or use as working document during cutover. Check off each item
as completed. Fill in actual values (IDs, URLs, results). Archive as
permanent record of production deployment.

Complements CUTOVER.md (detailed procedures) with actionable checklist.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 09:19:32 -05:00
DJP
32337df2b8 Add comprehensive production cutover plan
Complete guide for transitioning from development/staging to production
environment with detailed checklists, validation steps, and rollback procedures.

CUTOVER.md Contents:

Pre-Cutover:
- Complete credential inventory (DAM, Box, Database, Email, Webhooks)
- Development vs Production configuration comparison
- Backup and safety procedures
- Testing checklist before go-live

Cutover Phases:
1. Preparation (1-2 days before)
   - Update server code
   - Install dependencies
   - Update database schema
   - Configure production credentials

2. Testing (Before cutover)
   - Connection tests (DAM, Box, Database, Email, Webhook)
   - CreativeX extraction test
   - Dry run workflows with production data
   - Verify all integrations

3. Go/No-Go Decision
   - 11-point checklist for go-live approval
   - Clear criteria for proceeding or delaying

4. Go-Live Execution
   - Step-by-step cutover procedure
   - Cron job configuration
   - Initial monitoring plan

Post-Cutover Monitoring:
- Hourly checks (first 24 hours)
- Daily checks (first week)
- Weekly review tasks
- Success metrics and KPIs

Rollback Plan:
- Quick rollback to staging
- Database restore procedures
- Code reversion steps
- Configuration rollback

Production Differences:
- DAM URLs (staging vs production)
- Box folder IDs
- OAuth2 credentials
- Email recipients
- Webhook endpoints
- Database passwords
- mTLS certificates

Support Procedures:
- Monitoring commands
- Troubleshooting guides
- Emergency contacts
- Escalation path

Communication Plan:
- Stakeholder notification
- Email templates
- Status update schedule

Checklists:
- Infrastructure readiness
- Code deployment
- Configuration
- Testing completion
- Automation setup
- Sign-off approvals

Features:
 Complete credential inventory
 Step-by-step cutover procedure
 Testing and validation at each phase
 Clear go/no-go criteria
 Comprehensive rollback plan
 Post-cutover monitoring plan
 Troubleshooting procedures
 Communication templates

Purpose:
Provides operations team with complete roadmap for safe, structured
transition to production with minimal risk and clear rollback options.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 09:11:59 -05:00
DJP
ca43aaf1bc Remove email notification when no CreativeX files found
Changes behavior to only log when Box folder is empty, not send emails.

Rationale:
- Empty folder is normal operation (not an error condition)
- Reduces email noise when script runs on cron
- Still logs the event for monitoring
- Similar to other workflows (a1_to_a2, etc.) that don't email when no work to do

Changes:
- Removed notifier.send_email() call for 'creativex_no_files' template
- Enhanced log message: "No PDF files found - this is normal when folder is empty"
- Added: "Script completed successfully with no files to process"
- Still returns success=True (not an error)

Template 'creativex_no_files' retained for potential future use but not called.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 20:58:09 -05:00
DJP
e0128d98b8 Add automated PostgreSQL backup and restore system
Implements dual backup strategy with daily SQL dumps and weekly binary
backups, complete with restore capabilities and health monitoring.

Backup System Components:

1. database/backup.sh:
   - Daily mode: pg_dump SQL dumps (7-day retention)
   - Weekly mode: pg_basebackup binary backup (latest only)
   - Automatic cleanup of old backups
   - Compression (gzip) for space efficiency
   - Email notifications on failures
   - Docker-compatible execution

2. database/restore.sh:
   - Restore from SQL dump backups
   - Safety backup before restore
   - Confirmation prompts
   - Validation and verification
   - List available backups

3. database/check_backups.sh:
   - Health check monitoring
   - Verifies latest backup age (warns if > 25 hours)
   - Displays backup counts and sizes
   - Quiet mode for cron automation
   - Lists all available backups

Documentation:

- DATABASE_BACKUP_GUIDE.md: Complete backup/restore guide
  - Automated cron setup
  - Manual backup procedures
  - Restore scenarios
  - Troubleshooting
  - Disk space management

- backups/README.md: Quick reference
  - Directory structure
  - Common commands
  - Retention policy
  - Security notes

Configuration:

- Updated .gitignore to exclude backup files
- Backup locations: backups/dumps/, backups/basebackups/
- Logs: logs/backup.log, logs/restore.log
- Retention: 7 daily dumps + 1 weekly basebackup

Cron Schedule (Production):
- Daily: 2:00 AM (pg_dump)
- Weekly: Sundays 3:00 AM (pg_basebackup)
- Health Check: 8:00 AM daily

Features:
 Automated daily and weekly backups
 Dual strategy (logical + physical)
 Space-efficient (7-day retention, ~50 MB total)
 Safety backups before restore
 Email alerts on failures
 Health monitoring
 Docker-compatible
 Tested locally

Testing Performed:
- Daily backup created successfully (77K compressed)
- Backup file integrity verified (gzip test passed)
- Health check shows "Backup system healthy"
- Restore --list command working
- All scripts executable and functional

Disk Usage Estimate:
- Daily dumps: 7 × ~2 MB = ~14 MB
- Weekly backup: 1 × ~30 MB = ~30 MB
- Total: ~50 MB maximum

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 17:30:10 -05:00