DJP d818a8b6a8 Update comprehensive README and reorganize documentation files

Major README overhaul with complete deployment and configuration guide.
Moved old docs to tests/ folder for archive.

README.md UPDATES (880 lines - completely rewritten):
✓ Table of contents with navigation
✓ Complete overview of all 4 workflows + daily report
✓ Detailed authentication section (OAuth2 vs mTLS)
✓ Box-config.json location explanation
✓ Server deployment step-by-step guide
✓ Database setup (Docker + native PostgreSQL)
✓ Cron job examples for all workflows
✓ Comprehensive troubleshooting section
✓ Security checklist
✓ Monitoring and log rotation details
✓ Common SQL queries
✓ File structure diagram

KEY SECTIONS ADDED:
1. What's Included - All 5 scripts explained
2. Quick Start - Local setup guide
3. Server Deployment - 6-step process with commands
4. Workflows - Detailed process for each (A1→A2, A5→A6, B1→B2, A2→A3, Daily Report)
5. Authentication - OAuth2 vs mTLS with examples
6. Configuration - All .env variables documented
7. Database - Schema, setup, queries
8. Monitoring - Logs, emails, database queries
9. Troubleshooting - Common issues + solutions
10. File Structure - Complete directory tree

BOX-CONFIG.JSON LOCATION DOCUMENTED:
✓ Must be one folder up from Python-Version
✓ Referenced as ../Box-config.json in config.yaml
✓ Server deployment instructions include copying both files
✓ Troubleshooting section explains file not found errors

MTLS DOCUMENTATION:
✓ Different base URL explained (dev-auth.app-api.ferrero.com)
✓ --auth-pfx flag usage
✓ Whitelisted IP requirement noted
✓ Certificate testing commands

REORGANIZATION:
- Moved old DEPLOYMENT.md → tests/DEPLOYMENT.md (archive)
- Moved old WORKFLOW_DIAGRAMS.md → tests/WORKFLOW_DIAGRAMS.md (archive)
- New DEPLOYMENT_GUIDE.md is the current deployment doc
- README.md is now comprehensive one-stop documentation

Changes:
- Python-Version/README.md (completely rewritten, 880 lines)
- Moved 2 old docs to tests/ folder
- Added test files to tests/ folder

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-05 10:57:11 -05:00

27 KiB

Raw Permalink Blame History

Python Automation Workflow Diagrams

Detailed Mermaid flowcharts for each Python automation script.

A1 to A2 - Local Adaptation Master Asset Downloader

Script: a1_to_a2_download.py Trigger: Cron (every 5 minutes) Purpose: Download Local Adaptation master assets and upload to Box

flowchart TD
    Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box, Database, Notifier]
    Init --> TestConn{Test All Connections}
    TestConn -->|Any Failed| ExitFail([Exit with error code 1])
    TestConn -->|All OK| Search[Search DAM for campaigns with Status A1 and Campaign Type Local Adaptation]

    Search --> CheckCampaigns{Found A1 Campaigns?}
    CheckCampaigns -->|No| LogNone[Log: No A1 campaigns found]
    LogNone --> ExitClean([Exit with code 0])

    CheckCampaigns -->|Yes| SelectFirst[Select FIRST campaign only]
    SelectFirst --> GetMasterAssets[Get Master Assets from 01 Master Assets folder]

    GetMasterAssets --> CheckAssets{Found Assets?}
    CheckAssets -->|No| SkipCampaign[Log: No master assets found]
    SkipCampaign --> ExitFail

    CheckAssets -->|Yes| FindFinalFolder[Find 01 Final Assets folder for upload directory]
    FindFinalFolder --> InitCounters[Initialize: processed count 0, failed count 0]

    InitCounters --> LoopAssets[For Each Asset in Campaign]
    LoopAssets --> DownloadAsset[Download asset from DAM to temp folder]

    DownloadAsset --> GenTrackID[Generate unique 6-char tracking ID from database]
    GenTrackID --> CreateBoxFolder[Create or find Box folder: CampaignID-CampaignName]
    CreateBoxFolder --> UploadBox[Upload to Box with tracking ID suffix: filename_ABC123.ext]

    UploadBox --> StoreDB[Store in PostgreSQL: tracking ID, DAM asset ID, Box file ID, full metadata JSONB, upload directory]

    StoreDB --> CheckSuccess{Upload and Storage Successful?}
    CheckSuccess -->|Yes| IncrementSuccess[Increment processed count, Add to success list]
    CheckSuccess -->|No| IncrementFail[Increment failed count, Add to failed list with error]

    IncrementSuccess --> CleanTemp[Delete temp file]
    IncrementFail --> CleanTemp

    CleanTemp --> MoreAssets{More Assets to Process?}
    MoreAssets -->|Yes| LoopAssets

    MoreAssets -->|No| AllDoneCheck{All Assets Successful? processed equals total}

    AllDoneCheck -->|No| LogPartial[Log: Campaign incomplete - some failed]
    LogPartial --> SendPartialEmail[Send email: a1_to_a2_partial with success and failed lists]
    SendPartialEmail --> NoStatusUpdate[Status stays A1 - will retry on next run]
    NoStatusUpdate --> ExitFail

    AllDoneCheck -->|Yes| LogAllDone[Log: All assets processed successfully]
    LogAllDone --> UpdateStatus[Update campaign status A1 to A2 in DAM]

    UpdateStatus --> StatusCheck{Status Update Successful?}
    StatusCheck -->|No| LogStatusFail[Log: Status update failed]
    LogStatusFail --> ExitFail

    StatusCheck -->|Yes| SendWebhook[Send webhook to Make.com: campaign ID, campaign number, status change, asset list, timestamp]
    SendWebhook --> SendEmail[Send email: a1_to_a2_complete with all processed assets, tracking IDs, Box URLs]
    SendEmail --> LogSuccess[Log: Campaign completed successfully]
    LogSuccess --> ExitSuccess([Exit with code 0])

    style Start fill:#e1f5e1
    style ExitSuccess fill:#e1f5e1
    style ExitFail fill:#ffe1e1
    style ExitClean fill:#e1f5e1
    style AllDoneCheck fill:#fff3cd
    style StatusCheck fill:#fff3cd

A2 to A3 - Box to DAM Upload Handler (Polling)

Script: a2_to_a3_upload_polling.py Trigger: Cron (every 5 minutes) Purpose: Upload agency-processed files from Box to DAM

flowchart TD
    Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box folder 348526703108, Database, Notifier, FilenameParser, MetadataExtractorMVP]
    Init --> TestConn{Test All Connections}
    TestConn -->|Any Failed| ExitFail([Exit with error code 1])
    TestConn -->|All OK| ListBox[List all files in Box folder 348526703108]

    ListBox --> CheckFiles{Found Files?}
    CheckFiles -->|No| LogNoFiles[Log: No files found in Box folder]
    LogNoFiles --> ExitClean([Exit with code 0])

    CheckFiles -->|Yes| FilterValid[Filter for valid V2 filenames with tracking IDs]
    FilterValid --> CheckValid{Found Valid V2 Files?}
    CheckValid -->|No| LogNoValid[Log: No valid V2 files to process]
    LogNoValid --> ExitClean

    CheckValid -->|Yes| SelectFirst[Select FIRST file only - more processed on next run]
    SelectFirst --> ParseFilename[Parse V2 filename: extract all 10 components]

    ParseFilename --> ValidateV2{V2 Filename Valid?}
    ValidateV2 -->|No| LogInvalid[Log: Invalid V2 filename with errors]
    LogInvalid --> SendFailEmail[Send email: upload failed with validation errors]
    SendFailEmail --> ExitFail

    ValidateV2 -->|Yes| ExtractTracking[Extract tracking ID from filename]
    ExtractTracking --> CheckTracking{Tracking ID Exists?}
    CheckTracking -->|No| LogNoTracking[Log: No tracking ID in filename]
    LogNoTracking --> ExitFail

    CheckTracking -->|Yes| LoadMaster[Load master metadata from PostgreSQL by tracking ID]
    LoadMaster --> CheckMaster{Master Asset Found?}
    CheckMaster -->|No| LogNoMaster[Log: No master asset for tracking ID]
    LogNoMaster --> ExitFail

    CheckMaster -->|Yes| GetBoxMeta[Read Box metadata template: ferrerodammetadata]
    GetBoxMeta --> ExtractCreativeX[Extract: creativexScore, creativexUrl]

    ExtractCreativeX --> DownloadBox[Download file from Box to temp folder]
    DownloadBox --> StripFilename[Strip OMG Job Number and Tracking ID from filename]

    StripFilename --> BuildMVP[Build MVP asset representation: Load 24 fields from master, Add 3 missing from filename, Update Description State Language, Update CreativeX from Box metadata, Total 27-28 fields]

    BuildMVP --> RenameTemp[Rename temp file to clean filename]
    RenameTemp --> UploadDAM[Upload to DAM Final Assets folder with MVP metadata and video dimensions]

    UploadDAM --> CheckUpload{Upload Successful HTTP 201 or 202?}
    CheckUpload -->|No| LogUploadFail[Log: Upload failed with error]
    LogUploadFail --> CleanFail[Delete temp file]
    CleanFail --> SendFailEmail2[Send email: upload failed with error details]
    SendFailEmail2 --> ExitFail

    CheckUpload -->|Yes| StoreDerivative[Store derivative asset record in database: tracking ID and filename]
    StoreDerivative --> DeleteBox[Delete file from Box after successful upload]

    DeleteBox --> CleanSuccess[Delete local temp file]
    CleanSuccess --> SendSuccessEmail[Send email: a2_to_a3_file_uploaded with asset ID, tracking ID, clean filename, Box metadata values]

    SendSuccessEmail --> LogSuccess[Log: File processed successfully with asset ID]
    LogSuccess --> ExitSuccess([Exit with code 0])

    style Start fill:#e1f5e1
    style ExitSuccess fill:#e1f5e1
    style ExitFail fill:#ffe1e1
    style ExitClean fill:#e1f5e1
    style BuildMVP fill:#e1f0ff
    style GetBoxMeta fill:#ffe1ff
    style DeleteBox fill:#fff3cd

B1 to B2 - Global Masters Asset Downloader

Script: b1_to_b2_download.py Trigger: Cron (every 5 minutes) Purpose: Download Global Master assets and upload to Box

flowchart TD
    Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box folder 349261192115, Database, Notifier]
    Init --> TestConn{Test All Connections}
    TestConn -->|Any Failed| ExitFail([Exit with error code 1])
    TestConn -->|All OK| Search[Search DAM for campaigns with Status B1 and Campaign Type Global comm]

    Search --> CheckCampaigns{Found B1 Campaigns?}
    CheckCampaigns -->|No| LogNone[Log: No B1 campaigns found]
    LogNone --> ExitClean([Exit with code 0])

    CheckCampaigns -->|Yes| SelectFirst[Select FIRST campaign only]
    SelectFirst --> GetGlobalAssets[Get Global Master Assets from 05 Final Assets folder using is_global equals True]

    GetGlobalAssets --> CheckAssets{Found Assets?}
    CheckAssets -->|No| SkipCampaign[Log: No master assets found]
    SkipCampaign --> ExitFail

    CheckAssets -->|Yes| FindFinalFolder[Find 05 Final Assets folder - same as source]
    FindFinalFolder --> InitCounters[Initialize: processed count 0, failed count 0]

    InitCounters --> LoopAssets[For Each Asset in Campaign]
    LoopAssets --> DownloadAsset[Download asset from DAM to temp folder]

    DownloadAsset --> GenTrackID[Generate unique 6-char tracking ID from database]
    GenTrackID --> CreateBoxFolder[Create or find Box folder: MASTERS_CampaignName no campaign number]
    CreateBoxFolder --> UploadBox[Upload to Box with tracking ID suffix to folder 349261192115]

    UploadBox --> StoreDB[Store in PostgreSQL: tracking ID, DAM asset ID, Box file ID, full metadata JSONB, upload directory]

    StoreDB --> CheckSuccess{Upload and Storage Successful?}
    CheckSuccess -->|Yes| IncrementSuccess[Increment processed count, Add to success list]
    CheckSuccess -->|No| IncrementFail[Increment failed count, Add to failed list with error]

    IncrementSuccess --> CleanTemp[Delete temp file]
    IncrementFail --> CleanTemp

    CleanTemp --> MoreAssets{More Assets to Process?}
    MoreAssets -->|Yes| LoopAssets

    MoreAssets -->|No| AllDoneCheck{All Assets Successful? processed equals total}

    AllDoneCheck -->|No| LogPartial[Log: Global campaign incomplete - some failed]
    LogPartial --> SendPartialEmail[Send email: b1_to_b2_partial with success and failed lists]
    SendPartialEmail --> NoStatusUpdate[Status stays B1 - will retry on next run]
    NoStatusUpdate --> ExitFail

    AllDoneCheck -->|Yes| LogAllDone[Log: All Global assets processed successfully]
    LogAllDone --> UpdateStatus[Update campaign status B1 to B2 in DAM]

    UpdateStatus --> StatusCheck{Status Update Successful?}
    StatusCheck -->|No| LogStatusFail[Log: Status update failed]
    LogStatusFail --> ExitFail

    StatusCheck -->|Yes| NoWebhook[Note: B1 to B2 does NOT send webhook only email]
    NoWebhook --> SendEmail[Send email: b1_to_b2_complete with all processed assets, tracking IDs, Box URLs, Box folder 349261192115]
    SendEmail --> LogSuccess[Log: Campaign completed successfully]
    LogSuccess --> ExitSuccess([Exit with code 0])

    style Start fill:#e1f5e1
    style ExitSuccess fill:#e1f5e1
    style ExitFail fill:#ffe1e1
    style ExitClean fill:#e1f5e1
    style AllDoneCheck fill:#fff3cd
    style StatusCheck fill:#fff3cd
    style NoWebhook fill:#e1f0ff

Data Flow: A1 to A2 Workflow

How data moves through the system:

flowchart LR
    DAM[(OpenText DAM)] -->|1. Download| Script[a1_to_a2_download.py]
    Script -->|2. Generate Tracking ID| DB[(PostgreSQL Database)]
    Script -->|3. Upload with Tracking ID| Box[(Box.com Folder 348304357505)]
    Script -->|4. Store Full Metadata JSONB| DB
    Script -->|5. Update Status A1 to A2| DAM
    Script -->|6. Send Campaign Info| Webhook[Make.com Webhook]
    Script -->|7. Send Details| Email[SMTP Email to daveporter at oliver.agency]

    Box -->|Folder Structure| BoxFolder[C000000078-Campaign_Name]
    BoxFolder -->|Files| BoxFile[06_RAFFAELLO_MAESTRO_SD_ABC123.mp4]

    DB -->|Stores| DBData[tracking_id: ABC123, opentext_id: 0008a50..., full_metadata: complete JSON 200KB, upload_directory: ea0dbf...]

    style Script fill:#e1f0ff
    style DAM fill:#ffe1e1
    style Box fill:#e1ffe1
    style DB fill:#ffe1ff

Data Flow: A2 to A3 Workflow

How data moves through the system:

flowchart LR
    Box[(Box.com Folder 348526703108)] -->|1. Poll for Files| Script[a2_to_a3_upload_polling.py]
    Box -->|2. Read Metadata Template| Script
    Script -->|3. Get Master Metadata| DB[(PostgreSQL Database)]
    Script -->|4. Download File| Box
    Script -->|5. Parse V2 Filename| Parser[FilenameParser]

    Parser -->|Components| Script
    Script -->|6. Build MVP Fields| MVP[MetadataExtractorMVP]
    MVP -->|27-28 Fields| Script

    Script -->|7. Update Fields| Update[Description from filename, State to Local, Language from filename, CreativeX from Box]
    Update -->|Final Metadata| Script

    Script -->|8. Upload Clean File| DAM[(OpenText DAM)]
    Script -->|9. Store Derivative| DB
    Script -->|10. Delete from Box| Box
    Script -->|11. Send Notification| Email[SMTP Email]

    Box -->|Metadata Template| Meta[ferrerodammetadata: creativexScore 90, creativexUrl https...]

    style Script fill:#e1f0ff
    style Box fill:#e1ffe1
    style DAM fill:#ffe1e1
    style DB fill:#ffe1ff
    style MVP fill:#fff3cd

Data Flow: B1 to B2 Workflow

How data moves through the system:

flowchart LR
    DAM[(OpenText DAM)] -->|1. Download from 05 Final Assets| Script[b1_to_b2_download.py]
    Script -->|2. Generate Tracking ID| DB[(PostgreSQL Database)]
    Script -->|3. Upload with Tracking ID| Box[(Box.com Folder 349261192115)]
    Script -->|4. Store Full Metadata JSONB| DB
    Script -->|5. Update Status B1 to B2| DAM
    Script -->|6. Send Details No Webhook| Email[SMTP Email to daveporter at oliver.agency]

    Box -->|Folder Structure| BoxFolder[MASTERS_NUTELLA_PLANT-BASED_LAUNCH]
    BoxFolder -->|Files| BoxFile[nutella_pbased_ABC123.jpg]

    DB -->|Stores| DBData[tracking_id: ABC123, opentext_id: 747110..., full_metadata: complete JSON, upload_directory: folder ID]

    style Script fill:#e1f0ff
    style DAM fill:#ffe1e1
    style Box fill:#e1ffe1
    style DB fill:#ffe1ff

Detailed Step: MVP Metadata Building (A2 to A3)

How 27-28 MVP fields are constructed:

flowchart TD
    Start[Build MVP Asset Representation] --> LoadMaster[Load full_metadata JSONB from PostgreSQL 200KB plus]

    LoadMaster --> SearchFields[Search through 15 categories for 28 MVP field IDs]
    SearchFields --> Extract[Extract fields with exact structure from master]

    Extract --> Count[Count: Found 24 out of 28 fields typically]

    Count --> UpdateName[Update ARTESIA.FIELD.ASSET NAME to clean filename RAF_DE_de_TEST_OLV_001_6S_16x9.mp4]
    UpdateName --> UpdateDesc[Update ARTESIA.FIELD.ASSET DESCRIPTION to subject title from filename TEST-JOB]
    UpdateDesc --> ForceState[Force FERRERO.FIELD.STATE to Local]

    ForceState --> AddLang{MAIN_LANGUAGES missing?}
    AddLang -->|Yes| AddLangField[Add MAIN_LANGUAGES from filename: de to DE]
    AddLang -->|No| CheckCompliance
    AddLangField --> CheckCompliance

    CheckCompliance{ASSETCOMPLIANCE missing?} -->|Yes| AddCompliance[Add FERRERO.FIELD.ASSETCOMPLIANCE: dash]
    CheckCompliance -->|No| CheckTag
    AddCompliance --> CheckTag

    CheckTag{MARKETING_TAG missing?} -->|Yes| AddTag[Add MARKETING_TAG: Tag]
    CheckTag -->|No| CheckCreativeX
    AddTag --> CheckCreativeX

    CheckCreativeX{Box Metadata Has CreativeX?} -->|Yes| UpdateCXURL[Update FERRERO.FIELD.CREATIVEX LINK to creativexUrl from Box]
    CheckCreativeX -->|No| BuildRep
    UpdateCXURL --> LogCXScore[Log creativexScore value for reference]
    LogCXScore --> BuildRep

    BuildRep[Build final asset_resource structure: metadata with 27-28 fields, metadata_model_id ECOMMERCE, security_policy_list 1594]

    BuildRep --> Return[Return complete asset representation ready for DAM upload]

    style Start fill:#e1f0ff
    style Return fill:#e1f5e1
    style BuildRep fill:#fff3cd
    style UpdateCXURL fill:#ffe1ff

Detailed Step: All-Done Check (A1 to A2)

How the system determines if all assets are processed:

flowchart TD
    Start[Campaign Processing Complete] --> CountTotal[total_assets: Count of assets in campaign]
    CountTotal --> CountSuccess[processed_count: Count of successfully processed assets]
    CountSuccess --> CountFail[failed_count: Count of failed assets]

    CountFail --> Compare{processed_count equals total_assets?}

    Compare -->|No| PartialPath[PARTIAL COMPLETION PATH]
    PartialPath --> LogRatio[Log: processed over total successful]
    LogRatio --> StatusStays[Status remains A1 - DO NOT UPDATE]
    StatusStays --> EmailPartial[Send partial email with: success list, failed list with errors]
    EmailPartial --> WillRetry[Next cron run will retry same campaign]
    WillRetry --> ExitPartial([Exit with code 1 - indicates failure])

    Compare -->|Yes| FullPath[FULL COMPLETION PATH]
    FullPath --> LogAllDone[Log: All assets processed successfully]
    LogAllDone --> SafeUpdate[Safe to update status - all work complete]
    SafeUpdate --> UpdateStatus[Call DAM API: PATCH status A1 to A2]

    UpdateStatus --> VerifyUpdate{Update Successful?}
    VerifyUpdate -->|No| UpdateFailed[Log: Status update failed - investigate]
    UpdateFailed --> ExitPartial

    VerifyUpdate -->|Yes| SendWebhook[Send webhook with campaign details]
    SendWebhook --> SendEmail[Send success email with all asset details]
    SendEmail --> NextRun[Next cron run will find next A1 campaign]
    NextRun --> ExitSuccess([Exit with code 0 - success])

    style Compare fill:#fff3cd
    style VerifyUpdate fill:#fff3cd
    style FullPath fill:#e1f5e1
    style PartialPath fill:#ffe1e1
    style ExitSuccess fill:#e1f5e1
    style ExitPartial fill:#ffe1e1

Detailed Step: Box Metadata Extraction (A2 to A3)

How CreativeX data is read from Box and applied:

flowchart TD
    Start[Process Box File] --> GetFileID[File ID: 2035459900168]
    GetFileID --> CallAPI[Call Box API: GET files slash file_id slash metadata slash enterprise slash ferrerodammetadata]

    CallAPI --> CheckResponse{Metadata Found HTTP 200?}

    CheckResponse -->|No 404| LogNoMeta[Log: No metadata template found on file]
    LogNoMeta --> ReturnEmpty[Return empty dict - proceed without CreativeX data]

    CheckResponse -->|Yes| ParseMeta[Parse metadata response JSON]
    ParseMeta --> ExtractScore[Extract creativexScore field: 90]
    ExtractScore --> ExtractURL[Extract creativexUrl field: https://www.bbc.com]

    ExtractURL --> LogValues[Log: CreativeX Score 90, CreativeX URL https...]
    LogValues --> ReturnData[Return dict: score 90, url https://www.bbc.com]

    ReturnData --> PassToMVP[Pass to MetadataExtractorMVP.build_mvp_asset_representation]
    PassToMVP --> UpdateFields[Update FERRERO.FIELD.CREATIVEX LINK with creativexUrl]
    UpdateFields --> LogUpdated[Log: Set CREATIVEX LINK to URL]

    LogUpdated --> IncludeInUpload[CreativeX URL included in asset representation uploaded to DAM]

    ReturnEmpty --> PassToMVP2[Pass empty dict to MVP extractor]
    PassToMVP2 --> NoUpdate[No CreativeX fields updated - use master values]
    NoUpdate --> IncludeInUpload

    style Start fill:#e1f0ff
    style CheckResponse fill:#fff3cd
    style IncludeInUpload fill:#e1f5e1
    style UpdateFields fill:#ffe1ff

Error Handling Flow (All Scripts)

How errors are handled and retried:

flowchart TD
    Start[Script Execution] --> TryCatch[Wrap in try-except block]

    TryCatch --> MainLogic[Execute main workflow logic]

    MainLogic --> ErrorOccurs{Error Occurs?}

    ErrorOccurs -->|No| Success[Complete successfully]
    Success --> LogSuccess[Log success message]
    LogSuccess --> SendSuccessEmail[Send success email]
    SendSuccessEmail --> ExitZero([Exit code 0])

    ErrorOccurs -->|Yes| ClassifyError{Error Type?}

    ClassifyError -->|Connection Error| LogConnError[Log: Connection failed]
    ClassifyError -->|Validation Error| LogValError[Log: Validation failed with details]
    ClassifyError -->|API Error| LogAPIError[Log: API returned error code and message]
    ClassifyError -->|Other Error| LogGenError[Log: Exception with stack trace]

    LogConnError --> CheckRetryable{Is Retryable?}
    LogValError --> NotRetryable[Not Retryable - needs manual fix]
    LogAPIError --> CheckRetryable
    LogGenError --> CheckRetryable

    CheckRetryable -->|Yes| LogWillRetry[Log: Will retry on next cron run 5 min]
    CheckRetryable -->|No| NotRetryable

    LogWillRetry --> SendErrorEmail[Send error email with: filename, error message, retry info]
    NotRetryable --> SendErrorEmail

    SendErrorEmail --> StatusNotUpdated[Status NOT updated - campaign remains at current state]
    StatusNotUpdated --> ExitOne([Exit code 1 - indicates failure])

    ExitOne -.->|Next Cron Run 5 min later| Start

    style Start fill:#e1f0ff
    style ExitZero fill:#e1f5e1
    style ExitOne fill:#ffe1e1
    style CheckRetryable fill:#fff3cd
    style LogWillRetry fill:#e1f5e1

Cron Scheduling Flow

How the three scripts work together:

flowchart TD
    Cron([Cron Daemon Running on Server]) --> Every5Min{Every 5 Minutes}

    Every5Min --> TriggerA1[Trigger: a1_to_a2_download.py]
    Every5Min --> TriggerA2[Trigger: a2_to_a3_upload_polling.py]
    Every5Min --> TriggerB1[Trigger: b1_to_b2_download.py]

    TriggerA1 --> A1Script[A1 to A2 Script Executes]
    A1Script --> A1Search[Search for A1 Local campaigns]
    A1Search --> A1Process{Found A1?}
    A1Process -->|Yes| A1Download[Download, Box upload, DB store, Update to A2]
    A1Process -->|No| A1Exit[Exit - nothing to do]
    A1Download --> A1Exit
    A1Exit --> A1Log[Write to logs/a1_to_a2.log]

    TriggerA2 --> A2Script[A2 to A3 Script Executes]
    A2Script --> A2Poll[Poll Box folder 348526703108]
    A2Poll --> A2Process{Found V2 Files?}
    A2Process -->|Yes| A2Upload[Download from Box, Upload to DAM, Delete from Box]
    A2Process -->|No| A2Exit[Exit - nothing to do]
    A2Upload --> A2Exit
    A2Exit --> A2Log[Write to logs/a2_to_a3.log]

    TriggerB1 --> B1Script[B1 to B2 Script Executes]
    B1Script --> B1Search[Search for B1 Global campaigns]
    B1Search --> B1Process{Found B1?}
    B1Process -->|Yes| B1Download[Download, Box upload, DB store, Update to B2]
    B1Process -->|No| B1Exit[Exit - nothing to do]
    B1Download --> B1Exit
    B1Exit --> B1Log[Write to logs/b1_to_b2.log]

    A1Log --> Wait[Wait 5 minutes]
    A2Log --> Wait
    B1Log --> Wait

    Wait --> Every5Min

    style Cron fill:#e1f0ff
    style A1Download fill:#e1f5e1
    style A2Upload fill:#e1f5e1
    style B1Download fill:#e1f5e1

Database Interaction Flow

How scripts interact with PostgreSQL:

flowchart TD
    Script[Python Script] --> ConnPool[Get Connection from Pool min 1 max 10 connections]

    ConnPool --> Operation{Operation Type?}

    Operation -->|Generate ID| GenID[Generate 6-char random ID]
    GenID --> CheckUnique[SELECT COUNT FROM master_assets WHERE tracking_id equals ID]
    CheckUnique --> IsUnique{Count equals 0?}
    IsUnique -->|No| GenID
    IsUnique -->|Yes| ReturnID[Return unique tracking ID]

    Operation -->|Store Master| InsertMaster[INSERT INTO master_assets: tracking_id, opentext_id, full_metadata JSONB, upload_directory, description, all extracted fields]
    InsertMaster --> OnConflict[ON CONFLICT tracking_id DO UPDATE SET upload_directory, full_metadata, updated_at]
    OnConflict --> CommitMaster[COMMIT transaction]

    Operation -->|Get Master| SelectMaster[SELECT tracking_id, opentext_id, upload_directory, full_metadata FROM master_assets WHERE tracking_id equals ID AND status equals active]
    SelectMaster --> DecodeMeta[Decode full_metadata JSONB to Python dict]
    DecodeMeta --> ReturnMaster[Return complete master asset with 200KB metadata]

    Operation -->|Store Derivative| InsertDeriv[INSERT INTO derivative_assets: tracking_id, derivative_filename, file_extension, status active]
    InsertDeriv --> CommitDeriv[COMMIT transaction]

    Operation -->|All Done Check| CountMasters[SELECT COUNT DISTINCT tracking_id FROM master_assets WHERE campaign_id equals ID]
    CountMasters --> CountDerivs[SELECT COUNT DISTINCT tracking_id FROM derivative_assets WHERE tracking_id IN master_assets for campaign]
    CountDerivs --> CompareCounts{Counts Equal?}
    CompareCounts -->|Yes| ReturnTrue[Return True - all done]
    CompareCounts -->|No| ReturnFalse[Return False - more pending]

    ReturnID --> ReturnConn[Return connection to pool]
    CommitMaster --> ReturnConn
    ReturnMaster --> ReturnConn
    CommitDeriv --> ReturnConn
    ReturnTrue --> ReturnConn
    ReturnFalse --> ReturnConn

    ReturnConn --> Script

    style Script fill:#e1f0ff
    style ConnPool fill:#ffe1ff
    style CommitMaster fill:#e1f5e1
    style ReturnTrue fill:#e1f5e1

Box Folder Organization

How files are organized in Box:

flowchart TD
    BoxRoot[Box Root Folders] --> A1Folder[348304357505: Local Adaptation A1 to A2]
    BoxRoot --> A2Folder[348526703108: Agency Uploads A2 to A3 Input]
    BoxRoot --> B1Folder[349261192115: Global Masters B1 to B2]

    A1Folder --> A1Camp1[C000000078-Local_adaptation_test_2]
    A1Camp1 --> A1File1[06_RAFFAELLO_MAESTRO_SD_ABC123.mp4]
    A1Camp1 --> A1File2[8000500247167_8_DEF456.tif]

    A2Folder --> A2File1[1234567_RAF_DE_de_TEST_OLV_001_6S_16x9_ABC123.mp4]
    A2File1 -.->|Has Metadata| A2Meta[ferrerodammetadata: creativexScore 90, creativexUrl https...]

    B1Folder --> B1Camp1[MASTERS_NUTELLA_PLANT-BASED_LAUNCH]
    B1Camp1 --> B1File1[nutella_pbased_XYZ789.jpg]
    B1Camp1 --> B1File2[nutella_pbased_beauty_GHI012.jpg]

    style A1Folder fill:#e1f5e1
    style A2Folder fill:#ffe1e1
    style B1Folder fill:#e1f0ff
    style A2Meta fill:#ffe1ff

Complete System Architecture

How all components work together:

flowchart TB
    Cron[Cron Scheduler] --> A1Script[a1_to_a2_download.py Every 5 min]
    Cron --> A2Script[a2_to_a3_upload_polling.py Every 5 min]
    Cron --> B1Script[b1_to_b2_download.py Every 5 min]

    A1Script --> DAM[(OpenText DAM API)]
    A2Script --> DAM
    B1Script --> DAM

    A1Script --> Box[(Box.com API)]
    A2Script --> Box
    B1Script --> Box

    A1Script --> DB[(PostgreSQL Database)]
    A2Script --> DB
    B1Script --> DB

    A1Script --> Email[SMTP Mailgun]
    A2Script --> Email
    B1Script --> Email

    A1Script --> Webhook[Make.com Webhook]

    DAM -.->|OAuth2| Auth[Access Token auto-refresh]
    Box -.->|JWT| BoxAuth[JWT Token enterprise auth]
    DB -.->|Connection Pool| DBPool[1-10 connections]

    A1Script --> LogA1[logs/a1_to_a2.log Rotated 28 files 10MB each]
    A2Script --> LogA2[logs/a2_to_a3.log Rotated 28 files 10MB each]
    B1Script --> LogB1[logs/b1_to_b2.log Rotated 28 files 10MB each]

    Config[config.yaml + .env] -.->|Configuration| A1Script
    Config -.->|Configuration| A2Script
    Config -.->|Configuration| B1Script

    style Cron fill:#e1f0ff
    style DAM fill:#ffe1e1
    style Box fill:#e1ffe1
    style DB fill:#ffe1ff
    style Email fill:#fff3cd

Notes

Diagram Legend

Colors:

🟢 Green: Success paths, completed states
🔴 Red: Error paths, failed states
🟡 Yellow: Decision points, important checks
🔵 Blue: Processing steps, data transformation
🟣 Purple: External services, special features

Key Decision Points

All-Done Check:

Critical for preventing premature status updates
Only proceeds when ALL assets successful
Failed assets cause retry on next run

Single-Run Mode:

Each script processes ONE item (campaign or file)
Cron handles scheduling (every 5 minutes)
Allows controlled processing
Easy to test and debug

Box Metadata:

Optional enhancement
If template not found, continues without
Doesn't block upload if missing

File Naming

A1→A2 and B1→B2 (Box Upload):

Original: filename.ext
With Tracking: filename_ABC123.ext

A2→A3 (DAM Upload):

From Box: 1234567_RAF_DE_de_TEST_OLV_001_6S_16x9_ABC123.mp4
To DAM: RAF_DE_de_TEST_OLV_001_6S_16x9.mp4

For complete documentation, see PROJECT_STATUS_2025-11-03.md

27 KiB Raw Permalink Blame History

Python Automation Workflow Diagrams

A1 to A2 - Local Adaptation Master Asset Downloader

A2 to A3 - Box to DAM Upload Handler (Polling)

B1 to B2 - Global Masters Asset Downloader

Data Flow: A1 to A2 Workflow

Data Flow: A2 to A3 Workflow

Data Flow: B1 to B2 Workflow

Detailed Step: MVP Metadata Building (A2 to A3)

Detailed Step: All-Done Check (A1 to A2)

Detailed Step: Box Metadata Extraction (A2 to A3)

Error Handling Flow (All Scripts)

Cron Scheduling Flow

Database Interaction Flow

Box Folder Organization

Complete System Architecture

Notes

Diagram Legend

Key Decision Points

File Naming

27 KiB

Raw Permalink Blame History