ferrero-opentext/Python-Version/tests/WORKFLOW_DIAGRAMS.md
DJP d818a8b6a8 Update comprehensive README and reorganize documentation files
Major README overhaul with complete deployment and configuration guide.
Moved old docs to tests/ folder for archive.

README.md UPDATES (880 lines - completely rewritten):
✓ Table of contents with navigation
✓ Complete overview of all 4 workflows + daily report
✓ Detailed authentication section (OAuth2 vs mTLS)
✓ Box-config.json location explanation
✓ Server deployment step-by-step guide
✓ Database setup (Docker + native PostgreSQL)
✓ Cron job examples for all workflows
✓ Comprehensive troubleshooting section
✓ Security checklist
✓ Monitoring and log rotation details
✓ Common SQL queries
✓ File structure diagram

KEY SECTIONS ADDED:
1. What's Included - All 5 scripts explained
2. Quick Start - Local setup guide
3. Server Deployment - 6-step process with commands
4. Workflows - Detailed process for each (A1→A2, A5→A6, B1→B2, A2→A3, Daily Report)
5. Authentication - OAuth2 vs mTLS with examples
6. Configuration - All .env variables documented
7. Database - Schema, setup, queries
8. Monitoring - Logs, emails, database queries
9. Troubleshooting - Common issues + solutions
10. File Structure - Complete directory tree

BOX-CONFIG.JSON LOCATION DOCUMENTED:
✓ Must be one folder up from Python-Version
✓ Referenced as ../Box-config.json in config.yaml
✓ Server deployment instructions include copying both files
✓ Troubleshooting section explains file not found errors

MTLS DOCUMENTATION:
✓ Different base URL explained (dev-auth.app-api.ferrero.com)
✓ --auth-pfx flag usage
✓ Whitelisted IP requirement noted
✓ Certificate testing commands

REORGANIZATION:
- Moved old DEPLOYMENT.md → tests/DEPLOYMENT.md (archive)
- Moved old WORKFLOW_DIAGRAMS.md → tests/WORKFLOW_DIAGRAMS.md (archive)
- New DEPLOYMENT_GUIDE.md is the current deployment doc
- README.md is now comprehensive one-stop documentation

Changes:
- Python-Version/README.md (completely rewritten, 880 lines)
- Moved 2 old docs to tests/ folder
- Added test files to tests/ folder

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 10:57:11 -05:00

27 KiB

Python Automation Workflow Diagrams

Detailed Mermaid flowcharts for each Python automation script.


A1 to A2 - Local Adaptation Master Asset Downloader

Script: a1_to_a2_download.py Trigger: Cron (every 5 minutes) Purpose: Download Local Adaptation master assets and upload to Box

flowchart TD
    Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box, Database, Notifier]
    Init --> TestConn{Test All Connections}
    TestConn -->|Any Failed| ExitFail([Exit with error code 1])
    TestConn -->|All OK| Search[Search DAM for campaigns with Status A1 and Campaign Type Local Adaptation]

    Search --> CheckCampaigns{Found A1 Campaigns?}
    CheckCampaigns -->|No| LogNone[Log: No A1 campaigns found]
    LogNone --> ExitClean([Exit with code 0])

    CheckCampaigns -->|Yes| SelectFirst[Select FIRST campaign only]
    SelectFirst --> GetMasterAssets[Get Master Assets from 01 Master Assets folder]

    GetMasterAssets --> CheckAssets{Found Assets?}
    CheckAssets -->|No| SkipCampaign[Log: No master assets found]
    SkipCampaign --> ExitFail

    CheckAssets -->|Yes| FindFinalFolder[Find 01 Final Assets folder for upload directory]
    FindFinalFolder --> InitCounters[Initialize: processed count 0, failed count 0]

    InitCounters --> LoopAssets[For Each Asset in Campaign]
    LoopAssets --> DownloadAsset[Download asset from DAM to temp folder]

    DownloadAsset --> GenTrackID[Generate unique 6-char tracking ID from database]
    GenTrackID --> CreateBoxFolder[Create or find Box folder: CampaignID-CampaignName]
    CreateBoxFolder --> UploadBox[Upload to Box with tracking ID suffix: filename_ABC123.ext]

    UploadBox --> StoreDB[Store in PostgreSQL: tracking ID, DAM asset ID, Box file ID, full metadata JSONB, upload directory]

    StoreDB --> CheckSuccess{Upload and Storage Successful?}
    CheckSuccess -->|Yes| IncrementSuccess[Increment processed count, Add to success list]
    CheckSuccess -->|No| IncrementFail[Increment failed count, Add to failed list with error]

    IncrementSuccess --> CleanTemp[Delete temp file]
    IncrementFail --> CleanTemp

    CleanTemp --> MoreAssets{More Assets to Process?}
    MoreAssets -->|Yes| LoopAssets

    MoreAssets -->|No| AllDoneCheck{All Assets Successful? processed equals total}

    AllDoneCheck -->|No| LogPartial[Log: Campaign incomplete - some failed]
    LogPartial --> SendPartialEmail[Send email: a1_to_a2_partial with success and failed lists]
    SendPartialEmail --> NoStatusUpdate[Status stays A1 - will retry on next run]
    NoStatusUpdate --> ExitFail

    AllDoneCheck -->|Yes| LogAllDone[Log: All assets processed successfully]
    LogAllDone --> UpdateStatus[Update campaign status A1 to A2 in DAM]

    UpdateStatus --> StatusCheck{Status Update Successful?}
    StatusCheck -->|No| LogStatusFail[Log: Status update failed]
    LogStatusFail --> ExitFail

    StatusCheck -->|Yes| SendWebhook[Send webhook to Make.com: campaign ID, campaign number, status change, asset list, timestamp]
    SendWebhook --> SendEmail[Send email: a1_to_a2_complete with all processed assets, tracking IDs, Box URLs]
    SendEmail --> LogSuccess[Log: Campaign completed successfully]
    LogSuccess --> ExitSuccess([Exit with code 0])

    style Start fill:#e1f5e1
    style ExitSuccess fill:#e1f5e1
    style ExitFail fill:#ffe1e1
    style ExitClean fill:#e1f5e1
    style AllDoneCheck fill:#fff3cd
    style StatusCheck fill:#fff3cd

A2 to A3 - Box to DAM Upload Handler (Polling)

Script: a2_to_a3_upload_polling.py Trigger: Cron (every 5 minutes) Purpose: Upload agency-processed files from Box to DAM

flowchart TD
    Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box folder 348526703108, Database, Notifier, FilenameParser, MetadataExtractorMVP]
    Init --> TestConn{Test All Connections}
    TestConn -->|Any Failed| ExitFail([Exit with error code 1])
    TestConn -->|All OK| ListBox[List all files in Box folder 348526703108]

    ListBox --> CheckFiles{Found Files?}
    CheckFiles -->|No| LogNoFiles[Log: No files found in Box folder]
    LogNoFiles --> ExitClean([Exit with code 0])

    CheckFiles -->|Yes| FilterValid[Filter for valid V2 filenames with tracking IDs]
    FilterValid --> CheckValid{Found Valid V2 Files?}
    CheckValid -->|No| LogNoValid[Log: No valid V2 files to process]
    LogNoValid --> ExitClean

    CheckValid -->|Yes| SelectFirst[Select FIRST file only - more processed on next run]
    SelectFirst --> ParseFilename[Parse V2 filename: extract all 10 components]

    ParseFilename --> ValidateV2{V2 Filename Valid?}
    ValidateV2 -->|No| LogInvalid[Log: Invalid V2 filename with errors]
    LogInvalid --> SendFailEmail[Send email: upload failed with validation errors]
    SendFailEmail --> ExitFail

    ValidateV2 -->|Yes| ExtractTracking[Extract tracking ID from filename]
    ExtractTracking --> CheckTracking{Tracking ID Exists?}
    CheckTracking -->|No| LogNoTracking[Log: No tracking ID in filename]
    LogNoTracking --> ExitFail

    CheckTracking -->|Yes| LoadMaster[Load master metadata from PostgreSQL by tracking ID]
    LoadMaster --> CheckMaster{Master Asset Found?}
    CheckMaster -->|No| LogNoMaster[Log: No master asset for tracking ID]
    LogNoMaster --> ExitFail

    CheckMaster -->|Yes| GetBoxMeta[Read Box metadata template: ferrerodammetadata]
    GetBoxMeta --> ExtractCreativeX[Extract: creativexScore, creativexUrl]

    ExtractCreativeX --> DownloadBox[Download file from Box to temp folder]
    DownloadBox --> StripFilename[Strip OMG Job Number and Tracking ID from filename]

    StripFilename --> BuildMVP[Build MVP asset representation: Load 24 fields from master, Add 3 missing from filename, Update Description State Language, Update CreativeX from Box metadata, Total 27-28 fields]

    BuildMVP --> RenameTemp[Rename temp file to clean filename]
    RenameTemp --> UploadDAM[Upload to DAM Final Assets folder with MVP metadata and video dimensions]

    UploadDAM --> CheckUpload{Upload Successful HTTP 201 or 202?}
    CheckUpload -->|No| LogUploadFail[Log: Upload failed with error]
    LogUploadFail --> CleanFail[Delete temp file]
    CleanFail --> SendFailEmail2[Send email: upload failed with error details]
    SendFailEmail2 --> ExitFail

    CheckUpload -->|Yes| StoreDerivative[Store derivative asset record in database: tracking ID and filename]
    StoreDerivative --> DeleteBox[Delete file from Box after successful upload]

    DeleteBox --> CleanSuccess[Delete local temp file]
    CleanSuccess --> SendSuccessEmail[Send email: a2_to_a3_file_uploaded with asset ID, tracking ID, clean filename, Box metadata values]

    SendSuccessEmail --> LogSuccess[Log: File processed successfully with asset ID]
    LogSuccess --> ExitSuccess([Exit with code 0])

    style Start fill:#e1f5e1
    style ExitSuccess fill:#e1f5e1
    style ExitFail fill:#ffe1e1
    style ExitClean fill:#e1f5e1
    style BuildMVP fill:#e1f0ff
    style GetBoxMeta fill:#ffe1ff
    style DeleteBox fill:#fff3cd

B1 to B2 - Global Masters Asset Downloader

Script: b1_to_b2_download.py Trigger: Cron (every 5 minutes) Purpose: Download Global Master assets and upload to Box

flowchart TD
    Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box folder 349261192115, Database, Notifier]
    Init --> TestConn{Test All Connections}
    TestConn -->|Any Failed| ExitFail([Exit with error code 1])
    TestConn -->|All OK| Search[Search DAM for campaigns with Status B1 and Campaign Type Global comm]

    Search --> CheckCampaigns{Found B1 Campaigns?}
    CheckCampaigns -->|No| LogNone[Log: No B1 campaigns found]
    LogNone --> ExitClean([Exit with code 0])

    CheckCampaigns -->|Yes| SelectFirst[Select FIRST campaign only]
    SelectFirst --> GetGlobalAssets[Get Global Master Assets from 05 Final Assets folder using is_global equals True]

    GetGlobalAssets --> CheckAssets{Found Assets?}
    CheckAssets -->|No| SkipCampaign[Log: No master assets found]
    SkipCampaign --> ExitFail

    CheckAssets -->|Yes| FindFinalFolder[Find 05 Final Assets folder - same as source]
    FindFinalFolder --> InitCounters[Initialize: processed count 0, failed count 0]

    InitCounters --> LoopAssets[For Each Asset in Campaign]
    LoopAssets --> DownloadAsset[Download asset from DAM to temp folder]

    DownloadAsset --> GenTrackID[Generate unique 6-char tracking ID from database]
    GenTrackID --> CreateBoxFolder[Create or find Box folder: MASTERS_CampaignName no campaign number]
    CreateBoxFolder --> UploadBox[Upload to Box with tracking ID suffix to folder 349261192115]

    UploadBox --> StoreDB[Store in PostgreSQL: tracking ID, DAM asset ID, Box file ID, full metadata JSONB, upload directory]

    StoreDB --> CheckSuccess{Upload and Storage Successful?}
    CheckSuccess -->|Yes| IncrementSuccess[Increment processed count, Add to success list]
    CheckSuccess -->|No| IncrementFail[Increment failed count, Add to failed list with error]

    IncrementSuccess --> CleanTemp[Delete temp file]
    IncrementFail --> CleanTemp

    CleanTemp --> MoreAssets{More Assets to Process?}
    MoreAssets -->|Yes| LoopAssets

    MoreAssets -->|No| AllDoneCheck{All Assets Successful? processed equals total}

    AllDoneCheck -->|No| LogPartial[Log: Global campaign incomplete - some failed]
    LogPartial --> SendPartialEmail[Send email: b1_to_b2_partial with success and failed lists]
    SendPartialEmail --> NoStatusUpdate[Status stays B1 - will retry on next run]
    NoStatusUpdate --> ExitFail

    AllDoneCheck -->|Yes| LogAllDone[Log: All Global assets processed successfully]
    LogAllDone --> UpdateStatus[Update campaign status B1 to B2 in DAM]

    UpdateStatus --> StatusCheck{Status Update Successful?}
    StatusCheck -->|No| LogStatusFail[Log: Status update failed]
    LogStatusFail --> ExitFail

    StatusCheck -->|Yes| NoWebhook[Note: B1 to B2 does NOT send webhook only email]
    NoWebhook --> SendEmail[Send email: b1_to_b2_complete with all processed assets, tracking IDs, Box URLs, Box folder 349261192115]
    SendEmail --> LogSuccess[Log: Campaign completed successfully]
    LogSuccess --> ExitSuccess([Exit with code 0])

    style Start fill:#e1f5e1
    style ExitSuccess fill:#e1f5e1
    style ExitFail fill:#ffe1e1
    style ExitClean fill:#e1f5e1
    style AllDoneCheck fill:#fff3cd
    style StatusCheck fill:#fff3cd
    style NoWebhook fill:#e1f0ff

Data Flow: A1 to A2 Workflow

How data moves through the system:

flowchart LR
    DAM[(OpenText DAM)] -->|1. Download| Script[a1_to_a2_download.py]
    Script -->|2. Generate Tracking ID| DB[(PostgreSQL Database)]
    Script -->|3. Upload with Tracking ID| Box[(Box.com Folder 348304357505)]
    Script -->|4. Store Full Metadata JSONB| DB
    Script -->|5. Update Status A1 to A2| DAM
    Script -->|6. Send Campaign Info| Webhook[Make.com Webhook]
    Script -->|7. Send Details| Email[SMTP Email to daveporter at oliver.agency]

    Box -->|Folder Structure| BoxFolder[C000000078-Campaign_Name]
    BoxFolder -->|Files| BoxFile[06_RAFFAELLO_MAESTRO_SD_ABC123.mp4]

    DB -->|Stores| DBData[tracking_id: ABC123, opentext_id: 0008a50..., full_metadata: complete JSON 200KB, upload_directory: ea0dbf...]

    style Script fill:#e1f0ff
    style DAM fill:#ffe1e1
    style Box fill:#e1ffe1
    style DB fill:#ffe1ff

Data Flow: A2 to A3 Workflow

How data moves through the system:

flowchart LR
    Box[(Box.com Folder 348526703108)] -->|1. Poll for Files| Script[a2_to_a3_upload_polling.py]
    Box -->|2. Read Metadata Template| Script
    Script -->|3. Get Master Metadata| DB[(PostgreSQL Database)]
    Script -->|4. Download File| Box
    Script -->|5. Parse V2 Filename| Parser[FilenameParser]

    Parser -->|Components| Script
    Script -->|6. Build MVP Fields| MVP[MetadataExtractorMVP]
    MVP -->|27-28 Fields| Script

    Script -->|7. Update Fields| Update[Description from filename, State to Local, Language from filename, CreativeX from Box]
    Update -->|Final Metadata| Script

    Script -->|8. Upload Clean File| DAM[(OpenText DAM)]
    Script -->|9. Store Derivative| DB
    Script -->|10. Delete from Box| Box
    Script -->|11. Send Notification| Email[SMTP Email]

    Box -->|Metadata Template| Meta[ferrerodammetadata: creativexScore 90, creativexUrl https...]

    style Script fill:#e1f0ff
    style Box fill:#e1ffe1
    style DAM fill:#ffe1e1
    style DB fill:#ffe1ff
    style MVP fill:#fff3cd

Data Flow: B1 to B2 Workflow

How data moves through the system:

flowchart LR
    DAM[(OpenText DAM)] -->|1. Download from 05 Final Assets| Script[b1_to_b2_download.py]
    Script -->|2. Generate Tracking ID| DB[(PostgreSQL Database)]
    Script -->|3. Upload with Tracking ID| Box[(Box.com Folder 349261192115)]
    Script -->|4. Store Full Metadata JSONB| DB
    Script -->|5. Update Status B1 to B2| DAM
    Script -->|6. Send Details No Webhook| Email[SMTP Email to daveporter at oliver.agency]

    Box -->|Folder Structure| BoxFolder[MASTERS_NUTELLA_PLANT-BASED_LAUNCH]
    BoxFolder -->|Files| BoxFile[nutella_pbased_ABC123.jpg]

    DB -->|Stores| DBData[tracking_id: ABC123, opentext_id: 747110..., full_metadata: complete JSON, upload_directory: folder ID]

    style Script fill:#e1f0ff
    style DAM fill:#ffe1e1
    style Box fill:#e1ffe1
    style DB fill:#ffe1ff

Detailed Step: MVP Metadata Building (A2 to A3)

How 27-28 MVP fields are constructed:

flowchart TD
    Start[Build MVP Asset Representation] --> LoadMaster[Load full_metadata JSONB from PostgreSQL 200KB plus]

    LoadMaster --> SearchFields[Search through 15 categories for 28 MVP field IDs]
    SearchFields --> Extract[Extract fields with exact structure from master]

    Extract --> Count[Count: Found 24 out of 28 fields typically]

    Count --> UpdateName[Update ARTESIA.FIELD.ASSET NAME to clean filename RAF_DE_de_TEST_OLV_001_6S_16x9.mp4]
    UpdateName --> UpdateDesc[Update ARTESIA.FIELD.ASSET DESCRIPTION to subject title from filename TEST-JOB]
    UpdateDesc --> ForceState[Force FERRERO.FIELD.STATE to Local]

    ForceState --> AddLang{MAIN_LANGUAGES missing?}
    AddLang -->|Yes| AddLangField[Add MAIN_LANGUAGES from filename: de to DE]
    AddLang -->|No| CheckCompliance
    AddLangField --> CheckCompliance

    CheckCompliance{ASSETCOMPLIANCE missing?} -->|Yes| AddCompliance[Add FERRERO.FIELD.ASSETCOMPLIANCE: dash]
    CheckCompliance -->|No| CheckTag
    AddCompliance --> CheckTag

    CheckTag{MARKETING_TAG missing?} -->|Yes| AddTag[Add MARKETING_TAG: Tag]
    CheckTag -->|No| CheckCreativeX
    AddTag --> CheckCreativeX

    CheckCreativeX{Box Metadata Has CreativeX?} -->|Yes| UpdateCXURL[Update FERRERO.FIELD.CREATIVEX LINK to creativexUrl from Box]
    CheckCreativeX -->|No| BuildRep
    UpdateCXURL --> LogCXScore[Log creativexScore value for reference]
    LogCXScore --> BuildRep

    BuildRep[Build final asset_resource structure: metadata with 27-28 fields, metadata_model_id ECOMMERCE, security_policy_list 1594]

    BuildRep --> Return[Return complete asset representation ready for DAM upload]

    style Start fill:#e1f0ff
    style Return fill:#e1f5e1
    style BuildRep fill:#fff3cd
    style UpdateCXURL fill:#ffe1ff

Detailed Step: All-Done Check (A1 to A2)

How the system determines if all assets are processed:

flowchart TD
    Start[Campaign Processing Complete] --> CountTotal[total_assets: Count of assets in campaign]
    CountTotal --> CountSuccess[processed_count: Count of successfully processed assets]
    CountSuccess --> CountFail[failed_count: Count of failed assets]

    CountFail --> Compare{processed_count equals total_assets?}

    Compare -->|No| PartialPath[PARTIAL COMPLETION PATH]
    PartialPath --> LogRatio[Log: processed over total successful]
    LogRatio --> StatusStays[Status remains A1 - DO NOT UPDATE]
    StatusStays --> EmailPartial[Send partial email with: success list, failed list with errors]
    EmailPartial --> WillRetry[Next cron run will retry same campaign]
    WillRetry --> ExitPartial([Exit with code 1 - indicates failure])

    Compare -->|Yes| FullPath[FULL COMPLETION PATH]
    FullPath --> LogAllDone[Log: All assets processed successfully]
    LogAllDone --> SafeUpdate[Safe to update status - all work complete]
    SafeUpdate --> UpdateStatus[Call DAM API: PATCH status A1 to A2]

    UpdateStatus --> VerifyUpdate{Update Successful?}
    VerifyUpdate -->|No| UpdateFailed[Log: Status update failed - investigate]
    UpdateFailed --> ExitPartial

    VerifyUpdate -->|Yes| SendWebhook[Send webhook with campaign details]
    SendWebhook --> SendEmail[Send success email with all asset details]
    SendEmail --> NextRun[Next cron run will find next A1 campaign]
    NextRun --> ExitSuccess([Exit with code 0 - success])

    style Compare fill:#fff3cd
    style VerifyUpdate fill:#fff3cd
    style FullPath fill:#e1f5e1
    style PartialPath fill:#ffe1e1
    style ExitSuccess fill:#e1f5e1
    style ExitPartial fill:#ffe1e1

Detailed Step: Box Metadata Extraction (A2 to A3)

How CreativeX data is read from Box and applied:

flowchart TD
    Start[Process Box File] --> GetFileID[File ID: 2035459900168]
    GetFileID --> CallAPI[Call Box API: GET files slash file_id slash metadata slash enterprise slash ferrerodammetadata]

    CallAPI --> CheckResponse{Metadata Found HTTP 200?}

    CheckResponse -->|No 404| LogNoMeta[Log: No metadata template found on file]
    LogNoMeta --> ReturnEmpty[Return empty dict - proceed without CreativeX data]

    CheckResponse -->|Yes| ParseMeta[Parse metadata response JSON]
    ParseMeta --> ExtractScore[Extract creativexScore field: 90]
    ExtractScore --> ExtractURL[Extract creativexUrl field: https://www.bbc.com]

    ExtractURL --> LogValues[Log: CreativeX Score 90, CreativeX URL https...]
    LogValues --> ReturnData[Return dict: score 90, url https://www.bbc.com]

    ReturnData --> PassToMVP[Pass to MetadataExtractorMVP.build_mvp_asset_representation]
    PassToMVP --> UpdateFields[Update FERRERO.FIELD.CREATIVEX LINK with creativexUrl]
    UpdateFields --> LogUpdated[Log: Set CREATIVEX LINK to URL]

    LogUpdated --> IncludeInUpload[CreativeX URL included in asset representation uploaded to DAM]

    ReturnEmpty --> PassToMVP2[Pass empty dict to MVP extractor]
    PassToMVP2 --> NoUpdate[No CreativeX fields updated - use master values]
    NoUpdate --> IncludeInUpload

    style Start fill:#e1f0ff
    style CheckResponse fill:#fff3cd
    style IncludeInUpload fill:#e1f5e1
    style UpdateFields fill:#ffe1ff

Error Handling Flow (All Scripts)

How errors are handled and retried:

flowchart TD
    Start[Script Execution] --> TryCatch[Wrap in try-except block]

    TryCatch --> MainLogic[Execute main workflow logic]

    MainLogic --> ErrorOccurs{Error Occurs?}

    ErrorOccurs -->|No| Success[Complete successfully]
    Success --> LogSuccess[Log success message]
    LogSuccess --> SendSuccessEmail[Send success email]
    SendSuccessEmail --> ExitZero([Exit code 0])

    ErrorOccurs -->|Yes| ClassifyError{Error Type?}

    ClassifyError -->|Connection Error| LogConnError[Log: Connection failed]
    ClassifyError -->|Validation Error| LogValError[Log: Validation failed with details]
    ClassifyError -->|API Error| LogAPIError[Log: API returned error code and message]
    ClassifyError -->|Other Error| LogGenError[Log: Exception with stack trace]

    LogConnError --> CheckRetryable{Is Retryable?}
    LogValError --> NotRetryable[Not Retryable - needs manual fix]
    LogAPIError --> CheckRetryable
    LogGenError --> CheckRetryable

    CheckRetryable -->|Yes| LogWillRetry[Log: Will retry on next cron run 5 min]
    CheckRetryable -->|No| NotRetryable

    LogWillRetry --> SendErrorEmail[Send error email with: filename, error message, retry info]
    NotRetryable --> SendErrorEmail

    SendErrorEmail --> StatusNotUpdated[Status NOT updated - campaign remains at current state]
    StatusNotUpdated --> ExitOne([Exit code 1 - indicates failure])

    ExitOne -.->|Next Cron Run 5 min later| Start

    style Start fill:#e1f0ff
    style ExitZero fill:#e1f5e1
    style ExitOne fill:#ffe1e1
    style CheckRetryable fill:#fff3cd
    style LogWillRetry fill:#e1f5e1

Cron Scheduling Flow

How the three scripts work together:

flowchart TD
    Cron([Cron Daemon Running on Server]) --> Every5Min{Every 5 Minutes}

    Every5Min --> TriggerA1[Trigger: a1_to_a2_download.py]
    Every5Min --> TriggerA2[Trigger: a2_to_a3_upload_polling.py]
    Every5Min --> TriggerB1[Trigger: b1_to_b2_download.py]

    TriggerA1 --> A1Script[A1 to A2 Script Executes]
    A1Script --> A1Search[Search for A1 Local campaigns]
    A1Search --> A1Process{Found A1?}
    A1Process -->|Yes| A1Download[Download, Box upload, DB store, Update to A2]
    A1Process -->|No| A1Exit[Exit - nothing to do]
    A1Download --> A1Exit
    A1Exit --> A1Log[Write to logs/a1_to_a2.log]

    TriggerA2 --> A2Script[A2 to A3 Script Executes]
    A2Script --> A2Poll[Poll Box folder 348526703108]
    A2Poll --> A2Process{Found V2 Files?}
    A2Process -->|Yes| A2Upload[Download from Box, Upload to DAM, Delete from Box]
    A2Process -->|No| A2Exit[Exit - nothing to do]
    A2Upload --> A2Exit
    A2Exit --> A2Log[Write to logs/a2_to_a3.log]

    TriggerB1 --> B1Script[B1 to B2 Script Executes]
    B1Script --> B1Search[Search for B1 Global campaigns]
    B1Search --> B1Process{Found B1?}
    B1Process -->|Yes| B1Download[Download, Box upload, DB store, Update to B2]
    B1Process -->|No| B1Exit[Exit - nothing to do]
    B1Download --> B1Exit
    B1Exit --> B1Log[Write to logs/b1_to_b2.log]

    A1Log --> Wait[Wait 5 minutes]
    A2Log --> Wait
    B1Log --> Wait

    Wait --> Every5Min

    style Cron fill:#e1f0ff
    style A1Download fill:#e1f5e1
    style A2Upload fill:#e1f5e1
    style B1Download fill:#e1f5e1

Database Interaction Flow

How scripts interact with PostgreSQL:

flowchart TD
    Script[Python Script] --> ConnPool[Get Connection from Pool min 1 max 10 connections]

    ConnPool --> Operation{Operation Type?}

    Operation -->|Generate ID| GenID[Generate 6-char random ID]
    GenID --> CheckUnique[SELECT COUNT FROM master_assets WHERE tracking_id equals ID]
    CheckUnique --> IsUnique{Count equals 0?}
    IsUnique -->|No| GenID
    IsUnique -->|Yes| ReturnID[Return unique tracking ID]

    Operation -->|Store Master| InsertMaster[INSERT INTO master_assets: tracking_id, opentext_id, full_metadata JSONB, upload_directory, description, all extracted fields]
    InsertMaster --> OnConflict[ON CONFLICT tracking_id DO UPDATE SET upload_directory, full_metadata, updated_at]
    OnConflict --> CommitMaster[COMMIT transaction]

    Operation -->|Get Master| SelectMaster[SELECT tracking_id, opentext_id, upload_directory, full_metadata FROM master_assets WHERE tracking_id equals ID AND status equals active]
    SelectMaster --> DecodeMeta[Decode full_metadata JSONB to Python dict]
    DecodeMeta --> ReturnMaster[Return complete master asset with 200KB metadata]

    Operation -->|Store Derivative| InsertDeriv[INSERT INTO derivative_assets: tracking_id, derivative_filename, file_extension, status active]
    InsertDeriv --> CommitDeriv[COMMIT transaction]

    Operation -->|All Done Check| CountMasters[SELECT COUNT DISTINCT tracking_id FROM master_assets WHERE campaign_id equals ID]
    CountMasters --> CountDerivs[SELECT COUNT DISTINCT tracking_id FROM derivative_assets WHERE tracking_id IN master_assets for campaign]
    CountDerivs --> CompareCounts{Counts Equal?}
    CompareCounts -->|Yes| ReturnTrue[Return True - all done]
    CompareCounts -->|No| ReturnFalse[Return False - more pending]

    ReturnID --> ReturnConn[Return connection to pool]
    CommitMaster --> ReturnConn
    ReturnMaster --> ReturnConn
    CommitDeriv --> ReturnConn
    ReturnTrue --> ReturnConn
    ReturnFalse --> ReturnConn

    ReturnConn --> Script

    style Script fill:#e1f0ff
    style ConnPool fill:#ffe1ff
    style CommitMaster fill:#e1f5e1
    style ReturnTrue fill:#e1f5e1

Box Folder Organization

How files are organized in Box:

flowchart TD
    BoxRoot[Box Root Folders] --> A1Folder[348304357505: Local Adaptation A1 to A2]
    BoxRoot --> A2Folder[348526703108: Agency Uploads A2 to A3 Input]
    BoxRoot --> B1Folder[349261192115: Global Masters B1 to B2]

    A1Folder --> A1Camp1[C000000078-Local_adaptation_test_2]
    A1Camp1 --> A1File1[06_RAFFAELLO_MAESTRO_SD_ABC123.mp4]
    A1Camp1 --> A1File2[8000500247167_8_DEF456.tif]

    A2Folder --> A2File1[1234567_RAF_DE_de_TEST_OLV_001_6S_16x9_ABC123.mp4]
    A2File1 -.->|Has Metadata| A2Meta[ferrerodammetadata: creativexScore 90, creativexUrl https...]

    B1Folder --> B1Camp1[MASTERS_NUTELLA_PLANT-BASED_LAUNCH]
    B1Camp1 --> B1File1[nutella_pbased_XYZ789.jpg]
    B1Camp1 --> B1File2[nutella_pbased_beauty_GHI012.jpg]

    style A1Folder fill:#e1f5e1
    style A2Folder fill:#ffe1e1
    style B1Folder fill:#e1f0ff
    style A2Meta fill:#ffe1ff

Complete System Architecture

How all components work together:

flowchart TB
    Cron[Cron Scheduler] --> A1Script[a1_to_a2_download.py Every 5 min]
    Cron --> A2Script[a2_to_a3_upload_polling.py Every 5 min]
    Cron --> B1Script[b1_to_b2_download.py Every 5 min]

    A1Script --> DAM[(OpenText DAM API)]
    A2Script --> DAM
    B1Script --> DAM

    A1Script --> Box[(Box.com API)]
    A2Script --> Box
    B1Script --> Box

    A1Script --> DB[(PostgreSQL Database)]
    A2Script --> DB
    B1Script --> DB

    A1Script --> Email[SMTP Mailgun]
    A2Script --> Email
    B1Script --> Email

    A1Script --> Webhook[Make.com Webhook]

    DAM -.->|OAuth2| Auth[Access Token auto-refresh]
    Box -.->|JWT| BoxAuth[JWT Token enterprise auth]
    DB -.->|Connection Pool| DBPool[1-10 connections]

    A1Script --> LogA1[logs/a1_to_a2.log Rotated 28 files 10MB each]
    A2Script --> LogA2[logs/a2_to_a3.log Rotated 28 files 10MB each]
    B1Script --> LogB1[logs/b1_to_b2.log Rotated 28 files 10MB each]

    Config[config.yaml + .env] -.->|Configuration| A1Script
    Config -.->|Configuration| A2Script
    Config -.->|Configuration| B1Script

    style Cron fill:#e1f0ff
    style DAM fill:#ffe1e1
    style Box fill:#e1ffe1
    style DB fill:#ffe1ff
    style Email fill:#fff3cd

Notes

Diagram Legend

Colors:

  • 🟢 Green: Success paths, completed states
  • 🔴 Red: Error paths, failed states
  • 🟡 Yellow: Decision points, important checks
  • 🔵 Blue: Processing steps, data transformation
  • 🟣 Purple: External services, special features

Key Decision Points

All-Done Check:

  • Critical for preventing premature status updates
  • Only proceeds when ALL assets successful
  • Failed assets cause retry on next run

Single-Run Mode:

  • Each script processes ONE item (campaign or file)
  • Cron handles scheduling (every 5 minutes)
  • Allows controlled processing
  • Easy to test and debug

Box Metadata:

  • Optional enhancement
  • If template not found, continues without
  • Doesn't block upload if missing

File Naming

A1→A2 and B1→B2 (Box Upload):

  • Original: filename.ext
  • With Tracking: filename_ABC123.ext

A2→A3 (DAM Upload):

  • From Box: 1234567_RAF_DE_de_TEST_OLV_001_6S_16x9_ABC123.mp4
  • To DAM: RAF_DE_de_TEST_OLV_001_6S_16x9.mp4

For complete documentation, see PROJECT_STATUS_2025-11-03.md