Major README overhaul with complete deployment and configuration guide. Moved old docs to tests/ folder for archive. README.md UPDATES (880 lines - completely rewritten): ✓ Table of contents with navigation ✓ Complete overview of all 4 workflows + daily report ✓ Detailed authentication section (OAuth2 vs mTLS) ✓ Box-config.json location explanation ✓ Server deployment step-by-step guide ✓ Database setup (Docker + native PostgreSQL) ✓ Cron job examples for all workflows ✓ Comprehensive troubleshooting section ✓ Security checklist ✓ Monitoring and log rotation details ✓ Common SQL queries ✓ File structure diagram KEY SECTIONS ADDED: 1. What's Included - All 5 scripts explained 2. Quick Start - Local setup guide 3. Server Deployment - 6-step process with commands 4. Workflows - Detailed process for each (A1→A2, A5→A6, B1→B2, A2→A3, Daily Report) 5. Authentication - OAuth2 vs mTLS with examples 6. Configuration - All .env variables documented 7. Database - Schema, setup, queries 8. Monitoring - Logs, emails, database queries 9. Troubleshooting - Common issues + solutions 10. File Structure - Complete directory tree BOX-CONFIG.JSON LOCATION DOCUMENTED: ✓ Must be one folder up from Python-Version ✓ Referenced as ../Box-config.json in config.yaml ✓ Server deployment instructions include copying both files ✓ Troubleshooting section explains file not found errors MTLS DOCUMENTATION: ✓ Different base URL explained (dev-auth.app-api.ferrero.com) ✓ --auth-pfx flag usage ✓ Whitelisted IP requirement noted ✓ Certificate testing commands REORGANIZATION: - Moved old DEPLOYMENT.md → tests/DEPLOYMENT.md (archive) - Moved old WORKFLOW_DIAGRAMS.md → tests/WORKFLOW_DIAGRAMS.md (archive) - New DEPLOYMENT_GUIDE.md is the current deployment doc - README.md is now comprehensive one-stop documentation Changes: - Python-Version/README.md (completely rewritten, 880 lines) - Moved 2 old docs to tests/ folder - Added test files to tests/ folder 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
27 KiB
Python Automation Workflow Diagrams
Detailed Mermaid flowcharts for each Python automation script.
A1 to A2 - Local Adaptation Master Asset Downloader
Script: a1_to_a2_download.py
Trigger: Cron (every 5 minutes)
Purpose: Download Local Adaptation master assets and upload to Box
flowchart TD
Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box, Database, Notifier]
Init --> TestConn{Test All Connections}
TestConn -->|Any Failed| ExitFail([Exit with error code 1])
TestConn -->|All OK| Search[Search DAM for campaigns with Status A1 and Campaign Type Local Adaptation]
Search --> CheckCampaigns{Found A1 Campaigns?}
CheckCampaigns -->|No| LogNone[Log: No A1 campaigns found]
LogNone --> ExitClean([Exit with code 0])
CheckCampaigns -->|Yes| SelectFirst[Select FIRST campaign only]
SelectFirst --> GetMasterAssets[Get Master Assets from 01 Master Assets folder]
GetMasterAssets --> CheckAssets{Found Assets?}
CheckAssets -->|No| SkipCampaign[Log: No master assets found]
SkipCampaign --> ExitFail
CheckAssets -->|Yes| FindFinalFolder[Find 01 Final Assets folder for upload directory]
FindFinalFolder --> InitCounters[Initialize: processed count 0, failed count 0]
InitCounters --> LoopAssets[For Each Asset in Campaign]
LoopAssets --> DownloadAsset[Download asset from DAM to temp folder]
DownloadAsset --> GenTrackID[Generate unique 6-char tracking ID from database]
GenTrackID --> CreateBoxFolder[Create or find Box folder: CampaignID-CampaignName]
CreateBoxFolder --> UploadBox[Upload to Box with tracking ID suffix: filename_ABC123.ext]
UploadBox --> StoreDB[Store in PostgreSQL: tracking ID, DAM asset ID, Box file ID, full metadata JSONB, upload directory]
StoreDB --> CheckSuccess{Upload and Storage Successful?}
CheckSuccess -->|Yes| IncrementSuccess[Increment processed count, Add to success list]
CheckSuccess -->|No| IncrementFail[Increment failed count, Add to failed list with error]
IncrementSuccess --> CleanTemp[Delete temp file]
IncrementFail --> CleanTemp
CleanTemp --> MoreAssets{More Assets to Process?}
MoreAssets -->|Yes| LoopAssets
MoreAssets -->|No| AllDoneCheck{All Assets Successful? processed equals total}
AllDoneCheck -->|No| LogPartial[Log: Campaign incomplete - some failed]
LogPartial --> SendPartialEmail[Send email: a1_to_a2_partial with success and failed lists]
SendPartialEmail --> NoStatusUpdate[Status stays A1 - will retry on next run]
NoStatusUpdate --> ExitFail
AllDoneCheck -->|Yes| LogAllDone[Log: All assets processed successfully]
LogAllDone --> UpdateStatus[Update campaign status A1 to A2 in DAM]
UpdateStatus --> StatusCheck{Status Update Successful?}
StatusCheck -->|No| LogStatusFail[Log: Status update failed]
LogStatusFail --> ExitFail
StatusCheck -->|Yes| SendWebhook[Send webhook to Make.com: campaign ID, campaign number, status change, asset list, timestamp]
SendWebhook --> SendEmail[Send email: a1_to_a2_complete with all processed assets, tracking IDs, Box URLs]
SendEmail --> LogSuccess[Log: Campaign completed successfully]
LogSuccess --> ExitSuccess([Exit with code 0])
style Start fill:#e1f5e1
style ExitSuccess fill:#e1f5e1
style ExitFail fill:#ffe1e1
style ExitClean fill:#e1f5e1
style AllDoneCheck fill:#fff3cd
style StatusCheck fill:#fff3cd
A2 to A3 - Box to DAM Upload Handler (Polling)
Script: a2_to_a3_upload_polling.py
Trigger: Cron (every 5 minutes)
Purpose: Upload agency-processed files from Box to DAM
flowchart TD
Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box folder 348526703108, Database, Notifier, FilenameParser, MetadataExtractorMVP]
Init --> TestConn{Test All Connections}
TestConn -->|Any Failed| ExitFail([Exit with error code 1])
TestConn -->|All OK| ListBox[List all files in Box folder 348526703108]
ListBox --> CheckFiles{Found Files?}
CheckFiles -->|No| LogNoFiles[Log: No files found in Box folder]
LogNoFiles --> ExitClean([Exit with code 0])
CheckFiles -->|Yes| FilterValid[Filter for valid V2 filenames with tracking IDs]
FilterValid --> CheckValid{Found Valid V2 Files?}
CheckValid -->|No| LogNoValid[Log: No valid V2 files to process]
LogNoValid --> ExitClean
CheckValid -->|Yes| SelectFirst[Select FIRST file only - more processed on next run]
SelectFirst --> ParseFilename[Parse V2 filename: extract all 10 components]
ParseFilename --> ValidateV2{V2 Filename Valid?}
ValidateV2 -->|No| LogInvalid[Log: Invalid V2 filename with errors]
LogInvalid --> SendFailEmail[Send email: upload failed with validation errors]
SendFailEmail --> ExitFail
ValidateV2 -->|Yes| ExtractTracking[Extract tracking ID from filename]
ExtractTracking --> CheckTracking{Tracking ID Exists?}
CheckTracking -->|No| LogNoTracking[Log: No tracking ID in filename]
LogNoTracking --> ExitFail
CheckTracking -->|Yes| LoadMaster[Load master metadata from PostgreSQL by tracking ID]
LoadMaster --> CheckMaster{Master Asset Found?}
CheckMaster -->|No| LogNoMaster[Log: No master asset for tracking ID]
LogNoMaster --> ExitFail
CheckMaster -->|Yes| GetBoxMeta[Read Box metadata template: ferrerodammetadata]
GetBoxMeta --> ExtractCreativeX[Extract: creativexScore, creativexUrl]
ExtractCreativeX --> DownloadBox[Download file from Box to temp folder]
DownloadBox --> StripFilename[Strip OMG Job Number and Tracking ID from filename]
StripFilename --> BuildMVP[Build MVP asset representation: Load 24 fields from master, Add 3 missing from filename, Update Description State Language, Update CreativeX from Box metadata, Total 27-28 fields]
BuildMVP --> RenameTemp[Rename temp file to clean filename]
RenameTemp --> UploadDAM[Upload to DAM Final Assets folder with MVP metadata and video dimensions]
UploadDAM --> CheckUpload{Upload Successful HTTP 201 or 202?}
CheckUpload -->|No| LogUploadFail[Log: Upload failed with error]
LogUploadFail --> CleanFail[Delete temp file]
CleanFail --> SendFailEmail2[Send email: upload failed with error details]
SendFailEmail2 --> ExitFail
CheckUpload -->|Yes| StoreDerivative[Store derivative asset record in database: tracking ID and filename]
StoreDerivative --> DeleteBox[Delete file from Box after successful upload]
DeleteBox --> CleanSuccess[Delete local temp file]
CleanSuccess --> SendSuccessEmail[Send email: a2_to_a3_file_uploaded with asset ID, tracking ID, clean filename, Box metadata values]
SendSuccessEmail --> LogSuccess[Log: File processed successfully with asset ID]
LogSuccess --> ExitSuccess([Exit with code 0])
style Start fill:#e1f5e1
style ExitSuccess fill:#e1f5e1
style ExitFail fill:#ffe1e1
style ExitClean fill:#e1f5e1
style BuildMVP fill:#e1f0ff
style GetBoxMeta fill:#ffe1ff
style DeleteBox fill:#fff3cd
B1 to B2 - Global Masters Asset Downloader
Script: b1_to_b2_download.py
Trigger: Cron (every 5 minutes)
Purpose: Download Global Master assets and upload to Box
flowchart TD
Start([Cron triggers script every 5 min]) --> Init[Initialize Clients: DAM, Box folder 349261192115, Database, Notifier]
Init --> TestConn{Test All Connections}
TestConn -->|Any Failed| ExitFail([Exit with error code 1])
TestConn -->|All OK| Search[Search DAM for campaigns with Status B1 and Campaign Type Global comm]
Search --> CheckCampaigns{Found B1 Campaigns?}
CheckCampaigns -->|No| LogNone[Log: No B1 campaigns found]
LogNone --> ExitClean([Exit with code 0])
CheckCampaigns -->|Yes| SelectFirst[Select FIRST campaign only]
SelectFirst --> GetGlobalAssets[Get Global Master Assets from 05 Final Assets folder using is_global equals True]
GetGlobalAssets --> CheckAssets{Found Assets?}
CheckAssets -->|No| SkipCampaign[Log: No master assets found]
SkipCampaign --> ExitFail
CheckAssets -->|Yes| FindFinalFolder[Find 05 Final Assets folder - same as source]
FindFinalFolder --> InitCounters[Initialize: processed count 0, failed count 0]
InitCounters --> LoopAssets[For Each Asset in Campaign]
LoopAssets --> DownloadAsset[Download asset from DAM to temp folder]
DownloadAsset --> GenTrackID[Generate unique 6-char tracking ID from database]
GenTrackID --> CreateBoxFolder[Create or find Box folder: MASTERS_CampaignName no campaign number]
CreateBoxFolder --> UploadBox[Upload to Box with tracking ID suffix to folder 349261192115]
UploadBox --> StoreDB[Store in PostgreSQL: tracking ID, DAM asset ID, Box file ID, full metadata JSONB, upload directory]
StoreDB --> CheckSuccess{Upload and Storage Successful?}
CheckSuccess -->|Yes| IncrementSuccess[Increment processed count, Add to success list]
CheckSuccess -->|No| IncrementFail[Increment failed count, Add to failed list with error]
IncrementSuccess --> CleanTemp[Delete temp file]
IncrementFail --> CleanTemp
CleanTemp --> MoreAssets{More Assets to Process?}
MoreAssets -->|Yes| LoopAssets
MoreAssets -->|No| AllDoneCheck{All Assets Successful? processed equals total}
AllDoneCheck -->|No| LogPartial[Log: Global campaign incomplete - some failed]
LogPartial --> SendPartialEmail[Send email: b1_to_b2_partial with success and failed lists]
SendPartialEmail --> NoStatusUpdate[Status stays B1 - will retry on next run]
NoStatusUpdate --> ExitFail
AllDoneCheck -->|Yes| LogAllDone[Log: All Global assets processed successfully]
LogAllDone --> UpdateStatus[Update campaign status B1 to B2 in DAM]
UpdateStatus --> StatusCheck{Status Update Successful?}
StatusCheck -->|No| LogStatusFail[Log: Status update failed]
LogStatusFail --> ExitFail
StatusCheck -->|Yes| NoWebhook[Note: B1 to B2 does NOT send webhook only email]
NoWebhook --> SendEmail[Send email: b1_to_b2_complete with all processed assets, tracking IDs, Box URLs, Box folder 349261192115]
SendEmail --> LogSuccess[Log: Campaign completed successfully]
LogSuccess --> ExitSuccess([Exit with code 0])
style Start fill:#e1f5e1
style ExitSuccess fill:#e1f5e1
style ExitFail fill:#ffe1e1
style ExitClean fill:#e1f5e1
style AllDoneCheck fill:#fff3cd
style StatusCheck fill:#fff3cd
style NoWebhook fill:#e1f0ff
Data Flow: A1 to A2 Workflow
How data moves through the system:
flowchart LR
DAM[(OpenText DAM)] -->|1. Download| Script[a1_to_a2_download.py]
Script -->|2. Generate Tracking ID| DB[(PostgreSQL Database)]
Script -->|3. Upload with Tracking ID| Box[(Box.com Folder 348304357505)]
Script -->|4. Store Full Metadata JSONB| DB
Script -->|5. Update Status A1 to A2| DAM
Script -->|6. Send Campaign Info| Webhook[Make.com Webhook]
Script -->|7. Send Details| Email[SMTP Email to daveporter at oliver.agency]
Box -->|Folder Structure| BoxFolder[C000000078-Campaign_Name]
BoxFolder -->|Files| BoxFile[06_RAFFAELLO_MAESTRO_SD_ABC123.mp4]
DB -->|Stores| DBData[tracking_id: ABC123, opentext_id: 0008a50..., full_metadata: complete JSON 200KB, upload_directory: ea0dbf...]
style Script fill:#e1f0ff
style DAM fill:#ffe1e1
style Box fill:#e1ffe1
style DB fill:#ffe1ff
Data Flow: A2 to A3 Workflow
How data moves through the system:
flowchart LR
Box[(Box.com Folder 348526703108)] -->|1. Poll for Files| Script[a2_to_a3_upload_polling.py]
Box -->|2. Read Metadata Template| Script
Script -->|3. Get Master Metadata| DB[(PostgreSQL Database)]
Script -->|4. Download File| Box
Script -->|5. Parse V2 Filename| Parser[FilenameParser]
Parser -->|Components| Script
Script -->|6. Build MVP Fields| MVP[MetadataExtractorMVP]
MVP -->|27-28 Fields| Script
Script -->|7. Update Fields| Update[Description from filename, State to Local, Language from filename, CreativeX from Box]
Update -->|Final Metadata| Script
Script -->|8. Upload Clean File| DAM[(OpenText DAM)]
Script -->|9. Store Derivative| DB
Script -->|10. Delete from Box| Box
Script -->|11. Send Notification| Email[SMTP Email]
Box -->|Metadata Template| Meta[ferrerodammetadata: creativexScore 90, creativexUrl https...]
style Script fill:#e1f0ff
style Box fill:#e1ffe1
style DAM fill:#ffe1e1
style DB fill:#ffe1ff
style MVP fill:#fff3cd
Data Flow: B1 to B2 Workflow
How data moves through the system:
flowchart LR
DAM[(OpenText DAM)] -->|1. Download from 05 Final Assets| Script[b1_to_b2_download.py]
Script -->|2. Generate Tracking ID| DB[(PostgreSQL Database)]
Script -->|3. Upload with Tracking ID| Box[(Box.com Folder 349261192115)]
Script -->|4. Store Full Metadata JSONB| DB
Script -->|5. Update Status B1 to B2| DAM
Script -->|6. Send Details No Webhook| Email[SMTP Email to daveporter at oliver.agency]
Box -->|Folder Structure| BoxFolder[MASTERS_NUTELLA_PLANT-BASED_LAUNCH]
BoxFolder -->|Files| BoxFile[nutella_pbased_ABC123.jpg]
DB -->|Stores| DBData[tracking_id: ABC123, opentext_id: 747110..., full_metadata: complete JSON, upload_directory: folder ID]
style Script fill:#e1f0ff
style DAM fill:#ffe1e1
style Box fill:#e1ffe1
style DB fill:#ffe1ff
Detailed Step: MVP Metadata Building (A2 to A3)
How 27-28 MVP fields are constructed:
flowchart TD
Start[Build MVP Asset Representation] --> LoadMaster[Load full_metadata JSONB from PostgreSQL 200KB plus]
LoadMaster --> SearchFields[Search through 15 categories for 28 MVP field IDs]
SearchFields --> Extract[Extract fields with exact structure from master]
Extract --> Count[Count: Found 24 out of 28 fields typically]
Count --> UpdateName[Update ARTESIA.FIELD.ASSET NAME to clean filename RAF_DE_de_TEST_OLV_001_6S_16x9.mp4]
UpdateName --> UpdateDesc[Update ARTESIA.FIELD.ASSET DESCRIPTION to subject title from filename TEST-JOB]
UpdateDesc --> ForceState[Force FERRERO.FIELD.STATE to Local]
ForceState --> AddLang{MAIN_LANGUAGES missing?}
AddLang -->|Yes| AddLangField[Add MAIN_LANGUAGES from filename: de to DE]
AddLang -->|No| CheckCompliance
AddLangField --> CheckCompliance
CheckCompliance{ASSETCOMPLIANCE missing?} -->|Yes| AddCompliance[Add FERRERO.FIELD.ASSETCOMPLIANCE: dash]
CheckCompliance -->|No| CheckTag
AddCompliance --> CheckTag
CheckTag{MARKETING_TAG missing?} -->|Yes| AddTag[Add MARKETING_TAG: Tag]
CheckTag -->|No| CheckCreativeX
AddTag --> CheckCreativeX
CheckCreativeX{Box Metadata Has CreativeX?} -->|Yes| UpdateCXURL[Update FERRERO.FIELD.CREATIVEX LINK to creativexUrl from Box]
CheckCreativeX -->|No| BuildRep
UpdateCXURL --> LogCXScore[Log creativexScore value for reference]
LogCXScore --> BuildRep
BuildRep[Build final asset_resource structure: metadata with 27-28 fields, metadata_model_id ECOMMERCE, security_policy_list 1594]
BuildRep --> Return[Return complete asset representation ready for DAM upload]
style Start fill:#e1f0ff
style Return fill:#e1f5e1
style BuildRep fill:#fff3cd
style UpdateCXURL fill:#ffe1ff
Detailed Step: All-Done Check (A1 to A2)
How the system determines if all assets are processed:
flowchart TD
Start[Campaign Processing Complete] --> CountTotal[total_assets: Count of assets in campaign]
CountTotal --> CountSuccess[processed_count: Count of successfully processed assets]
CountSuccess --> CountFail[failed_count: Count of failed assets]
CountFail --> Compare{processed_count equals total_assets?}
Compare -->|No| PartialPath[PARTIAL COMPLETION PATH]
PartialPath --> LogRatio[Log: processed over total successful]
LogRatio --> StatusStays[Status remains A1 - DO NOT UPDATE]
StatusStays --> EmailPartial[Send partial email with: success list, failed list with errors]
EmailPartial --> WillRetry[Next cron run will retry same campaign]
WillRetry --> ExitPartial([Exit with code 1 - indicates failure])
Compare -->|Yes| FullPath[FULL COMPLETION PATH]
FullPath --> LogAllDone[Log: All assets processed successfully]
LogAllDone --> SafeUpdate[Safe to update status - all work complete]
SafeUpdate --> UpdateStatus[Call DAM API: PATCH status A1 to A2]
UpdateStatus --> VerifyUpdate{Update Successful?}
VerifyUpdate -->|No| UpdateFailed[Log: Status update failed - investigate]
UpdateFailed --> ExitPartial
VerifyUpdate -->|Yes| SendWebhook[Send webhook with campaign details]
SendWebhook --> SendEmail[Send success email with all asset details]
SendEmail --> NextRun[Next cron run will find next A1 campaign]
NextRun --> ExitSuccess([Exit with code 0 - success])
style Compare fill:#fff3cd
style VerifyUpdate fill:#fff3cd
style FullPath fill:#e1f5e1
style PartialPath fill:#ffe1e1
style ExitSuccess fill:#e1f5e1
style ExitPartial fill:#ffe1e1
Detailed Step: Box Metadata Extraction (A2 to A3)
How CreativeX data is read from Box and applied:
flowchart TD
Start[Process Box File] --> GetFileID[File ID: 2035459900168]
GetFileID --> CallAPI[Call Box API: GET files slash file_id slash metadata slash enterprise slash ferrerodammetadata]
CallAPI --> CheckResponse{Metadata Found HTTP 200?}
CheckResponse -->|No 404| LogNoMeta[Log: No metadata template found on file]
LogNoMeta --> ReturnEmpty[Return empty dict - proceed without CreativeX data]
CheckResponse -->|Yes| ParseMeta[Parse metadata response JSON]
ParseMeta --> ExtractScore[Extract creativexScore field: 90]
ExtractScore --> ExtractURL[Extract creativexUrl field: https://www.bbc.com]
ExtractURL --> LogValues[Log: CreativeX Score 90, CreativeX URL https...]
LogValues --> ReturnData[Return dict: score 90, url https://www.bbc.com]
ReturnData --> PassToMVP[Pass to MetadataExtractorMVP.build_mvp_asset_representation]
PassToMVP --> UpdateFields[Update FERRERO.FIELD.CREATIVEX LINK with creativexUrl]
UpdateFields --> LogUpdated[Log: Set CREATIVEX LINK to URL]
LogUpdated --> IncludeInUpload[CreativeX URL included in asset representation uploaded to DAM]
ReturnEmpty --> PassToMVP2[Pass empty dict to MVP extractor]
PassToMVP2 --> NoUpdate[No CreativeX fields updated - use master values]
NoUpdate --> IncludeInUpload
style Start fill:#e1f0ff
style CheckResponse fill:#fff3cd
style IncludeInUpload fill:#e1f5e1
style UpdateFields fill:#ffe1ff
Error Handling Flow (All Scripts)
How errors are handled and retried:
flowchart TD
Start[Script Execution] --> TryCatch[Wrap in try-except block]
TryCatch --> MainLogic[Execute main workflow logic]
MainLogic --> ErrorOccurs{Error Occurs?}
ErrorOccurs -->|No| Success[Complete successfully]
Success --> LogSuccess[Log success message]
LogSuccess --> SendSuccessEmail[Send success email]
SendSuccessEmail --> ExitZero([Exit code 0])
ErrorOccurs -->|Yes| ClassifyError{Error Type?}
ClassifyError -->|Connection Error| LogConnError[Log: Connection failed]
ClassifyError -->|Validation Error| LogValError[Log: Validation failed with details]
ClassifyError -->|API Error| LogAPIError[Log: API returned error code and message]
ClassifyError -->|Other Error| LogGenError[Log: Exception with stack trace]
LogConnError --> CheckRetryable{Is Retryable?}
LogValError --> NotRetryable[Not Retryable - needs manual fix]
LogAPIError --> CheckRetryable
LogGenError --> CheckRetryable
CheckRetryable -->|Yes| LogWillRetry[Log: Will retry on next cron run 5 min]
CheckRetryable -->|No| NotRetryable
LogWillRetry --> SendErrorEmail[Send error email with: filename, error message, retry info]
NotRetryable --> SendErrorEmail
SendErrorEmail --> StatusNotUpdated[Status NOT updated - campaign remains at current state]
StatusNotUpdated --> ExitOne([Exit code 1 - indicates failure])
ExitOne -.->|Next Cron Run 5 min later| Start
style Start fill:#e1f0ff
style ExitZero fill:#e1f5e1
style ExitOne fill:#ffe1e1
style CheckRetryable fill:#fff3cd
style LogWillRetry fill:#e1f5e1
Cron Scheduling Flow
How the three scripts work together:
flowchart TD
Cron([Cron Daemon Running on Server]) --> Every5Min{Every 5 Minutes}
Every5Min --> TriggerA1[Trigger: a1_to_a2_download.py]
Every5Min --> TriggerA2[Trigger: a2_to_a3_upload_polling.py]
Every5Min --> TriggerB1[Trigger: b1_to_b2_download.py]
TriggerA1 --> A1Script[A1 to A2 Script Executes]
A1Script --> A1Search[Search for A1 Local campaigns]
A1Search --> A1Process{Found A1?}
A1Process -->|Yes| A1Download[Download, Box upload, DB store, Update to A2]
A1Process -->|No| A1Exit[Exit - nothing to do]
A1Download --> A1Exit
A1Exit --> A1Log[Write to logs/a1_to_a2.log]
TriggerA2 --> A2Script[A2 to A3 Script Executes]
A2Script --> A2Poll[Poll Box folder 348526703108]
A2Poll --> A2Process{Found V2 Files?}
A2Process -->|Yes| A2Upload[Download from Box, Upload to DAM, Delete from Box]
A2Process -->|No| A2Exit[Exit - nothing to do]
A2Upload --> A2Exit
A2Exit --> A2Log[Write to logs/a2_to_a3.log]
TriggerB1 --> B1Script[B1 to B2 Script Executes]
B1Script --> B1Search[Search for B1 Global campaigns]
B1Search --> B1Process{Found B1?}
B1Process -->|Yes| B1Download[Download, Box upload, DB store, Update to B2]
B1Process -->|No| B1Exit[Exit - nothing to do]
B1Download --> B1Exit
B1Exit --> B1Log[Write to logs/b1_to_b2.log]
A1Log --> Wait[Wait 5 minutes]
A2Log --> Wait
B1Log --> Wait
Wait --> Every5Min
style Cron fill:#e1f0ff
style A1Download fill:#e1f5e1
style A2Upload fill:#e1f5e1
style B1Download fill:#e1f5e1
Database Interaction Flow
How scripts interact with PostgreSQL:
flowchart TD
Script[Python Script] --> ConnPool[Get Connection from Pool min 1 max 10 connections]
ConnPool --> Operation{Operation Type?}
Operation -->|Generate ID| GenID[Generate 6-char random ID]
GenID --> CheckUnique[SELECT COUNT FROM master_assets WHERE tracking_id equals ID]
CheckUnique --> IsUnique{Count equals 0?}
IsUnique -->|No| GenID
IsUnique -->|Yes| ReturnID[Return unique tracking ID]
Operation -->|Store Master| InsertMaster[INSERT INTO master_assets: tracking_id, opentext_id, full_metadata JSONB, upload_directory, description, all extracted fields]
InsertMaster --> OnConflict[ON CONFLICT tracking_id DO UPDATE SET upload_directory, full_metadata, updated_at]
OnConflict --> CommitMaster[COMMIT transaction]
Operation -->|Get Master| SelectMaster[SELECT tracking_id, opentext_id, upload_directory, full_metadata FROM master_assets WHERE tracking_id equals ID AND status equals active]
SelectMaster --> DecodeMeta[Decode full_metadata JSONB to Python dict]
DecodeMeta --> ReturnMaster[Return complete master asset with 200KB metadata]
Operation -->|Store Derivative| InsertDeriv[INSERT INTO derivative_assets: tracking_id, derivative_filename, file_extension, status active]
InsertDeriv --> CommitDeriv[COMMIT transaction]
Operation -->|All Done Check| CountMasters[SELECT COUNT DISTINCT tracking_id FROM master_assets WHERE campaign_id equals ID]
CountMasters --> CountDerivs[SELECT COUNT DISTINCT tracking_id FROM derivative_assets WHERE tracking_id IN master_assets for campaign]
CountDerivs --> CompareCounts{Counts Equal?}
CompareCounts -->|Yes| ReturnTrue[Return True - all done]
CompareCounts -->|No| ReturnFalse[Return False - more pending]
ReturnID --> ReturnConn[Return connection to pool]
CommitMaster --> ReturnConn
ReturnMaster --> ReturnConn
CommitDeriv --> ReturnConn
ReturnTrue --> ReturnConn
ReturnFalse --> ReturnConn
ReturnConn --> Script
style Script fill:#e1f0ff
style ConnPool fill:#ffe1ff
style CommitMaster fill:#e1f5e1
style ReturnTrue fill:#e1f5e1
Box Folder Organization
How files are organized in Box:
flowchart TD
BoxRoot[Box Root Folders] --> A1Folder[348304357505: Local Adaptation A1 to A2]
BoxRoot --> A2Folder[348526703108: Agency Uploads A2 to A3 Input]
BoxRoot --> B1Folder[349261192115: Global Masters B1 to B2]
A1Folder --> A1Camp1[C000000078-Local_adaptation_test_2]
A1Camp1 --> A1File1[06_RAFFAELLO_MAESTRO_SD_ABC123.mp4]
A1Camp1 --> A1File2[8000500247167_8_DEF456.tif]
A2Folder --> A2File1[1234567_RAF_DE_de_TEST_OLV_001_6S_16x9_ABC123.mp4]
A2File1 -.->|Has Metadata| A2Meta[ferrerodammetadata: creativexScore 90, creativexUrl https...]
B1Folder --> B1Camp1[MASTERS_NUTELLA_PLANT-BASED_LAUNCH]
B1Camp1 --> B1File1[nutella_pbased_XYZ789.jpg]
B1Camp1 --> B1File2[nutella_pbased_beauty_GHI012.jpg]
style A1Folder fill:#e1f5e1
style A2Folder fill:#ffe1e1
style B1Folder fill:#e1f0ff
style A2Meta fill:#ffe1ff
Complete System Architecture
How all components work together:
flowchart TB
Cron[Cron Scheduler] --> A1Script[a1_to_a2_download.py Every 5 min]
Cron --> A2Script[a2_to_a3_upload_polling.py Every 5 min]
Cron --> B1Script[b1_to_b2_download.py Every 5 min]
A1Script --> DAM[(OpenText DAM API)]
A2Script --> DAM
B1Script --> DAM
A1Script --> Box[(Box.com API)]
A2Script --> Box
B1Script --> Box
A1Script --> DB[(PostgreSQL Database)]
A2Script --> DB
B1Script --> DB
A1Script --> Email[SMTP Mailgun]
A2Script --> Email
B1Script --> Email
A1Script --> Webhook[Make.com Webhook]
DAM -.->|OAuth2| Auth[Access Token auto-refresh]
Box -.->|JWT| BoxAuth[JWT Token enterprise auth]
DB -.->|Connection Pool| DBPool[1-10 connections]
A1Script --> LogA1[logs/a1_to_a2.log Rotated 28 files 10MB each]
A2Script --> LogA2[logs/a2_to_a3.log Rotated 28 files 10MB each]
B1Script --> LogB1[logs/b1_to_b2.log Rotated 28 files 10MB each]
Config[config.yaml + .env] -.->|Configuration| A1Script
Config -.->|Configuration| A2Script
Config -.->|Configuration| B1Script
style Cron fill:#e1f0ff
style DAM fill:#ffe1e1
style Box fill:#e1ffe1
style DB fill:#ffe1ff
style Email fill:#fff3cd
Notes
Diagram Legend
Colors:
- 🟢 Green: Success paths, completed states
- 🔴 Red: Error paths, failed states
- 🟡 Yellow: Decision points, important checks
- 🔵 Blue: Processing steps, data transformation
- 🟣 Purple: External services, special features
Key Decision Points
All-Done Check:
- Critical for preventing premature status updates
- Only proceeds when ALL assets successful
- Failed assets cause retry on next run
Single-Run Mode:
- Each script processes ONE item (campaign or file)
- Cron handles scheduling (every 5 minutes)
- Allows controlled processing
- Easy to test and debug
Box Metadata:
- Optional enhancement
- If template not found, continues without
- Doesn't block upload if missing
File Naming
A1→A2 and B1→B2 (Box Upload):
- Original: filename.ext
- With Tracking: filename_ABC123.ext
A2→A3 (DAM Upload):
- From Box: 1234567_RAF_DE_de_TEST_OLV_001_6S_16x9_ABC123.mp4
- To DAM: RAF_DE_de_TEST_OLV_001_6S_16x9.mp4
For complete documentation, see PROJECT_STATUS_2025-11-03.md