Compare commits

...
Sign in to create a new pull request.

58 commits
ppr ... main

Author SHA1 Message Date
nickviljoen
9e92db185a Feature: Apply naming-tool pre-upload metadata overrides on A2→A3 upload
The naming tool's metadata editor saves pre-upload overrides to the
override_metadata table (shared ferrero_tracking DB), but until now the
Python upload pipeline never read from it — every edit was being saved
but never applied to DAM. This wires up the consumer side so user edits
land on the uploaded asset.

- database.py: get_override_metadata() / mark_override_applied(),
  resilient to a missing override_metadata table on dev DBs
- metadata_extractor_mvp.py: OVERRIDE_FIELD_MAP (mirrors the naming
  tool's editor-field → DAM-field-ID map) + _apply_override_fields().
  Applied after master/filename/forced/CreativeX values but before
  asset_type_overrides so EOL/LTD compliance still wins. Empty editor
  values are skipped (leaves inherited value alone). Validity ISO
  dates normalised to MM/DD/YYYY for DAM
- a2_to_a3_upload_polling.py: lookup before building the asset rep,
  pass override_fields into build_mvp_asset_representation, mark
  applied only after confirmed upload success

Override priority: user edit > master metadata > forced defaults >
hardcoded today+365 validity — so the team's per-asset validity
period (e.g. 1 month) now flows through end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:06:06 +02:00
nickviljoen
4e9fb6d18f Feature: Add check_campaign_status.py read-only status lookup
Wraps find_campaign_by_identifier() from update_campaign_status.py so
operators can query a campaign's current DAM status by number or partial
name without performing any updates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 20:27:37 +02:00
nickviljoen
db35697091 Feature: Add Spotify (SPT) to social media codes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 21:14:29 +02:00
nickviljoen
c12aef0eb1 Fix: Populate MAIN_LANGUAGES in folder-only mode (-N) uploads
Folder-only mode deep-copies the asset template with MAIN_LANGUAGES.values=[]
and never repopulated it from language_code, so the DAM rejected -N uploads
(SND/voiceover) with "Cannot set null value for a required field: MAIN_LANGUAGES".
Now mirrors the full-inheritance path's tabular values structure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:19:09 +02:00
nickviljoen
6d6213024a Fix: Merge A+B live campaigns into single CSV for OMG
OMG's Box automation treats each new live_campaigns_*.csv as a full-list
replacement, so the per-series global CSV introduced 2026-04-30 stomped
the local list whenever a B1→B2 ran. Collapse to one combined CSV
(A-series + B-series) emitted by every handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:36:43 +02:00
nickviljoen
28586308d7 Docs: Refresh A1 empty-folder doc and LTD asset type notes
A1_RETRY_LOGIC.md updated to reflect the 2026-04-28 rework: empty
folders are now treated as expected workflow (silent skip + one-time
warning at poll 20, no auto permanent-fail), while the original
3-strikes-then-permanently-fail behavior is preserved for genuine
folder errors via the mark_failed_at_max flag.

README.md adds LTD (Licensing Translation Document) to the asset type
override section alongside EOL, and notes that empty overrides remove
fields while non-empty overrides on non-MVP fields are appended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:19:06 +02:00
nickviljoen
ba4f1a9bf7 Feature: Global live campaigns CSV + B4 closure flow
Wires B-series (global) campaigns into OMG using the same Box
automation as A-series. Mirrors the A1/A4 lifecycle for B1/B4.

- b1_to_b2_download: after B2 status update, mark live=YES status=B2
  and upload live_campaigns_global_<ts>.csv to the existing Box folder
  (BOX_LIVE_CAMPAIGNS_FOLDER_ID, 352181382858 in PROD). Filename keeps
  the live_campaigns_ prefix so the existing OMG automation rule picks
  it up.
- b4_box_uploader (new): polls DAM for status B4, marks live=NO, regens
  the global CSV. Mirrors a4_box_uploader.
- a4_box_uploader: reads prior status before overwriting; if it was
  B-series, regenerate the global CSV instead. b4_box_uploader does the
  symmetric A-series fallback. Defensive in case DAM doesn't enforce
  type-specific status transitions.
- database: add get_all_live_global_campaigns() (status LIKE 'B%').
  Tighten get_all_live_campaigns() to status LIKE 'A%' so any cross-type
  rows can't leak into the wrong CSV.
- orchestrator + orchestrator-prod: register B4 Box Uploader at 10min.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:12:49 +02:00
nickviljoen
b74c9c68aa Fix: EOL/LTD asset type overrides — IP Rights, CreativeX, descriptions
- LTD DAM code confirmed by client: licensingtranslationdocument (was placeholder)
- EOL + LTD: IP Rights forced to "No" (was "Yes")
- EOL + LTD: Remove CreativeX URL and score (not applicable to legal asset types)
- EOL: Description forced to "Legal Studio Name"
- Reorder _apply_asset_type_overrides() to run after _update_creativex_fields()
  so overrides have true final precedence (Box CreativeX was clobbering removals)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 13:24:19 +02:00
nickviljoen
5909e017a4 Reporting: Format CreativeX score as '100 (DV360)' in B1→B2 emails
DAM stores the CreativeX tabular cell as '<platform>^<score>', e.g.
'DV360^100'. Add format_cx_score_for_display() and apply at the point
where the email asset dict is built — both new-download and skipped
paths. Raw value stays in creativex_scores.quality_score so all platform
info is preserved for queries; only the email display is reshaped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:04:27 +02:00
nickviljoen
8bf8dc1325 Fix: Recursively walk metadata_element_list when extracting CreativeX
Diagnostic confirmed FERRERO.TAB.FIELD.CREATIVEX (score) lives at depth 2
in B1 master metadata — nested under FERRERO.TABULAR.FIELD.CREATIVEX
inside a category — and FERRERO.FIELD.CREATIVEX LINK lives at depth 1.
The flat top-level walk used previously never reached them, so live B1
runs and the backfill both reported zero CX scores. Updated extractor
in b1_to_b2_download.py and the inline copy in
backfill_b1_creativex_scores.py to descend recursively.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:53:15 +02:00
nickviljoen
a463eb42f8 Diagnostic: Recursively walk nested metadata_element_list for CX search
Previous version only looked at top-level metadata_element_list, which
contains categories — actual fields nest under each category. Now
recursively descends through all nested metadata_element_list arrays
and counts every element_id at any depth, then searches the full set
for CX/score/quality hints. Reports max nesting depth and the depth at
which each CX-flavored ID was found.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:49:54 +02:00
nickviljoen
3c69e7545a Fix: Escape literal % in LIKE pattern in B1 metadata diagnostic
psycopg2 performs %-substitution when params are passed to execute(),
so 'M%' in the LIKE clause was being interpreted as a positional
placeholder, raising IndexError when there's only one real %s (LIMIT).
Escape as 'M%%' so it's preserved as a literal percent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:47:21 +02:00
nickviljoen
23bcc057c5 Diagnostic: Inspect B1 master metadata structure for CX fields
Read-only script that samples B1 global masters from master_assets and
reports: top-level keys in full_metadata, presence of
metadata.metadata_element_list, and any element_ids matching
creativex/cx/score/quality (case-insensitive). Helps diagnose why the CX
backfill found 0 matches — distinguishes "client masters have no CX
score yet" from "CX field uses a different element_id than A1".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:45:35 +02:00
nickviljoen
b9d5ac9feb Backfill: One-shot script to populate CX scores for existing B1 masters
Walks master_assets for B1 global masters (tracking_id LIKE 'M%' AND
local_campaign_id IS NULL), extracts CreativeX score from full_metadata
JSONB, and inserts into creativex_scores with status='b1-master-cx-score'.
Idempotent — relies on the existing tracking_id dedup in
db.store_creativex_score, so re-runs are safe. Supports --dry-run for
preview before applying.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:40:48 +02:00
nickviljoen
f28b5221f7 Enhancement: Capture CreativeX score on B1→B2 global masters
Extracts CreativeX score and URL from DAM master metadata during the
B1→B2 download, persists to creativex_scores with new status
'b1-master-cx-score' (dedup by tracking_id), and surfaces the score in
the b1_to_b2_complete and b1_to_b2_partial emails — falling back to
"No CreativeX Score" when the master has no score yet. Skipped
already-downloaded assets backfill from full_metadata JSONB on next pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:31:07 +02:00
nickviljoen
74977f2366 Rename: SDA asset type → LTD (Licensing Translation Document)
Renames the asset type code introduced in 0f49cc6 from SDA (Supporting
Documents for Approval) to LTD (Licensing Translation Document). All
field overrides and the fixed Description value are unchanged.

DAM-side asset type code remains externallegalopinion as a placeholder
pending client confirmation; will update in a follow-up commit if the
DAM code differs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 21:05:44 +02:00
nickviljoen
0f49cc6cbc Enhancement: SDA (Supporting Documents for Approval) asset type
Adds SDA as a new asset type for License claim translations supporting
the EOL (External Legal Opinion) workflow.

- SDA maps to externallegalopinion in DAM (same as EOL).
- Field overrides match EOL (Agency = "-", Prod Company = "-",
  Languages = Global, IP Right = Yes, Licensing = No, validity dates
  removed) plus a fixed Description: "Translation of License claim -
  For approval purposes only".
- Added asset_type_overrides section to field_mappings_ppr.yaml; it
  was missing, so EOL overrides weren't actually applying on PPR.
  Both EOL and SDA blocks are now defined for both PPR and PROD.
- _apply_asset_type_overrides now appends a simple string field when
  the override targets a field not yet in mvp_fields, so the SDA
  description is set even if the filename has no subject_title.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:08:03 +02:00
nickviljoen
90f326aecb Enhancement: Treat empty A1 folders as expected workflow
Campaign managers often create the campaign in DAM before assets are
uploaded, so an empty Master Assets folder is the normal pre-asset state
rather than a failure. Stop marking these as permanently failed and stop
emailing on every poll.

- increment_a1_retry() gains mark_failed_at_max param; empty-folder path
  passes False so the campaign keeps polling indefinitely until assets
  appear (or the DAM status changes).
- Empty-folder branch now skips silently on every poll and sends a single
  warning email at poll 20 (~1 hour at the 3-min cadence) so genuinely
  stuck campaigns still surface.
- New a1_to_a2_no_assets_warning email template — one-time soft warning,
  no permanent-failure language.
- Existing reset_a1_retry() on successful A1→A2 still clears the counter
  when assets eventually appear.
- Other folder-error paths (folder not found, etc.) keep the original
  3-retry-then-fail behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:20:41 +02:00
nickviljoen
ab557b78de Fix: Skip permanently-failed campaigns before A1 per-run cap
The A1→A2 uploader processes up to 2 campaigns per run. Permanently-failed
campaigns were skipped only inside the loop, so they still consumed slots
and could starve the rest of the queue indefinitely. Filter them out
before the slice so eligible campaigns get processed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:54:36 +02:00
nickviljoen
2c06f3936f Reporting: Split new vs previously-downloaded assets in A1→A2 / B1→B2 emails
When a campaign is re-opened (status reset to A1/B1 after new files are
added), the tool correctly skips already-downloaded assets but the email
report and CSV previously listed the whole folder as "processed", which
was misleading. Reports now show "Total: 14 (12 previously downloaded,
2 new this run)" with new assets in full detail and previously-downloaded
assets in a compact list. B1→B2 CSV gains a Status column matching A1→A2.
2026-04-23 14:11:00 +02:00
nickviljoen
d83e41707c Docs: Update README with asset type mapping changes and current date
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:44:55 +02:00
nickviljoen
455cc1bf2a Update asset type mappings per Scaling Agencies Metadata List
Remove 9 deprecated types (CID, ECB, EBS, EOP, EUG, EWB, FPO, PKI, PRI),
add 9 new types (EAN, ESI, NTB, PIR, PKC, PKT, SCP, SNC, UPI), and update
DAT DAM code from digitalassettoolkit to digitalasset. Display names updated
to match current client naming conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:40:21 +02:00
nickviljoen
695eefadf3 Fix: Recurse into subfolders with numeric extensions (e.g. "2.0")
DAM subfolder "WND_PCS 2026 2.0" was being treated as a file because
".0" was not in the known extensions list and defaulted to is_folder=False.
This caused an HTTP 404 on download since it's a folder, not a file.

Added numeric-only extension check (.0, .1, etc.) to the folder detection
logic so the script correctly recurses into versioned subfolders and
downloads the assets inside them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:46:32 +02:00
nickviljoen
0408d282a5 Revert "Fix: Skip subfolders with numeric extensions in B1→B2 downloads"
This reverts commit 4dff200e10.
2026-04-10 09:44:41 +02:00
nickviljoen
4dff200e10 Fix: Skip subfolders with numeric extensions in B1→B2 downloads
DAM subfolder "WND_PCS 2026 2.0" was being treated as a downloadable
asset because ".0" passed the existing extension check. Added safeguard
to skip items with numeric-only extensions (e.g. .0, .1) which are
version numbers in folder names, not real files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:42:29 +02:00
nickviljoen
39a495e4cc Fix: Skip already-processed assets on B1→B2 retry runs
Previously the script re-downloaded and re-uploaded all assets on every
retry, even those already successfully stored in DB and Box. For large
campaigns (1300+ assets) this caused unnecessary load and duplicate uploads.

Now checks DB via find_global_master_by_opentext_id() before downloading.
Assets already in DB with a valid Box URL are skipped and counted toward
the processed total, so only genuinely failed assets are retried.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:07:07 +02:00
nickviljoen
03c5ab65a8 Docs: Update README and CLAUDE.md with folder-only template and EOL workflow
Added documentation for template-based folder-only mode (-N flag),
asset type overrides (EOL), environment-specific field mappings,
and updated config file references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:33:35 +02:00
nickviljoen
95edece5f3 Enhancement: EOL (External Legal Opinion) workflow
Adds EOL as a new asset type with field overrides for both PPR and PROD:
- Asset type maps to 'externallegalopinion' in DAM
- Agency Name = "-", Production House = "-"
- Main Languages = "Global"
- IP Rights = "Yes", Licensing = "No"
- Validity dates removed
Also adds VOD platform code and removes OLV asset type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 15:53:37 +02:00
nickviljoen
33e71be453 Fix: Template-based folder-only mode for -N flag uploads
Folder-only mode (-N suffix files) was sending minimal metadata that DAM
rejected with "unmarshalling parameter" error. Now uses a reference
asset_representation_template.json as the base for all metadata fields,
ensuring the full field structure (column_name, data_type, domain_id, etc.)
the DAM API requires. Also fixes default/forced value handling to use
DomainValue format for domained fields from the template.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 15:53:10 +02:00
nickviljoen
5905f3262a Fix: Folder-only mode metadata format for PROD DAM compatibility
Folder-only mode (-N suffix files) was sending simplified metadata that
PROD DAM rejected with "unmarshalling parameter" error. Updated to use
DomainValue format for domained fields, correct asset type field ID
(FERRERO.FIELD.MKTG.ASSET TYPE), asset type code mapping (e.g. SND→sound),
validity dates, and forced values from config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:31:02 +02:00
nickviljoen
51e915e67c Add global_master_tracking_id to link A1→A2 local assets to B1→B2 global masters
A1→A2 now looks up the opentext_id in master_assets for an M-prefixed record
from B1→B2 and stores it as global_master_tracking_id on the local asset record.
This provides traceability from local campaign assets back to their global master
without changing any existing workflow logic or DAM metadata.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 13:12:55 +02:00
nickviljoen
78a4ca0976 Fix: CreativeX score supersede now matches base filename ignoring timestamp suffix
Previously, re-scored assets with a DAM timestamp suffix (e.g. _2026-03-13-05-53-36)
were treated as new files, leaving multiple 'active' records. Now strips the timestamp
and uses LIKE matching so all variants of the same base asset are properly superseded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 21:12:50 +02:00
nickviljoen
4dded5de14 Fix: Send Mailgun API emails one recipient at a time
Mailgun silently drops emails with multiple recipients in the to field.
Send individual API calls per recipient and split comma-separated addresses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:39:55 +02:00
nickviljoen
e6a6357403 Update Mailgun test: try US/EU endpoints, handle non-JSON errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:29:27 +02:00
nickviljoen
467a735e94 Add Mailgun recipient format test script
Diagnose daily report email delivery issue - tests single recipient,
comma-separated string in list, and properly split list formats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 12:54:44 +02:00
nickviljoen
dc779724fc Add Mailgun API support for PROD email notifications
Mailgun API is used when MAILGUN_API_KEY and MAILGUN_DOMAIN are set,
with SMTP as fallback for PPR. Also fixes A2→A3 batch subject line
that was rendering Jinja2 syntax literally instead of substituting values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:39:16 +02:00
nickviljoen
96b33fa084 Fix: Correct MARKETING_TAG parent_table_id in folder-only mode
Was generating FERRERO.TABULAR.FIELD.MARKETING_TAG (underscore) but DAM
expects FERRERO.TABULAR.FIELD.MARKETING.TAG (dot). Added explicit mapping
for tabular field parent table IDs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:13:11 +02:00
nickviljoen
6bc1b397d0 Fix: Use simple value structure for non-domain default fields in folder-only mode
VIDEO_POST_PROD_COMPANY and AUDIO_POST_PROD_COMPANY are not domain fields
but were being wrapped with DomainValue, causing unmarshalling errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:07:21 +02:00
nickviljoen
6e0bb08a5f Fix: Add type field to folder-only mode (-N) metadata values for DAM API
The _build_fields_from_filename method was using {"value": "..."} without
the required {"type": "string", "value": "..."} structure, causing
unmarshalling errors on the DAM API for -N suffix uploads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:03:02 +02:00
nickviljoen
faa33cf44f Fix: Use DomainValue wrapper for non-tabular default fields in folder-only mode (-N)
Fixes unmarshalling error on DAM upload when using -N suffix files. The API
requires the DomainValue structure when domain_value is true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 15:30:46 +02:00
nickviljoen
8299a87180 Fix: Update MAIN_LANGUAGES values array for tabular fields in DAM upload
The filename_updates logic was only updating field['value'] (singular) but for
tabular fields like MAIN_LANGUAGES, the DAM reads from field['values'] (plural
array). This caused the master's original language (e.g. "Global") to persist
instead of the correct language from the filename (e.g. "PL").

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 17:26:31 +02:00
nickviljoen
63e42d1196 Fix: Don't send generic CreativeX URL when no score exists
When no CreativeX score is found for a file, the system was sending a
generic placeholder URL (app.creativex.com/preflight/pretests) to the DAM.
Now sends no URL at all, so only files with actual CreativeX scores get a URL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 17:42:57 +02:00
nickviljoen
74141689e6 Enable FERRERO.MASTERASSETIDS and multi-master support for PROD
Remove PPR-only gates so PROD supports the same MASTERASSETIDS tabular
field and multi-master ID parsing as PPR. DAM deployment scheduled for
Feb 18 — do not push until then.

Changes:
- filename_parser: Remove is_ppr check, allow multi-master ID parsing in PROD
- a2_to_a3: Populate master_opentext_ids for single-master PROD case
- dam_client: Remove PPR-only skip on domain registration
- metadata_extractor_mvp: Update docstrings only

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 18:12:30 +02:00
nickviljoen
f6c84762ae Fix: Map CreativeX API channel/publisher to DAM platform names for PROD
The new CreativeX API format stores channel/publisher at the top level
of full_extraction_data instead of inside a data.ferrero_mapped_platforms
wrapper. Add fallback mapping so platforms are correctly populated for
DAM uploads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 17:43:58 +02:00
nickviljoen
052558961a Revert "Fix: Add YouTube platform mapping and social media code fallback for CreativeX"
This reverts commit 799b6d50e8.
2026-02-13 17:17:06 +02:00
nickviljoen
799b6d50e8 Fix: Add YouTube platform mapping and social media code fallback for CreativeX
YouTube Ads was missing from the DAM-CX mappings CSV, causing empty
Platform > Rating fields for YouTube assets. Also adds a fallback that
derives the CreativeX platform from the filename social media code (e.g.
YTA -> YouTube) when the database has no mapped platforms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 17:00:47 +02:00
nickviljoen
9dbb7ce8d9 Revert "Fix: Re-enable FERRERO.MASTERASSETIDS field for PROD single-master uploads"
This reverts commit ea85749e0a.
2026-02-13 16:41:57 +02:00
nickviljoen
ea85749e0a Fix: Re-enable FERRERO.MASTERASSETIDS field for PROD single-master uploads
Populates master_opentext_ids for single-master case so uploads use the
tabular FERRERO.MASTERASSETIDS field instead of the ARTESIA.FIELD.ASSET_ID
fallback. Reverts the workaround from 6517a4f now that the field is being
configured in PROD DAM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 14:20:01 +02:00
nickviljoen
98826d51c4 Fix: CreativeX tracking ID fallback, filename stripping, and social media codes
CreativeX lookup now falls back to tracking ID search when filename match fails
(handles mismatched naming from CreativeX PDFs). strip_upload_components now
only removes job number and tracking ID, keeping social media codes (YTA, DV3,
etc.) in the clean filename. Updated SOCIAL_MEDIA_CODES from 4 to 39 codes
sourced from the Ferrero naming tool.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:24:23 +02:00
nickviljoen
6517a4f83f Fix: Skip FERRERO.MASTERASSETIDS field on PROD - field not yet configured
PROD DAM rejects FERRERO.MASTERASSETIDS as it only exists in PPR. Remove the
single-master-to-list conversion so PROD uses the existing single-ID field
(master_opentext_id) instead. Will be re-added when client configures the
tabular field in PROD.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 12:33:43 +02:00
nickviljoen
27916062ff Fix: Pass notifier to process_box_file and use case-sensitive Master ID check
The notifier variable was referenced inside process_box_file but never passed
as a parameter, causing NameError for any file hitting the Master Tracking ID
check. Also changed the check from case-insensitive (.upper().startswith('M'))
to case-sensitive (.startswith('M')) to avoid false positives on random tracking
IDs like mviSv5.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:49:21 +02:00
nickviljoen
636b555d9d Fix: Define master_opentext_ids variable in A2→A3 and add multi-master support
The PROD a2_to_a3 script referenced master_opentext_ids without defining it,
causing NameError for all file uploads. Brings in multi-master tracking ID
support from PPR: filename parser handles multiple IDs (PPR) or single ID
(PROD), metadata extractor supports MASTERASSETIDS tabular field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:37:18 +02:00
nickviljoen
d72d37a83d Enhancement: Campaign re-opening support and PPR master asset ID registration
A1→A2 now handles re-processing when campaign is reset to A1 after adding new
master assets. Existing assets reuse tracking IDs and skip Box upload, new assets
are processed normally. Also includes PPR domain registration for multiple master
asset IDs in a2_to_a3 and dam_client.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 21:07:13 +02:00
nickviljoen
57b4df2799 Security: Remove database password from permanently failed email template
Replace exposed database credentials and SQL commands in A1 permanently failed notification email with support contact information (optical@oliver.agency).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 07:24:49 +02:00
nick.viljoen
fc9539d4b5 Security: Add .env files to .gitignore
.env files contain sensitive credentials and should never be committed to git.
  Removed .env-prod from tracking while preserving local file.
2026-01-31 18:07:44 +00:00
nickviljoen
c90032b1d9 Fix: A1 retry logic now catches folder not found errors
Problem:
- Retry logic only triggered for empty folders (total_assets == 0)
- When "Master Assets" folder doesn't exist, error thrown BEFORE retry check
- Exception caught by outer try/except, sent old upload_failed template
- No database tracking, emails sent every 3 minutes indefinitely

Solution:
- Added retry logic to outer exception handler
- Detects folder/assets errors and applies same 3-attempt tracking
- Now handles both: (1) folder doesn't exist, (2) folder is empty
- Database tracking works for both scenarios

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 19:34:29 +02:00
nickviljoen
e1f15ea632 Add A1 retry logic and orchestrator off-hours cadence
Feature 1: A1→A2 Empty Folder Retry Logic
- Track retry attempts (max 3) for campaigns with no master assets
- Mark campaigns as permanently failed after 3 attempts
- Stop processing and sending emails for permanently failed campaigns
- Two new email templates: retry notification and permanent failure
- Database migration adds 4 new columns to campaign_status table
- Comprehensive documentation in A1_RETRY_LOGIC.md

Feature 2: Orchestrator Off-Hours Cadence
- Add 30 minutes to all task intervals during off-hours
- Off-hours: 10 PM - 5 AM weekdays + all day Saturday/Sunday
- Tasks only run at minutes 0 and 30 during off-hours
- Configurable and easy to enable/disable
- Daily Report (7 PM) remains unchanged

Files changed:
- NEW: database/migrations/003_add_a1_retry_tracking.sql
- NEW: MARKDOWN_DOCS/A1_RETRY_LOGIC.md
- MODIFIED: scripts/shared/database.py (added 3 methods)
- MODIFIED: scripts/a1_to_a2_box_uploader.py (added retry logic)
- MODIFIED: scripts/shared/notifier.py (added 2 templates)
- MODIFIED: scripts/orchestrator-prod.py (added off-hours config)
- MODIFIED: RUN_ORCHESTRATOR.md (added off-hours docs)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 17:38:57 +02:00
nickviljoen
b7e0430636 Fix: Prevent DAM folder creation attempts causing timeouts
Remove folder creation logic in get_or_create_subfolder_path() since DAM does not allow folder creation via API. When a subfolder doesn't exist, upload to the parent folder instead of attempting to create it (which was causing 120 second timeouts).

This resolves upload failures in PROD environment during A2→A3 workflow.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-30 10:34:53 +02:00
32 changed files with 4742 additions and 444 deletions

View file

@ -1,66 +0,0 @@
# Ferrero Automation Environment Variables
# Environment (staging or production)
ENV=prod
# DAM Credentials - OAuth2 (default authentication)
DAM_BASE_URL=https://dam.ferrero.com/otmmapi
DAM_AUTH_URL=https://dam.ferrero.com/otdsws/oauth2/token
DAM_CLIENT_ID=otds-OLV
DAM_CLIENT_SECRET=hs28LZ9ZzQ5I9rlW3P7Wwyw85oOatlC1
# DAM mTLS Certificate (optional - only used with --auth-pfx flag)
DAM_MTLS_BASE_URL=https://prod-auth.app-api.ferrero.com/00003/mm/token
DAM_MTLS_CERT_PATH=config/certificates/SAP-XX-Orange-Logic-to-APP-APIM-prod.pfx
DAM_MTLS_CERT_PASSWORD=(aP5IzJdg1d)e)V39Sq5k]13LwO[49D43#iR{}ks
# Box Credentials
BOX_CLIENT_ID=l2atwxxq4xna7phcjr2uifm4mbah69qp
BOX_CLIENT_SECRET=6XcuCQ6akpk9daE0UHaGSv3mSxWaER4l
BOX_JWT_KEY_ID=n1izyn3l
BOX_PASSPHRASE=971585f5fd6171428c14a7c8899af5ab
BOX_ENTERPRISE_ID=43984435
# Box Folder Configuration
BOX_ROOT_FOLDER_A1_A2=348304357505
BOX_ROOT_FOLDER_A2_A3=348526703108
BOX_ROOT_FOLDER_B1_B2=349261192115
BOX_ROOT_FOLDER_CREATIVEX=350605024645
# Database
DB_HOST=localhost
DB_PORT=5437
DB_USER=ferrero_user
DB_PASSWORD=ferrero_pass_2025
# Mailgun / SMTP (for email notifications)
SMTP_SERVER=smtp.mailgun.org
SMTP_PORT=587
SMTP_USER=twist@mail.dev.oliver.solutions
SMTP_PASSWORD=102115e9f3b9d7332d0cd1d4329bc0d4-77751bfc-ca066b71
SENDER_EMAIL=TWIST-UK-SERVER@oliver.agency
ERROR_EMAIL=daveporter@oliver.agency
REPORT_EMAILS=daveporter@oliver.agency
# Mailgun API (alternative to SMTP)
MAILGUN_API_KEY=your_mailgun_api_key_here
MAILGUN_DOMAIN=mail.dev.oliver.solutions
# Webhook Configuration
CAMPAIGN_STATUS_WEBHOOK_URL=https://hook.us1.make.celonis.com/3f9ztwl8qnljufo0l65utfv5wvvnt9m5
WEBHOOK_AUTH_TOKEN=
WEBHOOK_RECEIVER_PORT=5555
BOX_WEBHOOK_PRIMARY_KEY=your_box_webhook_primary_key
BOX_WEBHOOK_SECONDARY_KEY=your_box_webhook_secondary_key
# CreativeX Configuration
LLAMA_CLOUD_API_KEY=llx-EDmfh0ZReUbXUbaa5i5275TAP2LznNDqc3skJRL3HY4RUDcf
CREATIVEX_AGENT_NAME=Creativex-Extract
BOX_LIVE_CAMPAIGNS_FOLDER_ID=352181382858
# DAM mTLS V2 (Hybrid)
DAM_MTLS_OAUTH_URL=https://prod-auth.app-api.ferrero.com/00003/mm/token
# Master Asset ID Field Configuration
MASTER_ASSET_ID_FIELD=ARTESIA.FIELD.ASSET_ID

View file

@ -5,3 +5,5 @@ temp/
logs/
.DS_Store
.env
.env-prod
.env

View file

@ -0,0 +1,324 @@
# A1→A2 Empty Folder Handling
**Purpose:** Avoid spam emails and false-positive permanent failures for the common workflow where campaign managers create an A1 campaign before uploading the master assets.
**Initial implementation:** January 31, 2026
**Reworked:** April 28, 2026 — empty folders are now treated as expected client workflow rather than failures.
**Related files:**
- `scripts/a1_to_a2_box_uploader.py` (main script)
- `scripts/shared/database.py` (retry tracking methods)
- `database/migrations/003_add_a1_retry_tracking.sql` (schema)
---
## How It Works (current behavior)
### The empty-folder case (most common)
When a campaign is at A1 in DAM but the Master Assets folder is empty, the script treats this as a normal pre-asset state, not a failure.
**Flow:**
1. Every poll: `a1_retry_count` is incremented for visibility, the script logs `No master assets yet (poll N) - skipping until assets appear`, and exits silently.
2. At poll 20 (~1 hour at the 3-minute orchestrator cadence) the script sends a single `a1_to_a2_no_assets_warning` email so genuinely-stuck campaigns still surface.
3. After poll 20, the script keeps skipping silently. **`a1_permanently_failed` is never auto-set for empty folders.**
4. When assets eventually appear and A1→A2 succeeds, `db.reset_a1_retry()` clears the counter automatically.
The threshold lives in `scripts/a1_to_a2_box_uploader.py` as `EMPTY_FOLDER_WARNING_THRESHOLD = 20`.
### The genuine-error case
The 3-retries-then-permanently-fail behavior **still exists** for actual folder-level errors (e.g. `Assets folder not found (tried Master Assets)`), which are caught by the script's exception handler. These DO mark `a1_permanently_failed=TRUE` after 3 failures and DO send the retry / permanently-failed emails.
`db.increment_a1_retry()` accepts `mark_failed_at_max=True|False` to switch between the two behaviors. The empty-folder branch passes `False`; the exception handler passes `True` (default).
### Queue-slot filter
The A1→A2 script processes up to 2 campaigns per run (`campaigns[:2]`). Permanently-failed campaigns are filtered out **before** the slot cap so they no longer block the queue (`scripts/a1_to_a2_box_uploader.py:652`).
### Database tracking
Four fields on the `campaign_status` table:
- `a1_retry_count` (INTEGER): Number of polls where the folder was empty / errored. For empty-folder cases this can grow unbounded; reset on success.
- `a1_last_retry_at` (TIMESTAMP): When last attempt occurred
- `a1_permanently_failed` (BOOLEAN): TRUE only via the genuine-error path (after 3 failures), never via the empty-folder path
- `a1_failure_reason` (TEXT): Why it failed (e.g., "Assets folder not found (tried Master Assets)")
---
## Configuration
### Empty-folder warning threshold
`scripts/a1_to_a2_box_uploader.py`:
```python
EMPTY_FOLDER_WARNING_THRESHOLD = 20 # ~1 hour at 3-min poll cadence
```
Send the one-time warning sooner/later by adjusting this constant.
### Genuine-error retry attempts
`scripts/shared/database.py``increment_a1_retry()`:
```python
MAX_RETRIES = 3
```
Applies only when the caller passes `mark_failed_at_max=True` (default), i.e. the exception handler in `process_campaign()`. The empty-folder branch passes `False` and is unaffected.
---
## Email Notifications
### Empty-folder warning (one-time, at poll 20)
**Template:** `a1_to_a2_no_assets_warning`
**Subject:** ⚠️ Campaign in A1 with no assets yet - {campaign_name}
**Recipients:** Error notification list
**Sent:** exactly once per stuck campaign, when `a1_retry_count == 20`. Counter resets on success, so a future re-stuck event would warn again.
### Genuine-error retry email (attempts 12)
**Template:** `a1_to_a2_no_assets_retry`
**Subject:** ⚠️ No Assets Found (Attempt X/3) - Campaign {name}
**Recipients:** Error notification list
**Trigger:** non-empty-folder errors caught by `process_campaign()`'s exception handler.
### Genuine-error final failure (attempt 3)
**Template:** `a1_to_a2_permanently_failed`
**Subject:** ❌ PERMANENTLY FAILED - Campaign {name} (No Assets After 3 Attempts)
**Recipients:** Error notification list
**Content:**
- Campaign marked as permanently failed (campaign filtered from future queue runs)
- Required actions to fix
- SQL command to manually reset
---
## Manual Operations
### Check Campaign Retry Status
```sql
SELECT campaign_number, campaign_name, status,
a1_retry_count, a1_last_retry_at,
a1_permanently_failed, a1_failure_reason
FROM campaign_status
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
```
### Reset Single Campaign
```sql
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
```
**Or using psql command:**
```bash
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking <<EOF
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
EOF
```
### Reset All Failed Campaigns
```sql
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL
WHERE a1_permanently_failed = TRUE;
```
### View All Failed Campaigns
```sql
SELECT campaign_number, campaign_name,
a1_retry_count, a1_last_retry_at, a1_failure_reason
FROM campaign_status
WHERE a1_permanently_failed = TRUE
ORDER BY a1_last_retry_at DESC;
```
---
## Failure Scenarios
### Scenario 1: Temporary Empty Folder
**What Happens:**
- Attempt 1: Email sent, retry counter = 1
- Assets added to folder before attempt 2
- Next run finds assets, processes successfully
- Retry counter automatically reset to 0
**Result:** Problem self-resolves, minimal notifications
### Scenario 2: Persistent Empty Folder
**What Happens:**
- Attempt 1 (0 min): Email sent, retry counter = 1
- Attempt 2 (3 min): Email sent, retry counter = 2
- Attempt 3 (6 min): Email sent, retry counter = 3
- Campaign marked permanently failed
- Processing stops, no more emails
**Result:** Support team alerted, infinite emails prevented
### Scenario 3: Wrong Status Assignment
**What Happens:**
- Campaign set to A1 by mistake (no assets intended)
- Fails 3 times, marked permanently failed
- Admin realizes mistake, changes status to different value
- Campaign no longer appears in A1 search results
**Result:** No reset needed, campaign excluded from processing
---
## Testing
### Test Retry Logic
1. Create test campaign in DAM with A1 status
2. Ensure Master Assets folder is empty
3. Run A1→A2 script manually 3 times
4. Verify emails received and database state
```bash
# Run 1
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Check database
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"
# Run 2 (wait 3 minutes or run immediately for testing)
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Check again
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"
# Run 3
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Verify permanently failed
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed, a1_failure_reason FROM campaign_status WHERE a1_permanently_failed = TRUE;"
```
### Test Reset Logic
```bash
# Reset the test campaign
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "UPDATE campaign_status SET a1_retry_count = 0, a1_permanently_failed = FALSE WHERE campaign_number = 'TEST_CAMPAIGN';"
# Run again
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Verify it retries
```
---
## Monitoring
### Dashboard Query: Current Retry Status
```sql
SELECT
COUNT(*) FILTER (WHERE a1_retry_count = 0) as "No Issues",
COUNT(*) FILTER (WHERE a1_retry_count = 1) as "Attempt 1",
COUNT(*) FILTER (WHERE a1_retry_count = 2) as "Attempt 2",
COUNT(*) FILTER (WHERE a1_retry_count >= 3) as "Permanently Failed"
FROM campaign_status
WHERE status = 'A1';
```
### Alert Query: Campaigns Near Failure
```sql
SELECT campaign_number, campaign_name, a1_retry_count, a1_last_retry_at
FROM campaign_status
WHERE status = 'A1'
AND a1_retry_count >= 2
AND a1_permanently_failed = FALSE
ORDER BY a1_retry_count DESC, a1_last_retry_at DESC;
```
---
## Troubleshooting
### Q: Campaign keeps failing even after adding assets
**A:** Check if campaign was marked permanently failed. Reset using SQL command above.
### Q: Want to change from 3 to 5 retry attempts
**A:** Edit `MAX_RETRIES = 3` in `database.py` line ~567. Also update email templates to reflect new maximum.
### Q: How to disable retry logic completely?
**A:** Not recommended, but you can:
1. Set `MAX_RETRIES = 999` (effectively infinite)
2. Or revert to old `a1_to_a2_no_assets` template without retry tracking
### Q: Can I set different retry counts for different campaigns?
**A:** No, it's a global setting. All campaigns use same `MAX_RETRIES` value.
### Q: What if I want to delete permanently failed campaigns from database?
**A:** Don't delete. Instead, change their status to something other than A1. They'll be excluded from processing automatically.
---
## Future Enhancements
Potential improvements for future versions:
1. **Configurable retry timing:**
- Instead of relying on cron frequency (3 min)
- Check `a1_last_retry_at` and skip if too recent
- Allow exponential backoff (3 min, 10 min, 30 min)
2. **Campaign-specific retry limits:**
- Add optional `a1_max_retries` column
- Allow different campaigns to have different thresholds
- Default to global MAX_RETRIES if not set
3. **Automatic cleanup:**
- After 30 days, auto-reset permanently failed campaigns
- Or send weekly digest of stuck campaigns
4. **Webhook notifications:**
- Send to external system when campaign permanently fails
- Integrate with ticketing system
5. **Admin UI:**
- Web interface to view/reset retry status
- Bulk reset operations
---
## Code Locations
**Quick reference for developers:**
| Component | File | Line Range |
|-----------|------|------------|
| Retry check logic | `a1_to_a2_box_uploader.py` | ~176-186 |
| Empty folder detection | `a1_to_a2_box_uploader.py` | ~193-231 |
| Success reset | `a1_to_a2_box_uploader.py` | ~354-356 |
| `get_a1_retry_status()` | `database.py` | ~522-558 |
| `increment_a1_retry()` | `database.py` | ~560-620 |
| `reset_a1_retry()` | `database.py` | ~622-655 |
| Email templates | `notifier.py` | ~593-687 |
| Database migration | `migrations/003_add_a1_retry_tracking.sql` | All |
---
## Change Log
**January 31, 2026:**
- Initial implementation
- 3-attempt retry mechanism
- Permanent failure tracking
- Two new email templates
- This documentation created
**Future updates will be logged here.**

View file

@ -126,16 +126,22 @@ tail -f logs/cron_a1_a2.log
- Rejection comment extraction (A5→A6 specific)
- **metadata_extractor_mvp.py**: Field mapping and metadata transformation
- Loads 27 MVP fields from `config/field_mappings.yaml`
- Handles filename-based updates
- Loads MVP fields from environment-specific config (`field_mappings_ppr.yaml` or `field_mappings_prod.yaml`)
- **Two tracking modes:** Full inheritance (from master metadata) and folder-only (`-N` suffix, uses `config/asset_representation_template.json` as base)
- Handles filename-based updates, forced values, defaults, and asset type overrides
- Asset type overrides (e.g., EOL) can set/remove fields with final precedence
- Force-sets required values (e.g., STATE = "Local")
- Uses DomainValue format for domained fields when setting values on template fields
### Configuration Architecture
**Hierarchical config system:**
- `.env`: Environment variables (credentials, never committed)
- `config/config.yaml`: Main configuration (references .env vars)
- `config/field_mappings.yaml`: Editable field definitions (add/remove fields without code changes)
- `config/field_mappings_ppr.yaml`: PPR field definitions (auto-loaded when DAM URL contains 'ppr')
- `config/field_mappings_prod.yaml`: PROD field definitions (auto-loaded otherwise)
- `config/asset_type_mappings.yaml`: 3-letter code to DAM code mappings (e.g., EOL -> externallegalopinion)
- `config/asset_representation_template.json`: Reference template for folder-only mode (-N flag), contains full field metadata structure
- `../Box-config.json`: Box JWT credentials (one directory up from Python-Version)
**Important**: Box-config.json MUST be located at `../Box-config.json` (one folder up). This is hardcoded in config.yaml as `rsa_private_key_path: ../Box-config.json`.
@ -181,8 +187,11 @@ tail -f logs/cron_a1_a2.log
**A2→A3 (Upload from Box):**
- Polls `BOX_ROOT_FOLDER_A2_A3` (348526703108) for new files
- Parses tracking ID from filename (V2 format)
- Loads master metadata from database
- Two tracking modes:
- **Full inheritance**: Loads master metadata from database, inherits all fields
- **Folder-only** (`-N` suffix): Uses `config/asset_representation_template.json` as base, populates from filename
- Updates Description, Language, State fields from filename
- Applies asset type overrides (e.g., EOL sets Agency="-", Languages="Global", IPRights="Yes", removes validity dates)
- Deletes file from Box after successful upload
- Updates campaign status A2→A3 when ALL assets uploaded
@ -219,7 +228,10 @@ New workflow scripts should follow this pattern:
### Modifying Field Mappings
**To add/remove fields, edit `config/field_mappings.yaml`:**
**To add/remove fields, edit the environment-specific file:**
- PPR: `config/field_mappings_ppr.yaml`
- PROD: `config/field_mappings_prod.yaml`
```yaml
mvp_fields:
- FERRERO.FIELD.NEW_FIELD_NAME # Add here
@ -228,6 +240,19 @@ mvp_fields:
**No code changes required** - the system dynamically loads fields at runtime.
### Asset Type Overrides
To add field overrides for a specific asset type, add an `asset_type_overrides` section to the field mappings file:
```yaml
asset_type_overrides:
EOL: # Keyed by 3-letter asset type code
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
MAIN_LANGUAGES: "Global"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Empty string removes the field
```
Overrides run after all other field processing (forced values, defaults) and take final precedence. An empty string value removes the field entirely from the payload.
### Database Queries
Common patterns used in the codebase:
@ -301,7 +326,10 @@ except Exception as e:
├── .env # Environment variables
├── config/
│ ├── config.yaml
│ ├── field_mappings.yaml
│ ├── field_mappings_ppr.yaml
│ ├── field_mappings_prod.yaml
│ ├── asset_type_mappings.yaml
│ ├── asset_representation_template.json
│ └── certificates/
│ └── dam-mtls-dev.pfx
├── database/

View file

@ -2,8 +2,8 @@
**Complete automated workflow for Ferrero DAM Content Scaling**
**Version:** 2.0
**Last Updated:** November 5, 2025
**Version:** 2.1
**Last Updated:** April 16, 2026
**Status:** ✅ Production Ready & Fully Tested
---
@ -334,6 +334,12 @@ crontab -e
7. Delete file from Box
8. **Update A2→A3 when ALL campaign assets uploaded**
**Two tracking modes:**
- **Full inheritance** (standard): Inherits all metadata from the master asset
- **Folder-only** (`-N` suffix): Builds metadata from a reference template (`config/asset_representation_template.json`) and populates values from the filename. Used when the derivative only needs the upload folder from the master.
**Asset type overrides:** Certain asset types (e.g., EOL) trigger field overrides configured in the environment's field mappings file (e.g., Agency Name, Languages, IP Rights, validity dates).
**Box Folder:** 348526703108 (Agency Uploads)
**Email:** a2_to_a3_file_uploaded, a2_to_a3_complete
@ -946,6 +952,44 @@ scp Box-config.json user@server:/opt/ferrero-automation/
scp -r Python-Version/ user@server:/opt/ferrero-automation/
```
### Field Mappings (Environment-Specific)
The system auto-detects the environment from the DAM URL and loads the appropriate config:
- **PPR:** `config/field_mappings_ppr.yaml` (pre-production, `ppr.dam.ferrero.com`)
- **PROD:** `config/field_mappings_prod.yaml` (production, `dam.ferrero.com`)
Each file defines: MVP fields, filename update rules, forced values, defaults, and asset type overrides.
### Asset Type Mappings
`config/asset_type_mappings.yaml` maps 3-letter codes from the naming tool to DAM domain values (e.g., `EHI` -> `heroimage`, `EOL` -> `externallegalopinion`).
**Last updated:** April 16, 2026 per Scaling Agencies Metadata List. 38 asset types mapped (was 39). Changes:
- **Removed:** CID, ECB, EBS, EOP, EUG, EWB, FPO, PKI, PRI
- **Added:** EAN, ESI, NTB, PIR, PKC, PKT, SCP, SNC, UPI
- **Changed:** DAT DAM code updated from `digitalassettoolkit` to `digitalasset`
### Asset Representation Template
`config/asset_representation_template.json` is the reference template for folder-only mode (`-N` flag uploads). It contains the full field metadata structure that the DAM API requires for asset creation. This template was provided by the client and should be updated if the DAM metadata model changes.
### Asset Type Overrides (EOL / LTD)
Certain asset types trigger field overrides configured in the field mappings file. Currently configured for both PPR and PROD:
**EOL (External Legal Opinion)**
- Agency Name = "-"
- Production House = "-"
- Main Languages = "Global"
- IP Rights = "Yes"
- Licensing = "No"
- Validity dates removed
**LTD (Licensing Translation Document)** — supports the EOL workflow with translated license claims. Same overrides as EOL, plus a fixed Description: `"Translation of License claim - For approval purposes only"`. Currently mapped to the same DAM-side code (`externallegalopinion`) as a placeholder pending client confirmation.
These overrides are applied after all other field processing and take final precedence. An empty-string override removes the field; a non-empty override targeting a field that isn't in `mvp_fields` will be appended as a simple string field.
---
## 🗄️ Database
@ -1397,8 +1441,8 @@ PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrer
---
**Version:** 2.0 - Production Ready
**Last Updated:** November 5, 2025
**Version:** 2.1 - Production Ready
**Last Updated:** April 16, 2026
**Repository:** bitbucket.org:zlalani/ferrero-opentext.git
🚀 **Ready to deploy!**

View file

@ -45,6 +45,119 @@ Checks once, runs any due tasks, then exits. This is what cron would call.
---
## Off-Hours Configuration
### Overview
The orchestrator automatically reduces task frequency during off-hours to minimize system load during low-activity periods.
**What changes during off-hours:**
- All tasks run less frequently (only at 0 and 30 minute marks)
- Example: A 3-minute task normally runs at minutes 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, etc.
- During off-hours: Runs only at minutes 0 and 30 (every 30 minutes)
- Daily Report (7 PM) remains unchanged
**Off-hours definition:**
- Late night: 10 PM (22:00) to 5 AM (05:00) every day
- All day Saturday (00:00-23:59)
- All day Sunday (00:00-23:59)
### Configuration
**Location:** `scripts/orchestrator-prod.py` lines ~88-107
```python
OFF_HOURS_CONFIG = {
'enabled': True, # Set to False to disable
'extra_minutes': 30, # Minutes to add during off-hours
'late_night_start': 22, # Start hour (22 = 10 PM)
'late_night_end': 5, # End hour (5 = 5 AM)
'weekend_days': [5, 6], # Saturday=5, Sunday=6
'exempt_tasks': [
'Daily Report' # Tasks that ignore off-hours
]
}
```
### Examples
**Business Hours (Monday 2 PM):**
```
A1→A2: Runs every 3 minutes (0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, ...)
A4 Box: Runs every 10 minutes (0, 10, 20, 30, 40, 50)
```
**Off-Hours (Monday 11 PM or Saturday):**
```
A1→A2: Runs every 30 minutes (0, 30)
A4 Box: Runs every 30 minutes (0, 30)
All tasks: Only run at minutes 0 and 30
```
### Customization
#### Change off-hours timing
Edit `orchestrator-prod.py`:
```python
# Late night only from midnight to 6 AM
'late_night_start': 0,
'late_night_end': 6,
# Include only Sunday as weekend
'weekend_days': [6], # 6 = Sunday
```
#### Disable off-hours completely
```python
OFF_HOURS_CONFIG = {
'enabled': False, # Turns off all off-hours logic
# ... rest unchanged
}
```
#### Exempt specific tasks
```python
'exempt_tasks': [
'Daily Report',
'A4 Webhook Monitor' # This task will run at normal cadence even in off-hours
]
```
### Monitoring
Check orchestrator logs to see current mode:
```bash
# Watch for mode changes
tail -f logs/orchestrator.log | grep "MODE"
# Output examples:
# Orchestrator tick: 2026-01-31 14:00:00 [NORMAL MODE]
# Orchestrator tick: 2026-01-31 22:00:00 [OFF-HOURS MODE]
# Adding 30 minutes to all task intervals
```
### Testing
```bash
# Test without affecting production
python scripts/orchestrator-prod.py --force
# Look for these log messages:
# [OFF-HOURS MODE] or [NORMAL MODE]
# "Adding 30 minutes to all task intervals"
# "Task 'A1->A2' due (off-hours: 3min + 30min cadence)"
```
---
## Logs
- **Orchestrator logs**: `logs/orchestrator.log`

View file

@ -0,0 +1,830 @@
{
"asset_resource": {
"asset": {
"metadata": {
"metadata_element_list": [
{
"column_name": "ASSET_TYPE",
"data_length": 30,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.DOMAIN.MARKETING.ASSETTYPE",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.MKTG.ASSET TYPE",
"multilingual": false,
"name": "Asset Type",
"prompt": "Asset Type",
"required": false,
"restriction_id": 7242,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "FISCAL__YEAR",
"data_length": 100,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO_DOMAIN_FISCAL_YEAR",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.FISCAL YEAR",
"multilingual": false,
"name": "Release Fiscal Year",
"prompt": "Release Fiscal Year",
"required": true,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "ECOMMERCE",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "DESCR",
"data_length": 249,
"data_type": "CHAR",
"description": "Descriptive information",
"displayable": true,
"domained": false,
"edit_type": "TEXTAREA",
"editable": true,
"enabled": true,
"facetable": false,
"id": "ARTESIA.FIELD.ASSET DESCRIPTION",
"multilingual": false,
"name": "Description",
"prompt": "Description",
"required": false,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "UOIS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"value": {
"type": "string",
"value": ""
}
}
},
{
"column_name": "FLAVOUR",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.DOMAIN.MARKETING.FLAVOUR",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.MARKETING.FLAVOUR",
"multilingual": false,
"name": "Flavour",
"prompt": "Flavour",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "SIZE",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.DOMAIN.MARKETING.SIZE",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.MARKETING.SIZE",
"multilingual": false,
"name": "Size",
"prompt": "Size",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "ASSET_STATE",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.DOMAIN.GLOBAL.LOCAL",
"domained": true,
"edit_type": "COMBO_NOTNULL",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.STATE",
"multilingual": false,
"name": "Global/Local",
"prompt": "Global/Local",
"required": true,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "FERRERO_IC_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "NAME",
"data_length": 256,
"data_type": "CHAR",
"description": "Original name of master object",
"displayable": true,
"domained": false,
"edit_type": "SIMPLE",
"editable": false,
"enabled": true,
"facetable": false,
"id": "ARTESIA.FIELD.ASSET NAME",
"multilingual": false,
"name": "Asset Name",
"prompt": "Asset Name",
"required": false,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "UOIS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"value": {
"type": "string",
"value": ""
}
}
},
{
"column_name": "SUB_BRAND",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.DOMAIN.SUBBRAND",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.SUB BRAND",
"multilingual": false,
"name": "Sub-Brands",
"prompt": "Sub-Brands",
"required": false,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "ECOMMERCE",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "VALIDATION_STARTING_DATE",
"data_length": 20,
"data_type": "DATE",
"displayable": true,
"domained": false,
"edit_type": "DATE",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.ASSET VALIDITY START PERIOD",
"multilingual": false,
"name": "Asset validity start period",
"prompt": "Asset validity start period",
"required": false,
"restriction_id": 7242,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "FERRERO_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"value": {
"type": "string",
"value": ""
}
}
},
{
"column_name": "VALIDATION_ENDING_DATE",
"data_length": 20,
"data_type": "DATE",
"displayable": true,
"domained": false,
"edit_type": "DATE",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.FIELD.ASSET VALIDITY END PERIOD",
"multilingual": false,
"name": "Asset validity end period",
"prompt": "Asset validity end period",
"required": false,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "FERRERO_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"value": {
"type": "string",
"value": ""
}
}
},
{
"column_name": "AGENCY_NAME",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKETING.AGENCY_NAME",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.AGENCY NAME",
"multilingual": false,
"name": "Agency Name",
"prompt": "Agency Name",
"required": false,
"restriction_id": 7336,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "CREATIVE_LINK",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "TEXTAREA",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.FIELD.CREATIVEX LINK",
"multilingual": false,
"name": "CreativeX Hyperlink",
"prompt": "CreativeX Hyperlink",
"required": false,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "FERRERO_ASSET_CREATIVEX",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"value": {
"type": "string",
"value": ""
}
}
},
{
"column_name": "IP_RIGHTS",
"data_length": 200,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKETING.IPRIGHTS",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": true,
"id": "FERRERO.MARKET.FIELD.IPRIGHT",
"multilingual": false,
"name": "IP Rights",
"prompt": "IP Rights",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "TOTAL_BUYOUT",
"data_length": 200,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKETING.TOTAL_BUYOUT",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKET.FIELD.BUYOUT",
"multilingual": false,
"name": "Touchpoint Scope",
"prompt": "Touchpoint Scope",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "FERRERO_PROPERTY",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKETING.FERRERO_PROPERTY",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKET.FIELD.FERRERO PROPERTY",
"multilingual": false,
"name": "Ferrero Property",
"prompt": "Ferrero Property",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "VID_AND_STAT_RIGHT",
"data_length": 100,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKET.TECH_VALID",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKET.VID_N_STAT",
"multilingual": false,
"name": "Video and Static Right",
"prompt": "Video and Static Right",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "PRODUCTION_COMPANY",
"data_length": 200,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.DOMAIN.MARKETING.PRODUCT",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKET.PROD_COMPANY",
"multilingual": false,
"name": "Production House",
"prompt": "Production House",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "LICENSING",
"data_length": 200,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKET.TECH_VALID",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKET.FIELD.LICENSIN",
"multilingual": false,
"name": "Licensing",
"prompt": "Licensing",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"cascading_group_id": "FERRERO.MARKET.CG.LICENSE",
"column_name": "LICENSOR",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "CASCADING",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKET.FIELD.LICENSE",
"multilingual": false,
"name": "License",
"prompt": "License",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "SPOT_VERSION",
"data_length": 100,
"data_type": "CHAR",
"displayable": true,
"domain_id": "FERRERO.MARKETING.SPOT_VERSION",
"domained": true,
"edit_type": "COMBO",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.SPOT_VERSION",
"multilingual": false,
"name": "Spot Version",
"prompt": "Spot Version",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "DIRECTOR_NAME",
"data_length": 100,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "SIMPLE",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.DIRECTOR_NAME",
"multilingual": false,
"name": "Director Name",
"prompt": "Director Name",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "VIDEO_POST_PRODUCTION_COMPANY",
"data_length": 200,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "SIMPLE",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY",
"multilingual": false,
"name": "Video Post-Production Company",
"prompt": "Video Post-Production Company",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "VIDEO_COMPANY_DETAILS",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "TEXTAREA",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.VID_POST_PROD_CONTACT",
"multilingual": false,
"name": "Video Post Production Company Contact Details",
"prompt": "Video Post Production Company Contact Details",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "AUDIO_POST_PRODUCTION_COMPANY",
"data_length": 200,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "SIMPLE",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY",
"multilingual": false,
"name": "Audio Post-Production Company",
"prompt": "Audio Post-Production Company",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"column_name": "AUDIO_COMPANY_DETAILS",
"data_length": 2000,
"data_type": "CHAR",
"displayable": true,
"domained": false,
"edit_type": "TEXTAREA",
"editable": true,
"enabled": true,
"facetable": false,
"id": "FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_CONTACT",
"multilingual": false,
"name": "Audio Post Production Company Contact Details",
"prompt": "Audio Post Production Company Contact Details",
"required": false,
"restriction_id": 5429,
"scale": 0,
"searchable": true,
"searchable_scope_id": "1",
"searchable_scope_num_id": 1,
"sortable": true,
"system_field": false,
"table_name": "MARKETING_ASSET_DETAILS",
"trigger_field": false,
"type": "com.artesia.metadata.MetadataField",
"value": {
"cascading_domain_value": false,
"domain_value": false,
"is_locked": false
}
},
{
"id": "MAIN_LANGUAGES",
"parent_table_id": "FERRERO.TABULAR.FIELD.MAIN LANGUAGES",
"type": "com.artesia.metadata.MetadataTableField",
"values": []
},
{
"id": "FERRERO.FIELD.ASSETCOMPLIANCE",
"parent_table_id": "FERRERO.TABULAR.FIELD.ASSETCOMPLIANCE",
"type": "com.artesia.metadata.MetadataTableField",
"values": []
},
{
"id": "MARKETING_TAG",
"parent_table_id": "FERRERO.TABULAR.FIELD.MARKETING_TAG",
"type": "com.artesia.metadata.MetadataTableField",
"values": []
},
{
"id": "FERRERO.MARKET.FIELD.TYPE_VID",
"parent_table_id": "FERRERO.TABULAR.VID_STAT_TYPE",
"type": "com.artesia.metadata.MetadataTableField",
"values": []
}
]
},
"metadata_model_id": "ECOMMERCE",
"security_policy_list": [
{
"id": 1594
}
]
}
}
}

View file

@ -2,18 +2,15 @@
# Frontend naming tool uses 3-letter codes (EHI, IMG, TVC, etc.)
# DAM uses descriptive lowercase codes (heroimage, keyvisual, tvc, etc.)
# This file maps between them
# Updated: 2026-04-16 per Scaling Agencies Metadata List
# E-Commerce Asset Types
ECA: aplus # E-COMM: A+
ECB: backpackshot # E-COMM: Back Packshot
EBS: beautyshot # E-COMM: Beauty shot
EBR: brandstore # E-COMM: Brand Store
EEM: emedia # E-COMM: E-Media
EHI: heroimage # E-COMM: Hero Image
EIL: ingredientslist # E-COMM: Ingredients List
EOP: outofpack # E-COMM: Out Of Pack
EUG: ugc # E-COMM: UGC
EWB: whybuy # E-COMM: Why Buy
ECA: aplus # A+ content (E-COMM)
EBR: brandstore # Brand Store (E-COMM)
EEM: emedia # E-Media (E-COMM)
EHI: heroimage # Hero Image (E-COMM)
EIL: ingredientslist # Ingredients List
ESI: secondaryimage # Secondary image (E-COMM)
# Standard Asset Types
3RT: coretoys # 3D Real Toys
@ -22,29 +19,36 @@ BBK: brandbook # Brand Book
BRC: brandcharacter # Brand Character
BSG: brandsignature # Brand Signature
CKV: campaignkeyvisual # Campaign Key Visual
CID: CreativeIdea # Creative Idea
DAT: digitalassettoolkit # Digital Assets/Toolkit
FLA: flyerartworks # Flyer Artworks
DAT: digitalasset # Digital Asset
EAN: eancodeclaim # EAN CODE - claim
FLA: flyerartworks # Trade Leaflet
FNT: font # Font
GDT: gadget # Gadget
GDT: gadget # Gadget / Prize
GRG: groupguidelines # Group Guidelines
IMG: keyvisual # Immagine Guida / Front of Pack Image (was FPO)
FPO: keyvisual # Front of Pack Image (alias for IMG)
IMG: keyvisual # Immagine Guida/Product and Key Ingredients
LGL: localguidelines # Local Guidelines
LOG: ferrerologo # Logo
MLF: marketingleaflet # Marketing Leaflet
OLV: onlinevideodigitalvideo # On Line Video
PAW: packartworks # Pack Artworks
PKI: packshot # Pack Images (was packshot)
MLF: marketingleaflet # Toys Marketing Leaflet
NTB: nutritionalclaim # Nutritional table
PAW: packartworks # Pack Artwork
PIR: prepinstructionclaim # Prep. Instruction and recipes
PKC: packcurendering # Pack CU Rendering
PKT: packturendering # Pack TU/SU Rendering
POS: posm # POS Material
PDM: productdemo # Product Demo
PRI: productimages # Product Images
QRC: qrcode # QR code
QRC: qrcode # QR Code
SCP: sizecomparisonclaim # Size comparison picture
SNC: certificationsustainabilityclaim # Certification/sustainability/nutritional claim
SND: sound # Sound
SIP: internalproperties # Styleguide Internal Properties
SGL: licenseshighlights # Styleguide Licenses
TVC: tvc # TVC
VIE: visualidentityelements # Visual Identity Elements
UPI: unwrappedproductimage # Unwrapped Product Images
VIE: visualidentityelements # Brand Visual Identity Elements
# External Legal Opinion
EOL: externallegalopinion # External Legal Opinion (triggers field overrides)
LTD: licensingtranslationdocument # Licensing Translation Document - License claim translations (triggers field overrides)
# Note: If a 3-letter code is not in this mapping, it will be passed through as-is
# and may fail DAM validation if the code doesn't exist in DAM's domain

View file

@ -80,11 +80,15 @@ retry:
notifications:
enabled: true
smtp:
server: ${SMTP_SERVER}
port: ${SMTP_PORT}
user: ${SMTP_USER}
password: ${SMTP_PASSWORD}
sender_email: ${SENDER_EMAIL}
server: ${SMTP_SERVER:-}
port: ${SMTP_PORT:-587}
user: ${SMTP_USER:-}
password: ${SMTP_PASSWORD:-}
sender_email: ${SENDER_EMAIL:-}
mailgun:
api_key: ${MAILGUN_API_KEY:-}
domain: ${MAILGUN_DOMAIN:-}
sender_email: ${MAILGUN_SENDER_EMAIL:-}
recipients:
success:
- ${REPORT_EMAILS}

View file

@ -76,3 +76,31 @@ defaults:
FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY: "Oliver Marketing Ltd"
FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY: "Oliver Marketing Ltd"
FERRERO.MARKET.PROD_COMPANY: "-" # Production House
# Asset type overrides (keyed by 3-letter asset type code)
# Applied AFTER normal field updates and forced values
# Overrides specific fields when a matching asset type is detected in the filename
asset_type_overrides:
EOL: # External Legal Opinion - selected as asset type in naming tool
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for EOL
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for EOL
ARTESIA.FIELD.ASSET DESCRIPTION: "Legal Studio Name"
LTD: # Licensing Translation Document - License claim translations supporting EOL
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for LTD
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for LTD
ARTESIA.FIELD.ASSET DESCRIPTION: "Translation of License claim - For approval purposes only"

View file

@ -76,3 +76,31 @@ defaults:
FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY: "Oliver Marketing Ltd"
FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY: "Oliver Marketing Ltd"
FERRERO.MARKET.PROD_COMPANY: "-" # Production House
# Asset type overrides (keyed by 3-letter asset type code)
# Applied AFTER normal field updates and forced values
# Overrides specific fields when a matching asset type is detected in the filename
asset_type_overrides:
EOL: # External Legal Opinion - selected as asset type in naming tool
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for EOL
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for EOL
ARTESIA.FIELD.ASSET DESCRIPTION: "Legal Studio Name"
LTD: # Licensing Translation Document - License claim translations supporting EOL
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for LTD
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for LTD
ARTESIA.FIELD.ASSET DESCRIPTION: "Translation of License claim - For approval purposes only"

View file

@ -51,6 +51,7 @@ CREATE TABLE IF NOT EXISTS master_assets (
global_master_campaign_id VARCHAR(50),
global_master_folder_id VARCHAR(255),
local_campaign_id VARCHAR(50),
global_master_tracking_id VARCHAR(6),
-- Workflow Information
upload_directory VARCHAR(1000),
@ -198,7 +199,7 @@ CREATE TABLE IF NOT EXISTS creativex_scores (
-- Timestamps
extracted_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
status VARCHAR(50) DEFAULT 'active', -- 'active', 'superseded', 'master-cx-score'
status VARCHAR(50) DEFAULT 'active', -- 'active', 'superseded', 'master-cx-score' (A1 local masters), 'b1-master-cx-score' (B1 global masters)
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
@ -221,6 +222,7 @@ CREATE INDEX IF NOT EXISTS idx_master_assets_created_at ON master_assets(created
CREATE INDEX IF NOT EXISTS idx_master_assets_global_master ON master_assets(global_master_campaign_id);
CREATE INDEX IF NOT EXISTS idx_master_assets_local_campaign ON master_assets(local_campaign_id);
CREATE INDEX IF NOT EXISTS idx_master_assets_opentext_local ON master_assets(opentext_id, local_campaign_id);
CREATE INDEX IF NOT EXISTS idx_master_assets_global_master_tracking ON master_assets(global_master_tracking_id);
-- derivative_assets indexes
CREATE INDEX IF NOT EXISTS idx_derivative_tracking_id ON derivative_assets(tracking_id);

View file

@ -0,0 +1,32 @@
-- Migration: Add A1 retry tracking to campaign_status table
-- Purpose: Prevent infinite error emails for empty A1 campaigns
-- Date: January 31, 2026
\echo 'Adding A1 retry tracking fields to campaign_status table...'
ALTER TABLE campaign_status
ADD COLUMN IF NOT EXISTS a1_retry_count INTEGER DEFAULT 0,
ADD COLUMN IF NOT EXISTS a1_last_retry_at TIMESTAMP,
ADD COLUMN IF NOT EXISTS a1_permanently_failed BOOLEAN DEFAULT FALSE,
ADD COLUMN IF NOT EXISTS a1_failure_reason TEXT;
\echo 'Fields added successfully'
-- Create index for faster queries
CREATE INDEX IF NOT EXISTS idx_campaign_status_a1_failed ON campaign_status(a1_permanently_failed);
\echo 'Index created'
-- Add comments for documentation
COMMENT ON COLUMN campaign_status.a1_retry_count IS 'Number of times A1→A2 processing attempted with empty folder';
COMMENT ON COLUMN campaign_status.a1_last_retry_at IS 'Timestamp of last retry attempt';
COMMENT ON COLUMN campaign_status.a1_permanently_failed IS 'TRUE if campaign failed all 3 retry attempts';
COMMENT ON COLUMN campaign_status.a1_failure_reason IS 'Description of why campaign was marked as permanently failed';
\echo ''
\echo '============================================================'
\echo 'Migration 003 complete!'
\echo '============================================================'
\echo 'Added fields: a1_retry_count, a1_last_retry_at, a1_permanently_failed, a1_failure_reason'
\echo 'Purpose: Track A1 empty folder retries (max 3 attempts)'
\echo '============================================================'

View file

@ -0,0 +1,13 @@
-- Migration 004: Add global_master_tracking_id column to master_assets
-- Purpose: Links local campaign assets (A1→A2) back to their global master (B1→B2)
-- by storing the M-prefixed tracking ID from the B1 record
-- Date: 2026-03-21
ALTER TABLE master_assets
ADD COLUMN IF NOT EXISTS global_master_tracking_id VARCHAR(6);
-- Index for lookups
CREATE INDEX IF NOT EXISTS idx_master_assets_global_master_tracking
ON master_assets(global_master_tracking_id);
\echo 'Migration 004 complete: Added global_master_tracking_id to master_assets'

View file

@ -0,0 +1,14 @@
-- Migration 005: Document new 'b1-master-cx-score' status value in creativex_scores
-- Purpose: B1→B2 global master CreativeX scores are now persisted to creativex_scores
-- with status='b1-master-cx-score' so they can be queried directly without
-- joining through master_assets. No DDL change needed (status is VARCHAR(50)
-- and accepts arbitrary values); this migration exists for documentation only.
-- Date: 2026-04-29
-- Existing status values:
-- 'active' - currently-valid A2 scoring extraction (versioned)
-- 'superseded' - older A2 scoring extraction replaced by a newer one
-- 'master-cx-score' - A1→A2 local master reference score
-- 'b1-master-cx-score' - B1→B2 global master reference score (NEW)
\echo 'Migration 005 complete: b1-master-cx-score status documented (no schema change)'

View file

@ -50,6 +50,11 @@ logging.basicConfig(
logger = logging.getLogger('A1toA2Box')
# Empty A1 folders are an expected client workflow (folder created before assets uploaded).
# Skip silently and send a single warning email at this poll count to flag genuinely-stuck
# campaigns without spamming. At ~3-min poll cadence, 20 polls ≈ 1 hour.
EMPTY_FOLDER_WARNING_THRESHOLD = 20
def extract_creativex_from_dam_metadata(asset_metadata):
"""
Extract CreativeX score and URL from DAM asset metadata if present
@ -171,6 +176,15 @@ def process_campaign(campaign, dam, box, db, notifier, config):
logger.info("Processing campaign: {} ({})".format(campaign_name, campaign_number))
logger.info("=" * 60)
# CHECK RETRY STATUS FIRST
retry_status = db.get_a1_retry_status(campaign_id)
if retry_status and retry_status['permanently_failed']:
logger.warning("Campaign {} is marked as permanently failed - skipping".format(campaign_number))
logger.info("Failure reason: {}".format(retry_status.get('failure_reason', 'Unknown')))
logger.info("To retry this campaign, manually reset it using database.reset_a1_retry()")
return {'success': False, 'processed': 0, 'failed': 0, 'skipped': True}
total_assets = 0
try:
# Get master assets
@ -180,17 +194,38 @@ def process_campaign(campaign, dam, box, db, notifier, config):
logger.info("Found {} master assets".format(total_assets))
if total_assets == 0:
logger.warning("No master assets found in Master Assets folder")
# Send email notification about empty campaign (keep error notifications)
notifier.send_email(
template_name='a1_to_a2_no_assets',
recipients=config['notifications']['recipients']['errors'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number
}
# Empty folders are expected when a campaign manager creates the campaign
# before uploading assets. Track the count for visibility but never auto-fail
# — keep retrying every poll until assets appear (or status changes in DAM).
retry_result = db.increment_a1_retry(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
reason="No master assets found in Master Assets folder",
mark_failed_at_max=False
)
if not retry_result['success']:
logger.error("Failed to update retry counter")
retry_count = retry_result.get('retry_count', 0)
logger.info("No master assets yet (poll {}) - skipping until assets appear".format(retry_count))
# Send a single warning email when the campaign has been empty for ~1 hour
# so genuinely-stuck campaigns still surface, without spamming on every poll.
if retry_count == EMPTY_FOLDER_WARNING_THRESHOLD:
logger.warning("Campaign has been empty for {} polls - sending one-time warning".format(retry_count))
notifier.send_email(
template_name='a1_to_a2_no_assets_warning',
recipients=config['notifications']['recipients']['errors'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'poll_count': retry_count
}
)
return {'success': False, 'processed': 0, 'failed': 0}
# Track results
@ -216,16 +251,74 @@ def process_campaign(campaign, dam, box, db, notifier, config):
else:
logger.info("Processing: {}".format(asset_name))
# 1. Download from DAM
# 1. Extract Global Campaign Reference (needed for tracking ID lookup)
global_ref = db.extract_global_campaign_reference(asset, campaign_number)
# 1b. Look up matching B1→B2 global master by opentext_id
global_master_tid = db.find_global_master_by_opentext_id(asset_id)
if global_master_tid:
logger.info("Linked to global master: {}{}".format(asset_name, global_master_tid))
# 2. Find existing tracking ID or generate new one
# Handles re-processing: if campaign was reset to A1 after adding new masters,
# existing assets keep their tracking IDs, new assets get new IDs
tracking_result = db.find_or_create_tracking_id(
opentext_id=asset_id,
local_campaign_id=global_ref['local_campaign_id']
)
tracking_id = tracking_result['tracking_id']
is_existing = tracking_result['is_existing']
if is_existing:
# Asset already processed in a previous A1→A2 cycle
existing_master = db.get_master_asset(tracking_id)
if existing_master and existing_master.get('box_file_id'):
logger.info("Re-processing: reusing tracking ID {} for existing asset {} (skipping download/upload)".format(
tracking_id, asset_name))
box_result = {
'file_id': existing_master['box_file_id'],
'url': existing_master['box_url']
}
# Update database metadata (asset_data may have changed in DAM)
db_result = db.store_master_asset(
tracking_id=tracking_id,
opentext_id=asset_id,
asset_data=asset,
box_file_id=box_result['file_id'],
box_url=box_result['url'],
upload_folder_id=final_folder_id,
global_master_campaign_id=global_ref['global_master_campaign_id'],
global_master_folder_id=global_ref['global_master_folder_id'],
local_campaign_id=global_ref['local_campaign_id'],
global_master_tracking_id=global_master_tid
)
if db_result['success']:
processed_assets.append({
'asset_id': asset_id,
'asset_name': asset_name,
'tracking_id': tracking_id,
'box_file_id': box_result['file_id'],
'box_url': box_result['url'],
'is_existing': True
})
logger.info("✓ Existing asset confirmed: {}{} (skipped)".format(asset_name, tracking_id))
else:
raise Exception("Database update failed for existing asset")
continue # Skip to next asset
else:
# Tracking ID exists but no usable Box info - process normally
logger.info("Existing tracking ID {} found but no Box info - downloading/uploading".format(tracking_id))
# 3. Download from DAM (new assets or existing without Box info)
file_path = dam.download_asset(
asset_id,
output_dir='temp/downloads/{}'.format(campaign_id)
)
# 2. Generate tracking ID (regular files never start with 'M')
tracking_id = db.generate_unique_tracking_id(is_master=False)
# 3. Upload to Box (preserve folder structure from DAM)
# 4. Upload to Box (preserve folder structure from DAM)
box_result = box.upload_with_tracking_id(
file_path=file_path,
campaign_id=campaign_number,
@ -234,9 +327,6 @@ def process_campaign(campaign, dam, box, db, notifier, config):
subfolder_path=folder_path
)
# 4. Extract Global Campaign Reference and Local Campaign ID
global_ref = db.extract_global_campaign_reference(asset, campaign_number)
# 5. Store in database
db_result = db.store_master_asset(
tracking_id=tracking_id,
@ -247,7 +337,8 @@ def process_campaign(campaign, dam, box, db, notifier, config):
upload_folder_id=final_folder_id,
global_master_campaign_id=global_ref['global_master_campaign_id'],
global_master_folder_id=global_ref['global_master_folder_id'],
local_campaign_id=global_ref['local_campaign_id']
local_campaign_id=global_ref['local_campaign_id'],
global_master_tracking_id=global_master_tid
)
if db_result['success']:
@ -275,9 +366,10 @@ def process_campaign(campaign, dam, box, db, notifier, config):
'asset_name': asset_name,
'tracking_id': tracking_id,
'box_file_id': box_result['file_id'],
'box_url': box_result['url']
'box_url': box_result['url'],
'is_existing': False
})
logger.info("Success: {}{}".format(asset_name, tracking_id))
logger.info("New asset: {}{}".format(asset_name, tracking_id))
else:
raise Exception("Database storage failed")
@ -295,10 +387,17 @@ def process_campaign(campaign, dam, box, db, notifier, config):
# CHECK: All assets processed successfully?
all_done = len(processed_assets) == total_assets
# Count new vs existing assets
new_assets = [a for a in processed_assets if not a.get('is_existing')]
existing_assets = [a for a in processed_assets if a.get('is_existing')]
logger.info("")
logger.info("Campaign {} Results:".format(campaign_id))
logger.info(" Total: {}".format(total_assets))
logger.info(" Successful: {}".format(len(processed_assets)))
if existing_assets:
logger.info(" - New assets: {}".format(len(new_assets)))
logger.info(" - Existing assets (skipped download/upload): {}".format(len(existing_assets)))
logger.info(" Failed: {}".format(len(failed_assets)))
logger.info(" All Done: {}".format("YES" if all_done else "NO"))
logger.info("")
@ -312,6 +411,9 @@ def process_campaign(campaign, dam, box, db, notifier, config):
if status_result['success']:
logger.info("✓ Status updated successfully")
# RESET retry counter on success
db.reset_a1_retry(campaign_id)
# Record campaign status in database
logger.info("Recording campaign status in database...")
db.record_campaign_status(
@ -341,19 +443,18 @@ def process_campaign(campaign, dam, box, db, notifier, config):
os.makedirs("temp")
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['Filename', 'Tracking ID', 'Campaign Number']
fieldnames = ['Filename', 'Tracking ID', 'Campaign Number', 'Status']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for asset in processed_assets:
# 2024-03-22: Clean filename request (remove tracking ID)
# Assuming tracking ID is at the end or we just want the asset_name
clean_name = asset['asset_name'] # asset_name from db.store_master_asset is typically used
clean_name = asset['asset_name']
writer.writerow({
'Filename': clean_name,
'Tracking ID': asset['tracking_id'],
'Campaign Number': campaign_number
'Campaign Number': campaign_number,
'Status': 'Existing' if asset.get('is_existing') else 'New'
})
logger.info("Generated CSV report: {}".format(csv_path))
@ -372,7 +473,11 @@ def process_campaign(campaign, dam, box, db, notifier, config):
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'asset_count': len(processed_assets),
'processed_assets': processed_assets
'new_asset_count': len(new_assets),
'existing_asset_count': len(existing_assets),
'processed_assets': processed_assets,
'new_assets': new_assets,
'existing_assets': existing_assets
},
attachments=attachments
)
@ -416,20 +521,66 @@ def process_campaign(campaign, dam, box, db, notifier, config):
except Exception as e:
logger.error("Campaign processing failed: {}".format(str(e)))
# Send error notification for this specific campaign failure
try:
notifier.send_email(
template_name='upload_failed',
recipients=config['notifications']['recipients']['errors'],
data={
'filename': "Campaign: {}".format(campaign_name),
'tracking_id': campaign_number,
'error': str(e)
}
# Check if this is a "folder not found" or "no assets" error - use retry logic
error_str = str(e).lower()
is_folder_issue = 'folder not found' in error_str or 'no assets' in error_str or 'assets folder' in error_str
if is_folder_issue:
logger.warning("Detected folder/assets issue - applying retry logic")
# Increment retry counter
retry_result = db.increment_a1_retry(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
reason=str(e)
)
except Exception as email_error:
logger.error("Failed to send error email: {}".format(str(email_error)))
if not retry_result['success']:
logger.error("Failed to update retry counter")
is_permanently_failed = retry_result.get('permanently_failed', False)
retry_count = retry_result.get('retry_count', 0)
# Determine which email template to use
if is_permanently_failed:
# Send FINAL failure email (after 3 attempts)
template_name = 'a1_to_a2_permanently_failed'
else:
# Send standard retry notification
template_name = 'a1_to_a2_no_assets_retry'
# Send email notification
try:
notifier.send_email(
template_name=template_name,
recipients=config['notifications']['recipients']['errors'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'retry_count': retry_count,
'max_retries': 3,
'is_permanently_failed': is_permanently_failed
}
)
except Exception as email_error:
logger.error("Failed to send error email: {}".format(str(email_error)))
else:
# Other errors - send generic failure notification
try:
notifier.send_email(
template_name='upload_failed',
recipients=config['notifications']['recipients']['errors'],
data={
'filename': "Campaign: {}".format(campaign_name),
'tracking_id': campaign_number,
'error': str(e)
}
)
except Exception as email_error:
logger.error("Failed to send error email: {}".format(str(email_error)))
return {'success': False, 'processed': 0, 'failed': total_assets}
@ -495,10 +646,30 @@ def main():
db.close()
sys.exit(0)
# Exclude permanently-failed campaigns so they don't consume processing slots
eligible_campaigns = []
skipped_failed = []
for campaign in campaigns:
retry_status = db.get_a1_retry_status(campaign['asset_id'])
if retry_status and retry_status['permanently_failed']:
skipped_failed.append(campaign.get('campaign_id', 'N/A'))
else:
eligible_campaigns.append(campaign)
if skipped_failed:
logger.info("Excluding {} permanently-failed campaign(s): {}".format(
len(skipped_failed), ", ".join(skipped_failed)
))
if not eligible_campaigns:
logger.info("No eligible A1 campaigns to process - exiting")
db.close()
sys.exit(0)
# Process UP TO 2 campaigns
campaigns_to_process = campaigns[:2]
logger.info("Found {} A1 campaigns - processing {} campaign(s)".format(
len(campaigns), len(campaigns_to_process)
campaigns_to_process = eligible_campaigns[:2]
logger.info("Found {} A1 campaigns ({} eligible) - processing {} campaign(s)".format(
len(campaigns), len(eligible_campaigns), len(campaigns_to_process)
))
logger.info("")

View file

@ -57,7 +57,7 @@ logging.basicConfig(
logger = logging.getLogger('A2toA3')
def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, keep_files=False, dryrun=False):
def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, notifier, keep_files=False, dryrun=False):
"""
Process a single file from Box folder
@ -93,11 +93,43 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
if subfolder_path:
logger.info("From Box subfolder: {} -> will create in DAM".format(subfolder_path))
# 2. Load master metadata from database
master_asset = db.get_master_asset(tracking_id)
# 2. Load master metadata from database (support multiple tracking IDs in PPR)
tracking_ids = parsed.get('tracking_ids', [tracking_id]) # Get all IDs or fallback to single
has_multiple_masters = parsed.get('has_multiple_masters', False)
# CHECK: Warn if Master Tracking ID is used (starts with M)
if tracking_id.upper().startswith('M'):
# Load all master assets (supports multiple masters in both PPR and PROD)
master_assets = []
master_opentext_ids = []
if has_multiple_masters:
logger.info("Multiple master assets detected: {}".format(', '.join(tracking_ids)))
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
logger.warning("Master asset not found for tracking ID: {} - skipping".format(tid))
continue
master_assets.append(master)
master_opentext_ids.append(master['opentext_id'])
if not master_assets:
raise ValueError("No master assets found for tracking IDs: {}".format(', '.join(tracking_ids)))
# Use first master for metadata inheritance
master_asset = master_assets[0]
logger.info("Using primary master {} for metadata, linking {} total masters".format(
tracking_ids[0], len(master_assets)))
else:
# Single master (backward compatible)
master_asset = db.get_master_asset(tracking_id)
if not master_asset:
# Will check below
master_asset = None
else:
master_opentext_ids = [master_asset['opentext_id']]
# CHECK: Warn if Master Tracking ID is used (starts with uppercase M)
if tracking_id.startswith('M'):
logger.warning("Detected Master Tracking ID in Version/Derivative upload folder: {}".format(tracking_id))
# Send notification
@ -141,10 +173,8 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
if not master_asset:
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
# 3. Get CreativeX score from database (lookup by original Box filename)
# The PDF contains the filename field with the full name (job + tracking ID)
# So we lookup using the original filename from Box, not the stripped version
creativex_data = db.get_creativex_score_by_filename(filename)
# 3. Get CreativeX score from database (lookup by filename, fallback to tracking ID)
creativex_data = db.get_creativex_score_by_filename(filename, tracking_id=tracking_id)
# Build box_metadata dict (for compatibility with existing code)
if creativex_data:
@ -156,7 +186,47 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
# If legacy single platform exists, add it to list
if not platforms and data_obj.get('ferrero_mapped_platform'):
platforms = [data_obj.get('ferrero_mapped_platform')]
# Fallback: Handle new CreativeX API format (no 'data' wrapper)
# Maps API channel/publisher back to DAM platform names
if not platforms and isinstance(full_data, dict) and 'channel' in full_data:
api_channel = full_data.get('channel', '')
api_publisher = full_data.get('publisher', '')
CHANNEL_TO_DAM = {
'google_ads': 'Google',
'dv360': 'DV360',
'tiktok_paid': 'TikTok',
'snapchat_paid': 'Snap',
'pinterest': 'Pinterest',
'twitter_paid': 'Twitter',
'amazon_paid': 'Amazon',
}
FB_PUBLISHER_TO_DAM = {
'facebook': 'FB - Feed',
'audience_network': 'Audience Network - An Classic',
'messenger': 'Messenger - Inbox',
}
IG_PUBLISHER_TO_DAM = {
'instagram': 'IG - Feed',
}
if api_channel in CHANNEL_TO_DAM:
platforms = [CHANNEL_TO_DAM[api_channel]]
elif api_channel == 'facebook_paid' and api_publisher in FB_PUBLISHER_TO_DAM:
platforms = [FB_PUBLISHER_TO_DAM[api_publisher]]
elif api_channel == 'instagram_paid' and api_publisher in IG_PUBLISHER_TO_DAM:
platforms = [IG_PUBLISHER_TO_DAM[api_publisher]]
elif api_channel == 'facebook_paid':
platforms = ['FB - Feed']
elif api_channel == 'instagram_paid':
platforms = ['IG - Feed']
if platforms:
logger.info("CreativeX: Mapped API channel '{}'/publisher '{}' to DAM platform '{}'".format(
api_channel, api_publisher, platforms[0]))
box_metadata = {
'score': creativex_data['quality_score'],
'url': creativex_data['creativex_url'],
@ -167,12 +237,12 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
))
creativex_found = True
else:
# Use default values when no CreativeX score found
# Use default values when no CreativeX score found - no URL sent
box_metadata = {
'score': '0',
'url': 'https://app.creativex.com/preflight/pretests'
'url': ''
}
logger.warning("No CreativeX score found for: {} - Using default values (Score: 0, Placeholder URL)".format(
logger.warning("No CreativeX score found for: {} - Using default values (Score: 0, No URL)".format(
filename
))
creativex_found = False
@ -184,14 +254,28 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
# 5. Get clean filename
clean_filename = parser.strip_upload_components(filename)
# 6. Build MVP asset representation with CreativeX data from database
# 6. Look up pre-upload metadata override saved by the naming tool's editor.
# The naming tool stores filename without extension, so strip it here.
filename_no_ext = os.path.splitext(filename)[0]
override = db.get_override_metadata(filename_no_ext)
override_fields = None
if override:
override_fields = override.get('override_fields')
logger.info("Found pre-upload override (id={}) for {}: {} field(s)".format(
override.get('id'), filename_no_ext,
len(override_fields) if override_fields else 0
))
# 7. Build MVP asset representation with CreativeX data from database
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master_asset['full_metadata'],
clean_filename=clean_filename,
parsed_filename=parsed,
box_metadata=box_metadata, # Pass CreativeX data from database
tracking_mode=tracking_mode, # Pass tracking mode for folder-only handling
master_opentext_id=master_asset['opentext_id'] # Pass master DAM ID for derivative tracking
master_opentext_id=master_asset['opentext_id'], # Primary master DAM ID
master_opentext_ids=master_opentext_ids, # All master IDs (multiple or single)
override_fields=override_fields # Pre-upload edits from naming tool
)
# DRYRUN MODE: Display full asset representation and exit
@ -215,6 +299,20 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
logger.info(" Score: {}".format(box_metadata.get('score')))
logger.info(" URL: {}".format(box_metadata.get('url')))
logger.info("")
# Register master asset IDs in lookup domain (even in dryrun for testing)
# This API call is safe - it only adds values to the lookup table, doesn't create assets
if master_opentext_ids:
logger.info("Domain Registration Test:")
registration_result = dam.register_master_asset_ids_for_ppr(master_opentext_ids)
if registration_result.get('skipped'):
logger.info(" Skipped (not PPR environment)")
else:
logger.info(" Registered: {}".format(registration_result.get('registered_ids', [])))
if registration_result.get('failed_ids'):
logger.info(" Failed: {}".format(registration_result.get('failed_ids', [])))
logger.info("")
logger.info("DRYRUN: No upload performed, file kept in Box")
logger.info("=" * 80)
@ -226,7 +324,7 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
'clean_filename': clean_filename,
'creativex_found': creativex_found,
'creativex_score': box_metadata.get('score', '0'),
'creativex_url': box_metadata.get('url', 'https://app.creativex.com/preflight/pretests'),
'creativex_url': box_metadata.get('url', ''),
'dryrun': True
}
@ -248,6 +346,11 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
)
logger.info("Will upload to: 01. Final Assets/{}".format(subfolder_path))
# Register master asset IDs in lookup domain before upload
# OpenText API requires domain values to exist before they can be used in asset creation
if master_opentext_ids:
dam.register_master_asset_ids_for_ppr(master_opentext_ids)
upload_result = dam.upload_asset(
file_path=clean_temp_file,
folder_id=upload_folder_id,
@ -265,6 +368,10 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
filename=clean_filename
)
# Mark pre-upload override as applied (only after confirmed DAM upload success).
if override:
db.mark_override_applied(filename_no_ext)
# 9. Delete file from Box after successful upload (unless --keep-files flag set)
if keep_files:
logger.info("--keep-files flag set - File kept in Box: {}".format(filename))
@ -289,7 +396,7 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, kee
'clean_filename': clean_filename,
'creativex_found': creativex_found,
'creativex_score': box_metadata.get('score', '0'),
'creativex_url': box_metadata.get('url', 'https://app.creativex.com/preflight/pretests'),
'creativex_url': box_metadata.get('url', ''),
'subfolder_path': subfolder_path # Add subfolder path to result
}
@ -347,7 +454,7 @@ def main():
box = BoxClient(config, root_folder_id=config['box'].get('root_folder_a2_a3'))
db = Database(config)
notifier = Notifier(config)
parser = FilenameParser()
parser = FilenameParser(dam_base_url=dam.base_url) # Pass DAM URL for environment detection
mvp_extractor = MetadataExtractorMVP(field_mappings)
# Test connections
@ -426,7 +533,7 @@ def main():
logger.info("Processing file {}/{}".format(idx, len(valid_files)))
logger.info("=" * 60)
result = process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, keep_files=args.keep_files, dryrun=args.dryrun)
result = process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, notifier, keep_files=args.keep_files, dryrun=args.dryrun)
if result['success']:
successful_files.append(result)

View file

@ -52,61 +52,57 @@ logger = logging.getLogger('A4Box')
def generate_and_upload_csv(db, box, config):
"""
Generate CSV of all live campaigns and upload to Box
Generate the combined live-campaigns CSV (A-series + B-series) and upload
to Box. OMG's automation treats each new file as a full replacement of
its live list, so we always emit the complete list under one filename.
"""
try:
logger.info("Generating live campaigns CSV...")
# 1. Get all live campaigns from DB
campaigns = db.get_all_live_campaigns()
if not campaigns:
logger.warning("No live campaigns found to report")
# Even if empty, we might want to upload an empty CSV to clear the list?
# For now, let's upload it even if empty to reflect that no campaigns are live.
logger.info("Found {} live campaigns".format(len(campaigns)))
# 2. Generate CSV file
timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d_%H%M%S_UTC')
csv_filename = 'live_campaigns_{}.csv'.format(timestamp)
csv_path = os.path.join('temp', csv_filename)
os.makedirs('temp', exist_ok=True)
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['code', 'description']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for camp in campaigns:
writer.writerow({
'code': "{}-{}".format(camp['campaign_number'], camp['campaign_name']),
'description': camp['campaign_name']
})
logger.info("Generated CSV: {}".format(csv_path))
# 3. Upload to Box
folder_id = config['box'].get('live_campaigns_folder_id')
if not folder_id:
logger.error("Box live_campaigns_folder_id not configured")
return False
upload_result = box.upload_file(
file_path=csv_path,
folder_id=folder_id,
target_filename=csv_filename
)
logger.info("Uploaded CSV to Box: {} (File ID: {})".format(
csv_filename, upload_result['file_id']
))
# Clean up
os.remove(csv_path)
return True
except Exception as e:
logger.error("Failed to generate/upload CSV: {}".format(str(e)))
return False
@ -149,11 +145,9 @@ def process_campaign(campaign, dam, box, db, notifier, config):
webhook_sent=True # Mark as processed
)
# Generate and upload updated CSV
# This will now exclude the campaign we just marked as NO
logger.info("Generating and uploading updated live campaigns CSV...")
csv_success = generate_and_upload_csv(db, box, config)
if csv_success:
logger.info("✓ CSV report uploaded successfully")
else:

View file

@ -10,8 +10,10 @@ Compatible with Python 3.6+
import sys
import os
import time
import csv
import logging
import argparse
from datetime import datetime, timezone
# Add shared library to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
@ -52,6 +54,136 @@ logging.basicConfig(
logger = logging.getLogger('B1toB2')
def _walk_metadata_elements(elements):
"""Recursively yield every element in nested metadata_element_list arrays.
Categories and tables both nest fields underneath them, so a flat walk
misses anything below the top level."""
for e in elements or []:
if not isinstance(e, dict):
continue
yield e
nested = e.get('metadata_element_list')
if isinstance(nested, list):
for sub in _walk_metadata_elements(nested):
yield sub
def extract_creativex_from_dam_metadata(asset_metadata):
"""
Extract CreativeX score and URL from DAM asset metadata if present.
Walks the metadata_element_list recursively because the score field
(FERRERO.TAB.FIELD.CREATIVEX) is nested at depth 2 under its parent
table FERRERO.TABULAR.FIELD.CREATIVEX, not at the top level.
"""
try:
top = (asset_metadata or {}).get('metadata', {}).get('metadata_element_list', [])
cx = {'score': None, 'url': None}
for element in _walk_metadata_elements(top):
element_id = element.get('id')
if element_id == 'FERRERO.TAB.FIELD.CREATIVEX':
values = element.get('values', [])
if values:
value_obj = values[0].get('value', {})
if isinstance(value_obj, dict):
field_value = value_obj.get('field_value', {})
if isinstance(field_value, dict):
score = field_value.get('value')
if score:
cx['score'] = str(score)
elif element_id == 'FERRERO.FIELD.CREATIVEX LINK':
value_obj = element.get('value', {})
if isinstance(value_obj, dict):
nested_value = value_obj.get('value', {})
if isinstance(nested_value, dict):
url = nested_value.get('value')
if url:
cx['url'] = url
return cx
except Exception as e:
logger.warning("Failed to extract CreativeX from metadata: {}".format(str(e)))
return {'score': None, 'url': None}
def generate_and_upload_csv(db, box, config):
"""
Generate the combined live-campaigns CSV (A-series + B-series) and upload
to Box. OMG's automation treats each new file as a full replacement of
its live list, so we always emit the complete list under one filename.
"""
try:
logger.info("Generating live campaigns CSV...")
campaigns = db.get_all_live_campaigns()
if not campaigns:
logger.warning("No live campaigns found to report")
logger.info("Found {} live campaigns".format(len(campaigns)))
timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d_%H%M%S_UTC')
csv_filename = 'live_campaigns_{}.csv'.format(timestamp)
csv_path = os.path.join('temp', csv_filename)
os.makedirs('temp', exist_ok=True)
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['code', 'description']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for camp in campaigns:
writer.writerow({
'code': "{}-{}".format(camp['campaign_number'], camp['campaign_name']),
'description': camp['campaign_name']
})
logger.info("Generated CSV: {}".format(csv_path))
folder_id = config['box'].get('live_campaigns_folder_id')
if not folder_id:
logger.error("Box live_campaigns_folder_id not configured")
return False
upload_result = box.upload_file(
file_path=csv_path,
folder_id=folder_id,
target_filename=csv_filename
)
logger.info("Uploaded CSV to Box: {} (File ID: {})".format(
csv_filename, upload_result['file_id']
))
os.remove(csv_path)
return True
except Exception as e:
logger.error("Failed to generate/upload CSV: {}".format(str(e)))
return False
def format_cx_score_for_display(raw_score):
"""DAM stores the CreativeX score as a tabular cell that concatenates
platform and score with a caret, e.g. 'DV360^100'. Convert to
'100 (DV360)' for human-readable email output. Returns the raw value
unchanged if it doesn't match the expected pattern."""
if not raw_score:
return raw_score
if '^' in raw_score:
platform, _, score = raw_score.partition('^')
platform = platform.strip()
score = score.strip()
if platform and score:
return "{} ({})".format(score, platform)
return raw_score
def process_campaign(campaign, dam, box, db, notifier, config):
"""
Process single campaign - download all master assets
@ -103,6 +235,7 @@ def process_campaign(campaign, dam, box, db, notifier, config):
return {'success': False, 'processed': 0, 'failed': total_assets}
# Process each asset
skipped_count = 0
for asset in master_assets:
asset_id = asset['asset_id']
asset_name = asset.get('name', 'unknown')
@ -117,7 +250,7 @@ def process_campaign(campaign, dam, box, db, notifier, config):
# SAFEGUARD: Check if it's a folder (should be handled by dam_client, but double check)
asset_type = asset.get('asset_type', {})
type_name = asset_type.get('name', '') if isinstance(asset_type, dict) else str(asset_type)
if 'folder' in type_name.lower():
logger.warning("Skipping item identified as folder: {} (Type: {})".format(asset_name, type_name))
continue
@ -128,6 +261,37 @@ def process_campaign(campaign, dam, box, db, notifier, config):
logger.warning("Skipping item with no extension (likely folder/container): {}".format(asset_name))
continue
# SKIP CHECK: If this asset was already processed (exists in DB), skip re-downloading
existing_tracking_id = db.find_global_master_by_opentext_id(asset_id)
if existing_tracking_id:
existing_asset = db.get_master_asset(existing_tracking_id)
if existing_asset and existing_asset.get('box_url'):
skipped_count += 1
logger.info("⏭ Already processed: {}{} (skipping)".format(asset_name, existing_tracking_id))
cx = extract_creativex_from_dam_metadata(existing_asset.get('full_metadata') or {})
if cx['score'] or cx['url']:
db.store_creativex_score(
filename=asset_name,
creativex_id='',
creativex_url=cx['url'] or '',
quality_score=cx['score'] or '',
box_file_id=existing_asset.get('box_file_id', ''),
full_extraction_data={'master_metadata': True, 'source': 'b1_to_b2', 'data': cx},
tracking_id=existing_tracking_id,
status='b1-master-cx-score'
)
processed_assets.append({
'asset_id': asset_id,
'asset_name': asset_name,
'tracking_id': existing_tracking_id,
'box_file_id': existing_asset.get('box_file_id', ''),
'box_url': existing_asset.get('box_url', ''),
'creativex_score': format_cx_score_for_display(cx['score']),
'creativex_url': cx['url'],
'is_existing': True
})
continue
# 1. Download from DAM
file_path = dam.download_asset(
asset_id,
@ -161,12 +325,29 @@ def process_campaign(campaign, dam, box, db, notifier, config):
)
if db_result['success']:
cx = extract_creativex_from_dam_metadata(asset)
if cx['score']:
logger.info("CreativeX score on master {}: {}".format(asset_name, cx['score']))
if cx['score'] or cx['url']:
db.store_creativex_score(
filename=asset_name,
creativex_id='',
creativex_url=cx['url'] or '',
quality_score=cx['score'] or '',
box_file_id=box_result['file_id'],
full_extraction_data={'master_metadata': True, 'source': 'b1_to_b2', 'data': cx},
tracking_id=tracking_id,
status='b1-master-cx-score'
)
processed_assets.append({
'asset_id': asset_id,
'asset_name': asset_name,
'tracking_id': tracking_id,
'box_file_id': box_result['file_id'],
'box_url': box_result['url']
'box_url': box_result['url'],
'creativex_score': format_cx_score_for_display(cx['score']),
'creativex_url': cx['url'],
'is_existing': False
})
logger.info("✓ Success: {}{}".format(asset_name, tracking_id))
else:
@ -186,10 +367,16 @@ def process_campaign(campaign, dam, box, db, notifier, config):
# CHECK: All assets processed successfully?
all_done = len(processed_assets) == total_assets
# Split new vs existing for reporting
new_assets = [a for a in processed_assets if not a.get('is_existing')]
existing_assets = [a for a in processed_assets if a.get('is_existing')]
logger.info("")
logger.info("Campaign {} Results:".format(campaign_id))
logger.info(" Total: {}".format(total_assets))
logger.info(" Successful: {}".format(len(processed_assets)))
logger.info(" Skipped (already done): {}".format(skipped_count))
logger.info(" New this run: {}".format(len(new_assets)))
logger.info(" Failed: {}".format(len(failed_assets)))
logger.info(" All Done: {}".format("YES" if all_done else "NO"))
logger.info("")
@ -203,6 +390,28 @@ def process_campaign(campaign, dam, box, db, notifier, config):
if status_result['success']:
logger.info("✓ Status updated successfully")
# Record campaign status in database — marks it as LIVE so the
# global CSV picks it up. B4 closure (or A4 with prior B-status)
# later flips this to NO.
logger.info("Recording campaign status in database (Live: YES, status B2)...")
db.record_campaign_status(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
live_campaign='YES',
status='B2',
webhook_sent=False # B-series workflow doesn't send a webhook
)
# Regenerate and upload the combined live campaigns CSV to Box.
# Box automation forwards it to OMG as a full-list replacement.
logger.info("Generating and uploading live campaigns CSV...")
csv_success = generate_and_upload_csv(db, box, config)
if csv_success:
logger.info("✓ CSV report uploaded successfully")
else:
logger.error("✗ CSV report generation/upload failed")
# NOTE: B1→B2 workflow does NOT send webhook (only email notification)
# Webhook is only used for A1→A2 workflow
@ -215,7 +424,7 @@ def process_campaign(campaign, dam, box, db, notifier, config):
os.makedirs("temp")
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['Filename', 'Tracking ID', 'Campaign Number']
fieldnames = ['Filename', 'Tracking ID', 'Campaign Number', 'Status']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
@ -223,7 +432,8 @@ def process_campaign(campaign, dam, box, db, notifier, config):
writer.writerow({
'Filename': asset['asset_name'],
'Tracking ID': asset['tracking_id'],
'Campaign Number': campaign_number
'Campaign Number': campaign_number,
'Status': 'Existing' if asset.get('is_existing') else 'New'
})
logger.info("Generated CSV report: {}".format(csv_path))
@ -242,7 +452,11 @@ def process_campaign(campaign, dam, box, db, notifier, config):
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'asset_count': len(processed_assets),
'processed_assets': processed_assets
'new_asset_count': len(new_assets),
'existing_asset_count': len(existing_assets),
'processed_assets': processed_assets,
'new_assets': new_assets,
'existing_assets': existing_assets
},
attachments=attachments
)

View file

@ -0,0 +1,283 @@
#!/usr/bin/env python3
"""
B4 Box Uploader
Monitors campaigns with status B4 (Global - Not Going Live)
Updates status in DB to live_campaign='NO'
Generates and uploads updated GLOBAL CSV of live campaigns to Box.
Mirrors a4_box_uploader.py for the global (B-series) workflow.
"""
import sys
import os
import time
import logging
import argparse
import csv
from datetime import datetime, timezone
# Add shared library to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.dam_client import DAMClient
from shared.box_client import BoxClient
from shared.database import Database
from shared.notifier import Notifier
# Setup logging with rotation
from logging.handlers import RotatingFileHandler
# Create logs directory if it doesn't exist
os.makedirs('logs', exist_ok=True)
os.makedirs('logs/backup', exist_ok=True)
# Configure logging with rotation
log_handler = RotatingFileHandler(
'logs/b4_box.log',
maxBytes=10*1024*1024, # 10MB per file
backupCount=28
)
log_handler.setLevel(logging.INFO)
log_handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)
console_handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
logging.basicConfig(
level=logging.INFO,
handlers=[log_handler, console_handler]
)
logger = logging.getLogger('B4Box')
def generate_and_upload_csv(db, box, config):
"""
Generate the combined live-campaigns CSV (A-series + B-series) and upload
to Box. OMG's automation treats each new file as a full replacement of
its live list, so we always emit the complete list under one filename.
"""
try:
logger.info("Generating live campaigns CSV...")
campaigns = db.get_all_live_campaigns()
if not campaigns:
logger.warning("No live campaigns found to report")
logger.info("Found {} live campaigns".format(len(campaigns)))
timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d_%H%M%S_UTC')
csv_filename = 'live_campaigns_{}.csv'.format(timestamp)
csv_path = os.path.join('temp', csv_filename)
os.makedirs('temp', exist_ok=True)
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['code', 'description']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for camp in campaigns:
writer.writerow({
'code': "{}-{}".format(camp['campaign_number'], camp['campaign_name']),
'description': camp['campaign_name']
})
logger.info("Generated CSV: {}".format(csv_path))
folder_id = config['box'].get('live_campaigns_folder_id')
if not folder_id:
logger.error("Box live_campaigns_folder_id not configured")
return False
upload_result = box.upload_file(
file_path=csv_path,
folder_id=folder_id,
target_filename=csv_filename
)
logger.info("Uploaded CSV to Box: {} (File ID: {})".format(
csv_filename, upload_result['file_id']
))
os.remove(csv_path)
return True
except Exception as e:
logger.error("Failed to generate/upload CSV: {}".format(str(e)))
return False
def process_campaign(campaign, dam, box, db, notifier, config):
"""
Process B4 campaign - mark not-live and regenerate the global CSV.
"""
campaign_id = campaign['asset_id']
campaign_name = campaign['campaign_name']
campaign_number = campaign.get('campaign_id') or 'UNKNOWN'
logger.info("=" * 60)
logger.info("Processing B4 campaign: {} ({})".format(campaign_name, campaign_number))
logger.info("=" * 60)
try:
campaign_check = db.check_campaign_processed(campaign_id)
if campaign_check['exists'] and campaign_check['webhook_sent']:
logger.info("Campaign already processed")
logger.info(" Processed at: {}".format(campaign_check['webhook_sent_at']))
logger.info(" Status: {}".format(campaign_check['status']))
logger.info(" Live Campaign: {}".format(campaign_check['live_campaign']))
logger.info("Skipping to avoid duplicate processing")
return {'success': True, 'processed': False, 'already_processed': True}
logger.info("Recording campaign status in database (Live: NO)...")
db.record_campaign_status(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
live_campaign='NO',
status='B4',
webhook_sent=True
)
logger.info("Generating and uploading updated live campaigns CSV...")
csv_success = generate_and_upload_csv(db, box, config)
if csv_success:
logger.info("✓ CSV report uploaded successfully")
else:
logger.error("✗ CSV report generation/upload failed")
notifier.send_email(
template_name='a4_webhook_sent', # Reuse template — conveys "closure processed"
recipients=config['notifications']['recipients']['success'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'webhook_url': 'CSV Uploaded to Box (Global)'
}
)
return {'success': True, 'processed': True}
except Exception as e:
logger.error("Campaign processing failed: {}".format(str(e)))
return {'success': False, 'processed': False}
def main():
"""Main polling loop"""
parser = argparse.ArgumentParser(description='Ferrero B4 Box Uploader')
parser.add_argument('--auth-pfx', action='store_true',
help='Use mTLS certificate authentication (Legacy APIM)')
parser.add_argument('--auth-pfx-v2', action='store_true',
help='Use mTLS V2 (Hybrid) authentication')
args = parser.parse_args()
logger.info("=" * 60)
logger.info("Ferrero B4 Box Uploader Starting")
auth_mode = 'oauth'
if args.auth_pfx_v2:
auth_mode = 'mtls_v2'
logger.info("Authentication: mTLS V2 (Hybrid)")
elif args.auth_pfx:
auth_mode = 'mtls'
logger.info("Authentication: mTLS Certificate (Legacy)")
else:
logger.info("Authentication: OAuth2 (default)")
logger.info("=" * 60)
config = load_config('config/config.yaml')
dam = DAMClient(config, auth_mode=auth_mode)
box = BoxClient(config)
db = Database(config)
notifier = Notifier(config)
logger.info("Testing connections...")
if not dam.test_connection():
logger.error("DAM connection failed - exiting")
sys.exit(1)
if not box.test_connection():
logger.error("Box connection failed - exiting")
sys.exit(1)
if not db.test_connection():
logger.error("Database connection failed - exiting")
sys.exit(1)
logger.info("All connections OK")
logger.info("")
try:
logger.info("Searching for B4 campaigns...")
campaigns = dam.search_campaigns(status='B4')
if not campaigns:
logger.info("No B4 campaigns found - exiting")
db.close()
sys.exit(0)
logger.info("Found {} B4 campaign(s) - processing all".format(len(campaigns)))
logger.info("")
processed_count = 0
failed_count = 0
already_processed_count = 0
for campaign in campaigns:
result = process_campaign(campaign, dam, box, db, notifier, config)
if result['success']:
if result.get('processed'):
processed_count += 1
if result.get('already_processed'):
already_processed_count += 1
else:
failed_count += 1
logger.info("")
logger.info("=" * 60)
logger.info("B4 Box Uploader Summary")
logger.info("=" * 60)
logger.info("Total campaigns found: {}".format(len(campaigns)))
logger.info("Processed (CSV updated): {}".format(processed_count))
logger.info("Already processed: {}".format(already_processed_count))
logger.info("Failed: {}".format(failed_count))
logger.info("=" * 60)
db.close()
if failed_count == 0:
sys.exit(0)
elif processed_count > 0:
sys.exit(0)
else:
sys.exit(1)
except Exception as e:
logger.critical("Script error: {}".format(str(e)))
notifier.send_email(
template_name='upload_failed',
recipients=config['notifications']['recipients']['critical'],
data={
'filename': 'B4 Box Uploader',
'tracking_id': 'N/A',
'error': str(e)
}
)
db.close()
sys.exit(1)
if __name__ == '__main__':
main()

View file

@ -0,0 +1,203 @@
#!/usr/bin/env python3
"""
One-shot backfill: Populate creativex_scores with status='b1-master-cx-score'
for B1B2 global masters already in master_assets that don't yet have a row.
Identification rule:
tracking_id LIKE 'M%' AND local_campaign_id IS NULL AND status = 'active'
B1B2 stores masters without local_campaign_id; A1A2 always sets it, so this
cleanly separates global from local masters that share the M-prefix.
The CX score is read out of master_assets.full_metadata JSONB. Rows where the
DAM metadata has no CreativeX score AND no URL are reported but skipped.
db.store_creativex_score(..., status='b1-master-cx-score') already dedupes by
tracking_id, so re-running is safe.
Usage:
python scripts/backfill_b1_creativex_scores.py # apply
python scripts/backfill_b1_creativex_scores.py --dry-run # preview only
"""
import sys
import os
import argparse
import logging
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.database import Database
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger('B1CXBackfill')
def _walk_metadata_elements(elements):
"""Recursively yield every element in nested metadata_element_list arrays."""
for e in elements or []:
if not isinstance(e, dict):
continue
yield e
nested = e.get('metadata_element_list')
if isinstance(nested, list):
for sub in _walk_metadata_elements(nested):
yield sub
def extract_creativex_from_dam_metadata(asset_metadata):
"""Mirror of the extractor in b1_to_b2_download.py — duplicated here
to keep the backfill script self-contained (avoids triggering
b1_to_b2_download's module-level logging setup on import).
Walks recursively: the score field is at depth 2 (nested inside
FERRERO.TABULAR.FIELD.CREATIVEX, which lives inside a category)."""
try:
top = (asset_metadata or {}).get('metadata', {}).get('metadata_element_list', [])
cx = {'score': None, 'url': None}
for element in _walk_metadata_elements(top):
element_id = element.get('id')
if element_id == 'FERRERO.TAB.FIELD.CREATIVEX':
values = element.get('values', [])
if values:
value_obj = values[0].get('value', {})
if isinstance(value_obj, dict):
field_value = value_obj.get('field_value', {})
if isinstance(field_value, dict):
score = field_value.get('value')
if score:
cx['score'] = str(score)
elif element_id == 'FERRERO.FIELD.CREATIVEX LINK':
value_obj = element.get('value', {})
if isinstance(value_obj, dict):
nested = value_obj.get('value', {})
if isinstance(nested, dict):
url = nested.get('value')
if url:
cx['url'] = url
return cx
except Exception as e:
logger.warning('Failed to extract CreativeX from metadata: %s', e)
return {'score': None, 'url': None}
def fetch_b1_masters(db):
conn = db.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT tracking_id, original_filename, file_extension,
full_metadata, description
FROM master_assets
WHERE tracking_id LIKE 'M%'
AND local_campaign_id IS NULL
AND status = 'active'
ORDER BY created_at
""")
rows = cursor.fetchall()
return [
{
'tracking_id': r[0],
'filename': (r[1] or '') + (r[2] or ''),
'full_metadata': r[3] if isinstance(r[3], dict) else (r[3] or {}),
'box_file_id': Database.parse_box_info_from_description(r[4]).get('box_file_id') or '',
}
for r in rows
]
finally:
cursor.close()
db.put_connection(conn)
def existing_cx_tracking_ids(db):
"""Return set of tracking_ids that already have a b1-master-cx-score row."""
conn = db.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT DISTINCT tracking_id
FROM creativex_scores
WHERE status = 'b1-master-cx-score'
AND tracking_id IS NOT NULL
""")
return {row[0] for row in cursor.fetchall()}
finally:
cursor.close()
db.put_connection(conn)
def main():
parser = argparse.ArgumentParser(description='Backfill B1 master CreativeX scores')
parser.add_argument('--dry-run', action='store_true',
help='Report what would be inserted without touching the DB')
args = parser.parse_args()
config = load_config('config/config.yaml')
db = Database(config)
if not db.test_connection():
logger.error('Database connection failed')
sys.exit(1)
masters = fetch_b1_masters(db)
already_have = existing_cx_tracking_ids(db)
logger.info('Scanned %d B1 global masters in master_assets', len(masters))
logger.info('Existing b1-master-cx-score rows: %d', len(already_have))
inserted = 0
skipped_no_cx = 0
skipped_already = 0
for m in masters:
if m['tracking_id'] in already_have:
skipped_already += 1
continue
cx = extract_creativex_from_dam_metadata(m['full_metadata'])
if not (cx['score'] or cx['url']):
skipped_no_cx += 1
logger.debug('No CX in metadata for %s (%s)', m['tracking_id'], m['filename'])
continue
if args.dry_run:
logger.info('[DRY-RUN] Would insert: %s | %s | score=%s url=%s',
m['tracking_id'], m['filename'], cx['score'], cx['url'])
inserted += 1
continue
result = db.store_creativex_score(
filename=m['filename'],
creativex_id='',
creativex_url=cx['url'] or '',
quality_score=cx['score'] or '',
box_file_id=m['box_file_id'],
full_extraction_data={'master_metadata': True, 'source': 'b1_backfill', 'data': cx},
tracking_id=m['tracking_id'],
status='b1-master-cx-score'
)
if result.get('success'):
if result.get('already_exists'):
# Race or stale already_have set — count as already
skipped_already += 1
else:
inserted += 1
logger.info('Inserted: %s | %s | score=%s', m['tracking_id'], m['filename'], cx['score'])
else:
logger.error('Failed for %s: %s', m['tracking_id'], result.get('error'))
logger.info('=' * 60)
logger.info('Backfill summary%s:', ' (DRY-RUN)' if args.dry_run else '')
logger.info(' Scanned B1 masters: %d', len(masters))
logger.info(' Already had CX row: %d', skipped_already)
logger.info(' No CX in metadata: %d', skipped_no_cx)
logger.info(' %s: %d', 'Would insert' if args.dry_run else 'Inserted', inserted)
logger.info('=' * 60)
db.close()
if __name__ == '__main__':
main()

View file

@ -0,0 +1,117 @@
#!/usr/bin/env python3
"""
Campaign Status Check - Read-only lookup of a campaign's current status on the DAM
Searches all A#/B# statuses for a campaign by number or partial name and prints
the current status. Makes no changes.
Compatible with Python 3.6+
"""
import sys
import os
import logging
import argparse
# Add shared library to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.dam_client import DAMClient
from scripts.update_campaign_status import find_campaign_by_identifier
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger('CheckStatus')
def main():
parser = argparse.ArgumentParser(
description='Check the current status of a campaign on the DAM (read-only)',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Check campaign C000000078 (dev environment, OAuth)
python scripts/check_campaign_status.py --camp C000000078
# Check by partial name
python scripts/check_campaign_status.py --camp "CONTENT SCALING"
# Production environment with mTLS V2
python scripts/check_campaign_status.py --camp C000000078 --auth-pfx-v2 --env prod
"""
)
parser.add_argument('--camp', type=str, required=True,
help='Campaign number (e.g., C000000078) or partial campaign name')
parser.add_argument('--auth-pfx', action='store_true',
help='Use mTLS certificate authentication (Legacy APIM)')
parser.add_argument('--auth-pfx-v2', action='store_true',
help='Use mTLS V2 (Hybrid) authentication')
parser.add_argument('--env', type=str, choices=['dev', 'prod'], default='dev',
help='Environment: dev (default) or prod')
args = parser.parse_args()
auth_mode = 'oauth'
if args.auth_pfx_v2:
auth_mode = 'mtls_v2'
elif args.auth_pfx:
auth_mode = 'mtls'
os.environ['ENV'] = args.env
print("")
print("=" * 70)
print("Ferrero Campaign Status Check")
print("=" * 70)
print("Campaign Identifier: {}".format(args.camp))
print("Environment: {}".format(args.env.upper()))
if auth_mode == 'mtls_v2':
print("Authentication: mTLS V2 (Hybrid)")
elif auth_mode == 'mtls':
print("Authentication: mTLS Certificate (Legacy)")
else:
print("Authentication: OAuth2 (default)")
print("=" * 70)
print("")
config = load_config('config/config.yaml')
dam = DAMClient(config, auth_mode=auth_mode)
logger.info("Testing DAM connection...")
if not dam.test_connection():
logger.error("DAM connection failed - exiting")
sys.exit(1)
logger.info("DAM connection OK")
print("")
campaigns = find_campaign_by_identifier(dam, args.camp)
if not campaigns:
print("")
print("=" * 70)
print("No campaigns found matching: {}".format(args.camp))
print("=" * 70)
print("")
print("Searched statuses: A1, A2, A3, A4, A5, A6, B1, B2")
print("Try:")
print(" - Exact campaign number: C000000078")
print(" - Partial campaign name: CONTENT SCALING")
sys.exit(1)
print("")
print("=" * 70)
print("Found {} matching campaign(s)".format(len(campaigns)))
print("=" * 70)
print("")
for i, campaign in enumerate(campaigns, 1):
print("{}. {}".format(i, campaign.get('campaign_name', 'Unknown')))
print(" Campaign Number: {}".format(campaign.get('campaign_id', 'N/A')))
print(" Current Status: {}".format(campaign['current_status']))
print(" DAM Asset ID: {}".format(campaign.get('asset_id', 'N/A')))
print("")
if __name__ == '__main__':
main()

View file

@ -0,0 +1,162 @@
#!/usr/bin/env python3
"""
Diagnostic: Inspect what metadata B1 global masters actually carry in
master_assets.full_metadata, so we can tell why the CX backfill found 0.
Two checks:
1. Top-level keys of full_metadata (does the structure even contain
metadata.metadata_element_list?).
2. Across a larger sample, count occurrences of any element_id that
looks CX/score/quality-related (case-insensitive) surfaces the
actual element IDs used by client B1 masters, in case they differ
from the A1 IDs the extractor expects.
Read-only. Safe to run any time.
Usage:
python scripts/diagnose_b1_master_metadata.py
python scripts/diagnose_b1_master_metadata.py --sample 200
"""
import sys
import os
import json
import argparse
import logging
from collections import Counter
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.database import Database
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger('B1MetaDiag')
CX_HINTS = ('creativex', 'cx', 'score', 'quality')
def walk_elements(elements, depth=0):
"""Recursively yield (depth, element) for every element in a nested
metadata_element_list. Categories and tables both contain nested
metadata_element_list arrays flat iteration misses everything below
the top level."""
for e in elements or []:
if not isinstance(e, dict):
continue
yield depth, e
nested = e.get('metadata_element_list')
if isinstance(nested, list):
for sub in walk_elements(nested, depth + 1):
yield sub
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--sample', type=int, default=100,
help='How many B1 masters to scan for element-ID counts (default 100)')
parser.add_argument('--show-full', type=int, default=2,
help='How many sample full_metadata blobs to dump in full (default 2)')
args = parser.parse_args()
config = load_config('config/config.yaml')
db = Database(config)
if not db.test_connection():
sys.exit(1)
conn = db.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT tracking_id, original_filename, full_metadata
FROM master_assets
WHERE tracking_id LIKE 'M%%'
AND local_campaign_id IS NULL
AND status = 'active'
ORDER BY created_at DESC
LIMIT %s
""", (args.sample,))
rows = cursor.fetchall()
finally:
cursor.close()
db.put_connection(conn)
logger.info('Sampled %d B1 global masters', len(rows))
# 1. Top-level structure check
top_key_counter = Counter()
has_meta_list = 0
empty_full_meta = 0
for r in rows:
full = r[2] if isinstance(r[2], dict) else (r[2] or {})
if not full:
empty_full_meta += 1
continue
for k in full.keys():
top_key_counter[k] += 1
meta = full.get('metadata')
if isinstance(meta, dict) and isinstance(meta.get('metadata_element_list'), list):
has_meta_list += 1
logger.info('=' * 60)
logger.info('Top-level keys present in full_metadata (count of rows containing the key):')
for k, c in top_key_counter.most_common():
logger.info(' %-30s %d', k, c)
logger.info('Rows with empty full_metadata: %d', empty_full_meta)
logger.info('Rows with metadata.metadata_element_list: %d', has_meta_list)
logger.info('=' * 60)
# 2. Recursive hunt for CX-flavored element IDs (nested metadata_element_list)
id_counter = Counter()
cx_id_depth = {} # eid -> depth at which it was first seen
cx_id_counter = Counter()
rows_with_cx_hint = 0
max_depth_seen = 0
for r in rows:
full = r[2] if isinstance(r[2], dict) else (r[2] or {})
top_list = (full.get('metadata') or {}).get('metadata_element_list') or []
row_had_hint = False
for depth, e in walk_elements(top_list):
if depth > max_depth_seen:
max_depth_seen = depth
eid = (e.get('id') or '').strip()
if not eid:
continue
id_counter[eid] += 1
lower = eid.lower()
if any(h in lower for h in CX_HINTS):
cx_id_counter[eid] += 1
cx_id_depth.setdefault(eid, depth)
row_had_hint = True
if row_had_hint:
rows_with_cx_hint += 1
logger.info('Distinct element_ids seen across sample (any depth): %d', len(id_counter))
logger.info('Max nesting depth observed: %d', max_depth_seen)
logger.info('Rows containing at least one CX-flavored element_id: %d / %d',
rows_with_cx_hint, len(rows))
logger.info('-' * 60)
if cx_id_counter:
logger.info('CX/score/quality-flavored element_ids found (id @ depth, count):')
for eid, c in cx_id_counter.most_common():
logger.info(' %-50s @depth %d %d', eid, cx_id_depth[eid], c)
else:
logger.info('NO CX/score/quality-flavored element_ids found at any depth.')
logger.info('Likely: client B1 masters were uploaded before CX scoring ran on them.')
logger.info('=' * 60)
# 3. Dump first few full blobs verbatim for manual inspection
if args.show_full > 0:
logger.info('First %d full_metadata blobs (truncated to 4KB each):', args.show_full)
for r in rows[:args.show_full]:
full = r[2] if isinstance(r[2], dict) else (r[2] or {})
blob = json.dumps(full, indent=2, default=str)
if len(blob) > 4096:
blob = blob[:4096] + '\n... [truncated]'
logger.info('--- %s (%s) ---\n%s', r[0], r[1], blob)
db.close()
if __name__ == '__main__':
main()

View file

@ -75,6 +75,12 @@ TASKS = [
'interval_minutes': 10,
'args': ['--auth-pfx-v2'] # Production uses mTLS V2
},
{
'name': 'B4 Box Uploader',
'script': 'scripts/b4_box_uploader.py',
'interval_minutes': 10,
'args': ['--auth-pfx-v2'] # Production uses mTLS V2
},
{
'name': 'Daily Report',
'script': 'scripts/daily_report.py',
@ -84,9 +90,77 @@ TASKS = [
}
]
# ==========================================
# OFF-HOURS CONFIGURATION
# ==========================================
# Off-hours definition
OFF_HOURS_CONFIG = {
'enabled': True, # Set to False to disable off-hours slowdown
'extra_minutes': 30, # Minutes to add to intervals during off-hours
# Late night: 10 PM (22:00) to 5 AM (05:00) every day
'late_night_start': 22, # Hour (0-23)
'late_night_end': 5, # Hour (0-23)
# Weekend: All day Saturday and Sunday
'weekend_days': [5, 6], # 0=Monday, 5=Saturday, 6=Sunday
# Tasks exempt from off-hours slowdown (always run at normal cadence)
'exempt_tasks': [
'Daily Report' # Task name to exclude (runs at 7 PM regardless)
]
}
LOCK_DIR = 'locks'
STATE_FILE = 'orchestrator_state.json'
# ==========================================
# OFF-HOURS DETECTION
# ==========================================
def is_off_hours(now=None):
"""
Determine if current time is in off-hours period
Args:
now: datetime object (defaults to current time)
Returns:
bool: True if in off-hours, False otherwise
"""
if not OFF_HOURS_CONFIG['enabled']:
return False
if now is None:
now = datetime.now()
current_hour = now.hour
current_weekday = now.weekday() # 0=Monday, 6=Sunday
# Check if weekend (all day Saturday or Sunday)
if current_weekday in OFF_HOURS_CONFIG['weekend_days']:
logger.debug("Off-hours: Weekend (day {})".format(current_weekday))
return True
# Check if late night
late_night_start = OFF_HOURS_CONFIG['late_night_start']
late_night_end = OFF_HOURS_CONFIG['late_night_end']
if late_night_start > late_night_end:
# Wraps around midnight (e.g., 22:00 to 5:00)
is_late_night = current_hour >= late_night_start or current_hour < late_night_end
else:
# Same day range (e.g., 1:00 to 5:00)
is_late_night = late_night_start <= current_hour < late_night_end
if is_late_night:
logger.debug("Off-hours: Late night (hour {})".format(current_hour))
return True
logger.debug("Business hours (hour {}, weekday {})".format(current_hour, current_weekday))
return False
# ==========================================
# CORE CLASSES
# ==========================================
@ -177,22 +251,55 @@ class TaskRunner:
now = datetime.now()
current_hour = now.hour
current_minute = now.minute
logger.info(f"Orchestrator tick: {now.strftime('%Y-%m-%d %H:%M:%S')}")
# Determine if we're in off-hours
in_off_hours = is_off_hours(now)
if in_off_hours:
logger.info("=" * 80)
logger.info("Orchestrator tick: {} [OFF-HOURS MODE]".format(now.strftime('%Y-%m-%d %H:%M:%S')))
logger.info("Adding {} minutes to all task intervals".format(OFF_HOURS_CONFIG['extra_minutes']))
logger.info("=" * 80)
else:
logger.info("Orchestrator tick: {} [NORMAL MODE]".format(now.strftime('%Y-%m-%d %H:%M:%S')))
for task in TASKS:
# Check for specific hour schedule
task_name = task['name']
# Check for specific hour schedule (e.g., Daily Report at 7 PM)
if 'run_at_hour' in task:
target_hour = task['run_at_hour']
# Run only at the top of the hour (minute 0)
if current_hour == target_hour and current_minute == 0:
logger.info("Scheduled task '{}' due at {}:00".format(task_name, target_hour))
self.run_task(task)
continue
# Standard interval check
interval = task.get('interval_minutes', 5)
if interval > 0 and current_minute % interval == 0:
self.run_task(task)
# Standard interval check with off-hours adjustment
base_interval = task.get('interval_minutes', 5)
# Check if task is exempt from off-hours slowdown
is_exempt = task_name in OFF_HOURS_CONFIG['exempt_tasks']
# In off-hours, skip non-exempt tasks unless they match the extended interval
if in_off_hours and not is_exempt:
# Task should run if:
# 1. Current minute matches base interval (normal check)
# 2. AND we're at a 30-minute boundary (0 or 30)
if base_interval > 0:
matches_interval = current_minute % base_interval == 0
at_boundary = current_minute % 30 == 0
if matches_interval and at_boundary:
logger.info("Task '{}' due (off-hours: {}min + 30min cadence)".format(
task_name, base_interval
))
self.run_task(task)
else:
# Normal business hours OR exempt task
if base_interval > 0 and current_minute % base_interval == 0:
logger.info("Task '{}' due ({}min interval)".format(task_name, base_interval))
self.run_task(task)
def main():
parser = argparse.ArgumentParser(description='Ferrero Orchestrator')

View file

@ -75,6 +75,12 @@ TASKS = [
'interval_minutes': 10,
'args': [] # Temporarily using OAuth instead of --auth-pfx-v2
},
{
'name': 'B4 Box Uploader',
'script': 'scripts/b4_box_uploader.py',
'interval_minutes': 10,
'args': [] # Temporarily using OAuth instead of --auth-pfx-v2
},
{
'name': 'Daily Report',
'script': 'scripts/daily_report.py',

View file

@ -583,6 +583,9 @@ class DAMClient:
# If extension has spaces in it, it's not a real extension
elif ' ' in ext:
is_folder = True
# Numeric-only extension = version number (e.g. "WND_PCS 2026 2.0"), not a file
elif ext[1:].isdigit():
is_folder = True
else:
# Has an extension-like string, but not in our known list
# Could be an uncommon file type - assume it's a file to be safe
@ -1201,6 +1204,151 @@ class DAMClient:
mime_type, _ = mimetypes.guess_type(file_path)
return mime_type or 'application/octet-stream'
def register_master_asset_id_domain_value(self, master_asset_id):
"""
Register a master asset ID in the FERRERO_MASTER_ASSET_ID lookup domain.
Required in PPR environment before using the ID in asset creation.
The OpenText API does not support creating new domain values during asset
creation, so this must be called before the create asset API.
Args:
master_asset_id: The master asset ID to register
Returns:
dict with success, http_code, and optional error
"""
# Only for PPR environment
if 'ppr' not in self.base_url.lower():
return {'success': True, 'skipped': True, 'reason': 'Not PPR environment'}
try:
payload = {
"domain_value_resource": {
"domain_value": {
"description": master_asset_id,
"display_value": master_asset_id,
"field_value": {
"type": "string",
"value": master_asset_id
}
}
}
}
logger.info("PPR: Registering master asset ID '{}' in lookup domain...".format(master_asset_id))
response = self._make_api_request(
'POST',
"{}/v6/lookupdomains/FERRERO_MASTER_ASSET_ID/lookupvalues".format(self.base_url),
json=payload,
headers={
'Content-Type': 'application/json',
'Accept': 'application/json'
}
)
# Success cases
if response.status_code in [200, 201, 202]:
logger.info("PPR: Master asset ID '{}' registered successfully".format(master_asset_id))
return {
'success': True,
'http_code': response.status_code,
'already_existed': False
}
# Already exists - OpenText returns 409 OR 500 with "duplicate code" message
if response.status_code == 409:
logger.info("PPR: Master asset ID '{}' already exists in lookup domain".format(master_asset_id))
return {
'success': True,
'http_code': response.status_code,
'already_existed': True
}
# Check for duplicate error in 500 response (OpenText quirk)
if response.status_code == 500:
try:
error_data = response.json()
error_msg = error_data.get('exception_body', {}).get('message', '')
if 'duplicate' in error_msg.lower():
logger.info("PPR: Master asset ID '{}' already exists in lookup domain".format(master_asset_id))
return {
'success': True,
'http_code': response.status_code,
'already_existed': True
}
except:
pass
# Actual failure
error_msg = "Failed to register master asset ID '{}': HTTP {} - {}".format(
master_asset_id,
response.status_code,
response.text[:200] if response.text else 'No response'
)
logger.warning(error_msg)
return {
'success': False,
'http_code': response.status_code,
'error': error_msg
}
except Exception as e:
error_msg = "Exception registering master asset ID '{}': {}".format(master_asset_id, str(e))
logger.error(error_msg)
return {
'success': False,
'error': error_msg
}
def register_master_asset_ids_for_ppr(self, master_asset_ids):
"""
Register all master asset IDs in the lookup domain.
Call this before creating an asset that references these IDs.
The OpenText DAM API does not support creating new domain values during
asset creation. We must first add each master asset ID to the
FERRERO_MASTER_ASSET_ID domain value table before the create asset call.
Args:
master_asset_ids: List of master asset IDs to register
Returns:
dict with success, registered_ids, failed_ids
"""
if not master_asset_ids:
return {'success': True, 'registered_ids': [], 'failed_ids': []}
logger.info("=" * 60)
logger.info("Registering {} master asset ID(s) in lookup domain".format(len(master_asset_ids)))
logger.info(" IDs: {}".format(', '.join(master_asset_ids)))
logger.info("=" * 60)
registered = []
failed = []
for master_id in master_asset_ids:
result = self.register_master_asset_id_domain_value(master_id)
if result.get('success'):
registered.append(master_id)
else:
failed.append({'id': master_id, 'error': result.get('error')})
logger.info("Domain registration complete - {}/{} succeeded".format(
len(registered), len(master_asset_ids)))
if failed:
logger.warning("Failed to register: {}".format(
', '.join([f['id'] for f in failed])))
# Return success even if some failed (better to try the upload and see)
return {
'success': len(failed) == 0,
'registered_ids': registered,
'failed_ids': failed
}
def get_or_create_subfolder_path(self, base_folder_id, subfolder_path):
"""
Create or find subfolder structure in DAM matching Box structure
@ -1235,14 +1383,10 @@ class DAMClient:
current_folder_id = existing
logger.info("Found existing folder: {} (ID: {})".format(folder_name, current_folder_id))
else:
# Create it
new_id = self._create_folder(current_folder_id, folder_name)
if new_id:
current_folder_id = new_id
logger.info("Created folder: {} (ID: {})".format(folder_name, current_folder_id))
else:
logger.error("Failed to create folder: {}".format(folder_name))
return base_folder_id # Return base folder if creation fails
# Folder doesn't exist - DAM doesn't allow folder creation via API
# Upload to parent folder instead
logger.warning("Folder '{}' not found in DAM. DAM does not allow folder creation. Files will be uploaded to parent folder.".format(folder_name))
return current_folder_id # Return current parent folder instead of trying to create
return current_folder_id

View file

@ -148,7 +148,45 @@ class Database:
cursor.close()
self.put_connection(conn)
def store_master_asset(self, tracking_id, opentext_id, asset_data, box_file_id, box_url, upload_folder_id, global_master_campaign_id=None, global_master_folder_id=None, local_campaign_id=None):
def find_global_master_by_opentext_id(self, opentext_id):
"""
Look up a B1B2 global master asset by opentext_id.
Returns the M-prefixed tracking ID if a matching global master exists.
Args:
opentext_id: DAM asset ID to search for
Returns:
str: M-prefixed tracking ID if found, None otherwise
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT tracking_id FROM master_assets
WHERE opentext_id = %s
AND tracking_id LIKE 'M%%'
AND status = 'active'
LIMIT 1
""", (opentext_id,))
row = cursor.fetchone()
if row:
logger.info("Found global master tracking ID {} for opentext_id {}".format(
row[0], opentext_id
))
return row[0]
else:
logger.debug("No global master found for opentext_id {}".format(opentext_id))
return None
finally:
cursor.close()
self.put_connection(conn)
def store_master_asset(self, tracking_id, opentext_id, asset_data, box_file_id, box_url, upload_folder_id, global_master_campaign_id=None, global_master_folder_id=None, local_campaign_id=None, global_master_tracking_id=None):
"""
Store master asset with FULL metadata in JSONB column
@ -162,6 +200,7 @@ class Database:
global_master_campaign_id: Global master campaign ID (from GLOBAL CAMPAIGN REFERENCE)
global_master_folder_id: Global master folder ID
local_campaign_id: Local campaign ID (immediate campaign this asset belongs to)
global_master_tracking_id: M-prefixed tracking ID from B1B2 global master (if found)
Returns:
dict with success boolean
@ -190,9 +229,10 @@ class Database:
tracking_id, opentext_id, original_filename, file_extension,
file_size_bytes, mime_type, upload_directory,
description, full_metadata, status,
global_master_campaign_id, global_master_folder_id, local_campaign_id
global_master_campaign_id, global_master_folder_id, local_campaign_id,
global_master_tracking_id
) VALUES (
%s, %s, %s, %s, %s, %s, %s, %s, %s, 'active', %s, %s, %s
%s, %s, %s, %s, %s, %s, %s, %s, %s, 'active', %s, %s, %s, %s
)
ON CONFLICT (tracking_id) DO UPDATE SET
upload_directory = EXCLUDED.upload_directory,
@ -201,6 +241,7 @@ class Database:
global_master_campaign_id = EXCLUDED.global_master_campaign_id,
global_master_folder_id = EXCLUDED.global_master_folder_id,
local_campaign_id = EXCLUDED.local_campaign_id,
global_master_tracking_id = EXCLUDED.global_master_tracking_id,
updated_at = CURRENT_TIMESTAMP
""", (
tracking_id,
@ -214,7 +255,8 @@ class Database:
full_metadata_json,
global_master_campaign_id,
global_master_folder_id,
local_campaign_id
local_campaign_id,
global_master_tracking_id
))
conn.commit()
@ -256,18 +298,49 @@ class Database:
# Parse JSONB as dict
full_metadata = row[3] if isinstance(row[3], dict) else json.loads(row[3])
# Parse Box info from description
box_info = self.parse_box_info_from_description(row[4])
return {
'tracking_id': row[0],
'opentext_id': row[1],
'upload_directory': row[2],
'full_metadata': full_metadata,
'description': row[4]
'description': row[4],
'box_file_id': box_info.get('box_file_id'),
'box_url': box_info.get('box_url')
}
finally:
cursor.close()
self.put_connection(conn)
@staticmethod
def parse_box_info_from_description(description):
"""
Parse Box file ID and URL from master asset description field.
Description format:
Box File ID: {id}
Box URL: {url}
DAM Asset ID: {opentext_id}
Returns:
dict with box_file_id and box_url (None if not found)
"""
result = {'box_file_id': None, 'box_url': None}
if not description:
return result
for line in description.split('\n'):
line = line.strip()
if line.startswith('Box File ID:'):
result['box_file_id'] = line.split(':', 1)[1].strip()
elif line.startswith('Box URL:'):
result['box_url'] = line.split(':', 1)[1].strip()
return result
def check_campaign_upload_complete(self, campaign_id):
"""
Check if ALL master assets for a campaign have been uploaded
@ -519,6 +592,160 @@ class Database:
cursor.close()
self.put_connection(conn)
def get_a1_retry_status(self, campaign_id):
"""
Get A1 retry status for campaign
Args:
campaign_id: DAM campaign folder ID
Returns:
dict with retry_count, last_retry_at, permanently_failed, failure_reason
Returns None if campaign not found
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT a1_retry_count, a1_last_retry_at,
a1_permanently_failed, a1_failure_reason
FROM campaign_status
WHERE campaign_id = %s
""", (campaign_id,))
row = cursor.fetchone()
if row:
return {
'retry_count': row[0] or 0,
'last_retry_at': row[1],
'permanently_failed': row[2] or False,
'failure_reason': row[3]
}
else:
return None
finally:
cursor.close()
self.put_connection(conn)
def increment_a1_retry(self, campaign_id, campaign_number, campaign_name, reason, mark_failed_at_max=True):
"""
Increment A1 retry counter and mark as permanently failed if max attempts reached
Args:
campaign_id: DAM campaign folder ID
campaign_number: Campaign number (e.g., C000000078)
campaign_name: Campaign name
reason: Description of failure (e.g., "No master assets found")
mark_failed_at_max: If True (default), set a1_permanently_failed=True at MAX_RETRIES.
Set False for empty-folder polling where the campaign is expected
to eventually receive assets and should keep retrying silently.
Returns:
dict with success, retry_count, permanently_failed
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
# Maximum retry attempts before marking as permanently failed
MAX_RETRIES = 3
# Get current retry count
cursor.execute("""
SELECT a1_retry_count FROM campaign_status
WHERE campaign_id = %s
""", (campaign_id,))
row = cursor.fetchone()
current_count = (row[0] or 0) if row else 0
new_count = current_count + 1
is_permanently_failed = mark_failed_at_max and new_count >= MAX_RETRIES
# Insert or update campaign status with retry tracking
cursor.execute("""
INSERT INTO campaign_status (
campaign_id, campaign_number, campaign_name,
live_campaign, status, webhook_sent,
a1_retry_count, a1_last_retry_at,
a1_permanently_failed, a1_failure_reason
) VALUES (%s, %s, %s, 'NO', 'A1', FALSE, %s, CURRENT_TIMESTAMP, %s, %s)
ON CONFLICT (campaign_id) DO UPDATE SET
a1_retry_count = EXCLUDED.a1_retry_count,
a1_last_retry_at = EXCLUDED.a1_last_retry_at,
a1_permanently_failed = EXCLUDED.a1_permanently_failed,
a1_failure_reason = EXCLUDED.a1_failure_reason,
updated_at = CURRENT_TIMESTAMP
""", (
campaign_id,
campaign_number,
campaign_name,
new_count,
is_permanently_failed,
reason if is_permanently_failed else None
))
conn.commit()
logger.info("A1 retry tracking: Campaign {} - Attempt {}/{} (Permanently Failed: {})".format(
campaign_number, new_count, MAX_RETRIES, is_permanently_failed
))
return {
'success': True,
'retry_count': new_count,
'permanently_failed': is_permanently_failed
}
except Exception as e:
conn.rollback()
logger.error("Failed to increment A1 retry: {}".format(str(e)))
return {'success': False, 'error': str(e)}
finally:
cursor.close()
self.put_connection(conn)
def reset_a1_retry(self, campaign_id):
"""
Reset A1 retry tracking for campaign (used when campaign is fixed manually)
Args:
campaign_id: DAM campaign folder ID
Returns:
dict with success boolean
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL,
updated_at = CURRENT_TIMESTAMP
WHERE campaign_id = %s
""", (campaign_id,))
conn.commit()
logger.info("Reset A1 retry tracking for campaign: {}".format(campaign_id))
return {'success': True}
except Exception as e:
conn.rollback()
logger.error("Failed to reset A1 retry: {}".format(str(e)))
return {'success': False, 'error': str(e)}
finally:
cursor.close()
self.put_connection(conn)
def check_campaign_processed(self, campaign_id):
"""
Check if campaign has already been processed (webhook sent)
@ -587,6 +814,41 @@ class Database:
import json
full_json = json.dumps(full_extraction_data) if isinstance(full_extraction_data, dict) else full_extraction_data
# B1→B2 global masters: dedup by tracking_id so re-runs and previously-downloaded
# assets don't create duplicate rows.
if status == 'b1-master-cx-score':
cursor.execute("""
SELECT id FROM creativex_scores
WHERE tracking_id = %s AND status = 'b1-master-cx-score'
LIMIT 1
""", (tracking_id,))
if cursor.fetchone():
logger.debug("B1 master CreativeX score already recorded for tracking {}, skipping insert".format(tracking_id))
return {'success': True, 'is_update': False, 'already_exists': True}
cursor.execute("""
INSERT INTO creativex_scores (
filename, creativex_id, creativex_url, quality_score,
box_file_id, full_extraction_data, tracking_id, status
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)
""", (
filename,
creativex_id,
creativex_url,
quality_score,
box_file_id,
full_json,
tracking_id,
'b1-master-cx-score'
))
conn.commit()
logger.info("Stored B1 master CreativeX score: {} (Tracking: {}, Score: {})".format(
filename, tracking_id, quality_score
))
return {'success': True, 'is_update': False, 'version_number': 1}
# Handle master-cx-score differently (no versioning, just reference storage)
if status == 'master-cx-score':
# Simple insert for master score reference (no versioning)
@ -618,33 +880,52 @@ class Database:
}
# For 'active' status - use soft delete versioning
# Step 1: Check if filename already exists with status='active'
# Also count total versions for this filename
cursor.execute("""
SELECT id, quality_score FROM creativex_scores
WHERE filename = %s AND status = 'active'
""", (filename,))
# Strip timestamp suffix (e.g. _2026-03-13-05-53-36) from filename
# so re-scored assets supersede previous versions regardless of timestamp
import re
dot_idx = filename.rfind('.')
name_part = filename[:dot_idx] if dot_idx >= 0 else filename
ext = filename[dot_idx:] if dot_idx >= 0 else ''
base_filename = re.sub(r'_\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}$', '', name_part) + ext
existing = cursor.fetchone()
# Step 1: Check if this base asset already exists with status='active'
# Use LIKE pattern to match any timestamp variant of the same base filename
if base_filename != filename:
# Filename has a timestamp - match base pattern with any/no timestamp
like_pattern = base_filename.replace(ext, '') + '%' + ext
cursor.execute("""
SELECT id, quality_score, filename FROM creativex_scores
WHERE filename LIKE %s AND status = 'active'
""", (like_pattern,))
else:
# No timestamp in filename - still match variants that do have one
like_pattern = name_part + '%' + ext
cursor.execute("""
SELECT id, quality_score, filename FROM creativex_scores
WHERE filename LIKE %s AND status = 'active'
""", (like_pattern,))
# Count total versions (including superseded)
existing = cursor.fetchall()
# Count total versions (including superseded) for the base asset
cursor.execute("""
SELECT COUNT(*) FROM creativex_scores
WHERE filename = %s
""", (filename,))
WHERE filename LIKE %s
""", (like_pattern,))
total_versions = cursor.fetchone()[0]
if existing:
# Step 2: Mark existing record(s) as 'superseded'
# Step 2: Mark all existing active records as 'superseded'
cursor.execute("""
UPDATE creativex_scores
SET status = 'superseded'
WHERE filename = %s AND status = 'active'
""", (filename,))
WHERE filename LIKE %s AND status = 'active'
""", (like_pattern,))
logger.info("Superseded previous CreativeX score for: {} (old score: {})".format(
filename, existing[1]
superseded_filenames = [row[2] for row in existing]
logger.info("Superseded {} previous CreativeX score(s) for base asset: {} (old filenames: {})".format(
len(existing), base_filename, superseded_filenames
))
# Step 3: Insert new 'active' record
@ -670,8 +951,9 @@ class Database:
version_number = total_versions + 1
if existing:
logger.info("Updated CreativeX score: {} (Score: {} -> {}, Version: {})".format(
filename, existing[1], quality_score, version_number
old_scores = [row[1] for row in existing]
logger.info("Updated CreativeX score: {} (Old scores: {} -> {}, Version: {})".format(
filename, old_scores, quality_score, version_number
))
else:
logger.info("Stored new CreativeX score: {} (Score: {}, Version: {})".format(
@ -693,15 +975,18 @@ class Database:
cursor.close()
self.put_connection(conn)
def get_creativex_score_by_filename(self, filename):
def get_creativex_score_by_filename(self, filename, tracking_id=None):
"""
Get CreativeX score data by filename
Performs extension-agnostic lookup: if exact filename not found,
tries common video/image extensions (.mp4, .jpg, .png, .mov, etc.)
If still not found and tracking_id provided, falls back to LIKE search
on tracking ID (handles mismatched naming from CreativeX PDFs).
Args:
filename: Filename to search for
tracking_id: Optional tracking ID for fallback lookup
Returns:
dict with creativex data or None if not found
@ -748,6 +1033,24 @@ class Database:
if row:
break # Found with alternative extension
# If still not found, try tracking ID fallback
# CreativeX PDFs sometimes have different naming (extra text, stripped hyphens)
# but tracking ID is always consistent
if not row and tracking_id:
cursor.execute("""
SELECT filename, creativex_id, creativex_url, quality_score,
box_file_id, full_extraction_data, extracted_at
FROM creativex_scores
WHERE filename LIKE %s AND status = 'active'
ORDER BY extracted_at DESC
LIMIT 1
""", ('%' + tracking_id + '%',))
row = cursor.fetchone()
if row:
logger.info("CreativeX: Found score via tracking ID fallback '{}' -> {}".format(
tracking_id, row[0]))
if not row:
return None
@ -771,33 +1074,114 @@ class Database:
def get_all_live_campaigns(self):
"""
Get all live campaigns for CSV report
Returns:
list of dicts with campaign_number, campaign_name
Get all live campaigns (A-series local + B-series global) for the
single combined CSV that OMG ingests as a full replacement list.
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT campaign_number, campaign_name
FROM campaign_status
SELECT campaign_number, campaign_name
FROM campaign_status
WHERE live_campaign = 'YES'
AND (status LIKE 'A%' OR status LIKE 'B%')
ORDER BY campaign_number DESC
""")
rows = cursor.fetchall()
campaigns = []
for row in rows:
campaigns.append({
'campaign_number': row[0],
'campaign_name': row[1]
})
return campaigns
finally:
cursor.close()
self.put_connection(conn)
def get_override_metadata(self, filename_without_ext):
"""
Look up pre-upload metadata override saved by the naming tool.
Returns the latest unapplied override row for this filename, or None.
If the override_metadata table doesn't exist (e.g., on a dev DB where the
naming tool migration hasn't been run), returns None — upload behaviour
falls back to today's defaults.
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT id, tracking_id, override_fields
FROM override_metadata
WHERE filename = %s
AND applied_to_upload = FALSE
ORDER BY created_at DESC
LIMIT 1
""", (filename_without_ext,))
row = cursor.fetchone()
if not row:
return None
override_fields = row[2] if isinstance(row[2], dict) else json.loads(row[2])
return {
'id': row[0],
'tracking_id': row[1],
'override_fields': override_fields,
}
except psycopg2.errors.UndefinedTable:
conn.rollback()
logger.warning("override_metadata table does not exist - skipping override lookup")
return None
except Exception as e:
conn.rollback()
logger.error("Failed to query override_metadata for '{}': {}".format(
filename_without_ext, str(e)
))
return None
finally:
cursor.close()
self.put_connection(conn)
def mark_override_applied(self, filename_without_ext):
"""
Mark a pre-upload override row as applied after a successful DAM upload.
Only updates rows that are currently applied_to_upload = FALSE.
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
UPDATE override_metadata
SET applied_to_upload = TRUE,
applied_at = CURRENT_TIMESTAMP
WHERE filename = %s
AND applied_to_upload = FALSE
""", (filename_without_ext,))
updated = cursor.rowcount
conn.commit()
if updated:
logger.info("Marked {} override row(s) as applied for '{}'".format(
updated, filename_without_ext
))
return updated
except psycopg2.errors.UndefinedTable:
conn.rollback()
return 0
except Exception as e:
conn.rollback()
logger.error("Failed to mark override applied for '{}': {}".format(
filename_without_ext, str(e)
))
return 0
finally:
cursor.close()
self.put_connection(conn)

View file

@ -15,10 +15,43 @@ class FilenameParser:
[JOB]_[BRAND]_[SUBJECT]_[ASSET]_[DUR]_[RATIO]_[SPOT]_[COUNTRY]_[LANG]_[SOCIAL]_[TRACKING]
Example: 1234567_RAF_ME-MOMENT_OLV_6S_1x1_REF_GL_it_IGF_pOiJ9s
PPR Environment: Supports multiple tracking IDs (e.g., pOiJ9s+BqB8vo+laRJo0)
PROD Environment: Single tracking ID only (backward compatible)
"""
# Known social media platform codes
SOCIAL_MEDIA_CODES = ['FBP', 'FBR', 'IGF', 'IGR'] # Expandable
# Known social media platform codes (from Ferrero naming tool data.json)
SOCIAL_MEDIA_CODES = [
# Facebook
'FBD', 'FGF', 'FBR', 'FRO', 'FBS', 'FBF', 'FBP', 'FIA', 'FIV',
'FMP', 'FPF', 'FRC', 'FSE', 'FSS', 'FSV', 'FUK', 'FVF',
# Instagram
'IGF', 'IGE', 'IGG', 'IGT', 'IPF', 'IPR', 'IGR', 'IGO', 'IGS', 'ISH', 'IST',
# Audience Network
'ANC', 'ANI', 'ANR',
# Messenger
'MSI', 'MSS',
# YouTube
'YTA', 'YTB', 'YTS',
# Other platforms
'AMZ', 'DV3', 'GOO', 'PIN', 'SNA', 'SPT', 'TIK', 'TWI', 'VOD',
]
def __init__(self, dam_base_url=None):
"""
Initialize parser with optional environment detection
Args:
dam_base_url: DAM base URL for environment detection (optional)
"""
self.dam_base_url = dam_base_url
self.is_ppr = self._is_ppr_environment()
def _is_ppr_environment(self):
"""Check if running in PPR environment"""
if not self.dam_base_url:
return False
return 'ppr.dam.ferrero.com' in self.dam_base_url.lower()
def parse_filename(self, filename):
"""
@ -178,21 +211,68 @@ class FilenameParser:
logger.debug("Found social media: {}".format(part))
index += 1
# Tracking ID: 6 alphanumeric, optionally with -N suffix
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?$', part):
tracking = part
tracking_mode = 'full'
base_tracking_id = tracking
# Tracking ID(s): 6 alphanumeric, optionally with -N suffix
# PPR: Supports multiple IDs (e.g., "BqB8vo+SfUQ7m+laRJo0")
# PROD: Single ID only (backward compatible)
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$', part):
# Check if multiple IDs provided
if '+' in part and self.is_ppr:
# PPR ONLY: Parse multiple tracking IDs
tracking_ids = []
tracking_modes = []
tracking_ids_with_suffix = []
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2] # Strip -N suffix
logger.info("Detected folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))
id_parts = part.split('+')
logger.info("PPR Environment - Multiple tracking IDs detected: {}".format(len(id_parts)))
for tracking in id_parts:
tracking_mode = 'full'
base_tracking_id = tracking
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2]
logger.info("Folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))
tracking_ids.append(base_tracking_id)
tracking_modes.append(tracking_mode)
tracking_ids_with_suffix.append(tracking)
# Store primary (first) for backward compatibility
parsed['tracking_id'] = tracking_ids[0]
parsed['tracking_mode'] = tracking_modes[0]
parsed['tracking_id_with_suffix'] = tracking_ids_with_suffix[0]
# Store all IDs for multi-master support
parsed['tracking_ids'] = tracking_ids
parsed['tracking_modes'] = tracking_modes
parsed['tracking_ids_with_suffix'] = tracking_ids_with_suffix
parsed['has_multiple_masters'] = True
logger.info("Parsed {} tracking IDs: {}".format(len(tracking_ids), ', '.join(tracking_ids)))
else:
# PROD or Single ID: Use only first tracking ID
if '+' in part:
logger.warning("PROD Environment - Multiple tracking IDs not supported, using first ID only")
part = part.split('+')[0] # Take only first ID
tracking = part
tracking_mode = 'full'
base_tracking_id = tracking
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2]
logger.info("Folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))
parsed['tracking_id'] = base_tracking_id
parsed['tracking_mode'] = tracking_mode
parsed['tracking_id_with_suffix'] = tracking
parsed['tracking_ids'] = [base_tracking_id] # Single item list for compatibility
parsed['has_multiple_masters'] = False
logger.debug("Found tracking ID: {}".format(tracking))
parsed['tracking_id'] = base_tracking_id
parsed['tracking_mode'] = tracking_mode
parsed['tracking_id_with_suffix'] = tracking
logger.debug("Found tracking ID: {}".format(tracking))
index += 1
# Unknown part - could be aspect ratio fallback
@ -216,8 +296,8 @@ class FilenameParser:
def strip_upload_components(self, filename):
"""
Strip OMG Job Number and Tracking ID from filename
Returns clean filename in V2.1 order
Strip OMG Job Number from front and Tracking ID from back of filename.
Keeps everything else as-is (including social media codes, DV3, etc.)
Args:
filename: Original filename
@ -226,40 +306,23 @@ class FilenameParser:
Clean filename for upload (no job number, no tracking ID)
Example:
Input: 1234567_RAF_TEST_OLV_6S_1x1_REF_GL_it_IGF_abc123.mp4
Output: RAF_TEST_OLV_6S_1x1_REF_GL_it_IGF.mp4
Input: 6662777_NUT_XMAS-SHARETHELOVE-GLAS_OLV_6S_16X9_PL_pl_YTA_EvQJrM.mp4
Output: NUT_XMAS-SHARETHELOVE-GLAS_OLV_6S_16X9_PL_pl_YTA.mp4
"""
parsed = self.parse_filename(filename)
import os
if not parsed:
base, ext = os.path.splitext(filename)
parts = base.split('_')
if len(parts) < 3:
return filename
# Build clean filename in V2.1 order
# [BRAND]_[SUBJECT]_[ASSET]_[DUR]_[RATIO]_[SPOT]_[COUNTRY]_[LANG]_[SOCIAL]
clean_parts = []
# Strip job number from front (digits only)
if parts[0].isdigit():
parts = parts[1:]
if parsed['brand_code']:
clean_parts.append(parsed['brand_code'])
if parsed['subject_title']:
clean_parts.append(parsed['subject_title'])
if parsed['asset_type']:
clean_parts.append(parsed['asset_type'])
if parsed['seconds']:
clean_parts.append(parsed['seconds'] + 'S')
if parsed['aspect_ratio']:
clean_parts.append(parsed['aspect_ratio'])
if parsed['spot_version']:
clean_parts.append(parsed['spot_version'])
if parsed['country_code']:
clean_parts.append(parsed['country_code'])
if parsed['language_code']:
clean_parts.append(parsed['language_code'])
if parsed['social_media_version']:
clean_parts.append(parsed['social_media_version'])
# Strip tracking ID(s) from back (6 alphanumeric chars, optionally with +joined IDs or -N suffix)
if parts and re.match(r'^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$', parts[-1]):
parts = parts[:-1]
clean_filename = '_'.join(clean_parts)
if parsed['extension']:
clean_filename += parsed['extension']
return clean_filename
return '_'.join(parts) + ext

View file

@ -5,12 +5,44 @@ Compatible with Python 3.6+
"""
import logging
import json
import copy
from datetime import datetime, timedelta
import os
from shared.config_loader import load_country_code_mappings
logger = logging.getLogger('MetadataExtractorMVP')
# Editor field name -> DAM metadata field ID.
# Mirrors the canonical mapping in the naming tool's public-v2/Database.php
# so that pre-upload overrides saved via the metadata editor are applied to
# the matching DAM fields on upload.
OVERRIDE_FIELD_MAP = {
'validity_start': 'FERRERO.FIELD.ASSET VALIDITY START PERIOD',
'validity_end': 'FERRERO.FIELD.ASSET VALIDITY END PERIOD',
'marketing_tag': 'MARKETING_TAG',
'agency_name': 'FERRERO.MARKETING.FIELD.AGENCY NAME',
'spot_version': 'FERRERO.MARKETING.FIELD.SPOT_VERSION',
'director_name': 'FERRERO.MARKETING.FIELD.DIRECTOR_NAME',
'video_post_prod_company': 'FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY',
'video_post_prod_contact': 'FERRERO.MARKETING.FIELD.VID_POST_PROD_CONTACT',
'audio_post_prod_company': 'FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY',
'audio_post_prod_contact': 'FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_CONTACT',
'video_type': 'FERRERO.MARKET.FIELD.TYPE_VID',
'ip_rights': 'FERRERO.MARKET.FIELD.IPRIGHT',
'production_company': 'FERRERO.MARKET.PROD_COMPANY',
'licensing': 'FERRERO.MARKET.FIELD.LICENSIN',
'buyout': 'FERRERO.MARKET.FIELD.BUYOUT',
'ferrero_property': 'FERRERO.MARKET.FIELD.FERRERO PROPERTY',
'video_status': 'FERRERO.MARKET.VID_N_STAT',
'license': 'FERRERO.MARKET.FIELD.LICENSE',
'creativex_score': 'FERRERO.TAB.FIELD.CREATIVEX',
'creativex_link': 'FERRERO.FIELD.CREATIVEX LINK',
}
DATE_OVERRIDE_FIELDS = {'validity_start', 'validity_end'}
class MetadataExtractorMVP:
def __init__(self, field_mappings):
"""
@ -23,6 +55,7 @@ class MetadataExtractorMVP:
self.filename_updates = field_mappings.get('filename_updates', {})
self.forced_values = field_mappings.get('forced_values', {})
self.defaults = field_mappings.get('defaults', {})
self.asset_type_overrides = field_mappings.get('asset_type_overrides', {})
# Load country code mappings (ISO -> DAM codes)
self.country_mappings = load_country_code_mappings()
@ -34,6 +67,22 @@ class MetadataExtractorMVP:
if self.asset_type_mappings:
logger.info("Loaded {} asset type mappings (3-letter->DAM)".format(len(self.asset_type_mappings)))
# Load asset representation template for folder-only mode
self.template_fields = self._load_asset_representation_template()
if self.template_fields:
logger.info("Loaded asset representation template with {} fields".format(len(self.template_fields)))
def _load_asset_representation_template(self):
"""Load the asset representation template JSON for folder-only mode"""
template_path = 'config/asset_representation_template.json'
try:
with open(template_path, 'r') as f:
data = json.load(f)
return data['asset_resource']['asset']['metadata']['metadata_element_list']
except Exception as e:
logger.warning("Could not load asset representation template: {}".format(str(e)))
return []
def extract_mvp_fields(self, master_metadata):
"""
Extract only MVP fields from full master metadata
@ -94,7 +143,7 @@ class MetadataExtractorMVP:
return extracted_fields
def build_mvp_asset_representation(self, master_metadata, clean_filename, parsed_filename, box_metadata=None, tracking_mode='full', master_opentext_id=None):
def build_mvp_asset_representation(self, master_metadata, clean_filename, parsed_filename, box_metadata=None, tracking_mode='full', master_opentext_id=None, master_opentext_ids=None, override_fields=None):
"""
Build asset representation with MVP fields + updates from filename
@ -105,6 +154,10 @@ class MetadataExtractorMVP:
box_metadata: Optional Box metadata
tracking_mode: 'full' (inherit all metadata) or 'folder_only' (only use folder)
master_opentext_id: Optional DAM Asset ID of master asset (for derivative tracking)
override_fields: Optional dict of pre-upload metadata overrides keyed by
editor field name (e.g. {'validity_end': '...', 'ip_rights': 'Yes'}).
Applied after master/filename/forced values but before asset-type
overrides so EOL/LTD compliance still wins. Empty values are skipped.
Returns:
Asset representation dict ready for upload
@ -127,15 +180,41 @@ class MetadataExtractorMVP:
mvp_fields = []
mvp_fields = self._build_fields_from_filename(parsed_filename, clean_filename)
# Apply forced values from config (e.g., AGENCY NAME)
# STATE is already handled in _build_fields_from_filename
mvp_fields = self._apply_forced_values(mvp_fields)
# Add missing MVP fields with defaults (both modes)
mvp_fields = self._add_missing_fields(mvp_fields, parsed_filename)
# Add empty required fields that DAM expects (even if empty) - folder-only mode needs these
mvp_fields = self._add_empty_required_fields(mvp_fields)
# Update CreativeX fields from Box metadata if provided
if box_metadata:
mvp_fields = self._update_creativex_fields(mvp_fields, box_metadata)
# Add Master Asset ID field if provided (derivative tracking)
if master_opentext_id:
# Apply pre-upload metadata overrides from the naming tool's editor.
# Runs after master/filename/forced/default/CreativeX values so it wins
# over them, but before asset_type_overrides so EOL/LTD compliance rules
# still take final precedence.
if override_fields:
mvp_fields = self._apply_override_fields(mvp_fields, override_fields)
# Apply asset type overrides (e.g., EOL, LTD) - takes final precedence over
# forced values, defaults, and CreativeX (LTD removes CreativeX entirely).
mvp_fields = self._apply_asset_type_overrides(mvp_fields, parsed_filename)
# Add MASTERASSETIDS field with all master IDs
# Priority: Use master_opentext_ids if provided (multiple IDs), otherwise fall back to single master_opentext_id
if master_opentext_ids and len(master_opentext_ids) > 0:
mvp_fields = self._add_master_asset_ids_field(mvp_fields, master_opentext_ids)
if len(master_opentext_ids) > 1:
logger.info("PPR - Added MASTERASSETIDS field with {} master IDs".format(len(master_opentext_ids)))
else:
logger.info("Added MASTERASSETIDS field with 1 master ID")
elif master_opentext_id:
# Fallback to single master ID if master_opentext_ids not provided
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
@ -190,8 +269,28 @@ class MetadataExtractorMVP:
# Update the field
for field in mvp_fields:
if field.get('id') == field_id:
self._set_field_value(field, value)
logger.info("Updated {} from filename: {}".format(field_id, value))
# For tabular fields (like MAIN_LANGUAGES), update the 'values' array
# The DAM reads from 'values' (plural), not 'value' (singular)
if field.get('type') == 'com.artesia.metadata.MetadataTableField' or 'values' in field:
field['values'] = [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'expired_value': False,
'field_value': {
'type': 'string',
'value': value
},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
logger.info("Updated tabular field {} values array from filename: {}".format(field_id, value))
else:
self._set_field_value(field, value)
logger.info("Updated {} from filename: {}".format(field_id, value))
break
# Apply country code mapping (ISO -> DAM codes)
@ -268,6 +367,96 @@ class MetadataExtractorMVP:
return mvp_fields
def _apply_asset_type_overrides(self, mvp_fields, parsed_filename):
"""
Apply asset type overrides when a matching asset type (e.g., EOL) is detected in the filename.
These overrides take final precedence over forced values and defaults.
Args:
mvp_fields: List of MVP field objects
parsed_filename: Parsed filename dict (must contain 'asset_type' key)
Returns:
Updated mvp_fields list
"""
if not parsed_filename:
return mvp_fields
asset_type = parsed_filename.get('asset_type')
if not asset_type:
return mvp_fields
overrides = self.asset_type_overrides.get(asset_type)
if not overrides:
return mvp_fields
logger.info("Applying {} asset type overrides for '{}'".format(len(overrides), asset_type))
for field_id, override_value in overrides.items():
# Empty string means remove the field entirely
if override_value == '':
before_count = len(mvp_fields)
mvp_fields = [f for f in mvp_fields if f.get('id') != field_id]
if len(mvp_fields) < before_count:
logger.info("Asset type override: removed field {}".format(field_id))
else:
logger.debug("Asset type override: field {} not present (nothing to remove)".format(field_id))
continue
field_found = False
for field in mvp_fields:
if field.get('id') == field_id:
field_found = True
# For tabular fields (like MAIN_LANGUAGES), update both 'value' and 'values'
if field.get('type') == 'com.artesia.metadata.MetadataTableField' or 'values' in field:
domain_value_obj = {
'type': 'com.artesia.metadata.DomainValue',
'field_value': {'type': 'string', 'value': override_value},
'display_value': override_value,
'expired_value': False,
'active_to': '',
'active_from': ''
}
field['value'] = {
'value': domain_value_obj,
'is_locked': False,
'domain_value': True,
'cascading_domain_value': False
}
field['values'] = [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'expired_value': False,
'field_value': {
'type': 'string',
'value': override_value
},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
logger.info("Asset type override: {} = {} (tabular)".format(field_id, override_value))
else:
self._set_field_value(field, override_value)
logger.info("Asset type override: {} = {}".format(field_id, override_value))
break
if not field_found:
# Field not present yet (e.g. description has no subject_title from filename).
# Append as a simple string field so the override still takes effect. Tabular
# / domained overrides aren't supported here — they should already be in
# mvp_fields via _add_missing_fields.
mvp_fields.append({
'id': field_id,
'value': {'value': {'type': 'string', 'value': override_value}}
})
logger.info("Asset type override: {} = {} (added missing field)".format(field_id, override_value))
return mvp_fields
def _add_missing_fields(self, mvp_fields, parsed_filename):
"""Add missing MVP fields from filename or defaults"""
field_ids = [f.get('id') for f in mvp_fields]
@ -278,29 +467,82 @@ class MetadataExtractorMVP:
language = parsed_filename['language_code'].upper()
logger.info("Adding MAIN_LANGUAGES: {}".format(language))
domain_value_obj = {
'type': 'com.artesia.metadata.DomainValue',
'field_value': {'type': 'string', 'value': language},
'display_value': language,
'expired_value': False,
'active_to': '',
'active_from': ''
}
mvp_fields.append({
'id': 'MAIN_LANGUAGES',
'name': 'MAIN LANGUAGES',
'parent_table_id': 'FERRERO.TABULAR.FIELD.MAIN LANGUAGES',
'type': 'com.artesia.metadata.MetadataTableField',
'value': {
'value': domain_value_obj,
'is_locked': False,
'domain_value': True,
'cascading_domain_value': False
},
'values': [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'field_value': {
'type': 'string',
'value': language
},
'expired_value': False,
'field_value': {'type': 'string', 'value': language},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
],
'tabular': True,
'domained': True,
'required': True,
'domain_id': 'FERRERO.DOMAIN.MAIN LAGUAGES_LU'
})
# Add other missing fields with defaults
field_ids = [f.get('id') for f in mvp_fields]
for field_id, default_value in self.defaults.items():
if field_id in field_ids:
# Field exists (e.g. from template) - check if value is empty and set default
for field in mvp_fields:
if field.get('id') == field_id:
# Tabular fields use 'values' array - skip if already populated
if field.get('type') == 'com.artesia.metadata.MetadataTableField':
if field.get('values'):
break # Already has values
# Empty tabular - fall through to add as new below
break
# Regular field - check if it has an actual value set
val = field.get('value', {})
has_value = 'value' in val and isinstance(val.get('value'), dict) and 'value' in val['value']
if not has_value:
# Use DomainValue format for domained fields
if field.get('domained', False):
field['value'] = {
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'active_from': '',
'active_to': '',
'display_value': default_value,
'expired_value': False,
'field_value': {'type': 'string', 'value': default_value},
'type': 'com.artesia.metadata.DomainValue'
}
}
else:
field['value'] = {'value': {'type': 'string', 'value': default_value}}
logger.info("Set default on template field {} = {}".format(field_id, default_value))
break
continue
if field_id not in field_ids:
logger.info("Adding {} with default: {}".format(field_id, default_value))
@ -310,12 +552,72 @@ class MetadataExtractorMVP:
]
if is_tabular:
# Map field IDs to correct parent table IDs
parent_table_map = {
'FERRERO.FIELD.ASSETCOMPLIANCE': 'FERRERO.TABULAR.FIELD.ASSETCOMPLIANCE',
'MARKETING_TAG': 'FERRERO.TABULAR.FIELD.MARKETING.TAG',
}
parent_table_id = parent_table_map.get(field_id, 'FERRERO.TABULAR.FIELD.' + field_id.split('.')[-1])
domain_value_obj = {
'type': 'com.artesia.metadata.DomainValue',
'field_value': {'type': 'string', 'value': default_value},
'display_value': default_value,
'expired_value': False,
'active_to': '',
'active_from': ''
}
mvp_fields.append({
'id': field_id,
'parent_table_id': 'FERRERO.TABULAR.FIELD.' + field_id.split('.')[-1],
'parent_table_id': parent_table_id,
'type': 'com.artesia.metadata.MetadataTableField',
'value': {
'value': domain_value_obj,
'is_locked': False,
'domain_value': True,
'cascading_domain_value': False
},
'values': [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'expired_value': False,
'field_value': {
'type': 'string',
'value': default_value
},
'type': 'com.artesia.metadata.DomainValue'
}
}
],
'tabular': True,
'domained': True
})
else:
# Non-domain fields use simple value structure
non_domain_fields = [
'FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY',
'FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY',
]
if field_id in non_domain_fields:
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': {
'value': {
'type': 'string',
'value': default_value
}
}
})
else:
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': {
'cascading_domain_value': False,
'domain_value': True,
'value': {
@ -326,21 +628,154 @@ class MetadataExtractorMVP:
'type': 'com.artesia.metadata.DomainValue'
}
}
]
})
else:
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': {
'cascading_domain_value': False,
'domain_value': True,
'value': {
'type': 'string',
'value': default_value
})
return mvp_fields
def _apply_forced_values(self, mvp_fields):
"""
Apply forced values from config to existing fields.
For fields not yet present, adds them with DomainValue format.
Used in folder-only mode where _update_fields is not called.
"""
field_ids = [f.get('id') for f in mvp_fields]
for field_id, forced_value in self.forced_values.items():
if field_id in field_ids:
# Field exists - set value with proper format based on field type
for field in mvp_fields:
if field.get('id') == field_id:
if field.get('domained', False):
field['value'] = {
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'active_from': '',
'active_to': '',
'display_value': forced_value,
'expired_value': False,
'field_value': {'type': 'string', 'value': forced_value},
'type': 'com.artesia.metadata.DomainValue'
}
}
else:
self._set_field_value(field, forced_value)
logger.info("Forced value applied: {} = {}".format(field_id, forced_value))
break
else:
# Field not present - add with DomainValue format
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': {
'cascading_domain_value': False,
'domain_value': True,
'value': {
'field_value': {'type': 'string', 'value': forced_value},
'type': 'com.artesia.metadata.DomainValue'
}
})
}
})
logger.info("Forced value added: {} = {}".format(field_id, forced_value))
return mvp_fields
def _add_empty_required_fields(self, mvp_fields):
"""
Add fields that the DAM expects to be present even if empty.
In full-inheritance mode these come from the master asset.
In folder-only mode they must be explicitly added.
Only adds fields not already present.
"""
field_ids = [f.get('id') for f in mvp_fields]
# Empty value structure for domained fields with no value set
empty_domained_value = {
'is_locked': False,
'domain_value': False,
'cascading_domain_value': False
}
# Fields with empty domained values
empty_domained_fields = [
'FERRERO.FIELD.MARKETING.FLAVOUR',
'FERRERO.FIELD.MARKETING.SIZE',
'FERRERO.FIELD.SUB BRAND',
'FERRERO.MARKET.FIELD.BUYOUT',
'FERRERO.MARKET.FIELD.FERRERO PROPERTY',
'FERRERO.MARKET.VID_N_STAT',
'FERRERO.MARKETING.FIELD.SPOT_VERSION',
]
for field_id in empty_domained_fields:
if field_id not in field_ids:
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': dict(empty_domained_value)
})
# Fields with empty non-domained values
empty_plain_fields = [
'FERRERO.MARKETING.FIELD.DIRECTOR_NAME',
'FERRERO.MARKETING.FIELD.VID_POST_PROD_CONTACT',
'FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_CONTACT',
'FERRERO.MARKET.FIELD.LICENSE',
]
for field_id in empty_plain_fields:
if field_id not in field_ids:
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': {
'is_locked': False,
'domain_value': False,
'cascading_domain_value': False
}
})
# Domained fields with default "No" value
no_value_fields = [
'FERRERO.MARKET.FIELD.IPRIGHT',
'FERRERO.MARKET.FIELD.LICENSIN',
]
for field_id in no_value_fields:
if field_id not in field_ids:
mvp_fields.append({
'id': field_id,
'type': 'com.artesia.metadata.MetadataField',
'value': {
'value': {
'type': 'com.artesia.metadata.DomainValue',
'field_value': {'type': 'string', 'value': 'No'},
'display_value': 'No',
'expired_value': False,
'active_to': '',
'active_from': ''
},
'is_locked': False,
'domain_value': True,
'cascading_domain_value': False
}
})
# Empty tabular field: Type of Video & Static Right
if 'FERRERO.MARKET.FIELD.TYPE_VID' not in field_ids:
mvp_fields.append({
'id': 'FERRERO.MARKET.FIELD.TYPE_VID',
'parent_table_id': 'FERRERO.TABULAR.VID_STAT_TYPE',
'type': 'com.artesia.metadata.MetadataTableField',
'values': [],
'tabular': True,
'domained': True
})
added_count = len(mvp_fields) - len(field_ids)
if added_count > 0:
logger.info("Added {} empty required fields for DAM compatibility".format(added_count))
return mvp_fields
@ -415,63 +850,104 @@ class MetadataExtractorMVP:
def _build_fields_from_filename(self, parsed_filename, clean_filename):
"""
Build ALL metadata fields from parsed filename
Used in folder-only mode (tracking ID with -N suffix)
Build ALL metadata fields from parsed filename using the reference template.
Used in folder-only mode (tracking ID with -N suffix).
Note: Uses codes directly for now. Can add lookup tables later
for brand_code->brand_name, country_code->country_name, etc.
Deep copies the asset representation template and populates values
from the parsed filename. This ensures all fields have the full metadata
structure (column_name, data_type, etc.) that the DAM API requires.
"""
fields = []
if not self.template_fields:
logger.error("No asset representation template loaded - folder-only mode cannot proceed")
return []
# Deep copy the template so we don't modify the original
fields = copy.deepcopy(self.template_fields)
# Build lookup for quick access
fields_by_id = {f['id']: f for f in fields}
# Helper to set a domained field value with DomainValue structure
def set_domained_value(field, value):
field['value'] = {
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'active_from': '',
'active_to': '',
'display_value': value,
'expired_value': False,
'field_value': {'type': 'string', 'value': value},
'type': 'com.artesia.metadata.DomainValue'
}
}
# Helper to set a plain string field value
def set_string_value(field, value):
field['value'] = {'value': {'type': 'string', 'value': value}}
# --- Populate fields from filename ---
# ASSET NAME
fields.append({
'id': 'ARTESIA.FIELD.ASSET NAME',
'value': {'value': {'value': clean_filename}}
})
if 'ARTESIA.FIELD.ASSET NAME' in fields_by_id:
set_string_value(fields_by_id['ARTESIA.FIELD.ASSET NAME'], clean_filename)
# DESCRIPTION (from subject_title)
if parsed_filename.get('subject_title'):
fields.append({
'id': 'ARTESIA.FIELD.ASSET DESCRIPTION',
'value': {'value': {'value': parsed_filename['subject_title']}}
})
# DESCRIPTION
if parsed_filename.get('subject_title') and 'ARTESIA.FIELD.ASSET DESCRIPTION' in fields_by_id:
set_string_value(fields_by_id['ARTESIA.FIELD.ASSET DESCRIPTION'], parsed_filename['subject_title'])
# BRAND (use code for now, could add lookup later)
if parsed_filename.get('brand_code'):
fields.append({
'id': 'FERRERO.FIELD.BRAND',
'value': {'value': {'value': parsed_filename['brand_code']}}
})
# Note: BRAND and COUNTRY are NOT set in the metadata payload.
# They are inherited from the DAM folder structure.
# COUNTRY (map ISO code to DAM code)
if parsed_filename.get('country_code'):
dam_country_code = self._map_country_code(parsed_filename['country_code'])
fields.append({
'id': 'FERRERO.FIELD.COUNTRY',
'value': {'value': {'value': dam_country_code}}
})
# LANGUAGE (use code for now)
if parsed_filename.get('language_code'):
fields.append({
'id': 'FERRERO.FIELD.LANGUAGES',
'value': {'value': {'value': parsed_filename['language_code']}}
})
# ASSET TYPE (use code for now)
# ASSET TYPE (use config field ID, map code via lookup)
if parsed_filename.get('asset_type'):
fields.append({
'id': 'FERRERO.FIELD.ASSET TYPE',
'value': {'value': {'value': parsed_filename['asset_type']}}
})
asset_type_field_id = 'FERRERO.FIELD.ASSET TYPE'
for field_id, config in self.filename_updates.items():
if config.get('source') == 'asset_type':
asset_type_field_id = field_id
break
# STATE (force to Local)
fields.append({
'id': 'FERRERO.FIELD.STATE',
'value': {'value': {'value': 'Local'}}
})
mapped_asset_type = self._map_asset_type(parsed_filename['asset_type'])
if asset_type_field_id in fields_by_id:
set_domained_value(fields_by_id[asset_type_field_id], mapped_asset_type)
logger.info("Built {} fields from filename (folder-only mode)".format(len(fields)))
# STATE (forced to Local)
if 'FERRERO.FIELD.STATE' in fields_by_id:
set_domained_value(fields_by_id['FERRERO.FIELD.STATE'], 'Local')
# MAIN_LANGUAGES (tabular field — populate values array from language_code)
if parsed_filename.get('language_code') and 'MAIN_LANGUAGES' in fields_by_id:
language = parsed_filename['language_code'].upper()
fields_by_id['MAIN_LANGUAGES']['values'] = [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'expired_value': False,
'field_value': {'type': 'string', 'value': language},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
logger.info("Set MAIN_LANGUAGES (folder-only mode): {}".format(language))
# VALIDITY DATES (Start = Today, End = Today + 1 Year)
try:
today = datetime.now()
one_year_later = today + timedelta(days=365)
start_date_str = today.strftime('%m/%d/%Y')
end_date_str = one_year_later.strftime('%m/%d/%Y')
if 'FERRERO.FIELD.ASSET VALIDITY START PERIOD' in fields_by_id:
set_string_value(fields_by_id['FERRERO.FIELD.ASSET VALIDITY START PERIOD'], start_date_str)
if 'FERRERO.FIELD.ASSET VALIDITY END PERIOD' in fields_by_id:
set_string_value(fields_by_id['FERRERO.FIELD.ASSET VALIDITY END PERIOD'], end_date_str)
except Exception as e:
logger.error("Failed to set validity dates in folder-only mode: {}".format(str(e)))
logger.info("Built {} fields from template (folder-only mode)".format(len(fields)))
return fields
@ -486,6 +962,72 @@ class MetadataExtractorMVP:
return field['value']['value']['field_value'].get('value')
return None
def _apply_override_fields(self, mvp_fields, override_fields):
"""
Apply pre-upload metadata overrides from the naming tool.
For each non-empty entry in override_fields, map the editor field name
to its DAM field ID via OVERRIDE_FIELD_MAP and write the value into the
matching field in mvp_fields. Empty strings are skipped (treat as
"user didn't set this, leave inherited value alone"). Validity dates
from the editor arrive as ISO 8601 strings and are normalised to the
MM/DD/YYYY format DAM expects.
"""
if not override_fields:
return mvp_fields
applied = 0
for editor_field, raw_value in override_fields.items():
if raw_value is None or raw_value == '':
continue
dam_field_id = OVERRIDE_FIELD_MAP.get(editor_field)
if not dam_field_id:
logger.debug("Override: no DAM mapping for editor field '{}' - skipping".format(editor_field))
continue
value = raw_value
if editor_field in DATE_OVERRIDE_FIELDS:
value = self._normalize_iso_date(raw_value)
if not value:
continue
target = None
for field in mvp_fields:
if field.get('id') == dam_field_id:
target = field
break
if target is None:
logger.warning("Override: field {} (DAM id {}) not present in mvp_fields - skipping".format(
editor_field, dam_field_id
))
continue
if editor_field in DATE_OVERRIDE_FIELDS:
self._set_date_field_value(target, value)
else:
self._set_field_value(target, value)
logger.info("Override applied: {} ({}) = {}".format(editor_field, dam_field_id, value))
applied += 1
if applied:
logger.info("Applied {} pre-upload override field(s) from naming tool".format(applied))
return mvp_fields
def _normalize_iso_date(self, iso_str):
"""Convert an ISO 8601 date string (with or without time/timezone) to MM/DD/YYYY."""
if not iso_str:
return None
try:
date_part = iso_str.split('T')[0]
dt = datetime.strptime(date_part, '%Y-%m-%d')
return dt.strftime('%m/%d/%Y')
except Exception as e:
logger.warning("Could not normalize override date '{}': {}".format(iso_str, str(e)))
return None
def _set_field_value(self, field, value):
"""Set field value handling different structures"""
import json
@ -757,7 +1299,56 @@ class MetadataExtractorMVP:
}
})
logger.info("Added new Master Asset ID field: {}".format(master_field_id))
return mvp_fields
def _add_master_asset_ids_field(self, mvp_fields, master_opentext_ids):
"""
Add FERRERO.MASTERASSETIDS tabular field with multiple master asset IDs
Supports Many-to-Many relationship between derivatives and masters
Args:
mvp_fields: List of MVP fields
master_opentext_ids: List of DAM Asset IDs of master assets
Returns:
Updated mvp_fields list with FERRERO.MASTERASSETIDS
"""
if not master_opentext_ids or len(master_opentext_ids) == 0:
logger.info("No master_opentext_ids provided - skipping FERRERO.MASTERASSETIDS field")
return mvp_fields
# Check if field already exists
for field in mvp_fields:
if self._get_field_id(field) == 'FERRERO.MASTERASSETIDS':
logger.info("FERRERO.MASTERASSETIDS already present - skipping")
return mvp_fields
# Build values array with all master asset IDs
values = []
for master_id in master_opentext_ids:
values.append({
'cascading_domain_value': False,
'domain_value': False,
'is_locked': False,
'value': {
'type': 'string',
'value': master_id
}
})
# Create tabular field
new_field = {
'id': 'FERRERO.MASTERASSETIDS',
'parent_table_id': 'FERRERO.TABULAR.FIELD.MASTERASSETIDS',
'type': 'com.artesia.metadata.MetadataTableField',
'values': values
}
mvp_fields.append(new_field)
logger.info("Added FERRERO.MASTERASSETIDS field with {} master asset ID(s): {}".format(
len(values), ', '.join(master_opentext_ids[:3]) + ('...' if len(master_opentext_ids) > 3 else '')))
return mvp_fields
def _get_field_id(self, field):

View file

@ -203,8 +203,28 @@ class MetadataExtractorMVP:
# Update the field
for field in mvp_fields:
if field.get('id') == field_id:
self._set_field_value(field, value)
logger.info("Updated {} from filename: {}".format(field_id, value))
# For tabular fields (like MAIN_LANGUAGES), update the 'values' array
# The DAM reads from 'values' (plural), not 'value' (singular)
if field.get('type') == 'com.artesia.metadata.MetadataTableField' or 'values' in field:
field['values'] = [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'expired_value': False,
'field_value': {
'type': 'string',
'value': value
},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
logger.info("Updated tabular field {} values array from filename: {}".format(field_id, value))
else:
self._set_field_value(field, value)
logger.info("Updated {} from filename: {}".format(field_id, value))
break
# Apply country code mapping (ISO -> DAM codes)

View file

@ -18,7 +18,7 @@ class Notifier:
self.config = config
self.enabled = config['notifications']['enabled']
# SMTP configuration (preferred method)
# SMTP configuration
smtp_config = config['notifications'].get('smtp', {})
self.smtp_server = smtp_config.get('server')
self.smtp_port = smtp_config.get('port', 587)
@ -26,6 +26,12 @@ class Notifier:
self.smtp_password = smtp_config.get('password')
self.sender_email = smtp_config.get('sender_email')
# Mailgun API configuration (preferred over SMTP when configured)
mailgun_config = config['notifications'].get('mailgun', {})
self.mailgun_api_key = mailgun_config.get('api_key')
self.mailgun_domain = mailgun_config.get('domain')
self.mailgun_sender = mailgun_config.get('sender_email') or self.sender_email
self.recipients = config['notifications']['recipients']
self.webhook_config = config.get('webhooks', {})
@ -43,8 +49,8 @@ class Notifier:
logger.info("Notifications disabled, skipping email")
return
if not self.smtp_server or not self.smtp_user:
logger.warning("SMTP not configured, skipping email")
if not self.mailgun_api_key and (not self.smtp_server or not self.smtp_user):
logger.warning("Neither Mailgun API nor SMTP configured, skipping email")
return
try:
@ -60,24 +66,59 @@ class Notifier:
<div style="background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Assets Downloaded:</strong> {{ asset_count }}</p>
<p style="margin: 5px 0 0 0;"><strong>Total Assets:</strong> {{ asset_count }}
{% if existing_asset_count and existing_asset_count > 0 %}
({{ existing_asset_count }} previously downloaded, <strong>{{ new_asset_count }} new this run</strong>)
{% endif %}
</p>
<p style="margin: 5px 0 0 0;"><strong>Status Updated:</strong> A1 A2</p>
</div>
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #28a745; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
{% if new_assets is defined %}
{% if new_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #28a745;">🆕 New This Run ({{ new_assets|length }}):</h3>
{% for asset in new_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #28a745; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
{% endfor %}
{% endif %}
{% if existing_assets is defined and existing_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #666;">📁 Previously Downloaded ({{ existing_assets|length }}):</h3>
<div style="border: 1px solid #ddd; padding: 10px 15px; background-color: #f5f5f5; border-radius: 4px;">
<p style="margin: 0 0 8px 0; color: #666; font-size: 13px;">These files were already in Box from an earlier run and were skipped.</p>
<ul style="margin: 5px 0 0 0; padding-left: 20px; color: #555;">
{% for asset in existing_assets %}
<li style="margin: 3px 0;">{{ asset.asset_name }} <code style="color: #888; font-size: 11px;">({{ asset.tracking_id }})</code></li>
{% endfor %}
</ul>
</div>
</div>
{% endfor %}
{% endif %}
{% else %}
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #28a745; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
{% endfor %}
{% endif %}
<div style="background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong> Complete:</strong> All assets downloaded from DAM and uploaded to Box with tracking IDs.</p>
@ -111,7 +152,7 @@ class Notifier:
"""
},
'a2_to_a3_batch_complete': {
'subject': "A2→A3 Batch Upload Complete - {{ successful_count }}/{{ total_files }} Successful",
'subject': "A2→A3 Batch Upload Complete - {successful_count}/{total_files} Successful",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: {% if failed_count == 0 %}#28a745{% else %}#ff9800{% endif %}; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
@ -300,7 +341,7 @@ class Notifier:
<p style="margin: 5px 0 0 0;"><strong>Default Values Used:</strong></p>
<ul style="margin: 5px 0 0 20px; padding: 0;">
<li>Score: 0</li>
<li>URL: https://app.creativex.com/preflight/pretests</li>
<li>URL: None (no CreativeX URL sent)</li>
</ul>
<p style="margin: 10px 0 0 0; font-size: 12px; color: #666;">
<em>To add CreativeX score: Upload PDF report to Box folder 350605024645 and run creativex_scoring_storing.py</em>
@ -326,24 +367,61 @@ class Notifier:
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign Type:</strong> Global Masters</p>
<p style="margin: 5px 0 0 0;"><strong>Assets Downloaded:</strong> {{ asset_count }}</p>
<p style="margin: 5px 0 0 0;"><strong>Total Assets:</strong> {{ asset_count }}
{% if existing_asset_count and existing_asset_count > 0 %}
({{ existing_asset_count }} previously downloaded, <strong>{{ new_asset_count }} new this run</strong>)
{% endif %}
</p>
<p style="margin: 5px 0 0 0;"><strong>Status Updated:</strong> B1 B2</p>
</div>
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #1976d2; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
{% if new_assets is defined %}
{% if new_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #1976d2;">🆕 New This Run ({{ new_assets|length }}):</h3>
{% for asset in new_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #1976d2; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">CreativeX Score:</span> {% if asset.creativex_score %}{{ asset.creativex_score }}{% if asset.creativex_url %} (<a href="{{ asset.creativex_url }}">View on CreativeX</a>){% endif %}{% else %}<span style="color: #999;">No CreativeX Score</span>{% endif %}</p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
{% endfor %}
{% endif %}
{% if existing_assets is defined and existing_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #666;">📁 Previously Downloaded ({{ existing_assets|length }}):</h3>
<div style="border: 1px solid #ddd; padding: 10px 15px; background-color: #f5f5f5; border-radius: 4px;">
<p style="margin: 0 0 8px 0; color: #666; font-size: 13px;">These files were already in Box from an earlier run and were skipped.</p>
<ul style="margin: 5px 0 0 0; padding-left: 20px; color: #555;">
{% for asset in existing_assets %}
<li style="margin: 3px 0;">{{ asset.asset_name }} <code style="color: #888; font-size: 11px;">({{ asset.tracking_id }})</code> &mdash; <span style="font-size: 12px;">CreativeX: {% if asset.creativex_score %}{{ asset.creativex_score }}{% else %}<span style="color: #999;">none</span>{% endif %}</span></li>
{% endfor %}
</ul>
</div>
</div>
{% endfor %}
{% endif %}
{% else %}
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #1976d2; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">CreativeX Score:</span> {% if asset.creativex_score %}{{ asset.creativex_score }}{% if asset.creativex_url %} (<a href="{{ asset.creativex_url }}">View on CreativeX</a>){% endif %}{% else %}<span style="color: #999;">No CreativeX Score</span>{% endif %}</p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
{% endfor %}
{% endif %}
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong> Complete:</strong> All Global Master assets downloaded from DAM and uploaded to Box with tracking IDs.</p>
@ -378,6 +456,7 @@ class Notifier:
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">CreativeX Score:</span> {% if asset.creativex_score %}{{ asset.creativex_score }}{% if asset.creativex_url %} (<a href="{{ asset.creativex_url }}">View on CreativeX</a>){% endif %}{% else %}<span style="color: #999;">No CreativeX Score</span>{% endif %}</p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
@ -590,6 +669,125 @@ class Notifier:
</div>
"""
},
'a1_to_a2_no_assets_retry': {
'subject': "⚠️ No Assets Found (Attempt {retry_count}/3) - Campaign {campaign_name}",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: #ff9800; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
<h1 style="margin: 0;"> No Master Assets Found (Retry {{ retry_count }}/{{ max_retries }})</h1>
</div>
<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign ID:</strong> {{ campaign_id }}</p>
<p style="margin: 5px 0 0 0;"><strong>Status:</strong> A1</p>
<p style="margin: 5px 0 0 0;"><strong>Retry Attempt:</strong> {{ retry_count }} of {{ max_retries }}</p>
</div>
<div style="padding: 20px; background-color: #f8f9fa; border-radius: 4px; margin: 20px 0;">
<h3 style="color: #ff9800; margin-top: 0;">Campaign Set to A1 but No Assets Found</h3>
<p>The Master Assets folder was searched (including subfolders) but no assets were found.</p>
<p>This campaign is set to status A1 but appears to have no master assets ready for download.</p>
</div>
<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>📌 What Happens Next:</strong></p>
<ul style="margin: 10px 0;">
<li>This is attempt <strong>{{ retry_count }}</strong> of <strong>{{ max_retries }}</strong></li>
<li>System will retry automatically on next run (every 3 minutes)</li>
{% if retry_count < max_retries %}
<li><strong>{{ max_retries - retry_count }} attempt(s) remaining</strong> before marking as permanently failed</li>
{% else %}
<li style="color: #d32f2f;"><strong>WARNING: This is the final attempt!</strong> Next failure will mark campaign as permanently failed.</li>
{% endif %}
<li>Please verify assets exist in Master Assets folder</li>
</ul>
</div>
<p style="color: #666; font-size: 12px; margin-top: 20px;">A1A2 script will retry automatically. No action needed unless this persists.</p>
</div>
"""
},
'a1_to_a2_no_assets_warning': {
'subject': "⚠️ Campaign in A1 with no assets yet - {campaign_name}",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: #ff9800; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
<h1 style="margin: 0;"> Campaign in A1 with No Assets Yet</h1>
</div>
<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign ID:</strong> {{ campaign_id }}</p>
<p style="margin: 5px 0 0 0;"><strong>Status:</strong> A1</p>
<p style="margin: 5px 0 0 0;"><strong>Polls with empty folder:</strong> {{ poll_count }}</p>
</div>
<div style="padding: 20px; background-color: #f8f9fa; border-radius: 4px; margin: 20px 0;">
<h3 style="color: #ff9800; margin-top: 0;">Master Assets Folder Has Been Empty for ~1 Hour</h3>
<p>This campaign has been at status A1 for roughly an hour with no master assets in the folder.</p>
<p>This is often expected the folder may have been created before assets were uploaded and the system will keep checking automatically.</p>
<p>This is a <strong>one-time warning</strong>; no further emails will be sent for this campaign.</p>
</div>
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>📌 Action only needed if:</strong></p>
<ul style="margin: 10px 0;">
<li>You expected assets to be uploaded already</li>
<li>The campaign was set to A1 by mistake (change the status in DAM)</li>
</ul>
<p style="margin: 10px 0 0 0;">Otherwise no action needed processing will start automatically as soon as assets appear in the Master Assets folder.</p>
</div>
<p style="color: #666; font-size: 12px; margin-top: 20px;">A1A2 script will continue to check silently every 3 minutes.</p>
</div>
"""
},
'a1_to_a2_permanently_failed': {
'subject': "❌ PERMANENTLY FAILED - Campaign {campaign_name} (No Assets After 3 Attempts)",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: #d32f2f; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
<h1 style="margin: 0;"> CAMPAIGN PERMANENTLY FAILED</h1>
</div>
<div style="background-color: #ffebee; border-left: 4px solid #d32f2f; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign ID:</strong> {{ campaign_id }}</p>
<p style="margin: 5px 0 0 0;"><strong>Status:</strong> A1</p>
<p style="margin: 5px 0 0 0;"><strong>Failed Attempts:</strong> {{ retry_count }} / {{ max_retries }}</p>
</div>
<div style="padding: 20px; background-color: #f8f9fa; border-radius: 4px; margin: 20px 0;">
<h3 style="color: #d32f2f; margin-top: 0;">Campaign Marked as Permanently Failed</h3>
<p>After {{ max_retries }} consecutive attempts, the system was unable to find any master assets in the Master Assets folder.</p>
<p><strong>This campaign will no longer be processed automatically.</strong></p>
</div>
<div style="background-color: #ffebee; border-left: 4px solid #d32f2f; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>🔧 Required Actions:</strong></p>
<ol style="margin: 10px 0;">
<li>Verify the campaign should actually be in A1 status</li>
<li>Check if Master Assets folder exists and contains files</li>
<li>If this is a mistake, change campaign status to something else</li>
<li>If assets need to be added, add them to Master Assets folder</li>
<li><strong>Once fixed, manually reset the retry counter</strong></li>
</ol>
</div>
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>💡 How to Reset This Campaign:</strong></p>
<p style="margin: 10px 0; padding: 15px; background-color: white; border-radius: 4px;">
To reset the status and retry this campaign, please contact support at: <br>
<strong><a href="mailto:optical@oliver.agency" style="color: #1976d2;">optical@oliver.agency</a></strong>
</p>
<p style="margin: 5px 0 0 0; font-size: 12px; color: #666;">Support will reset the retry counter and investigate the issue.</p>
</div>
<p style="color: #666; font-size: 12px; margin-top: 20px;">Automated processing stopped. Manual intervention required.</p>
</div>
"""
},
'b1_to_b2_no_assets': {
'subject': "⚠️ No Assets Found - Global Campaign {campaign_name}",
'html': """
@ -894,59 +1092,105 @@ class Notifier:
html_content = jinja_template.render(data)
subject = template['subject'].format(**data)
# 2. Create MIME message
if attachments:
# Use MIMEMultipart for attachments
message = MIMEMultipart()
message['From'] = self.sender_email
message['To'] = ", ".join(recipients) if isinstance(recipients, list) else recipients
message['Subject'] = subject
# Attach HTML body
message.attach(MIMEText(html_content, "html"))
# Attach files
from email.mime.base import MIMEBase
from email import encoders
import os
for file_path in attachments:
try:
if os.path.exists(file_path):
with open(file_path, "rb") as attachment:
part = MIMEBase("application", "octet-stream")
part.set_payload(attachment.read())
encoders.encode_base64(part)
filename = os.path.basename(file_path)
part.add_header(
"Content-Disposition",
f"attachment; filename= {filename}",
)
message.attach(part)
logger.info("Attached file: {}".format(filename))
else:
logger.warning("Attachment not found: {}".format(file_path))
except Exception as e:
logger.error("Failed to attach file {}: {}".format(file_path, str(e)))
else:
# Use standard MIMEText for simple emails
message = MIMEText(html_content, "html")
message['From'] = self.sender_email
message['To'] = ", ".join(recipients) if isinstance(recipients, list) else recipients
message['Subject'] = subject
# 2. Send via Mailgun API or SMTP
recipient_list = recipients if isinstance(recipients, list) else [recipients]
# 3. Send via SMTP
with smtplib.SMTP(self.smtp_server, self.smtp_port) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.send_message(message)
if self.mailgun_api_key and self.mailgun_domain:
self._send_via_mailgun_api(recipient_list, subject, html_content, attachments)
else:
self._send_via_smtp(recipient_list, subject, html_content, attachments)
logger.info("Email sent to {} (Template: {})".format(recipients, template_name))
except Exception as e:
logger.error("Failed to send email: {}".format(str(e)))
def _send_via_mailgun_api(self, recipient_list, subject, html_content, attachments=None):
"""Send email via Mailgun REST API - sends one request per recipient for reliable delivery"""
import os
url = "https://api.mailgun.net/v3/{}/messages".format(self.mailgun_domain)
# Normalize: split any comma-separated strings into individual addresses
normalized = []
for r in recipient_list:
for addr in r.split(','):
addr = addr.strip()
if addr:
normalized.append(addr)
for recipient in normalized:
files = []
try:
if attachments:
for file_path in attachments:
if os.path.exists(file_path):
files.append(("attachment", (os.path.basename(file_path), open(file_path, "rb"))))
else:
logger.warning("Attachment not found: {}".format(file_path))
data = {
"from": self.mailgun_sender,
"to": [recipient],
"subject": subject,
"html": html_content,
}
response = requests.post(
url,
auth=("api", self.mailgun_api_key),
data=data,
files=files if files else None,
)
response.raise_for_status()
logger.info("Mailgun API sent to {}: {}".format(recipient, response.json()))
except Exception as e:
logger.error("Mailgun API failed for {}: {}".format(recipient, str(e)))
finally:
for _, file_tuple in files:
file_tuple[1].close()
def _send_via_smtp(self, recipient_list, subject, html_content, attachments=None):
"""Send email via SMTP"""
import os
from email.mime.base import MIMEBase
from email import encoders
if attachments:
message = MIMEMultipart()
message['From'] = self.sender_email
message['To'] = ", ".join(recipient_list)
message['Subject'] = subject
message.attach(MIMEText(html_content, "html"))
for file_path in attachments:
try:
if os.path.exists(file_path):
with open(file_path, "rb") as attachment:
part = MIMEBase("application", "octet-stream")
part.set_payload(attachment.read())
encoders.encode_base64(part)
filename = os.path.basename(file_path)
part.add_header(
"Content-Disposition",
"attachment; filename= {}".format(filename),
)
message.attach(part)
logger.info("Attached file: {}".format(filename))
else:
logger.warning("Attachment not found: {}".format(file_path))
except Exception as e:
logger.error("Failed to attach file {}: {}".format(file_path, str(e)))
else:
message = MIMEText(html_content, "html")
message['From'] = self.sender_email
message['To'] = ", ".join(recipient_list)
message['Subject'] = subject
with smtplib.SMTP(self.smtp_server, self.smtp_port) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.send_message(message)
def send_webhook(self, url, payload):
"""
url: Webhook URL

View file

@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""
Quick test: Send via Mailgun API with multiple recipients
to diagnose daily report delivery issue.
"""
import os
import sys
import requests
# Load from environment (same as production)
api_key = os.environ.get('MAILGUN_API_KEY')
domain = os.environ.get('MAILGUN_DOMAIN')
sender = os.environ.get('MAILGUN_SENDER_EMAIL') or os.environ.get('SENDER_EMAIL')
if not api_key or not domain:
print("ERROR: MAILGUN_API_KEY and MAILGUN_DOMAIN must be set")
sys.exit(1)
print("Using domain: {}".format(domain))
print("Using sender: {}".format(sender))
print("API key: {}...{}".format(api_key[:8], api_key[-8:]))
print()
# Try both US and EU endpoints
endpoints = [
("US", "https://api.mailgun.net/v3/{}/messages".format(domain)),
("EU", "https://api.eu.mailgun.net/v3/{}/messages".format(domain)),
]
# First, find which endpoint works
working_url = None
for region, url in endpoints:
print("Testing {} endpoint: {}".format(region, url))
test_data = {
"from": sender,
"to": ["nick.viljoen@oliver.agency"],
"subject": "Mailgun Endpoint Test - {} Region".format(region),
"html": "<p>Testing {} endpoint</p>".format(region),
}
resp = requests.post(url, auth=("api", api_key), data=test_data)
print(" Status: {}".format(resp.status_code))
print(" Response: {}".format(resp.text[:500]))
if resp.status_code == 200:
working_url = url
print(" >>> {} endpoint works!".format(region))
break
print()
if not working_url:
print("\nERROR: Neither US nor EU endpoint accepted the API key.")
print("Check that MAILGUN_API_KEY is correct and the domain is verified.")
sys.exit(1)
print()
print("=" * 60)
print("Using working endpoint: {}".format(working_url))
print("=" * 60)
# --- Test 1: Comma-separated string in list (how daily report currently sends) ---
print()
print("TEST 1: Comma-separated string in list (current daily report format)")
data1 = {
"from": sender,
"to": ["nick.viljoen@oliver.agency,daveporter@oliver.agency"],
"subject": "Mailgun Test 1 - Comma-Separated in List",
"html": "<h2>Test 1</h2><p>Comma-separated string in list. If you see this, the current format works.</p>",
}
resp1 = requests.post(working_url, auth=("api", api_key), data=data1)
print(" Status: {}".format(resp1.status_code))
print(" Response: {}".format(resp1.text[:500]))
# --- Test 2: Multiple recipients as separate list items (proper format) ---
print()
print("TEST 2: Separate list items (proper format)")
data2 = {
"from": sender,
"to": ["nick.viljoen@oliver.agency", "daveporter@oliver.agency"],
"subject": "Mailgun Test 2 - Separate List Items",
"html": "<h2>Test 2</h2><p>Separate list items. If you see this, the split format works.</p>",
}
resp2 = requests.post(working_url, auth=("api", api_key), data=data2)
print(" Status: {}".format(resp2.status_code))
print(" Response: {}".format(resp2.text[:500]))
print()
print("=" * 60)
print("DONE - Check inboxes for both tests")
print("=" * 60)