Compare commits

..

58 commits
ppr ... main

Author SHA1 Message Date
nickviljoen
9e92db185a Feature: Apply naming-tool pre-upload metadata overrides on A2→A3 upload
The naming tool's metadata editor saves pre-upload overrides to the
override_metadata table (shared ferrero_tracking DB), but until now the
Python upload pipeline never read from it — every edit was being saved
but never applied to DAM. This wires up the consumer side so user edits
land on the uploaded asset.

- database.py: get_override_metadata() / mark_override_applied(),
  resilient to a missing override_metadata table on dev DBs
- metadata_extractor_mvp.py: OVERRIDE_FIELD_MAP (mirrors the naming
  tool's editor-field → DAM-field-ID map) + _apply_override_fields().
  Applied after master/filename/forced/CreativeX values but before
  asset_type_overrides so EOL/LTD compliance still wins. Empty editor
  values are skipped (leaves inherited value alone). Validity ISO
  dates normalised to MM/DD/YYYY for DAM
- a2_to_a3_upload_polling.py: lookup before building the asset rep,
  pass override_fields into build_mvp_asset_representation, mark
  applied only after confirmed upload success

Override priority: user edit > master metadata > forced defaults >
hardcoded today+365 validity — so the team's per-asset validity
period (e.g. 1 month) now flows through end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:06:06 +02:00
nickviljoen
4e9fb6d18f Feature: Add check_campaign_status.py read-only status lookup
Wraps find_campaign_by_identifier() from update_campaign_status.py so
operators can query a campaign's current DAM status by number or partial
name without performing any updates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 20:27:37 +02:00
nickviljoen
db35697091 Feature: Add Spotify (SPT) to social media codes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 21:14:29 +02:00
nickviljoen
c12aef0eb1 Fix: Populate MAIN_LANGUAGES in folder-only mode (-N) uploads
Folder-only mode deep-copies the asset template with MAIN_LANGUAGES.values=[]
and never repopulated it from language_code, so the DAM rejected -N uploads
(SND/voiceover) with "Cannot set null value for a required field: MAIN_LANGUAGES".
Now mirrors the full-inheritance path's tabular values structure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:19:09 +02:00
nickviljoen
6d6213024a Fix: Merge A+B live campaigns into single CSV for OMG
OMG's Box automation treats each new live_campaigns_*.csv as a full-list
replacement, so the per-series global CSV introduced 2026-04-30 stomped
the local list whenever a B1→B2 ran. Collapse to one combined CSV
(A-series + B-series) emitted by every handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:36:43 +02:00
nickviljoen
28586308d7 Docs: Refresh A1 empty-folder doc and LTD asset type notes
A1_RETRY_LOGIC.md updated to reflect the 2026-04-28 rework: empty
folders are now treated as expected workflow (silent skip + one-time
warning at poll 20, no auto permanent-fail), while the original
3-strikes-then-permanently-fail behavior is preserved for genuine
folder errors via the mark_failed_at_max flag.

README.md adds LTD (Licensing Translation Document) to the asset type
override section alongside EOL, and notes that empty overrides remove
fields while non-empty overrides on non-MVP fields are appended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:19:06 +02:00
nickviljoen
ba4f1a9bf7 Feature: Global live campaigns CSV + B4 closure flow
Wires B-series (global) campaigns into OMG using the same Box
automation as A-series. Mirrors the A1/A4 lifecycle for B1/B4.

- b1_to_b2_download: after B2 status update, mark live=YES status=B2
  and upload live_campaigns_global_<ts>.csv to the existing Box folder
  (BOX_LIVE_CAMPAIGNS_FOLDER_ID, 352181382858 in PROD). Filename keeps
  the live_campaigns_ prefix so the existing OMG automation rule picks
  it up.
- b4_box_uploader (new): polls DAM for status B4, marks live=NO, regens
  the global CSV. Mirrors a4_box_uploader.
- a4_box_uploader: reads prior status before overwriting; if it was
  B-series, regenerate the global CSV instead. b4_box_uploader does the
  symmetric A-series fallback. Defensive in case DAM doesn't enforce
  type-specific status transitions.
- database: add get_all_live_global_campaigns() (status LIKE 'B%').
  Tighten get_all_live_campaigns() to status LIKE 'A%' so any cross-type
  rows can't leak into the wrong CSV.
- orchestrator + orchestrator-prod: register B4 Box Uploader at 10min.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:12:49 +02:00
nickviljoen
b74c9c68aa Fix: EOL/LTD asset type overrides — IP Rights, CreativeX, descriptions
- LTD DAM code confirmed by client: licensingtranslationdocument (was placeholder)
- EOL + LTD: IP Rights forced to "No" (was "Yes")
- EOL + LTD: Remove CreativeX URL and score (not applicable to legal asset types)
- EOL: Description forced to "Legal Studio Name"
- Reorder _apply_asset_type_overrides() to run after _update_creativex_fields()
  so overrides have true final precedence (Box CreativeX was clobbering removals)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 13:24:19 +02:00
nickviljoen
5909e017a4 Reporting: Format CreativeX score as '100 (DV360)' in B1→B2 emails
DAM stores the CreativeX tabular cell as '<platform>^<score>', e.g.
'DV360^100'. Add format_cx_score_for_display() and apply at the point
where the email asset dict is built — both new-download and skipped
paths. Raw value stays in creativex_scores.quality_score so all platform
info is preserved for queries; only the email display is reshaped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:04:27 +02:00
nickviljoen
8bf8dc1325 Fix: Recursively walk metadata_element_list when extracting CreativeX
Diagnostic confirmed FERRERO.TAB.FIELD.CREATIVEX (score) lives at depth 2
in B1 master metadata — nested under FERRERO.TABULAR.FIELD.CREATIVEX
inside a category — and FERRERO.FIELD.CREATIVEX LINK lives at depth 1.
The flat top-level walk used previously never reached them, so live B1
runs and the backfill both reported zero CX scores. Updated extractor
in b1_to_b2_download.py and the inline copy in
backfill_b1_creativex_scores.py to descend recursively.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:53:15 +02:00
nickviljoen
a463eb42f8 Diagnostic: Recursively walk nested metadata_element_list for CX search
Previous version only looked at top-level metadata_element_list, which
contains categories — actual fields nest under each category. Now
recursively descends through all nested metadata_element_list arrays
and counts every element_id at any depth, then searches the full set
for CX/score/quality hints. Reports max nesting depth and the depth at
which each CX-flavored ID was found.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:49:54 +02:00
nickviljoen
3c69e7545a Fix: Escape literal % in LIKE pattern in B1 metadata diagnostic
psycopg2 performs %-substitution when params are passed to execute(),
so 'M%' in the LIKE clause was being interpreted as a positional
placeholder, raising IndexError when there's only one real %s (LIMIT).
Escape as 'M%%' so it's preserved as a literal percent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:47:21 +02:00
nickviljoen
23bcc057c5 Diagnostic: Inspect B1 master metadata structure for CX fields
Read-only script that samples B1 global masters from master_assets and
reports: top-level keys in full_metadata, presence of
metadata.metadata_element_list, and any element_ids matching
creativex/cx/score/quality (case-insensitive). Helps diagnose why the CX
backfill found 0 matches — distinguishes "client masters have no CX
score yet" from "CX field uses a different element_id than A1".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:45:35 +02:00
nickviljoen
b9d5ac9feb Backfill: One-shot script to populate CX scores for existing B1 masters
Walks master_assets for B1 global masters (tracking_id LIKE 'M%' AND
local_campaign_id IS NULL), extracts CreativeX score from full_metadata
JSONB, and inserts into creativex_scores with status='b1-master-cx-score'.
Idempotent — relies on the existing tracking_id dedup in
db.store_creativex_score, so re-runs are safe. Supports --dry-run for
preview before applying.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:40:48 +02:00
nickviljoen
f28b5221f7 Enhancement: Capture CreativeX score on B1→B2 global masters
Extracts CreativeX score and URL from DAM master metadata during the
B1→B2 download, persists to creativex_scores with new status
'b1-master-cx-score' (dedup by tracking_id), and surfaces the score in
the b1_to_b2_complete and b1_to_b2_partial emails — falling back to
"No CreativeX Score" when the master has no score yet. Skipped
already-downloaded assets backfill from full_metadata JSONB on next pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:31:07 +02:00
nickviljoen
74977f2366 Rename: SDA asset type → LTD (Licensing Translation Document)
Renames the asset type code introduced in 0f49cc6 from SDA (Supporting
Documents for Approval) to LTD (Licensing Translation Document). All
field overrides and the fixed Description value are unchanged.

DAM-side asset type code remains externallegalopinion as a placeholder
pending client confirmation; will update in a follow-up commit if the
DAM code differs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 21:05:44 +02:00
nickviljoen
0f49cc6cbc Enhancement: SDA (Supporting Documents for Approval) asset type
Adds SDA as a new asset type for License claim translations supporting
the EOL (External Legal Opinion) workflow.

- SDA maps to externallegalopinion in DAM (same as EOL).
- Field overrides match EOL (Agency = "-", Prod Company = "-",
  Languages = Global, IP Right = Yes, Licensing = No, validity dates
  removed) plus a fixed Description: "Translation of License claim -
  For approval purposes only".
- Added asset_type_overrides section to field_mappings_ppr.yaml; it
  was missing, so EOL overrides weren't actually applying on PPR.
  Both EOL and SDA blocks are now defined for both PPR and PROD.
- _apply_asset_type_overrides now appends a simple string field when
  the override targets a field not yet in mvp_fields, so the SDA
  description is set even if the filename has no subject_title.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:08:03 +02:00
nickviljoen
90f326aecb Enhancement: Treat empty A1 folders as expected workflow
Campaign managers often create the campaign in DAM before assets are
uploaded, so an empty Master Assets folder is the normal pre-asset state
rather than a failure. Stop marking these as permanently failed and stop
emailing on every poll.

- increment_a1_retry() gains mark_failed_at_max param; empty-folder path
  passes False so the campaign keeps polling indefinitely until assets
  appear (or the DAM status changes).
- Empty-folder branch now skips silently on every poll and sends a single
  warning email at poll 20 (~1 hour at the 3-min cadence) so genuinely
  stuck campaigns still surface.
- New a1_to_a2_no_assets_warning email template — one-time soft warning,
  no permanent-failure language.
- Existing reset_a1_retry() on successful A1→A2 still clears the counter
  when assets eventually appear.
- Other folder-error paths (folder not found, etc.) keep the original
  3-retry-then-fail behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:20:41 +02:00
nickviljoen
ab557b78de Fix: Skip permanently-failed campaigns before A1 per-run cap
The A1→A2 uploader processes up to 2 campaigns per run. Permanently-failed
campaigns were skipped only inside the loop, so they still consumed slots
and could starve the rest of the queue indefinitely. Filter them out
before the slice so eligible campaigns get processed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:54:36 +02:00
nickviljoen
2c06f3936f Reporting: Split new vs previously-downloaded assets in A1→A2 / B1→B2 emails
When a campaign is re-opened (status reset to A1/B1 after new files are
added), the tool correctly skips already-downloaded assets but the email
report and CSV previously listed the whole folder as "processed", which
was misleading. Reports now show "Total: 14 (12 previously downloaded,
2 new this run)" with new assets in full detail and previously-downloaded
assets in a compact list. B1→B2 CSV gains a Status column matching A1→A2.
2026-04-23 14:11:00 +02:00
nickviljoen
d83e41707c Docs: Update README with asset type mapping changes and current date
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:44:55 +02:00
nickviljoen
455cc1bf2a Update asset type mappings per Scaling Agencies Metadata List
Remove 9 deprecated types (CID, ECB, EBS, EOP, EUG, EWB, FPO, PKI, PRI),
add 9 new types (EAN, ESI, NTB, PIR, PKC, PKT, SCP, SNC, UPI), and update
DAT DAM code from digitalassettoolkit to digitalasset. Display names updated
to match current client naming conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:40:21 +02:00
nickviljoen
695eefadf3 Fix: Recurse into subfolders with numeric extensions (e.g. "2.0")
DAM subfolder "WND_PCS 2026 2.0" was being treated as a file because
".0" was not in the known extensions list and defaulted to is_folder=False.
This caused an HTTP 404 on download since it's a folder, not a file.

Added numeric-only extension check (.0, .1, etc.) to the folder detection
logic so the script correctly recurses into versioned subfolders and
downloads the assets inside them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:46:32 +02:00
nickviljoen
0408d282a5 Revert "Fix: Skip subfolders with numeric extensions in B1→B2 downloads"
This reverts commit 4dff200e10.
2026-04-10 09:44:41 +02:00
nickviljoen
4dff200e10 Fix: Skip subfolders with numeric extensions in B1→B2 downloads
DAM subfolder "WND_PCS 2026 2.0" was being treated as a downloadable
asset because ".0" passed the existing extension check. Added safeguard
to skip items with numeric-only extensions (e.g. .0, .1) which are
version numbers in folder names, not real files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:42:29 +02:00
nickviljoen
39a495e4cc Fix: Skip already-processed assets on B1→B2 retry runs
Previously the script re-downloaded and re-uploaded all assets on every
retry, even those already successfully stored in DB and Box. For large
campaigns (1300+ assets) this caused unnecessary load and duplicate uploads.

Now checks DB via find_global_master_by_opentext_id() before downloading.
Assets already in DB with a valid Box URL are skipped and counted toward
the processed total, so only genuinely failed assets are retried.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:07:07 +02:00
nickviljoen
03c5ab65a8 Docs: Update README and CLAUDE.md with folder-only template and EOL workflow
Added documentation for template-based folder-only mode (-N flag),
asset type overrides (EOL), environment-specific field mappings,
and updated config file references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:33:35 +02:00
nickviljoen
95edece5f3 Enhancement: EOL (External Legal Opinion) workflow
Adds EOL as a new asset type with field overrides for both PPR and PROD:
- Asset type maps to 'externallegalopinion' in DAM
- Agency Name = "-", Production House = "-"
- Main Languages = "Global"
- IP Rights = "Yes", Licensing = "No"
- Validity dates removed
Also adds VOD platform code and removes OLV asset type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 15:53:37 +02:00
nickviljoen
33e71be453 Fix: Template-based folder-only mode for -N flag uploads
Folder-only mode (-N suffix files) was sending minimal metadata that DAM
rejected with "unmarshalling parameter" error. Now uses a reference
asset_representation_template.json as the base for all metadata fields,
ensuring the full field structure (column_name, data_type, domain_id, etc.)
the DAM API requires. Also fixes default/forced value handling to use
DomainValue format for domained fields from the template.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 15:53:10 +02:00
nickviljoen
5905f3262a Fix: Folder-only mode metadata format for PROD DAM compatibility
Folder-only mode (-N suffix files) was sending simplified metadata that
PROD DAM rejected with "unmarshalling parameter" error. Updated to use
DomainValue format for domained fields, correct asset type field ID
(FERRERO.FIELD.MKTG.ASSET TYPE), asset type code mapping (e.g. SND→sound),
validity dates, and forced values from config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:31:02 +02:00
nickviljoen
51e915e67c Add global_master_tracking_id to link A1→A2 local assets to B1→B2 global masters
A1→A2 now looks up the opentext_id in master_assets for an M-prefixed record
from B1→B2 and stores it as global_master_tracking_id on the local asset record.
This provides traceability from local campaign assets back to their global master
without changing any existing workflow logic or DAM metadata.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 13:12:55 +02:00
nickviljoen
78a4ca0976 Fix: CreativeX score supersede now matches base filename ignoring timestamp suffix
Previously, re-scored assets with a DAM timestamp suffix (e.g. _2026-03-13-05-53-36)
were treated as new files, leaving multiple 'active' records. Now strips the timestamp
and uses LIKE matching so all variants of the same base asset are properly superseded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 21:12:50 +02:00
nickviljoen
4dded5de14 Fix: Send Mailgun API emails one recipient at a time
Mailgun silently drops emails with multiple recipients in the to field.
Send individual API calls per recipient and split comma-separated addresses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:39:55 +02:00
nickviljoen
e6a6357403 Update Mailgun test: try US/EU endpoints, handle non-JSON errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:29:27 +02:00
nickviljoen
467a735e94 Add Mailgun recipient format test script
Diagnose daily report email delivery issue - tests single recipient,
comma-separated string in list, and properly split list formats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 12:54:44 +02:00
nickviljoen
dc779724fc Add Mailgun API support for PROD email notifications
Mailgun API is used when MAILGUN_API_KEY and MAILGUN_DOMAIN are set,
with SMTP as fallback for PPR. Also fixes A2→A3 batch subject line
that was rendering Jinja2 syntax literally instead of substituting values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:39:16 +02:00
nickviljoen
96b33fa084 Fix: Correct MARKETING_TAG parent_table_id in folder-only mode
Was generating FERRERO.TABULAR.FIELD.MARKETING_TAG (underscore) but DAM
expects FERRERO.TABULAR.FIELD.MARKETING.TAG (dot). Added explicit mapping
for tabular field parent table IDs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:13:11 +02:00
nickviljoen
6bc1b397d0 Fix: Use simple value structure for non-domain default fields in folder-only mode
VIDEO_POST_PROD_COMPANY and AUDIO_POST_PROD_COMPANY are not domain fields
but were being wrapped with DomainValue, causing unmarshalling errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:07:21 +02:00
nickviljoen
6e0bb08a5f Fix: Add type field to folder-only mode (-N) metadata values for DAM API
The _build_fields_from_filename method was using {"value": "..."} without
the required {"type": "string", "value": "..."} structure, causing
unmarshalling errors on the DAM API for -N suffix uploads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:03:02 +02:00
nickviljoen
faa33cf44f Fix: Use DomainValue wrapper for non-tabular default fields in folder-only mode (-N)
Fixes unmarshalling error on DAM upload when using -N suffix files. The API
requires the DomainValue structure when domain_value is true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 15:30:46 +02:00
nickviljoen
8299a87180 Fix: Update MAIN_LANGUAGES values array for tabular fields in DAM upload
The filename_updates logic was only updating field['value'] (singular) but for
tabular fields like MAIN_LANGUAGES, the DAM reads from field['values'] (plural
array). This caused the master's original language (e.g. "Global") to persist
instead of the correct language from the filename (e.g. "PL").

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 17:26:31 +02:00
nickviljoen
63e42d1196 Fix: Don't send generic CreativeX URL when no score exists
When no CreativeX score is found for a file, the system was sending a
generic placeholder URL (app.creativex.com/preflight/pretests) to the DAM.
Now sends no URL at all, so only files with actual CreativeX scores get a URL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 17:42:57 +02:00
nickviljoen
74141689e6 Enable FERRERO.MASTERASSETIDS and multi-master support for PROD
Remove PPR-only gates so PROD supports the same MASTERASSETIDS tabular
field and multi-master ID parsing as PPR. DAM deployment scheduled for
Feb 18 — do not push until then.

Changes:
- filename_parser: Remove is_ppr check, allow multi-master ID parsing in PROD
- a2_to_a3: Populate master_opentext_ids for single-master PROD case
- dam_client: Remove PPR-only skip on domain registration
- metadata_extractor_mvp: Update docstrings only

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 18:12:30 +02:00
nickviljoen
f6c84762ae Fix: Map CreativeX API channel/publisher to DAM platform names for PROD
The new CreativeX API format stores channel/publisher at the top level
of full_extraction_data instead of inside a data.ferrero_mapped_platforms
wrapper. Add fallback mapping so platforms are correctly populated for
DAM uploads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 17:43:58 +02:00
nickviljoen
052558961a Revert "Fix: Add YouTube platform mapping and social media code fallback for CreativeX"
This reverts commit 799b6d50e8.
2026-02-13 17:17:06 +02:00
nickviljoen
799b6d50e8 Fix: Add YouTube platform mapping and social media code fallback for CreativeX
YouTube Ads was missing from the DAM-CX mappings CSV, causing empty
Platform > Rating fields for YouTube assets. Also adds a fallback that
derives the CreativeX platform from the filename social media code (e.g.
YTA -> YouTube) when the database has no mapped platforms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 17:00:47 +02:00
nickviljoen
9dbb7ce8d9 Revert "Fix: Re-enable FERRERO.MASTERASSETIDS field for PROD single-master uploads"
This reverts commit ea85749e0a.
2026-02-13 16:41:57 +02:00
nickviljoen
ea85749e0a Fix: Re-enable FERRERO.MASTERASSETIDS field for PROD single-master uploads
Populates master_opentext_ids for single-master case so uploads use the
tabular FERRERO.MASTERASSETIDS field instead of the ARTESIA.FIELD.ASSET_ID
fallback. Reverts the workaround from 6517a4f now that the field is being
configured in PROD DAM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 14:20:01 +02:00
nickviljoen
98826d51c4 Fix: CreativeX tracking ID fallback, filename stripping, and social media codes
CreativeX lookup now falls back to tracking ID search when filename match fails
(handles mismatched naming from CreativeX PDFs). strip_upload_components now
only removes job number and tracking ID, keeping social media codes (YTA, DV3,
etc.) in the clean filename. Updated SOCIAL_MEDIA_CODES from 4 to 39 codes
sourced from the Ferrero naming tool.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:24:23 +02:00
nickviljoen
6517a4f83f Fix: Skip FERRERO.MASTERASSETIDS field on PROD - field not yet configured
PROD DAM rejects FERRERO.MASTERASSETIDS as it only exists in PPR. Remove the
single-master-to-list conversion so PROD uses the existing single-ID field
(master_opentext_id) instead. Will be re-added when client configures the
tabular field in PROD.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 12:33:43 +02:00
nickviljoen
27916062ff Fix: Pass notifier to process_box_file and use case-sensitive Master ID check
The notifier variable was referenced inside process_box_file but never passed
as a parameter, causing NameError for any file hitting the Master Tracking ID
check. Also changed the check from case-insensitive (.upper().startswith('M'))
to case-sensitive (.startswith('M')) to avoid false positives on random tracking
IDs like mviSv5.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:49:21 +02:00
nickviljoen
636b555d9d Fix: Define master_opentext_ids variable in A2→A3 and add multi-master support
The PROD a2_to_a3 script referenced master_opentext_ids without defining it,
causing NameError for all file uploads. Brings in multi-master tracking ID
support from PPR: filename parser handles multiple IDs (PPR) or single ID
(PROD), metadata extractor supports MASTERASSETIDS tabular field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:37:18 +02:00
nickviljoen
d72d37a83d Enhancement: Campaign re-opening support and PPR master asset ID registration
A1→A2 now handles re-processing when campaign is reset to A1 after adding new
master assets. Existing assets reuse tracking IDs and skip Box upload, new assets
are processed normally. Also includes PPR domain registration for multiple master
asset IDs in a2_to_a3 and dam_client.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 21:07:13 +02:00
nickviljoen
57b4df2799 Security: Remove database password from permanently failed email template
Replace exposed database credentials and SQL commands in A1 permanently failed notification email with support contact information (optical@oliver.agency).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-02 07:24:49 +02:00
nick.viljoen
fc9539d4b5 Security: Add .env files to .gitignore
.env files contain sensitive credentials and should never be committed to git.
  Removed .env-prod from tracking while preserving local file.
2026-01-31 18:07:44 +00:00
nickviljoen
c90032b1d9 Fix: A1 retry logic now catches folder not found errors
Problem:
- Retry logic only triggered for empty folders (total_assets == 0)
- When "Master Assets" folder doesn't exist, error thrown BEFORE retry check
- Exception caught by outer try/except, sent old upload_failed template
- No database tracking, emails sent every 3 minutes indefinitely

Solution:
- Added retry logic to outer exception handler
- Detects folder/assets errors and applies same 3-attempt tracking
- Now handles both: (1) folder doesn't exist, (2) folder is empty
- Database tracking works for both scenarios

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 19:34:29 +02:00
nickviljoen
e1f15ea632 Add A1 retry logic and orchestrator off-hours cadence
Feature 1: A1→A2 Empty Folder Retry Logic
- Track retry attempts (max 3) for campaigns with no master assets
- Mark campaigns as permanently failed after 3 attempts
- Stop processing and sending emails for permanently failed campaigns
- Two new email templates: retry notification and permanent failure
- Database migration adds 4 new columns to campaign_status table
- Comprehensive documentation in A1_RETRY_LOGIC.md

Feature 2: Orchestrator Off-Hours Cadence
- Add 30 minutes to all task intervals during off-hours
- Off-hours: 10 PM - 5 AM weekdays + all day Saturday/Sunday
- Tasks only run at minutes 0 and 30 during off-hours
- Configurable and easy to enable/disable
- Daily Report (7 PM) remains unchanged

Files changed:
- NEW: database/migrations/003_add_a1_retry_tracking.sql
- NEW: MARKDOWN_DOCS/A1_RETRY_LOGIC.md
- MODIFIED: scripts/shared/database.py (added 3 methods)
- MODIFIED: scripts/a1_to_a2_box_uploader.py (added retry logic)
- MODIFIED: scripts/shared/notifier.py (added 2 templates)
- MODIFIED: scripts/orchestrator-prod.py (added off-hours config)
- MODIFIED: RUN_ORCHESTRATOR.md (added off-hours docs)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 17:38:57 +02:00
nickviljoen
b7e0430636 Fix: Prevent DAM folder creation attempts causing timeouts
Remove folder creation logic in get_or_create_subfolder_path() since DAM does not allow folder creation via API. When a subfolder doesn't exist, upload to the parent folder instead of attempting to create it (which was causing 120 second timeouts).

This resolves upload failures in PROD environment during A2→A3 workflow.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-30 10:34:53 +02:00
34 changed files with 2704 additions and 1115 deletions

View file

@ -1,66 +0,0 @@
# Ferrero Automation Environment Variables
# Environment (staging or production)
ENV=prod
# DAM Credentials - OAuth2 (default authentication)
DAM_BASE_URL=https://dam.ferrero.com/otmmapi
DAM_AUTH_URL=https://dam.ferrero.com/otdsws/oauth2/token
DAM_CLIENT_ID=otds-OLV
DAM_CLIENT_SECRET=hs28LZ9ZzQ5I9rlW3P7Wwyw85oOatlC1
# DAM mTLS Certificate (optional - only used with --auth-pfx flag)
DAM_MTLS_BASE_URL=https://prod-auth.app-api.ferrero.com/00003/mm/token
DAM_MTLS_CERT_PATH=config/certificates/SAP-XX-Orange-Logic-to-APP-APIM-prod.pfx
DAM_MTLS_CERT_PASSWORD=(aP5IzJdg1d)e)V39Sq5k]13LwO[49D43#iR{}ks
# Box Credentials
BOX_CLIENT_ID=l2atwxxq4xna7phcjr2uifm4mbah69qp
BOX_CLIENT_SECRET=6XcuCQ6akpk9daE0UHaGSv3mSxWaER4l
BOX_JWT_KEY_ID=n1izyn3l
BOX_PASSPHRASE=971585f5fd6171428c14a7c8899af5ab
BOX_ENTERPRISE_ID=43984435
# Box Folder Configuration
BOX_ROOT_FOLDER_A1_A2=348304357505
BOX_ROOT_FOLDER_A2_A3=348526703108
BOX_ROOT_FOLDER_B1_B2=349261192115
BOX_ROOT_FOLDER_CREATIVEX=350605024645
# Database
DB_HOST=localhost
DB_PORT=5437
DB_USER=ferrero_user
DB_PASSWORD=ferrero_pass_2025
# Mailgun / SMTP (for email notifications)
SMTP_SERVER=smtp.mailgun.org
SMTP_PORT=587
SMTP_USER=twist@mail.dev.oliver.solutions
SMTP_PASSWORD=102115e9f3b9d7332d0cd1d4329bc0d4-77751bfc-ca066b71
SENDER_EMAIL=TWIST-UK-SERVER@oliver.agency
ERROR_EMAIL=daveporter@oliver.agency
REPORT_EMAILS=daveporter@oliver.agency
# Mailgun API (alternative to SMTP)
MAILGUN_API_KEY=your_mailgun_api_key_here
MAILGUN_DOMAIN=mail.dev.oliver.solutions
# Webhook Configuration
CAMPAIGN_STATUS_WEBHOOK_URL=https://hook.us1.make.celonis.com/3f9ztwl8qnljufo0l65utfv5wvvnt9m5
WEBHOOK_AUTH_TOKEN=
WEBHOOK_RECEIVER_PORT=5555
BOX_WEBHOOK_PRIMARY_KEY=your_box_webhook_primary_key
BOX_WEBHOOK_SECONDARY_KEY=your_box_webhook_secondary_key
# CreativeX Configuration
LLAMA_CLOUD_API_KEY=llx-EDmfh0ZReUbXUbaa5i5275TAP2LznNDqc3skJRL3HY4RUDcf
CREATIVEX_AGENT_NAME=Creativex-Extract
BOX_LIVE_CAMPAIGNS_FOLDER_ID=352181382858
# DAM mTLS V2 (Hybrid)
DAM_MTLS_OAUTH_URL=https://prod-auth.app-api.ferrero.com/00003/mm/token
# Master Asset ID Field Configuration
MASTER_ASSET_ID_FIELD=ARTESIA.FIELD.ASSET_ID

View file

@ -5,3 +5,5 @@ temp/
logs/
.DS_Store
.env
.env-prod
.env

View file

@ -0,0 +1,324 @@
# A1→A2 Empty Folder Handling
**Purpose:** Avoid spam emails and false-positive permanent failures for the common workflow where campaign managers create an A1 campaign before uploading the master assets.
**Initial implementation:** January 31, 2026
**Reworked:** April 28, 2026 — empty folders are now treated as expected client workflow rather than failures.
**Related files:**
- `scripts/a1_to_a2_box_uploader.py` (main script)
- `scripts/shared/database.py` (retry tracking methods)
- `database/migrations/003_add_a1_retry_tracking.sql` (schema)
---
## How It Works (current behavior)
### The empty-folder case (most common)
When a campaign is at A1 in DAM but the Master Assets folder is empty, the script treats this as a normal pre-asset state, not a failure.
**Flow:**
1. Every poll: `a1_retry_count` is incremented for visibility, the script logs `No master assets yet (poll N) - skipping until assets appear`, and exits silently.
2. At poll 20 (~1 hour at the 3-minute orchestrator cadence) the script sends a single `a1_to_a2_no_assets_warning` email so genuinely-stuck campaigns still surface.
3. After poll 20, the script keeps skipping silently. **`a1_permanently_failed` is never auto-set for empty folders.**
4. When assets eventually appear and A1→A2 succeeds, `db.reset_a1_retry()` clears the counter automatically.
The threshold lives in `scripts/a1_to_a2_box_uploader.py` as `EMPTY_FOLDER_WARNING_THRESHOLD = 20`.
### The genuine-error case
The 3-retries-then-permanently-fail behavior **still exists** for actual folder-level errors (e.g. `Assets folder not found (tried Master Assets)`), which are caught by the script's exception handler. These DO mark `a1_permanently_failed=TRUE` after 3 failures and DO send the retry / permanently-failed emails.
`db.increment_a1_retry()` accepts `mark_failed_at_max=True|False` to switch between the two behaviors. The empty-folder branch passes `False`; the exception handler passes `True` (default).
### Queue-slot filter
The A1→A2 script processes up to 2 campaigns per run (`campaigns[:2]`). Permanently-failed campaigns are filtered out **before** the slot cap so they no longer block the queue (`scripts/a1_to_a2_box_uploader.py:652`).
### Database tracking
Four fields on the `campaign_status` table:
- `a1_retry_count` (INTEGER): Number of polls where the folder was empty / errored. For empty-folder cases this can grow unbounded; reset on success.
- `a1_last_retry_at` (TIMESTAMP): When last attempt occurred
- `a1_permanently_failed` (BOOLEAN): TRUE only via the genuine-error path (after 3 failures), never via the empty-folder path
- `a1_failure_reason` (TEXT): Why it failed (e.g., "Assets folder not found (tried Master Assets)")
---
## Configuration
### Empty-folder warning threshold
`scripts/a1_to_a2_box_uploader.py`:
```python
EMPTY_FOLDER_WARNING_THRESHOLD = 20 # ~1 hour at 3-min poll cadence
```
Send the one-time warning sooner/later by adjusting this constant.
### Genuine-error retry attempts
`scripts/shared/database.py``increment_a1_retry()`:
```python
MAX_RETRIES = 3
```
Applies only when the caller passes `mark_failed_at_max=True` (default), i.e. the exception handler in `process_campaign()`. The empty-folder branch passes `False` and is unaffected.
---
## Email Notifications
### Empty-folder warning (one-time, at poll 20)
**Template:** `a1_to_a2_no_assets_warning`
**Subject:** ⚠️ Campaign in A1 with no assets yet - {campaign_name}
**Recipients:** Error notification list
**Sent:** exactly once per stuck campaign, when `a1_retry_count == 20`. Counter resets on success, so a future re-stuck event would warn again.
### Genuine-error retry email (attempts 12)
**Template:** `a1_to_a2_no_assets_retry`
**Subject:** ⚠️ No Assets Found (Attempt X/3) - Campaign {name}
**Recipients:** Error notification list
**Trigger:** non-empty-folder errors caught by `process_campaign()`'s exception handler.
### Genuine-error final failure (attempt 3)
**Template:** `a1_to_a2_permanently_failed`
**Subject:** ❌ PERMANENTLY FAILED - Campaign {name} (No Assets After 3 Attempts)
**Recipients:** Error notification list
**Content:**
- Campaign marked as permanently failed (campaign filtered from future queue runs)
- Required actions to fix
- SQL command to manually reset
---
## Manual Operations
### Check Campaign Retry Status
```sql
SELECT campaign_number, campaign_name, status,
a1_retry_count, a1_last_retry_at,
a1_permanently_failed, a1_failure_reason
FROM campaign_status
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
```
### Reset Single Campaign
```sql
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
```
**Or using psql command:**
```bash
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking <<EOF
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL
WHERE campaign_id = 'YOUR_CAMPAIGN_ID';
EOF
```
### Reset All Failed Campaigns
```sql
UPDATE campaign_status
SET a1_retry_count = 0,
a1_last_retry_at = NULL,
a1_permanently_failed = FALSE,
a1_failure_reason = NULL
WHERE a1_permanently_failed = TRUE;
```
### View All Failed Campaigns
```sql
SELECT campaign_number, campaign_name,
a1_retry_count, a1_last_retry_at, a1_failure_reason
FROM campaign_status
WHERE a1_permanently_failed = TRUE
ORDER BY a1_last_retry_at DESC;
```
---
## Failure Scenarios
### Scenario 1: Temporary Empty Folder
**What Happens:**
- Attempt 1: Email sent, retry counter = 1
- Assets added to folder before attempt 2
- Next run finds assets, processes successfully
- Retry counter automatically reset to 0
**Result:** Problem self-resolves, minimal notifications
### Scenario 2: Persistent Empty Folder
**What Happens:**
- Attempt 1 (0 min): Email sent, retry counter = 1
- Attempt 2 (3 min): Email sent, retry counter = 2
- Attempt 3 (6 min): Email sent, retry counter = 3
- Campaign marked permanently failed
- Processing stops, no more emails
**Result:** Support team alerted, infinite emails prevented
### Scenario 3: Wrong Status Assignment
**What Happens:**
- Campaign set to A1 by mistake (no assets intended)
- Fails 3 times, marked permanently failed
- Admin realizes mistake, changes status to different value
- Campaign no longer appears in A1 search results
**Result:** No reset needed, campaign excluded from processing
---
## Testing
### Test Retry Logic
1. Create test campaign in DAM with A1 status
2. Ensure Master Assets folder is empty
3. Run A1→A2 script manually 3 times
4. Verify emails received and database state
```bash
# Run 1
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Check database
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"
# Run 2 (wait 3 minutes or run immediately for testing)
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Check again
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed FROM campaign_status WHERE status = 'A1';"
# Run 3
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Verify permanently failed
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "SELECT campaign_number, a1_retry_count, a1_permanently_failed, a1_failure_reason FROM campaign_status WHERE a1_permanently_failed = TRUE;"
```
### Test Reset Logic
```bash
# Reset the test campaign
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c "UPDATE campaign_status SET a1_retry_count = 0, a1_permanently_failed = FALSE WHERE campaign_number = 'TEST_CAMPAIGN';"
# Run again
python scripts/a1_to_a2_box_uploader.py --auth-pfx-v2
# Verify it retries
```
---
## Monitoring
### Dashboard Query: Current Retry Status
```sql
SELECT
COUNT(*) FILTER (WHERE a1_retry_count = 0) as "No Issues",
COUNT(*) FILTER (WHERE a1_retry_count = 1) as "Attempt 1",
COUNT(*) FILTER (WHERE a1_retry_count = 2) as "Attempt 2",
COUNT(*) FILTER (WHERE a1_retry_count >= 3) as "Permanently Failed"
FROM campaign_status
WHERE status = 'A1';
```
### Alert Query: Campaigns Near Failure
```sql
SELECT campaign_number, campaign_name, a1_retry_count, a1_last_retry_at
FROM campaign_status
WHERE status = 'A1'
AND a1_retry_count >= 2
AND a1_permanently_failed = FALSE
ORDER BY a1_retry_count DESC, a1_last_retry_at DESC;
```
---
## Troubleshooting
### Q: Campaign keeps failing even after adding assets
**A:** Check if campaign was marked permanently failed. Reset using SQL command above.
### Q: Want to change from 3 to 5 retry attempts
**A:** Edit `MAX_RETRIES = 3` in `database.py` line ~567. Also update email templates to reflect new maximum.
### Q: How to disable retry logic completely?
**A:** Not recommended, but you can:
1. Set `MAX_RETRIES = 999` (effectively infinite)
2. Or revert to old `a1_to_a2_no_assets` template without retry tracking
### Q: Can I set different retry counts for different campaigns?
**A:** No, it's a global setting. All campaigns use same `MAX_RETRIES` value.
### Q: What if I want to delete permanently failed campaigns from database?
**A:** Don't delete. Instead, change their status to something other than A1. They'll be excluded from processing automatically.
---
## Future Enhancements
Potential improvements for future versions:
1. **Configurable retry timing:**
- Instead of relying on cron frequency (3 min)
- Check `a1_last_retry_at` and skip if too recent
- Allow exponential backoff (3 min, 10 min, 30 min)
2. **Campaign-specific retry limits:**
- Add optional `a1_max_retries` column
- Allow different campaigns to have different thresholds
- Default to global MAX_RETRIES if not set
3. **Automatic cleanup:**
- After 30 days, auto-reset permanently failed campaigns
- Or send weekly digest of stuck campaigns
4. **Webhook notifications:**
- Send to external system when campaign permanently fails
- Integrate with ticketing system
5. **Admin UI:**
- Web interface to view/reset retry status
- Bulk reset operations
---
## Code Locations
**Quick reference for developers:**
| Component | File | Line Range |
|-----------|------|------------|
| Retry check logic | `a1_to_a2_box_uploader.py` | ~176-186 |
| Empty folder detection | `a1_to_a2_box_uploader.py` | ~193-231 |
| Success reset | `a1_to_a2_box_uploader.py` | ~354-356 |
| `get_a1_retry_status()` | `database.py` | ~522-558 |
| `increment_a1_retry()` | `database.py` | ~560-620 |
| `reset_a1_retry()` | `database.py` | ~622-655 |
| Email templates | `notifier.py` | ~593-687 |
| Database migration | `migrations/003_add_a1_retry_tracking.sql` | All |
---
## Change Log
**January 31, 2026:**
- Initial implementation
- 3-attempt retry mechanism
- Permanent failure tracking
- Two new email templates
- This documentation created
**Future updates will be logged here.**

View file

@ -1,378 +0,0 @@
# Option 1: Multiple Tracking IDs in Filename - Implementation Guide
## Overview
Allow a single derivative/localized asset to reference multiple master assets by including multiple tracking IDs in the filename.
**Example Filename:**
```
1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg
^^^^^^^^^^^^^^^^^
Multiple tracking IDs
```
**Delimiter:** Use `+` to separate multiple tracking IDs (could also use `,` or `_`)
---
## Changes Required
### 1⃣ Filename Parser (`scripts/shared/filename_parser.py`)
**Current Code (line ~182):**
```python
# Tracking ID: 6 alphanumeric, optionally with -N suffix
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?$', part):
tracking = part
tracking_mode = 'full'
base_tracking_id = tracking
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2] # Strip -N suffix
parsed['tracking_id'] = base_tracking_id
parsed['tracking_mode'] = tracking_mode
parsed['tracking_id_with_suffix'] = tracking
logger.debug("Found tracking ID: {}".format(tracking))
index += 1
```
**Modified Code:**
```python
# Tracking ID(s): 6 alphanumeric, optionally with -N suffix
# Supports multiple IDs separated by + (e.g., "BqB8vo+SfUQ7m+laRJo0")
elif re.match(r'^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$', part):
tracking_ids = []
tracking_modes = []
tracking_ids_with_suffix = []
# Split by + delimiter to get all tracking IDs
id_parts = part.split('+')
for tracking in id_parts:
tracking_mode = 'full'
base_tracking_id = tracking
if tracking.endswith('-N'):
tracking_mode = 'folder_only'
base_tracking_id = tracking[:-2] # Strip -N suffix
logger.info("Detected folder-only tracking ID: {} (base: {})".format(tracking, base_tracking_id))
tracking_ids.append(base_tracking_id)
tracking_modes.append(tracking_mode)
tracking_ids_with_suffix.append(tracking)
# Store primary (first) tracking ID for backward compatibility
parsed['tracking_id'] = tracking_ids[0]
parsed['tracking_mode'] = tracking_modes[0]
parsed['tracking_id_with_suffix'] = tracking_ids_with_suffix[0]
# Store all tracking IDs for multi-master support
parsed['tracking_ids'] = tracking_ids
parsed['tracking_modes'] = tracking_modes
parsed['tracking_ids_with_suffix'] = tracking_ids_with_suffix
parsed['has_multiple_masters'] = len(tracking_ids) > 1
logger.debug("Found {} tracking ID(s): {}".format(len(tracking_ids), ', '.join(tracking_ids)))
index += 1
```
**Key Changes:**
- Updated regex to match multiple IDs: `^[a-zA-Z0-9]{6}(-N)?(\+[a-zA-Z0-9]{6}(-N)?)*$`
- Split on `+` delimiter
- Store primary ID for backward compatibility
- Add new fields: `tracking_ids`, `has_multiple_masters`
---
### 2⃣ A2→A3 Upload Script (`scripts/a2_to_a3_upload_polling.py`)
**Current Code (line ~97):**
```python
# 2. Load master metadata from database
master_asset = db.get_master_asset(tracking_id)
if not master_asset:
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
```
**Modified Code:**
```python
# 2. Load master metadata from database (support multiple tracking IDs)
tracking_ids = parsed.get('tracking_ids', [tracking_id]) # Get all tracking IDs or fallback to single
has_multiple_masters = parsed.get('has_multiple_masters', False)
# Load all master assets
master_assets = []
master_opentext_ids = []
if has_multiple_masters:
logger.info("Multiple master assets detected: {}".format(', '.join(tracking_ids)))
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
logger.warning("Master asset not found for tracking ID: {}".format(tid))
continue
master_assets.append(master)
master_opentext_ids.append(master['opentext_id'])
if not master_assets:
raise ValueError("No master assets found for tracking IDs: {}".format(', '.join(tracking_ids)))
# Use first master for metadata inheritance (could enhance this later)
master_asset = master_assets[0]
logger.info("Using primary master {} for metadata, linking all {} masters".format(
tracking_ids[0], len(master_assets)))
else:
# Single master (backward compatible)
master_asset = db.get_master_asset(tracking_id)
if not master_asset:
raise ValueError("No master asset for tracking ID: {}".format(tracking_id))
master_opentext_ids = [master_asset['opentext_id']]
```
**Current Code (line ~194):**
```python
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master_asset['full_metadata'],
clean_filename=clean_filename,
parsed_filename=parsed,
box_metadata=box_metadata,
tracking_mode=tracking_mode,
master_opentext_id=master_asset['opentext_id'] # Single ID
)
```
**Modified Code:**
```python
# Pass all master opentext IDs (support multiple)
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master_asset['full_metadata'],
clean_filename=clean_filename,
parsed_filename=parsed,
box_metadata=box_metadata,
tracking_mode=tracking_mode,
master_opentext_id=master_asset['opentext_id'], # Primary for ARTESIA.FIELD.ASSET_ID
master_opentext_ids=master_opentext_ids # All IDs for MASTERASSETIDS field
)
```
**Key Changes:**
- Extract multiple tracking IDs from parsed data
- Look up all master assets in database
- Collect all master opentext_ids
- Pass list to metadata extractor
---
### 3⃣ Metadata Extractor (`scripts/shared/metadata_extractor_mvp.py`)
**Current Method Signature (line ~97):**
```python
def build_mvp_asset_representation(self, master_metadata, clean_filename,
parsed_filename, box_metadata=None,
tracking_mode='full', master_opentext_id=None):
```
**Modified Method Signature:**
```python
def build_mvp_asset_representation(self, master_metadata, clean_filename,
parsed_filename, box_metadata=None,
tracking_mode='full', master_opentext_id=None,
master_opentext_ids=None):
```
**Current Code (line ~139):**
```python
if master_opentext_id:
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
```
**Modified Code:**
```python
# Add Master Asset ID field(s) if provided (derivative tracking)
if master_opentext_id:
mvp_fields = self._add_master_asset_id_field(mvp_fields, master_opentext_id)
logger.info("Added Master Asset ID field: {}".format(master_opentext_id))
# Add MASTERASSETIDS tabular field with all master IDs (support multiple)
if master_opentext_ids and len(master_opentext_ids) > 0:
mvp_fields = self._add_master_asset_ids_field(mvp_fields, master_opentext_ids)
logger.info("Added MASTERASSETIDS field with {} value(s)".format(len(master_opentext_ids)))
```
**New Method (add after `_add_master_asset_id_field`):**
```python
def _add_master_asset_ids_field(self, mvp_fields, master_opentext_ids):
"""
Add FERRERO.MASTERASSETIDS tabular field with multiple master asset IDs
Supports Many-to-Many relationship between derivatives and masters
Args:
mvp_fields: List of MVP fields
master_opentext_ids: List of DAM Asset IDs of master assets
Returns:
Updated mvp_fields list with FERRERO.MASTERASSETIDS
"""
if not master_opentext_ids or len(master_opentext_ids) == 0:
logger.info("No master_opentext_ids provided - skipping FERRERO.MASTERASSETIDS field")
return mvp_fields
# Check if field already exists
for field in mvp_fields:
if self._get_field_id(field) == 'FERRERO.MASTERASSETIDS':
logger.info("FERRERO.MASTERASSETIDS already present")
return mvp_fields
# Build values array with all master asset IDs
values = []
for master_id in master_opentext_ids:
values.append({
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'field_value': {
'type': 'string',
'value': master_id
},
'type': 'com.artesia.metadata.DomainValue'
}
})
# Create tabular field
new_field = {
'id': 'FERRERO.MASTERASSETIDS',
'parent_table_id': 'FERRERO.TABULAR.FIELD.MASTERASSETIDS',
'type': 'com.artesia.metadata.MetadataTableField',
'values': values
}
mvp_fields.append(new_field)
logger.info("Added FERRERO.MASTERASSETIDS field with {} master asset ID(s): {}".format(
len(values), ', '.join(master_opentext_ids[:3]) + ('...' if len(master_opentext_ids) > 3 else '')))
return mvp_fields
```
**Key Changes:**
- Add `master_opentext_ids` parameter (list)
- New method `_add_master_asset_ids_field` that accepts a list
- Builds `values` array with all master IDs
- Backward compatible (still uses single `master_opentext_id` for ARTESIA.FIELD.ASSET_ID)
---
## Testing Examples
### Single Master (Backward Compatible)
**Filename:** `1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo.jpg`
**Parsed:**
```python
{
'tracking_id': 'BqB8vo',
'tracking_ids': ['BqB8vo'],
'has_multiple_masters': False
}
```
**Result:** Single ID in MASTERASSETIDS field (current behavior)
---
### Multiple Masters (New Feature)
**Filename:** `1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg`
**Parsed:**
```python
{
'tracking_id': 'BqB8vo', # Primary (for backward compatibility)
'tracking_ids': ['BqB8vo', 'SfUQ7m', 'laRJo0'],
'has_multiple_masters': True
}
```
**Database Lookups:**
- BqB8vo → fc5c389776516bb58044c7d4bf479da458599baf
- SfUQ7m → ad3948d72ea8550a338a600ae87a1bdd1968b066
- laRJo0 → 020d76f957ec9f4ec0b18035a2d012cd3fd376c2
**Result:** 3 IDs in MASTERASSETIDS field values array
---
## Migration Path
1. **Phase 1 - Implement Code** (No Breaking Changes)
- Add changes to all 3 files
- Test with single tracking ID (should work exactly as before)
- Backward compatible with existing filenames
2. **Phase 2 - Test Multiple IDs**
- Create test file with multiple tracking IDs
- Upload to PPR with `--dryrun`
- Verify 3 values in MASTERASSETIDS field
3. **Phase 3 - Agency Tool Integration**
- Agency tool generates filenames with `+` delimiter
- Agency tool uses multiple tracking IDs when needed
- Most files will still have single tracking ID (normal case)
4. **Phase 4 - Production Deployment**
- Enable in PROD after testing in PPR
- Update field in PROD DAM schema first
- Deploy code changes
---
## Alternative Delimiters
If `+` causes issues, alternatives:
| Delimiter | Example | Notes |
|-----------|---------|-------|
| `+` | `BqB8vo+SfUQ7m` | ✅ Recommended (clear separator) |
| `,` | `BqB8vo,SfUQ7m` | ⚠️ Might conflict with CSV exports |
| `_` | `BqB8vo_SfUQ7m` | ⚠️ Already used in filename structure |
| `~` | `BqB8vo~SfUQ7m` | ✅ Alternative if + causes issues |
---
## Error Handling
**What happens if one tracking ID is not found?**
```python
# Option A: Skip missing masters (log warning)
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
logger.warning("Master asset not found for tracking ID: {}".format(tid))
continue # Skip this one, continue with others
# Option B: Fail entire upload (strict)
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
raise ValueError("Master asset not found for tracking ID: {}".format(tid))
```
**Recommendation:** Use Option A (skip missing) - derivative still uploads with available master links.
---
## Summary
**Files to Modify:**
1. `scripts/shared/filename_parser.py` - Parse multiple tracking IDs
2. `scripts/a2_to_a3_upload_polling.py` - Look up multiple masters
3. `scripts/shared/metadata_extractor_mvp.py` - Add all IDs to field
**Backward Compatible:** ✅ Yes - existing single-ID filenames work exactly as before
**Ready to Implement:** This document provides all code changes needed.

View file

@ -1,179 +0,0 @@
# PPR-Only Multiple Tracking IDs - Implementation Complete
## ✅ Changes Implemented
Multiple tracking IDs feature is now **ACTIVE in PPR** and **DISABLED in PROD** via environment detection.
---
## Files Modified
### 1. `scripts/shared/filename_parser.py`
- Added `__init__` method with DAM URL parameter
- Added `_is_ppr_environment()` method
- Updated tracking ID parsing to:
- **PPR**: Parse multiple IDs separated by `+` (e.g., `BqB8vo+SfUQ7m+laRJo0`)
- **PROD**: Use only first ID (backward compatible)
### 2. `scripts/a2_to_a3_upload_polling.py`
- Pass DAM URL to FilenameParser for environment detection
- Loop through all tracking IDs (PPR) or single ID (PROD)
- Look up all master assets in database
- Collect all `opentext_id` values
- Pass list to metadata extractor
### 3. `scripts/shared/metadata_extractor_mvp.py`
- Added `master_opentext_ids` parameter (list)
- New method: `_add_master_asset_ids_field()` to handle multiple IDs
- Builds `values` array with all master IDs
---
## Environment Detection Logic
**PPR Environment:**
- DAM URL contains: `ppr.dam.ferrero.com`
- Multiple tracking IDs: ✅ **ENABLED**
- Filename format: `1234568_ROC_ST_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m.jpg`
**PROD Environment:**
- DAM URL contains: `dam.ferrero.com` (not ppr)
- Multiple tracking IDs: ❌ **DISABLED**
- Filename format: `1234568_ROC_ST_SND_6S_16x9_REF_DE_de_BqB8vo.jpg` (single ID)
- If multiple IDs provided, uses FIRST ID only with warning
---
## Testing in PPR
### Test 1: Single Tracking ID (Backward Compatible)
**Filename:**
```
1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo.jpg
```
**Expected Result:**
- Parses as single tracking ID
- One master asset looked up
- One value in MASTERASSETIDS field
- ✅ Works exactly as before
### Test 2: Multiple Tracking IDs (New Feature)
**Filename:**
```
1234568_ROC_STRANGER-THINGS_SND_6S_16x9_REF_DE_de_BqB8vo+SfUQ7m+laRJo0.jpg
```
**Expected Result:**
- PPR environment detected
- Parses 3 tracking IDs: `BqB8vo`, `SfUQ7m`, `laRJo0`
- Looks up 3 master assets in database
- Gets 3 opentext_ids:
- `fc5c389776516bb58044c7d4bf479da458599baf`
- `ad3948d72ea8550a338a600ae87a1bdd1968b066`
- `020d76f957ec9f4ec0b18035a2d012cd3fd376c2`
- Creates MASTERASSETIDS field with 3 values
**Log Output:**
```
PPR Environment - Multiple tracking IDs detected: 3
Parsed 3 tracking IDs: BqB8vo, SfUQ7m, laRJo0
PPR - Multiple master assets detected: BqB8vo, SfUQ7m, laRJo0
Using primary master BqB8vo for metadata, linking 3 total masters
PPR - Added MASTERASSETIDS field with 3 master IDs
Added FERRERO.MASTERASSETIDS field with 3 master asset ID(s): fc5c389776516bb58044c7d4bf479da458599baf, ad3948d72ea8550a338a600ae87a1bdd1968b066, 020d76f957ec9f4ec0b18035a2d012cd3fd376c2
```
---
## Test Commands
### Dry Run (Recommended First)
```bash
python scripts/a2_to_a3_upload_polling.py --dryrun
```
Check the JSON output for:
```json
{
"id": "FERRERO.MASTERASSETIDS",
"parent_table_id": "FERRERO.TABULAR.FIELD.MASTERASSETIDS",
"type": "com.artesia.metadata.MetadataTableField",
"values": [
{"value": {"field_value": {"value": "fc5c389776516bb58044c7d4bf479da458599baf"}}},
{"value": {"field_value": {"value": "ad3948d72ea8550a338a600ae87a1bdd1968b066"}}},
{"value": {"field_value": {"value": "020d76f957ec9f4ec0b18035a2d012cd3fd376c2"}}}
]
}
```
### Real Upload to PPR
```bash
python scripts/a2_to_a3_upload_polling.py
```
Then verify in PPR DAM:
1. Search for the uploaded asset
2. Open metadata
3. Check "Master Asset IDs" tabular field
4. Should show multiple rows
---
## Error Handling
**Missing Master Assets:**
- If one tracking ID is not found in database, it's skipped with warning
- Derivative still uploads with available master links
- Log message: `Master asset not found for tracking ID: xyz - skipping`
**PROD Environment with Multiple IDs:**
- Uses FIRST tracking ID only
- Logs warning: `PROD Environment - Multiple tracking IDs not supported, using first ID only`
- Works as backward compatible (no errors)
---
## Current Environment Check
Your `.env` file shows:
```
DAM_BASE_URL=https://ppr.dam.ferrero.com/otmmapi
```
**PPR Environment** - Multiple tracking IDs are **ENABLED**
---
## Agency Tool Requirements
To use multiple tracking IDs, the Agency tool needs to:
1. Concatenate tracking IDs with `+` delimiter
2. Example: `tracking_id_1 + "+" + tracking_id_2 + "+" + tracking_id_3`
3. Place in filename: `{job}_{brand}_{...}_{tracking_ids}.{ext}`
**Most derivatives will still use single tracking ID** - this is only for special cases where one derivative references multiple masters.
---
## Production Safety
✅ **PROD is Protected:**
- Environment detection prevents multiple IDs in PROD
- If multiple IDs accidentally used, only first ID is processed
- No breaking changes to PROD behavior
- Fully backward compatible
---
## Ready to Test! 🚀
Your PPR environment is now ready to test multiple tracking IDs.
1. Create test file with multiple IDs
2. Upload to Box: `DAM-UPLOAD/1234568/`
3. Run with `--dryrun` first
4. Verify JSON shows multiple values
5. Real upload and check in PPR DAM

View file

@ -3,7 +3,7 @@
**Complete automated workflow for Ferrero DAM Content Scaling**
**Version:** 2.1
**Last Updated:** March 31, 2026
**Last Updated:** April 16, 2026
**Status:** ✅ Production Ready & Fully Tested
---
@ -965,13 +965,20 @@ Each file defines: MVP fields, filename update rules, forced values, defaults, a
`config/asset_type_mappings.yaml` maps 3-letter codes from the naming tool to DAM domain values (e.g., `EHI` -> `heroimage`, `EOL` -> `externallegalopinion`).
**Last updated:** April 16, 2026 per Scaling Agencies Metadata List. 38 asset types mapped (was 39). Changes:
- **Removed:** CID, ECB, EBS, EOP, EUG, EWB, FPO, PKI, PRI
- **Added:** EAN, ESI, NTB, PIR, PKC, PKT, SCP, SNC, UPI
- **Changed:** DAT DAM code updated from `digitalassettoolkit` to `digitalasset`
### Asset Representation Template
`config/asset_representation_template.json` is the reference template for folder-only mode (`-N` flag uploads). It contains the full field metadata structure that the DAM API requires for asset creation. This template was provided by the client and should be updated if the DAM metadata model changes.
### Asset Type Overrides (EOL Example)
### Asset Type Overrides (EOL / LTD)
Certain asset types trigger field overrides configured in the field mappings file. For example, **EOL (External Legal Opinion)** overrides:
Certain asset types trigger field overrides configured in the field mappings file. Currently configured for both PPR and PROD:
**EOL (External Legal Opinion)**
- Agency Name = "-"
- Production House = "-"
- Main Languages = "Global"
@ -979,7 +986,9 @@ Certain asset types trigger field overrides configured in the field mappings fil
- Licensing = "No"
- Validity dates removed
These overrides are applied after all other field processing and take final precedence.
**LTD (Licensing Translation Document)** — supports the EOL workflow with translated license claims. Same overrides as EOL, plus a fixed Description: `"Translation of License claim - For approval purposes only"`. Currently mapped to the same DAM-side code (`externallegalopinion`) as a placeholder pending client confirmation.
These overrides are applied after all other field processing and take final precedence. An empty-string override removes the field; a non-empty override targeting a field that isn't in `mvp_fields` will be appended as a simple string field.
---
@ -1432,8 +1441,8 @@ PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrer
---
**Version:** 2.0 - Production Ready
**Last Updated:** November 5, 2025
**Version:** 2.1 - Production Ready
**Last Updated:** April 16, 2026
**Repository:** bitbucket.org:zlalani/ferrero-opentext.git
🚀 **Ready to deploy!**

View file

@ -45,6 +45,119 @@ Checks once, runs any due tasks, then exits. This is what cron would call.
---
## Off-Hours Configuration
### Overview
The orchestrator automatically reduces task frequency during off-hours to minimize system load during low-activity periods.
**What changes during off-hours:**
- All tasks run less frequently (only at 0 and 30 minute marks)
- Example: A 3-minute task normally runs at minutes 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, etc.
- During off-hours: Runs only at minutes 0 and 30 (every 30 minutes)
- Daily Report (7 PM) remains unchanged
**Off-hours definition:**
- Late night: 10 PM (22:00) to 5 AM (05:00) every day
- All day Saturday (00:00-23:59)
- All day Sunday (00:00-23:59)
### Configuration
**Location:** `scripts/orchestrator-prod.py` lines ~88-107
```python
OFF_HOURS_CONFIG = {
'enabled': True, # Set to False to disable
'extra_minutes': 30, # Minutes to add during off-hours
'late_night_start': 22, # Start hour (22 = 10 PM)
'late_night_end': 5, # End hour (5 = 5 AM)
'weekend_days': [5, 6], # Saturday=5, Sunday=6
'exempt_tasks': [
'Daily Report' # Tasks that ignore off-hours
]
}
```
### Examples
**Business Hours (Monday 2 PM):**
```
A1→A2: Runs every 3 minutes (0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, ...)
A4 Box: Runs every 10 minutes (0, 10, 20, 30, 40, 50)
```
**Off-Hours (Monday 11 PM or Saturday):**
```
A1→A2: Runs every 30 minutes (0, 30)
A4 Box: Runs every 30 minutes (0, 30)
All tasks: Only run at minutes 0 and 30
```
### Customization
#### Change off-hours timing
Edit `orchestrator-prod.py`:
```python
# Late night only from midnight to 6 AM
'late_night_start': 0,
'late_night_end': 6,
# Include only Sunday as weekend
'weekend_days': [6], # 6 = Sunday
```
#### Disable off-hours completely
```python
OFF_HOURS_CONFIG = {
'enabled': False, # Turns off all off-hours logic
# ... rest unchanged
}
```
#### Exempt specific tasks
```python
'exempt_tasks': [
'Daily Report',
'A4 Webhook Monitor' # This task will run at normal cadence even in off-hours
]
```
### Monitoring
Check orchestrator logs to see current mode:
```bash
# Watch for mode changes
tail -f logs/orchestrator.log | grep "MODE"
# Output examples:
# Orchestrator tick: 2026-01-31 14:00:00 [NORMAL MODE]
# Orchestrator tick: 2026-01-31 22:00:00 [OFF-HOURS MODE]
# Adding 30 minutes to all task intervals
```
### Testing
```bash
# Test without affecting production
python scripts/orchestrator-prod.py --force
# Look for these log messages:
# [OFF-HOURS MODE] or [NORMAL MODE]
# "Adding 30 minutes to all task intervals"
# "Task 'A1->A2' due (off-hours: 3min + 30min cadence)"
```
---
## Logs
- **Orchestrator logs**: `logs/orchestrator.log`

View file

@ -2,18 +2,15 @@
# Frontend naming tool uses 3-letter codes (EHI, IMG, TVC, etc.)
# DAM uses descriptive lowercase codes (heroimage, keyvisual, tvc, etc.)
# This file maps between them
# Updated: 2026-04-16 per Scaling Agencies Metadata List
# E-Commerce Asset Types
ECA: aplus # E-COMM: A+
ECB: backpackshot # E-COMM: Back Packshot
EBS: beautyshot # E-COMM: Beauty shot
EBR: brandstore # E-COMM: Brand Store
EEM: emedia # E-COMM: E-Media
EHI: heroimage # E-COMM: Hero Image
EIL: ingredientslist # E-COMM: Ingredients List
EOP: outofpack # E-COMM: Out Of Pack
EUG: ugc # E-COMM: UGC
EWB: whybuy # E-COMM: Why Buy
ECA: aplus # A+ content (E-COMM)
EBR: brandstore # Brand Store (E-COMM)
EEM: emedia # E-Media (E-COMM)
EHI: heroimage # Hero Image (E-COMM)
EIL: ingredientslist # Ingredients List
ESI: secondaryimage # Secondary image (E-COMM)
# Standard Asset Types
3RT: coretoys # 3D Real Toys
@ -22,31 +19,36 @@ BBK: brandbook # Brand Book
BRC: brandcharacter # Brand Character
BSG: brandsignature # Brand Signature
CKV: campaignkeyvisual # Campaign Key Visual
CID: CreativeIdea # Creative Idea
DAT: digitalassettoolkit # Digital Assets/Toolkit
FLA: flyerartworks # Flyer Artworks
DAT: digitalasset # Digital Asset
EAN: eancodeclaim # EAN CODE - claim
FLA: flyerartworks # Trade Leaflet
FNT: font # Font
GDT: gadget # Gadget
GDT: gadget # Gadget / Prize
GRG: groupguidelines # Group Guidelines
IMG: keyvisual # Immagine Guida / Front of Pack Image (was FPO)
FPO: keyvisual # Front of Pack Image (alias for IMG)
IMG: keyvisual # Immagine Guida/Product and Key Ingredients
LGL: localguidelines # Local Guidelines
LOG: ferrerologo # Logo
MLF: marketingleaflet # Marketing Leaflet
PAW: packartworks # Pack Artworks
PKI: packshot # Pack Images (was packshot)
MLF: marketingleaflet # Toys Marketing Leaflet
NTB: nutritionalclaim # Nutritional table
PAW: packartworks # Pack Artwork
PIR: prepinstructionclaim # Prep. Instruction and recipes
PKC: packcurendering # Pack CU Rendering
PKT: packturendering # Pack TU/SU Rendering
POS: posm # POS Material
PDM: productdemo # Product Demo
PRI: productimages # Product Images
QRC: qrcode # QR code
QRC: qrcode # QR Code
SCP: sizecomparisonclaim # Size comparison picture
SNC: certificationsustainabilityclaim # Certification/sustainability/nutritional claim
SND: sound # Sound
SIP: internalproperties # Styleguide Internal Properties
SGL: licenseshighlights # Styleguide Licenses
TVC: tvc # TVC
VIE: visualidentityelements # Visual Identity Elements
UPI: unwrappedproductimage # Unwrapped Product Images
VIE: visualidentityelements # Brand Visual Identity Elements
# External Legal Opinion
EOL: externallegalopinion # External Legal Opinion (triggers field overrides)
LTD: licensingtranslationdocument # Licensing Translation Document - License claim translations (triggers field overrides)
# Note: If a 3-letter code is not in this mapping, it will be passed through as-is
# and may fail DAM validation if the code doesn't exist in DAM's domain

View file

@ -80,11 +80,15 @@ retry:
notifications:
enabled: true
smtp:
server: ${SMTP_SERVER}
port: ${SMTP_PORT}
user: ${SMTP_USER}
password: ${SMTP_PASSWORD}
sender_email: ${SENDER_EMAIL}
server: ${SMTP_SERVER:-}
port: ${SMTP_PORT:-587}
user: ${SMTP_USER:-}
password: ${SMTP_PASSWORD:-}
sender_email: ${SENDER_EMAIL:-}
mailgun:
api_key: ${MAILGUN_API_KEY:-}
domain: ${MAILGUN_DOMAIN:-}
sender_email: ${MAILGUN_SENDER_EMAIL:-}
recipients:
success:
- ${REPORT_EMAILS}

View file

@ -85,7 +85,22 @@ asset_type_overrides:
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "Yes"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for EOL
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for EOL
ARTESIA.FIELD.ASSET DESCRIPTION: "Legal Studio Name"
LTD: # Licensing Translation Document - License claim translations supporting EOL
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for LTD
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for LTD
ARTESIA.FIELD.ASSET DESCRIPTION: "Translation of License claim - For approval purposes only"

View file

@ -76,3 +76,31 @@ defaults:
FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY: "Oliver Marketing Ltd"
FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY: "Oliver Marketing Ltd"
FERRERO.MARKET.PROD_COMPANY: "-" # Production House
# Asset type overrides (keyed by 3-letter asset type code)
# Applied AFTER normal field updates and forced values
# Overrides specific fields when a matching asset type is detected in the filename
asset_type_overrides:
EOL: # External Legal Opinion - selected as asset type in naming tool
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for EOL
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for EOL
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for EOL
ARTESIA.FIELD.ASSET DESCRIPTION: "Legal Studio Name"
LTD: # Licensing Translation Document - License claim translations supporting EOL
FERRERO.MARKETING.FIELD.AGENCY NAME: "-"
FERRERO.MARKET.PROD_COMPANY: "-"
MAIN_LANGUAGES: "Global"
FERRERO.MARKET.FIELD.IPRIGHT: "No"
FERRERO.MARKET.FIELD.LICENSIN: "No"
FERRERO.FIELD.ASSET VALIDITY START PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.ASSET VALIDITY END PERIOD: "" # Remove validity dates for LTD
FERRERO.FIELD.CREATIVEX LINK: "" # Remove CreativeX URL for LTD
FERRERO.TAB.FIELD.CREATIVEX: "" # Remove CreativeX score for LTD
ARTESIA.FIELD.ASSET DESCRIPTION: "Translation of License claim - For approval purposes only"

View file

@ -51,6 +51,7 @@ CREATE TABLE IF NOT EXISTS master_assets (
global_master_campaign_id VARCHAR(50),
global_master_folder_id VARCHAR(255),
local_campaign_id VARCHAR(50),
global_master_tracking_id VARCHAR(6),
-- Workflow Information
upload_directory VARCHAR(1000),
@ -198,7 +199,7 @@ CREATE TABLE IF NOT EXISTS creativex_scores (
-- Timestamps
extracted_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
status VARCHAR(50) DEFAULT 'active', -- 'active', 'superseded', 'master-cx-score'
status VARCHAR(50) DEFAULT 'active', -- 'active', 'superseded', 'master-cx-score' (A1 local masters), 'b1-master-cx-score' (B1 global masters)
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
@ -221,6 +222,7 @@ CREATE INDEX IF NOT EXISTS idx_master_assets_created_at ON master_assets(created
CREATE INDEX IF NOT EXISTS idx_master_assets_global_master ON master_assets(global_master_campaign_id);
CREATE INDEX IF NOT EXISTS idx_master_assets_local_campaign ON master_assets(local_campaign_id);
CREATE INDEX IF NOT EXISTS idx_master_assets_opentext_local ON master_assets(opentext_id, local_campaign_id);
CREATE INDEX IF NOT EXISTS idx_master_assets_global_master_tracking ON master_assets(global_master_tracking_id);
-- derivative_assets indexes
CREATE INDEX IF NOT EXISTS idx_derivative_tracking_id ON derivative_assets(tracking_id);

View file

@ -0,0 +1,32 @@
-- Migration: Add A1 retry tracking to campaign_status table
-- Purpose: Prevent infinite error emails for empty A1 campaigns
-- Date: January 31, 2026
\echo 'Adding A1 retry tracking fields to campaign_status table...'
ALTER TABLE campaign_status
ADD COLUMN IF NOT EXISTS a1_retry_count INTEGER DEFAULT 0,
ADD COLUMN IF NOT EXISTS a1_last_retry_at TIMESTAMP,
ADD COLUMN IF NOT EXISTS a1_permanently_failed BOOLEAN DEFAULT FALSE,
ADD COLUMN IF NOT EXISTS a1_failure_reason TEXT;
\echo 'Fields added successfully'
-- Create index for faster queries
CREATE INDEX IF NOT EXISTS idx_campaign_status_a1_failed ON campaign_status(a1_permanently_failed);
\echo 'Index created'
-- Add comments for documentation
COMMENT ON COLUMN campaign_status.a1_retry_count IS 'Number of times A1→A2 processing attempted with empty folder';
COMMENT ON COLUMN campaign_status.a1_last_retry_at IS 'Timestamp of last retry attempt';
COMMENT ON COLUMN campaign_status.a1_permanently_failed IS 'TRUE if campaign failed all 3 retry attempts';
COMMENT ON COLUMN campaign_status.a1_failure_reason IS 'Description of why campaign was marked as permanently failed';
\echo ''
\echo '============================================================'
\echo 'Migration 003 complete!'
\echo '============================================================'
\echo 'Added fields: a1_retry_count, a1_last_retry_at, a1_permanently_failed, a1_failure_reason'
\echo 'Purpose: Track A1 empty folder retries (max 3 attempts)'
\echo '============================================================'

View file

@ -0,0 +1,13 @@
-- Migration 004: Add global_master_tracking_id column to master_assets
-- Purpose: Links local campaign assets (A1→A2) back to their global master (B1→B2)
-- by storing the M-prefixed tracking ID from the B1 record
-- Date: 2026-03-21
ALTER TABLE master_assets
ADD COLUMN IF NOT EXISTS global_master_tracking_id VARCHAR(6);
-- Index for lookups
CREATE INDEX IF NOT EXISTS idx_master_assets_global_master_tracking
ON master_assets(global_master_tracking_id);
\echo 'Migration 004 complete: Added global_master_tracking_id to master_assets'

View file

@ -0,0 +1,14 @@
-- Migration 005: Document new 'b1-master-cx-score' status value in creativex_scores
-- Purpose: B1→B2 global master CreativeX scores are now persisted to creativex_scores
-- with status='b1-master-cx-score' so they can be queried directly without
-- joining through master_assets. No DDL change needed (status is VARCHAR(50)
-- and accepts arbitrary values); this migration exists for documentation only.
-- Date: 2026-04-29
-- Existing status values:
-- 'active' - currently-valid A2 scoring extraction (versioned)
-- 'superseded' - older A2 scoring extraction replaced by a newer one
-- 'master-cx-score' - A1→A2 local master reference score
-- 'b1-master-cx-score' - B1→B2 global master reference score (NEW)
\echo 'Migration 005 complete: b1-master-cx-score status documented (no schema change)'

View file

@ -4,9 +4,8 @@ import sys
import psycopg2
from dotenv import load_dotenv
# Load env vars from current directory
script_dir = os.path.dirname(os.path.abspath(__file__))
load_dotenv(os.path.join(script_dir, '.env'))
# Load env vars
load_dotenv('/Users/daveporter/Desktop/CODING-2024/Ferrero-Opentext/Python-Version/.env')
try:
conn = psycopg2.connect(

View file

@ -50,6 +50,11 @@ logging.basicConfig(
logger = logging.getLogger('A1toA2Box')
# Empty A1 folders are an expected client workflow (folder created before assets uploaded).
# Skip silently and send a single warning email at this poll count to flag genuinely-stuck
# campaigns without spamming. At ~3-min poll cadence, 20 polls ≈ 1 hour.
EMPTY_FOLDER_WARNING_THRESHOLD = 20
def extract_creativex_from_dam_metadata(asset_metadata):
"""
Extract CreativeX score and URL from DAM asset metadata if present
@ -171,6 +176,15 @@ def process_campaign(campaign, dam, box, db, notifier, config):
logger.info("Processing campaign: {} ({})".format(campaign_name, campaign_number))
logger.info("=" * 60)
# CHECK RETRY STATUS FIRST
retry_status = db.get_a1_retry_status(campaign_id)
if retry_status and retry_status['permanently_failed']:
logger.warning("Campaign {} is marked as permanently failed - skipping".format(campaign_number))
logger.info("Failure reason: {}".format(retry_status.get('failure_reason', 'Unknown')))
logger.info("To retry this campaign, manually reset it using database.reset_a1_retry()")
return {'success': False, 'processed': 0, 'failed': 0, 'skipped': True}
total_assets = 0
try:
# Get master assets
@ -180,17 +194,38 @@ def process_campaign(campaign, dam, box, db, notifier, config):
logger.info("Found {} master assets".format(total_assets))
if total_assets == 0:
logger.warning("No master assets found in Master Assets folder")
# Send email notification about empty campaign (keep error notifications)
notifier.send_email(
template_name='a1_to_a2_no_assets',
recipients=config['notifications']['recipients']['errors'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number
}
# Empty folders are expected when a campaign manager creates the campaign
# before uploading assets. Track the count for visibility but never auto-fail
# — keep retrying every poll until assets appear (or status changes in DAM).
retry_result = db.increment_a1_retry(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
reason="No master assets found in Master Assets folder",
mark_failed_at_max=False
)
if not retry_result['success']:
logger.error("Failed to update retry counter")
retry_count = retry_result.get('retry_count', 0)
logger.info("No master assets yet (poll {}) - skipping until assets appear".format(retry_count))
# Send a single warning email when the campaign has been empty for ~1 hour
# so genuinely-stuck campaigns still surface, without spamming on every poll.
if retry_count == EMPTY_FOLDER_WARNING_THRESHOLD:
logger.warning("Campaign has been empty for {} polls - sending one-time warning".format(retry_count))
notifier.send_email(
template_name='a1_to_a2_no_assets_warning',
recipients=config['notifications']['recipients']['errors'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'poll_count': retry_count
}
)
return {'success': False, 'processed': 0, 'failed': 0}
# Track results
@ -219,6 +254,11 @@ def process_campaign(campaign, dam, box, db, notifier, config):
# 1. Extract Global Campaign Reference (needed for tracking ID lookup)
global_ref = db.extract_global_campaign_reference(asset, campaign_number)
# 1b. Look up matching B1→B2 global master by opentext_id
global_master_tid = db.find_global_master_by_opentext_id(asset_id)
if global_master_tid:
logger.info("Linked to global master: {}{}".format(asset_name, global_master_tid))
# 2. Find existing tracking ID or generate new one
# Handles re-processing: if campaign was reset to A1 after adding new masters,
# existing assets keep their tracking IDs, new assets get new IDs
@ -250,7 +290,8 @@ def process_campaign(campaign, dam, box, db, notifier, config):
upload_folder_id=final_folder_id,
global_master_campaign_id=global_ref['global_master_campaign_id'],
global_master_folder_id=global_ref['global_master_folder_id'],
local_campaign_id=global_ref['local_campaign_id']
local_campaign_id=global_ref['local_campaign_id'],
global_master_tracking_id=global_master_tid
)
if db_result['success']:
@ -296,7 +337,8 @@ def process_campaign(campaign, dam, box, db, notifier, config):
upload_folder_id=final_folder_id,
global_master_campaign_id=global_ref['global_master_campaign_id'],
global_master_folder_id=global_ref['global_master_folder_id'],
local_campaign_id=global_ref['local_campaign_id']
local_campaign_id=global_ref['local_campaign_id'],
global_master_tracking_id=global_master_tid
)
if db_result['success']:
@ -369,6 +411,9 @@ def process_campaign(campaign, dam, box, db, notifier, config):
if status_result['success']:
logger.info("✓ Status updated successfully")
# RESET retry counter on success
db.reset_a1_retry(campaign_id)
# Record campaign status in database
logger.info("Recording campaign status in database...")
db.record_campaign_status(
@ -430,7 +475,9 @@ def process_campaign(campaign, dam, box, db, notifier, config):
'asset_count': len(processed_assets),
'new_asset_count': len(new_assets),
'existing_asset_count': len(existing_assets),
'processed_assets': processed_assets
'processed_assets': processed_assets,
'new_assets': new_assets,
'existing_assets': existing_assets
},
attachments=attachments
)
@ -474,20 +521,66 @@ def process_campaign(campaign, dam, box, db, notifier, config):
except Exception as e:
logger.error("Campaign processing failed: {}".format(str(e)))
# Send error notification for this specific campaign failure
try:
notifier.send_email(
template_name='upload_failed',
recipients=config['notifications']['recipients']['errors'],
data={
'filename': "Campaign: {}".format(campaign_name),
'tracking_id': campaign_number,
'error': str(e)
}
# Check if this is a "folder not found" or "no assets" error - use retry logic
error_str = str(e).lower()
is_folder_issue = 'folder not found' in error_str or 'no assets' in error_str or 'assets folder' in error_str
if is_folder_issue:
logger.warning("Detected folder/assets issue - applying retry logic")
# Increment retry counter
retry_result = db.increment_a1_retry(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
reason=str(e)
)
except Exception as email_error:
logger.error("Failed to send error email: {}".format(str(email_error)))
if not retry_result['success']:
logger.error("Failed to update retry counter")
is_permanently_failed = retry_result.get('permanently_failed', False)
retry_count = retry_result.get('retry_count', 0)
# Determine which email template to use
if is_permanently_failed:
# Send FINAL failure email (after 3 attempts)
template_name = 'a1_to_a2_permanently_failed'
else:
# Send standard retry notification
template_name = 'a1_to_a2_no_assets_retry'
# Send email notification
try:
notifier.send_email(
template_name=template_name,
recipients=config['notifications']['recipients']['errors'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'retry_count': retry_count,
'max_retries': 3,
'is_permanently_failed': is_permanently_failed
}
)
except Exception as email_error:
logger.error("Failed to send error email: {}".format(str(email_error)))
else:
# Other errors - send generic failure notification
try:
notifier.send_email(
template_name='upload_failed',
recipients=config['notifications']['recipients']['errors'],
data={
'filename': "Campaign: {}".format(campaign_name),
'tracking_id': campaign_number,
'error': str(e)
}
)
except Exception as email_error:
logger.error("Failed to send error email: {}".format(str(email_error)))
return {'success': False, 'processed': 0, 'failed': total_assets}
@ -553,10 +646,30 @@ def main():
db.close()
sys.exit(0)
# Exclude permanently-failed campaigns so they don't consume processing slots
eligible_campaigns = []
skipped_failed = []
for campaign in campaigns:
retry_status = db.get_a1_retry_status(campaign['asset_id'])
if retry_status and retry_status['permanently_failed']:
skipped_failed.append(campaign.get('campaign_id', 'N/A'))
else:
eligible_campaigns.append(campaign)
if skipped_failed:
logger.info("Excluding {} permanently-failed campaign(s): {}".format(
len(skipped_failed), ", ".join(skipped_failed)
))
if not eligible_campaigns:
logger.info("No eligible A1 campaigns to process - exiting")
db.close()
sys.exit(0)
# Process UP TO 2 campaigns
campaigns_to_process = campaigns[:2]
logger.info("Found {} A1 campaigns - processing {} campaign(s)".format(
len(campaigns), len(campaigns_to_process)
campaigns_to_process = eligible_campaigns[:2]
logger.info("Found {} A1 campaigns ({} eligible) - processing {} campaign(s)".format(
len(campaigns), len(eligible_campaigns), len(campaigns_to_process)
))
logger.info("")

View file

@ -97,12 +97,12 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
tracking_ids = parsed.get('tracking_ids', [tracking_id]) # Get all IDs or fallback to single
has_multiple_masters = parsed.get('has_multiple_masters', False)
# Load all master assets (PPR: multiple, PROD: single)
# Load all master assets (supports multiple masters in both PPR and PROD)
master_assets = []
master_opentext_ids = []
if has_multiple_masters:
logger.info("PPR - Multiple master assets detected: {}".format(', '.join(tracking_ids)))
logger.info("Multiple master assets detected: {}".format(', '.join(tracking_ids)))
for tid in tracking_ids:
master = db.get_master_asset(tid)
if not master:
@ -128,6 +128,7 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
master_opentext_ids = [master_asset['opentext_id']]
# CHECK: Warn if Master Tracking ID is used (starts with uppercase M)
if tracking_id.startswith('M'):
logger.warning("Detected Master Tracking ID in Version/Derivative upload folder: {}".format(tracking_id))
@ -185,7 +186,47 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
# If legacy single platform exists, add it to list
if not platforms and data_obj.get('ferrero_mapped_platform'):
platforms = [data_obj.get('ferrero_mapped_platform')]
# Fallback: Handle new CreativeX API format (no 'data' wrapper)
# Maps API channel/publisher back to DAM platform names
if not platforms and isinstance(full_data, dict) and 'channel' in full_data:
api_channel = full_data.get('channel', '')
api_publisher = full_data.get('publisher', '')
CHANNEL_TO_DAM = {
'google_ads': 'Google',
'dv360': 'DV360',
'tiktok_paid': 'TikTok',
'snapchat_paid': 'Snap',
'pinterest': 'Pinterest',
'twitter_paid': 'Twitter',
'amazon_paid': 'Amazon',
}
FB_PUBLISHER_TO_DAM = {
'facebook': 'FB - Feed',
'audience_network': 'Audience Network - An Classic',
'messenger': 'Messenger - Inbox',
}
IG_PUBLISHER_TO_DAM = {
'instagram': 'IG - Feed',
}
if api_channel in CHANNEL_TO_DAM:
platforms = [CHANNEL_TO_DAM[api_channel]]
elif api_channel == 'facebook_paid' and api_publisher in FB_PUBLISHER_TO_DAM:
platforms = [FB_PUBLISHER_TO_DAM[api_publisher]]
elif api_channel == 'instagram_paid' and api_publisher in IG_PUBLISHER_TO_DAM:
platforms = [IG_PUBLISHER_TO_DAM[api_publisher]]
elif api_channel == 'facebook_paid':
platforms = ['FB - Feed']
elif api_channel == 'instagram_paid':
platforms = ['IG - Feed']
if platforms:
logger.info("CreativeX: Mapped API channel '{}'/publisher '{}' to DAM platform '{}'".format(
api_channel, api_publisher, platforms[0]))
box_metadata = {
'score': creativex_data['quality_score'],
'url': creativex_data['creativex_url'],
@ -196,12 +237,12 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
))
creativex_found = True
else:
# Use default values when no CreativeX score found
# Use default values when no CreativeX score found - no URL sent
box_metadata = {
'score': '0',
'url': 'https://app.creativex.com/preflight/pretests'
'url': ''
}
logger.warning("No CreativeX score found for: {} - Using default values (Score: 0, Placeholder URL)".format(
logger.warning("No CreativeX score found for: {} - Using default values (Score: 0, No URL)".format(
filename
))
creativex_found = False
@ -213,7 +254,19 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
# 5. Get clean filename
clean_filename = parser.strip_upload_components(filename)
# 6. Build MVP asset representation with CreativeX data from database
# 6. Look up pre-upload metadata override saved by the naming tool's editor.
# The naming tool stores filename without extension, so strip it here.
filename_no_ext = os.path.splitext(filename)[0]
override = db.get_override_metadata(filename_no_ext)
override_fields = None
if override:
override_fields = override.get('override_fields')
logger.info("Found pre-upload override (id={}) for {}: {} field(s)".format(
override.get('id'), filename_no_ext,
len(override_fields) if override_fields else 0
))
# 7. Build MVP asset representation with CreativeX data from database
asset_rep = mvp_extractor.build_mvp_asset_representation(
master_metadata=master_asset['full_metadata'],
clean_filename=clean_filename,
@ -221,7 +274,8 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
box_metadata=box_metadata, # Pass CreativeX data from database
tracking_mode=tracking_mode, # Pass tracking mode for folder-only handling
master_opentext_id=master_asset['opentext_id'], # Primary master DAM ID
master_opentext_ids=master_opentext_ids # All master IDs (PPR: multiple, PROD: single)
master_opentext_ids=master_opentext_ids, # All master IDs (multiple or single)
override_fields=override_fields # Pre-upload edits from naming tool
)
# DRYRUN MODE: Display full asset representation and exit
@ -246,10 +300,10 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
logger.info(" URL: {}".format(box_metadata.get('url')))
logger.info("")
# PPR ONLY: Register master asset IDs in lookup domain (even in dryrun for testing)
# Register master asset IDs in lookup domain (even in dryrun for testing)
# This API call is safe - it only adds values to the lookup table, doesn't create assets
if master_opentext_ids:
logger.info("PPR Domain Registration Test:")
logger.info("Domain Registration Test:")
registration_result = dam.register_master_asset_ids_for_ppr(master_opentext_ids)
if registration_result.get('skipped'):
logger.info(" Skipped (not PPR environment)")
@ -270,7 +324,7 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
'clean_filename': clean_filename,
'creativex_found': creativex_found,
'creativex_score': box_metadata.get('score', '0'),
'creativex_url': box_metadata.get('url', 'https://app.creativex.com/preflight/pretests'),
'creativex_url': box_metadata.get('url', ''),
'dryrun': True
}
@ -292,7 +346,7 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
)
logger.info("Will upload to: 01. Final Assets/{}".format(subfolder_path))
# PPR ONLY: Register master asset IDs in lookup domain before upload
# Register master asset IDs in lookup domain before upload
# OpenText API requires domain values to exist before they can be used in asset creation
if master_opentext_ids:
dam.register_master_asset_ids_for_ppr(master_opentext_ids)
@ -314,6 +368,10 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
filename=clean_filename
)
# Mark pre-upload override as applied (only after confirmed DAM upload success).
if override:
db.mark_override_applied(filename_no_ext)
# 9. Delete file from Box after successful upload (unless --keep-files flag set)
if keep_files:
logger.info("--keep-files flag set - File kept in Box: {}".format(filename))
@ -338,7 +396,7 @@ def process_box_file(file_info, dam, box, db, parser, mvp_extractor, config, not
'clean_filename': clean_filename,
'creativex_found': creativex_found,
'creativex_score': box_metadata.get('score', '0'),
'creativex_url': box_metadata.get('url', 'https://app.creativex.com/preflight/pretests'),
'creativex_url': box_metadata.get('url', ''),
'subfolder_path': subfolder_path # Add subfolder path to result
}

View file

@ -52,61 +52,57 @@ logger = logging.getLogger('A4Box')
def generate_and_upload_csv(db, box, config):
"""
Generate CSV of all live campaigns and upload to Box
Generate the combined live-campaigns CSV (A-series + B-series) and upload
to Box. OMG's automation treats each new file as a full replacement of
its live list, so we always emit the complete list under one filename.
"""
try:
logger.info("Generating live campaigns CSV...")
# 1. Get all live campaigns from DB
campaigns = db.get_all_live_campaigns()
if not campaigns:
logger.warning("No live campaigns found to report")
# Even if empty, we might want to upload an empty CSV to clear the list?
# For now, let's upload it even if empty to reflect that no campaigns are live.
logger.info("Found {} live campaigns".format(len(campaigns)))
# 2. Generate CSV file
timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d_%H%M%S_UTC')
csv_filename = 'live_campaigns_{}.csv'.format(timestamp)
csv_path = os.path.join('temp', csv_filename)
os.makedirs('temp', exist_ok=True)
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['code', 'description']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for camp in campaigns:
writer.writerow({
'code': "{}-{}".format(camp['campaign_number'], camp['campaign_name']),
'description': camp['campaign_name']
})
logger.info("Generated CSV: {}".format(csv_path))
# 3. Upload to Box
folder_id = config['box'].get('live_campaigns_folder_id')
if not folder_id:
logger.error("Box live_campaigns_folder_id not configured")
return False
upload_result = box.upload_file(
file_path=csv_path,
folder_id=folder_id,
target_filename=csv_filename
)
logger.info("Uploaded CSV to Box: {} (File ID: {})".format(
csv_filename, upload_result['file_id']
))
# Clean up
os.remove(csv_path)
return True
except Exception as e:
logger.error("Failed to generate/upload CSV: {}".format(str(e)))
return False
@ -149,11 +145,9 @@ def process_campaign(campaign, dam, box, db, notifier, config):
webhook_sent=True # Mark as processed
)
# Generate and upload updated CSV
# This will now exclude the campaign we just marked as NO
logger.info("Generating and uploading updated live campaigns CSV...")
csv_success = generate_and_upload_csv(db, box, config)
if csv_success:
logger.info("✓ CSV report uploaded successfully")
else:

View file

@ -10,8 +10,10 @@ Compatible with Python 3.6+
import sys
import os
import time
import csv
import logging
import argparse
from datetime import datetime, timezone
# Add shared library to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
@ -52,6 +54,136 @@ logging.basicConfig(
logger = logging.getLogger('B1toB2')
def _walk_metadata_elements(elements):
"""Recursively yield every element in nested metadata_element_list arrays.
Categories and tables both nest fields underneath them, so a flat walk
misses anything below the top level."""
for e in elements or []:
if not isinstance(e, dict):
continue
yield e
nested = e.get('metadata_element_list')
if isinstance(nested, list):
for sub in _walk_metadata_elements(nested):
yield sub
def extract_creativex_from_dam_metadata(asset_metadata):
"""
Extract CreativeX score and URL from DAM asset metadata if present.
Walks the metadata_element_list recursively because the score field
(FERRERO.TAB.FIELD.CREATIVEX) is nested at depth 2 under its parent
table FERRERO.TABULAR.FIELD.CREATIVEX, not at the top level.
"""
try:
top = (asset_metadata or {}).get('metadata', {}).get('metadata_element_list', [])
cx = {'score': None, 'url': None}
for element in _walk_metadata_elements(top):
element_id = element.get('id')
if element_id == 'FERRERO.TAB.FIELD.CREATIVEX':
values = element.get('values', [])
if values:
value_obj = values[0].get('value', {})
if isinstance(value_obj, dict):
field_value = value_obj.get('field_value', {})
if isinstance(field_value, dict):
score = field_value.get('value')
if score:
cx['score'] = str(score)
elif element_id == 'FERRERO.FIELD.CREATIVEX LINK':
value_obj = element.get('value', {})
if isinstance(value_obj, dict):
nested_value = value_obj.get('value', {})
if isinstance(nested_value, dict):
url = nested_value.get('value')
if url:
cx['url'] = url
return cx
except Exception as e:
logger.warning("Failed to extract CreativeX from metadata: {}".format(str(e)))
return {'score': None, 'url': None}
def generate_and_upload_csv(db, box, config):
"""
Generate the combined live-campaigns CSV (A-series + B-series) and upload
to Box. OMG's automation treats each new file as a full replacement of
its live list, so we always emit the complete list under one filename.
"""
try:
logger.info("Generating live campaigns CSV...")
campaigns = db.get_all_live_campaigns()
if not campaigns:
logger.warning("No live campaigns found to report")
logger.info("Found {} live campaigns".format(len(campaigns)))
timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d_%H%M%S_UTC')
csv_filename = 'live_campaigns_{}.csv'.format(timestamp)
csv_path = os.path.join('temp', csv_filename)
os.makedirs('temp', exist_ok=True)
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['code', 'description']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for camp in campaigns:
writer.writerow({
'code': "{}-{}".format(camp['campaign_number'], camp['campaign_name']),
'description': camp['campaign_name']
})
logger.info("Generated CSV: {}".format(csv_path))
folder_id = config['box'].get('live_campaigns_folder_id')
if not folder_id:
logger.error("Box live_campaigns_folder_id not configured")
return False
upload_result = box.upload_file(
file_path=csv_path,
folder_id=folder_id,
target_filename=csv_filename
)
logger.info("Uploaded CSV to Box: {} (File ID: {})".format(
csv_filename, upload_result['file_id']
))
os.remove(csv_path)
return True
except Exception as e:
logger.error("Failed to generate/upload CSV: {}".format(str(e)))
return False
def format_cx_score_for_display(raw_score):
"""DAM stores the CreativeX score as a tabular cell that concatenates
platform and score with a caret, e.g. 'DV360^100'. Convert to
'100 (DV360)' for human-readable email output. Returns the raw value
unchanged if it doesn't match the expected pattern."""
if not raw_score:
return raw_score
if '^' in raw_score:
platform, _, score = raw_score.partition('^')
platform = platform.strip()
score = score.strip()
if platform and score:
return "{} ({})".format(score, platform)
return raw_score
def process_campaign(campaign, dam, box, db, notifier, config):
"""
Process single campaign - download all master assets
@ -103,6 +235,7 @@ def process_campaign(campaign, dam, box, db, notifier, config):
return {'success': False, 'processed': 0, 'failed': total_assets}
# Process each asset
skipped_count = 0
for asset in master_assets:
asset_id = asset['asset_id']
asset_name = asset.get('name', 'unknown')
@ -117,7 +250,7 @@ def process_campaign(campaign, dam, box, db, notifier, config):
# SAFEGUARD: Check if it's a folder (should be handled by dam_client, but double check)
asset_type = asset.get('asset_type', {})
type_name = asset_type.get('name', '') if isinstance(asset_type, dict) else str(asset_type)
if 'folder' in type_name.lower():
logger.warning("Skipping item identified as folder: {} (Type: {})".format(asset_name, type_name))
continue
@ -128,6 +261,37 @@ def process_campaign(campaign, dam, box, db, notifier, config):
logger.warning("Skipping item with no extension (likely folder/container): {}".format(asset_name))
continue
# SKIP CHECK: If this asset was already processed (exists in DB), skip re-downloading
existing_tracking_id = db.find_global_master_by_opentext_id(asset_id)
if existing_tracking_id:
existing_asset = db.get_master_asset(existing_tracking_id)
if existing_asset and existing_asset.get('box_url'):
skipped_count += 1
logger.info("⏭ Already processed: {}{} (skipping)".format(asset_name, existing_tracking_id))
cx = extract_creativex_from_dam_metadata(existing_asset.get('full_metadata') or {})
if cx['score'] or cx['url']:
db.store_creativex_score(
filename=asset_name,
creativex_id='',
creativex_url=cx['url'] or '',
quality_score=cx['score'] or '',
box_file_id=existing_asset.get('box_file_id', ''),
full_extraction_data={'master_metadata': True, 'source': 'b1_to_b2', 'data': cx},
tracking_id=existing_tracking_id,
status='b1-master-cx-score'
)
processed_assets.append({
'asset_id': asset_id,
'asset_name': asset_name,
'tracking_id': existing_tracking_id,
'box_file_id': existing_asset.get('box_file_id', ''),
'box_url': existing_asset.get('box_url', ''),
'creativex_score': format_cx_score_for_display(cx['score']),
'creativex_url': cx['url'],
'is_existing': True
})
continue
# 1. Download from DAM
file_path = dam.download_asset(
asset_id,
@ -161,12 +325,29 @@ def process_campaign(campaign, dam, box, db, notifier, config):
)
if db_result['success']:
cx = extract_creativex_from_dam_metadata(asset)
if cx['score']:
logger.info("CreativeX score on master {}: {}".format(asset_name, cx['score']))
if cx['score'] or cx['url']:
db.store_creativex_score(
filename=asset_name,
creativex_id='',
creativex_url=cx['url'] or '',
quality_score=cx['score'] or '',
box_file_id=box_result['file_id'],
full_extraction_data={'master_metadata': True, 'source': 'b1_to_b2', 'data': cx},
tracking_id=tracking_id,
status='b1-master-cx-score'
)
processed_assets.append({
'asset_id': asset_id,
'asset_name': asset_name,
'tracking_id': tracking_id,
'box_file_id': box_result['file_id'],
'box_url': box_result['url']
'box_url': box_result['url'],
'creativex_score': format_cx_score_for_display(cx['score']),
'creativex_url': cx['url'],
'is_existing': False
})
logger.info("✓ Success: {}{}".format(asset_name, tracking_id))
else:
@ -186,10 +367,16 @@ def process_campaign(campaign, dam, box, db, notifier, config):
# CHECK: All assets processed successfully?
all_done = len(processed_assets) == total_assets
# Split new vs existing for reporting
new_assets = [a for a in processed_assets if not a.get('is_existing')]
existing_assets = [a for a in processed_assets if a.get('is_existing')]
logger.info("")
logger.info("Campaign {} Results:".format(campaign_id))
logger.info(" Total: {}".format(total_assets))
logger.info(" Successful: {}".format(len(processed_assets)))
logger.info(" Skipped (already done): {}".format(skipped_count))
logger.info(" New this run: {}".format(len(new_assets)))
logger.info(" Failed: {}".format(len(failed_assets)))
logger.info(" All Done: {}".format("YES" if all_done else "NO"))
logger.info("")
@ -203,6 +390,28 @@ def process_campaign(campaign, dam, box, db, notifier, config):
if status_result['success']:
logger.info("✓ Status updated successfully")
# Record campaign status in database — marks it as LIVE so the
# global CSV picks it up. B4 closure (or A4 with prior B-status)
# later flips this to NO.
logger.info("Recording campaign status in database (Live: YES, status B2)...")
db.record_campaign_status(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
live_campaign='YES',
status='B2',
webhook_sent=False # B-series workflow doesn't send a webhook
)
# Regenerate and upload the combined live campaigns CSV to Box.
# Box automation forwards it to OMG as a full-list replacement.
logger.info("Generating and uploading live campaigns CSV...")
csv_success = generate_and_upload_csv(db, box, config)
if csv_success:
logger.info("✓ CSV report uploaded successfully")
else:
logger.error("✗ CSV report generation/upload failed")
# NOTE: B1→B2 workflow does NOT send webhook (only email notification)
# Webhook is only used for A1→A2 workflow
@ -215,7 +424,7 @@ def process_campaign(campaign, dam, box, db, notifier, config):
os.makedirs("temp")
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['Filename', 'Tracking ID', 'Campaign Number']
fieldnames = ['Filename', 'Tracking ID', 'Campaign Number', 'Status']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
@ -223,7 +432,8 @@ def process_campaign(campaign, dam, box, db, notifier, config):
writer.writerow({
'Filename': asset['asset_name'],
'Tracking ID': asset['tracking_id'],
'Campaign Number': campaign_number
'Campaign Number': campaign_number,
'Status': 'Existing' if asset.get('is_existing') else 'New'
})
logger.info("Generated CSV report: {}".format(csv_path))
@ -242,7 +452,11 @@ def process_campaign(campaign, dam, box, db, notifier, config):
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'asset_count': len(processed_assets),
'processed_assets': processed_assets
'new_asset_count': len(new_assets),
'existing_asset_count': len(existing_assets),
'processed_assets': processed_assets,
'new_assets': new_assets,
'existing_assets': existing_assets
},
attachments=attachments
)

View file

@ -0,0 +1,283 @@
#!/usr/bin/env python3
"""
B4 Box Uploader
Monitors campaigns with status B4 (Global - Not Going Live)
Updates status in DB to live_campaign='NO'
Generates and uploads updated GLOBAL CSV of live campaigns to Box.
Mirrors a4_box_uploader.py for the global (B-series) workflow.
"""
import sys
import os
import time
import logging
import argparse
import csv
from datetime import datetime, timezone
# Add shared library to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.dam_client import DAMClient
from shared.box_client import BoxClient
from shared.database import Database
from shared.notifier import Notifier
# Setup logging with rotation
from logging.handlers import RotatingFileHandler
# Create logs directory if it doesn't exist
os.makedirs('logs', exist_ok=True)
os.makedirs('logs/backup', exist_ok=True)
# Configure logging with rotation
log_handler = RotatingFileHandler(
'logs/b4_box.log',
maxBytes=10*1024*1024, # 10MB per file
backupCount=28
)
log_handler.setLevel(logging.INFO)
log_handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)
console_handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
logging.basicConfig(
level=logging.INFO,
handlers=[log_handler, console_handler]
)
logger = logging.getLogger('B4Box')
def generate_and_upload_csv(db, box, config):
"""
Generate the combined live-campaigns CSV (A-series + B-series) and upload
to Box. OMG's automation treats each new file as a full replacement of
its live list, so we always emit the complete list under one filename.
"""
try:
logger.info("Generating live campaigns CSV...")
campaigns = db.get_all_live_campaigns()
if not campaigns:
logger.warning("No live campaigns found to report")
logger.info("Found {} live campaigns".format(len(campaigns)))
timestamp = datetime.now(timezone.utc).strftime('%Y-%m-%d_%H%M%S_UTC')
csv_filename = 'live_campaigns_{}.csv'.format(timestamp)
csv_path = os.path.join('temp', csv_filename)
os.makedirs('temp', exist_ok=True)
with open(csv_path, 'w', newline='') as csvfile:
fieldnames = ['code', 'description']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for camp in campaigns:
writer.writerow({
'code': "{}-{}".format(camp['campaign_number'], camp['campaign_name']),
'description': camp['campaign_name']
})
logger.info("Generated CSV: {}".format(csv_path))
folder_id = config['box'].get('live_campaigns_folder_id')
if not folder_id:
logger.error("Box live_campaigns_folder_id not configured")
return False
upload_result = box.upload_file(
file_path=csv_path,
folder_id=folder_id,
target_filename=csv_filename
)
logger.info("Uploaded CSV to Box: {} (File ID: {})".format(
csv_filename, upload_result['file_id']
))
os.remove(csv_path)
return True
except Exception as e:
logger.error("Failed to generate/upload CSV: {}".format(str(e)))
return False
def process_campaign(campaign, dam, box, db, notifier, config):
"""
Process B4 campaign - mark not-live and regenerate the global CSV.
"""
campaign_id = campaign['asset_id']
campaign_name = campaign['campaign_name']
campaign_number = campaign.get('campaign_id') or 'UNKNOWN'
logger.info("=" * 60)
logger.info("Processing B4 campaign: {} ({})".format(campaign_name, campaign_number))
logger.info("=" * 60)
try:
campaign_check = db.check_campaign_processed(campaign_id)
if campaign_check['exists'] and campaign_check['webhook_sent']:
logger.info("Campaign already processed")
logger.info(" Processed at: {}".format(campaign_check['webhook_sent_at']))
logger.info(" Status: {}".format(campaign_check['status']))
logger.info(" Live Campaign: {}".format(campaign_check['live_campaign']))
logger.info("Skipping to avoid duplicate processing")
return {'success': True, 'processed': False, 'already_processed': True}
logger.info("Recording campaign status in database (Live: NO)...")
db.record_campaign_status(
campaign_id=campaign_id,
campaign_number=campaign_number,
campaign_name=campaign_name,
live_campaign='NO',
status='B4',
webhook_sent=True
)
logger.info("Generating and uploading updated live campaigns CSV...")
csv_success = generate_and_upload_csv(db, box, config)
if csv_success:
logger.info("✓ CSV report uploaded successfully")
else:
logger.error("✗ CSV report generation/upload failed")
notifier.send_email(
template_name='a4_webhook_sent', # Reuse template — conveys "closure processed"
recipients=config['notifications']['recipients']['success'],
data={
'campaign_name': campaign_name,
'campaign_id': campaign_id,
'campaign_number': campaign_number,
'webhook_url': 'CSV Uploaded to Box (Global)'
}
)
return {'success': True, 'processed': True}
except Exception as e:
logger.error("Campaign processing failed: {}".format(str(e)))
return {'success': False, 'processed': False}
def main():
"""Main polling loop"""
parser = argparse.ArgumentParser(description='Ferrero B4 Box Uploader')
parser.add_argument('--auth-pfx', action='store_true',
help='Use mTLS certificate authentication (Legacy APIM)')
parser.add_argument('--auth-pfx-v2', action='store_true',
help='Use mTLS V2 (Hybrid) authentication')
args = parser.parse_args()
logger.info("=" * 60)
logger.info("Ferrero B4 Box Uploader Starting")
auth_mode = 'oauth'
if args.auth_pfx_v2:
auth_mode = 'mtls_v2'
logger.info("Authentication: mTLS V2 (Hybrid)")
elif args.auth_pfx:
auth_mode = 'mtls'
logger.info("Authentication: mTLS Certificate (Legacy)")
else:
logger.info("Authentication: OAuth2 (default)")
logger.info("=" * 60)
config = load_config('config/config.yaml')
dam = DAMClient(config, auth_mode=auth_mode)
box = BoxClient(config)
db = Database(config)
notifier = Notifier(config)
logger.info("Testing connections...")
if not dam.test_connection():
logger.error("DAM connection failed - exiting")
sys.exit(1)
if not box.test_connection():
logger.error("Box connection failed - exiting")
sys.exit(1)
if not db.test_connection():
logger.error("Database connection failed - exiting")
sys.exit(1)
logger.info("All connections OK")
logger.info("")
try:
logger.info("Searching for B4 campaigns...")
campaigns = dam.search_campaigns(status='B4')
if not campaigns:
logger.info("No B4 campaigns found - exiting")
db.close()
sys.exit(0)
logger.info("Found {} B4 campaign(s) - processing all".format(len(campaigns)))
logger.info("")
processed_count = 0
failed_count = 0
already_processed_count = 0
for campaign in campaigns:
result = process_campaign(campaign, dam, box, db, notifier, config)
if result['success']:
if result.get('processed'):
processed_count += 1
if result.get('already_processed'):
already_processed_count += 1
else:
failed_count += 1
logger.info("")
logger.info("=" * 60)
logger.info("B4 Box Uploader Summary")
logger.info("=" * 60)
logger.info("Total campaigns found: {}".format(len(campaigns)))
logger.info("Processed (CSV updated): {}".format(processed_count))
logger.info("Already processed: {}".format(already_processed_count))
logger.info("Failed: {}".format(failed_count))
logger.info("=" * 60)
db.close()
if failed_count == 0:
sys.exit(0)
elif processed_count > 0:
sys.exit(0)
else:
sys.exit(1)
except Exception as e:
logger.critical("Script error: {}".format(str(e)))
notifier.send_email(
template_name='upload_failed',
recipients=config['notifications']['recipients']['critical'],
data={
'filename': 'B4 Box Uploader',
'tracking_id': 'N/A',
'error': str(e)
}
)
db.close()
sys.exit(1)
if __name__ == '__main__':
main()

View file

@ -0,0 +1,203 @@
#!/usr/bin/env python3
"""
One-shot backfill: Populate creativex_scores with status='b1-master-cx-score'
for B1B2 global masters already in master_assets that don't yet have a row.
Identification rule:
tracking_id LIKE 'M%' AND local_campaign_id IS NULL AND status = 'active'
B1B2 stores masters without local_campaign_id; A1A2 always sets it, so this
cleanly separates global from local masters that share the M-prefix.
The CX score is read out of master_assets.full_metadata JSONB. Rows where the
DAM metadata has no CreativeX score AND no URL are reported but skipped.
db.store_creativex_score(..., status='b1-master-cx-score') already dedupes by
tracking_id, so re-running is safe.
Usage:
python scripts/backfill_b1_creativex_scores.py # apply
python scripts/backfill_b1_creativex_scores.py --dry-run # preview only
"""
import sys
import os
import argparse
import logging
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.database import Database
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger('B1CXBackfill')
def _walk_metadata_elements(elements):
"""Recursively yield every element in nested metadata_element_list arrays."""
for e in elements or []:
if not isinstance(e, dict):
continue
yield e
nested = e.get('metadata_element_list')
if isinstance(nested, list):
for sub in _walk_metadata_elements(nested):
yield sub
def extract_creativex_from_dam_metadata(asset_metadata):
"""Mirror of the extractor in b1_to_b2_download.py — duplicated here
to keep the backfill script self-contained (avoids triggering
b1_to_b2_download's module-level logging setup on import).
Walks recursively: the score field is at depth 2 (nested inside
FERRERO.TABULAR.FIELD.CREATIVEX, which lives inside a category)."""
try:
top = (asset_metadata or {}).get('metadata', {}).get('metadata_element_list', [])
cx = {'score': None, 'url': None}
for element in _walk_metadata_elements(top):
element_id = element.get('id')
if element_id == 'FERRERO.TAB.FIELD.CREATIVEX':
values = element.get('values', [])
if values:
value_obj = values[0].get('value', {})
if isinstance(value_obj, dict):
field_value = value_obj.get('field_value', {})
if isinstance(field_value, dict):
score = field_value.get('value')
if score:
cx['score'] = str(score)
elif element_id == 'FERRERO.FIELD.CREATIVEX LINK':
value_obj = element.get('value', {})
if isinstance(value_obj, dict):
nested = value_obj.get('value', {})
if isinstance(nested, dict):
url = nested.get('value')
if url:
cx['url'] = url
return cx
except Exception as e:
logger.warning('Failed to extract CreativeX from metadata: %s', e)
return {'score': None, 'url': None}
def fetch_b1_masters(db):
conn = db.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT tracking_id, original_filename, file_extension,
full_metadata, description
FROM master_assets
WHERE tracking_id LIKE 'M%'
AND local_campaign_id IS NULL
AND status = 'active'
ORDER BY created_at
""")
rows = cursor.fetchall()
return [
{
'tracking_id': r[0],
'filename': (r[1] or '') + (r[2] or ''),
'full_metadata': r[3] if isinstance(r[3], dict) else (r[3] or {}),
'box_file_id': Database.parse_box_info_from_description(r[4]).get('box_file_id') or '',
}
for r in rows
]
finally:
cursor.close()
db.put_connection(conn)
def existing_cx_tracking_ids(db):
"""Return set of tracking_ids that already have a b1-master-cx-score row."""
conn = db.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT DISTINCT tracking_id
FROM creativex_scores
WHERE status = 'b1-master-cx-score'
AND tracking_id IS NOT NULL
""")
return {row[0] for row in cursor.fetchall()}
finally:
cursor.close()
db.put_connection(conn)
def main():
parser = argparse.ArgumentParser(description='Backfill B1 master CreativeX scores')
parser.add_argument('--dry-run', action='store_true',
help='Report what would be inserted without touching the DB')
args = parser.parse_args()
config = load_config('config/config.yaml')
db = Database(config)
if not db.test_connection():
logger.error('Database connection failed')
sys.exit(1)
masters = fetch_b1_masters(db)
already_have = existing_cx_tracking_ids(db)
logger.info('Scanned %d B1 global masters in master_assets', len(masters))
logger.info('Existing b1-master-cx-score rows: %d', len(already_have))
inserted = 0
skipped_no_cx = 0
skipped_already = 0
for m in masters:
if m['tracking_id'] in already_have:
skipped_already += 1
continue
cx = extract_creativex_from_dam_metadata(m['full_metadata'])
if not (cx['score'] or cx['url']):
skipped_no_cx += 1
logger.debug('No CX in metadata for %s (%s)', m['tracking_id'], m['filename'])
continue
if args.dry_run:
logger.info('[DRY-RUN] Would insert: %s | %s | score=%s url=%s',
m['tracking_id'], m['filename'], cx['score'], cx['url'])
inserted += 1
continue
result = db.store_creativex_score(
filename=m['filename'],
creativex_id='',
creativex_url=cx['url'] or '',
quality_score=cx['score'] or '',
box_file_id=m['box_file_id'],
full_extraction_data={'master_metadata': True, 'source': 'b1_backfill', 'data': cx},
tracking_id=m['tracking_id'],
status='b1-master-cx-score'
)
if result.get('success'):
if result.get('already_exists'):
# Race or stale already_have set — count as already
skipped_already += 1
else:
inserted += 1
logger.info('Inserted: %s | %s | score=%s', m['tracking_id'], m['filename'], cx['score'])
else:
logger.error('Failed for %s: %s', m['tracking_id'], result.get('error'))
logger.info('=' * 60)
logger.info('Backfill summary%s:', ' (DRY-RUN)' if args.dry_run else '')
logger.info(' Scanned B1 masters: %d', len(masters))
logger.info(' Already had CX row: %d', skipped_already)
logger.info(' No CX in metadata: %d', skipped_no_cx)
logger.info(' %s: %d', 'Would insert' if args.dry_run else 'Inserted', inserted)
logger.info('=' * 60)
db.close()
if __name__ == '__main__':
main()

View file

@ -0,0 +1,117 @@
#!/usr/bin/env python3
"""
Campaign Status Check - Read-only lookup of a campaign's current status on the DAM
Searches all A#/B# statuses for a campaign by number or partial name and prints
the current status. Makes no changes.
Compatible with Python 3.6+
"""
import sys
import os
import logging
import argparse
# Add shared library to path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.dam_client import DAMClient
from scripts.update_campaign_status import find_campaign_by_identifier
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger('CheckStatus')
def main():
parser = argparse.ArgumentParser(
description='Check the current status of a campaign on the DAM (read-only)',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Check campaign C000000078 (dev environment, OAuth)
python scripts/check_campaign_status.py --camp C000000078
# Check by partial name
python scripts/check_campaign_status.py --camp "CONTENT SCALING"
# Production environment with mTLS V2
python scripts/check_campaign_status.py --camp C000000078 --auth-pfx-v2 --env prod
"""
)
parser.add_argument('--camp', type=str, required=True,
help='Campaign number (e.g., C000000078) or partial campaign name')
parser.add_argument('--auth-pfx', action='store_true',
help='Use mTLS certificate authentication (Legacy APIM)')
parser.add_argument('--auth-pfx-v2', action='store_true',
help='Use mTLS V2 (Hybrid) authentication')
parser.add_argument('--env', type=str, choices=['dev', 'prod'], default='dev',
help='Environment: dev (default) or prod')
args = parser.parse_args()
auth_mode = 'oauth'
if args.auth_pfx_v2:
auth_mode = 'mtls_v2'
elif args.auth_pfx:
auth_mode = 'mtls'
os.environ['ENV'] = args.env
print("")
print("=" * 70)
print("Ferrero Campaign Status Check")
print("=" * 70)
print("Campaign Identifier: {}".format(args.camp))
print("Environment: {}".format(args.env.upper()))
if auth_mode == 'mtls_v2':
print("Authentication: mTLS V2 (Hybrid)")
elif auth_mode == 'mtls':
print("Authentication: mTLS Certificate (Legacy)")
else:
print("Authentication: OAuth2 (default)")
print("=" * 70)
print("")
config = load_config('config/config.yaml')
dam = DAMClient(config, auth_mode=auth_mode)
logger.info("Testing DAM connection...")
if not dam.test_connection():
logger.error("DAM connection failed - exiting")
sys.exit(1)
logger.info("DAM connection OK")
print("")
campaigns = find_campaign_by_identifier(dam, args.camp)
if not campaigns:
print("")
print("=" * 70)
print("No campaigns found matching: {}".format(args.camp))
print("=" * 70)
print("")
print("Searched statuses: A1, A2, A3, A4, A5, A6, B1, B2")
print("Try:")
print(" - Exact campaign number: C000000078")
print(" - Partial campaign name: CONTENT SCALING")
sys.exit(1)
print("")
print("=" * 70)
print("Found {} matching campaign(s)".format(len(campaigns)))
print("=" * 70)
print("")
for i, campaign in enumerate(campaigns, 1):
print("{}. {}".format(i, campaign.get('campaign_name', 'Unknown')))
print(" Campaign Number: {}".format(campaign.get('campaign_id', 'N/A')))
print(" Current Status: {}".format(campaign['current_status']))
print(" DAM Asset ID: {}".format(campaign.get('asset_id', 'N/A')))
print("")
if __name__ == '__main__':
main()

View file

@ -0,0 +1,162 @@
#!/usr/bin/env python3
"""
Diagnostic: Inspect what metadata B1 global masters actually carry in
master_assets.full_metadata, so we can tell why the CX backfill found 0.
Two checks:
1. Top-level keys of full_metadata (does the structure even contain
metadata.metadata_element_list?).
2. Across a larger sample, count occurrences of any element_id that
looks CX/score/quality-related (case-insensitive) surfaces the
actual element IDs used by client B1 masters, in case they differ
from the A1 IDs the extractor expects.
Read-only. Safe to run any time.
Usage:
python scripts/diagnose_b1_master_metadata.py
python scripts/diagnose_b1_master_metadata.py --sample 200
"""
import sys
import os
import json
import argparse
import logging
from collections import Counter
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
from shared.config_loader import load_config
from shared.database import Database
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger('B1MetaDiag')
CX_HINTS = ('creativex', 'cx', 'score', 'quality')
def walk_elements(elements, depth=0):
"""Recursively yield (depth, element) for every element in a nested
metadata_element_list. Categories and tables both contain nested
metadata_element_list arrays flat iteration misses everything below
the top level."""
for e in elements or []:
if not isinstance(e, dict):
continue
yield depth, e
nested = e.get('metadata_element_list')
if isinstance(nested, list):
for sub in walk_elements(nested, depth + 1):
yield sub
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--sample', type=int, default=100,
help='How many B1 masters to scan for element-ID counts (default 100)')
parser.add_argument('--show-full', type=int, default=2,
help='How many sample full_metadata blobs to dump in full (default 2)')
args = parser.parse_args()
config = load_config('config/config.yaml')
db = Database(config)
if not db.test_connection():
sys.exit(1)
conn = db.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT tracking_id, original_filename, full_metadata
FROM master_assets
WHERE tracking_id LIKE 'M%%'
AND local_campaign_id IS NULL
AND status = 'active'
ORDER BY created_at DESC
LIMIT %s
""", (args.sample,))
rows = cursor.fetchall()
finally:
cursor.close()
db.put_connection(conn)
logger.info('Sampled %d B1 global masters', len(rows))
# 1. Top-level structure check
top_key_counter = Counter()
has_meta_list = 0
empty_full_meta = 0
for r in rows:
full = r[2] if isinstance(r[2], dict) else (r[2] or {})
if not full:
empty_full_meta += 1
continue
for k in full.keys():
top_key_counter[k] += 1
meta = full.get('metadata')
if isinstance(meta, dict) and isinstance(meta.get('metadata_element_list'), list):
has_meta_list += 1
logger.info('=' * 60)
logger.info('Top-level keys present in full_metadata (count of rows containing the key):')
for k, c in top_key_counter.most_common():
logger.info(' %-30s %d', k, c)
logger.info('Rows with empty full_metadata: %d', empty_full_meta)
logger.info('Rows with metadata.metadata_element_list: %d', has_meta_list)
logger.info('=' * 60)
# 2. Recursive hunt for CX-flavored element IDs (nested metadata_element_list)
id_counter = Counter()
cx_id_depth = {} # eid -> depth at which it was first seen
cx_id_counter = Counter()
rows_with_cx_hint = 0
max_depth_seen = 0
for r in rows:
full = r[2] if isinstance(r[2], dict) else (r[2] or {})
top_list = (full.get('metadata') or {}).get('metadata_element_list') or []
row_had_hint = False
for depth, e in walk_elements(top_list):
if depth > max_depth_seen:
max_depth_seen = depth
eid = (e.get('id') or '').strip()
if not eid:
continue
id_counter[eid] += 1
lower = eid.lower()
if any(h in lower for h in CX_HINTS):
cx_id_counter[eid] += 1
cx_id_depth.setdefault(eid, depth)
row_had_hint = True
if row_had_hint:
rows_with_cx_hint += 1
logger.info('Distinct element_ids seen across sample (any depth): %d', len(id_counter))
logger.info('Max nesting depth observed: %d', max_depth_seen)
logger.info('Rows containing at least one CX-flavored element_id: %d / %d',
rows_with_cx_hint, len(rows))
logger.info('-' * 60)
if cx_id_counter:
logger.info('CX/score/quality-flavored element_ids found (id @ depth, count):')
for eid, c in cx_id_counter.most_common():
logger.info(' %-50s @depth %d %d', eid, cx_id_depth[eid], c)
else:
logger.info('NO CX/score/quality-flavored element_ids found at any depth.')
logger.info('Likely: client B1 masters were uploaded before CX scoring ran on them.')
logger.info('=' * 60)
# 3. Dump first few full blobs verbatim for manual inspection
if args.show_full > 0:
logger.info('First %d full_metadata blobs (truncated to 4KB each):', args.show_full)
for r in rows[:args.show_full]:
full = r[2] if isinstance(r[2], dict) else (r[2] or {})
blob = json.dumps(full, indent=2, default=str)
if len(blob) > 4096:
blob = blob[:4096] + '\n... [truncated]'
logger.info('--- %s (%s) ---\n%s', r[0], r[1], blob)
db.close()
if __name__ == '__main__':
main()

View file

@ -75,6 +75,12 @@ TASKS = [
'interval_minutes': 10,
'args': ['--auth-pfx-v2'] # Production uses mTLS V2
},
{
'name': 'B4 Box Uploader',
'script': 'scripts/b4_box_uploader.py',
'interval_minutes': 10,
'args': ['--auth-pfx-v2'] # Production uses mTLS V2
},
{
'name': 'Daily Report',
'script': 'scripts/daily_report.py',
@ -84,9 +90,77 @@ TASKS = [
}
]
# ==========================================
# OFF-HOURS CONFIGURATION
# ==========================================
# Off-hours definition
OFF_HOURS_CONFIG = {
'enabled': True, # Set to False to disable off-hours slowdown
'extra_minutes': 30, # Minutes to add to intervals during off-hours
# Late night: 10 PM (22:00) to 5 AM (05:00) every day
'late_night_start': 22, # Hour (0-23)
'late_night_end': 5, # Hour (0-23)
# Weekend: All day Saturday and Sunday
'weekend_days': [5, 6], # 0=Monday, 5=Saturday, 6=Sunday
# Tasks exempt from off-hours slowdown (always run at normal cadence)
'exempt_tasks': [
'Daily Report' # Task name to exclude (runs at 7 PM regardless)
]
}
LOCK_DIR = 'locks'
STATE_FILE = 'orchestrator_state.json'
# ==========================================
# OFF-HOURS DETECTION
# ==========================================
def is_off_hours(now=None):
"""
Determine if current time is in off-hours period
Args:
now: datetime object (defaults to current time)
Returns:
bool: True if in off-hours, False otherwise
"""
if not OFF_HOURS_CONFIG['enabled']:
return False
if now is None:
now = datetime.now()
current_hour = now.hour
current_weekday = now.weekday() # 0=Monday, 6=Sunday
# Check if weekend (all day Saturday or Sunday)
if current_weekday in OFF_HOURS_CONFIG['weekend_days']:
logger.debug("Off-hours: Weekend (day {})".format(current_weekday))
return True
# Check if late night
late_night_start = OFF_HOURS_CONFIG['late_night_start']
late_night_end = OFF_HOURS_CONFIG['late_night_end']
if late_night_start > late_night_end:
# Wraps around midnight (e.g., 22:00 to 5:00)
is_late_night = current_hour >= late_night_start or current_hour < late_night_end
else:
# Same day range (e.g., 1:00 to 5:00)
is_late_night = late_night_start <= current_hour < late_night_end
if is_late_night:
logger.debug("Off-hours: Late night (hour {})".format(current_hour))
return True
logger.debug("Business hours (hour {}, weekday {})".format(current_hour, current_weekday))
return False
# ==========================================
# CORE CLASSES
# ==========================================
@ -177,22 +251,55 @@ class TaskRunner:
now = datetime.now()
current_hour = now.hour
current_minute = now.minute
logger.info(f"Orchestrator tick: {now.strftime('%Y-%m-%d %H:%M:%S')}")
# Determine if we're in off-hours
in_off_hours = is_off_hours(now)
if in_off_hours:
logger.info("=" * 80)
logger.info("Orchestrator tick: {} [OFF-HOURS MODE]".format(now.strftime('%Y-%m-%d %H:%M:%S')))
logger.info("Adding {} minutes to all task intervals".format(OFF_HOURS_CONFIG['extra_minutes']))
logger.info("=" * 80)
else:
logger.info("Orchestrator tick: {} [NORMAL MODE]".format(now.strftime('%Y-%m-%d %H:%M:%S')))
for task in TASKS:
# Check for specific hour schedule
task_name = task['name']
# Check for specific hour schedule (e.g., Daily Report at 7 PM)
if 'run_at_hour' in task:
target_hour = task['run_at_hour']
# Run only at the top of the hour (minute 0)
if current_hour == target_hour and current_minute == 0:
logger.info("Scheduled task '{}' due at {}:00".format(task_name, target_hour))
self.run_task(task)
continue
# Standard interval check
interval = task.get('interval_minutes', 5)
if interval > 0 and current_minute % interval == 0:
self.run_task(task)
# Standard interval check with off-hours adjustment
base_interval = task.get('interval_minutes', 5)
# Check if task is exempt from off-hours slowdown
is_exempt = task_name in OFF_HOURS_CONFIG['exempt_tasks']
# In off-hours, skip non-exempt tasks unless they match the extended interval
if in_off_hours and not is_exempt:
# Task should run if:
# 1. Current minute matches base interval (normal check)
# 2. AND we're at a 30-minute boundary (0 or 30)
if base_interval > 0:
matches_interval = current_minute % base_interval == 0
at_boundary = current_minute % 30 == 0
if matches_interval and at_boundary:
logger.info("Task '{}' due (off-hours: {}min + 30min cadence)".format(
task_name, base_interval
))
self.run_task(task)
else:
# Normal business hours OR exempt task
if base_interval > 0 and current_minute % base_interval == 0:
logger.info("Task '{}' due ({}min interval)".format(task_name, base_interval))
self.run_task(task)
def main():
parser = argparse.ArgumentParser(description='Ferrero Orchestrator')

View file

@ -75,6 +75,12 @@ TASKS = [
'interval_minutes': 10,
'args': [] # Temporarily using OAuth instead of --auth-pfx-v2
},
{
'name': 'B4 Box Uploader',
'script': 'scripts/b4_box_uploader.py',
'interval_minutes': 10,
'args': [] # Temporarily using OAuth instead of --auth-pfx-v2
},
{
'name': 'Daily Report',
'script': 'scripts/daily_report.py',

View file

@ -583,6 +583,9 @@ class DAMClient:
# If extension has spaces in it, it's not a real extension
elif ' ' in ext:
is_folder = True
# Numeric-only extension = version number (e.g. "WND_PCS 2026 2.0"), not a file
elif ext[1:].isdigit():
is_folder = True
else:
# Has an extension-like string, but not in our known list
# Could be an uncommon file type - assume it's a file to be safe
@ -1301,11 +1304,11 @@ class DAMClient:
def register_master_asset_ids_for_ppr(self, master_asset_ids):
"""
Register all master asset IDs in the lookup domain (PPR only).
Register all master asset IDs in the lookup domain.
Call this before creating an asset that references these IDs.
The OpenText DAM API does not support creating new domain values during
asset creation. In PPR, we must first add each master asset ID to the
asset creation. We must first add each master asset ID to the
FERRERO_MASTER_ASSET_ID domain value table before the create asset call.
Args:
@ -1314,16 +1317,11 @@ class DAMClient:
Returns:
dict with success, registered_ids, failed_ids
"""
# Only for PPR environment
if 'ppr' not in self.base_url.lower():
logger.debug("Not PPR environment - skipping master asset ID domain registration")
return {'success': True, 'skipped': True}
if not master_asset_ids:
return {'success': True, 'registered_ids': [], 'failed_ids': []}
logger.info("=" * 60)
logger.info("PPR: Registering {} master asset ID(s) in lookup domain".format(len(master_asset_ids)))
logger.info("Registering {} master asset ID(s) in lookup domain".format(len(master_asset_ids)))
logger.info(" IDs: {}".format(', '.join(master_asset_ids)))
logger.info("=" * 60)
@ -1337,11 +1335,11 @@ class DAMClient:
else:
failed.append({'id': master_id, 'error': result.get('error')})
logger.info("PPR: Domain registration complete - {}/{} succeeded".format(
logger.info("Domain registration complete - {}/{} succeeded".format(
len(registered), len(master_asset_ids)))
if failed:
logger.warning("PPR: Failed to register: {}".format(
logger.warning("Failed to register: {}".format(
', '.join([f['id'] for f in failed])))
# Return success even if some failed (better to try the upload and see)
@ -1385,14 +1383,10 @@ class DAMClient:
current_folder_id = existing
logger.info("Found existing folder: {} (ID: {})".format(folder_name, current_folder_id))
else:
# Create it
new_id = self._create_folder(current_folder_id, folder_name)
if new_id:
current_folder_id = new_id
logger.info("Created folder: {} (ID: {})".format(folder_name, current_folder_id))
else:
logger.error("Failed to create folder: {}".format(folder_name))
return base_folder_id # Return base folder if creation fails
# Folder doesn't exist - DAM doesn't allow folder creation via API
# Upload to parent folder instead
logger.warning("Folder '{}' not found in DAM. DAM does not allow folder creation. Files will be uploaded to parent folder.".format(folder_name))
return current_folder_id # Return current parent folder instead of trying to create
return current_folder_id

View file

@ -148,7 +148,45 @@ class Database:
cursor.close()
self.put_connection(conn)
def store_master_asset(self, tracking_id, opentext_id, asset_data, box_file_id, box_url, upload_folder_id, global_master_campaign_id=None, global_master_folder_id=None, local_campaign_id=None):
def find_global_master_by_opentext_id(self, opentext_id):
"""
Look up a B1B2 global master asset by opentext_id.
Returns the M-prefixed tracking ID if a matching global master exists.
Args:
opentext_id: DAM asset ID to search for
Returns:
str: M-prefixed tracking ID if found, None otherwise
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT tracking_id FROM master_assets
WHERE opentext_id = %s
AND tracking_id LIKE 'M%%'
AND status = 'active'
LIMIT 1
""", (opentext_id,))
row = cursor.fetchone()
if row:
logger.info("Found global master tracking ID {} for opentext_id {}".format(
row[0], opentext_id
))
return row[0]
else:
logger.debug("No global master found for opentext_id {}".format(opentext_id))
return None
finally:
cursor.close()
self.put_connection(conn)
def store_master_asset(self, tracking_id, opentext_id, asset_data, box_file_id, box_url, upload_folder_id, global_master_campaign_id=None, global_master_folder_id=None, local_campaign_id=None, global_master_tracking_id=None):
"""
Store master asset with FULL metadata in JSONB column
@ -162,6 +200,7 @@ class Database:
global_master_campaign_id: Global master campaign ID (from GLOBAL CAMPAIGN REFERENCE)
global_master_folder_id: Global master folder ID
local_campaign_id: Local campaign ID (immediate campaign this asset belongs to)
global_master_tracking_id: M-prefixed tracking ID from B1B2 global master (if found)
Returns:
dict with success boolean
@ -190,9 +229,10 @@ class Database:
tracking_id, opentext_id, original_filename, file_extension,
file_size_bytes, mime_type, upload_directory,
description, full_metadata, status,
global_master_campaign_id, global_master_folder_id, local_campaign_id
global_master_campaign_id, global_master_folder_id, local_campaign_id,
global_master_tracking_id
) VALUES (
%s, %s, %s, %s, %s, %s, %s, %s, %s, 'active', %s, %s, %s
%s, %s, %s, %s, %s, %s, %s, %s, %s, 'active', %s, %s, %s, %s
)
ON CONFLICT (tracking_id) DO UPDATE SET
upload_directory = EXCLUDED.upload_directory,
@ -201,6 +241,7 @@ class Database:
global_master_campaign_id = EXCLUDED.global_master_campaign_id,
global_master_folder_id = EXCLUDED.global_master_folder_id,
local_campaign_id = EXCLUDED.local_campaign_id,
global_master_tracking_id = EXCLUDED.global_master_tracking_id,
updated_at = CURRENT_TIMESTAMP
""", (
tracking_id,
@ -214,7 +255,8 @@ class Database:
full_metadata_json,
global_master_campaign_id,
global_master_folder_id,
local_campaign_id
local_campaign_id,
global_master_tracking_id
))
conn.commit()
@ -588,7 +630,7 @@ class Database:
cursor.close()
self.put_connection(conn)
def increment_a1_retry(self, campaign_id, campaign_number, campaign_name, reason):
def increment_a1_retry(self, campaign_id, campaign_number, campaign_name, reason, mark_failed_at_max=True):
"""
Increment A1 retry counter and mark as permanently failed if max attempts reached
@ -597,6 +639,9 @@ class Database:
campaign_number: Campaign number (e.g., C000000078)
campaign_name: Campaign name
reason: Description of failure (e.g., "No master assets found")
mark_failed_at_max: If True (default), set a1_permanently_failed=True at MAX_RETRIES.
Set False for empty-folder polling where the campaign is expected
to eventually receive assets and should keep retrying silently.
Returns:
dict with success, retry_count, permanently_failed
@ -617,7 +662,7 @@ class Database:
row = cursor.fetchone()
current_count = (row[0] or 0) if row else 0
new_count = current_count + 1
is_permanently_failed = new_count >= MAX_RETRIES
is_permanently_failed = mark_failed_at_max and new_count >= MAX_RETRIES
# Insert or update campaign status with retry tracking
cursor.execute("""
@ -769,6 +814,41 @@ class Database:
import json
full_json = json.dumps(full_extraction_data) if isinstance(full_extraction_data, dict) else full_extraction_data
# B1→B2 global masters: dedup by tracking_id so re-runs and previously-downloaded
# assets don't create duplicate rows.
if status == 'b1-master-cx-score':
cursor.execute("""
SELECT id FROM creativex_scores
WHERE tracking_id = %s AND status = 'b1-master-cx-score'
LIMIT 1
""", (tracking_id,))
if cursor.fetchone():
logger.debug("B1 master CreativeX score already recorded for tracking {}, skipping insert".format(tracking_id))
return {'success': True, 'is_update': False, 'already_exists': True}
cursor.execute("""
INSERT INTO creativex_scores (
filename, creativex_id, creativex_url, quality_score,
box_file_id, full_extraction_data, tracking_id, status
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)
""", (
filename,
creativex_id,
creativex_url,
quality_score,
box_file_id,
full_json,
tracking_id,
'b1-master-cx-score'
))
conn.commit()
logger.info("Stored B1 master CreativeX score: {} (Tracking: {}, Score: {})".format(
filename, tracking_id, quality_score
))
return {'success': True, 'is_update': False, 'version_number': 1}
# Handle master-cx-score differently (no versioning, just reference storage)
if status == 'master-cx-score':
# Simple insert for master score reference (no versioning)
@ -800,33 +880,52 @@ class Database:
}
# For 'active' status - use soft delete versioning
# Step 1: Check if filename already exists with status='active'
# Also count total versions for this filename
cursor.execute("""
SELECT id, quality_score FROM creativex_scores
WHERE filename = %s AND status = 'active'
""", (filename,))
# Strip timestamp suffix (e.g. _2026-03-13-05-53-36) from filename
# so re-scored assets supersede previous versions regardless of timestamp
import re
dot_idx = filename.rfind('.')
name_part = filename[:dot_idx] if dot_idx >= 0 else filename
ext = filename[dot_idx:] if dot_idx >= 0 else ''
base_filename = re.sub(r'_\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}$', '', name_part) + ext
existing = cursor.fetchone()
# Step 1: Check if this base asset already exists with status='active'
# Use LIKE pattern to match any timestamp variant of the same base filename
if base_filename != filename:
# Filename has a timestamp - match base pattern with any/no timestamp
like_pattern = base_filename.replace(ext, '') + '%' + ext
cursor.execute("""
SELECT id, quality_score, filename FROM creativex_scores
WHERE filename LIKE %s AND status = 'active'
""", (like_pattern,))
else:
# No timestamp in filename - still match variants that do have one
like_pattern = name_part + '%' + ext
cursor.execute("""
SELECT id, quality_score, filename FROM creativex_scores
WHERE filename LIKE %s AND status = 'active'
""", (like_pattern,))
# Count total versions (including superseded)
existing = cursor.fetchall()
# Count total versions (including superseded) for the base asset
cursor.execute("""
SELECT COUNT(*) FROM creativex_scores
WHERE filename = %s
""", (filename,))
WHERE filename LIKE %s
""", (like_pattern,))
total_versions = cursor.fetchone()[0]
if existing:
# Step 2: Mark existing record(s) as 'superseded'
# Step 2: Mark all existing active records as 'superseded'
cursor.execute("""
UPDATE creativex_scores
SET status = 'superseded'
WHERE filename = %s AND status = 'active'
""", (filename,))
WHERE filename LIKE %s AND status = 'active'
""", (like_pattern,))
logger.info("Superseded previous CreativeX score for: {} (old score: {})".format(
filename, existing[1]
superseded_filenames = [row[2] for row in existing]
logger.info("Superseded {} previous CreativeX score(s) for base asset: {} (old filenames: {})".format(
len(existing), base_filename, superseded_filenames
))
# Step 3: Insert new 'active' record
@ -852,8 +951,9 @@ class Database:
version_number = total_versions + 1
if existing:
logger.info("Updated CreativeX score: {} (Score: {} -> {}, Version: {})".format(
filename, existing[1], quality_score, version_number
old_scores = [row[1] for row in existing]
logger.info("Updated CreativeX score: {} (Old scores: {} -> {}, Version: {})".format(
filename, old_scores, quality_score, version_number
))
else:
logger.info("Stored new CreativeX score: {} (Score: {}, Version: {})".format(
@ -974,33 +1074,114 @@ class Database:
def get_all_live_campaigns(self):
"""
Get all live campaigns for CSV report
Returns:
list of dicts with campaign_number, campaign_name
Get all live campaigns (A-series local + B-series global) for the
single combined CSV that OMG ingests as a full replacement list.
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT campaign_number, campaign_name
FROM campaign_status
SELECT campaign_number, campaign_name
FROM campaign_status
WHERE live_campaign = 'YES'
AND (status LIKE 'A%' OR status LIKE 'B%')
ORDER BY campaign_number DESC
""")
rows = cursor.fetchall()
campaigns = []
for row in rows:
campaigns.append({
'campaign_number': row[0],
'campaign_name': row[1]
})
return campaigns
finally:
cursor.close()
self.put_connection(conn)
def get_override_metadata(self, filename_without_ext):
"""
Look up pre-upload metadata override saved by the naming tool.
Returns the latest unapplied override row for this filename, or None.
If the override_metadata table doesn't exist (e.g., on a dev DB where the
naming tool migration hasn't been run), returns None — upload behaviour
falls back to today's defaults.
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
SELECT id, tracking_id, override_fields
FROM override_metadata
WHERE filename = %s
AND applied_to_upload = FALSE
ORDER BY created_at DESC
LIMIT 1
""", (filename_without_ext,))
row = cursor.fetchone()
if not row:
return None
override_fields = row[2] if isinstance(row[2], dict) else json.loads(row[2])
return {
'id': row[0],
'tracking_id': row[1],
'override_fields': override_fields,
}
except psycopg2.errors.UndefinedTable:
conn.rollback()
logger.warning("override_metadata table does not exist - skipping override lookup")
return None
except Exception as e:
conn.rollback()
logger.error("Failed to query override_metadata for '{}': {}".format(
filename_without_ext, str(e)
))
return None
finally:
cursor.close()
self.put_connection(conn)
def mark_override_applied(self, filename_without_ext):
"""
Mark a pre-upload override row as applied after a successful DAM upload.
Only updates rows that are currently applied_to_upload = FALSE.
"""
conn = self.get_connection()
try:
cursor = conn.cursor()
cursor.execute("""
UPDATE override_metadata
SET applied_to_upload = TRUE,
applied_at = CURRENT_TIMESTAMP
WHERE filename = %s
AND applied_to_upload = FALSE
""", (filename_without_ext,))
updated = cursor.rowcount
conn.commit()
if updated:
logger.info("Marked {} override row(s) as applied for '{}'".format(
updated, filename_without_ext
))
return updated
except psycopg2.errors.UndefinedTable:
conn.rollback()
return 0
except Exception as e:
conn.rollback()
logger.error("Failed to mark override applied for '{}': {}".format(
filename_without_ext, str(e)
))
return 0
finally:
cursor.close()
self.put_connection(conn)

View file

@ -34,7 +34,7 @@ class FilenameParser:
# YouTube
'YTA', 'YTB', 'YTS',
# Other platforms
'AMZ', 'DV3', 'GOO', 'PIN', 'SNA', 'TIK', 'TWI', 'VOD',
'AMZ', 'DV3', 'GOO', 'PIN', 'SNA', 'SPT', 'TIK', 'TWI', 'VOD',
]
def __init__(self, dam_base_url=None):

View file

@ -13,6 +13,36 @@ from shared.config_loader import load_country_code_mappings
logger = logging.getLogger('MetadataExtractorMVP')
# Editor field name -> DAM metadata field ID.
# Mirrors the canonical mapping in the naming tool's public-v2/Database.php
# so that pre-upload overrides saved via the metadata editor are applied to
# the matching DAM fields on upload.
OVERRIDE_FIELD_MAP = {
'validity_start': 'FERRERO.FIELD.ASSET VALIDITY START PERIOD',
'validity_end': 'FERRERO.FIELD.ASSET VALIDITY END PERIOD',
'marketing_tag': 'MARKETING_TAG',
'agency_name': 'FERRERO.MARKETING.FIELD.AGENCY NAME',
'spot_version': 'FERRERO.MARKETING.FIELD.SPOT_VERSION',
'director_name': 'FERRERO.MARKETING.FIELD.DIRECTOR_NAME',
'video_post_prod_company': 'FERRERO.MARKETING.FIELD.VIDEO_POST_PROD_COMPANY',
'video_post_prod_contact': 'FERRERO.MARKETING.FIELD.VID_POST_PROD_CONTACT',
'audio_post_prod_company': 'FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_COMPANY',
'audio_post_prod_contact': 'FERRERO.MARKETING.FIELD.AUDIO_POST_PROD_CONTACT',
'video_type': 'FERRERO.MARKET.FIELD.TYPE_VID',
'ip_rights': 'FERRERO.MARKET.FIELD.IPRIGHT',
'production_company': 'FERRERO.MARKET.PROD_COMPANY',
'licensing': 'FERRERO.MARKET.FIELD.LICENSIN',
'buyout': 'FERRERO.MARKET.FIELD.BUYOUT',
'ferrero_property': 'FERRERO.MARKET.FIELD.FERRERO PROPERTY',
'video_status': 'FERRERO.MARKET.VID_N_STAT',
'license': 'FERRERO.MARKET.FIELD.LICENSE',
'creativex_score': 'FERRERO.TAB.FIELD.CREATIVEX',
'creativex_link': 'FERRERO.FIELD.CREATIVEX LINK',
}
DATE_OVERRIDE_FIELDS = {'validity_start', 'validity_end'}
class MetadataExtractorMVP:
def __init__(self, field_mappings):
"""
@ -113,7 +143,7 @@ class MetadataExtractorMVP:
return extracted_fields
def build_mvp_asset_representation(self, master_metadata, clean_filename, parsed_filename, box_metadata=None, tracking_mode='full', master_opentext_id=None, master_opentext_ids=None):
def build_mvp_asset_representation(self, master_metadata, clean_filename, parsed_filename, box_metadata=None, tracking_mode='full', master_opentext_id=None, master_opentext_ids=None, override_fields=None):
"""
Build asset representation with MVP fields + updates from filename
@ -124,6 +154,10 @@ class MetadataExtractorMVP:
box_metadata: Optional Box metadata
tracking_mode: 'full' (inherit all metadata) or 'folder_only' (only use folder)
master_opentext_id: Optional DAM Asset ID of master asset (for derivative tracking)
override_fields: Optional dict of pre-upload metadata overrides keyed by
editor field name (e.g. {'validity_end': '...', 'ip_rights': 'Yes'}).
Applied after master/filename/forced values but before asset-type
overrides so EOL/LTD compliance still wins. Empty values are skipped.
Returns:
Asset representation dict ready for upload
@ -156,13 +190,21 @@ class MetadataExtractorMVP:
# Add empty required fields that DAM expects (even if empty) - folder-only mode needs these
mvp_fields = self._add_empty_required_fields(mvp_fields)
# Apply asset type overrides (e.g., EOL) - takes final precedence over forced values/defaults
mvp_fields = self._apply_asset_type_overrides(mvp_fields, parsed_filename)
# Update CreativeX fields from Box metadata if provided
if box_metadata:
mvp_fields = self._update_creativex_fields(mvp_fields, box_metadata)
# Apply pre-upload metadata overrides from the naming tool's editor.
# Runs after master/filename/forced/default/CreativeX values so it wins
# over them, but before asset_type_overrides so EOL/LTD compliance rules
# still take final precedence.
if override_fields:
mvp_fields = self._apply_override_fields(mvp_fields, override_fields)
# Apply asset type overrides (e.g., EOL, LTD) - takes final precedence over
# forced values, defaults, and CreativeX (LTD removes CreativeX entirely).
mvp_fields = self._apply_asset_type_overrides(mvp_fields, parsed_filename)
# Add MASTERASSETIDS field with all master IDs
# Priority: Use master_opentext_ids if provided (multiple IDs), otherwise fall back to single master_opentext_id
if master_opentext_ids and len(master_opentext_ids) > 0:
@ -403,7 +445,15 @@ class MetadataExtractorMVP:
break
if not field_found:
logger.warning("Asset type override field '{}' not found in MVP fields - skipping".format(field_id))
# Field not present yet (e.g. description has no subject_title from filename).
# Append as a simple string field so the override still takes effect. Tabular
# / domained overrides aren't supported here — they should already be in
# mvp_fields via _add_missing_fields.
mvp_fields.append({
'id': field_id,
'value': {'value': {'type': 'string', 'value': override_value}}
})
logger.info("Asset type override: {} = {} (added missing field)".format(field_id, override_value))
return mvp_fields
@ -866,6 +916,23 @@ class MetadataExtractorMVP:
if 'FERRERO.FIELD.STATE' in fields_by_id:
set_domained_value(fields_by_id['FERRERO.FIELD.STATE'], 'Local')
# MAIN_LANGUAGES (tabular field — populate values array from language_code)
if parsed_filename.get('language_code') and 'MAIN_LANGUAGES' in fields_by_id:
language = parsed_filename['language_code'].upper()
fields_by_id['MAIN_LANGUAGES']['values'] = [
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'expired_value': False,
'field_value': {'type': 'string', 'value': language},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
logger.info("Set MAIN_LANGUAGES (folder-only mode): {}".format(language))
# VALIDITY DATES (Start = Today, End = Today + 1 Year)
try:
today = datetime.now()
@ -895,6 +962,72 @@ class MetadataExtractorMVP:
return field['value']['value']['field_value'].get('value')
return None
def _apply_override_fields(self, mvp_fields, override_fields):
"""
Apply pre-upload metadata overrides from the naming tool.
For each non-empty entry in override_fields, map the editor field name
to its DAM field ID via OVERRIDE_FIELD_MAP and write the value into the
matching field in mvp_fields. Empty strings are skipped (treat as
"user didn't set this, leave inherited value alone"). Validity dates
from the editor arrive as ISO 8601 strings and are normalised to the
MM/DD/YYYY format DAM expects.
"""
if not override_fields:
return mvp_fields
applied = 0
for editor_field, raw_value in override_fields.items():
if raw_value is None or raw_value == '':
continue
dam_field_id = OVERRIDE_FIELD_MAP.get(editor_field)
if not dam_field_id:
logger.debug("Override: no DAM mapping for editor field '{}' - skipping".format(editor_field))
continue
value = raw_value
if editor_field in DATE_OVERRIDE_FIELDS:
value = self._normalize_iso_date(raw_value)
if not value:
continue
target = None
for field in mvp_fields:
if field.get('id') == dam_field_id:
target = field
break
if target is None:
logger.warning("Override: field {} (DAM id {}) not present in mvp_fields - skipping".format(
editor_field, dam_field_id
))
continue
if editor_field in DATE_OVERRIDE_FIELDS:
self._set_date_field_value(target, value)
else:
self._set_field_value(target, value)
logger.info("Override applied: {} ({}) = {}".format(editor_field, dam_field_id, value))
applied += 1
if applied:
logger.info("Applied {} pre-upload override field(s) from naming tool".format(applied))
return mvp_fields
def _normalize_iso_date(self, iso_str):
"""Convert an ISO 8601 date string (with or without time/timezone) to MM/DD/YYYY."""
if not iso_str:
return None
try:
date_part = iso_str.split('T')[0]
dt = datetime.strptime(date_part, '%Y-%m-%d')
return dt.strftime('%m/%d/%Y')
except Exception as e:
logger.warning("Could not normalize override date '{}': {}".format(iso_str, str(e)))
return None
def _set_field_value(self, field, value):
"""Set field value handling different structures"""
import json

View file

@ -18,7 +18,7 @@ class Notifier:
self.config = config
self.enabled = config['notifications']['enabled']
# SMTP configuration (preferred method)
# SMTP configuration
smtp_config = config['notifications'].get('smtp', {})
self.smtp_server = smtp_config.get('server')
self.smtp_port = smtp_config.get('port', 587)
@ -26,6 +26,12 @@ class Notifier:
self.smtp_password = smtp_config.get('password')
self.sender_email = smtp_config.get('sender_email')
# Mailgun API configuration (preferred over SMTP when configured)
mailgun_config = config['notifications'].get('mailgun', {})
self.mailgun_api_key = mailgun_config.get('api_key')
self.mailgun_domain = mailgun_config.get('domain')
self.mailgun_sender = mailgun_config.get('sender_email') or self.sender_email
self.recipients = config['notifications']['recipients']
self.webhook_config = config.get('webhooks', {})
@ -43,8 +49,8 @@ class Notifier:
logger.info("Notifications disabled, skipping email")
return
if not self.smtp_server or not self.smtp_user:
logger.warning("SMTP not configured, skipping email")
if not self.mailgun_api_key and (not self.smtp_server or not self.smtp_user):
logger.warning("Neither Mailgun API nor SMTP configured, skipping email")
return
try:
@ -60,24 +66,59 @@ class Notifier:
<div style="background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Assets Downloaded:</strong> {{ asset_count }}</p>
<p style="margin: 5px 0 0 0;"><strong>Total Assets:</strong> {{ asset_count }}
{% if existing_asset_count and existing_asset_count > 0 %}
({{ existing_asset_count }} previously downloaded, <strong>{{ new_asset_count }} new this run</strong>)
{% endif %}
</p>
<p style="margin: 5px 0 0 0;"><strong>Status Updated:</strong> A1 A2</p>
</div>
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #28a745; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
{% if new_assets is defined %}
{% if new_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #28a745;">🆕 New This Run ({{ new_assets|length }}):</h3>
{% for asset in new_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #28a745; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
{% endfor %}
{% endif %}
{% if existing_assets is defined and existing_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #666;">📁 Previously Downloaded ({{ existing_assets|length }}):</h3>
<div style="border: 1px solid #ddd; padding: 10px 15px; background-color: #f5f5f5; border-radius: 4px;">
<p style="margin: 0 0 8px 0; color: #666; font-size: 13px;">These files were already in Box from an earlier run and were skipped.</p>
<ul style="margin: 5px 0 0 0; padding-left: 20px; color: #555;">
{% for asset in existing_assets %}
<li style="margin: 3px 0;">{{ asset.asset_name }} <code style="color: #888; font-size: 11px;">({{ asset.tracking_id }})</code></li>
{% endfor %}
</ul>
</div>
</div>
{% endfor %}
{% endif %}
{% else %}
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #28a745; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
{% endfor %}
{% endif %}
<div style="background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong> Complete:</strong> All assets downloaded from DAM and uploaded to Box with tracking IDs.</p>
@ -111,7 +152,7 @@ class Notifier:
"""
},
'a2_to_a3_batch_complete': {
'subject': "A2→A3 Batch Upload Complete - {{ successful_count }}/{{ total_files }} Successful",
'subject': "A2→A3 Batch Upload Complete - {successful_count}/{total_files} Successful",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: {% if failed_count == 0 %}#28a745{% else %}#ff9800{% endif %}; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
@ -300,7 +341,7 @@ class Notifier:
<p style="margin: 5px 0 0 0;"><strong>Default Values Used:</strong></p>
<ul style="margin: 5px 0 0 20px; padding: 0;">
<li>Score: 0</li>
<li>URL: https://app.creativex.com/preflight/pretests</li>
<li>URL: None (no CreativeX URL sent)</li>
</ul>
<p style="margin: 10px 0 0 0; font-size: 12px; color: #666;">
<em>To add CreativeX score: Upload PDF report to Box folder 350605024645 and run creativex_scoring_storing.py</em>
@ -326,24 +367,61 @@ class Notifier:
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign Type:</strong> Global Masters</p>
<p style="margin: 5px 0 0 0;"><strong>Assets Downloaded:</strong> {{ asset_count }}</p>
<p style="margin: 5px 0 0 0;"><strong>Total Assets:</strong> {{ asset_count }}
{% if existing_asset_count and existing_asset_count > 0 %}
({{ existing_asset_count }} previously downloaded, <strong>{{ new_asset_count }} new this run</strong>)
{% endif %}
</p>
<p style="margin: 5px 0 0 0;"><strong>Status Updated:</strong> B1 B2</p>
</div>
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #1976d2; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
{% if new_assets is defined %}
{% if new_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #1976d2;">🆕 New This Run ({{ new_assets|length }}):</h3>
{% for asset in new_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #1976d2; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">CreativeX Score:</span> {% if asset.creativex_score %}{{ asset.creativex_score }}{% if asset.creativex_url %} (<a href="{{ asset.creativex_url }}">View on CreativeX</a>){% endif %}{% else %}<span style="color: #999;">No CreativeX Score</span>{% endif %}</p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
{% endfor %}
{% endif %}
{% if existing_assets is defined and existing_assets|length > 0 %}
<h3 style="margin-top: 30px; color: #666;">📁 Previously Downloaded ({{ existing_assets|length }}):</h3>
<div style="border: 1px solid #ddd; padding: 10px 15px; background-color: #f5f5f5; border-radius: 4px;">
<p style="margin: 0 0 8px 0; color: #666; font-size: 13px;">These files were already in Box from an earlier run and were skipped.</p>
<ul style="margin: 5px 0 0 0; padding-left: 20px; color: #555;">
{% for asset in existing_assets %}
<li style="margin: 3px 0;">{{ asset.asset_name }} <code style="color: #888; font-size: 11px;">({{ asset.tracking_id }})</code> &mdash; <span style="font-size: 12px;">CreativeX: {% if asset.creativex_score %}{{ asset.creativex_score }}{% else %}<span style="color: #999;">none</span>{% endif %}</span></li>
{% endfor %}
</ul>
</div>
</div>
{% endfor %}
{% endif %}
{% else %}
<h3 style="margin-top: 30px; color: #333;">Processed Assets:</h3>
{% for asset in processed_assets %}
<div style="border: 1px solid #ddd; margin: 15px 0; padding: 15px; background-color: #fafafa; border-radius: 4px;">
<div style="background-color: #1976d2; color: white; padding: 10px 15px; margin: -15px -15px 15px -15px; border-radius: 4px 4px 0 0;">
<strong>{{ asset.asset_name }}</strong>
</div>
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box File ID:</span> {{ asset.box_file_id }}</p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">CreativeX Score:</span> {% if asset.creativex_score %}{{ asset.creativex_score }}{% if asset.creativex_url %} (<a href="{{ asset.creativex_url }}">View on CreativeX</a>){% endif %}{% else %}<span style="color: #999;">No CreativeX Score</span>{% endif %}</p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
{% endfor %}
{% endif %}
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong> Complete:</strong> All Global Master assets downloaded from DAM and uploaded to Box with tracking IDs.</p>
@ -378,6 +456,7 @@ class Notifier:
<div style="padding: 10px; background-color: white; border-radius: 4px;">
<p style="margin: 5px 0;"><span style="font-weight: bold;">Tracking ID:</span> <code>{{ asset.tracking_id }}</code></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">Box URL:</span> <a href="{{ asset.box_url }}">{{ asset.box_url }}</a></p>
<p style="margin: 5px 0;"><span style="font-weight: bold;">CreativeX Score:</span> {% if asset.creativex_score %}{{ asset.creativex_score }}{% if asset.creativex_url %} (<a href="{{ asset.creativex_url }}">View on CreativeX</a>){% endif %}{% else %}<span style="color: #999;">No CreativeX Score</span>{% endif %}</p>
{% if asset.folder_path %}<p style="margin: 5px 0;"><span style="font-weight: bold;">DAM Path:</span> {{ asset.folder_path }}</p>{% endif %}
</div>
</div>
@ -590,6 +669,125 @@ class Notifier:
</div>
"""
},
'a1_to_a2_no_assets_retry': {
'subject': "⚠️ No Assets Found (Attempt {retry_count}/3) - Campaign {campaign_name}",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: #ff9800; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
<h1 style="margin: 0;"> No Master Assets Found (Retry {{ retry_count }}/{{ max_retries }})</h1>
</div>
<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign ID:</strong> {{ campaign_id }}</p>
<p style="margin: 5px 0 0 0;"><strong>Status:</strong> A1</p>
<p style="margin: 5px 0 0 0;"><strong>Retry Attempt:</strong> {{ retry_count }} of {{ max_retries }}</p>
</div>
<div style="padding: 20px; background-color: #f8f9fa; border-radius: 4px; margin: 20px 0;">
<h3 style="color: #ff9800; margin-top: 0;">Campaign Set to A1 but No Assets Found</h3>
<p>The Master Assets folder was searched (including subfolders) but no assets were found.</p>
<p>This campaign is set to status A1 but appears to have no master assets ready for download.</p>
</div>
<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>📌 What Happens Next:</strong></p>
<ul style="margin: 10px 0;">
<li>This is attempt <strong>{{ retry_count }}</strong> of <strong>{{ max_retries }}</strong></li>
<li>System will retry automatically on next run (every 3 minutes)</li>
{% if retry_count < max_retries %}
<li><strong>{{ max_retries - retry_count }} attempt(s) remaining</strong> before marking as permanently failed</li>
{% else %}
<li style="color: #d32f2f;"><strong>WARNING: This is the final attempt!</strong> Next failure will mark campaign as permanently failed.</li>
{% endif %}
<li>Please verify assets exist in Master Assets folder</li>
</ul>
</div>
<p style="color: #666; font-size: 12px; margin-top: 20px;">A1A2 script will retry automatically. No action needed unless this persists.</p>
</div>
"""
},
'a1_to_a2_no_assets_warning': {
'subject': "⚠️ Campaign in A1 with no assets yet - {campaign_name}",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: #ff9800; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
<h1 style="margin: 0;"> Campaign in A1 with No Assets Yet</h1>
</div>
<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign ID:</strong> {{ campaign_id }}</p>
<p style="margin: 5px 0 0 0;"><strong>Status:</strong> A1</p>
<p style="margin: 5px 0 0 0;"><strong>Polls with empty folder:</strong> {{ poll_count }}</p>
</div>
<div style="padding: 20px; background-color: #f8f9fa; border-radius: 4px; margin: 20px 0;">
<h3 style="color: #ff9800; margin-top: 0;">Master Assets Folder Has Been Empty for ~1 Hour</h3>
<p>This campaign has been at status A1 for roughly an hour with no master assets in the folder.</p>
<p>This is often expected the folder may have been created before assets were uploaded and the system will keep checking automatically.</p>
<p>This is a <strong>one-time warning</strong>; no further emails will be sent for this campaign.</p>
</div>
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>📌 Action only needed if:</strong></p>
<ul style="margin: 10px 0;">
<li>You expected assets to be uploaded already</li>
<li>The campaign was set to A1 by mistake (change the status in DAM)</li>
</ul>
<p style="margin: 10px 0 0 0;">Otherwise no action needed processing will start automatically as soon as assets appear in the Master Assets folder.</p>
</div>
<p style="color: #666; font-size: 12px; margin-top: 20px;">A1A2 script will continue to check silently every 3 minutes.</p>
</div>
"""
},
'a1_to_a2_permanently_failed': {
'subject': "❌ PERMANENTLY FAILED - Campaign {campaign_name} (No Assets After 3 Attempts)",
'html': """
<div style="font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;">
<div style="background-color: #d32f2f; color: white; padding: 20px; text-align: center; border-radius: 8px 8px 0 0;">
<h1 style="margin: 0;"> CAMPAIGN PERMANENTLY FAILED</h1>
</div>
<div style="background-color: #ffebee; border-left: 4px solid #d32f2f; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>Campaign:</strong> {{ campaign_name }} ({{ campaign_number }})</p>
<p style="margin: 5px 0 0 0;"><strong>Campaign ID:</strong> {{ campaign_id }}</p>
<p style="margin: 5px 0 0 0;"><strong>Status:</strong> A1</p>
<p style="margin: 5px 0 0 0;"><strong>Failed Attempts:</strong> {{ retry_count }} / {{ max_retries }}</p>
</div>
<div style="padding: 20px; background-color: #f8f9fa; border-radius: 4px; margin: 20px 0;">
<h3 style="color: #d32f2f; margin-top: 0;">Campaign Marked as Permanently Failed</h3>
<p>After {{ max_retries }} consecutive attempts, the system was unable to find any master assets in the Master Assets folder.</p>
<p><strong>This campaign will no longer be processed automatically.</strong></p>
</div>
<div style="background-color: #ffebee; border-left: 4px solid #d32f2f; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>🔧 Required Actions:</strong></p>
<ol style="margin: 10px 0;">
<li>Verify the campaign should actually be in A1 status</li>
<li>Check if Master Assets folder exists and contains files</li>
<li>If this is a mistake, change campaign status to something else</li>
<li>If assets need to be added, add them to Master Assets folder</li>
<li><strong>Once fixed, manually reset the retry counter</strong></li>
</ol>
</div>
<div style="background-color: #e3f2fd; border-left: 4px solid #1976d2; padding: 15px; margin: 20px 0;">
<p style="margin: 0;"><strong>💡 How to Reset This Campaign:</strong></p>
<p style="margin: 10px 0; padding: 15px; background-color: white; border-radius: 4px;">
To reset the status and retry this campaign, please contact support at: <br>
<strong><a href="mailto:optical@oliver.agency" style="color: #1976d2;">optical@oliver.agency</a></strong>
</p>
<p style="margin: 5px 0 0 0; font-size: 12px; color: #666;">Support will reset the retry counter and investigate the issue.</p>
</div>
<p style="color: #666; font-size: 12px; margin-top: 20px;">Automated processing stopped. Manual intervention required.</p>
</div>
"""
},
'b1_to_b2_no_assets': {
'subject': "⚠️ No Assets Found - Global Campaign {campaign_name}",
'html': """
@ -894,59 +1092,105 @@ class Notifier:
html_content = jinja_template.render(data)
subject = template['subject'].format(**data)
# 2. Create MIME message
if attachments:
# Use MIMEMultipart for attachments
message = MIMEMultipart()
message['From'] = self.sender_email
message['To'] = ", ".join(recipients) if isinstance(recipients, list) else recipients
message['Subject'] = subject
# Attach HTML body
message.attach(MIMEText(html_content, "html"))
# Attach files
from email.mime.base import MIMEBase
from email import encoders
import os
for file_path in attachments:
try:
if os.path.exists(file_path):
with open(file_path, "rb") as attachment:
part = MIMEBase("application", "octet-stream")
part.set_payload(attachment.read())
encoders.encode_base64(part)
filename = os.path.basename(file_path)
part.add_header(
"Content-Disposition",
f"attachment; filename= {filename}",
)
message.attach(part)
logger.info("Attached file: {}".format(filename))
else:
logger.warning("Attachment not found: {}".format(file_path))
except Exception as e:
logger.error("Failed to attach file {}: {}".format(file_path, str(e)))
else:
# Use standard MIMEText for simple emails
message = MIMEText(html_content, "html")
message['From'] = self.sender_email
message['To'] = ", ".join(recipients) if isinstance(recipients, list) else recipients
message['Subject'] = subject
# 2. Send via Mailgun API or SMTP
recipient_list = recipients if isinstance(recipients, list) else [recipients]
# 3. Send via SMTP
with smtplib.SMTP(self.smtp_server, self.smtp_port) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.send_message(message)
if self.mailgun_api_key and self.mailgun_domain:
self._send_via_mailgun_api(recipient_list, subject, html_content, attachments)
else:
self._send_via_smtp(recipient_list, subject, html_content, attachments)
logger.info("Email sent to {} (Template: {})".format(recipients, template_name))
except Exception as e:
logger.error("Failed to send email: {}".format(str(e)))
def _send_via_mailgun_api(self, recipient_list, subject, html_content, attachments=None):
"""Send email via Mailgun REST API - sends one request per recipient for reliable delivery"""
import os
url = "https://api.mailgun.net/v3/{}/messages".format(self.mailgun_domain)
# Normalize: split any comma-separated strings into individual addresses
normalized = []
for r in recipient_list:
for addr in r.split(','):
addr = addr.strip()
if addr:
normalized.append(addr)
for recipient in normalized:
files = []
try:
if attachments:
for file_path in attachments:
if os.path.exists(file_path):
files.append(("attachment", (os.path.basename(file_path), open(file_path, "rb"))))
else:
logger.warning("Attachment not found: {}".format(file_path))
data = {
"from": self.mailgun_sender,
"to": [recipient],
"subject": subject,
"html": html_content,
}
response = requests.post(
url,
auth=("api", self.mailgun_api_key),
data=data,
files=files if files else None,
)
response.raise_for_status()
logger.info("Mailgun API sent to {}: {}".format(recipient, response.json()))
except Exception as e:
logger.error("Mailgun API failed for {}: {}".format(recipient, str(e)))
finally:
for _, file_tuple in files:
file_tuple[1].close()
def _send_via_smtp(self, recipient_list, subject, html_content, attachments=None):
"""Send email via SMTP"""
import os
from email.mime.base import MIMEBase
from email import encoders
if attachments:
message = MIMEMultipart()
message['From'] = self.sender_email
message['To'] = ", ".join(recipient_list)
message['Subject'] = subject
message.attach(MIMEText(html_content, "html"))
for file_path in attachments:
try:
if os.path.exists(file_path):
with open(file_path, "rb") as attachment:
part = MIMEBase("application", "octet-stream")
part.set_payload(attachment.read())
encoders.encode_base64(part)
filename = os.path.basename(file_path)
part.add_header(
"Content-Disposition",
"attachment; filename= {}".format(filename),
)
message.attach(part)
logger.info("Attached file: {}".format(filename))
else:
logger.warning("Attachment not found: {}".format(file_path))
except Exception as e:
logger.error("Failed to attach file {}: {}".format(file_path, str(e)))
else:
message = MIMEText(html_content, "html")
message['From'] = self.sender_email
message['To'] = ", ".join(recipient_list)
message['Subject'] = subject
with smtplib.SMTP(self.smtp_server, self.smtp_port) as server:
server.starttls()
server.login(self.smtp_user, self.smtp_password)
server.send_message(message)
def send_webhook(self, url, payload):
"""
url: Webhook URL

View file

@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""
Quick test: Send via Mailgun API with multiple recipients
to diagnose daily report delivery issue.
"""
import os
import sys
import requests
# Load from environment (same as production)
api_key = os.environ.get('MAILGUN_API_KEY')
domain = os.environ.get('MAILGUN_DOMAIN')
sender = os.environ.get('MAILGUN_SENDER_EMAIL') or os.environ.get('SENDER_EMAIL')
if not api_key or not domain:
print("ERROR: MAILGUN_API_KEY and MAILGUN_DOMAIN must be set")
sys.exit(1)
print("Using domain: {}".format(domain))
print("Using sender: {}".format(sender))
print("API key: {}...{}".format(api_key[:8], api_key[-8:]))
print()
# Try both US and EU endpoints
endpoints = [
("US", "https://api.mailgun.net/v3/{}/messages".format(domain)),
("EU", "https://api.eu.mailgun.net/v3/{}/messages".format(domain)),
]
# First, find which endpoint works
working_url = None
for region, url in endpoints:
print("Testing {} endpoint: {}".format(region, url))
test_data = {
"from": sender,
"to": ["nick.viljoen@oliver.agency"],
"subject": "Mailgun Endpoint Test - {} Region".format(region),
"html": "<p>Testing {} endpoint</p>".format(region),
}
resp = requests.post(url, auth=("api", api_key), data=test_data)
print(" Status: {}".format(resp.status_code))
print(" Response: {}".format(resp.text[:500]))
if resp.status_code == 200:
working_url = url
print(" >>> {} endpoint works!".format(region))
break
print()
if not working_url:
print("\nERROR: Neither US nor EU endpoint accepted the API key.")
print("Check that MAILGUN_API_KEY is correct and the domain is verified.")
sys.exit(1)
print()
print("=" * 60)
print("Using working endpoint: {}".format(working_url))
print("=" * 60)
# --- Test 1: Comma-separated string in list (how daily report currently sends) ---
print()
print("TEST 1: Comma-separated string in list (current daily report format)")
data1 = {
"from": sender,
"to": ["nick.viljoen@oliver.agency,daveporter@oliver.agency"],
"subject": "Mailgun Test 1 - Comma-Separated in List",
"html": "<h2>Test 1</h2><p>Comma-separated string in list. If you see this, the current format works.</p>",
}
resp1 = requests.post(working_url, auth=("api", api_key), data=data1)
print(" Status: {}".format(resp1.status_code))
print(" Response: {}".format(resp1.text[:500]))
# --- Test 2: Multiple recipients as separate list items (proper format) ---
print()
print("TEST 2: Separate list items (proper format)")
data2 = {
"from": sender,
"to": ["nick.viljoen@oliver.agency", "daveporter@oliver.agency"],
"subject": "Mailgun Test 2 - Separate List Items",
"html": "<h2>Test 2</h2><p>Separate list items. If you see this, the split format works.</p>",
}
resp2 = requests.post(working_url, auth=("api", api_key), data=data2)
print(" Status: {}".format(resp2.status_code))
print(" Response: {}".format(resp2.text[:500]))
print()
print("=" * 60)
print("DONE - Check inboxes for both tests")
print("=" * 60)

View file

@ -1,148 +0,0 @@
#!/usr/bin/env python3
"""
Test script to verify MASTERASSETIDS field implementation
Shows master assets and their potential derivatives
"""
import os
import sys
import psycopg2
from dotenv import load_dotenv
# Load env vars from current directory
script_dir = os.path.dirname(os.path.abspath(__file__))
load_dotenv(os.path.join(script_dir, '.env'))
try:
conn = psycopg2.connect(
host=os.getenv('DB_HOST', 'localhost'),
port=os.getenv('DB_PORT', '5437'),
database='ferrero_tracking',
user=os.getenv('DB_USER'),
password=os.getenv('DB_PASSWORD')
)
cursor = conn.cursor()
print("=" * 80)
print("MASTERASSETIDS FIELD TESTING REPORT")
print("=" * 80)
# 1. Show master assets available for testing
print("\n📋 MASTER ASSETS (Available for Testing)")
print("-" * 80)
cursor.execute("""
SELECT
tracking_id,
opentext_id,
local_campaign_id,
original_filename,
created_at
FROM master_assets
ORDER BY created_at DESC
LIMIT 10
""")
print(f"{'Tracking ID':<12} {'OpenText ID':<45} {'Campaign':<15} {'Filename':<30}")
print("-" * 80)
for row in cursor.fetchall():
tracking_id, opentext_id, campaign_id, filename, created_at = row
filename_short = (filename[:27] + '...') if filename and len(filename) > 30 else filename or 'N/A'
print(f"{tracking_id:<12} {opentext_id:<45} {campaign_id:<15} {filename_short:<30}")
# 2. Show derivative assets (if any exist)
print("\n\n📦 DERIVATIVE ASSETS (Uploaded from Agency)")
print("-" * 80)
cursor.execute("""
SELECT
da.tracking_id,
da.dam_asset_id,
da.derivative_filename,
ma.opentext_id as master_opentext_id,
ma.local_campaign_id,
da.created_at
FROM derivative_assets da
LEFT JOIN master_assets ma ON da.tracking_id = ma.tracking_id
ORDER BY da.created_at DESC
LIMIT 10
""")
derivative_rows = cursor.fetchall()
if derivative_rows:
print(f"{'Tracking ID':<12} {'Derivative DAM ID':<45} {'Master DAM ID (should be in MASTERASSETIDS)':<50}")
print("-" * 80)
for row in derivative_rows:
tracking_id, dam_asset_id, filename, master_opentext_id, campaign_id, created_at = row
print(f"{tracking_id:<12} {dam_asset_id or 'N/A':<45} {master_opentext_id or 'N/A':<50}")
else:
print("(No derivative assets found)")
print("\n Derivatives are created when Agency returns localized assets (A2→A3 flow)")
# 3. Show campaigns ready for testing
print("\n\n🧪 CAMPAIGNS READY FOR TESTING")
print("-" * 80)
cursor.execute("""
SELECT
cs.campaign_number,
cs.campaign_name,
cs.status,
COUNT(ma.id) as master_count,
MAX(cs.updated_at) as last_updated
FROM campaign_status cs
LEFT JOIN master_assets ma ON cs.campaign_number = ma.local_campaign_id
WHERE cs.status IN ('A2', 'A3')
GROUP BY cs.campaign_number, cs.campaign_name, cs.status
ORDER BY last_updated DESC
""")
test_campaigns = cursor.fetchall()
if test_campaigns:
print(f"{'Campaign':<15} {'Status':<8} {'Master Assets':<15} {'Campaign Name':<40}")
print("-" * 80)
for row in test_campaigns:
campaign_num, campaign_name, status, count, last_updated = row
print(f"{campaign_num:<15} {status:<8} {count:<15} {campaign_name[:37]}")
else:
print("(No campaigns in A2 or A3 status)")
# 4. Get a sample tracking ID for testing
print("\n\n🔬 TEST SCENARIO")
print("-" * 80)
cursor.execute("""
SELECT tracking_id, opentext_id, local_campaign_id, original_filename
FROM master_assets
ORDER BY created_at DESC
LIMIT 1
""")
sample = cursor.fetchone()
if sample:
tracking_id, opentext_id, campaign_id, filename = sample
print(f"Sample Master Asset for Testing:")
print(f" Tracking ID: {tracking_id}")
print(f" OpenText ID: {opentext_id}")
print(f" Campaign: {campaign_id}")
print(f" Filename: {filename or 'N/A'}")
print(f"\nTo test MASTERASSETIDS field:")
print(f" 1. Upload a derivative file to Box with tracking ID: {tracking_id}")
print(f" 2. Run: python scripts/a2_to_a3_upload_polling.py --dryrun")
print(f" 3. Check for FERRERO.MASTERASSETIDS field with value: {opentext_id}")
print(f"\nNote: Field is only active in PPR environment (ppr.dam.ferrero.com)")
# 5. Environment check
print("\n\n🌍 ENVIRONMENT CONFIGURATION")
print("-" * 80)
dam_url = os.getenv('DAM_BASE_URL', 'Not configured')
print(f"DAM Base URL: {dam_url}")
if 'ppr.dam.ferrero.com' in dam_url:
print("Environment: PPR (MASTERASSETIDS field is ENABLED ✅)")
elif 'dam.ferrero.com' in dam_url:
print("Environment: PROD (MASTERASSETIDS field is DISABLED ⚠️)")
else:
print("Environment: Unknown")
print("\n" + "=" * 80)
conn.close()
except Exception as e:
print(f"❌ Error: {e}")
sys.exit(1)

View file

@ -1,94 +0,0 @@
#!/usr/bin/env python3
"""
Test script to demonstrate MASTERASSETIDS field with multiple master asset IDs
This creates a test JSON structure showing how multiple master assets would be linked
"""
import json
# Example: A localized asset (derivative) that references TWO master assets
# Master 1: fc5c389776516bb58044c7d4bf479da458599baf (tracking: BqB8vo)
# Master 2: ad3948d72ea8550a338a600ae87a1bdd1968b066 (tracking: SfUQ7m)
test_field_structure = {
'id': 'FERRERO.MASTERASSETIDS',
'parent_table_id': 'FERRERO.TABULAR.FIELD.MASTERASSETIDS',
'type': 'com.artesia.metadata.MetadataTableField',
'values': [
# First master asset ID
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'field_value': {
'type': 'string',
'value': 'fc5c389776516bb58044c7d4bf479da458599baf'
},
'type': 'com.artesia.metadata.DomainValue'
}
},
# Second master asset ID
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'field_value': {
'type': 'string',
'value': 'ad3948d72ea8550a338a600ae87a1bdd1968b066'
},
'type': 'com.artesia.metadata.DomainValue'
}
},
# Third master asset ID (optional)
{
'cascading_domain_value': False,
'domain_value': True,
'is_locked': False,
'value': {
'field_value': {
'type': 'string',
'value': '020d76f957ec9f4ec0b18035a2d012cd3fd376c2'
},
'type': 'com.artesia.metadata.DomainValue'
}
}
]
}
print("=" * 80)
print("MULTIPLE MASTER ASSET IDS - TEST STRUCTURE")
print("=" * 80)
print()
print("Field ID:", test_field_structure['id'])
print("Parent Table:", test_field_structure['parent_table_id'])
print("Number of Master Asset IDs:", len(test_field_structure['values']))
print()
print("Master Asset IDs:")
for i, value_obj in enumerate(test_field_structure['values'], 1):
master_id = value_obj['value']['field_value']['value']
print(f" {i}. {master_id}")
print()
print("Full JSON Structure:")
print("-" * 80)
print(json.dumps(test_field_structure, indent=2))
print()
print("=" * 80)
print("TESTING NOTES")
print("=" * 80)
print()
print("To test if DAM accepts multiple IDs:")
print("1. Check if FERRERO.TABULAR.FIELD.MASTERASSETIDS schema allows multiple rows")
print("2. Verify with DAM admin if field has 'Allow Multiple Values' enabled")
print("3. Test upload with this structure to PPR environment")
print()
print("Current Implementation:")
print(" - Code adds ONE master ID (from tracking ID lookup)")
print(" - Supports Many-to-Many relationship conceptually")
print(" - Array structure ready for multiple values")
print()
print("To enable multiple IDs in production:")
print(" - Agency tool needs to send list of master tracking IDs")
print(" - Database schema needs multiple master references")
print(" - Code modification needed to look up multiple masters")
print()