banner_studio/FRUSTRATION_LIST.md
Simeon Schecter 988a47c797 Initial commit: Day 1 + Day 2 of the vertical slice
Day 1 (monorepo + Node layout engine):
- Turborepo + pnpm workspaces with apps/web, apps/render-worker, and
  packages for types, layout-engine, prompts, api-lib.
- @banner-studio/types: BannerSpec contract, every layer kind, ResolvedLayer,
  zod schemas mirroring each interface.
- @banner-studio/layout-engine: Dropflow WASM wrapper, text measurement,
  shrink-to-fit, push_siblings, resolveLayout. Snapshot-tested.

Day 2 (browser parity + AI pipeline):
- Layout engine ./browser subpath: same resolveLayout in the browser via
  Dropflow WASM build. Quarantined wasm-locator import (dropflow 0.5.1
  exports gap).
- Cross-group push_siblings bug fix: deltas now thread through group
  recursion via a shared accumulator; regression test added.
- DEMO_TEMPLATE_300x250 promoted to packages/layout-engine/src/templates/.
- @banner-studio/prompts: versioned extract + generate prompts with
  zod-defined tool schemas (claude-sonnet-4-6, forced tool-use).
- @banner-studio/api-lib: CSV feed loader, extract/generate/route-node/
  assemble agents, orchestrator returning fully-resolved BannerSpec.
  Generate agent retries on character-limit overflow.
- apps/web (Next.js 14 App Router): /api/generate route, /parity diff page,
  promise-singleton browser engine init.
- feeds/demo.csv with five hand-authored rows of varied length.
- SLICE_DEVIATIONS.md documents the five intentional gaps from
  ARCHITECTURE.md with V1 reversal paths.

Verified end-to-end: POST /api/generate against the live Claude API
returns three resolved BannerSpecs and two honestly-skipped rows
(overflow after two attempts). 26 unit + integration tests passing.
2026-05-15 10:25:21 -04:00

18 KiB
Raw Permalink Blame History

FRUSTRATION_LIST.md

The UX contract for the banner production platform, expressed as failure modes — concrete moments where existing tools (Celtra, Bannerify, Pencil, Eversail, Flexitive, Google Web Designer) make designers, producers, and traffickers want to throw their laptops.

Every entry below is something this tool must not do. The "rule" line is the architectural or interaction rule that prevents it. When a build decision is in tension with one of these rules, the rule wins unless explicitly overruled in writing.

Format:

  • N. Title
  • Failure mode — one sentence.
  • Concrete moment — where the user encounters it.
  • Our rule — what prevents it in this tool.

1. Text overflow is silent

  • Failure mode — Generated copy overflows the text container, the tool accepts it, and nobody notices until the banner is exported or already live.
  • Concrete moment — A trafficker uploads to CM360, the headline is clipped at the third word, and Slack lights up.
  • Our rule — Overflow is a first-class state in the layout engine. Every ResolvedLayer carries layout_log.overflow_triggered. The review UI renders any artboard with overflow in a visually unmistakable state (red border + overflow badge). Export is blocked while any artboard has overflow.

2. Regeneration destroys edits

  • Failure mode — A human edits the headline, regenerates for a typo elsewhere, and the headline reverts.
  • Concrete moment — Producer fixed "Up to 40% Off" to "Up To 40% Off" per client style, regenerated, the fix is gone, and they have to remember every edit they ever made.
  • Our rule — Human overrides live in human_overrides keyed by field_path and survive regeneration via the merge algorithm in ARCHITECTURE.md Part 7. The override merge happens server-side; the UI never re-applies anything.

3. The user cannot tell which fields are overridden

  • Failure mode — Overrides exist but are invisible, so the user cannot distinguish AI output from human work.
  • Concrete moment — Producer is reviewing version 4, wants to know which copy is original, which was edited, and by whom; the tool offers no answer.
  • Our rule — Every overridden field renders with a persistent visual marker (override pip + author + timestamp on hover). The marker is part of the review grid's default state, not a toggle.

4. Template changes silently break live campaigns

  • Failure mode — A designer edits the template; campaigns built on it break or drift on the next regeneration, with no notification.
  • Concrete moment — Designer narrows the headline slot by 12px to fix one campaign. The next morning, three other campaigns regenerate with overflow.
  • Our rule — Template edits to fields used by active campaigns trigger an interrupt modal listing affected campaigns. The modal cannot be auto-dismissed. Producers on those campaigns receive a notification with a diff.

5. Export produces files that fail ad server QA

  • Failure mode — The tool exports, the trafficker uploads, CM360 rejects, the designer learns from the trafficker, not the tool.
  • Concrete moment — File weight 187KB on a 150KB initial-load profile, click tag misconfigured, GSAP loaded as a sibling script instead of CDN — discovered five hours after handoff.
  • Our rule — QA gates run inline during export and block on any blocking failure. The export button surfaces gate results before producing a zip. The trafficking sheet is generated from the same QA pass, not separately.

6. Artboard navigation is disorienting at scale

  • Failure mode — 12 sizes × 3 variants = 36 artboards, and the user cannot find their place.
  • Concrete moment — Producer scrolls past row 14, loses the artboard they were editing, can no longer recall which size had the issue.
  • Our rule — Artboard grid has fixed spatial layout per template (sizes on Y, variants on X). Active artboard is always pinned in view. Cmd+G opens an artboard picker with type-to-jump. Breadcrumb at the top names the active artboard at all times.

7. Font loading is unpredictable

  • Failure mode — Designer sets a font, it looks correct in the canvas, the export falls back to Arial because the font wasn't embedded.
  • Concrete moment — CD reviews export, sees that Helvetica Neue silently became Arial in three banners, asks "did you even check?"
  • Our rule — Fonts are embedded in the render worker image, not relied on from the OS. The browser preview loads the same font files via the same loader. If a font cannot be loaded, the canvas refuses to render the layer and shows a missing-font badge — never silently substitutes.

8. Undo is missing or unreliable

  • Failure mode — Ctrl+Z works in the canvas but not after generation, or works for most actions but not text edits.
  • Concrete moment — Designer deletes a layer, hits Ctrl+Z, the layer comes back but in the wrong z-index.
  • Our rule — Undo is unlimited, session-scoped, and consistent across every editable surface (canvas, review grid edits, property panel). It works in text edit mode. Behavior is specified in INTERACTION_STANDARDS.md and is treated as a correctness property, not a feature.

9. The canvas fights you with bad snapping

  • Failure mode — Snapping helps 80% of the time and betrays you 20% — snaps to the wrong guide, can't be overridden mid-drag.
  • Concrete moment — Designer is trying to place an element 6px from the safe zone; canvas snaps to 8px because of a smart guide they can't see.
  • Our rule — Snapping is opt-in by gesture, not always-on. Hold to disable snapping mid-drag. Snap targets are visible before they engage. Snap distance is fixed at 4px.

10. Copy/paste behavior is inconsistent

  • Failure mode — Paste lands in the wrong place, wrong artboard, or strips styles.
  • Concrete moment — Designer copies a styled text layer from the 300x250, pastes onto the 728x90, gets a layer at (0,0) with default font.
  • Our rule — Paste rules are codified in INTERACTION_STANDARDS.md. Paste goes to the active artboard at the source layer's relative coordinates, preserving every style property. Cmd+Shift+V pastes in place.

11. Preview doesn't match export

  • Failure mode — What you see in the tool is not what comes out of the export pipeline.
  • Concrete moment — Headline wraps to two lines in the canvas, one line in the rendered HTML5; CD signed off on the wrong layout.
  • Our rule — Browser preview and Playwright render use the same Dropflow WASM module to compute layout. Resolved coordinates are baked into the spec at generation time. The render worker treats the spec as authoritative, not as a hint. This is the keystone (CLAUDE.md "The Text Group System").

12. The AI feels like a black box

  • Failure mode — Generation completes, output appears, the user cannot tell why this copy or this asset.
  • Concrete moment — Producer sees a headline they don't like. Was it the brand voice profile? The character limit? The brief tone field? No way to know.
  • Our ruleBannerSpec.ai_reasoning is populated by the orchestration service and rendered as a side panel in the review UI. It explains copy rationale, asset selection, variant selection, and animation choice in plain language. The panel is present by default, not hidden behind a toggle.

13. Generation is a blocking spinner

  • Failure mode — User clicks Generate, stares at a spinner for 60 seconds, has no idea if it's working or stuck.
  • Concrete moment — Producer waits, refreshes, waits again, assumes broken, starts over — discovers the original run finished and is now lost.
  • Our rule — Generation is async. The producer can keep working on other artboards or campaigns. Progress is specific ("Generating copy for 300×250…", "Validating character counts…"), not a generic spinner. Banners populate in the grid as they complete.

14. Character limit discoverability is poor

  • Failure mode — The user does not know there's a character limit until copy is rejected, or doesn't know what it is.
  • Concrete moment — Designer enters preview copy that fits at 28px, exports, finds the AI-generated 50-character version overflows because no one mentioned a limit existed.
  • Our rule — Character limits are set at template design time with a live visual simulator (ghost box of max content volume per artboard). Limits are surfaced in every UI that touches copy: template builder, review grid, AI reasoning panel. The AI validator rejects copy over limit and retries once before flagging.

15. Brand-rule violations slip through

  • Failure mode — AI generates copy that violates brand voice rules (banned words, exclamation marks for a no-exclamation brand, disallowed superlatives).
  • Concrete moment — High-fashion client opens the review, finds "AMAZING DEALS!!" in three banners; brand voice is "restrained, declarative, no exclamation marks."
  • Our ruleCopyConstraints (banned words, allow_exclamation_marks, allow_superlatives) are enforced programmatically after generation, not via prompt-only. Violations trigger a retry; second violation flags the banner for human attention with the violated rule named.

16. Click tags are subtly wrong

  • Failure mode — Click tag is implemented for the wrong ad server, or the wrong variable name, or the click area doesn't cover the banner.
  • Concrete moment — Trafficker uploads to CM360, clicks don't track. Discovered a week into the flight.
  • Our rule — Click tag implementation is owned by the ad server profile (AdServerProfile.click_tag_implementation). The export pipeline emits the correct pattern for that profile. The click area is computed to cover the full artboard minus any explicitly-exempt regions. A QA gate verifies the click tag exists and is wired before zip composition.

17. Weight calculation surprises at export

  • Failure mode — Banner is built, exported, found to be 230KB on a 150KB profile, designer has no idea what to remove.
  • Concrete moment — Designer used a 2.3MB hero image, didn't notice, finds out at export and has to redo the asset selection across 12 artboards.
  • Our rule — Weight is estimated continuously in the canvas (ArtboardSpec.estimated_weight_kb) using asset file sizes + GSAP-exempt scoring. The header bar shows current weight vs. profile limit. Weight overrun is a soft warning during edit, a blocking gate at export.

18. Asset version drift

  • Failure mode — An asset is replaced in the library; campaigns built against the old version silently use the new one or break.
  • Concrete moment — Marketing updates the hero shot, three live campaigns now show a different product, no one was notified.
  • Our rule — Assets are content-addressed. BannerSpec references asset_id + a content hash. Replacing an asset creates a new version; existing campaigns keep their pinned hash until explicitly updated. Updating triggers the same change-notification flow as template edits (Failure 4).

19. Confirmation dialogs everywhere

  • Failure mode — Tool asks "are you sure?" for routine actions, training users to click Yes reflexively, defeating the purpose.
  • Concrete moment — Designer deletes a layer, dismisses confirmation; deletes another, dismisses; deletes the wrong one, dismisses confirmation; can't recover.
  • Our rule — No confirmation dialogs for destructive actions. Undo is the safety net. Confirmations only appear for irreversible cross-system actions (publishing to ad server, archiving a campaign).

20. Save state anxiety

  • Failure mode — "Unsaved changes" indicator, save buttons, dirty state — for a tool designers leave open all day.
  • Concrete moment — Browser tab crashes; designer hadn't saved in 40 minutes; everything is lost.
  • Our rule — Auto-save on every state change. The template builder has no save button. Version history is the safety net, not user discipline.

21. AI tries to be a designer

  • Failure mode — AI moves elements, changes layouts, picks assets "creatively," producing surprises the designer must clean up.
  • Concrete moment — AI decides the headline would "look better" repositioned, breaks the brand template, every banner now requires manual correction.
  • Our rule — Positioning verbatim from CLAUDE.md and confirmed by the stakeholder meeting: AI manages the process, not the design. AI generates copy and routes data into slots. Asset selection is a deterministic metadata query against variant group selection rules — AI never "chooses" an asset creatively.

22. Cross-artboard editing is unclear

  • Failure mode — Edit on the master artboard — does it propagate? Edit on a child — does it stick? No one can tell.
  • Concrete moment — Designer fixes a typo on the 300x250, exports, finds the 728x90 still has the typo because they didn't realize edits don't propagate.
  • Our rule — Edits on the master artboard propagate to children that haven't been independently overridden. Each child shows a "synced with master" indicator that flips to "diverged" if independently edited. The propagation behavior is shown in the UI before the edit commits.

23. Async generation feedback is too coarse

  • Failure mode — Generation status is binary: "generating" or "done." No insight into which stage failed or why.
  • Concrete moment — One banner out of 12 fails. Producer sees 11 done and one stuck. No way to know if it's retrying, dead, or rate-limited.
  • Our rule — Each banner shows its current pipeline stage (extract → generate → route → assemble) and its last AI call status. Failure includes the failed stage, the reason, and a one-click retry that doesn't re-run the successful stages.

24. Approval workflow loses context

  • Failure mode — CD approves one version, producer pushes a later version, no one knows which version is the approved one.
  • Concrete moment — Trafficker exports the active version, ships it, CD says "that's not what I approved."
  • Our rule — Approvals are bound to a specific version_id, not the campaign. Export is offered against the approved version by default. Exporting a non-approved version requires an explicit override and is logged.

25. Conflict resolution UI is buried

  • Failure mode — Conflicts exist (override vs. new generation) but the UI surfaces them in a list the user has to find.
  • Concrete moment — Producer regenerates, doesn't notice the conflicts panel, exports anyway with the wrong values.
  • Our rule — Unresolved conflicts block export. The review grid surfaces conflicted banners with a distinct state (yellow border + conflict count). Resolution is inline per banner, not a separate screen. (Conflict resolution UX in detail: V2 — flagged in ANTI_PATTERNS.md.)

26. Touchpad gestures conflict with canvas

  • Failure mode — Two-finger scroll pans the page instead of the canvas; pinch zooms the browser instead of the artboard.
  • Concrete moment — Designer on a MacBook zooms in on a small element, the entire app zooms, layout breaks.
  • Our rule — Canvas captures pinch and two-finger pan gestures when the cursor is over it. Browser zoom is disabled inside the editor. Touchpad behavior is specified in INTERACTION_STANDARDS.md.

27. The review grid feels like a spreadsheet

  • Failure mode — Reviewing 36 banners is a data-entry experience: small thumbnails, dense controls, no calm.
  • Concrete moment — CD opens the review, sees a wall of postage stamps with dropdowns, hands it back to the producer.
  • Our rule — Review grid is designed for judgment, not data entry. Banners are rendered at usable size with calm spacing. Controls are recessive until an artboard is selected. The "Feels Like" benchmark from the source document: "The review interface should feel like a lightbox."

28. Animation preview is separate from layout preview

  • Failure mode — Static preview shows layout; you have to enter a "play" mode to see animation; they look different.
  • Concrete moment — Designer approves a static layout, hits play, finds the headline fades in late and overlaps the CTA.
  • Our rule — Animation runs in place on the canvas. There is no separate preview mode. A scrubber controls timeline position. Looped playback can be toggled but does not require entering a different state.

29. Re-running on a stale brief

  • Failure mode — Brief was updated; previous generation used the old brief; UI doesn't make clear which brief produced which version.
  • Concrete moment — Producer regenerates after updating the offer end date; banners still show the old date because the regeneration silently used a cached brief.
  • Our rule — Each CampaignVersion records the brief snapshot used to generate it. Editing the brief invalidates the active generation and surfaces a "regenerate to apply" prompt. Stale versions are clearly marked.

30. No way to compare versions

  • Failure mode — Version history exists but the user cannot see what changed between version 3 and version 4.
  • Concrete moment — CD says "go back to the version with the green hero" — producer has no way to scan versions visually.
  • Our rule — Version history is a visual timeline of thumbnails per banner. Hovering shows the diff (copy changes, override author, regenerate reason). Restoring a prior version is a single action and creates a new version (no destructive rollback).

Conflicts to resolve before build

  • Conflict-resolution UX scope. Failure 25 names conflict resolution as a slice-relevant concern (block export on conflict). VERTICAL_SLICE.md explicitly excludes conflict resolution from the slice. Resolution: in the slice, the export path simply does not encounter conflicts (no human edits → no overrides → no conflicts). The blocking rule applies from V1 forward, not in the slice. This is a scope reconciliation, not a contradiction.
  • Cross-artboard edit propagation (Failure 22). ARCHITECTURE.md Part 4 says "one artboard is the design master" (Artboard.master: boolean) but does not specify the propagation algorithm or the divergence state. The propagation rule above is a UX commitment that must be matched by an architectural decision (either documented as an algorithm in ARCHITECTURE.md Part 5 or added to the override-merge logic in Part 7). Flag before V1 build of the template builder.
  • Asset versioning (Failure 18). ARCHITECTURE.md Part 4 defines Asset with no version field and no content hash. Pinning campaigns to a specific asset version requires either an asset_version_id on the asset or a content hash on the ResolvedLayer. Choose before the asset library is built.