diff --git a/ANIMATION_V1.md b/ANIMATION_V1.md new file mode 100644 index 0000000..7fed899 --- /dev/null +++ b/ANIMATION_V1.md @@ -0,0 +1,822 @@ +# ANIMATION_V1.md + +> **Status:** V1 design specification. Not implemented in the vertical slice. +> The slice ships three preset animations (`fade_in`, `hold`, `fade_out`) as a +> forward-pointer to this system. See SLICE_DEVIATIONS.md for the deltas. +> +> This document supersedes the animation discussion in ARCHITECTURE.md. + +The animation system is the product. Without it this is a static-image +templating tool, and there is no shortage of those. Every architectural +decision in this document is downstream of one premise: **a motion-design-led +creative team must be able to adopt this platform without feeling they have +accepted reduced expressive range.** + +This document is prescriptive. It specifies the JSON schema, the preset +library, the easing matrix, the mask system, the per-character animation +contract, the bounding-box math, and the QA gates that enforce all of the +above. It is meant to be implementable as written. + +--- + +## 1. Goals and Constraints + +### Goals + +1. **Designer-authored, AI-scaled.** Designers author templates with named + animation presets per layer. The AI orchestrator composes presets into a + per-banner timeline using a brand-voice-aware rationale, and never invents + new motion. +2. **Two runtimes, one contract.** A single JSON schema drives both the Konva + browser preview and the Playwright-headless final render. What the + reviewer sees is what ships. +3. **Production-grade motion vocabulary.** Translation, scale, rotation, + opacity. Masks (geometric and image). Per-character text animation. All + four transforms with industry-idiomatic easing. +4. **Composable, not procedural.** A timeline is an ordered list of blocks. + Each block applies one preset to one layer. The AI's job is to choose + presets and order blocks — never to emit raw keyframes. + +### Hard constraints (these define the rails) + +These come from the Round 1 research pass and are non-negotiable. Every +primitive in this document is designed to fit inside them. + +- **Chrome Heavy Ad Intervention.** An ad is unloaded if it exceeds: + - 4 MB cumulative network bandwidth + - 15 s of main-thread CPU within any rolling 30 s window + - 60 s of total main-thread CPU over the page lifetime +- **Composite properties only.** Animations use `transform` (translate, + scale, rotate) and `opacity` exclusively. Animating `width`, `height`, + `top`, `left`, `margin`, or `letter-spacing` forces layout recalculation + and burns the CPU budget. The schema cannot express these. +- **Duration.** 15 s default, 30 s hard ceiling (CM360). Loops global, + 3× maximum, must fit inside the duration ceiling. +- **Initial load weight.** 150 KB zip cap (CM360 / IAB LEAN). Polite load + may add up to 2.2 MB. +- **CDN-hosted runtime.** GSAP loads from `s0.2mdn.net/ads/studio/cached_libs/` + and does not count against the 150 KB cap. +- **GSAP free tier only.** No Club GreenSock plugins. SplitText is replaced + by the MIT-licensed SplitType library. + +### Out of scope for V1 + +- Keyframe authoring UI. Designers compose presets in template code; no + visual timeline editor. +- Custom easing curves. The GSAP stock catalog is the V1 surface. +- Lottie import or export. We do not ingest After Effects work. +- Interaction triggers (hover, click-to-play state changes). Timelines are + time-anchored only. +- Audio, video, expandable, or rich-media formats. + +--- + +## 2. The Schema + +### 2.1 Why a custom schema (and not Lottie) + +Lottie is the industry standard for vector animation interchange. It is also +the wrong fit for V1. The Round 2 research pass landed on a custom schema for +three reasons, all load-bearing: + +1. **AI authoring.** Lottie's bezier-tangent representation (`i.x`, `o.y` + floating-point arrays) forces the orchestrator to emit cubic-bezier math + instead of semantic strings like `"power2.out"`. This is the exact + failure mode that hallucinations love. +2. **Diff legibility.** A version diff that reads + `"ease": "power2.out", "preset": "scale_pop"` is auditable. A diff that + reads `"i": {"x": [0.25], "y": [0.1]}` is not. The version service + depends on humans being able to read diffs. +3. **Composite-only enforcement.** Lottie intrinsically mixes composite + (`p`, `s`, `r`, `o`) with non-composite (`fc`, `sw`, vector paths, + mattes, expressions). Stripping the non-composite half at export + time is a translation layer we would have to write, maintain, and + defend against drift. + +The schema described below is inspired by Lottie's declarative timeline +shape, but every field maps 1:1 to a GSAP call. The orchestrator produces +JSON; the runtime hands it to GSAP without transformation. + +### 2.2 Top-level shape + +```jsonc +{ + "version": "1.0.0", + "meta": { + "duration": 15.0, // seconds, ≤ 30 + "loops": 1, // 1 = play once, 2 = play+repeat, 3 = play+repeat+repeat. Max 3. + "fps_target": 60 + }, + "blocks": [ + // one entry per animation event, ordered by `start` + ] +} +``` + +`blocks` is ordered, time-anchored, and never overlaps in semantics: each +block describes one preset applied to one layer over one interval. Multiple +blocks may run simultaneously (different layers animating in parallel is +the common case), but the timeline reads top-to-bottom in `start` order. + +### 2.3 Block shape + +```jsonc +{ + "id": "block_hero_in", // stable id for diffs and overrides + "layer_id": "hero_image", // references BannerSpec layer + "preset": "scale_up_fade", // see Section 3 + "start": 0.0, // seconds, relative to timeline 0 + "duration": 0.8, // seconds, overrides preset default + "ease": "power2.out", // overrides preset default. Optional. + "stagger": null, // see Section 5 for per-character blocks + "mask_ref": null // see Section 4 for animated masks +} +``` + +Three rules govern the block: + +- **`preset` is required.** No anonymous keyframes. If a designer needs + a motion that isn't in the preset library, the answer is to add it to + the library, not to inline keyframes. +- **`duration` and `ease` are optional overrides.** The preset's defaults + are the right answer 95% of the time; the override is for the 5% where + a designer has a specific reason. +- **`start` is absolute, not relative.** Block N+1 does not implicitly + begin when block N ends. This sounds verbose, but it makes the diff + trivial: editing block 3's duration doesn't ripple into block 4's + start time. + +### 2.4 Worked example + +A 300×250 banner with hero image, headline, and CTA. Hero scales in, +headline reveals per-character, CTA pops: + +```jsonc +{ + "version": "1.0.0", + "meta": { "duration": 4.0, "loops": 1, "fps_target": 60 }, + "blocks": [ + { + "id": "hero_in", + "layer_id": "hero", + "preset": "scale_up_fade", + "start": 0.0, + "duration": 0.8 + }, + { + "id": "headline_in", + "layer_id": "headline", + "preset": "fade_up_chars", + "start": 0.5, + "duration": 0.6, + "stagger": 0.02 + }, + { + "id": "cta_in", + "layer_id": "cta", + "preset": "scale_pop", + "start": 1.4, + "duration": 0.6 + }, + { + "id": "cta_pulse", + "layer_id": "cta", + "preset": "pulse_gentle", + "start": 2.2, + "duration": 1.5 + } + ] +} +``` + +A reviewer reading this diff knows exactly what changed. The orchestrator +emitting this JSON has four lookups and three integer-ish numbers to +produce per block. The runtime translates each block to one `gsap.fromTo()` +call against the preset's defined start and end states. + +### 2.5 Loops + +`meta.loops` is a global counter. `loops: 3` replays the entire timeline +three times; the cumulative duration must fit inside the IAB / CM360 +ceiling (3 × `meta.duration` ≤ 30 s). Per-block loops are not supported +in V1 — they invite nested-infinite-loop bugs that an AI generator will +ship with surprising regularity. + +For "always-on" effects that read as loops (the gentle pulse on a CTA, +the float on a product image), use the preset's built-in yoyo (Section 3). +The preset, not the meta loop counter, owns that behavior. + +### 2.6 Where the schema lives in the BannerSpec + +The animation timeline attaches to each `Artboard`: + +```ts +interface Artboard { + artboard_id: string; + width: number; + height: number; + layers: ResolvedLayer[]; + animation: AnimationTimeline; // ← this document +} +``` + +Each artboard has its own timeline because animation choices vary per size +(a 728×90 leaderboard cannot afford the same staggered entrance as a +300×600 half-page). The orchestrator emits one timeline per size as part +of its per-size copy decision. Rationale lands in the existing +`ai_reasoning.animation_rationale` field shared across sizes. + +--- + +## 3. The Preset Library + +25 named presets, organized by category. Every preset compiles to one or +more GSAP `fromTo` calls against composite properties only. Every preset +defines a start state, an end state, a default ease, and a default duration. +Designers reference presets by name; the AI orchestrator selects by name. + +The selection criteria for inclusion: + +- Maps to composite transforms or opacity only. +- Renders identically in Konva and in headless Chromium. +- Is a pattern that appears in production-quality display advertising, + not just motion-design tutorial reels. +- Does not require a Club GreenSock plugin. + +### 3.1 Entrance presets + +| Preset | Start | End | Ease | Duration | Use | +|---|---|---|---|---|---| +| `fade_in` | opacity 0 | opacity 1 | `power1.inOut` | 0.8 s | Backgrounds, disclaimers, logos. | +| `slide_in_left` | x −100%, op 0 | x 0, op 1 | `power2.out` | 0.8 s | Hero imagery, body copy. | +| `slide_in_right` | x 100%, op 0 | x 0, op 1 | `power2.out` | 0.8 s | Side-panel reveals. | +| `slide_in_up` | y 100%, op 0 | y 0, op 1 | `power2.out` | 0.8 s | CTAs, bottom-anchored copy. | +| `slide_in_down` | y −100%, op 0 | y 0, op 1 | `power2.out` | 0.8 s | Top-anchored headers, badges. | +| `scale_up_fade` | scale 0.8, op 0 | scale 1, op 1 | `power2.out` | 0.8 s | Hero products, centered logos. | +| `scale_down_fade` | scale 1.2, op 0 | scale 1, op 1 | `power2.out` | 1.0 s | Lifestyle backgrounds settling in. | +| `scale_pop` | scale 0.5, op 0 | scale 1, op 1 | `back.out(1.7)` | 0.6 s | Badges, CTA buttons, price circles. | + +### 3.2 Exit presets + +| Preset | Start | End | Ease | Duration | Use | +|---|---|---|---|---|---| +| `fade_out` | opacity 1 | opacity 0 | `power1.inOut` | 0.6 s | Scene transitions, legacy copy. | +| `slide_out_left` | x 0, op 1 | x −100%, op 0 | `power2.in` | 0.6 s | Sweep clearing the frame. | +| `slide_out_right` | x 0, op 1 | x 100%, op 0 | `power2.in` | 0.6 s | Sweep clearing the frame. | + +### 3.3 Emphasis presets + +These are yoyo presets — they animate from base state to peak state and +return. The block's `duration` field is the full there-and-back time. + +| Preset | State change | Ease | Duration | Use | +|---|---|---|---|---| +| `pulse_gentle` | scale 1 ↔ 1.05 | `sine.inOut` | 1.5 s | Sustained CTA attention. | +| `pulse_strong` | scale 1 ↔ 1.15 | `power2.inOut` | 0.8 s | Urgent promotional badges. | +| `float_vertical` | y 0 ↔ −10 px | `sine.inOut` | 2.0 s | Floating product imagery. | + +### 3.4 Typography presets + +All typography presets require text splitting. The `stagger` field on the +block controls the interval between consecutive characters/words/lines. +Splitting happens at runtime via SplitType — see Section 5. + +| Preset | Start | End | Ease | Stagger | Split | Use | +|---|---|---|---|---|---|---| +| `fade_up_chars` | y 20, op 0 | y 0, op 1 | `power2.out` | 0.02 s | chars | Premium headlines. | +| `fade_up_words` | y 20, op 0 | y 0, op 1 | `power2.out` | 0.04 s | words | Subheads, longer copy. | +| `typewriter` | op 0 | op 1 | `steps(1)` | 0.04 s | chars | Tech, narrative, informative. | +| `scramble_chars` | text random | text final | `linear` | 0.03 s | chars | Cyber, high-tech promos. | +| `scale_pop_chars` | scale 0, op 0 | scale 1, op 1 | `back.out(1.5)` | 0.04 s | chars | Bold, energetic typography. | + +### 3.5 Mask presets + +Masks animate the visibility of a layer through a moving clip shape. See +Section 4 for the full mask system; these presets reference it. + +| Preset | Mask shape | Animates | Ease | Duration | Use | +|---|---|---|---|---|---| +| `mask_wipe_right` | rectangle | clip 0% → 100% width | `power2.inOut` | 1.0 s | Revealing new background. | +| `mask_wipe_up` | rectangle | clip 0% → 100% height | `power2.inOut` | 1.0 s | Rising imagery reveal. | +| `mask_circle_out` | circle | r 0 → max | `power3.inOut` | 1.2 s | Cinematic scene transitions. | +| `mask_text_reveal` | layer bbox | y 100% → 0% | `power3.out` | 0.8 s | Text rising from invisible floor. | + +### 3.6 List/stagger presets + +For multi-element layers (carousels, icon rows, bulleted lists). Stagger +applies to the layer's children, not to characters. + +| Preset | Start | End | Ease | Stagger | Use | +|---|---|---|---|---|---| +| `stagger_slide_in` | x −50 px, op 0 | x 0, op 1 | `power2.out` | 0.08 s | Bullets, multi-product rows. | +| `stagger_pop_up` | scale 0.8, op 0 | scale 1, op 1 | `back.out(1.5)` | 0.08 s | Social icons, logo lockups. | + +### 3.7 Explicit exclusions + +Presets in this list are common in tutorial-grade libraries but are not +shipped in V1. The rationale is recorded so the next person who asks +"why not `jello`?" has an answer. + +- **`jello`, `rubberBand`, `wobble`, `tada`, `headShake`, `swing`** — multi-axis + skew/scale combos that read as amateur in premium display. Animate.css + ships them; production-quality ads don't. +- **`shake_horizontal` and other shake patterns** — would require GSAP's + `rough` ease, which is excluded in Section 4. Shake patterns read as + rendering errors more often than as deliberate emphasis. +- **`blur_in`** — animates `filter: blur(Npx)`. Filter is a composite-eligible + property in Chromium, but it forces repaint, not just compositor work, + and burns the CPU budget faster than transforms. The use case (premium + reveals) is better served by `scale_down_fade` and `mask_circle_out`. + Blur belongs in video, not display. +- **Drop-shadow and box-shadow animation** — same repaint cost as blur. + If a layer needs a dimensional shadow that "appears," bake the shadow + into a transparent asset and animate that asset's opacity. +- **All Club GreenSock plugins** — `MorphSVG`, `SplitText`, `Physics2D`, + `DrawSVG`, `MotionPath`, `CustomEase`. V1 stays on the free GSAP tier + with SplitType as the SplitText replacement. + +--- + +## 4. Easing + +Easing is the difference between mechanical motion and motion that reads +as designed. The V1 surface is the GSAP stock ease catalog, restricted to +the curves that actually appear in production display advertising. + +### 4.1 Principles + +- **Entrance eases out.** Elements arriving on stage decelerate so the eye + can catch them. +- **Exit eases in.** Elements leaving accelerate away to clear the visual + field. +- **Emphasis uses `sine.inOut` or `power*.inOut`.** Symmetric eases for + symmetric (yoyo) motion. +- **Linear is reserved for typewriters, progress bars, and continuous + panning.** Almost never the right answer for entrances or exits. + +### 4.2 Easing matrix + +| Category | Default | Alt 1 | Alt 2 | +|---|---|---|---| +| Entrance (translation) | `power2.out` | `power3.out` | `expo.out` | +| Entrance (scale) | `power2.out` | `back.out(1.5)` | `elastic.out(1, 0.5)` | +| Exit (translation) | `power2.in` | `power3.in` | `expo.in` | +| Exit (scale) | `power2.in` | `back.in(1.2)` | `power1.in` | +| Emphasis (pulse, float) | `sine.inOut` | `power1.inOut` | `power2.inOut` | +| Slow reveal (fade) | `power1.inOut` | `linear` | `sine.out` | +| Snappy state change | `expo.inOut` | `power4.inOut` | `steps(1)` | +| Per-character stagger | `power2.out` | `back.out(1.2)` | `linear` | +| Mask reveal | `power3.inOut` | `power2.inOut` | `expo.inOut` | + +### 4.3 Excluded eases + +The GSAP catalog includes curves that are rarely or never used in +production display. They are hidden from the V1 surface to keep the +orchestrator's decision space small. + +- **`bounce.in` / `bounce.out`** — gravity-simulation cartoon physics. Reads + as children's brand or casual gaming. `back` covers the use case with + more taste. +- **`slow`** — cinematic speed-ramping. Lingers in the middle of the + transition. Wrong for 15-second display. +- **`rough`** — randomized jitter. Reads as a rendering error. +- **`circ`** — mathematically rigid circular arc. The `power*` family + feels more physical. + +The accessible palette is: `power1`–`power4`, `expo`, `sine`, `back`, +`elastic`, `linear`, `steps`. Each with `.in`, `.out`, `.inOut` variants +where applicable. ~24 distinct eases — enough range for production motion +design, small enough that the AI doesn't get decision fatigue. + +--- + +## 5. Masks + +Masks are first-class in V1. They unlock the "reveal" patterns that read as +premium motion (logo-shaped reveals, circular wipes, text rising from an +invisible floor) and are the difference between "ad" and "ad you remember." + +### 5.1 The mask as a layer property + +A mask is not a standalone timeline layer. It is a property on the masked +layer. Co-locating the mask with the layer it masks keeps z-index +relationships implicit and keeps the diff readable. + +```jsonc +{ + "layer_id": "hero_image", + "mask": { + "type": "clip-path", // "clip-path" | "image" + "geometry": "circle", // for clip-path: "rect" | "circle" | "polygon" + "asset_id": null, // for type: "image", references the mask asset + "animation": { + "preset": "mask_circle_out", + "duration": 1.2, + "ease": "power3.inOut" + } + } +} +``` + +Blocks reference an animated mask via `mask_ref` pointing to the +masked layer's id. The block describes the layer's animation; the mask's +animation is internal to the mask object. + +### 5.2 clip-path vs. mask-image: hit-testing decides + +CSS gives us two mask primitives. They are not interchangeable. + +- **`clip-path`** — actually clips the element's geometry. Pointer events + outside the clip do not fire. +- **`mask-image`** — modifies the alpha channel only. The element's + bounding box still receives pointer events, including in the "invisible" + region. + +For animated reveals over interactive elements — which is the most common +case, because CTAs are interactive and CTAs are the most-revealed thing — +**`clip-path` is mandatory.** `mask-image` lets users click an "invisible" +button that's mid-reveal, fires the click handler, and ships the wrong +analytics. This is a real bug we will not have because the schema doesn't +let us write it: `mask.type: "clip-path"` is the default and any geometric +mask animation uses it. + +`mask-image` is reserved for non-interactive layers (background imagery, +decorative elements) where alpha-channel masking is the only way to +achieve the effect (e.g., a brushstroke wipe, a textured reveal). + +### 5.3 Konva ↔ HTML parity + +Each mask type renders in both runtimes: + +| Mask type | Konva | HTML | +|---|---|---| +| `clip-path` rect | `Group.clipFunc` with a rect path | CSS `clip-path: inset(...)` | +| `clip-path` circle | `Group.clipFunc` with an arc | CSS `clip-path: circle(...)` | +| `clip-path` polygon | `Group.clipFunc` with a polygon | CSS `clip-path: polygon(...)` | +| `image` | `globalCompositeOperation: 'source-in'` on cached group | CSS `mask-image: url(...)` | + +The known parity risk is `mask-image` SVG scaling. Konva masks are pixel- +exact; CSS `mask-size` and `mask-position` can drift if the underlying +layer's box model changes between preview and render. The parity test +suite (Section 9) renders each mask preset in both runtimes and diffs the +output PNG. Drift > 2px on any sampled frame fails the test. + +### 5.4 Animated mask geometry + +The four mask presets in Section 3.5 cover the patterns that show up in +production display: + +- **Linear wipe** (`mask_wipe_right`, `mask_wipe_up`): rectangular + clip-path animating one edge from 0% to 100%. +- **Circular reveal** (`mask_circle_out`): clip-path circle with radius + animating from 0 to a value large enough to cover the layer's + bounding box. +- **Text rising from a floor** (`mask_text_reveal`): clip-path rect + fixed at the layer's bbox; text inside animates `y: 100% → 0%`. The + mask itself doesn't animate — the masked content does. + +Mask geometry that requires designer authorship (logo-shaped masks, +custom polygon shapes) is supported by `mask.geometry: "polygon"` with +a path string supplied by the template, but is not part of any preset. +The orchestrator does not invent mask shapes. + +--- + +## 6. Per-Character Text Animation + +Per-character animation is the difference between "templated copy" and +"designed message." It is also the single most performance-sensitive +primitive in the V1 system. This section is precise because it has to be. + +### 6.1 The split happens at runtime, in the DOM, via SplitType + +For the headless Chromium render (the one that ships), text splitting +happens in the DOM at runtime using SplitType. The reasons are non- +negotiable: + +- **Accessibility.** Screen readers must read the headline as one + coherent string, not as 47 phonetic letters. SplitType supports the + ARIA pattern (see 6.2). Canvas-rendered text destroys accessibility + outright. +- **DOM-based QA.** Ad-server review bots parse the DOM to verify text + content. Canvas text is invisible to them and triggers rejection. +- **Text selection and SEO.** Native browser text selection works on + DOM text. Canvas text does not select. + +For the Konva preview (the one designers and reviewers see), per-character +positioning is computed at spec-resolution time from Dropflow's glyph +positions. Konva renders each character as a `KText` node positioned to +match Dropflow's output. This preserves the preview-render parity +contract — the preview is positionally identical to what SplitType +produces at runtime, sub-pixel kerning and ligatures aside. + +The split-of-labor: + +- **Dropflow (spec time, both runtimes):** computes glyph positions used to + drive Konva preview rendering and to validate the assumption that + SplitType will produce the same layout. +- **SplitType (render time, Chromium only):** does the actual DOM split + for the GSAP-driven animation. + +### 6.2 The accessibility contract + +Splitting text into per-character spans destroys its semantic continuity. +The compilation engine must apply the standard ARIA mitigation, and the +schema does not let the designer or the orchestrator forget: + +For every layer where the block's `preset` is in Section 3.4 (typography +presets requiring char-split), the export pipeline must: + +1. Set `aria-label=""` on the parent layer + element. +2. Set `aria-hidden="true"` on every SplitType-generated child span. + +This is automated, not designer-authored. A QA gate (Section 9) verifies +that every char-split layer in the exported HTML has the parent label +and the hidden children before the export passes. + +### 6.3 The performance ceiling + +Per-character animation creates one DOM node per character. Every node +becomes its own GPU composite layer when animated via transform. Mobile +browsers have a hard ceiling on simultaneous composite layers, and +breaching it causes jank visible in the final render. + +**The V1 ceiling: 150 simultaneously animated character nodes.** + +This is not a recommendation; it's a QA gate. If a banner's exported +timeline has more than 150 char-split nodes animating in a single +30-second window, the export fails. + +Implications for the orchestrator: + +- A 60-char headline with `fade_up_chars` consumes 60 of the budget. +- A second char-split element (a 40-char subhead with `fade_up_chars`) + brings the total to 100. +- A third char-split element is risky. The orchestrator should prefer + `fade_up_words` (5–10 nodes) for subheads and reserve char-splits + for headlines. + +Body copy and legal disclaimers must never use char-split. The orchestrator +selects `fade_up_words` or `fade_up_lines` for any layer whose resolved +text exceeds 80 characters, regardless of designer preset choice. + +### 6.4 Concrete per-character parameters + +The defaults below are baked into the preset library and should not be +overridden lightly. They reflect production-grade motion design, not +arbitrary timings: + +| Pattern | Stagger | Ease | Distance / scale | Max chars | +|---|---|---|---|---| +| Typewriter | 0.03–0.05 s | `steps(1)` | n/a | 80 | +| Fade-up | 0.02 s | `power2.out` | 15–20 px | 80 | +| Scale-pop | 0.04 s | `back.out(1.5)` | scale 0 → 1 | 50 | +| Scramble | 0.03 s | `linear` | n/a (text mutation) | 40 | + +--- + +## 7. The Animated Bounding Box + +The most consequential and least obvious part of the animation system is +the asset-sizing problem. + +### 7.1 The problem + +A hero image animating with `scale_down_fade` starts at `scale: 1.2` and +ends at `scale: 1.0`. The source image must be sized for the largest +frame — `1.2 × layer_width` — not the final frame. Ship the source at the +final-frame size and the first frame is upscaled, blurry, and visibly +broken. + +The same logic applies to: + +- **Translation** (`slide_in_*`): source must cover the layer's position at + every keyframe, including the offscreen start. +- **Rotation**: a rotated rectangle's axis-aligned bounding box grows with + the angle. A 600×600 image rotated 15° needs a ~775×775 source. +- **Compound transforms** (scale + rotate + translate, e.g. an entrance + that combines `scale_up_fade` with a small rotation): the bounding box + is the union of every transformed corner at every sampled time. + +### 7.2 The math + +For a layer with rectangular bounding box, given a transform with +translation `(tx, ty)`, scale `(sx, sy)`, and rotation `θ` at time `t`: + +For each corner `(cx, cy)` of the base rectangle: + +``` +x'(t) = tx(t) + sx(t) · (cx · cos(θ(t)) − cy · sin(θ(t))) +y'(t) = ty(t) + sy(t) · (cx · sin(θ(t)) + cy · cos(θ(t))) +``` + +The layer's axis-aligned bounding box at time `t` is the min/max of +`x'`/`y'` over the four corners. The animation's bounding box is the +union of every per-`t` bounding box. + +Analytical solution of the extrema is computationally expensive and +brittle across compound transforms. V1 uses **discrete sampling**: 30 +samples per second of animated duration, taking the union of all sampled +bounding boxes. At 30 fps over a 1.0-second animation, that's 30 +4-corner evaluations — negligible cost, and the union over-approximates +the true bbox by at most one sample's worth of motion, which is well +within tolerance for source-asset sizing. + +### 7.3 `unionBoundingBox()` — one function, many call sites + +A single function in `packages/layout-engine`: + +```ts +function unionBoundingBox( + layer: ResolvedLayer, + blocks: AnimationBlock[], + fps: number = 30 +): BoundingBox; +``` + +It is called from: + +- **Smart asset selection** (`packages/api-lib/asset-selection`): picks + source crop sized for the union bbox, not the layer rect. +- **Render worker** (`packages/render-worker`): fails fast if the + source asset's dimensions are smaller than the union bbox. +- **QA gate** (`packages/qa-gates`): flags upscaled frames before export, + with the field path of the offending layer. + +### 7.4 What changes in the BannerSpec + +A resolved layer gains a `required_source_size` field: + +```ts +interface ResolvedLayer { + // ... existing fields + required_source_size?: { width: number; height: number }; +} +``` + +The orchestrator does not populate this field. The layout engine +computes it after resolving the layer's animation blocks. The render +worker reads it and fails the render if the actual source is smaller. +The asset selector reads it when requesting a crop from the asset +service. + +--- + +## 8. Library and Runtime Mechanics + +### 8.1 GSAP via the Google CDN + +GSAP 3 loads from `https://s0.2mdn.net/ads/studio/cached_libs/gsap_3.9.1_min.js`. +This URL is whitelisted by CM360, which means GSAP's bundle does not count +against the 150 KB initial-load cap. The HTML export injects exactly this +URL into the document head. Local-relative paths or other CDNs (including +unpkg, jsDelivr) cause CM360 to flag the library as a 4th-party call and +count its weight, which reliably fails the 150 KB gate. + +### 8.2 SplitType in place of SplitText + +The MIT-licensed SplitType library (~2 KB minified) replaces GSAP's +Club-only SplitText. SplitType is bundled into the exported HTML, not +loaded from a CDN, because no major ad CDN whitelists it. It is small +enough to fit within the 150 KB budget alongside the rest of the banner. +The export pipeline applies the aria-label/aria-hidden contract from 6.2 +in the same pass that calls SplitType. + +### 8.3 No `localStorage`, no `sessionStorage` + +CM360 rejects creatives that reference browser storage APIs. The export +pipeline's static-analysis pass scans the compiled JS bundle for any +reference to `localStorage` or `sessionStorage` and fails the export +if found. None of our code uses these APIs; this gate exists to catch +a third-party dependency drift. + +### 8.4 `will-change: transform` on animated layers + +The export pipeline annotates every layer that has at least one +animation block with `will-change: transform` (and `opacity` if +applicable). This promotes the layer to its own GPU compositor layer +ahead of time, preventing first-frame jank from late-binding the layer +when GSAP starts animating it. + +### 8.5 `prefers-reduced-motion` + +The exported HTML includes a `@media (prefers-reduced-motion: reduce)` +block that snaps every animation to its end state instantly. This serves +two purposes: + +- **Accessibility:** users who have requested reduced motion see the + final composition immediately. +- **Headless PNG capture:** Playwright launches with + `--force-prefers-reduced-motion`, which deterministically forces the + banner to its final-frame state. Capturing the static backup PNG + becomes a single screenshot with no `waitForTimeout`, which removes + the largest source of flakiness in the render pipeline. + +--- + +## 9. QA Gates + +Every gate below runs at export time. An export that fails any gate is +held — the spec is not corrupt, but it's not shippable, and the review +UI flags it for human resolution. + +### 9.1 Schema gates + +- **G1 — Composite-only.** Every block's preset is in the V1 library. + Inline keyframes are not expressible in the schema, so this gate is + enforced by the schema's TypeScript types, not by a runtime check. +- **G2 — Duration ceiling.** `meta.duration × meta.loops ≤ 30`. +- **G3 — Block ordering.** Blocks are sorted by `start`. No block's + `start + duration` exceeds `meta.duration`. +- **G4 — Loop count.** `meta.loops` is in `{1, 2, 3}`. + +### 9.2 Performance gates + +- **G5 — Char-split ceiling.** Sum of animated char-split nodes across + all simultaneously-running blocks in any 30-second window ≤ 150. +- **G6 — Weight budget.** Final zipped HTML ≤ 150 KB initial load. +- **G7 — No storage APIs.** Static analysis finds no reference to + `localStorage` or `sessionStorage`. +- **G8 — GSAP via CDN.** The exported HTML loads GSAP from + `s0.2mdn.net`, not from a local path. + +### 9.3 Asset gates + +- **G9 — Source size.** Every layer's actual source asset dimensions are + ≥ the `required_source_size` computed by `unionBoundingBox`. Fail + with the layer id and the deficit. +- **G10 — Crossorigin.** Every image element has `crossorigin="anonymous"` + (required for Konva canvas reads in the preview). + +### 9.4 Accessibility gates + +- **G11 — Aria contract.** Every layer with a char-split preset has + `aria-label` set to the original text on the parent and + `aria-hidden="true"` on every child span. + +### 9.5 Parity gates + +- **G12 — Konva ↔ Playwright pixel diff.** For each preset, the parity + test suite renders a reference banner in both runtimes and compares + PNG output. Per-frame pixel diff > 2 px on any sampled keyframe + fails the build (not the export — this is a development-time gate, + not a per-banner gate). + +--- + +## 10. Open Questions + +These were debated during the research passes and design and are +deferred — either to a later V1 iteration or to V2. Tracking them here +so they don't get lost. + +- **Interaction triggers.** V1 timelines are time-anchored. The Bannerflow + pattern of decoupling animation from text content supports localization + but not user interaction. When V2 introduces hover, click-to-expand, + or in-banner video, the schema gains an event-trigger model that block + references can hook into. +- **Custom eases.** GSAP's `CustomEase` is Club-only. If a designer ever + needs a custom curve, V2 either licenses Club GreenSock or implements + cubic-bezier eases via the free GSAP `power.in/out` family with bespoke + control points. The schema field is already `string`; this is purely an + authoring decision, not a schema change. +- **Mask shape authoring.** V1 supports rect, circle, and designer-supplied + polygons. Logo-shaped masks (SVG paths) work as `mask.geometry: "polygon"`, + but there is no authoring UI — designers paste path strings into the + template. V2 ships a small editor for this. +- **Per-block loops.** Not in V1. If a use case for per-block loops + emerges (a single icon that pulses 5× while the rest of the timeline + plays through once), it's expressible as multiple back-to-back blocks + in V1. Native per-block looping waits. +- **The 150-character ceiling on char-split.** This is the conservative + number from research. A production benchmarking pass on real hardware + (low-end Android, mid-tier iPhone) may push it higher or lower. Treat + the current ceiling as a placeholder that gates G5 should be calibrated + against empirically. + +--- + +## 11. What This Replaces and What It Defers + +This document replaces: + +- The animation section of `ARCHITECTURE.md`. The original sketch + predates the preset library and the bounding-box concept. +- The implicit `fade_in / hold / fade_out` preset set in the vertical + slice. Those three names are kept (mapped to `fade_in`, no-op, + `fade_out`) but every other preset in the slice is replaced by an + explicit entry from Section 3. + +This document does not specify: + +- The timeline authoring UI (V2). +- The Figma sync path for animation specs (V2+). +- The trafficking-sheet representation of animation choices (V1, but + documented separately in `TRAFFICKING_V1.md` when that exists). + +The next implementation step after this document is approved is to +flesh out `packages/types` with the schema in Section 2.2/2.3 and to +seed `packages/layout-engine` with `unionBoundingBox()`. Everything +else (preset library implementation, QA gates, runtime mechanics) +hangs off those two artifacts. diff --git a/ANIMATION_V1_DESIGN_DECISIONS.md b/ANIMATION_V1_DESIGN_DECISIONS.md new file mode 100644 index 0000000..2204816 --- /dev/null +++ b/ANIMATION_V1_DESIGN_DECISIONS.md @@ -0,0 +1,44 @@ +Animation Engine V1 Design Decisions and Architecture SpecificationsIntroduction and Architectural ImperativeThe transition from static programmatic creative to high-performance, dynamic HTML5 animation requires an architecture that balances the rigorous performance constraints of the browser with the programmatic flexibility required by dynamic creative optimization (DCO). The primary objective of the V1 Animation Engine is to provide a highly structured, semantically readable, and performant pipeline capable of being authored by an artificial intelligence agent and executed flawlessly across dual runtimes: Konva.js for design-time canvas previews and Playwright-driven Chromium for final headless rendering via GSAP 3.The architecture must strictly adhere to the Chrome Heavy Ad Intervention thresholds—specifically, remaining under 4MB network payload, 15 seconds of main thread CPU blocking within any 30-second window, and 60 seconds of total CPU time. Consequently, all design decisions regarding data serialization, preset definition, easing curves, masking methodologies, and text splitting must inherently favor operations that offload compositing to the Graphics Processing Unit (GPU) while minimizing Document Object Model (DOM) layout recalculations and repaints.This report provides exhaustive, prescriptive recommendations for the core V1 design decisions, establishing the foundational schema, the curated motion library, the implementation physics, and the specific bounds of execution.Section 1: Schema Decision and SpecificationTo engineer a programmatic animation data structure capable of AI-agent generation, dual-runtime execution, version control via JSON diffing, and human readability, the foundational serialization format must be meticulously selected. The evaluation centers on Lottie JSON, the dotLottie container, and a custom JSON architecture.1.1 Competitive Schema AssessmentLottie has established itself as the ubiquitous standard for vector animation interchange, fundamentally designed to export complex motion graphics directly from Adobe After Effects via the Bodymovin plugin. It operates as a structured JSON representation of the After Effects render model. However, its architecture is heavily optimized for complex shape layer morphing, intricate path interpolation, and deeply nested pre-compositions. This design philosophy stands in direct opposition to the discrete, composited transform manipulations required by programmatic ad engines governed by strict web performance constraints.The dotLottie format is an evolutionary optimization of the Lottie ecosystem. It addresses the payload size issue by compressing the Lottie JSON and bundling external assets—such as base64 encoded images and custom fonts—into a single .lottie ZIP container. While dotLottie significantly reduces file size and improves memory efficiency during transport , it does not alter the underlying Lottie JSON data structure; it merely wraps it. Therefore, the structural limitations and extreme verbosity of the Lottie schema persist entirely unaltered inside the archive.For the specific use case of an AI agent programmatically generating animations, the Lottie ecosystem presents critical structural failure modes:AI Authoring Complexity and Token Overhead: Lottie’s schema requires deep nesting and spatial interpolation logic. Generating a simple easing curve requires an AI agent to accurately output multidimensional cubic-bezier coordinates (the i and o vectors) rather than emitting a semantic string like "power2.out". This drastically increases the token payload and the probability of hallucinatory or mathematically invalid easing parameters.Version Control and Diff Readability: Lottie outputs are notoriously verbose and mathematically dense. A single basic transform keyframe introduces nested arrays for ks (transform properties), p (position), s (scale), r (rotation), and o (opacity), utilizing ambiguous single-letter keys. A human reviewer attempting to audit an AI-generated JSON diff to understand why a button animation changed would find the Lottie diff functionally impenetrable.Extraneous Data and Ad Compliance: The Lottie format is designed to accommodate features that actively trigger the Chrome Heavy Ad Intervention threshold. Features such as layer styles, vector path morphing, complex mathematical expressions, and matte compositing (tt attributes like Alpha or Luma) consume vast amounts of CPU time. To use Lottie safely in an ad environment, a secondary compiler would be required to actively strip forbidden properties from the payload.1.2 The Lottie Schema InternalsA forensic examination of the Lottie specification reveals the exact shape of its data structures, which underscores its incompatibility with a lightweight, AI-driven GSAP pipeline.In the Lottie specification, keyframe properties are stored in arrays organized by ascending frame number (t). A standard property animation in Lottie requires traversing the ks (transform) object, which holds distinct sub-objects for anchor point (a), position (p), scale (s), rotation (r), and opacity (o).A concrete JSON shape for a position keyframe utilizing bezier easing in Lottie appears as follows :JSON{ + "ty": 4, + "nm": "Hero CTA Button", + "ks": { + "o": { "a": 0, "k": 100 }, + "r": { "a": 1, "k": [ + { + "t": 0, + "s": , + "e": , + "i": { "x": [0.25], "y": }, + "o": { "x": [0.25], "y": } + }, + { "t": 30 } + ]}, + "p": { "a": 0, "k": }, + "s": { "a": 0, "k": } + } +} +The easing in Lottie is defined mathematically by the i (in tangent) and o (out tangent) objects, which dictate the handles of a cubic-bezier curve. The x property represents the time component mapped between 0 and 1, while the y property represents the value interpolation. For a fluid ease-in-out effect, the tangents are typically shifted toward the center of the grid, represented in the JSON as {"o": {"x": [0.333], "y": }, "i": {"x": [0.667], "y": }}. Expecting an AI agent to reliably output and manipulate these floating-point arrays for standard UI transitions is a misuse of large language model capabilities.Furthermore, an evaluation of Lottie's text animator system reveals severe friction with modern web DOM practices. The Lottie schema uses a complex TextAnimatorDataProperty that introduces "range selectors," which apply property modifications to subsets of text based on percentages, character indices, or words. A text animator dictionary dictates a start frame, an end frame, an offset, and an abstract shape parameter such as Square, Ramp Up, or Triangle.1.3 Limitations of Lottie for the V1 ArchitectureThe Lottie schema fails the V1 constraints on three primary architectural fronts:Baking of Non-Composite Properties: Lottie natively intertwines composite and non-composite properties. It bakes fill colors (fc), stroke widths (sw), and exact vector paths (v, i, o bezier points for shapes) deeply within its shapes and text arrays. While the V1 GSAP engine is mandated strictly to animate composite CSS properties (transform and opacity) to ensure hardware acceleration, Lottie intrinsically mixes layout, style, and composite data.Text Animation Friction: Lottie’s text animation ecosystem relies entirely on index-based or percentage-based range selectors functioning within an isolated canvas or SVG wrapper. This is a fundamentally poor fit for the V1 environment, which utilizes SplitType to divide text into discrete HTML DOM nodes. GSAP animates discrete DOM nodes using highly efficient, array-based staggering algorithms (stagger: 0.05). Lottie, conversely, attempts to simulate DOM nodes by rendering individual glyph paths and mathematically calculating their offsets inside the JSON.Performance Overhead: Because Lottie was designed to mirror After Effects, it supports features like Mattes (tt attributes like Alpha and Luma), blending modes (bm), and complex expressions. If ingested into a web ad, the runtime parser must evaluate all these nodes, which taxes the main thread and introduces significant risk of breaching the 60-second total CPU time limit of Chrome Heavy Ad Intervention.1.4 Schema RecommendationGiven the exhaustive analysis of Lottie's structure, the definitive recommendation for V1 is Option D: A custom JSON schema inspired by Lottie's declarative timeline logic, but engineered specifically for programmatic AI generation and GSAP/Playwright execution.Adopting a strict subset of Lottie is a half-measure. It would require building a complex translation layer to convert nested bezier objects back into GSAP easing strings, and to map abstract range selectors back into SplitType DOM staggers. By designing a custom schema, the architecture empowers the AI agent to utilize semantic reasoning, emitting concise payloads that map cleanly to the BannerSpec and are natively understood by the GSAP 3 runtime.The minimal schema shape must strictly define the timeline structure, the target DOM layers, the valid composite properties, and semantic easing strings. The schema leverages an array of discrete animation blocks rather than deeply nested property timelines.Minimal Schema Definition for V1:JSON{ + "version": "1.0.0", + "meta": { + "duration": 15.0, + "loops": 2, + "maxDurationEnforced": true + }, + "timeline": +} +This structure achieves several critical goals: it is highly token-efficient for language models, it maps 1:1 with GSAP's gsap.fromTo() signature, it enforces the separation of composite properties (y, scale, opacity) from layout properties, and the JSON diffs are immediately legible to a human reviewer.1.5 Authoring Workflow for V1For the V1 architecture, the path of least resistance and highest reliability is for this custom schema to be generated entirely by the AI authoring agent, relying on a predefined library of templates and presets. It will not be authored by human designers exporting from After Effects via Bodymovin.Attempting to force motion designers to use After Effects to generate web ad code introduces an insurmountable mapping problem. Designers will inevitably utilize unsupported AE features—such as drop shadows, layer blending, and track mattes—that will fail silently or crash the custom parser when exported.Instead, human designers will define the initial visual state, layout constraints, and typography parameters in a static UI or template format. The AI agent will subsequently populate the "timeline" array, utilizing the curated preset list (defined in Section 2) to breathe life into the static spec. Human review will consist of analyzing a fast, highly readable JSON diff, where semantic keys ("ease": "power2.out", "preset": "scale_pop") provide instant comprehension of the AI's intent.Section 2: Preset Library CurationA core requirement for scaled programmatic animation is a highly curated library of standard motion presets. Bespoke, freeform keyframing introduces unnecessary cognitive load, increases the risk of LLM hallucinations, and threatens to breach performance limits. An explicit taxonomy of presets constrains the AI to outputting reliable, pre-validated motion code.2.1 Survey of Industry Preset LibrariesAn exhaustive analysis of standard industry platforms reveals the following taxonomy of animation primitives and naming conventions:Animate.css: This widely used open-source CSS library focuses on verbose, highly descriptive naming conventions for standard @keyframes. The library ships with a massive catalog of presets, including bounce, flash, pulse, rubberBand, shakeX, shakeY, headShake, swing, tada, wobble, jello, heartBeat, and various directional entries like slideInUp and fadeIn. While popular, many of its presets are unsuitable for professional ad design.Motion One: Engineered for modern UI and React applications, Motion One employs highly functional categorizations geared toward digital product design rather than classical timeline animation. Its presets and examples include enter/exit, scramble text, parallax, spring, reveal, and stagger from center.Celtra: Celtra utilizes a specialized scene-based architecture, typically dividing animations into structured intros, outros, and sustaining interaction states. It relies on high-level transitioning components such as Miniscroller and Interscroller rather than granular nested keyframes, prioritizing modular design toolkits.Bannerflow: Focused heavily on dynamic creative optimization (DCO) and localized programmatic delivery, Bannerflow ships with fundamental HTML5 timeline transitions. Its primary motion primitives are centered around functional entrances and exits: fade, slide, zoom, and bounce.Google Web Designer (GWD): GWD ships with a "Quick Animations" panel embedded directly via custom web components. These are primarily basic fades, slides, and expansions that are triggered by internal Javascript timeline events (gotoAndPlay) mapped to user interactions or automated markers.2.2 Curated V1 Preset LibraryDrawing from the industry survey, the following table defines the prescriptive curated list of 27 named presets to be supported natively in the V1 schema. Every preset in this matrix has been designed to map directly to GSAP composite properties (transforms and opacity), ensuring hardware-accelerated 60fps rendering without triggering layout thrashing on the browser's main thread. This table is ready for direct ingestion into the ANIMATION_V1.md design document.Preset NameCategoryStart StateEnd StateDefault EaseDurationSplit RequiredMask RequiredTypical Ad Use Casefade_inEntranceopacity: 0opacity: 1power1.inOut0.8sNoNoBackgrounds, disclaimers, logos.fade_outExitopacity: 1opacity: 0power1.inOut0.6sNoNoScene transitions, legacy copy.slide_in_leftEntrancex: -100%, op: 0x: 0, op: 1power2.out0.8sNoNoProduct imagery, standard body copy.slide_in_rightEntrancex: 100%, op: 0x: 0, op: 1power2.out0.8sNoNoProduct imagery, side-panel reveals.slide_in_upEntrancey: 100%, op: 0y: 0, op: 1power2.out0.8sNoNoCTAs, bottom-anchored legal text.slide_in_downEntrancey: -100%, op: 0y: 0, op: 1power2.out0.8sNoNoTop-anchored headers, hanging badges.slide_out_leftExitx: 0, op: 1x: -100%, op: 0power2.in0.6sNoNoFrame sweeping, clearing the scene.slide_out_rightExitx: 0, op: 1x: 100%, op: 0power2.in0.6sNoNoFrame sweeping, clearing the scene.scale_up_fadeEntrancescale: 0.8, op: 0scale: 1, op: 1power2.out0.8sNoNoHero product shots, center logos.scale_down_fadeEntrancescale: 1.2, op: 0scale: 1, op: 1power2.out1.0sNoNoLifestyle background imagery setting.scale_popEntrancescale: 0.5, op: 0scale: 1, op: 1back.out(1.7)0.6sNoNoBadges, CTA buttons, pricing circles.blur_inEntranceblur(10px), op: 0blur(0), op: 1power2.out1.0sNoNoPremium luxury brand reveals.pulse_gentleEmphasisscale: 1scale: 1.05 (yoyo)sine.inOut1.5sNoNoSustaining passive attention on a CTA.pulse_strongEmphasisscale: 1scale: 1.15 (yoyo)power2.inOut0.8sNoNoUrgent promotional badges (e.g., "-50%").float_verticalEmphasisy: 0y: -10 (yoyo)sine.inOut2.0sNoNoFloating product elements (shoes, cans).shake_horizontalEmphasisx: 0x: 5, -5, 5, 0rough0.5sNoNoAttention grabbing for expiring deals.fade_up_charsTypoy: 20, op: 0y: 0, op: 1power2.out0.6sYes (chars)NoPremium, sophisticated headlines.fade_up_wordsTypoy: 20, op: 0y: 0, op: 1power2.out0.6sYes (words)NoSub-headlines, longer promotional copy.typewriterTypoop: 0op: 1steps(1)0.05sYes (chars)NoTech, narrative, or informative copy.scramble_charsTypotext: randomtext: originallinear1.0sYes (chars)NoCyber-themed, high-tech promotions.scale_pop_charsTyposcale: 0, op: 0scale: 1, op: 1back.out(1.5)0.4sYes (chars)NoBold promotional or energetic typography.mask_wipe_rightMaskclip-path: 0%clip-path: 100%power2.inOut1.0sNoYesRevealing a new background organically.mask_wipe_upMaskclip-path: 0%clip-path: 100%power2.inOut1.0sNoYesRising imagery reveal or scene transition.mask_circle_outMaskclip: circle(0)clip: circle(100%)power3.inOut1.2sNoYesDynamic, cinematic scene transitions.mask_text_revealMasky: 100%y: 0%power3.out0.8sYes (lines)YesModern typography rising from an invisible floor.stagger_slide_inListx: -50px, op: 0x: 0, op: 1power2.out0.5sYes (nodes)NoBulleted lists, multi-product carousels.stagger_pop_upListscale: 0.8, op: 0scale: 1, op: 1back.out(1.5)0.5sYes (nodes)NoStaggering multiple social icons or logos.2.3 Excluded Presets and Failure ModesCertain animation presets ubiquitous in novice motion design tools and CSS libraries like Animate.css must be explicitly excluded from the V1 specification. Shipping these presets inevitably leads to programmatic failures, rendering bottlenecks, and brand degradation.Layout Animators: Any preset that mandates the animation of dimensional or positional properties—such as width, height, padding, margin, top, left, or letter-spacing—is strictly forbidden. Modifying these properties dynamically forces the browser rendering engine to recalculate the entire DOM layout geometry, update the render tree, and repaint the pixels. This process, known as "layout thrashing," severely degrades performance, causes dropped frames on mobile devices, and directly triggers Chrome Heavy Ad limits.Amateur Motion Primitives: Presets such as jello, rubberBand, wobble, and tada heavily featured in Animate.css rely on exaggerated, multi-axis scaling and skewed rotation. While technically achievable via composite transforms, they present a cheap, unprofessional aesthetic that detracts from high-end corporate brand guidelines and are generally rejected by premium ad networks.Non-Composite Filter Animations: Heavy utilization of CSS box-shadow or drop-shadow within animation keyframes forces continuous GPU repaints and introduces significant input lag, particularly on the constrained graphical processing units of mobile phones rendering HTML5 banners. If a dimensional shadow must be animated, it should be baked into a static transparent PNG asset, and the engine should simply fade its opacity.Club GreenSock Exclusive Plugins: Any preset relying on GSAP’s premium plugins (e.g., MorphSVG, SplitText, Physics2D) is excluded to maintain licensing compliance and minimize the initial payload weight. V1 utilizes the MIT-licensed SplitType as the direct, compliant alternative for all typographic staggering operations.Section 4: Competitive Schema AnalysisUnderstanding the data serialization patterns of existing enterprise programmatic ad platforms provides critical intelligence on architectural pitfalls, historical evolution, and modern best practices. A deep review of Celtra, Bannerflow, and Google Web Designer reveals distinctly different approaches to timeline management and creative deployment.4.1 Platform Approaches to Data StructuringCeltra (The Scene-Based Matrix): +Celtra targets high-end rich media, complex display units, and programmatic video. Interacting with its export API surfaces creative data as structured JSON arrays. Crucially, Celtra approaches animation not as a single global timeline spanning the entire duration of the ad, but as discrete interactive "scenes". Research indicates that they limit scene timeline editing to a maximum of 10 seconds to organically enforce performance best practices and prevent designers from creating bloated files. Animations are built functionally, focusing on transitioning objects in (intros) and out (outros) of the viewport. Through "Master Templates," a single animation logic map can populate across hundreds of variants using a dynamic data feed, injecting different assets into the pre-defined timeline logic.Bannerflow (Dynamic Parameterization): +Bannerflow relies heavily on HTML5 automation and Dynamic Creative Optimization (DCO) to serve creatives at massive scale. Under the hood, Bannerflow's timeline editor generates native CSS and JavaScript keyframe animations on the fly. It focuses deeply on real-time rendering, translation management across global campaigns, and DCO feed integration. Instead of exporting monolithic, heavily nested animation files, it serializes parameters that are interpreted at runtime to apply core transitions like fades, slides, and zooms. The animation data is fundamentally decoupled from the text content to allow for instantaneous language localization.Google Web Designer (The DOM-Heavy Approach): +Google Web Designer (GWD) operates much closer to the metal of the browser's native DOM API. It saves authoring files in an internal .gwd XML/JSON structure and exports pure, unadulterated HTML files featuring custom DOM elements (e.g., , ) and highly verbose, hard-coded CSS @keyframes. GWD manages the timeline entirely via CSS, triggering distinct transitions by swapping classes or firing discrete JavaScript timeline events (such as gotoAndPlay) attached to user interactions. This creates extremely heavy, brittle stylesheets that are exceptionally difficult to read, version control, or modify programmatically.4.2 Architectural Takeaways for V1Synthesizing the competitive landscape yields three fundamental architectural mandates for the V1 Schema:Adopt Celtra’s Scene/Block Logic: The V1 schema must aggressively avoid attempting to maintain a single 15-second monolithic timeline array containing hundreds of staggered, absolute-positioned keyframes. Instead, V1 should logically group animations into distinct phases or blocks. This makes AI generation highly modular and resilient. If the AI agent is instructed to alter the entrance animation of the CTA button, it edits the specific parameter block without being forced to recalculate the absolute start times for every subsequent animation in the entire file.Avoid GWD’s CSS Verbosity: Generating native CSS @keyframes dynamically results in massive code bloat. Each minor animation adjustment requires declaring a unique keyframe block in the