ANIMATION_V1: design spec for the V1 animation system

Specifies the V1 animation system end-to-end. Authored after two
Deep Research passes (preserved as ANIMATION_V1_RESEARCH.md and
ANIMATION_V1_DESIGN_DECISIONS.md for provenance).

ANIMATION_V1.md covers:
- Hard constraints: Chrome Heavy Ad Intervention (4MB / 15s burst /
  60s total CPU), composite-only animation, 150KB initial-load cap,
  GSAP via s0.2mdn.net CDN, free-tier only.
- Custom JSON schema (not Lottie) — block-based timeline, absolute
  start times, preset references only, no inline keyframes. Designed
  for AI authoring and human-readable diffs.
- 25-preset library across entrance / exit / emphasis / typography /
  mask / list categories. Each preset specifies start state, end
  state, default ease, default duration, and split/mask requirements.
- 9-category easing matrix using GSAP stock eases; bounce, slow,
  rough, and circ excluded from the V1 surface.
- Mask system: mask is a property on the masked layer (not a
  standalone layer). clip-path mandatory over interactive elements
  to prevent ghost-click failures. Konva ↔ HTML parity table.
- Per-character animation: SplitType at render time, Dropflow at
  spec time, automated aria-label / aria-hidden contract, 150-node
  ceiling enforced by QA gate.
- Animated bounding-box math: discrete sampling at 30 fps,
  unionBoundingBox() called from asset selection, render worker,
  and QA gate. Adds required_source_size to ResolvedLayer.
- 12 QA gates (G1-G12) covering schema, performance, asset,
  accessibility, and parity.

ARCHITECTURE.md updates:
- Forward-notes section at the top pointing to ANIMATION_V1.md and
  RESOLVED_FEED.md, matching the existing Part 7 forward-note style.
- Inline forward note in the Part 3 animation stack block.
- Old content preserved as historical record.

Decisions baked in (resolved during draft):
- Loops are global (max 3), not per-block. Per-block loops invite
  nested-infinite-loop bugs in AI-generated specs.
- Block triggers are time-anchored only. Event/interaction triggers
  wait for V2 rich media.
- blur_in and shake_horizontal dropped from the 27-preset research
  list. Blur is a video pattern; shake reads as a rendering error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Simeon Schecter 2026-05-18 20:12:58 -04:00
parent aab2af84cc
commit e51686a3d4
4 changed files with 921 additions and 0 deletions

822
ANIMATION_V1.md Normal file
View file

@ -0,0 +1,822 @@
# ANIMATION_V1.md
> **Status:** V1 design specification. Not implemented in the vertical slice.
> The slice ships three preset animations (`fade_in`, `hold`, `fade_out`) as a
> forward-pointer to this system. See SLICE_DEVIATIONS.md for the deltas.
>
> This document supersedes the animation discussion in ARCHITECTURE.md.
The animation system is the product. Without it this is a static-image
templating tool, and there is no shortage of those. Every architectural
decision in this document is downstream of one premise: **a motion-design-led
creative team must be able to adopt this platform without feeling they have
accepted reduced expressive range.**
This document is prescriptive. It specifies the JSON schema, the preset
library, the easing matrix, the mask system, the per-character animation
contract, the bounding-box math, and the QA gates that enforce all of the
above. It is meant to be implementable as written.
---
## 1. Goals and Constraints
### Goals
1. **Designer-authored, AI-scaled.** Designers author templates with named
animation presets per layer. The AI orchestrator composes presets into a
per-banner timeline using a brand-voice-aware rationale, and never invents
new motion.
2. **Two runtimes, one contract.** A single JSON schema drives both the Konva
browser preview and the Playwright-headless final render. What the
reviewer sees is what ships.
3. **Production-grade motion vocabulary.** Translation, scale, rotation,
opacity. Masks (geometric and image). Per-character text animation. All
four transforms with industry-idiomatic easing.
4. **Composable, not procedural.** A timeline is an ordered list of blocks.
Each block applies one preset to one layer. The AI's job is to choose
presets and order blocks — never to emit raw keyframes.
### Hard constraints (these define the rails)
These come from the Round 1 research pass and are non-negotiable. Every
primitive in this document is designed to fit inside them.
- **Chrome Heavy Ad Intervention.** An ad is unloaded if it exceeds:
- 4 MB cumulative network bandwidth
- 15 s of main-thread CPU within any rolling 30 s window
- 60 s of total main-thread CPU over the page lifetime
- **Composite properties only.** Animations use `transform` (translate,
scale, rotate) and `opacity` exclusively. Animating `width`, `height`,
`top`, `left`, `margin`, or `letter-spacing` forces layout recalculation
and burns the CPU budget. The schema cannot express these.
- **Duration.** 15 s default, 30 s hard ceiling (CM360). Loops global,
3× maximum, must fit inside the duration ceiling.
- **Initial load weight.** 150 KB zip cap (CM360 / IAB LEAN). Polite load
may add up to 2.2 MB.
- **CDN-hosted runtime.** GSAP loads from `s0.2mdn.net/ads/studio/cached_libs/`
and does not count against the 150 KB cap.
- **GSAP free tier only.** No Club GreenSock plugins. SplitText is replaced
by the MIT-licensed SplitType library.
### Out of scope for V1
- Keyframe authoring UI. Designers compose presets in template code; no
visual timeline editor.
- Custom easing curves. The GSAP stock catalog is the V1 surface.
- Lottie import or export. We do not ingest After Effects work.
- Interaction triggers (hover, click-to-play state changes). Timelines are
time-anchored only.
- Audio, video, expandable, or rich-media formats.
---
## 2. The Schema
### 2.1 Why a custom schema (and not Lottie)
Lottie is the industry standard for vector animation interchange. It is also
the wrong fit for V1. The Round 2 research pass landed on a custom schema for
three reasons, all load-bearing:
1. **AI authoring.** Lottie's bezier-tangent representation (`i.x`, `o.y`
floating-point arrays) forces the orchestrator to emit cubic-bezier math
instead of semantic strings like `"power2.out"`. This is the exact
failure mode that hallucinations love.
2. **Diff legibility.** A version diff that reads
`"ease": "power2.out", "preset": "scale_pop"` is auditable. A diff that
reads `"i": {"x": [0.25], "y": [0.1]}` is not. The version service
depends on humans being able to read diffs.
3. **Composite-only enforcement.** Lottie intrinsically mixes composite
(`p`, `s`, `r`, `o`) with non-composite (`fc`, `sw`, vector paths,
mattes, expressions). Stripping the non-composite half at export
time is a translation layer we would have to write, maintain, and
defend against drift.
The schema described below is inspired by Lottie's declarative timeline
shape, but every field maps 1:1 to a GSAP call. The orchestrator produces
JSON; the runtime hands it to GSAP without transformation.
### 2.2 Top-level shape
```jsonc
{
"version": "1.0.0",
"meta": {
"duration": 15.0, // seconds, ≤ 30
"loops": 1, // 1 = play once, 2 = play+repeat, 3 = play+repeat+repeat. Max 3.
"fps_target": 60
},
"blocks": [
// one entry per animation event, ordered by `start`
]
}
```
`blocks` is ordered, time-anchored, and never overlaps in semantics: each
block describes one preset applied to one layer over one interval. Multiple
blocks may run simultaneously (different layers animating in parallel is
the common case), but the timeline reads top-to-bottom in `start` order.
### 2.3 Block shape
```jsonc
{
"id": "block_hero_in", // stable id for diffs and overrides
"layer_id": "hero_image", // references BannerSpec layer
"preset": "scale_up_fade", // see Section 3
"start": 0.0, // seconds, relative to timeline 0
"duration": 0.8, // seconds, overrides preset default
"ease": "power2.out", // overrides preset default. Optional.
"stagger": null, // see Section 5 for per-character blocks
"mask_ref": null // see Section 4 for animated masks
}
```
Three rules govern the block:
- **`preset` is required.** No anonymous keyframes. If a designer needs
a motion that isn't in the preset library, the answer is to add it to
the library, not to inline keyframes.
- **`duration` and `ease` are optional overrides.** The preset's defaults
are the right answer 95% of the time; the override is for the 5% where
a designer has a specific reason.
- **`start` is absolute, not relative.** Block N+1 does not implicitly
begin when block N ends. This sounds verbose, but it makes the diff
trivial: editing block 3's duration doesn't ripple into block 4's
start time.
### 2.4 Worked example
A 300×250 banner with hero image, headline, and CTA. Hero scales in,
headline reveals per-character, CTA pops:
```jsonc
{
"version": "1.0.0",
"meta": { "duration": 4.0, "loops": 1, "fps_target": 60 },
"blocks": [
{
"id": "hero_in",
"layer_id": "hero",
"preset": "scale_up_fade",
"start": 0.0,
"duration": 0.8
},
{
"id": "headline_in",
"layer_id": "headline",
"preset": "fade_up_chars",
"start": 0.5,
"duration": 0.6,
"stagger": 0.02
},
{
"id": "cta_in",
"layer_id": "cta",
"preset": "scale_pop",
"start": 1.4,
"duration": 0.6
},
{
"id": "cta_pulse",
"layer_id": "cta",
"preset": "pulse_gentle",
"start": 2.2,
"duration": 1.5
}
]
}
```
A reviewer reading this diff knows exactly what changed. The orchestrator
emitting this JSON has four lookups and three integer-ish numbers to
produce per block. The runtime translates each block to one `gsap.fromTo()`
call against the preset's defined start and end states.
### 2.5 Loops
`meta.loops` is a global counter. `loops: 3` replays the entire timeline
three times; the cumulative duration must fit inside the IAB / CM360
ceiling (3 × `meta.duration` ≤ 30 s). Per-block loops are not supported
in V1 — they invite nested-infinite-loop bugs that an AI generator will
ship with surprising regularity.
For "always-on" effects that read as loops (the gentle pulse on a CTA,
the float on a product image), use the preset's built-in yoyo (Section 3).
The preset, not the meta loop counter, owns that behavior.
### 2.6 Where the schema lives in the BannerSpec
The animation timeline attaches to each `Artboard`:
```ts
interface Artboard {
artboard_id: string;
width: number;
height: number;
layers: ResolvedLayer[];
animation: AnimationTimeline; // ← this document
}
```
Each artboard has its own timeline because animation choices vary per size
(a 728×90 leaderboard cannot afford the same staggered entrance as a
300×600 half-page). The orchestrator emits one timeline per size as part
of its per-size copy decision. Rationale lands in the existing
`ai_reasoning.animation_rationale` field shared across sizes.
---
## 3. The Preset Library
25 named presets, organized by category. Every preset compiles to one or
more GSAP `fromTo` calls against composite properties only. Every preset
defines a start state, an end state, a default ease, and a default duration.
Designers reference presets by name; the AI orchestrator selects by name.
The selection criteria for inclusion:
- Maps to composite transforms or opacity only.
- Renders identically in Konva and in headless Chromium.
- Is a pattern that appears in production-quality display advertising,
not just motion-design tutorial reels.
- Does not require a Club GreenSock plugin.
### 3.1 Entrance presets
| Preset | Start | End | Ease | Duration | Use |
|---|---|---|---|---|---|
| `fade_in` | opacity 0 | opacity 1 | `power1.inOut` | 0.8 s | Backgrounds, disclaimers, logos. |
| `slide_in_left` | x 100%, op 0 | x 0, op 1 | `power2.out` | 0.8 s | Hero imagery, body copy. |
| `slide_in_right` | x 100%, op 0 | x 0, op 1 | `power2.out` | 0.8 s | Side-panel reveals. |
| `slide_in_up` | y 100%, op 0 | y 0, op 1 | `power2.out` | 0.8 s | CTAs, bottom-anchored copy. |
| `slide_in_down` | y 100%, op 0 | y 0, op 1 | `power2.out` | 0.8 s | Top-anchored headers, badges. |
| `scale_up_fade` | scale 0.8, op 0 | scale 1, op 1 | `power2.out` | 0.8 s | Hero products, centered logos. |
| `scale_down_fade` | scale 1.2, op 0 | scale 1, op 1 | `power2.out` | 1.0 s | Lifestyle backgrounds settling in. |
| `scale_pop` | scale 0.5, op 0 | scale 1, op 1 | `back.out(1.7)` | 0.6 s | Badges, CTA buttons, price circles. |
### 3.2 Exit presets
| Preset | Start | End | Ease | Duration | Use |
|---|---|---|---|---|---|
| `fade_out` | opacity 1 | opacity 0 | `power1.inOut` | 0.6 s | Scene transitions, legacy copy. |
| `slide_out_left` | x 0, op 1 | x 100%, op 0 | `power2.in` | 0.6 s | Sweep clearing the frame. |
| `slide_out_right` | x 0, op 1 | x 100%, op 0 | `power2.in` | 0.6 s | Sweep clearing the frame. |
### 3.3 Emphasis presets
These are yoyo presets — they animate from base state to peak state and
return. The block's `duration` field is the full there-and-back time.
| Preset | State change | Ease | Duration | Use |
|---|---|---|---|---|
| `pulse_gentle` | scale 1 ↔ 1.05 | `sine.inOut` | 1.5 s | Sustained CTA attention. |
| `pulse_strong` | scale 1 ↔ 1.15 | `power2.inOut` | 0.8 s | Urgent promotional badges. |
| `float_vertical` | y 0 ↔ 10 px | `sine.inOut` | 2.0 s | Floating product imagery. |
### 3.4 Typography presets
All typography presets require text splitting. The `stagger` field on the
block controls the interval between consecutive characters/words/lines.
Splitting happens at runtime via SplitType — see Section 5.
| Preset | Start | End | Ease | Stagger | Split | Use |
|---|---|---|---|---|---|---|
| `fade_up_chars` | y 20, op 0 | y 0, op 1 | `power2.out` | 0.02 s | chars | Premium headlines. |
| `fade_up_words` | y 20, op 0 | y 0, op 1 | `power2.out` | 0.04 s | words | Subheads, longer copy. |
| `typewriter` | op 0 | op 1 | `steps(1)` | 0.04 s | chars | Tech, narrative, informative. |
| `scramble_chars` | text random | text final | `linear` | 0.03 s | chars | Cyber, high-tech promos. |
| `scale_pop_chars` | scale 0, op 0 | scale 1, op 1 | `back.out(1.5)` | 0.04 s | chars | Bold, energetic typography. |
### 3.5 Mask presets
Masks animate the visibility of a layer through a moving clip shape. See
Section 4 for the full mask system; these presets reference it.
| Preset | Mask shape | Animates | Ease | Duration | Use |
|---|---|---|---|---|---|
| `mask_wipe_right` | rectangle | clip 0% → 100% width | `power2.inOut` | 1.0 s | Revealing new background. |
| `mask_wipe_up` | rectangle | clip 0% → 100% height | `power2.inOut` | 1.0 s | Rising imagery reveal. |
| `mask_circle_out` | circle | r 0 → max | `power3.inOut` | 1.2 s | Cinematic scene transitions. |
| `mask_text_reveal` | layer bbox | y 100% → 0% | `power3.out` | 0.8 s | Text rising from invisible floor. |
### 3.6 List/stagger presets
For multi-element layers (carousels, icon rows, bulleted lists). Stagger
applies to the layer's children, not to characters.
| Preset | Start | End | Ease | Stagger | Use |
|---|---|---|---|---|---|
| `stagger_slide_in` | x 50 px, op 0 | x 0, op 1 | `power2.out` | 0.08 s | Bullets, multi-product rows. |
| `stagger_pop_up` | scale 0.8, op 0 | scale 1, op 1 | `back.out(1.5)` | 0.08 s | Social icons, logo lockups. |
### 3.7 Explicit exclusions
Presets in this list are common in tutorial-grade libraries but are not
shipped in V1. The rationale is recorded so the next person who asks
"why not `jello`?" has an answer.
- **`jello`, `rubberBand`, `wobble`, `tada`, `headShake`, `swing`** — multi-axis
skew/scale combos that read as amateur in premium display. Animate.css
ships them; production-quality ads don't.
- **`shake_horizontal` and other shake patterns** — would require GSAP's
`rough` ease, which is excluded in Section 4. Shake patterns read as
rendering errors more often than as deliberate emphasis.
- **`blur_in`** — animates `filter: blur(Npx)`. Filter is a composite-eligible
property in Chromium, but it forces repaint, not just compositor work,
and burns the CPU budget faster than transforms. The use case (premium
reveals) is better served by `scale_down_fade` and `mask_circle_out`.
Blur belongs in video, not display.
- **Drop-shadow and box-shadow animation** — same repaint cost as blur.
If a layer needs a dimensional shadow that "appears," bake the shadow
into a transparent asset and animate that asset's opacity.
- **All Club GreenSock plugins**`MorphSVG`, `SplitText`, `Physics2D`,
`DrawSVG`, `MotionPath`, `CustomEase`. V1 stays on the free GSAP tier
with SplitType as the SplitText replacement.
---
## 4. Easing
Easing is the difference between mechanical motion and motion that reads
as designed. The V1 surface is the GSAP stock ease catalog, restricted to
the curves that actually appear in production display advertising.
### 4.1 Principles
- **Entrance eases out.** Elements arriving on stage decelerate so the eye
can catch them.
- **Exit eases in.** Elements leaving accelerate away to clear the visual
field.
- **Emphasis uses `sine.inOut` or `power*.inOut`.** Symmetric eases for
symmetric (yoyo) motion.
- **Linear is reserved for typewriters, progress bars, and continuous
panning.** Almost never the right answer for entrances or exits.
### 4.2 Easing matrix
| Category | Default | Alt 1 | Alt 2 |
|---|---|---|---|
| Entrance (translation) | `power2.out` | `power3.out` | `expo.out` |
| Entrance (scale) | `power2.out` | `back.out(1.5)` | `elastic.out(1, 0.5)` |
| Exit (translation) | `power2.in` | `power3.in` | `expo.in` |
| Exit (scale) | `power2.in` | `back.in(1.2)` | `power1.in` |
| Emphasis (pulse, float) | `sine.inOut` | `power1.inOut` | `power2.inOut` |
| Slow reveal (fade) | `power1.inOut` | `linear` | `sine.out` |
| Snappy state change | `expo.inOut` | `power4.inOut` | `steps(1)` |
| Per-character stagger | `power2.out` | `back.out(1.2)` | `linear` |
| Mask reveal | `power3.inOut` | `power2.inOut` | `expo.inOut` |
### 4.3 Excluded eases
The GSAP catalog includes curves that are rarely or never used in
production display. They are hidden from the V1 surface to keep the
orchestrator's decision space small.
- **`bounce.in` / `bounce.out`** — gravity-simulation cartoon physics. Reads
as children's brand or casual gaming. `back` covers the use case with
more taste.
- **`slow`** — cinematic speed-ramping. Lingers in the middle of the
transition. Wrong for 15-second display.
- **`rough`** — randomized jitter. Reads as a rendering error.
- **`circ`** — mathematically rigid circular arc. The `power*` family
feels more physical.
The accessible palette is: `power1``power4`, `expo`, `sine`, `back`,
`elastic`, `linear`, `steps`. Each with `.in`, `.out`, `.inOut` variants
where applicable. ~24 distinct eases — enough range for production motion
design, small enough that the AI doesn't get decision fatigue.
---
## 5. Masks
Masks are first-class in V1. They unlock the "reveal" patterns that read as
premium motion (logo-shaped reveals, circular wipes, text rising from an
invisible floor) and are the difference between "ad" and "ad you remember."
### 5.1 The mask as a layer property
A mask is not a standalone timeline layer. It is a property on the masked
layer. Co-locating the mask with the layer it masks keeps z-index
relationships implicit and keeps the diff readable.
```jsonc
{
"layer_id": "hero_image",
"mask": {
"type": "clip-path", // "clip-path" | "image"
"geometry": "circle", // for clip-path: "rect" | "circle" | "polygon"
"asset_id": null, // for type: "image", references the mask asset
"animation": {
"preset": "mask_circle_out",
"duration": 1.2,
"ease": "power3.inOut"
}
}
}
```
Blocks reference an animated mask via `mask_ref` pointing to the
masked layer's id. The block describes the layer's animation; the mask's
animation is internal to the mask object.
### 5.2 clip-path vs. mask-image: hit-testing decides
CSS gives us two mask primitives. They are not interchangeable.
- **`clip-path`** — actually clips the element's geometry. Pointer events
outside the clip do not fire.
- **`mask-image`** — modifies the alpha channel only. The element's
bounding box still receives pointer events, including in the "invisible"
region.
For animated reveals over interactive elements — which is the most common
case, because CTAs are interactive and CTAs are the most-revealed thing —
**`clip-path` is mandatory.** `mask-image` lets users click an "invisible"
button that's mid-reveal, fires the click handler, and ships the wrong
analytics. This is a real bug we will not have because the schema doesn't
let us write it: `mask.type: "clip-path"` is the default and any geometric
mask animation uses it.
`mask-image` is reserved for non-interactive layers (background imagery,
decorative elements) where alpha-channel masking is the only way to
achieve the effect (e.g., a brushstroke wipe, a textured reveal).
### 5.3 Konva ↔ HTML parity
Each mask type renders in both runtimes:
| Mask type | Konva | HTML |
|---|---|---|
| `clip-path` rect | `Group.clipFunc` with a rect path | CSS `clip-path: inset(...)` |
| `clip-path` circle | `Group.clipFunc` with an arc | CSS `clip-path: circle(...)` |
| `clip-path` polygon | `Group.clipFunc` with a polygon | CSS `clip-path: polygon(...)` |
| `image` | `globalCompositeOperation: 'source-in'` on cached group | CSS `mask-image: url(...)` |
The known parity risk is `mask-image` SVG scaling. Konva masks are pixel-
exact; CSS `mask-size` and `mask-position` can drift if the underlying
layer's box model changes between preview and render. The parity test
suite (Section 9) renders each mask preset in both runtimes and diffs the
output PNG. Drift > 2px on any sampled frame fails the test.
### 5.4 Animated mask geometry
The four mask presets in Section 3.5 cover the patterns that show up in
production display:
- **Linear wipe** (`mask_wipe_right`, `mask_wipe_up`): rectangular
clip-path animating one edge from 0% to 100%.
- **Circular reveal** (`mask_circle_out`): clip-path circle with radius
animating from 0 to a value large enough to cover the layer's
bounding box.
- **Text rising from a floor** (`mask_text_reveal`): clip-path rect
fixed at the layer's bbox; text inside animates `y: 100% → 0%`. The
mask itself doesn't animate — the masked content does.
Mask geometry that requires designer authorship (logo-shaped masks,
custom polygon shapes) is supported by `mask.geometry: "polygon"` with
a path string supplied by the template, but is not part of any preset.
The orchestrator does not invent mask shapes.
---
## 6. Per-Character Text Animation
Per-character animation is the difference between "templated copy" and
"designed message." It is also the single most performance-sensitive
primitive in the V1 system. This section is precise because it has to be.
### 6.1 The split happens at runtime, in the DOM, via SplitType
For the headless Chromium render (the one that ships), text splitting
happens in the DOM at runtime using SplitType. The reasons are non-
negotiable:
- **Accessibility.** Screen readers must read the headline as one
coherent string, not as 47 phonetic letters. SplitType supports the
ARIA pattern (see 6.2). Canvas-rendered text destroys accessibility
outright.
- **DOM-based QA.** Ad-server review bots parse the DOM to verify text
content. Canvas text is invisible to them and triggers rejection.
- **Text selection and SEO.** Native browser text selection works on
DOM text. Canvas text does not select.
For the Konva preview (the one designers and reviewers see), per-character
positioning is computed at spec-resolution time from Dropflow's glyph
positions. Konva renders each character as a `KText` node positioned to
match Dropflow's output. This preserves the preview-render parity
contract — the preview is positionally identical to what SplitType
produces at runtime, sub-pixel kerning and ligatures aside.
The split-of-labor:
- **Dropflow (spec time, both runtimes):** computes glyph positions used to
drive Konva preview rendering and to validate the assumption that
SplitType will produce the same layout.
- **SplitType (render time, Chromium only):** does the actual DOM split
for the GSAP-driven animation.
### 6.2 The accessibility contract
Splitting text into per-character spans destroys its semantic continuity.
The compilation engine must apply the standard ARIA mitigation, and the
schema does not let the designer or the orchestrator forget:
For every layer where the block's `preset` is in Section 3.4 (typography
presets requiring char-split), the export pipeline must:
1. Set `aria-label="<original text content>"` on the parent layer
element.
2. Set `aria-hidden="true"` on every SplitType-generated child span.
This is automated, not designer-authored. A QA gate (Section 9) verifies
that every char-split layer in the exported HTML has the parent label
and the hidden children before the export passes.
### 6.3 The performance ceiling
Per-character animation creates one DOM node per character. Every node
becomes its own GPU composite layer when animated via transform. Mobile
browsers have a hard ceiling on simultaneous composite layers, and
breaching it causes jank visible in the final render.
**The V1 ceiling: 150 simultaneously animated character nodes.**
This is not a recommendation; it's a QA gate. If a banner's exported
timeline has more than 150 char-split nodes animating in a single
30-second window, the export fails.
Implications for the orchestrator:
- A 60-char headline with `fade_up_chars` consumes 60 of the budget.
- A second char-split element (a 40-char subhead with `fade_up_chars`)
brings the total to 100.
- A third char-split element is risky. The orchestrator should prefer
`fade_up_words` (510 nodes) for subheads and reserve char-splits
for headlines.
Body copy and legal disclaimers must never use char-split. The orchestrator
selects `fade_up_words` or `fade_up_lines` for any layer whose resolved
text exceeds 80 characters, regardless of designer preset choice.
### 6.4 Concrete per-character parameters
The defaults below are baked into the preset library and should not be
overridden lightly. They reflect production-grade motion design, not
arbitrary timings:
| Pattern | Stagger | Ease | Distance / scale | Max chars |
|---|---|---|---|---|
| Typewriter | 0.030.05 s | `steps(1)` | n/a | 80 |
| Fade-up | 0.02 s | `power2.out` | 1520 px | 80 |
| Scale-pop | 0.04 s | `back.out(1.5)` | scale 0 → 1 | 50 |
| Scramble | 0.03 s | `linear` | n/a (text mutation) | 40 |
---
## 7. The Animated Bounding Box
The most consequential and least obvious part of the animation system is
the asset-sizing problem.
### 7.1 The problem
A hero image animating with `scale_down_fade` starts at `scale: 1.2` and
ends at `scale: 1.0`. The source image must be sized for the largest
frame — `1.2 × layer_width` — not the final frame. Ship the source at the
final-frame size and the first frame is upscaled, blurry, and visibly
broken.
The same logic applies to:
- **Translation** (`slide_in_*`): source must cover the layer's position at
every keyframe, including the offscreen start.
- **Rotation**: a rotated rectangle's axis-aligned bounding box grows with
the angle. A 600×600 image rotated 15° needs a ~775×775 source.
- **Compound transforms** (scale + rotate + translate, e.g. an entrance
that combines `scale_up_fade` with a small rotation): the bounding box
is the union of every transformed corner at every sampled time.
### 7.2 The math
For a layer with rectangular bounding box, given a transform with
translation `(tx, ty)`, scale `(sx, sy)`, and rotation `θ` at time `t`:
For each corner `(cx, cy)` of the base rectangle:
```
x'(t) = tx(t) + sx(t) · (cx · cos(θ(t)) cy · sin(θ(t)))
y'(t) = ty(t) + sy(t) · (cx · sin(θ(t)) + cy · cos(θ(t)))
```
The layer's axis-aligned bounding box at time `t` is the min/max of
`x'`/`y'` over the four corners. The animation's bounding box is the
union of every per-`t` bounding box.
Analytical solution of the extrema is computationally expensive and
brittle across compound transforms. V1 uses **discrete sampling**: 30
samples per second of animated duration, taking the union of all sampled
bounding boxes. At 30 fps over a 1.0-second animation, that's 30
4-corner evaluations — negligible cost, and the union over-approximates
the true bbox by at most one sample's worth of motion, which is well
within tolerance for source-asset sizing.
### 7.3 `unionBoundingBox()` — one function, many call sites
A single function in `packages/layout-engine`:
```ts
function unionBoundingBox(
layer: ResolvedLayer,
blocks: AnimationBlock[],
fps: number = 30
): BoundingBox;
```
It is called from:
- **Smart asset selection** (`packages/api-lib/asset-selection`): picks
source crop sized for the union bbox, not the layer rect.
- **Render worker** (`packages/render-worker`): fails fast if the
source asset's dimensions are smaller than the union bbox.
- **QA gate** (`packages/qa-gates`): flags upscaled frames before export,
with the field path of the offending layer.
### 7.4 What changes in the BannerSpec
A resolved layer gains a `required_source_size` field:
```ts
interface ResolvedLayer {
// ... existing fields
required_source_size?: { width: number; height: number };
}
```
The orchestrator does not populate this field. The layout engine
computes it after resolving the layer's animation blocks. The render
worker reads it and fails the render if the actual source is smaller.
The asset selector reads it when requesting a crop from the asset
service.
---
## 8. Library and Runtime Mechanics
### 8.1 GSAP via the Google CDN
GSAP 3 loads from `https://s0.2mdn.net/ads/studio/cached_libs/gsap_3.9.1_min.js`.
This URL is whitelisted by CM360, which means GSAP's bundle does not count
against the 150 KB initial-load cap. The HTML export injects exactly this
URL into the document head. Local-relative paths or other CDNs (including
unpkg, jsDelivr) cause CM360 to flag the library as a 4th-party call and
count its weight, which reliably fails the 150 KB gate.
### 8.2 SplitType in place of SplitText
The MIT-licensed SplitType library (~2 KB minified) replaces GSAP's
Club-only SplitText. SplitType is bundled into the exported HTML, not
loaded from a CDN, because no major ad CDN whitelists it. It is small
enough to fit within the 150 KB budget alongside the rest of the banner.
The export pipeline applies the aria-label/aria-hidden contract from 6.2
in the same pass that calls SplitType.
### 8.3 No `localStorage`, no `sessionStorage`
CM360 rejects creatives that reference browser storage APIs. The export
pipeline's static-analysis pass scans the compiled JS bundle for any
reference to `localStorage` or `sessionStorage` and fails the export
if found. None of our code uses these APIs; this gate exists to catch
a third-party dependency drift.
### 8.4 `will-change: transform` on animated layers
The export pipeline annotates every layer that has at least one
animation block with `will-change: transform` (and `opacity` if
applicable). This promotes the layer to its own GPU compositor layer
ahead of time, preventing first-frame jank from late-binding the layer
when GSAP starts animating it.
### 8.5 `prefers-reduced-motion`
The exported HTML includes a `@media (prefers-reduced-motion: reduce)`
block that snaps every animation to its end state instantly. This serves
two purposes:
- **Accessibility:** users who have requested reduced motion see the
final composition immediately.
- **Headless PNG capture:** Playwright launches with
`--force-prefers-reduced-motion`, which deterministically forces the
banner to its final-frame state. Capturing the static backup PNG
becomes a single screenshot with no `waitForTimeout`, which removes
the largest source of flakiness in the render pipeline.
---
## 9. QA Gates
Every gate below runs at export time. An export that fails any gate is
held — the spec is not corrupt, but it's not shippable, and the review
UI flags it for human resolution.
### 9.1 Schema gates
- **G1 — Composite-only.** Every block's preset is in the V1 library.
Inline keyframes are not expressible in the schema, so this gate is
enforced by the schema's TypeScript types, not by a runtime check.
- **G2 — Duration ceiling.** `meta.duration × meta.loops ≤ 30`.
- **G3 — Block ordering.** Blocks are sorted by `start`. No block's
`start + duration` exceeds `meta.duration`.
- **G4 — Loop count.** `meta.loops` is in `{1, 2, 3}`.
### 9.2 Performance gates
- **G5 — Char-split ceiling.** Sum of animated char-split nodes across
all simultaneously-running blocks in any 30-second window ≤ 150.
- **G6 — Weight budget.** Final zipped HTML ≤ 150 KB initial load.
- **G7 — No storage APIs.** Static analysis finds no reference to
`localStorage` or `sessionStorage`.
- **G8 — GSAP via CDN.** The exported HTML loads GSAP from
`s0.2mdn.net`, not from a local path.
### 9.3 Asset gates
- **G9 — Source size.** Every layer's actual source asset dimensions are
≥ the `required_source_size` computed by `unionBoundingBox`. Fail
with the layer id and the deficit.
- **G10 — Crossorigin.** Every image element has `crossorigin="anonymous"`
(required for Konva canvas reads in the preview).
### 9.4 Accessibility gates
- **G11 — Aria contract.** Every layer with a char-split preset has
`aria-label` set to the original text on the parent and
`aria-hidden="true"` on every child span.
### 9.5 Parity gates
- **G12 — Konva ↔ Playwright pixel diff.** For each preset, the parity
test suite renders a reference banner in both runtimes and compares
PNG output. Per-frame pixel diff > 2 px on any sampled keyframe
fails the build (not the export — this is a development-time gate,
not a per-banner gate).
---
## 10. Open Questions
These were debated during the research passes and design and are
deferred — either to a later V1 iteration or to V2. Tracking them here
so they don't get lost.
- **Interaction triggers.** V1 timelines are time-anchored. The Bannerflow
pattern of decoupling animation from text content supports localization
but not user interaction. When V2 introduces hover, click-to-expand,
or in-banner video, the schema gains an event-trigger model that block
references can hook into.
- **Custom eases.** GSAP's `CustomEase` is Club-only. If a designer ever
needs a custom curve, V2 either licenses Club GreenSock or implements
cubic-bezier eases via the free GSAP `power.in/out` family with bespoke
control points. The schema field is already `string`; this is purely an
authoring decision, not a schema change.
- **Mask shape authoring.** V1 supports rect, circle, and designer-supplied
polygons. Logo-shaped masks (SVG paths) work as `mask.geometry: "polygon"`,
but there is no authoring UI — designers paste path strings into the
template. V2 ships a small editor for this.
- **Per-block loops.** Not in V1. If a use case for per-block loops
emerges (a single icon that pulses 5× while the rest of the timeline
plays through once), it's expressible as multiple back-to-back blocks
in V1. Native per-block looping waits.
- **The 150-character ceiling on char-split.** This is the conservative
number from research. A production benchmarking pass on real hardware
(low-end Android, mid-tier iPhone) may push it higher or lower. Treat
the current ceiling as a placeholder that gates G5 should be calibrated
against empirically.
---
## 11. What This Replaces and What It Defers
This document replaces:
- The animation section of `ARCHITECTURE.md`. The original sketch
predates the preset library and the bounding-box concept.
- The implicit `fade_in / hold / fade_out` preset set in the vertical
slice. Those three names are kept (mapped to `fade_in`, no-op,
`fade_out`) but every other preset in the slice is replaced by an
explicit entry from Section 3.
This document does not specify:
- The timeline authoring UI (V2).
- The Figma sync path for animation specs (V2+).
- The trafficking-sheet representation of animation choices (V1, but
documented separately in `TRAFFICKING_V1.md` when that exists).
The next implementation step after this document is approved is to
flesh out `packages/types` with the schema in Section 2.2/2.3 and to
seed `packages/layout-engine` with `unionBoundingBox()`. Everything
else (preset library implementation, QA gates, runtime mechanics)
hangs off those two artifacts.

File diff suppressed because one or more lines are too long

24
ANIMATION_V1_RESEARCH.md Normal file

File diff suppressed because one or more lines are too long

View file

@ -7,6 +7,29 @@
---
## Forward Notes (May 2026)
Two V1 design documents have been authored since this architecture was
written and should be read alongside it. Where the documents conflict,
the newer documents win.
- **`RESOLVED_FEED.md`** reshapes the copy-override portion of Part 7.
Copy edits flow through a sparse per-product-per-size feed upstream
of generation, not through patches applied downstream. See the
inline forward note in Part 7 for the specific scope.
- **`ANIMATION_V1.md`** supersedes the animation discussion in Parts 3,
4, and 6 of this document. It specifies the JSON schema, the 25-preset
library, the easing matrix, the mask system, the per-character
animation contract, the `unionBoundingBox` math for animated asset
sizing, and the 12 QA gates that enforce all of the above. The
`animation_presets: AnimationPreset[]` field on Template, the
`TimelineSpec` on ArtboardSpec, and the `animation_max_duration_ms`
/ `animation_max_loops` fields on AdServerProfile in Part 4 are
replaced by the schema in `ANIMATION_V1.md` §2. The
`animation_rationale` field on `BannerSpec.ai_reasoning` is retained.
---
## Part 1: Research Summary — Key Findings
### What the Research Confirmed
@ -179,6 +202,14 @@ const STACK = {
// Whitelisted by major ad servers (not counted in weight budget)
// Guarantees frame synchronization across browser environments
// API mirrors the timeline data model in the spec
//
// Forward note (May 2026): The animation system is specified in
// detail in ANIMATION_V1.md — JSON schema, 25-preset library,
// easing matrix, mask system, per-character contract, bounding-box
// math, and QA gates. GSAP loads from s0.2mdn.net (CDN-whitelisted,
// weight-exempt) as gsap_3.9.1_min.js. SplitText (Club GreenSock)
// is replaced by the MIT-licensed SplitType library for
// per-character text animation.
},
assets: {