Simeon Schecter ccbdb47162 Day 5+6 of the vertical slice: multi-size + per-row strips

Expands the slice from a single 300x250 banner to four IAB sizes
(300x600, 300x250, 728x90, 160x600) driven by a designer-authored
TypeSystem and a per-row strip review surface.

Layout engine
- TypeSystem with role-based typography (headline/subheadline/cta/legal)
  and piecewise size-class derivation: half_page / rectangle /
  leaderboard / skyscraper / mobile_banner.
- resolveLayout now derives per-size font/leading from the role +
  artboard size, then clamps to a legibility floor and emits a
  constraint_signal when copy does not fit at the floor.
- Four reference templates with character constraints per size.

AI pipeline (Shape B)
- One extract + one generate per feed row; generate returns per-size
  copy keyed by artboard_id plus a shared rationale block.
- Constraint-signal retry: orchestrator tightens per-(artboard, field)
  limits and re-calls generate before giving up.
- orchestrateRow returns specs[] + rationale + constraint_signals.

Review UI
- /review renders one strip per feed row, all four sizes side-by-side
  at true pixel dimensions, synced on a single GSAP master timeline.
- AiReasoningDrawer shows a per-size copy table, shared rationale, and
  any constraint signals that fired.
- /api/generate response grouped by row; /api/export accepts the same
  shape and writes exports/row-N/artboard_id.zip.

Render worker
- render-to-zip / render-many accept optional subdir + filename
  overrides so multi-size exports can be grouped by feed row.

Docs
- VERTICAL_SLICE and BUILD_SEQUENCE updated for the multi-size scope.
- RESOLVED_FEED.md documents the V1 Resolved Creative Feed proposal.
- SLICE_DEVIATIONS.md records where the slice diverges from V1.

Tests: 56 pass (28 layout-engine + 14 api-lib + 14 render-worker).
Web app: tsc clean, next build succeeds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-05-18 14:22:26 -04:00

20 KiB

Raw Permalink Blame History

VERTICAL_SLICE.md

A 3–5 day buildable end-to-end demo that proves the thesis: humans design, AI scales, output is production-grade HTML5. This document replaces PHASE_1_BRIEF.md for the short-timeline build. The full BUILD_SEQUENCE.md remains the plan for extending the slice to V1 afterward.

What this is

The smallest possible system that does the real thing. Not a mockup, not a static demo — a working pipeline where a CSV feed and a designed template produce a real HTML5 banner that opens in a browser and animates correctly.

The thesis being proven:

The text group system works. Dropflow runs in both the browser preview and the render worker and produces identical layout.
The four-agent AI pipeline produces typed, validated banner specs from a structured feed.
The output is real HTML5, not a screenshot. It animates with GSAP, it has a working click tag, and it could be uploaded to an ad server.

Everything else is deliberately deferred.

Slice scope — what's in

One template, hand-built in code, no template builder UI.

Four artboards: 300x600 (reference), 300x250, 728x90, 160x600.
One TypeSystem authored on the 300x600 reference, declaring the four roles (headline, subheadline, cta, legal) with base sizes, weights, line-heights, tracking, colors, and legibility floors. The other three sizes derive their typography from this system via the type-scale formula.
Per-artboard layer composition: each size can declare which layers it includes (e.g., the 728x90 may drop the subheadline; the 160x600 may collapse headline+subheadline into headline only).
One text group per artboard containing the included text layers with push_siblings cascade.
One smart asset slot for a hero image (no variant groups — one asset per row, per-size crop via focal point).
One static logo (PNG, no variant logic).
One GSAP animation preset (fade in, hold, fade out), with per-size timing scale.

One CSV feed.

3–5 rows. Columns: headline, subheadline, hero_image_url, cta_text, click_url, hero_focal_x, hero_focal_y.
Hand-authored, no validation rigor beyond "does it parse."

One AI pipeline, size-aware.

Four agents: extract → generate → route → assemble.
Extract runs once per row, producing an ExtractedContext.
Generate runs once per row and returns structured copy variants for all four sizes simultaneously (Shape B), respecting per-size character limits and the brand voice ("confident, modern, no exclamation marks").
Route node maps hero_image_url and focal point to the asset slot. No variant group logic.
Assemble produces a BannerSpec with four ArtboardSpecs, one per size.
Programmatic post-validation: per-size character count check. If a size overflows after shrink-to-fit hits the floor, the layout engine emits a constraint_signal and Generate is re-invoked with the tighter limit for that size. One retry, then flag as failed.

One review UI — per-row strip.

Each feed row is a horizontal strip showing all four sizes at true pixel dimensions, side by side.
All four sizes in a row animate together via a synchronized GSAP timeline ("play all" per row, or "play all" across the grid).
Click any banner in a row to see the AI reasoning panel — including the editorial decisions across sizes.
No editing, no version control, no conflict resolution, no approval workflow.

One render path.

Playwright runs locally (not in Docker yet).
Produces four HTML5 zips per row (one per size), each with: index.html, GSAP CDN script tag, the resolved spec inlined as JS, one IAB click tag, assets, a backup PNG.

One ad server profile.

IAB Standard. CM360 deferred.

That's the slice. Same "ones" as before, but the template is now a four-artboard system with a designer-authored type system, and the AI adapts copy per surface.

Slice scope — what's out

Deliberately, to make the timeline real:

Template builder UI (templates are TypeScript objects in the slice)
Asset library (assets are URLs hardcoded in the CSV)
Variant groups, logo lockups, selection rules
Variant groups, logo lockups, selection rules (multi-size hero crops use a focal point only)
Character limit simulator UI
Inline editing in the review UI
Version service, deltas, snapshots
Human overrides, conflict resolution
Approval workflow, role-based access
Auth (single-user, localhost)
Docker for the render worker
CM360 profile, Amazon DSP, Xandr
Trafficking sheet generator
QA gates as a separate package (one inline weight check is enough)
Brief mode (feed mode only)
Field locking, locked copy
Figma anything
Hosted deployment (runs on localhost)

When the slice is working, the next decision is which of these to add first. That decision is informed by what people say when they see the demo, not by guessing now.

Stack — minimum viable

Same locked decisions as V1, slimmed down:

Next.js 14 monorepo with apps/web, apps/api, apps/render-worker. Same structure as the full plan but most directories empty.
Packages: types, layout-engine only. prompts inlined in the api for now.
No tRPC. Plain Next.js API routes. tRPC is V1; the slice doesn't need the type bridge yet.
No Zustand. React useState is fine for the review grid.
No Drizzle. No database. Feed rows go straight into the AI pipeline, spec output goes straight to render. Add the DB when you extend to V1.
Anthropic SDK for AI calls.
Dropflow WASM in both the browser canvas and the render worker — this is non-negotiable, it's the thesis.
Konva for the canvas preview in the review grid.
Playwright for rendering, local install, not Docker.
GSAP loaded from CDN in the rendered HTML5.

Skipping the database is the biggest concession. It means no version history, no persistence across restarts, no real workflow. For the demo, this is fine — the demo runs end-to-end in one session.

Day-by-day plan

Numbers are realistic upper bounds for focused full-day sessions. If you finish a day early, advance the next day's work.

Day 1 — Foundation and the layout engine

The two highest-risk pieces. Build them first so if Dropflow doesn't behave the way the research says, you find out on day 1, not day 4.

Morning (3-4 hours):

Scaffold the monorepo. apps/web, apps/api, apps/render-worker as Next.js or Node entry points. packages/types, packages/layout-engine.
packages/types: implement BannerSpec, Template, Artboard, TextLayer, SmartAssetLayer, GroupLayer, TextBehaviorRules, PushSiblingRule, ResolvedLayer, TypographySpec. Skip everything campaign-related, version-related, Figma-related. ~10 interfaces total, copied straight from architecture doc Part 4.

Afternoon (3-4 hours):

packages/layout-engine: Dropflow WASM bootstrap. Get measureText(text, typography, maxWidth) working in Node. Three unit tests.
shrinkToFit function with tests.
applyPushSiblings function with tests.
resolveLayout top-level entry point — takes a BannerSpec and a copy map, returns ResolvedLayer[].

End-of-day check: A fixture BannerSpec with a text group resolves cleanly. layout_log is populated. All tests pass in Node.

If Dropflow doesn't work in Node, stop. This is the thesis. Fix it before continuing.

Day 2 — Browser parity and the AI pipeline

Morning (3-4 hours):

Get the same layout engine module loading in Next.js. WASM bundling is the gotcha here — next.config.js needs to be configured to serve WASM correctly. Verify by running the same fixture spec from Day 1 in the browser console and confirming the same ResolvedLayer[] output.
This is the "browser-Node parity" proof. If the numbers don't match, find out why before continuing.

Afternoon (3-4 hours):

Hand-author the template as a TypeScript constant in apps/api/templates/demo-300x250.ts. One artboard, one text group with headline+subheadline, one smart asset slot, GSAP timeline preset, hardcoded typography.
Hand-author the feed CSV. 5 rows, real product copy.
apps/api/services/ai-orchestration/: implement the four agents.
- Extract agent: prompt parses a CSV row into a typed ExtractedContext (real Claude API call, real prompt).
- Generate agent: prompt rewrites the copy against character limits and brand voice (real call).
- Route node: trivial mapping, no AI. Maps hero_image_url to the asset slot.
- Assemble agent: produces the final BannerSpec (one more real call, or inline if simple enough).
Programmatic post-validation: character count check, retry once if over limit.

End-of-day check: A CSV row goes in, a typed BannerSpec comes out. AI reasoning fields are populated. Run it on all 5 rows.

Day 3 — Review grid and the canvas preview

Morning (3-4 hours):

apps/web/app/review/page.tsx: triggers the AI pipeline on page load with the 5 feed rows, gets back 5 BannerSpecs.
Render each spec to a Konva canvas. Use the same resolveLayout from Day 1. The canvas reads the resolved layers and draws text + image + shapes.
Layout grid: 5 banners in a row or 2-3-2 grid.
Click a banner → side panel shows the AI reasoning fields and the resolved spec JSON (pretty-printed).

Afternoon (3-4 hours):

GSAP timeline on the canvas preview. Each banner animates per the timeline preset on load. A "play all" button that synchronizes them.
Polish: loading state while AI is generating, error states if a row fails validation twice (shows the row with a "skipped — copy too long" label).

End-of-day check: A producer-shaped user could see this and immediately understand what the product does. They see 5 banners generated from 5 feed rows, animating, with reasoning visible.

Day 4 — Render worker and HTML5 export

Morning (3-4 hours):

apps/render-worker/: Playwright local. A function that takes a BannerSpec, opens a static HTML runtime template in Playwright, fills it with the spec data, waits for fonts and first paint, takes a backup PNG screenshot.
Build the HTML5 runtime template: a static index.html with GSAP loaded from CDN, an inline <script> slot for the spec JSON, an inline <script> for the GSAP timeline assembly, IAB-standard click tag pattern (var clickTag = ""; <element onclick="window.open(clickTag, '_blank')">).
The same Dropflow module computes the resolved layout at runtime in the rendered HTML, OR (simpler) the resolved layout is baked into the spec at AI-generation time and the runtime just renders coordinates. Pick the second for the slice — it's simpler and proves the same thing because the layout was already computed by the same engine.

Afternoon (3-4 hours):

Zip composition: index.html, the inlined spec, the backup PNG, any local image assets, written to a zip on disk.
"Export" button in the review UI: triggers the render worker for all 5 banners, produces 5 zips, drops them in a ./exports/ directory.
Manual verification: unzip one, open index.html in a browser, confirm it animates, confirm clicking the CTA opens the click URL.

End-of-day check: Real HTML5 zips on disk. One of them opens in a browser and animates correctly. The click tag works.

Day 5 — Design system + 300x600 reference template

Morning (3-4 hours):

Implement TypeSystem interface in packages/types.
Implement type-scale.ts module in packages/layout-engine with deriveTypeSpec(system, target) and the size-class piecewise formula.
Wire the shrink-then-constrain loop in resolve-layout.ts: shrink to 85% of derived size, then emit constraint_signal if still overflowing at the legibility floor.

Afternoon (3-4 hours):

Author the 300x600 half-page template against the TypeSystem. Hero treatment, logo position, headline/subhead/CTA placement, GSAP timeline. This is the design-system reference — the visual standard the other three sizes inherit from.
First render. React, adjust, lock the design system.

End-of-day check: 300x600 banners render with type derived from the system. One row generates a polished half-page banner.

Day 6 — The other three sizes + size-aware AI

Morning (3-4 hours):

Author 300x250 by adapting the 300x600 design.
Author 728x90 (the hard one — vertical space is tight; may drop the subheadline).
Author 160x600 (narrow; constraints stress-test the formula).

Afternoon (3-4 hours):

Upgrade the AI pipeline: Generate agent emits structured per-size copy variants. Orchestrator handles the constraint-signal retry.
Update the post-validation logic for per-size character counts.
Run the pipeline against the demo CSV. All four sizes should generate from each row.

End-of-day check: Four banners per row, all sizes generated from one Generate call per row, all fitting their constraints.

Day 7 — Per-row strip review UI, multi-size render, polish, demo

Morning (3-4 hours):

Rebuild /review as per-row strips. Each feed row shows all four sizes at true pixel dimensions. GSAP timelines synchronized per row.
Reasoning panel reframed to show editorial decisions across sizes (which words survived the leaderboard cut, why).
Update the render worker to loop over sizes per row, organizing exports as exports/row-N/SIZE.zip.

Afternoon (3-4 hours):

Visual polish on the strip layout. Dieter Rams adjacent.
Record the demo. Voiceover: one campaign intent → four surfaces → AI adapts copy per size. 90 seconds, tight cut.
Update README and technical summary.

End-of-day check: Working demo of four-size adaptation, a recording, a written summary.

Slim file structure

banner-studio/
├── apps/
│   ├── web/                          Next.js — review grid
│   │   ├── app/
│   │   │   ├── review/page.tsx
│   │   │   └── layout.tsx
│   │   └── lib/
│   │       └── layout-bridge.ts      Calls @banner-studio/layout-engine in browser
│   ├── api/                          Next.js API routes (in apps/web for the slice)
│   │   └── app/api/
│   │       ├── generate/route.ts     Feed → 5 BannerSpecs
│   │       └── export/route.ts       BannerSpec → zip on disk
│   ├── api-lib/                      Shared backend logic (not its own app)
│   │   ├── ai-orchestration/
│   │   │   ├── extract-agent.ts
│   │   │   ├── generate-agent.ts
│   │   │   ├── route-node.ts
│   │   │   ├── assemble-agent.ts
│   │   │   └── orchestrator.ts
│   │   ├── templates/
│   │   │   └── demo-300x250.ts       The hand-authored template
│   │   ├── prompts/
│   │   │   ├── extract.ts
│   │   │   ├── generate.ts
│   │   │   └── assemble.ts
│   │   └── claude-client.ts
│   └── render-worker/                Local Playwright, not Docker
│       ├── render.ts                 BannerSpec → zip on disk
│       └── runtime-template.html     The HTML5 banner shell
├── packages/
│   ├── types/                        ~10 interfaces only
│   │   └── src/index.ts
│   └── layout-engine/                Dropflow + shrink-to-fit + push-siblings
│       └── src/
│           ├── dropflow-wrapper.ts
│           ├── shrink-to-fit.ts
│           ├── push-siblings.ts
│           ├── resolve-layout.ts
│           └── index.ts
├── feeds/
│   └── demo.csv                      5 rows
├── exports/                          Output zips land here
├── package.json
├── CLAUDE.md                         Updated for slice scope (see below)
└── README.md

Apps/api as a separate Node service is V1. For the slice, the API routes live inside the Next.js web app — one less process to manage.

CLAUDE.md addendum for the slice

When you start the slice build, prepend this to CLAUDE.md before the existing content:

## SLICE SCOPE — read this first

You are building a vertical slice of the platform, not V1. The architecture document
and BUILD_SEQUENCE.md describe the full V1. This slice is deliberately narrower.

What's in the slice: one template (hardcoded in TypeScript), one feed format (CSV),
one AI pipeline (four agents, real Claude calls), one review grid (Konva canvas
preview, no editing), one render path (Playwright local), one ad server profile (IAB).

What's out: template builder UI, asset library, variant groups, multi-artboard,
version control, human overrides, conflict resolution, approval workflow, auth,
Docker, CM360, trafficking sheet, brief mode, Figma. None of these exist in the slice.

When in doubt: build less, ship the demo, extend after. If you find yourself adding
a feature not in VERTICAL_SLICE.md, stop and confirm.

The full V1 plan is preserved in BUILD_SEQUENCE.md for after the slice ships.

The rest of CLAUDE.md stays as-is — locked stack decisions, what not to do, the text group system as the keystone, the service ownership table (relevant even though services are collapsed in the slice — it shapes how the modules talk to each other inside the same app).

Where this will probably go wrong

Dropflow WASM in Next.js. Bundling WASM for Next.js is notoriously fiddly. Two patterns work: copy the WASM file to /public and load via fetch, or configure webpack's experiments.asyncWebAssembly in next.config.js. Try the second first. Budget half a day for this if it fights you, and don't let it eat day 1.

Playwright local fonts. If the fonts on your machine differ from the fonts the canvas preview uses, your preview won't match your render. Use system-available fonts only in the slice template (Inter, Helvetica Neue, system-ui) and confirm Playwright sees the same ones. Save the font Docker setup for V1.

The four-agent pipeline taking too many Claude calls. Each banner = 3 Claude calls (extract, generate, assemble — route is no-AI). 5 banners = 15 calls per generation. With Opus this gets slow and rate-limit-prone. Use Sonnet 4 across all three agents in the slice. Quality is good enough for the demo. Save the model-per-agent tiering for V1.

Wanting to add the template builder. The pull will be strong. Resist. The slice's value is that a producer-shaped person sees what the product does without seeing how it's built. The template builder is the second most impressive piece of V1, but it's also where weeks go.

Underestimating Day 5 polish. "Visual polish on the review UI" sounds like an afternoon. It is not. The difference between "localhost test page" and "I would show this to a CD" is real work. Give it a full day. If you have to cut, cut Day 4's zip composition polish or one of the AI agents' prompt iteration, not Day 5.

Type scale calibration on the constrained sizes. The 728x90 and 160x600 are where the formula gets stress-tested. If the legibility floor pushes the system into emitting constraint signals on every row, the AI ends up writing very short copy and the demo feels less impressive. Tune the floors so they trigger on edge cases, not the common case. Budget time on Day 6 morning to iterate on the size-class math.

Hero image cropping across aspect ratios. A wide landscape hero shot fails in 160x600 tall format. Pick demo hero images with centered subjects and breathable composition that survive aggressive cropping in multiple directions. The focal-point hint in the CSV is the primary lever; use it.

After the slice

If the demo lands, the decision tree opens up. Three plausible next moves:

Extend to V1. Resume BUILD_SEQUENCE.md at Phase 4 (canvas) since Phases 1–3 are mostly done in the slice. Roughly 4-6 weeks of focused work to a complete V1.
Productize the slice. Add auth, hosting, and a real template builder UI. Skip the full V1 spec. Smaller scope, faster to a usable thing for one client, less defensible long-term.
Hold the slice as a reference build. Ship the demo internally, document what was proven, move on. The slice stands as evidence the architecture works; whether it becomes a larger product is a separate decision driven by what conversations the demo opens up.

The slice is designed so all three of these remain open. Don't decide which until the demo is in front of real eyes.

20 KiB Raw Permalink Blame History Unescape Escape

VERTICAL_SLICE.md

What this is

Slice scope — what's in

Slice scope — what's out

Stack — minimum viable

Day-by-day plan

Day 1 — Foundation and the layout engine

Day 2 — Browser parity and the AI pipeline

Day 3 — Review grid and the canvas preview

Day 4 — Render worker and HTML5 export

Day 5 — Design system + 300x600 reference template

Day 6 — The other three sizes + size-aware AI

Day 7 — Per-row strip review UI, multi-size render, polish, demo

Slim file structure

CLAUDE.md addendum for the slice

Where this will probably go wrong

After the slice

20 KiB

Raw Permalink Blame History