Promotes the per-brief theme picker from a buried section at the bottom of the JSON-textarea edit page to a dedicated /briefs/:id/theme route that feels like a real edit tool. New route layout (5fr / 4fr split, sticky preview on xl+): - Left: ThemeEditor (8 accent presets + custom hex; 4 heading-font tiles each rendering "The Branded Glass Moment" in the candidate font; 3 background presets; agency-name input; save / reset). - Right: ThemePreview — slice of the per-report dashboard styled by the picked theme, updates LIVE on every tweak before save. ThemePreview renders a mock dashboard topbar (with agency name + accent eyebrow), 3 KPI tiles, leaderboard row (rank + format dot + accent bar + plays), sample trend card (maturity pill + format chip + truth quote + KPI strip), primary/secondary buttons, accent-2 swatch. Inline-styled with full hex values so changing the picker doesn't bleed into the operator app's chrome. ThemeEditor refactored to expose live state via optional onPreview callback. Internal save/reset behaviour unchanged. Discoverability: - Brief detail page: "Theme & branding" button in the header action row next to "Export JSON" and "Run pipeline". - Brief edit page: footer link "Theme & branding ↗" replaces the inline editor that lived at the bottom (now redundant). - Brief list rows: small accent-dot indicator in the right-side metadata column when a theme is set, plus a per-row "Add theme" / "Edit theme" link in the action footer. Operator app's index.html now also loads Fraunces / Playfair / Space Grotesk / Inter / JetBrains Mono so the preview is WYSIWYG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| db | ||
| deploy | ||
| examples | ||
| operator-app | ||
| pipeline | ||
| server | ||
| templates | ||
| .env.example | ||
| .gitignore | ||
| docker-compose.v2.dev.yml | ||
| docker-compose.v2.prod.yml | ||
| docker-compose.v2.yml | ||
| Dockerfile.v2 | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| tsconfig.base.json | ||
| tsconfig.json | ||
| vitest.config.ts | ||
Social Reporting V2
V2 is the production TikTok social-listening tool that replaced V1 in-place at
https://optical-dev.oliver.solutions/social-reports/. It takes a brand brief
in, runs a 10-stage scrape → analyse → synthesise pipeline, and produces a
React dashboard plus a single-file claude.ai HTML bundle for handover.
V2 exists to fix three concrete things V1 got wrong:
- Asset linking. V1 joined transcripts/comments/covers to videos by URL
string. Different Apify actors return slightly different URL forms, so a
single normalisation drift silently nulled the asset and trends ended up
citing the wrong video. V2 keys everything by canonical TikTok numeric id
(
extractTikTokId) and is loud about drift. - Hashtag scrape junk. V1 had no engagement floor. Reports decayed under
low-quality hashtag noise. V2 has per-brief
min_likes,min_plays,min_stl_pctknobs applied both Apify-side and locally. - Single-user app. V1 was one shared
DASH_USER/DASH_PASSlogin. V2 has Azure AD SSO, real users, teams, roles, and super-admin elevation.
Architecture
+-------------------+
| Azure AD (SSO) |
+---------+---------+
| OIDC tokens
v
+---------+ +------------+ +-------+-------+ +-----------------+
| Browser | -----> | Apache 2.4 | ------> | app-v2 | ------> | Anthropic API |
| (SPA) | HTTPS | (vhost, | :3457 | Node 20 | HTTPS | (Claude CLI) |
| | <----- | /social- | <----- | TypeScript | <----- +-----------------+
+---------+ | reports) | | |
+------------+ | | ------> +-----------------+
| | | HTTPS | Apify |
v | | <----- | (TikTok actors) |
shared optical-dev | | +-----------------+
+---+---+-------+
| |
:5437 | | bind-mount
v v
+-----+ +-------------+
| db- | | ../briefs/ |
| v2 | | (host fs) |
|Pg16 | | per-report |
+-----+ | artefacts |
+-------------+
+----------------------------------------------------------------+
| Compose project: social-reporting-v2 (CLAUDE.md compose-name |
| policy — separate name from V1 to avoid container/volume |
| collision on the shared optical-dev host) |
+----------------------------------------------------------------+
| Component | Where | Why |
|---|---|---|
app-v2 container |
Dockerfile.v2, port 3457 |
Single Node process: HTTP API + SPA static host + spawned pipeline child |
db-v2 container |
postgres:16-alpine, port 5437 |
Separate DB so V2 can be torn down without touching V1's data |
| Apache vhost | shared optical-dev | /social-reports/ alias points at 127.0.0.1:3457 |
briefs/ host dir |
../briefs/ mounted into the container |
Pipeline writes per-report artefacts here; React dashboard reads from here at build time; survives container rebuilds |
| Operator SPA | operator-app/dist/ |
Vite build inlined into the same container, served at /social-reports/ |
| Per-report dashboard SPA | templates/dashboard_template/dist/ |
One bundle, parameterised by report id at runtime — served at /api/reports/:id/dashboard/ |
Repo layout
v2/
db/init.sql # forward-only schema (users, teams, briefs, reports, videos, video_assets, manifest_checks, trends, ...)
deploy/ # setup-v2.sh, deploy-v2.sh, cutover-in-place.sh, rollback-to-v1.sh
Dockerfile.v2 # two-stage: builds operator-app + dashboard SPA, then runs server
docker-compose.v2.yml # name: social-reporting-v2 (mandatory)
docker-compose.v2.prod.yml # prod overrides
operator-app/ # React 18 + Vite + TS + Tailwind: login, briefs, reports, teams, admin, help
server/ # HTTP API: routes/, db/, auth/, middleware/, schemas/
pipeline/ # 10-stage TS pipeline: cli.ts + stages/stage_N_*.ts + lib/
templates/dashboard_template/ # per-report dashboard scaffold (React + Recharts), built per-report
examples/ # demo briefs (Dove, etc.)
The pipeline
brief.json
|
v
+----------+ Stage 1 (Claude) ---> seeds.json
| seed | anchor / discovery / edge hashtags
| | + handles + search terms
+----------+
|
v
+----------+ Stage 2 (Apify, 4 actors in parallel) ---> pass1/pass1_videos.json
| pass1 | each seed -> hashtag/profile/search actor ---> pass1/spend_log.json
| scrape | engagement floor applied (min_likes, min_plays, min_stl_pct) pass1/raw/<run_id>.json
+----------+ soft-cap at 50% of brief.budget_usd (each actor's raw dump)
|
v
+----------+ Stage 3 (filter) ---> pass2/selected_video_ids.json
| recipe | match recipe A/B/C/D from brief.business_question ---> pass2/selection_rules.json
| select | apply filter expression to pass1
+----------+
|
v
+----------+ Stage 4 (Apify + ffmpeg + translate, 8 in flight per video) ---> enriched/<id>/
| pass2 | bulk TIKTOK_TRANSCRIPTS for selection metadata.json
| enrich | bulk TIKTOK_COMMENTS for selection cover.jpg
| | per-video: download mp4, ffmpeg frames, translate to en transcript.json
| | joins by canonical id (extractTikTokId), drift logged loudly comments.json
| | bundle.json is the LAST write per video (Stage 6 reads only it) frames/0001.jpg ...
+----------+ bundle.json
|
v
+----------+ Stage 5 (manifest gate, HARD) ---> manifest.json
| validate | walks selected x asset_kinds, checks file exists +
| | non-zero + Zod-valid + content-valid (transcript >=1 word,
| | comments >=5, frames >=1 jpg, cover >=10 KB, bundle.json valid)
| | on coverage<100 with --drop-failing: backfill from pass1
| | next-best ranks; if STILL <100 after 1 round, throws HardGateError
+----------+
| coverage == 100%
v
+----------+ Stage 6 (Claude per video, 8 concurrent) ---> analysis/<id>.json
| analyse | rubric: per-video JSON (hook, visual, audio, narrative,
| | audience, paid_or_organic) — Zod-validated
+----------+
|
v
+----------+ Stage 7 (Claude single call) ---> atomic_insights.json
| insights | rubric: extract atoms (hook patterns, visual motifs,
| | audio motifs, narrative arcs) across the set
+----------+
|
v
+----------+ Stage 8a (Claude) ---> categories.json
| trends | Stage 8b (Claude) ---> trends.json (with relevance: core|peripheral)
| + 8c | Stage 8b.5: per-trend relevance scoring (Claude)
| lenses | Stage 8c — lens artefacts: Hooks Library, Visual Vernacular,
| | Audio Atlas, Sentiment Map (4 small Claude calls)
+----------+
|
v
+----------+ Stage 9 ---> qa/paid_organic_review.json
| qa | no-Claude programmatic gates (paid/organic distribution
| | + coverage + manifest invariants); HALTS HERE awaiting
| | CM + Strategist sign-offs (two-different-humans gate)
+----------+
| both sign-offs landed
v
+----------+ Stage 10 ---> outputs/dataset_v2.json
| build | ---> dashboard/dist (vite build of templates/dashboard_template
| | with dataset_v2.json + per-id covers copied in)
| | ---> outputs/dashboard.html (single-file claude.ai bundle,
| | covers base64-inlined, capped at 3 MB)
| | ---> compare/* (only if brief.prior_report_id set; MoM compare per V3 §16)
+----------+
|
v
Report ready
Each stage writes a .state/stage{N}.done sentinel containing an inputs hash.
Reruns skip a stage if the hash matches; --force invalidates.
Multi-tenancy & auth
+-------------------+
| Azure AD tenant |
+---------+---------+
| OIDC redirect
v
+----------+ /api/sso/token-exchange +-----------+
| /login | ---------------------------------> | server.ts |
+----------+ +-----+-----+
|
v
+----------+----------+
| upsertUserFromSso() | matches azure_oid -> users row
+----------+----------+
|
v
+----------+----------+
| ensureUserHasTeam() | creates personal team on first sign-in
+----------+----------+
|
v
signSession(HMAC)
|
v
cookie -> /api/me
+-------+ +--------------+ +---------+ +----------+
| users | --1:N--| memberships |--N:1--> | teams | --1:N--| briefs |
+-------+ +--------------+ +---------+ +----+----+
role enum |
(owner/admin/ v 1:N
editor/viewer) +----+----+
| reports |
+---------+
Single auth gate is require-team-role.ts: super-admin bypass → membership
lookup → role check. Brief and report routes resolve brief_id → team_id → membership; viewer for reads, editor for mutations.
BOOTSTRAP_SUPER_ADMIN_EMAIL env var promotes one named user to super-admin
on first SSO sign-in. Sticky after that.
Password fallback (ALLOW_PASSWORD_FALLBACK) is off by default in prod —
emergency-only.
Operating
Routine deploy
ssh optical-dev
cd /opt/social-reporting
git pull
./v2/deploy/deploy-v2.sh
The script chowns briefs/ to uid 1000 (the in-container user), rebuilds the
stack via docker compose -p social-reporting-v2 ... up -d --build, waits
for /api/health, and reloads Apache.
Debugging
docker compose -p social-reporting-v2 logs --tail 300 -f app-v2
docker logs social-reporting-v2-app-v2-1 2>&1 | grep -E '\[run|error'
docker compose -p social-reporting-v2 exec db-v2 psql -U srv2_user social_reporting_v2
Cancelling a run
The run page has a Cancel button while non-terminal. It SIGTERMs the whole
process group (tsx + Claude CLI + ffmpeg + Apify polls all stop together) and
marks the row failed. Already-completed stages are preserved on disk via
.state/stage{N}.done sentinels, so "Cancel + edit brief + Force re-run"
works without re-paying for finished stages.
If the server has restarted since the run was triggered, the child handle is no longer in scope — Cancel still works, marks the row failed with "no running process — likely orphaned by a server restart". The server also sweeps such orphans on boot.
Cutover / rollback
V1 source still lives in agents/social-listening/ for rollback. Apache
points at one stack at a time (V1 = port 3456, V2 = port 3457). Switching is
an alias change + reload. See v2/deploy/cutover-in-place.sh and
v2/deploy/rollback-to-v1.sh.
Common pitfalls
geo: "UK"is invalid for Apify. Apify uses ISO codes —GB. The brief schema auto-normalisesUK -> GB(and Stage 2 normalises again as a belt-and-braces). Briefs created before this fix may need a re-save.APIFY_LIVE_APPROVEDmust betruein the container env to run real scrapes. Without it the actor wrapper returns{ status: 'DRY_RUN' }and Stage 2 throws upfront so you don't wonder where the videos went.- Pass-1 budget cap is 50% of
brief.budget_usd. Stage 4 used to inherit that cap and skip every actor; it now releases the soft cap and stays bounded by the hard ceiling (95% of budget). - Compose name policy. The compose file MUST start with
name: social-reporting-v2. Without it, on the shared optical-dev server it'd collapse onto the parent-directory project name and stomp V1's containers and volumes. - Cost events are persisted by
cli.ts. Stages must NOT register their ownonApifyCostcallback — that overwrites the CLI's DB writer and silently drops every Apify cost row. (This bit us once on Stage 2.) - Stage 6/8 concurrency. Both default to 8 in flight; override with
STAGE6_CONCURRENCY/STAGE4_CONCURRENCYenv vars when constrained. - MoM compare fails loudly. Setting
prior_report_idto a non-existent report id makes Stage 10 throw rather than silent-skip. By design (V3 §16).
Why it's shaped this way
Three deliberate choices worth knowing:
-
Filesystem is the source of truth for pipeline artefacts. The DB holds relational state (users, teams, briefs, reports, cost events, manifest counts, trend metadata) but the actual videos / transcripts / comments / frames / analyses live under
briefs/<report_id>/. This means a Postgres wipe doesn't lose a finished report, and Stage 4's per-video bundle.json is the contract Stage 6 reads — the analysis stage doesn't talk to the DB. -
One Node process for both HTTP and pipeline. The server spawns the pipeline as a
detachedchild of itself, holding theChildProcesshandle so it can SIGTERM the whole process group on Cancel. There's no message bus or queue — single replica, one pipeline at a time. The guarding flag (runningChild) is a process-local mutex. -
Per-stage idempotency via
.state/stage{N}.donesentinels. This is what makes "Retry" cheap and "Force re-run" possible. Each stage writes a sentinel containing the inputs hash; the runner skips on match. It also makes Cancel + edit-brief + Force re-run safe without throwing away already-paid-for work.
The asset-linking fix is the headline change but the day-to-day reliability comes from the manifest gate and the sentinels — together they mean a failed run is resumable, not abandoned.