Commit graph

28 commits

Author SHA1 Message Date
DJP
17a635099a Retire V1 source from main; V2 in v2/ is the new app
V1's running deployment at /opt/social-reporting on the server stays put
until cutover; V1's source is preserved on the v1-archive branch and via
git history. From this commit forward, all work targets v2/.

The new root README points contributors at v2/ and documents the rollback
path (deploy/rollback-to-v1.sh) for the cutover.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 17:39:35 -04:00
DJP
b89e8b511e Add V2: multi-team social-reporting platform with manifest-gated linking
V2 lives entirely under v2/ and is built around three asks the team raised
about V1: per-video assets sometimes drifted onto the wrong trend, hashtag
scrapes returned junk that wasn't filterable per-client, and there was no
multi-user model behind Microsoft SSO.

Highlights:
- Stable TikTok numeric-id key for every per-video asset; URL form drift is
  logged loudly to drift_log.jsonl and never silently nulls assets. Stage 5
  manifest hard-gates Stage 6 if any selected video is missing any required
  asset; --drop-failing auto-backfills from the next-best recipe candidates.
- Per-brief engagement floor (min_likes / min_plays / min_stl_pct), applied
  at Apify scrape time and re-validated locally; spend_log.json records
  raw_returned vs kept_after_floor per scrape.
- Users + teams + memberships with owner/admin/editor/viewer roles; SSO
  upserts a user keyed on Azure oid, auto-creates a personal team, and a
  super-admin is bootstrapped via BOOTSTRAP_SUPER_ADMIN_EMAIL on first
  sign-in. Phase A integration test: 16/16 pass.
- 10-stage TS pipeline (brief → seed → scrape1 → select → scrape2 →
  validate → analyse → insights → trends → qa → build) wired through one
  CLI; each stage idempotent + resumable from disk via .state sentinels.
  §4.5 rubrics shipped under prompts/ and loaded into Claude calls.
- React 18 + Vite + TS + Tailwind operator SPA: brief intake form,
  team management, super-admin user list, help/FAQ ported from V1.
- Separate Docker Compose project (name: social-reporting-v2, port 3457,
  Postgres 5437) with deploy/setup-v2.sh, deploy-v2.sh, rollback-to-v1.sh
  scripts that take over V1's /social-reports URL and let us roll back.

Verification: 62 unit tests pass (auth/session, ids extractor with full URL
fixture, engagement floor, recipes, manifest, linking-fix, MoM compare).
Live smoke run on a Dove brief: 1400 raw → 253 kept (82% culled) → 21
fully-bundled videos → 25 editorial trends across 8 brief-driven categories,
with drift=0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 17:39:07 -04:00
Vadym Samoilenko
7a70283e5b Fix frontend not being copied to /var/www/html on deploy
- Replace cp frontend/* with cp -r frontend/. to copy all files reliably
- Add mkdir -p as safety net in deploy.sh
- Add apache2 reload after frontend copy in deploy.sh
- setup.sh now copies entire frontend dir instead of hardcoded filenames

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 18:37:27 +01:00
Vadym Samoilenko
01bea84691 Add Azure AD SSO via MSAL.js SPA flow
- Self-host msal-browser.min.js v5.6.3 (UMD, 244KB, no CDN dependency)
- login.html: SSO button + redirect callback handler + password form fallback
- config.js: MSAL config (tenant, client ID, redirect URI) + __SSO_ENABLED flag
- server.ts: POST /api/sso/token-exchange — validates Azure ID token using Node
  crypto (JWKS fetch + 24h cache + RSA-SHA256 sig verify), issues sl_session cookie
- server.ts: /api/auth now returns user name/email/authMethod from session
- server.ts: CSP updated with login.microsoftonline.com for connect-src + frame-src
- docker-compose.yml: pass AZURE_TENANT_ID + AZURE_CLIENT_ID to container
- deploy/setup.sh: add Azure AD vars to .env template

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 18:18:57 +01:00
DJP
f9321e86d1 Add help tab with brief guide, tips, and FAQ
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 11:34:44 -04:00
DJP
6cea40c34d Add report context/vision free text field to brief
Optional textarea lets users provide strategic guidance like objectives,
competitive context, and focus areas. Injected into Claude prompts at
stages 2, 4, 6, and 8 so all agents can produce more focused output.
Backward compatible — empty context changes nothing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 11:20:13 -04:00
DJP
a66866a5b8 Add quick deploy script for routine updates
bash /opt/social-reporting/deploy/deploy.sh

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 10:44:04 -04:00
DJP
568cf1d40d Add per-brief Apify budget with platform splitting
- Add apifyBudget field to ClientBrief (default $10)
- Budget split: 70% discovery (evenly across platforms), 30% enrichment
- Per-platform soft cap prevents one platform hogging the budget
- Budget input field added to both frontend and dashboard forms
- Saved briefs preserve budget setting
- Fix Claude Vision 5MB limit: filter oversized thumbnails before batching
- Fix Docker: ensure node user can write to volume-mounted dirs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 10:36:30 -04:00
DJP
42fcc36018 Fix comments, visual language, and date filtering
- Fix TikTok comments actor input: `videos` → `videoUrls` (wrong field name)
- Fix TikTok transcripts actor input: `videos` → `videoUrls` (wrong field name)
- Allow HTTP URLs for thumbnails (TikTok CDN uses HTTP)
- Add date filtering to profile scrapers (TikTok + Instagram)
- Keep videos with unparseable dates instead of dropping them
- Lower visual language threshold from 5 to 3 thumbnails
- Increase thumbnail timeout from 5s to 10s
- Add logging for failed thumbnail downloads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 11:20:07 -04:00
DJP
dfc2a38861 Security hardening: fix 17 audit findings (C2-C7, H1-H4, H6-H8, M1-M5, M7)
Critical: restrict CORS, move Apify token to Auth header, add path traversal
validation, prompt injection delimiters, require production credentials.
High: security headers, cookie hardening, rate limiting, XSS fixes, error sanitization.
Medium: SSRF prevention, body size limit, Docker non-root, DB creds from env.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 10:59:48 -04:00
DJP
d85e16e95d Add comprehensive security audit report
25 findings across 4 severity levels with prioritized remediation roadmap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 10:43:08 -04:00
DJP
f2d6f56831 Report quality overhaul: 11 feedback items
1. Remove Desk Research (Stage 7 skipped, sources removed from report)
2. Fix comments scraping: increase cap to 2000, handle alt field names
3. Dynamic stats bar: hide zero-value stats instead of showing "0 Comments"
4. Prompt improvements: enforce timeliness, comment-based insights, creator spotlight algorithm (2-10 videos, exclude >50% dominance)
5. Date filtering: pass date params to Apify actors (oldestCreateTime, onlyPostsNewerThan, uploadDate) + log filter counts
6. Pullquotes: 3-4 generated editorial dividers between sections
7. Thumbnails: download top 50 coverUrl as base64, store on EnrichedVideo
8. Visual Language section: 5 batches of 10 through Claude Vision, synthesized into 5-6 visual codes with thumbnail cards
9. Sticky navigation bar with anchor links to all sections
10. New types: VisualCode, thumbnailUrl on Video, thumbnailBase64 on EnrichedVideo, pullquotes/visualCodes on ReportJSON

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 09:52:08 -04:00
DJP
3dcdf0cc69 Add project README with architecture, setup, and deployment docs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 14:13:24 -04:00
DJP
2429deff72 Round cost displays to nearest cent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 13:43:41 -04:00
DJP
4e16367d2d Fix brief loading: remove stale jsonPreview refs, add Export button, rename Load
- Fixed null reference error when loading JSON files (removed deleted jsonPreview element refs)
- Added Export button to download saved briefs as JSON files
- Renamed "Load & Run" to just "Load" per user feedback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 13:34:34 -04:00
DJP
010d304c2a Add saved briefs feature: server-side storage with dedicated tab
- Backend: GET/POST/DELETE /api/briefs endpoints storing JSON files in briefs/ dir
- Frontend: new Saved Briefs tab with cards showing client details, Load & Run, Delete
- Save Current Brief button on Pipeline tab persists form to server
- Both standalone dashboard and static frontend updated

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 13:30:15 -04:00
DJP
5f8d84f5c5 Add delete runs, bulk clear, and report download to dashboard
- Delete individual runs (with confirmation)
- Bulk remove all failed or completed runs
- Download report as HTML file (Content-Disposition: attachment)
- View + Download buttons in history table
- Backend: DELETE /api/runs/:id and DELETE /api/runs?status=failed|completed
- Backend: GET /report/:id/download serves with attachment header
- Updated both frontend/index.html and dashboard/index.html

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 13:23:56 -04:00
DJP
2473c22318 Fix Apache config: remove ProxyTimeout from Location block
ProxyTimeout is not allowed in <Location> context. Moved to server-level
ProxyTimeout directive already set above.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 12:18:02 -04:00
DJP
ce916cd658 Fix broken Unicode in Claude API calls + stabilize SSE proxy
- Sanitize unpaired surrogates from scraped text before JSON.stringify
  (Instagram captions often contain broken emoji causing JSON parse errors)
- Update Apache SSE proxy config: longer timeout, disable output filter
  buffering to prevent connection drops and repeated reconnects

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 12:15:07 -04:00
DJP
087d1bb23b Fix SSE reconnect loop: only POST /run once per pipeline start
EventSource auto-reconnects on connection drop, which re-fires the
'connected' event. The handler was POSTing /run on every reconnect,
causing multiple parallel pipeline runs and runaway Apify costs.

Added pipelineStarted guard so /run only fires on first connect.
Fixed in both frontend/index.html and dashboard/index.html.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:56:28 -04:00
DJP
9d15356a76 Fix Stage 5: correct actor input fields + add error resilience
- TikTok transcripts/comments actor expects 'videos' not 'videoUrls'
- Wrap all enrichment actor calls in safeRunActor so failures skip
  instead of crashing the pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:53:45 -04:00
DJP
57c4d3f0df Fix Apify budget: run scrapers sequentially instead of parallel
Promise.all() launched all platform scrapers simultaneously, so multiple
expensive runs started before any costs were tracked. Budget check only
saw totals after each run finished, allowing $7+ overspend on a $5 limit.

Now Stage 3 and Stage 5 run each scraper sequentially so the budget
gate can cut off between calls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:53:05 -04:00
DJP
247da45297 Fix port mapping: use env vars, bind localhost, remove duplicate ports
DB_PORT defaults to 5436, DASHBOARD_PORT defaults to 3456.
Prod override no longer redeclares ports (was causing duplicates).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:34:46 -04:00
DJP
9b9203355b Change prod Postgres port from 5435 to 5436 to avoid conflict
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:32:54 -04:00
DJP
c5c40aa4e5 Add server deployment: Apache proxy, static frontend, deploy script
- Static frontend (index.html, login.html, config.js) for Apache serving
- JSON-based auth API endpoints (/api/login, /api/auth, /api/logout)
- Apache config with ProxyPass for /social-reports path
- deploy/setup.sh for Ubuntu + Apache + Docker deployment
- docker-compose.prod.yml binds ports to 127.0.0.1 only
- Configurable API base URL via frontend/config.js

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:23:47 -04:00
DJP
ae981e8cb4 Add login auth, video embeds, and report serving
- Cookie-based session auth with login page (DASH_USER/DASH_PASS env vars)
- Serve generated reports via /report/:id route with View Report button
- YouTube iframe and Instagram native embeds in HTML reports
- Supporting videos grid per trend with platform icons
- Logout link in dashboard header

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 11:08:08 -04:00
DJP
50e1675b10 Initial commit: Social Listening Pipeline
8-stage TypeScript pipeline with Apify scraping, Claude AI analysis,
real-time dashboard with SSE, PostgreSQL cost tracking, and Apify budget controls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-02 22:18:02 -04:00
Dave Porter
d7b43dff99 Initial commit 2026-04-03 02:15:40 +00:00