No description
V2 lives entirely under v2/ and is built around three asks the team raised about V1: per-video assets sometimes drifted onto the wrong trend, hashtag scrapes returned junk that wasn't filterable per-client, and there was no multi-user model behind Microsoft SSO. Highlights: - Stable TikTok numeric-id key for every per-video asset; URL form drift is logged loudly to drift_log.jsonl and never silently nulls assets. Stage 5 manifest hard-gates Stage 6 if any selected video is missing any required asset; --drop-failing auto-backfills from the next-best recipe candidates. - Per-brief engagement floor (min_likes / min_plays / min_stl_pct), applied at Apify scrape time and re-validated locally; spend_log.json records raw_returned vs kept_after_floor per scrape. - Users + teams + memberships with owner/admin/editor/viewer roles; SSO upserts a user keyed on Azure oid, auto-creates a personal team, and a super-admin is bootstrapped via BOOTSTRAP_SUPER_ADMIN_EMAIL on first sign-in. Phase A integration test: 16/16 pass. - 10-stage TS pipeline (brief → seed → scrape1 → select → scrape2 → validate → analyse → insights → trends → qa → build) wired through one CLI; each stage idempotent + resumable from disk via .state sentinels. §4.5 rubrics shipped under prompts/ and loaded into Claude calls. - React 18 + Vite + TS + Tailwind operator SPA: brief intake form, team management, super-admin user list, help/FAQ ported from V1. - Separate Docker Compose project (name: social-reporting-v2, port 3457, Postgres 5437) with deploy/setup-v2.sh, deploy-v2.sh, rollback-to-v1.sh scripts that take over V1's /social-reports URL and let us roll back. Verification: 62 unit tests pass (auth/session, ids extractor with full URL fixture, engagement floor, recipes, manifest, linking-fix, MoM compare). Live smoke run on a Dove brief: 1400 raw → 253 kept (82% culled) → 21 fully-bundled videos → 25 editorial trends across 8 brief-driven categories, with drift=0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| agents/social-listening | ||
| db | ||
| deploy | ||
| frontend | ||
| v2 | ||
| .gitignore | ||
| DEVELOPER_BRIEF.md | ||
| DEVELOPER_BRIEF_V2.md | ||
| docker-compose.prod.yml | ||
| docker-compose.yml | ||
| Dockerfile | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| SECURITY_AUDIT.md | ||
| tsconfig.json | ||
Social Listening Pipeline
Automated social media research tool that scrapes TikTok, Instagram, and YouTube via Apify, analyses content with Claude AI, and generates client-ready HTML reports.
Architecture
frontend/ Static frontend (served by Apache)
agents/social-listening/
dashboard/ Node.js backend (HTTP + SSE on port 3456)
stages/ 8-stage pipeline
briefs/ Saved client briefs (JSON)
outputs/ Generated reports
deploy/ Apache config + setup script
Pipeline Stages
| Stage | Name | Description |
|---|---|---|
| 1 | Brief Validation | Validates and normalises the client brief |
| 2 | Strategy Review | AI reviews strategy, suggests up to 3 extra hashtags |
| 3 | Discovery Scrape | Scrapes TikTok/Instagram/YouTube via Apify |
| 4 | Data Review | AI analyses scraped content for trends |
| 5 | Enrichment Scrape | Fetches transcripts and extra metadata |
| 6 | Pre-Report Review | AI refines findings before report generation |
| 7 | Desk Research | Web search for additional context |
| 8 | Report Generation | Produces final HTML report with video embeds |
Key Features
- Real-time dashboard with SSE progress updates and live cost tracking
- Apify budget control (
APIFY_COST_LIMIT) — stops scraping when limit is reached - Saved briefs — save/load client briefs server-side with a dedicated tab
- Run history — view, download, and delete past pipeline runs with cost breakdowns
- Video embeds — YouTube iframes, Instagram native embeds, TikTok links in reports
- Auth — cookie-based session auth with HMAC-signed tokens
Prerequisites
- Docker & Docker Compose
- Node.js 20+ (for local development)
- Apify API token
- Anthropic API key
Environment Variables
Copy .env.example or create .env in the project root:
APIFY_TOKEN=your_apify_token
ANTHROPIC_API_KEY=your_anthropic_key
APIFY_LIVE_APPROVED=true
APIFY_COST_LIMIT=5
TEST_MODE=false
DASHBOARD_PORT=3456
DATABASE_URL=postgres://social:social@db:5432/social_listening
DASH_USER=admin
DASH_PASS=changeme
SESSION_SECRET=random_secret_here
Running Locally
# Start PostgreSQL + app via Docker
docker compose up -d
# Dashboard available at http://localhost:3456
Or without Docker:
npm install
# Start the dashboard server
npm run dashboard
# Run pipeline directly (CLI)
npm run pipeline # dry run
npm run pipeline:test # test mode
npm run pipeline:live # live Apify scraping
Production Deployment
The app is designed to run behind Apache on an Ubuntu server:
- Backend: Docker containers at
/opt/social-reporting - Frontend: Static files at
/var/www/html/social-reporting - URL:
https://your-domain.com/social-reports/
# On the server
cd /opt/social-reporting
git pull
cp frontend/* /var/www/html/social-reporting/
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build
See deploy/apache-social-reports.conf for the Apache reverse proxy config and deploy/setup.sh for first-time setup.
Tech Stack
- Runtime: TypeScript (ESM) via
tsx - Backend: Node.js HTTP server with SSE
- Database: PostgreSQL (via
postgresnpm package) - Scraping: Apify REST API
- AI: Anthropic Claude API (Messages API)
- Frontend: Vanilla HTML/CSS/JS with Montserrat font
- Deploy: Docker Compose + Apache reverse proxy