Add project README with architecture, setup, and deployment docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
DJP 2026-04-07 14:13:24 -04:00
parent 2429deff72
commit 3dcdf0cc69

112
README.md Normal file
View file

@ -0,0 +1,112 @@
# Social Listening Pipeline
Automated social media research tool that scrapes TikTok, Instagram, and YouTube via Apify, analyses content with Claude AI, and generates client-ready HTML reports.
## Architecture
```
frontend/ Static frontend (served by Apache)
agents/social-listening/
dashboard/ Node.js backend (HTTP + SSE on port 3456)
stages/ 8-stage pipeline
briefs/ Saved client briefs (JSON)
outputs/ Generated reports
deploy/ Apache config + setup script
```
### Pipeline Stages
| Stage | Name | Description |
|-------|------|-------------|
| 1 | Brief Validation | Validates and normalises the client brief |
| 2 | Strategy Review | AI reviews strategy, suggests up to 3 extra hashtags |
| 3 | Discovery Scrape | Scrapes TikTok/Instagram/YouTube via Apify |
| 4 | Data Review | AI analyses scraped content for trends |
| 5 | Enrichment Scrape | Fetches transcripts and extra metadata |
| 6 | Pre-Report Review | AI refines findings before report generation |
| 7 | Desk Research | Web search for additional context |
| 8 | Report Generation | Produces final HTML report with video embeds |
### Key Features
- **Real-time dashboard** with SSE progress updates and live cost tracking
- **Apify budget control** (`APIFY_COST_LIMIT`) — stops scraping when limit is reached
- **Saved briefs** — save/load client briefs server-side with a dedicated tab
- **Run history** — view, download, and delete past pipeline runs with cost breakdowns
- **Video embeds** — YouTube iframes, Instagram native embeds, TikTok links in reports
- **Auth** — cookie-based session auth with HMAC-signed tokens
## Prerequisites
- Docker & Docker Compose
- Node.js 20+ (for local development)
- Apify API token
- Anthropic API key
## Environment Variables
Copy `.env.example` or create `.env` in the project root:
```env
APIFY_TOKEN=your_apify_token
ANTHROPIC_API_KEY=your_anthropic_key
APIFY_LIVE_APPROVED=true
APIFY_COST_LIMIT=5
TEST_MODE=false
DASHBOARD_PORT=3456
DATABASE_URL=postgres://social:social@db:5432/social_listening
DASH_USER=admin
DASH_PASS=changeme
SESSION_SECRET=random_secret_here
```
## Running Locally
```bash
# Start PostgreSQL + app via Docker
docker compose up -d
# Dashboard available at http://localhost:3456
```
Or without Docker:
```bash
npm install
# Start the dashboard server
npm run dashboard
# Run pipeline directly (CLI)
npm run pipeline # dry run
npm run pipeline:test # test mode
npm run pipeline:live # live Apify scraping
```
## Production Deployment
The app is designed to run behind Apache on an Ubuntu server:
- **Backend**: Docker containers at `/opt/social-reporting`
- **Frontend**: Static files at `/var/www/html/social-reporting`
- **URL**: `https://your-domain.com/social-reports/`
```bash
# On the server
cd /opt/social-reporting
git pull
cp frontend/* /var/www/html/social-reporting/
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build
```
See `deploy/apache-social-reports.conf` for the Apache reverse proxy config and `deploy/setup.sh` for first-time setup.
## Tech Stack
- **Runtime**: TypeScript (ESM) via `tsx`
- **Backend**: Node.js HTTP server with SSE
- **Database**: PostgreSQL (via `postgres` npm package)
- **Scraping**: Apify REST API
- **AI**: Anthropic Claude API (Messages API)
- **Frontend**: Vanilla HTML/CSS/JS with Montserrat font
- **Deploy**: Docker Compose + Apache reverse proxy