DJP 26c766cf43 Security hardening: fix critical auth, RBAC, and injection vulnerabilities

- C1: Add authentication to file serving route + canonical path traversal check + nosniff header
- C2: DEV_BYPASS_AUTH now only works when Entra ID credentials are not configured
- H1: Add requireAuth() + assertOrgAccess() to 9 unprotected routes (upload, feedback, annotations, color-probes, reviews)
- H2: Add org-scoping to 4 routes (automations, users, skills)
- H3: SSRF protection on webhook URLs — HTTPS only, private/internal IPs blocked
- H6: API key uses timingSafeEqual, phantom fallback removed, supports X-Org-Id header
- M1: CRON_SECRET moved from query string to Authorization Bearer header
- Extend assertOrgAccess() to support 10 model types (was 3)
- npm audit fix: 17 vulnerabilities reduced to 4
- Add SECURITY-REVIEW.md with full findings report

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-07 20:48:05 -04:00

12 KiB

Raw Permalink Blame History

HP Prod Tracker — Security Code Review

Date: 2026-04-07 Reviewed by: Claude Code (automated deep review) Scope: Full codebase — 342 files, ~117K LOC, 62 API routes

Executive Summary

This codebase has significant security gaps that must be addressed before any production or external-facing deployment. The most critical issues are:

No authentication on file serving — uploaded media is publicly accessible
DEV_BYPASS_AUTH has no production guard — misconfiguration grants full admin access
11 API routes lack RBAC and org-scoping — cross-org data access possible
SSRF via webhook automation — internal network scanning possible
17 npm vulnerabilities (11 high severity) including prototype pollution in xlsx and lodash

The service layer and Prisma integration are generally well-structured. The core architecture is sound, but the authorization layer has inconsistent enforcement.

CRITICAL Findings

C1: Unauthenticated File Serving (Public Access to All Uploads)

File: src/app/api/uploads/[...path]/route.ts Severity: CRITICAL

The GET handler has zero authentication. Anyone with a URL can download any uploaded revision image, video, or HLS segment. The path traversal check (relativePath.includes("..")) is also weak — should use path.resolve() canonicalization instead.

Impact: All confidential pre-release product imagery and videos are publicly accessible to anyone who can guess the URL pattern (/api/uploads/revisions/{uuid}/{type}_{timestamp}.ext).

Fix: Add requireAuth() + org-scoped access check. Replace string-contains .. check with canonical path validation.

C2: DEV_BYPASS_AUTH Has No Production Guard

Files: src/middleware.ts:6, src/lib/api-utils.ts:8, src/app/(app)/layout.tsx:13 Severity: CRITICAL

DEV_BYPASS_AUTH=true bypasses ALL authentication and grants ADMIN role. There is no NODE_ENV !== "production" guard. If this env var is set in production (misconfiguration, copied .env), the entire app is wide open.

Impact: Complete auth bypass — any request gets full admin access.

Fix: Add && process.env.NODE_ENV !== "production" guard, or better, strip the env var from production Docker images entirely.

Note: This guard was intentionally removed during our local development setup. It MUST be restored before any external deployment.

HIGH Findings

H1: 9 API Routes Use getAuthSession() Without RBAC or Org-Scoping

Severity: HIGH

These routes authenticate the user but perform no permission check and no org-scope verification. Any authenticated user from any org can access/modify these resources:

Route	Methods	Impact
`stages/[stageId]/revisions/[revisionId]/upload`	POST, DELETE	Upload/delete files on any revision
`stages/[stageId]/feedback`	GET, POST	Read/create feedback on any stage
`feedback/[itemId]`	PATCH, DELETE	Modify/delete any feedback item
`revisions/[revisionId]/annotations`	GET, POST	List/create annotations on any revision
`revisions/[revisionId]/annotations/[annotationId]`	PATCH, DELETE	Modify/delete any annotation
`revisions/[revisionId]/color-probes`	GET, POST, DELETE	Access any color probes
`revisions/[revisionId]/color-probes/[probeId]`	PATCH, DELETE	Modify/delete any probe
`reviews`	GET, POST	Create reviews (service does scope reads)
`reviews/[sessionId]`	GET, PATCH, DELETE	Read/modify/delete any review session

Fix: Replace getAuthSession() with requireAuth(PERMISSION) and add assertOrgAccess() checks. Extend assertOrgAccess() to support revision, annotation, colorProbe, feedbackItem, and reviewSession models.

H2: 4 Routes Have RBAC but Missing Org-Scoping on Parameterized IDs

Severity: HIGH

Route	Methods	Impact
`automations/[ruleId]`	GET, PATCH, DELETE	Cross-org automation rule access
`automations/[ruleId]/executions`	GET	Cross-org execution log leakage
`users/[userId]`	PATCH	Change user role in another org
`users/[userId]/skills`	GET, POST, DELETE	Cross-org skill manipulation

Fix: Add org-scope verification for the target resource ID in each route.

H3: SSRF via Webhook URL in Automation Actions

File: src/lib/automation/action-executor.ts:250-336 Severity: HIGH

The send_webhook action accepts arbitrary URLs with no validation. Attackers with AUTOMATION_MANAGE permission can:

Probe internal network services (cloud metadata, localhost services)
Exfiltrate project data to external servers

Fix: Validate URL scheme (HTTPS only), block private/internal IP ranges, consider domain allowlist.

H4: MIME Type Validation Uses Client-Supplied file.type Only

File: src/lib/services/upload-service.ts:64-66, 206-209 Severity: HIGH

No server-side magic byte verification. An attacker can upload arbitrary files (HTML, SVG with scripts) by setting Content-Type to image/png.

Fix: Use file-type npm package for magic byte validation. Add X-Content-Type-Options: nosniff header on file serving responses.

H5: Chat Tool Executor Has No Org-Scoping on Mutations

File: src/lib/chat/tool-executor.ts:399, 465-503 Severity: HIGH

Several tool operations (list_deliverables, advance_stage, assign_artist, create_revision, etc.) pass user-supplied IDs to services without verifying the entity belongs to the user's org. Through chat, users can manipulate resources in other organizations.

Fix: Add org-scope checks in the tool executor before every mutation.

H6: API Key Auth Auto-Escalates to First Admin in First Org

File: src/lib/api-utils.ts:26-71 Severity: HIGH

prisma.organization.findFirst() with no ordering is nondeterministic. The API key holder gets ADMIN on whichever org Prisma returns first. The phantom fallback (lines 58-70) creates a fake admin with hardcoded organizationId: "api-org".

Fix: Create a dedicated API service account. Require X-Org-Id header. Remove phantom fallback. Use crypto.timingSafeEqual() for key comparison.

MEDIUM Findings

M1: CRON_SECRET in Query String

File: src/app/api/cron/deadlines/route.ts:12

Secret leaks to access logs, proxy logs, browser history.

Fix: Move to Authorization: Bearer <secret> header.

M2: 500MB Video Buffered in Memory

File: src/lib/services/upload-service.ts:231

file.arrayBuffer() loads entire video into Node.js heap. 4 concurrent 500MB uploads = 2GB RAM = OOM crash.

Fix: Stream directly to disk using file.stream() piped to fs.createWriteStream().

M3: 8 API Routes Missing Zod Validation

Routes using ad-hoc manual validation instead of Zod schemas:

automations POST/PATCH
chat POST
workload PATCH
skills POST
search/semantic POST
users/[userId]/skills POST
pipelines/[pipelineId]/duplicate POST
stages/[stageId] PATCH (date override bypasses validation)

Fix: Create Zod schemas for each.

M4: Automation Action Params Not Validated

File: src/lib/automation/action-executor.ts:359-386

validateActions only checks type and params exists, not param contents. Invalid status strings, arbitrary notification text, unvalidated roles all pass through.

Fix: Add per-action-type param validation schemas.

M5: z.any() in Validators Allows Arbitrary JSON Storage

Field	Validator	Line
`customStatuses`	`pipeline-template.ts`	23
`conditions.value`	`notification-rule.ts`	17
`attachments`	`revision.ts`	6

Fix: Replace with typed schemas.

M6: allowDangerousEmailAccountLinking Enabled

File: src/lib/auth.ts:16

Currently safe (single Entra ID provider), but becomes account takeover if another OAuth provider is added.

Fix: Add prominent warning comment. Disable flag if adding providers.

M7: Prompt Injection via Chat Driving Unconfirmed Mutations

File: src/app/api/chat/route.ts + tool-executor.ts

User messages drive LLM tool calls with up to 10 iterations. MUTATION_TOOLS are flagged but never actually require confirmation.

Fix: Enforce server-side confirmation for mutations. Add RBAC to tool executor.

M8: requireAuth() Leaks Permission Names

File: src/lib/rbac/require-auth.ts:51

Error message: Missing permission: PROJECT_UPDATE — reveals permission system to attackers.

Fix: Return generic "Forbidden" message.

LOW Findings

L1: API Key Timing Attack

=== comparison on API key (middleware.ts:15, api-utils.ts:29). Use crypto.timingSafeEqual().

L2: Health Endpoint Exposes Infrastructure Details

/api/health returns DB status, pgvector version, org count, DEV_BYPASS_AUTH state. Should be behind auth or return minimal info.

L3: No Rate Limiting on Any Endpoint

Includes the Claude API proxy (/api/chat). No throttling on login attempts, file uploads, or search.

L4: Prompt Injection in Search Summarization

Project names interpolated into Ollama prompt (semantic-search-service.ts:408-416). Low impact — read-only local model.

Dependency Vulnerabilities (npm audit)

17 vulnerabilities: 6 moderate, 11 high

Package	Severity	Issue	Fix Available?
`xlsx` (SheetJS)	HIGH	Prototype pollution + ReDoS	NO — no fix, consider replacing with `exceljs`
`lodash` (via chevrotain/prisma-ast)	HIGH	Prototype pollution in `_.unset`/`_.omit`, code injection in `_.template`	Via `npm audit fix`
`next@16.1.6`	MODERATE	HTTP smuggling, CSRF bypass, disk cache DoS	Via `npm audit fix`
`hono` (via prisma)	HIGH	9 vulnerabilities (XSS, SSRF, prototype pollution, etc.)	Via `npm audit fix`
`@hono/node-server` (via prisma)	HIGH	Auth bypass for static paths	Via `npm audit fix`
`defu`	HIGH	Prototype pollution via `__proto__`	Via `npm audit fix`
`effect` (via prisma)	HIGH	AsyncLocalStorage context contamination	Via `npm audit fix`
`flatted`	HIGH	Unbounded recursion DoS + prototype pollution	Via `npm audit fix`
`picomatch`	HIGH	Method injection + ReDoS	Via `npm audit fix`
`brace-expansion`	MODERATE	Process hang via zero-step sequence	Via `npm audit fix`

Key Concern: `xlsx@0.18.5`

This package has known prototype pollution vulnerabilities with no fix available. The app already uses exceljs as well. Consider removing xlsx and using exceljs exclusively.

Key Concern: NextAuth `5.0.0-beta.30`

This is a beta version of the authentication framework. Beta software may have undiscovered vulnerabilities and does not receive the same security scrutiny as stable releases.

Hardcoded Secrets Scan

Result: No hardcoded secrets found in source code.

All credentials are properly externalized via environment variables. The .env.example file contains placeholder values, not real credentials. Docker Compose uses ${DB_PASSWORD:-postgres} with a weak default — acceptable for dev but should be changed in production.

Architecture Notes

What's done well:

Service layer is cleanly separated from route handlers
Prisma ORM prevents most SQL injection (except raw queries for pgvector)
FFmpeg uses execFile (not exec) — no command injection
assertOrgAccess() is fail-closed (denies if model type not recognized)
File paths use server-generated UUIDs + timestamps, not user-supplied filenames
No secrets in source code

What needs work:

RBAC enforcement is inconsistent — opt-in pattern means new routes easily miss it
assertOrgAccess() only supports 3 entity types — needs extending
No rate limiting anywhere
No CSP headers
No request size limits beyond the 500MB video cap

Priority Fix Order

Immediate (before any deployment):
- C1: Add auth to file serving route
- C2: Restore NODE_ENV guard on DEV_BYPASS_AUTH
- H1: Add RBAC + org-scoping to 9 unprotected routes
Before external access:
- H3: SSRF protection on webhooks
- H4: Server-side MIME validation
- H5: Org-scoping on chat tool mutations
- H6: Fix API key auth (dedicated service account)
- Run npm audit fix
Soon after:
- All MEDIUM findings
- Replace xlsx with exceljs
- Add rate limiting (at minimum on /api/chat and /api/auth)
- Add CSP headers

12 KiB Raw Permalink Blame History