dow-prod-tracker/SECURITY-REVIEW.md
DJP 26c766cf43 Security hardening: fix critical auth, RBAC, and injection vulnerabilities
- C1: Add authentication to file serving route + canonical path traversal check + nosniff header
- C2: DEV_BYPASS_AUTH now only works when Entra ID credentials are not configured
- H1: Add requireAuth() + assertOrgAccess() to 9 unprotected routes (upload, feedback, annotations, color-probes, reviews)
- H2: Add org-scoping to 4 routes (automations, users, skills)
- H3: SSRF protection on webhook URLs — HTTPS only, private/internal IPs blocked
- H6: API key uses timingSafeEqual, phantom fallback removed, supports X-Org-Id header
- M1: CRON_SECRET moved from query string to Authorization Bearer header
- Extend assertOrgAccess() to support 10 model types (was 3)
- npm audit fix: 17 vulnerabilities reduced to 4
- Add SECURITY-REVIEW.md with full findings report

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 20:48:05 -04:00

12 KiB

HP Prod Tracker — Security Code Review

Date: 2026-04-07 Reviewed by: Claude Code (automated deep review) Scope: Full codebase — 342 files, ~117K LOC, 62 API routes


Executive Summary

This codebase has significant security gaps that must be addressed before any production or external-facing deployment. The most critical issues are:

  1. No authentication on file serving — uploaded media is publicly accessible
  2. DEV_BYPASS_AUTH has no production guard — misconfiguration grants full admin access
  3. 11 API routes lack RBAC and org-scoping — cross-org data access possible
  4. SSRF via webhook automation — internal network scanning possible
  5. 17 npm vulnerabilities (11 high severity) including prototype pollution in xlsx and lodash

The service layer and Prisma integration are generally well-structured. The core architecture is sound, but the authorization layer has inconsistent enforcement.


CRITICAL Findings

C1: Unauthenticated File Serving (Public Access to All Uploads)

File: src/app/api/uploads/[...path]/route.ts Severity: CRITICAL

The GET handler has zero authentication. Anyone with a URL can download any uploaded revision image, video, or HLS segment. The path traversal check (relativePath.includes("..")) is also weak — should use path.resolve() canonicalization instead.

Impact: All confidential pre-release product imagery and videos are publicly accessible to anyone who can guess the URL pattern (/api/uploads/revisions/{uuid}/{type}_{timestamp}.ext).

Fix: Add requireAuth() + org-scoped access check. Replace string-contains .. check with canonical path validation.


C2: DEV_BYPASS_AUTH Has No Production Guard

Files: src/middleware.ts:6, src/lib/api-utils.ts:8, src/app/(app)/layout.tsx:13 Severity: CRITICAL

DEV_BYPASS_AUTH=true bypasses ALL authentication and grants ADMIN role. There is no NODE_ENV !== "production" guard. If this env var is set in production (misconfiguration, copied .env), the entire app is wide open.

Impact: Complete auth bypass — any request gets full admin access.

Fix: Add && process.env.NODE_ENV !== "production" guard, or better, strip the env var from production Docker images entirely.

Note: This guard was intentionally removed during our local development setup. It MUST be restored before any external deployment.


HIGH Findings

H1: 9 API Routes Use getAuthSession() Without RBAC or Org-Scoping

Severity: HIGH

These routes authenticate the user but perform no permission check and no org-scope verification. Any authenticated user from any org can access/modify these resources:

Route Methods Impact
stages/[stageId]/revisions/[revisionId]/upload POST, DELETE Upload/delete files on any revision
stages/[stageId]/feedback GET, POST Read/create feedback on any stage
feedback/[itemId] PATCH, DELETE Modify/delete any feedback item
revisions/[revisionId]/annotations GET, POST List/create annotations on any revision
revisions/[revisionId]/annotations/[annotationId] PATCH, DELETE Modify/delete any annotation
revisions/[revisionId]/color-probes GET, POST, DELETE Access any color probes
revisions/[revisionId]/color-probes/[probeId] PATCH, DELETE Modify/delete any probe
reviews GET, POST Create reviews (service does scope reads)
reviews/[sessionId] GET, PATCH, DELETE Read/modify/delete any review session

Fix: Replace getAuthSession() with requireAuth(PERMISSION) and add assertOrgAccess() checks. Extend assertOrgAccess() to support revision, annotation, colorProbe, feedbackItem, and reviewSession models.

H2: 4 Routes Have RBAC but Missing Org-Scoping on Parameterized IDs

Severity: HIGH

Route Methods Impact
automations/[ruleId] GET, PATCH, DELETE Cross-org automation rule access
automations/[ruleId]/executions GET Cross-org execution log leakage
users/[userId] PATCH Change user role in another org
users/[userId]/skills GET, POST, DELETE Cross-org skill manipulation

Fix: Add org-scope verification for the target resource ID in each route.

H3: SSRF via Webhook URL in Automation Actions

File: src/lib/automation/action-executor.ts:250-336 Severity: HIGH

The send_webhook action accepts arbitrary URLs with no validation. Attackers with AUTOMATION_MANAGE permission can:

  • Probe internal network services (cloud metadata, localhost services)
  • Exfiltrate project data to external servers

Fix: Validate URL scheme (HTTPS only), block private/internal IP ranges, consider domain allowlist.

H4: MIME Type Validation Uses Client-Supplied file.type Only

File: src/lib/services/upload-service.ts:64-66, 206-209 Severity: HIGH

No server-side magic byte verification. An attacker can upload arbitrary files (HTML, SVG with scripts) by setting Content-Type to image/png.

Fix: Use file-type npm package for magic byte validation. Add X-Content-Type-Options: nosniff header on file serving responses.

H5: Chat Tool Executor Has No Org-Scoping on Mutations

File: src/lib/chat/tool-executor.ts:399, 465-503 Severity: HIGH

Several tool operations (list_deliverables, advance_stage, assign_artist, create_revision, etc.) pass user-supplied IDs to services without verifying the entity belongs to the user's org. Through chat, users can manipulate resources in other organizations.

Fix: Add org-scope checks in the tool executor before every mutation.

H6: API Key Auth Auto-Escalates to First Admin in First Org

File: src/lib/api-utils.ts:26-71 Severity: HIGH

prisma.organization.findFirst() with no ordering is nondeterministic. The API key holder gets ADMIN on whichever org Prisma returns first. The phantom fallback (lines 58-70) creates a fake admin with hardcoded organizationId: "api-org".

Fix: Create a dedicated API service account. Require X-Org-Id header. Remove phantom fallback. Use crypto.timingSafeEqual() for key comparison.


MEDIUM Findings

M1: CRON_SECRET in Query String

File: src/app/api/cron/deadlines/route.ts:12

Secret leaks to access logs, proxy logs, browser history.

Fix: Move to Authorization: Bearer <secret> header.

M2: 500MB Video Buffered in Memory

File: src/lib/services/upload-service.ts:231

file.arrayBuffer() loads entire video into Node.js heap. 4 concurrent 500MB uploads = 2GB RAM = OOM crash.

Fix: Stream directly to disk using file.stream() piped to fs.createWriteStream().

M3: 8 API Routes Missing Zod Validation

Routes using ad-hoc manual validation instead of Zod schemas:

  • automations POST/PATCH
  • chat POST
  • workload PATCH
  • skills POST
  • search/semantic POST
  • users/[userId]/skills POST
  • pipelines/[pipelineId]/duplicate POST
  • stages/[stageId] PATCH (date override bypasses validation)

Fix: Create Zod schemas for each.

M4: Automation Action Params Not Validated

File: src/lib/automation/action-executor.ts:359-386

validateActions only checks type and params exists, not param contents. Invalid status strings, arbitrary notification text, unvalidated roles all pass through.

Fix: Add per-action-type param validation schemas.

M5: z.any() in Validators Allows Arbitrary JSON Storage

Field Validator Line
customStatuses pipeline-template.ts 23
conditions.value notification-rule.ts 17
attachments revision.ts 6

Fix: Replace with typed schemas.

M6: allowDangerousEmailAccountLinking Enabled

File: src/lib/auth.ts:16

Currently safe (single Entra ID provider), but becomes account takeover if another OAuth provider is added.

Fix: Add prominent warning comment. Disable flag if adding providers.

M7: Prompt Injection via Chat Driving Unconfirmed Mutations

File: src/app/api/chat/route.ts + tool-executor.ts

User messages drive LLM tool calls with up to 10 iterations. MUTATION_TOOLS are flagged but never actually require confirmation.

Fix: Enforce server-side confirmation for mutations. Add RBAC to tool executor.

M8: requireAuth() Leaks Permission Names

File: src/lib/rbac/require-auth.ts:51

Error message: Missing permission: PROJECT_UPDATE — reveals permission system to attackers.

Fix: Return generic "Forbidden" message.


LOW Findings

L1: API Key Timing Attack

=== comparison on API key (middleware.ts:15, api-utils.ts:29). Use crypto.timingSafeEqual().

L2: Health Endpoint Exposes Infrastructure Details

/api/health returns DB status, pgvector version, org count, DEV_BYPASS_AUTH state. Should be behind auth or return minimal info.

L3: No Rate Limiting on Any Endpoint

Includes the Claude API proxy (/api/chat). No throttling on login attempts, file uploads, or search.

L4: Prompt Injection in Search Summarization

Project names interpolated into Ollama prompt (semantic-search-service.ts:408-416). Low impact — read-only local model.


Dependency Vulnerabilities (npm audit)

17 vulnerabilities: 6 moderate, 11 high

Package Severity Issue Fix Available?
xlsx (SheetJS) HIGH Prototype pollution + ReDoS NO — no fix, consider replacing with exceljs
lodash (via chevrotain/prisma-ast) HIGH Prototype pollution in _.unset/_.omit, code injection in _.template Via npm audit fix
next@16.1.6 MODERATE HTTP smuggling, CSRF bypass, disk cache DoS Via npm audit fix
hono (via prisma) HIGH 9 vulnerabilities (XSS, SSRF, prototype pollution, etc.) Via npm audit fix
@hono/node-server (via prisma) HIGH Auth bypass for static paths Via npm audit fix
defu HIGH Prototype pollution via __proto__ Via npm audit fix
effect (via prisma) HIGH AsyncLocalStorage context contamination Via npm audit fix
flatted HIGH Unbounded recursion DoS + prototype pollution Via npm audit fix
picomatch HIGH Method injection + ReDoS Via npm audit fix
brace-expansion MODERATE Process hang via zero-step sequence Via npm audit fix

Key Concern: xlsx@0.18.5

This package has known prototype pollution vulnerabilities with no fix available. The app already uses exceljs as well. Consider removing xlsx and using exceljs exclusively.

Key Concern: NextAuth 5.0.0-beta.30

This is a beta version of the authentication framework. Beta software may have undiscovered vulnerabilities and does not receive the same security scrutiny as stable releases.


Hardcoded Secrets Scan

Result: No hardcoded secrets found in source code.

All credentials are properly externalized via environment variables. The .env.example file contains placeholder values, not real credentials. Docker Compose uses ${DB_PASSWORD:-postgres} with a weak default — acceptable for dev but should be changed in production.


Architecture Notes

What's done well:

  • Service layer is cleanly separated from route handlers
  • Prisma ORM prevents most SQL injection (except raw queries for pgvector)
  • FFmpeg uses execFile (not exec) — no command injection
  • assertOrgAccess() is fail-closed (denies if model type not recognized)
  • File paths use server-generated UUIDs + timestamps, not user-supplied filenames
  • No secrets in source code

What needs work:

  • RBAC enforcement is inconsistent — opt-in pattern means new routes easily miss it
  • assertOrgAccess() only supports 3 entity types — needs extending
  • No rate limiting anywhere
  • No CSP headers
  • No request size limits beyond the 500MB video cap

Priority Fix Order

  1. Immediate (before any deployment):

    • C1: Add auth to file serving route
    • C2: Restore NODE_ENV guard on DEV_BYPASS_AUTH
    • H1: Add RBAC + org-scoping to 9 unprotected routes
  2. Before external access:

    • H3: SSRF protection on webhooks
    • H4: Server-side MIME validation
    • H5: Org-scoping on chat tool mutations
    • H6: Fix API key auth (dedicated service account)
    • Run npm audit fix
  3. Soon after:

    • All MEDIUM findings
    • Replace xlsx with exceljs
    • Add rate limiting (at minimum on /api/chat and /api/auth)
    • Add CSP headers