Commit graph

147 commits

Author SHA1 Message Date
Vadym Samoilenko
de4c862372 Fix signOut redirect: include basePath /hp-prod-tracker in redirectTo 2026-04-16 19:14:12 +01:00
Vadym Samoilenko
80114a65c8 Add sign-out button to sidebar
Exits the app session only (no Microsoft global logout).
Auth.js signOut() deletes the DB session and clears the cookie,
then redirects to /login.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:06:10 +01:00
Vadym Samoilenko
250796dd0c Replace Auth.js OAuth with MSAL.js SPA browser flow
- Token exchange now happens entirely in the browser via @azure/msal-browser
  (PKCE, no client_secret — correct for Azure SPA registrations)
- Browser stays on /hp-prod-tracker/login throughout; the /api/auth/callback
  URL never appears in the address bar
- New /api/auth/sso route validates the id_token (jose + Azure JWKS),
  creates User/Account/Session in Prisma, and sets the authjs session cookie
- Auth.js retained only for session reading (auth()) and signOut()
- Fix dev bypass safety gate: use NODE_ENV !== production instead of
  absence of AUTH_MICROSOFT_ENTRA_ID_SECRET
- Rename env vars: AUTH_MICROSOFT_ENTRA_ID_ID → AZURE_CLIENT_ID,
  AUTH_MICROSOFT_ENTRA_ID_TENANT_ID → AZURE_TENANT_ID, remove AUTH_URL
- Remove /api/auth Apache proxy rule (no longer needed)
- Delete OAuthRelay.tsx, add MsalLogin.tsx

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:49:43 +01:00
Vadym Samoilenko
6701946092 Fix OAuthRelay: relay on code-only, drop state check
Azure SPA returns ?code&session_state (no OAuth state). Auth.js also omits
state from the authorization URL when using PKCE. Two fixes:
- OAuthRelay: trigger on `code` alone, forward all params as-is
- auth.ts: checks: ["pkce"] — removes state requirement Auth.js would fail on

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:42:06 +01:00
Vadym Samoilenko
cadba79f55 deploy.sh: replace SECRET check with AZURE_REDIRECT_URI
SPA registration has no client_secret; check the new required var instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:30:52 +01:00
Vadym Samoilenko
17fc539d19 Configure SSO for Azure SPA registration: PKCE without client_secret
- Override authorization redirect_uri to match Azure SPA portal registration
  (login page URL instead of Auth.js callback URL)
- Custom token.request: public client PKCE exchange — no client_secret sent
- Add OAuthRelay client component: forwards ?code&state from login page to
  /api/auth/callback/microsoft-entra-id via window.location.replace
- Add AZURE_REDIRECT_URI env var to docker-compose.yml and .env.example
- Remove AUTH_MICROSOFT_ENTRA_ID_SECRET (SPA registrations don't issue secrets)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:25:57 +01:00
Vadym Samoilenko
bf0bee9c28 Fix SSO: use /api/auth (no basePath) as OAuth redirect_uri
next-auth v5 beta.30 cannot reliably pass the /hp-prod-tracker prefix
through OAuth redirect_uri — redirectProxyUrl is silently ignored.

Instead: AUTH_URL=https://…/api/auth (matches basePath exactly), Auth.js
sends consistent redirect_uri in both authorization and token exchange,
Apache proxies /api/auth → :3001 before the OliVAS /api/ rule.

Azure must have https://optical-dev.oliver.solutions/api/auth/callback/microsoft-entra-id registered.
Server .env: AUTH_URL=https://optical-dev.oliver.solutions/api/auth

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:54:37 +01:00
Vadym Samoilenko
f5b091ceea Fix auth error redirect: include basePath in signIn page path
Auth.js constructs server-side redirects from origin only, ignoring the
Next.js basePath. Explicitly including /hp-prod-tracker in pages.signIn
ensures errors redirect to /hp-prod-tracker/login instead of /login.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:33:47 +01:00
Vadym Samoilenko
c412ad7bdf Fix health check URL to include basePath
App is served under /hp-prod-tracker basePath, so the health endpoint
is at /hp-prod-tracker/api/health not /api/health.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:29:43 +01:00
Vadym Samoilenko
1b07542a31 Fix SSO token exchange: restore redirectProxyUrl alongside explicit redirect_uri
authorization.params.redirect_uri fixes the authorization request URI.
redirectProxyUrl fixes the token exchange URI (beta.30 uses it there).
Both are needed. AUTH_URL must now include /api/auth suffix on the server.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:22:01 +01:00
Vadym Samoilenko
1950ecc7d6 Restore Apache step: add Include to /etc/apache2 on deploy
On first deploy replaces the old inline hp-prod-tracker block in
optical-dev.oliver.solutions.conf with an Include pointing to
apache/hp-prod-tracker.conf. Idempotent — skips if Include already present.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:16:30 +01:00
Vadym Samoilenko
b7d50ad715 Move Apache config to apache/hp-prod-tracker.conf, remove auto-management
Apache config on this server is managed manually in optical-dev.oliver.solutions.conf
(same pattern as cc-dashboard). Deploy script no longer touches Apache.
Config moved to apache/hp-prod-tracker.conf matching amazon-transcreation pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:15:44 +01:00
Vadym Samoilenko
63818bc6e2 Rewrite deploy.sh following ppt-tool pattern
Numbered steps matching server conventions: prerequisites install,
git pull with SSH auto-switch, .env validation, docker compose build,
postgres + health-check waits, idempotent Apache Include management,
UFW firewall. Apache step replaces old inline block with a canonical
Include pointing to deploy/apache.conf.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:13:24 +01:00
Vadym Samoilenko
548a9d8ef5 Add Apache config snippet and wire into deploy script
deploy/apache.conf: canonical Apache proxy config for hp-prod-tracker —
adds WebSocket passthrough and 500 MB upload limit missing from the
current inline config. deploy.sh now replaces the inline block with an
Include directive on each deploy so the config stays in source control.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:11:00 +01:00
Vadym Samoilenko
7e32bbc430 Add idempotent deploy script
Handles initial deploy and updates: git pull via SSH, docker compose
rebuild, health check with timeout, pre-flight .env validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:08:29 +01:00
Vadym Samoilenko
6fd240860c Fix SSO redirect URI by setting authorization.params explicitly
next-auth v5 beta ignores redirectProxyUrl when constructing the
redirect_uri sent to Microsoft — it strips the pathname from AUTH_URL
and uses only the origin. Passing redirect_uri directly in
authorization.params guarantees the /hp-prod-tracker basePath is
included in the callback URL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 15:04:42 +01:00
DJP
aae25a0959 Fix SSO redirect URI to include basePath via redirectProxyUrl
Auth.js route matching needs basePath="/api/auth" (Next.js strips
/hp-prod-tracker from the internal request). But the OAuth redirect_uri
sent to Microsoft must include the full external path.

Uses redirectProxyUrl to explicitly set the callback URL to
{AUTH_URL}/api/auth/callback/microsoft-entra-id, which includes
the /hp-prod-tracker basePath. Pins basePath="/api/auth" so
AUTH_URL's pathname doesn't override route matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 13:00:01 -04:00
DJP
f41dfe6024 Pass AUTH_URL through to container for SSO callback
Auth.js needs AUTH_URL to build the correct redirect URI
including the /hp-prod-tracker basePath.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 15:39:09 -04:00
DJP
30f804b7ff Fix login redirect missing basePath behind reverse proxy
Use request.nextUrl.clone() instead of new URL("/login", request.url)
so Next.js includes the /hp-prod-tracker basePath in redirects.
Without this, unauthenticated users get sent to /login instead of
/hp-prod-tracker/login.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 15:26:51 -04:00
DJP
58d8459b43 Add README + smart Ollama→Claude escalation for tool calling
README.md:
- Full project overview, tech stack, features, AI architecture
- Deployment guide, data model, RBAC matrix, project structure

provider.ts:
- Reduce Ollama timeout from 180s to 45s (fail fast to Claude)
- Smart escalation: when Ollama responds with 0 tool calls but the
  query likely needed data (keyword match), automatically escalate
  to Claude for reliable tool calling
- Ollama still handles pure conversational queries for free
- Queries needing real data get Claude's reliable tool calling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 16:09:36 -04:00
DJP
697b015675 Dynamic tool selection for Ollama based on user intent
Instead of sending all 12 tools every request, match the user's message
against keyword groups (status, workload, assign, create, advance, revision)
and only send relevant tools. search_entities always included for name
resolution. Falls back to basic query tools if no keywords match.

This cuts the tool definitions from ~12 to ~2-6 per request, significantly
reducing context size for gemma4.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:55:07 -04:00
DJP
e99391b824 Reduce Ollama context size for gemma4 reliability
- Filter tools to 12 (from 17) via OLLAMA_TOOL_ALLOWLIST
- Shorten tool descriptions to first sentence only
- Trim system prompt: drop pipeline details and suggestion format, keep Rules
- Reduce num_predict from 4096 to 2048
- Fix system prompt trimming to preserve Rules section (name resolution, mutation flow)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:52:48 -04:00
DJP
660caeeafc Add response logging for Ollama to diagnose timeout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:42:32 -04:00
DJP
2c7f85bca3 Flatten Ollama conversation to plain text to fix JSON parse error
Ollama's parser chokes on deeply nested JSON in tool_use/tool_result
structured content blocks. Instead of sending OpenAI-format tool
messages, flatten everything to simple role/content text messages.
Tool results are truncated to 2KB to keep context manageable.

The model still receives tool definitions and can make new tool calls,
but prior tool interactions are shown as plain text in the history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:28:18 -04:00
DJP
ddbd0a3fd3 Fix Ollama JSON parse error by sending Content-Length header
Ollama was receiving chunked transfer encoding from Node.js fetch and
failing to parse the JSON body ("can't find closing '}' symbol").
Sending a Buffer with explicit Content-Length forces a single complete
body write.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:21:21 -04:00
DJP
b55b652c55 Add detailed Ollama logging and increase timeout to 180s
Logs request size, message count, and detailed error info to help
diagnose the "can't find closing '}'" JSON parsing error from Ollama.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:20:44 -04:00
DJP
d4fa69957e Switch Ollama chat model to gemma4:latest
Gemma 4 loads successfully, supports tool calling with proper
structured output, and responds in ~100ms after initial load.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 15:11:59 -04:00
DJP
49f301f6f4 Use mistral:latest (7B) for Ollama chat — only model that loads on server
Larger models (mistral-large 122B, qwen3-coder 30B, gpt-oss 20B) all
fail to load due to resource limits. mistral:latest (7.2B) loads and
responds successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 14:40:58 -04:00
DJP
93ab4a0947 Switch Ollama chat model to qwen3-coder:30b (mistral-large too large for server)
mistral-large:latest requires 420GB RAM, server only has 345GB.
qwen3-coder:30b is a 30.5B MoE model that fits in ~20GB with good
tool calling and reasoning capabilities.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 14:38:34 -04:00
DJP
83ce802264 Make Ollama primary AI provider, Claude as paid fallback
- Ollama (internal GPU server) is tried first — free
- If Ollama is down, falls back to Claude API with a browser toast:
  "Ollama unavailable — using Claude (paid API)"
- Provider badge shows which one is active (orange/purple)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 14:06:35 -04:00
DJP
6e19c1f046 Add Ollama as fallback AI provider, remove local Ollama container
- Claude is primary, Ollama (internal GPU server) is automatic fallback
- Provider auto-selects: Claude if API key set, else Ollama if reachable
- Ollama uses mistral-large:latest for chat with full tool calling support
- Removed local Ollama Docker service — uses remote at 10.24.42.219
- Chat panel badge shows "Claude" (purple) or "Ollama" (orange)
- OLLAMA_CHAT_HOST and OLLAMA_CHAT_MODEL env vars for configuration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 12:30:13 -04:00
DJP
3209a5dbee Prevent chat from exceeding Claude context limit
- Cap conversation history to last 20 messages
- Truncate tool results over 8KB before sending back to Claude
- Trim long assistant messages in client-side history to 2KB

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 12:17:27 -04:00
DJP
2f1afed855 Pass ANTHROPIC_API_KEY through to Docker container
The env var was in .env but not listed in docker-compose environment
block, so the container never received it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 12:11:18 -04:00
DJP
38bd8ac63d Add safety guardrails to AI chat assistant
- Mutation confirmation: all write operations (create, update, assign)
  now pause and show a confirmation card before executing. Users must
  click Confirm or Cancel.
- RBAC enforcement: Artists blocked from mutations via chat, Producers
  blocked from bulk operations. Only Admins get full access.
- Rate limiting: 20 requests/minute per user on the chat endpoint.
- System prompt updated to not instruct Claude to execute directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 12:05:14 -04:00
DJP
277ad85073 Prepend basePath to stored media URLs so assets load under /hp-prod-tracker
upload-service.ts and annotation-service.ts were storing URLs like
/api/uploads/revisions/... in the database. When the app is served at
/hp-prod-tracker, the browser needs /hp-prod-tracker/api/uploads/...
to hit the correct route.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 07:47:08 -04:00
DJP
5785f142fd Fix upload/delete/annotation fetch calls to use apiUrl() for basePath
Three files had hardcoded /api/ URLs that bypassed the basePath prefix,
causing 404s when the app is served under /hp-prod-tracker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 22:19:41 -04:00
DJP
c1a003570e Fix fetchJson in 17 hooks to use basePath prefix
All hook files had local fetchJson() helpers calling fetch(url) directly,
bypassing the basePath. Now wrapped with apiUrl() so API calls work
under /hp-prod-tracker path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 22:09:41 -04:00
DJP
60ec707814 Add /hp-prod-tracker basePath for path-based hosting
- Set basePath in next.config.ts for serving under /hp-prod-tracker
- Create apiUrl() helper to prepend basePath to fetch calls
- Update all 28 fetch("/api/...") calls across 16 files
- Add GCS storage migration plan doc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 21:47:30 -04:00
DJP
26c766cf43 Security hardening: fix critical auth, RBAC, and injection vulnerabilities
- C1: Add authentication to file serving route + canonical path traversal check + nosniff header
- C2: DEV_BYPASS_AUTH now only works when Entra ID credentials are not configured
- H1: Add requireAuth() + assertOrgAccess() to 9 unprotected routes (upload, feedback, annotations, color-probes, reviews)
- H2: Add org-scoping to 4 routes (automations, users, skills)
- H3: SSRF protection on webhook URLs — HTTPS only, private/internal IPs blocked
- H6: API key uses timingSafeEqual, phantom fallback removed, supports X-Org-Id header
- M1: CRON_SECRET moved from query string to Authorization Bearer header
- Extend assertOrgAccess() to support 10 model types (was 3)
- npm audit fix: 17 vulnerabilities reduced to 4
- Add SECURITY-REVIEW.md with full findings report

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 20:48:05 -04:00
DJP
4c0e9d32df Dev server deployment: port conflicts, auth bypass, API key, UI fixes
- Remap ports (3001, 5491) to avoid conflicts on shared server
- Remove NODE_ENV guard from DEV_BYPASS_AUTH in middleware, api-utils, layout
- Add API key authentication for external integrations
- Comment out Ollama dependency (optional for dev)
- Fix pipeline graph: topological depth layout for parallel branches
- Fix uploads: move to /data/uploads volume, serve via /api/uploads
- Fix wipe comparison: correct A/B layering, transformOrigin, ResizeObserver fit
- Fix Dockerfile: create /app/public directory for standalone build

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 17:17:54 -04:00
Leivur Djurhuus
010d29656c Clean up deployment config: remove Docker Hub refs, Cloudflare Tunnel
Source code is now on Bitbucket — IT builds from source directly.
Docker Hub and Cloudflare Tunnel are no longer needed. Removed
profiles gate from app service so docker compose up -d works without
flags. Updated .env.example with organized sections and comments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:44:09 -05:00
Leivur Djurhuus
005a7acbe2 Fix Docker image: add prisma + dotenv to runner stage
The standalone Next.js output doesn't include prisma (devDependency)
or dotenv (only used by prisma.config.ts, not app runtime). Install
them explicitly in the runner stage for prisma migrate deploy.
Pin prisma@7.4.2 to avoid npx downloading a non-existent version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:34:48 -05:00
Leivur Djurhuus
449b248323 Document SSO seed-user linking pattern for next-auth v5
Captures the allowDangerousEmailAccountLinking pattern for linking
pre-seeded users to SSO accounts, org auto-assignment via signIn
event, limbo page for unprovisioned users, and DEV_BYPASS_AUTH
production guard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:25:58 -05:00
Leivur Djurhuus
ffbc5a2e31 Add standalone output for Docker deployment, gitignore deploy dir
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:49:58 -05:00
Leivur Djurhuus
fa55dfc25f Add deployment infrastructure: health endpoint, Docker Compose fixes, tunnel
- Add /api/health endpoint checking DB, pgvector, org, templates,
  dev bypass safety, and AUTH_SECRET presence
- Fix Docker Compose app service: AUTH_SECRET, Entra ID env vars,
  AUTH_TRUST_HOST, app health check
- Add Cloudflare Tunnel service for zero-config HTTPS access
- Exclude health endpoint from auth middleware

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:54:15 -05:00
Leivur Djurhuus
0eaf809bc6 Add SSO bridge: Microsoft Entra ID auth with seed user linking
Configure Microsoft Entra ID as the sole SSO provider with
allowDangerousEmailAccountLinking to link SSO accounts to existing
seeded user records by email match. Add signIn event for automatic
org assignment by domain. Guard DEV_BYPASS_AUTH against production
use. Add branded pending page for authenticated users without org
membership. Remove Google provider for initial rollout simplicity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:52:13 -05:00
Leivur Djurhuus
4149b2cf40 Switch from db push to versioned Prisma migrations
Replace 2 stale migration files with a single baseline migration
capturing the full 40+ model schema. The database was freshly reset
via clean-slate, making this the ideal time to establish migration
history. Dockerfile now runs prisma migrate deploy before app start.
Updated SETUP.md and ROADMAP.md to reference prisma migrate dev
instead of db push.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:45:36 -05:00
Leivur Djurhuus
29657aeefd Gitignore database backup files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:40:53 -05:00
Leivur Djurhuus
aa20767035 Add clean slate toolkit solution documentation
Documents the purge-and-reseed pattern for transitioning from dev to
production data, including FK-safe deletion order, self-referential FK
handling, and backup/restore procedures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:39:37 -05:00
Leivur Djurhuus
dfa067e95f Database cleanup pre rollout 2026-04-06 14:35:56 -05:00