Leivur R. Djurhuus b37c7d0bf4 Refactor code structure for improved readability and maintainability

2026-03-14 13:17:19 -05:00

9.7 KiB

Raw Blame History

CLI-Anything Integration: HP CG Production Tracker

Executive Summary

CLI-Anything is an open-source framework (MIT licensed) that automatically transforms our web application into a command-line interface. This CLI layer enables AI agents to operate the tracker through structured commands, unlocking full automation of the production pipeline — from job intake to delivery.

The goal: shift producers from operators (doing everything manually) to supervisors (approving, overriding, handling exceptions).

Key Findings

What CLI-Anything Does

Auto-generates a production-ready CLI from an existing codebase
7-phase pipeline: analyze source → design commands → implement → test → document → publish
Outputs a pip-installable package with JSON output mode for AI consumption
Includes auto-generated tests and documentation
Validated across 11 major applications (1,508 passing tests)
Repository: https://github.com/HKUDS/CLI-Anything

Why It Fits Our Project

Our tracker already has a clean service layer with 26 REST API endpoints
Zod validators translate directly to CLI command schemas
Dependency engine and business rules are already codified
Skills, capacity, and workload data already exists in the system
Ollama is already running locally for embeddings (pgvector + nomic-embed-text)

Cost

CLI-Anything: Free (MIT license)
Claude API (default): ~$0.01–0.05 per interaction, highly reliable tool calling
Local LLM (Ollama): Free, already in our Docker Compose stack — serves as offline fallback
Primary cost is implementation time

What Can Be Automated

Producer Task	Automation Level	How
Job intake / project creation	Fully automatable	AI parses incoming requests (email, brief docs) and creates projects, deliverables, and pipeline stages
Artist assignment	Fully automatable	AI matches skills, capacity, and department data to assign the best available artist
Stage progression	Mostly automatable	Dependency engine auto-advances stages as prerequisites are approved; AI triggers downstream work
Deadline monitoring & escalation	Fully automatable	Scheduled agent flags overdue items, nudges artists, escalates only true blockers to producers
Status reporting	Fully automatable	Agent queries tracker and generates summaries on demand or on schedule
Revision cycles	Partially automatable	Agent logs revisions and reassigns artists; creative review still requires human eyes

Producer Role: Before vs After

Today	With Automation
Create projects manually	Review auto-created projects, approve
Assign artists by memory	Review AI-suggested assignments, override if needed
Monitor every stage daily	Get alerted only on exceptions and blockers
Chase artists for updates	Agent handles nudges and follow-ups
Compile status reports	Reports generated automatically
Advance stages manually	Pipeline advances itself

Architecture Decision: Chat Interface

Default: Claude API + Tool Use

Claude interprets natural language and calls tools mapped to our services
Best-in-class accuracy for tool calling, multi-step operations, and ambiguous requests
~$0.01–0.05 per interaction — negligible cost at 10-30 producers
Estimated monthly cost: ~$20–100 depending on usage volume

Fallback: Local LLM via Ollama

Activates automatically if the Claude API is unreachable (outage, network issues)
Llama 3 70B or Qwen 2.5 72B running locally via existing Docker Compose stack
Handles most straightforward operations reliably
Ensures producers are never blocked — the chat assistant stays online regardless

Why Claude as Default

Tool calling accuracy is significantly higher than local models — fewer misrouted commands, fewer confirmation retries
Handles complex multi-step requests out of the box ("create 20 deliverables and assign them based on availability")
The cost is trivial relative to the producer time saved
Ollama remains valuable as a zero-downtime safety net, not a cost-saving measure

Implementation Steps

Phase 1: Generate the CLI

Install CLI-Anything as a Claude Code plugin:

/plugin marketplace add HKUDS/CLI-Anything
/plugin install cli-anything

Run CLI generation against the tracker codebase:

/cli-anything ~/Documents/VScode/hp_prod_tracker

Review generated commands — ensure they map correctly to existing services:
- Project CRUD (create, list, update, archive)
- Deliverable CRUD + bulk creation
- Stage advancement + status updates
- Artist assignment (with skill/capacity awareness)
- Revision logging
- Workload queries
- Excel import/export

Refine coverage for any missing operations:

/cli-anything:refine ~/Documents/VScode/hp_prod_tracker "pipeline dependencies and bulk operations"

Run the auto-generated tests and validate against the real database

Install the CLI locally:

cd hp_prod_tracker/agent-harness && pip install -e .

Phase 2: Build the Chat UI Component

Create a slide-out chat panel using shadcn/ui (Sheet + ScrollArea + Input)
Add a chat icon/button to the app sidebar or top bar
Store chat history per user (new Prisma model or simple local state for V1)
Pass current context (active project, deliverable) into the chat so producers don't have to specify everything

Phase 3: Wire the AI Backend (Claude API — Default)

Install Anthropic SDK: npm install @anthropic-ai/sdk
Create /api/chat route in the Next.js app
Configure Claude API client with API key (stored in environment variables)
Define tools from existing Zod validators and service functions:
- create_project, list_projects, update_project
- create_deliverable, list_deliverables
- assign_artist, remove_assignment
- advance_stage, get_blocked_stages
- get_workload, get_available_artists
- create_revision, list_overdue
- export_excel, import_excel
Implement tool execution handlers that call the existing service layer
Add confirmation flow: for any mutation, show the user what will happen before executing
After execution, invalidate relevant TanStack Query caches so the UI updates in real-time
Test against common producer requests:
- "Create a new project for Pavilion 16, high priority, Q3"
- "Assign Maria to Model Prep on Spectre x360"
- "What's overdue this week?"
- "Mark all Catalog Images for HP-2026-Q2 as delivered"

Phase 4: Ollama Fallback Layer

Add a chat-capable Ollama model to docker-compose.yml (e.g., llama3:70b or qwen2.5:72b)

Create a provider abstraction in the chat API route:

try Claude API → on connection failure → fall back to Ollama

Map the same tool definitions to Ollama's function calling format
Add a health check endpoint that monitors Claude API availability
Log all fallback events so we can track how often Ollama is needed
Ensure producers see a subtle indicator when running in fallback mode (e.g., "Running locally — some complex requests may need to be simplified")

Phase 5: Automation Agents (Scheduled)

Create scheduled agent scripts (cron or Next.js API routes triggered by cron):
- Deadline monitor: Runs daily, flags overdue stages, sends notifications
- Auto-assignment: When a stage unblocks, suggests or auto-assigns based on skills + capacity
- Stage auto-advance: When all prerequisites are approved, automatically transition downstream stages
- Status digest: Weekly summary per project emailed or posted to Slack
Each agent uses the CLI or service layer directly
Add producer override controls — ability to pause/resume automation per project

Phase 6: Full Pipeline Automation

Intake automation: Monitor email inbox or Workfront API for new requests → auto-create projects
Smart assignment: Factor in historical performance, current workload trends, and skill match scores
Predictive alerts: Flag projects likely to miss deadlines before they're actually late
Self-healing pipeline: If an artist hasn't started an assigned stage in X days, auto-reassign

Risk Considerations

Risk	Mitigation
Claude API outage	Automatic fallback to local Ollama model; producers are never blocked
Claude API costs spike unexpectedly	Monitor usage via Anthropic dashboard; set billing alerts; ~$20-100/mo expected for 10-30 users
Ollama fallback misinterprets a command	Confirmation step before all mutations; undo capability; subtle UI indicator when in fallback mode
Producers don't trust the AI	Start with read-only queries (status checks, reports), add mutations gradually
Wrong artist assigned automatically	Always surface assignments as suggestions first; let producers approve for first 2-4 weeks
Over-automation removes producer oversight	Keep producers in the loop via notifications; require approval for high-impact actions (project creation, bulk operations)

Success Metrics

Reduction in time producers spend on manual tracker operations (target: 70%+)
Accuracy of AI-driven assignments vs producer overrides (target: 85%+ acceptance rate)
Producer adoption of chat interface (target: daily use within 4 weeks)
Reduction in overdue stages (target: 30%+ improvement from proactive monitoring)

References

CLI-Anything: https://github.com/HKUDS/CLI-Anything
Claude API Tool Use: https://docs.anthropic.com/en/docs/build-with-claude/tool-use
Project codebase: ~/Documents/VScode/hp_prod_tracker
Existing implementation plan: ~/Documents/VScode/hp_prod_tracker/IMPLEMENTATION_PLAN.md
Upgrade roadmap: ~/Documents/VScode/hp_prod_tracker/UPGRADE_PLAN.md

9.7 KiB Raw Blame History Unescape Escape