Add files via upload

CLI Anything plan for producer automation. Step by step guide ad executive brief
This commit is contained in:
Leivur R. Djurhuus 2026-03-12 02:43:23 -05:00 committed by GitHub
parent 172533b6d7
commit a4b5bbf5c9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -0,0 +1,207 @@
# CLI-Anything Integration: HP CG Production Tracker
## Executive Summary
CLI-Anything is an open-source framework (MIT licensed) that automatically transforms our web application into a command-line interface. This CLI layer enables AI agents to operate the tracker through structured commands, unlocking full automation of the production pipeline — from job intake to delivery.
The goal: shift producers from **operators** (doing everything manually) to **supervisors** (approving, overriding, handling exceptions).
---
## Key Findings
### What CLI-Anything Does
- Auto-generates a production-ready CLI from an existing codebase
- 7-phase pipeline: analyze source → design commands → implement → test → document → publish
- Outputs a pip-installable package with JSON output mode for AI consumption
- Includes auto-generated tests and documentation
- Validated across 11 major applications (1,508 passing tests)
- Repository: https://github.com/HKUDS/CLI-Anything
### Why It Fits Our Project
- Our tracker already has a clean service layer with 26 REST API endpoints
- Zod validators translate directly to CLI command schemas
- Dependency engine and business rules are already codified
- Skills, capacity, and workload data already exists in the system
- Ollama is already running locally for embeddings (pgvector + nomic-embed-text)
### Cost
- CLI-Anything: Free (MIT license)
- Claude API (default): ~$0.010.05 per interaction, highly reliable tool calling
- Local LLM (Ollama): Free, already in our Docker Compose stack — serves as offline fallback
- Primary cost is implementation time
---
## What Can Be Automated
| Producer Task | Automation Level | How |
|---|---|---|
| Job intake / project creation | Fully automatable | AI parses incoming requests (email, brief docs) and creates projects, deliverables, and pipeline stages |
| Artist assignment | Fully automatable | AI matches skills, capacity, and department data to assign the best available artist |
| Stage progression | Mostly automatable | Dependency engine auto-advances stages as prerequisites are approved; AI triggers downstream work |
| Deadline monitoring & escalation | Fully automatable | Scheduled agent flags overdue items, nudges artists, escalates only true blockers to producers |
| Status reporting | Fully automatable | Agent queries tracker and generates summaries on demand or on schedule |
| Revision cycles | Partially automatable | Agent logs revisions and reassigns artists; creative review still requires human eyes |
### Producer Role: Before vs After
| Today | With Automation |
|---|---|
| Create projects manually | Review auto-created projects, approve |
| Assign artists by memory | Review AI-suggested assignments, override if needed |
| Monitor every stage daily | Get alerted only on exceptions and blockers |
| Chase artists for updates | Agent handles nudges and follow-ups |
| Compile status reports | Reports generated automatically |
| Advance stages manually | Pipeline advances itself |
---
## Architecture Decision: Chat Interface
### Default: Claude API + Tool Use
- Claude interprets natural language and calls tools mapped to our services
- Best-in-class accuracy for tool calling, multi-step operations, and ambiguous requests
- ~$0.010.05 per interaction — negligible cost at 10-30 producers
- Estimated monthly cost: ~$20100 depending on usage volume
### Fallback: Local LLM via Ollama
- Activates automatically if the Claude API is unreachable (outage, network issues)
- Llama 3 70B or Qwen 2.5 72B running locally via existing Docker Compose stack
- Handles most straightforward operations reliably
- Ensures producers are never blocked — the chat assistant stays online regardless
### Why Claude as Default
- Tool calling accuracy is significantly higher than local models — fewer misrouted commands, fewer confirmation retries
- Handles complex multi-step requests out of the box ("create 20 deliverables and assign them based on availability")
- The cost is trivial relative to the producer time saved
- Ollama remains valuable as a zero-downtime safety net, not a cost-saving measure
---
## Implementation Steps
### Phase 1: Generate the CLI
1. Install CLI-Anything as a Claude Code plugin:
```bash
/plugin marketplace add HKUDS/CLI-Anything
/plugin install cli-anything
```
2. Run CLI generation against the tracker codebase:
```bash
/cli-anything ~/Documents/VScode/hp_prod_tracker
```
3. Review generated commands — ensure they map correctly to existing services:
- Project CRUD (create, list, update, archive)
- Deliverable CRUD + bulk creation
- Stage advancement + status updates
- Artist assignment (with skill/capacity awareness)
- Revision logging
- Workload queries
- Excel import/export
4. Refine coverage for any missing operations:
```bash
/cli-anything:refine ~/Documents/VScode/hp_prod_tracker "pipeline dependencies and bulk operations"
```
5. Run the auto-generated tests and validate against the real database
6. Install the CLI locally:
```bash
cd hp_prod_tracker/agent-harness && pip install -e .
```
### Phase 2: Build the Chat UI Component
1. Create a slide-out chat panel using shadcn/ui (`Sheet` + `ScrollArea` + `Input`)
2. Add a chat icon/button to the app sidebar or top bar
3. Store chat history per user (new Prisma model or simple local state for V1)
4. Pass current context (active project, deliverable) into the chat so producers don't have to specify everything
### Phase 3: Wire the AI Backend (Claude API — Default)
1. Install Anthropic SDK: `npm install @anthropic-ai/sdk`
2. Create `/api/chat` route in the Next.js app
3. Configure Claude API client with API key (stored in environment variables)
4. Define tools from existing Zod validators and service functions:
- `create_project`, `list_projects`, `update_project`
- `create_deliverable`, `list_deliverables`
- `assign_artist`, `remove_assignment`
- `advance_stage`, `get_blocked_stages`
- `get_workload`, `get_available_artists`
- `create_revision`, `list_overdue`
- `export_excel`, `import_excel`
5. Implement tool execution handlers that call the existing service layer
6. Add confirmation flow: for any mutation, show the user what will happen before executing
7. After execution, invalidate relevant TanStack Query caches so the UI updates in real-time
8. Test against common producer requests:
- "Create a new project for Pavilion 16, high priority, Q3"
- "Assign Maria to Model Prep on Spectre x360"
- "What's overdue this week?"
- "Mark all Catalog Images for HP-2026-Q2 as delivered"
### Phase 4: Ollama Fallback Layer
1. Add a chat-capable Ollama model to `docker-compose.yml` (e.g., `llama3:70b` or `qwen2.5:72b`)
2. Create a provider abstraction in the chat API route:
```
try Claude API → on connection failure → fall back to Ollama
```
3. Map the same tool definitions to Ollama's function calling format
4. Add a health check endpoint that monitors Claude API availability
5. Log all fallback events so we can track how often Ollama is needed
6. Ensure producers see a subtle indicator when running in fallback mode (e.g., "Running locally — some complex requests may need to be simplified")
### Phase 5: Automation Agents (Scheduled)
1. Create scheduled agent scripts (cron or Next.js API routes triggered by cron):
- **Deadline monitor**: Runs daily, flags overdue stages, sends notifications
- **Auto-assignment**: When a stage unblocks, suggests or auto-assigns based on skills + capacity
- **Stage auto-advance**: When all prerequisites are approved, automatically transition downstream stages
- **Status digest**: Weekly summary per project emailed or posted to Slack
2. Each agent uses the CLI or service layer directly
3. Add producer override controls — ability to pause/resume automation per project
### Phase 6: Full Pipeline Automation
1. **Intake automation**: Monitor email inbox or Workfront API for new requests → auto-create projects
2. **Smart assignment**: Factor in historical performance, current workload trends, and skill match scores
3. **Predictive alerts**: Flag projects likely to miss deadlines before they're actually late
4. **Self-healing pipeline**: If an artist hasn't started an assigned stage in X days, auto-reassign
---
## Risk Considerations
| Risk | Mitigation |
|---|---|
| Claude API outage | Automatic fallback to local Ollama model; producers are never blocked |
| Claude API costs spike unexpectedly | Monitor usage via Anthropic dashboard; set billing alerts; ~$20-100/mo expected for 10-30 users |
| Ollama fallback misinterprets a command | Confirmation step before all mutations; undo capability; subtle UI indicator when in fallback mode |
| Producers don't trust the AI | Start with read-only queries (status checks, reports), add mutations gradually |
| Wrong artist assigned automatically | Always surface assignments as suggestions first; let producers approve for first 2-4 weeks |
| Over-automation removes producer oversight | Keep producers in the loop via notifications; require approval for high-impact actions (project creation, bulk operations) |
---
## Success Metrics
- Reduction in time producers spend on manual tracker operations (target: 70%+)
- Accuracy of AI-driven assignments vs producer overrides (target: 85%+ acceptance rate)
- Producer adoption of chat interface (target: daily use within 4 weeks)
- Reduction in overdue stages (target: 30%+ improvement from proactive monitoring)
---
## References
- CLI-Anything: https://github.com/HKUDS/CLI-Anything
- Claude API Tool Use: https://docs.anthropic.com/en/docs/build-with-claude/tool-use
- Project codebase: ~/Documents/VScode/hp_prod_tracker
- Existing implementation plan: ~/Documents/VScode/hp_prod_tracker/IMPLEMENTATION_PLAN.md
- Upgrade roadmap: ~/Documents/VScode/hp_prod_tracker/UPGRADE_PLAN.md