Add files via upload
CLI Anything plan for producer automation. Step by step guide ad executive brief
This commit is contained in:
parent
172533b6d7
commit
a4b5bbf5c9
1 changed files with 207 additions and 0 deletions
207
CLI_ANYTHING_IMPLEMENTATION_PLAN.md
Normal file
207
CLI_ANYTHING_IMPLEMENTATION_PLAN.md
Normal file
|
|
@ -0,0 +1,207 @@
|
|||
# CLI-Anything Integration: HP CG Production Tracker
|
||||
|
||||
## Executive Summary
|
||||
|
||||
CLI-Anything is an open-source framework (MIT licensed) that automatically transforms our web application into a command-line interface. This CLI layer enables AI agents to operate the tracker through structured commands, unlocking full automation of the production pipeline — from job intake to delivery.
|
||||
|
||||
The goal: shift producers from **operators** (doing everything manually) to **supervisors** (approving, overriding, handling exceptions).
|
||||
|
||||
---
|
||||
|
||||
## Key Findings
|
||||
|
||||
### What CLI-Anything Does
|
||||
|
||||
- Auto-generates a production-ready CLI from an existing codebase
|
||||
- 7-phase pipeline: analyze source → design commands → implement → test → document → publish
|
||||
- Outputs a pip-installable package with JSON output mode for AI consumption
|
||||
- Includes auto-generated tests and documentation
|
||||
- Validated across 11 major applications (1,508 passing tests)
|
||||
- Repository: https://github.com/HKUDS/CLI-Anything
|
||||
|
||||
### Why It Fits Our Project
|
||||
|
||||
- Our tracker already has a clean service layer with 26 REST API endpoints
|
||||
- Zod validators translate directly to CLI command schemas
|
||||
- Dependency engine and business rules are already codified
|
||||
- Skills, capacity, and workload data already exists in the system
|
||||
- Ollama is already running locally for embeddings (pgvector + nomic-embed-text)
|
||||
|
||||
### Cost
|
||||
|
||||
- CLI-Anything: Free (MIT license)
|
||||
- Claude API (default): ~$0.01–0.05 per interaction, highly reliable tool calling
|
||||
- Local LLM (Ollama): Free, already in our Docker Compose stack — serves as offline fallback
|
||||
- Primary cost is implementation time
|
||||
|
||||
---
|
||||
|
||||
## What Can Be Automated
|
||||
|
||||
| Producer Task | Automation Level | How |
|
||||
|---|---|---|
|
||||
| Job intake / project creation | Fully automatable | AI parses incoming requests (email, brief docs) and creates projects, deliverables, and pipeline stages |
|
||||
| Artist assignment | Fully automatable | AI matches skills, capacity, and department data to assign the best available artist |
|
||||
| Stage progression | Mostly automatable | Dependency engine auto-advances stages as prerequisites are approved; AI triggers downstream work |
|
||||
| Deadline monitoring & escalation | Fully automatable | Scheduled agent flags overdue items, nudges artists, escalates only true blockers to producers |
|
||||
| Status reporting | Fully automatable | Agent queries tracker and generates summaries on demand or on schedule |
|
||||
| Revision cycles | Partially automatable | Agent logs revisions and reassigns artists; creative review still requires human eyes |
|
||||
|
||||
### Producer Role: Before vs After
|
||||
|
||||
| Today | With Automation |
|
||||
|---|---|
|
||||
| Create projects manually | Review auto-created projects, approve |
|
||||
| Assign artists by memory | Review AI-suggested assignments, override if needed |
|
||||
| Monitor every stage daily | Get alerted only on exceptions and blockers |
|
||||
| Chase artists for updates | Agent handles nudges and follow-ups |
|
||||
| Compile status reports | Reports generated automatically |
|
||||
| Advance stages manually | Pipeline advances itself |
|
||||
|
||||
---
|
||||
|
||||
## Architecture Decision: Chat Interface
|
||||
|
||||
### Default: Claude API + Tool Use
|
||||
|
||||
- Claude interprets natural language and calls tools mapped to our services
|
||||
- Best-in-class accuracy for tool calling, multi-step operations, and ambiguous requests
|
||||
- ~$0.01–0.05 per interaction — negligible cost at 10-30 producers
|
||||
- Estimated monthly cost: ~$20–100 depending on usage volume
|
||||
|
||||
### Fallback: Local LLM via Ollama
|
||||
|
||||
- Activates automatically if the Claude API is unreachable (outage, network issues)
|
||||
- Llama 3 70B or Qwen 2.5 72B running locally via existing Docker Compose stack
|
||||
- Handles most straightforward operations reliably
|
||||
- Ensures producers are never blocked — the chat assistant stays online regardless
|
||||
|
||||
### Why Claude as Default
|
||||
|
||||
- Tool calling accuracy is significantly higher than local models — fewer misrouted commands, fewer confirmation retries
|
||||
- Handles complex multi-step requests out of the box ("create 20 deliverables and assign them based on availability")
|
||||
- The cost is trivial relative to the producer time saved
|
||||
- Ollama remains valuable as a zero-downtime safety net, not a cost-saving measure
|
||||
|
||||
---
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Phase 1: Generate the CLI
|
||||
|
||||
1. Install CLI-Anything as a Claude Code plugin:
|
||||
```bash
|
||||
/plugin marketplace add HKUDS/CLI-Anything
|
||||
/plugin install cli-anything
|
||||
```
|
||||
2. Run CLI generation against the tracker codebase:
|
||||
```bash
|
||||
/cli-anything ~/Documents/VScode/hp_prod_tracker
|
||||
```
|
||||
3. Review generated commands — ensure they map correctly to existing services:
|
||||
- Project CRUD (create, list, update, archive)
|
||||
- Deliverable CRUD + bulk creation
|
||||
- Stage advancement + status updates
|
||||
- Artist assignment (with skill/capacity awareness)
|
||||
- Revision logging
|
||||
- Workload queries
|
||||
- Excel import/export
|
||||
4. Refine coverage for any missing operations:
|
||||
```bash
|
||||
/cli-anything:refine ~/Documents/VScode/hp_prod_tracker "pipeline dependencies and bulk operations"
|
||||
```
|
||||
5. Run the auto-generated tests and validate against the real database
|
||||
6. Install the CLI locally:
|
||||
```bash
|
||||
cd hp_prod_tracker/agent-harness && pip install -e .
|
||||
```
|
||||
|
||||
### Phase 2: Build the Chat UI Component
|
||||
|
||||
1. Create a slide-out chat panel using shadcn/ui (`Sheet` + `ScrollArea` + `Input`)
|
||||
2. Add a chat icon/button to the app sidebar or top bar
|
||||
3. Store chat history per user (new Prisma model or simple local state for V1)
|
||||
4. Pass current context (active project, deliverable) into the chat so producers don't have to specify everything
|
||||
|
||||
### Phase 3: Wire the AI Backend (Claude API — Default)
|
||||
|
||||
1. Install Anthropic SDK: `npm install @anthropic-ai/sdk`
|
||||
2. Create `/api/chat` route in the Next.js app
|
||||
3. Configure Claude API client with API key (stored in environment variables)
|
||||
4. Define tools from existing Zod validators and service functions:
|
||||
- `create_project`, `list_projects`, `update_project`
|
||||
- `create_deliverable`, `list_deliverables`
|
||||
- `assign_artist`, `remove_assignment`
|
||||
- `advance_stage`, `get_blocked_stages`
|
||||
- `get_workload`, `get_available_artists`
|
||||
- `create_revision`, `list_overdue`
|
||||
- `export_excel`, `import_excel`
|
||||
5. Implement tool execution handlers that call the existing service layer
|
||||
6. Add confirmation flow: for any mutation, show the user what will happen before executing
|
||||
7. After execution, invalidate relevant TanStack Query caches so the UI updates in real-time
|
||||
8. Test against common producer requests:
|
||||
- "Create a new project for Pavilion 16, high priority, Q3"
|
||||
- "Assign Maria to Model Prep on Spectre x360"
|
||||
- "What's overdue this week?"
|
||||
- "Mark all Catalog Images for HP-2026-Q2 as delivered"
|
||||
|
||||
### Phase 4: Ollama Fallback Layer
|
||||
|
||||
1. Add a chat-capable Ollama model to `docker-compose.yml` (e.g., `llama3:70b` or `qwen2.5:72b`)
|
||||
2. Create a provider abstraction in the chat API route:
|
||||
```
|
||||
try Claude API → on connection failure → fall back to Ollama
|
||||
```
|
||||
3. Map the same tool definitions to Ollama's function calling format
|
||||
4. Add a health check endpoint that monitors Claude API availability
|
||||
5. Log all fallback events so we can track how often Ollama is needed
|
||||
6. Ensure producers see a subtle indicator when running in fallback mode (e.g., "Running locally — some complex requests may need to be simplified")
|
||||
|
||||
### Phase 5: Automation Agents (Scheduled)
|
||||
|
||||
1. Create scheduled agent scripts (cron or Next.js API routes triggered by cron):
|
||||
- **Deadline monitor**: Runs daily, flags overdue stages, sends notifications
|
||||
- **Auto-assignment**: When a stage unblocks, suggests or auto-assigns based on skills + capacity
|
||||
- **Stage auto-advance**: When all prerequisites are approved, automatically transition downstream stages
|
||||
- **Status digest**: Weekly summary per project emailed or posted to Slack
|
||||
2. Each agent uses the CLI or service layer directly
|
||||
3. Add producer override controls — ability to pause/resume automation per project
|
||||
|
||||
### Phase 6: Full Pipeline Automation
|
||||
|
||||
1. **Intake automation**: Monitor email inbox or Workfront API for new requests → auto-create projects
|
||||
2. **Smart assignment**: Factor in historical performance, current workload trends, and skill match scores
|
||||
3. **Predictive alerts**: Flag projects likely to miss deadlines before they're actually late
|
||||
4. **Self-healing pipeline**: If an artist hasn't started an assigned stage in X days, auto-reassign
|
||||
|
||||
---
|
||||
|
||||
## Risk Considerations
|
||||
|
||||
| Risk | Mitigation |
|
||||
|---|---|
|
||||
| Claude API outage | Automatic fallback to local Ollama model; producers are never blocked |
|
||||
| Claude API costs spike unexpectedly | Monitor usage via Anthropic dashboard; set billing alerts; ~$20-100/mo expected for 10-30 users |
|
||||
| Ollama fallback misinterprets a command | Confirmation step before all mutations; undo capability; subtle UI indicator when in fallback mode |
|
||||
| Producers don't trust the AI | Start with read-only queries (status checks, reports), add mutations gradually |
|
||||
| Wrong artist assigned automatically | Always surface assignments as suggestions first; let producers approve for first 2-4 weeks |
|
||||
| Over-automation removes producer oversight | Keep producers in the loop via notifications; require approval for high-impact actions (project creation, bulk operations) |
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
- Reduction in time producers spend on manual tracker operations (target: 70%+)
|
||||
- Accuracy of AI-driven assignments vs producer overrides (target: 85%+ acceptance rate)
|
||||
- Producer adoption of chat interface (target: daily use within 4 weeks)
|
||||
- Reduction in overdue stages (target: 30%+ improvement from proactive monitoring)
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- CLI-Anything: https://github.com/HKUDS/CLI-Anything
|
||||
- Claude API Tool Use: https://docs.anthropic.com/en/docs/build-with-claude/tool-use
|
||||
- Project codebase: ~/Documents/VScode/hp_prod_tracker
|
||||
- Existing implementation plan: ~/Documents/VScode/hp_prod_tracker/IMPLEMENTATION_PLAN.md
|
||||
- Upgrade roadmap: ~/Documents/VScode/hp_prod_tracker/UPGRADE_PLAN.md
|
||||
Loading…
Add table
Reference in a new issue