modcomms

Author	SHA1	Message	Date
Vadym Samoilenko	1982d5d76e	feat(knowledge-base): smart resume for interrupted processing jobs On server restart, stale active jobs are automatically resumed rather than failed. Docs already parsed in a prior run are skipped (resume from cache), docs stuck at 'parsing' are reset to 'pending' and re-parsed. - Repository: add get_all_stale_active_jobs() and reset_stuck_parsing_docs() - Service: skip already-parsed docs in _parse_doc(), reset stuck docs on start - Main: recover stale jobs via asyncio.create_task() in lifespan startup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 10:20:35 +01:00
Vadym Samoilenko	f520aba397	Fix KB distillation fallback and SpecVersion type annotation - knowledge_base_service.py: wrap Gemini distillation call in try/except to fall back to fallback_client/fallback_model if primary times out, matching the fallback behaviour in GeminiService._generate_content() - models.py: fix SpecVersion.source_document_ids ORM type annotation from Mapped[Optional[dict]] to Mapped[Optional[list]] — the field stores a JSON array of document ID strings, not an object Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-16 14:03:25 +00:00
Vadym Samoilenko	060fbeba76	Fix GeminiService client attribute error in knowledge base distillation Replace self.gemini.client with self.gemini.primary_client on line 295 of knowledge_base_service.py. GeminiService only exposes primary_client and fallback_client — there is no client attribute. This caused all processing jobs to fail at the distillation step, which is also why Version History was always blank (no SpecVersion records were ever created). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-16 13:59:59 +00:00
michael	60ac3ab22e	Rewrite distillation prompts to preserve all source document details The previous prompts instructed Gemini to "remove redundancy, marketing fluff, or content not relevant to..." which caused salient details — especially unusual, granular, or edge-case instructions — to be lost from spec output. Rewritten all 5 agent prompts (legal, brand_barclays, brand_barclaycard, channel_best_practices, channel_tech_specs) to: - Reframe the task as "restructure and organise" rather than "distil and filter" - Add a zero-tolerance detail-loss instruction with concrete examples of unconventional rules that must be preserved - Explicitly forbid omitting, summarising away, or paraphrasing specific rules/values/conditions - Allow merging only exact duplicates while keeping all unique content Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 08:21:03 -06:00
michael	1800e71229	Fix cache invalidation falling back to static files after reprocessing After processing a new knowledge base spec, invalidate_cache() was clearing the DB spec from the cache without replacing it. The next analysis would then fall back to static prompts/*.md files instead of using the newly generated DB spec. Now invalidate_cache() accepts optional new_spec_content to immediately populate the DB cache, and knowledge_base_service passes the freshly distilled spec content so it's available for the next analysis without a server restart. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 17:56:11 -06:00
michael	de62fa1f87	Show partial parse status in UI when some pages fail - LlamaParse service now returns a ParseResult dataclass with markdown, total page count, and a list of failed pages (page number + error) - Knowledge base service sets status to "partial" (instead of "parsed") when some pages failed, with a descriptive error listing which pages failed and why - Frontend StatusBadge shows "partial parse" in orange for partial status - Error details are shown inline below the document row for both partial and error statuses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 17:51:52 -06:00
michael	8a9a24ebe6	Parallelize LlamaParse document processing with asyncio.gather Parse documents concurrently (up to 10 at a time via semaphore) instead of serially. Each coroutine uses its own DB session for per-document status updates, while a shared lock serializes job progress increments on the main session to avoid session-sharing issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 17:39:01 -06:00
michael	9e2473c3e9	Add Knowledge Base management system for AI agent specs Full-stack implementation enabling UI-driven management of the 5 AI agent knowledge bases (Legal, Brand Barclays, Brand Barclaycard, Channel Best Practices, Channel Tech Specs). Backend: - 4 new DB models: KnowledgeBase, SourceDocument, SpecVersion, ProcessingJob - Migration 006: creates tables, seeds 5 KB rows, imports existing prompts/*.md as v1 specs - KnowledgeBaseRepository with full CRUD for all 4 tables - LlamaParseService for document parsing, KnowledgeBaseService for pipeline orchestration - ReferenceDocsService updated with DB-backed spec loading + cache invalidation - 11 REST endpoints under /api/knowledge-base (list, detail, upload, delete, process, job status, versions, diff, activate) - StorageService extended with KB document storage Frontend: - TypeScript types for all KB entities (KnowledgeBaseListItem, SourceDocument, ProcessingJob, SpecVersion, DiffResult) - ApiService methods for all KB endpoints including multipart file upload - KnowledgeBase component with 3-level UI: agent grid, detail view (documents + versions tabs), diff viewer - Drag-and-drop file upload, processing progress bar with 3s polling, version comparison - KnowledgeBaseIcon + Sidebar nav item with adminOnly filtering Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-12 15:00:36 -06:00

8 commits