On server restart, stale active jobs are automatically resumed rather
than failed. Docs already parsed in a prior run are skipped (resume from
cache), docs stuck at 'parsing' are reset to 'pending' and re-parsed.
- Repository: add get_all_stale_active_jobs() and reset_stuck_parsing_docs()
- Service: skip already-parsed docs in _parse_doc(), reset stuck docs on start
- Main: recover stale jobs via asyncio.create_task() in lifespan startup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- knowledge_base_service.py: wrap Gemini distillation call in try/except
to fall back to fallback_client/fallback_model if primary times out,
matching the fallback behaviour in GeminiService._generate_content()
- models.py: fix SpecVersion.source_document_ids ORM type annotation from
Mapped[Optional[dict]] to Mapped[Optional[list]] — the field stores a
JSON array of document ID strings, not an object
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace self.gemini.client with self.gemini.primary_client on line 295 of
knowledge_base_service.py. GeminiService only exposes primary_client and
fallback_client — there is no client attribute. This caused all processing
jobs to fail at the distillation step, which is also why Version History
was always blank (no SpecVersion records were ever created).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous prompts instructed Gemini to "remove redundancy, marketing
fluff, or content not relevant to..." which caused salient details —
especially unusual, granular, or edge-case instructions — to be lost
from spec output. Rewritten all 5 agent prompts (legal, brand_barclays,
brand_barclaycard, channel_best_practices, channel_tech_specs) to:
- Reframe the task as "restructure and organise" rather than "distil
and filter"
- Add a zero-tolerance detail-loss instruction with concrete examples
of unconventional rules that must be preserved
- Explicitly forbid omitting, summarising away, or paraphrasing
specific rules/values/conditions
- Allow merging only exact duplicates while keeping all unique content
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After processing a new knowledge base spec, invalidate_cache() was
clearing the DB spec from the cache without replacing it. The next
analysis would then fall back to static prompts/*.md files instead of
using the newly generated DB spec.
Now invalidate_cache() accepts optional new_spec_content to immediately
populate the DB cache, and knowledge_base_service passes the freshly
distilled spec content so it's available for the next analysis without
a server restart.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- LlamaParse service now returns a ParseResult dataclass with markdown,
total page count, and a list of failed pages (page number + error)
- Knowledge base service sets status to "partial" (instead of "parsed")
when some pages failed, with a descriptive error listing which pages
failed and why
- Frontend StatusBadge shows "partial parse" in orange for partial status
- Error details are shown inline below the document row for both partial
and error statuses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse documents concurrently (up to 10 at a time via semaphore) instead
of serially. Each coroutine uses its own DB session for per-document
status updates, while a shared lock serializes job progress increments
on the main session to avoid session-sharing issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>