ppt-tool/backend
Vadym Samoilenko 8715fa8bd2 Replace docling+layoutparser+torch with PyMuPDF (~3.5GB → ~80MB)
- docling removed: PDF now parsed by PyMuPDF (fitz), PPTX by python-pptx
- layoutparser removed: already optional with graceful fallback (returns [])
- torch/pytorch index removed: no longer needed by any dependency
- pymupdf added: ~20MB wheel, no ML deps, faster than docling for text extraction
- All existing DOCX parsing kept (python-docx, already working)
- extract_text_from_image_via_vision() unchanged (Gemini API)

Result: api/worker Docker image ~3-4GB lighter, no NVIDIA libs on CPU server

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 20:06:46 +00:00
..
alembic/versions Fix migration: move to correct path, update down_revision to c7a3f8e21d4b 2026-03-01 20:10:36 +00:00
api Add 3 sandbox features: diagrams, mermaid, and template code-gen 2026-03-19 18:47:31 +00:00
assets Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
constants Phase 2: Admin panel, analytics, storage, template pipeline, multi-provider LLM 2026-02-26 23:39:34 +00:00
enums Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
migrations Fix migration: move to correct path, update down_revision to c7a3f8e21d4b 2026-03-01 20:10:36 +00:00
models Add 3 sandbox features: diagrams, mermaid, and template code-gen 2026-03-19 18:47:31 +00:00
scripts Phase 2: Admin panel, analytics, storage, template pipeline, multi-provider LLM 2026-02-26 23:39:34 +00:00
services Replace docling+layoutparser+torch with PyMuPDF (~3.5GB → ~80MB) 2026-03-19 20:06:46 +00:00
static Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
tests Phase 7: Apply design system to all admin pages + fix test stubs 2026-03-01 19:01:52 +00:00
utils Add 3 sandbox features: diagrams, mermaid, and template code-gen 2026-03-19 18:47:31 +00:00
workers Increase ARQ job timeout to 90 minutes 2026-02-27 21:48:51 +00:00
.python-version Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
alembic.ini Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
Dockerfile Phase 4: Fix critical bugs, improve document parsing, add vision OCR 2026-02-27 14:07:00 +00:00
mcp_server.py Rebrand Presenton to Oliver DeckForge, pre-configure models, use NanoBanana Pro 2026-02-26 18:17:11 +00:00
openai_spec.json Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
pyproject.toml Replace docling+layoutparser+torch with PyMuPDF (~3.5GB → ~80MB) 2026-03-19 20:06:46 +00:00
server.py Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00
uv.lock Phase 1-2: Foundation + Admin Panel & Client Management 2026-02-26 15:37:17 +00:00