ppt-tool/backend/services/docling_service.py
Vadym Samoilenko cf21ba4516 Phase 1-2: Foundation + Admin Panel & Client Management
Phase 1 (Foundation):
- Project restructure (presenton-main → backend/ + frontend/)
- Database schema (8 new models, Alembic config, seed script)
- Auth (Azure AD SSO + dev bypass, JWT sessions, AuthMiddleware)
- RBAC (access_service, rbac_middleware, admin routers)
- Audit logging (fire-and-forget, AuditMiddleware, admin router)
- i18n (react-i18next with 5 namespace files)

Phase 2 (Admin Panel & Client Management):
- Admin panel shell (sidebar layout, role guard, 12 pages)
- Redux admin slice with 18 async thunks
- User management (role changes, deactivation)
- Client management (CRUD, brand config, team management)
- Brand config editor (colors, fonts, logos, voice rules)
- Master deck upload & parser (PPTX → HTML → React pipeline)
- Audit log viewer with filters and CSV/JSON export

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:37:17 +00:00

33 lines
1.1 KiB
Python

from docling.document_converter import (
DocumentConverter,
PdfFormatOption,
PowerpointFormatOption,
WordFormatOption,
)
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.datamodel.base_models import InputFormat
class DoclingService:
def __init__(self):
self.pipeline_options = PdfPipelineOptions()
self.pipeline_options.do_ocr = False
self.converter = DocumentConverter(
allowed_formats=[InputFormat.PPTX, InputFormat.PDF, InputFormat.DOCX],
format_options={
InputFormat.DOCX: WordFormatOption(
pipeline_options=self.pipeline_options,
),
InputFormat.PPTX: PowerpointFormatOption(
pipeline_options=self.pipeline_options,
),
InputFormat.PDF: PdfFormatOption(
pipeline_options=self.pipeline_options,
),
},
)
def parse_to_markdown(self, file_path: str) -> str:
result = self.converter.convert(file_path)
return result.document.export_to_markdown()