vault backup: 2026-05-18 19:14:48

This commit is contained in:
Vadym Samoilenko 2026-05-18 19:14:48 +01:00
parent 0c5b1d72af
commit 9bc03db45e
2 changed files with 217 additions and 0 deletions

View file

@ -0,0 +1,118 @@
---
auto_generated: true
manual_updated_at: 2026-05-18
modified: 2026-05-18
---
# MD to Word Converter Developer Manual
## Architecture Overview
The project is a CLI-driven Python application with the following components:
- **`md_to_word_converter.py`**: Main module containing:
- `MermaidChartHandler`: Class to extract, validate, and render Mermaid charts.
- `main()` function: CLI entry point that orchestrates the conversion process.
- **`pyproject.toml`**: Project metadata, dependencies, and build configuration using `hatchling`.
## Tech Stack
- **Python**: 3.9+
- **Core Libraries**:
- `python-docx`: For creating and manipulating Word documents.
- `markdown`: For parsing Markdown content.
- `beautifulsoup4`: For parsing HTML fragments if needed.
- `Pillow`: For image processing (used for Mermaid chart rendering).
- **Build Tool**: `hatchling`
## Local Setup
1. **Clone the repository**:
```bash
git clone <repository-url>
cd md-to-word
```
2. **Create a virtual environment** (recommended):
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```
3. **Install dependencies**:
```bash
pip install -e .
```
4. **Verify installation**:
```bash
convert --help
```
## Environment Variables
Currently, no environment variables are required. However, if future versions require Mermaid CLI paths or CSS overrides, they may be added.
## Key Services & Entry Points
### `MermaidChartHandler`
- **`__init__`**: Initializes temp directory and creates Mermaid CSS file.
- **`_create_mermaid_css`**: Generates CSS for Mermaid styling with Montserrat font and #FFC407 accents.
- **`extract_mermaid_charts`**: Extracts Mermaid code blocks from Markdown.
### `main()` Entry Point
Located in `md_to_word_converter.py`, this function:
1. Parses CLI arguments.
2. Reads the input Markdown file.
3. Converts Markdown to HTML.
4. Processes Mermaid charts.
5. Builds the Word document using `python-docx`.
6. Saves the output `.docx` file.
## API Reference
### `MermaidChartHandler.extract_mermaid_charts(md_content: str) -> List[Dict]`
Extracts Mermaid charts from Markdown content.
- **Args**:
- `md_content` (str): Markdown content as a string.
- **Returns**:
- List of dictionaries with chart info (code, index, etc.).
### `main()`
CLI entry point.
- **Usage**: `convert <input> <output>`
- **Arguments**:
- `input`: Path to input Markdown file.
- `output`: Path to output Word file.
## Deployment
### Distributing as a Package
1. Build the package:
```bash
hatch build
```
2. Publish to PyPI:
```bash
hatch publish
```
### Docker Deployment
A Dockerfile can be created to containerize the application. Ensure system dependencies (e.g., Mermaid CLI, Puppeteer) are installed in the Docker image.
## Known Gotchas
- **Mermaid Rendering Dependency**: The tool relies on an external Mermaid CLI. Ensure it is installed and accessible in the system PATH.
- **Font Loading**: Mermaid CSS loads Montserrat from Google Fonts. In offline environments, provide a local font file or use a different font family.
- **Temporary Files**: The tool creates temp files for Mermaid rendering. These are not cleaned up automatically in all cases.
- **Python Version**: Requires Python 3.9+. Do not use with Python <3.9 due to dependency compatibility.
- **Large Documents**: Performance may degrade with very large Markdown files due to repeated temp file creation and Mermaid rendering.

View file

@ -0,0 +1,99 @@
---
auto_generated: true
manual_updated_at: 2026-05-18
modified: 2026-05-18
---
# MD to Word Converter User Manual
## What This Tool Does
This tool converts Markdown (`*.md`) documents into Microsoft Word (`*.docx`) documents. It features:
- **Rich formatting preservation**: Headers, lists, bold/italic text, tables, etc., are preserved in the Word output.
- **Mermaid chart rendering**: Inline Mermaid code blocks are extracted, rendered to images, and embedded into the Word document.
- **Custom styling**: Uses Montserrat font and a #FFC407 accent color palette for Mermaid charts and node styling.
## Who Uses It
- Technical writers who need to produce professionally formatted Word documents from Markdown source.
- Developers and analysts who use Mermaid diagrams in their documentation and need them rendered in final deliverables.
- Teams that standardize on Markdown for authoring and Word for distribution or editing by non-technical stakeholders.
## How to Access
### Prerequisites
- Python 3.9 or higher
- pip package manager
### Installation
1. Clone or download the project.
2. Install dependencies:
```bash
pip install python-docx markdown beautifulsoup4 pillow
```
### Running the Tool
The tool provides a command-line interface via the `convert` script.
```bash
convert <input_markdown_file> <output_word_file>
```
Example:
```bash
convert README.md output.docx
```
## Main Workflows
### 1. Basic Conversion
1. Prepare your Markdown file with standard Markdown syntax.
2. If you have Mermaid diagrams, enclose them in ` ```mermaid ` blocks.
3. Run the conversion command as shown above.
4. Open the resulting `.docx` file in Microsoft Word or compatible editor.
### 2. Converting with Mermaid Diagrams
1. Write your Mermaid diagram inside a code block:
```markdown
```mermaid
graph TD
A[Start] --> B{Decision}
B -->|Yes| C[End]
B -->|No| D[Process]
```
```
2. The converter will:
- Extract the Mermaid code.
- Render it to an image using an internal Mermaid rendering process (which requires Mermaid CLI to be installed separately on your system, if applicable).
- Embed the image into the Word document.
3. Run the conversion command.
4. Verify that diagrams appear correctly in the output.
### 3. Troubleshooting Common Issues
- **Mermaid diagrams not appearing**: Ensure you have Mermaid CLI (`@mermaid-js/mermaid-cli`) installed globally via npm: `npm install -g @mermaid-js/mermaid-cli`. Also check that the system has a compatible headless browser (e.g., Puppeteer/Playwright).
- **Font display issues**: The tool specifies Montserrat font in Mermaid CSS. Ensure the target system has access to Montserrat (loaded via Google Fonts) or substitute locally if needed.
- **Missing dependencies**: If you see `ImportError`, reinstall dependencies: `pip install -r requirements.txt` or install individually.
## FAQ
**Q: Does it support all Markdown features?**
A: It supports standard Markdown elements via the `markdown` library and additional formatting via `python-docx`. Complex HTML or custom extensions may not be fully supported.
**Q: Can I customize the output styles?**
A: Currently, custom styles are limited to Mermaid chart CSS. Global document styles are not yet configurable via UI/CLI.
**Q: Is the tool free and open source?**
A: Yes, the tool is open source and available under its project license.
**Q: Can I convert multiple files at once?**
A: The current CLI supports one input file per run. Use shell loops or scripts to batch convert.
**Q: What happens if a Mermaid diagram has syntax errors?**
A: The renderer may fail silently or produce broken images. Check your Mermaid syntax carefully before conversion.