diff --git a/.obsidian/plugins/hoarder-sync/data.json b/.obsidian/plugins/hoarder-sync/data.json index 66a6fa2..69f77ba 100644 --- a/.obsidian/plugins/hoarder-sync/data.json +++ b/.obsidian/plugins/hoarder-sync/data.json @@ -4,7 +4,7 @@ "syncFolder": "Hoarder", "attachmentsFolder": "Hoarder/attachments", "syncIntervalMinutes": 60, - "lastSyncTimestamp": 1779123138328, + "lastSyncTimestamp": 1779126738034, "updateExistingFiles": false, "excludeArchived": true, "onlyFavorites": false, diff --git a/01 Projects/Oliver-ai-bot_2.0/DEVELOPER_MANUAL.md b/01 Projects/Oliver-ai-bot_2.0/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..6500d11 --- /dev/null +++ b/01 Projects/Oliver-ai-bot_2.0/DEVELOPER_MANUAL.md @@ -0,0 +1,141 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: Developer_Manual +status: active +tags: +- client/oliver +- tech/azure-ad +- tech/celery +- tech/docker +- tech/fastapi +- tech/lamaindex +- tech/nextjs +- tech/postgresql +- tech/python +- tech/rag +- tech/redis +- tech/typescript +- type/sop +type: sop +--- + +# Oliver AI Bot 2.0 (Nexus MVP) - Developer Manual + +## Architecture Overview + +Nexus is a microservices-oriented monorepo structured for scalability and security. It runs in production on a GCE VM using Docker Compose, with heavy processing offloaded to Google Cloud Run. + +### Core Components +1. **Backend (FastAPI/Python):** The central API handling auth, business logic, and orchestration. +2. **Frontend (Next.js 14):** SPA serving the UI and handling client-side routing. +3. **Database (PostgreSQL 16):** Relational storage for users, metadata, and config. +4. **Vector DB (Qdrant):** Stores document embeddings for RAG retrieval. +5. **Cache/Broker (Redis 7):** Manages rate limiting, session storage, and Celery task queues. +6. **Worker Service (Cloud Run):** Stateless Python/FastAPI service for heavy document parsing (LLAMAParse integration). + +## Technology Stack + +* **Backend:** Python 3.11, FastAPI, SQLAlchemy (Async), Pydantic Settings. +* **Frontend:** Next.js 14, TypeScript, Tailwind CSS. +* **Infrastructure:** Docker Compose, GCE, Google Cloud Run, Apache (Reverse Proxy). +* **Auth:** Microsoft Entra ID (Azure AD), JWT (HS256), Python-JWT. +* **AI/ML:** OpenAI/Anthropic/Google LLMs, Qdrant (Vector DB), LangChain/LlamaIndex (implied by RAG pipeline). + +## Local Setup + +### Prerequisites +* Docker & Docker Compose +* Python 3.11+ +* Node.js 18+ + +### Steps +1. **Clone Repository:** + ```bash + git clone + cd nexus-mvp + ``` +2. **Environment Configuration:** + ```bash + cp backend/.env.example backend/.env + # Edit .env with your local DB passwords and Azure AD credentials + ``` +3. **Start Infrastructure:** + ```bash + docker-compose up -d db redis qdrant + ``` +4. **Backend Setup:** + ```bash + cd backend + pip install -r requirements.txt + python seed_data.py # Initialize regions/departments + uvicorn app.main:app --reload + ``` +5. **Frontend Setup:** + ```bash + cd frontend + npm install + npm run dev + ``` + +## Environment Variables + +Key variables are defined in `.env.example` and `docker-compose.yml`: + +| Variable | Description | Default/Example | +| :--- | :--- | :--- | +| `ENTRA_CLIENT_ID` | Azure AD App Registration Client ID | Required | +| `ENTRA_CLIENT_SECRET` | Azure AD App Registration Secret | Required | +| `DATABASE_URL` | PostgreSQL connection string | `postgresql://...@db:5432/nexus_db` | +| `REDIS_URL` | Redis connection string | `redis://redis:6379/0` | +| `QDRANT_URL` | Qdrant Vector DB URL | `http://qdrant:6333` | +| `JWT_SECRET` | Secret for signing JWT tokens | Required | +| `OPENAI_API_KEY` | OpenAI API Key for LLMs | Required | +| `LOG_LEVEL` | Logging verbosity | `INFO` | + +## Key Services & Entry Points + +* **`backend/app/main.py`**: FastAPI application entry point. Includes CORS, Rate Limiting, and Auth middleware. +* **`backend/app/config.py`**: Pydantic `BaseSettings` class for type-safe environment variable loading. +* **`backend/celery_app.py`**: Celery configuration. Defines workers, queues (`sharepoint`, `default`), and beat schedules (e.g., hourly SharePoint sync, 30-min token refresh). +* **`backend/cloud_run_service.py`**: Standalone FastAPI app deployed to Cloud Run. Handles `POST /process-document` for heavy text extraction/chunking, returning raw chunks for the main backend to embed. +* **`backend/seed_data.py`**: Script to populate initial `Region` and `Department` taxonomy data. + +## API Reference + +### Authentication +* `POST /api/v1/auth/login`: Initiates Entra ID PKCE flow. +* `GET /api/v1/auth/callback`: Handles OAuth2 callback from Azure AD. + +### Documents +* `POST /api/v1/documents/upload`: Uploads file to Qdrant/Postgres. +* `GET /api/v1/documents/search?q=`: Retrieves context for RAG. + +### Health +* `GET /health`: Standard health check for all services. + +## Deployment + +### Production (GCE + Cloud Run) +1. **Backend:** Deploy Docker container to GCE VM. Use Apache as reverse proxy (`/etc/apache2/sites-available/nexus.conf`). +2. **Workers:** Scale Celery workers on GCE VM. +3. **Cloud Run:** Deploy `backend/cloud_run_service.py` as a separate service in Google Cloud Run for elastic scaling during peak document processing. +4. **Database:** Use managed PostgreSQL (Cloud SQL) and Redis (Memorystore) in production instead of Docker containers. + +### Docker Compose +For local/dev/staging, use `docker-compose.yml`. It orchestrates `db`, `redis`, `qdrant`, and `backend` services. + +## Known Gotchas + +1. **Azure AD Redirect URI:** Ensure `ENTRA_REDIRECT_URI` matches the Azure Portal configuration exactly. Defaults to `http://localhost:8000/api/v1/auth/callback` locally. +2. **Qdrant Ports:** Qdrant exposes `6333` (HTTP) and `6334` (gRPC). Ensure firewall rules allow this. +3. **Celery Timeouts:** Task hard limit is 30 mins. Heavy documents may fail if they exceed this; consider increasing `task_time_limit` or splitting large files before upload. +4. **JWT Expiration:** Default is 15 minutes. Refresh tokens last 7 days. Ensure frontend handles silent refresh properly. +5. **Environment Consistency:** Never commit `.env`. Use `.env.example` as the source of truth for config keys. + +## Related +- [[01 Projects/sandbox-notebookllamalm-nextjs/DEVELOPER_MANUAL.md]] +- [[01 Projects/ppt-tool/DEVELOPER_MANUAL.md]] +- [[client/oliver]] \ No newline at end of file diff --git a/01 Projects/Oliver-ai-bot_2.0/USER_MANUAL.md b/01 Projects/Oliver-ai-bot_2.0/USER_MANUAL.md new file mode 100644 index 0000000..69b82f5 --- /dev/null +++ b/01 Projects/Oliver-ai-bot_2.0/USER_MANUAL.md @@ -0,0 +1,84 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: User_Manual +status: active +tags: +- client/oliver +- domain/ai +- tech/azure-ad +- tech/rag +- type/sop +type: sop +--- + +# Oliver AI Bot 2.0 (Nexus MVP) - User Manual + +## What This Tool Does + +**Nexus** is an enterprise-grade AI platform designed for OLIVER Agency and similar organizations. It centralizes artificial intelligence capabilities into a single secure hub, providing three core functions: + +1. **RAG Chat (Retrieval-Augmented Generation):** Ask natural language questions about your company's internal documents. The AI retrieves relevant information from the knowledge base and answers with source citations. +2. **Personal Assistant:** Access your Microsoft 365 data (emails, calendar, OneDrive files, SharePoint) via secure read-only integration. +3. **Knowledge Base Management:** A secure admin interface to upload, organize, and manage corporate documents for AI indexing. + +## Who Uses It + +* **End Users:** Employees needing quick access to company knowledge or personal productivity tools. +* **Content Managers:** Administrators responsible for uploading documents, managing the taxonomy (regions/departments), and monitoring system health. +* **Super Administrators:** IT staff with full access to system configuration, user roles, and backend settings. + +## How to Access + +1. **URL:** Navigate to `https://ai-sandbox.oliver.solutions` (or your specific deployment URL). +2. **Authentication:** Log in using your **Microsoft Entra ID (Azure AD)** credentials. + * The system uses a secure **PKCE (Proof Key for Code Exchange)** flow for Single Sign-On (SSO). + * No separate password is required; you use your standard corporate login. +3. **Browser Support:** Optimized for modern browsers (Chrome, Edge, Firefox, Safari). + +## Main Workflows + +### 1. Chatting with Your Knowledge Base + +1. Navigate to the **AI Chat** section. +2. Type your question in natural language (e.g., "What is the policy on remote work?" or "Summarize the Q3 financial report"). +3. **Multi-Language Support:** You can ask questions in any language; the response will be generated in that same language. +4. **Review Sources:** The AI response includes clickable citations linking to the original documents. Click these to view the source material. + +### 2. Managing the Knowledge Base (Admins Only) + +1. Navigate to the **Knowledge Base** admin panel. +2. **Upload Documents:** Click "Upload" and select files (PDF, DOCX, TXT, etc.). +3. **Tag Content:** Assign the document to a specific **Region** (e.g., UK, US, APAC) and **Department** (e.g., HR, IT). This ensures users only see content relevant to their scope. +4. **Processing:** The system automatically extracts text, chunks the content, and generates vector embeddings. This may take a few minutes. +5. **Re-indexing:** If a document is updated, use the "Re-index" button to refresh the data. + +### 3. Using the Personal Assistant + +1. Navigate to the **Assistant** tab. +2. **Calendar:** Ask "What are my meetings today?" to view your schedule. +3. **Email:** Ask "Summarize unread emails from [Name]" to get quick insights. +4. **Files:** Ask "Show me recent documents in [Folder Name]" to access OneDrive/SharePoint content. +5. *Note:* All access is read-only. You cannot delete or modify data via the Assistant. + +## Frequently Asked Questions (FAQ) + +**Q: Can I ask questions in different languages?** +A: Yes. Nexus supports multi-language queries and responds in the language you used. + +**Q: How is my data secured?** +A: Nexus uses Microsoft Entra ID for authentication and JWT tokens for session management. Data is stored in encrypted PostgreSQL and Qdrant databases. CORS policies restrict access to trusted origins only. + +**Q: Why is my document taking a long time to process?** +A: Large documents or complex formats (like scanned PDFs) require heavy processing. You will receive a notification when indexing is complete. + +**Q: Can I delete my chat history?** +A: Yes, you can clear individual chats from the chat interface. + +**Q: What happens if my session expires?** +A: Your JWT token expires after 15 minutes (default). Simply refresh the page or log in again. The system will prompt you to re-authenticate if needed. + +## Related +- [[01 Projects/Oliver-ai-bot_2.0/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/amazon-transcreation/DEVELOPER_MANUAL.md b/01 Projects/amazon-transcreation/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..1f265d1 --- /dev/null +++ b/01 Projects/amazon-transcreation/DEVELOPER_MANUAL.md @@ -0,0 +1,146 @@ +--- +auto_generated: true +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +--- + +# Amazon AI Transcreation Platform - Developer Manual + +## Architecture Overview + +The platform consists of: +1. **Frontend**: Next.js 14 application with React components, handling UI/UX, authentication, and real-time job polling. +2. **Backend**: FastAPI application providing REST APIs for Auth, Jobs, and Output management. +3. **Database**: PostgreSQL 16 for storing job metadata, user data, and output references. +4. **Cache/Queue**: Redis 7 for task queue management (Celery) and caching. +5. **Task Workers**: Celery workers processing transcreation jobs asynchronously. +6. **LLM Integration**: Anthropic Claude API for AI-driven transcreation. + +``` +Frontend (Next.js) <--> FastAPI Backend <--> Celery Workers + | | | + | | V + | | PostgreSQL + | | Redis + | | + +-------------------------+-----------------+ + | + Claude LLM API +``` + +## Tech Stack + +- **Frontend**: Next.js 14, React, Tailwind CSS, Radix UI Components +- **Backend**: Python, FastAPI, Uvicorn +- **Database**: PostgreSQL 16 (asyncpg) +- **Queue/Caching**: Redis 7, Celery +- **LLM**: Anthropic Claude (`claude-sonnet-4-6`) +- **Authentication**: JWT, Azure AD SSO (optional) +- **Deployment**: Docker, Docker Compose + +## Local Setup + +### Prerequisites +- Docker and Docker Compose +- Python 3.11+ (for local dev if needed) +- Anthropic API Key + +### Steps +1. **Clone the repository**. +2. **Configure Environment**: + ```bash + cp .env.example .env + ``` + Update `.env` with your values (especially `ANTHROPIC_API_KEY` and `JWT_SECRET_KEY`). + +3. **Start Services**: + ```bash + docker-compose up -d + ``` + This starts: + - `db` on port `5492` + - `redis` on port `6389` + - `backend` on port `8040` + - `frontend` (via Next.js dev server in Docker or separate setup) + - `celery_worker` on port `8041` (if exposed) + +4. **Run Migrations** (if applicable): + Ensure database tables are created. Check backend Dockerfile for init scripts. + +5. **Access the Platform**: + - Frontend: `http://localhost:3000` (or mapped port) + - Backend API: `http://localhost:8040` + +## Environment Variables + +| Variable | Description | Example | +|----------|-------------|---------| +| `DATABASE_URL` | PostgreSQL connection string | `postgresql+asyncpg://transcreation:transcreation@db:5432/transcreation` | +| `REDIS_URL` | Redis connection string | `redis://redis:6379/0` | +| `ANTHROPIC_API_KEY` | Anthropic API key | `sk-ant-xxxxx` | +| `JWT_SECRET_KEY` | Secret for JWT signing | `CHANGE_ME_TO_A_RANDOM_SECRET` | +| `JWT_ALGORITHM` | JWT algorithm | `HS256` | +| `JWT_EXPIRY_HOURS` | Token expiry | `8` | +| `STORAGE_ROOT` | Path for file storage | `/storage` | +| `LLM_MODEL` | Model to use | `claude-sonnet-4-6` | +| `AZURE_AD_SSO_ENABLED` | Enable Azure AD SSO | `true/false` | +| `AZURE_AD_TENANT_ID` | Azure AD Tenant ID | (if SSO enabled) | +| `AZURE_AD_CLIENT_ID` | Azure AD Client ID | (if SSO enabled) | + +## Key Services & Entry Points + +### Backend (`backend/`) +- **Main App**: `app/main.py` - FastAPI application entry point. +- **Tasks**: `app/tasks/celery_app.py` - Celery app configuration. +- **Database Models**: Likely in `app/models/` (PostgreSQL via SQLAlchemy/asyncpg). +- **Storage**: Files stored in `/storage` (mounted in Docker). + +### Frontend (`frontend/`) +- **Layout**: `src/app/layout.tsx` - Root layout with Amazon Ember font. +- **Auth**: `src/lib/auth.ts` - Authentication utilities (`isAuthenticated`). +- **Components**: Radix UI components in `src/components/ui/`. +- **Pages**: + - `src/app/page.tsx` - Redirects to `/dashboard` or `/login`. + - `src/app/dashboard/` - Job Wizard, Monitor, Review. + +### Docker Compose (`docker-compose.yml`) +- **db**: PostgreSQL 16, healthcheck via `pg_isready`. +- **redis**: Redis 7 Alpine, healthcheck via `redis-cli ping`. +- **backend**: Uvicorn server, mounts `./backend` for live reload. +- **celery_worker**: Celery worker, `--concurrency=4`. + +## API Reference + +- **Auth Endpoints**: `/auth/login`, `/auth/refresh` (JWT-based). +- **Jobs Endpoints**: + - `POST /jobs` - Create new transcreation job. + - `GET /jobs` - List jobs. + - `GET /jobs/{job_id}` - Get job details/status. + - `POST /jobs/{job_id}/approve` - Approve job output. +- **Output Endpoints**: + - `GET /output/{job_id}` - Download final translations. +- **Healthcheck**: + - `GET /health` - Check service health. + +*Note: Exact endpoints depend on implementation in `app/api/`. Polling interval in frontend is 3 seconds for job status updates.* + +## Deployment + +### Production Checklist +1. **Set Production Env Vars**: Ensure all secrets are secure. Disable `--reload` in Uvicorn. +2. **Disable Dev Volumes**: Remove `./backend:/app` mounts in production Dockerfile. +3. **Enable Azure AD SSO**: Set `AZURE_AD_SSO_ENABLED=true` and provide credentials. +4. **Secure Storage**: Ensure `/storage` is on persistent, secure storage. +5. **Monitoring**: Set up logging and monitoring for Celery workers and PostgreSQL. +6. **Reverse Proxy**: Use Nginx/Traefik to handle HTTPS and route traffic to the backend/frontend. + +## Known Gotchas + +- **Port Mappings**: In Docker Compose, ports are remapped (e.g., `5492:5432`). Ensure no conflicts. +- **Volume Mounts**: During dev, `./backend:/app` allows hot-reload. Do not use this in production as it can cause permission issues. +- **JWT Secrets**: Never commit `JWT_SECRET_KEY` to version control. Use secret management tools in production. +- **LLM Costs**: Anthropic API calls can be costly. Monitor usage and set limits if necessary. +- **Celery Concurrency**: Default is 4 workers. Adjust based on load and available resources. +- **Database Healthcheck**: If DB is slow, adjust `healthcheck` intervals in `docker-compose.yml`. +- **Font Files**: Amazon Ember fonts are embedded in the frontend. Ensure they are licensed for use. +- **Redis/PostgreSQL**: Ensure both services are healthy before starting backend/celery. `depends_on` with `condition: service_healthy` helps, but add retries in app code. \ No newline at end of file diff --git a/01 Projects/amazon-transcreation/USER_MANUAL.md b/01 Projects/amazon-transcreation/USER_MANUAL.md new file mode 100644 index 0000000..e05cc9a --- /dev/null +++ b/01 Projects/amazon-transcreation/USER_MANUAL.md @@ -0,0 +1,96 @@ +--- +auto_generated: true +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +--- + +# Amazon AI Transcreation Platform - User Manual + +## What This Tool Does + +The Amazon AI Transcreation Platform is an AI-powered solution designed to adapt Amazon marketing copy across 12 European locales. It replaces the previous manual LibreChat workflow with a structured, one-click multi-locale processing system. + +Key capabilities include: +- **Automated Transcreation**: Uses Claude LLM agents to adapt marketing copy for different regions while maintaining brand voice. +- **Real-time Monitoring**: Track the status of transcreation jobs in real-time via the dashboard. +- **In-App Review**: Review, edit, and approve translated content directly within the platform. +- **Structured Job Management**: Upload source files, select locales, and manage outputs in a unified interface. + +## Who Uses It + +- **Localization Managers**: Oversee translation projects and manage quality across multiple locales. +- **Marketing Teams**: Adapt marketing copy for European markets without manual translation workflows. +- **QA Specialists**: Review AI-generated translations for accuracy and brand compliance. +- **Administrators**: Manage user access and system configurations. + +## How to Access + +1. Navigate to the platform URL (e.g., `http://localhost:3000` or the deployed staging/production URL). +2. If not already logged in, you will be redirected to the **Login** page. +3. Authenticate using your corporate credentials (via Azure AD SSO if enabled). +4. Upon successful authentication, you will be redirected to the **Dashboard**. + +## Main Workflows + +### Workflow 1: Creating a Transcreation Job + +1. **Navigate to the Job Wizard**: From the Dashboard, click on "New Job" or navigate to the Job Wizard section. +2. **Upload Source Files**: + - Click "Upload" and select the marketing copy files (e.g., .txt, .docx, or .json). + - Files are stored in `/storage` on the backend. +3. **Select Target Locales**: + - Choose the 12 European locales you wish to target (e.g., de_DE, fr_FR, es_ES, etc.). + - Select the marketing channels (e.g., Email, Social Media, Product Descriptions). +4. **Configure Settings**: + - Choose the LLM model (default: `claude-sonnet-4-6`). + - Adjust any specific tone or style guidelines if available. +5. **Launch Job**: Click "Start Transcreation". + - The job is queued in Celery and processed with 4 concurrent workers. + - You can monitor progress in the "Monitor" tab. + +### Workflow 2: Monitoring Job Progress + +1. **Go to Monitor**: Navigate to the "Monitor" section from the Dashboard. +2. **View Job Status**: + - Jobs will show statuses: `QUEUED`, `PROCESSING`, `COMPLETED`, or `FAILED`. + - Use the polling mechanism (every 3 seconds) to see real-time updates. +3. **Check Details**: Click on a job to view details, including progress bars and logs. + +### Workflow 3: Reviewing and Approving Output + +1. **Navigate to Review**: Go to the "Review" section once jobs are completed. +2. **Select Job**: Choose a completed job from the list. +3. **Compare Source and Output**: + - View side-by-side comparisons of source copy and AI-generated translations. + - Use the built-in editor to make manual adjustments if necessary. +4. **Approve or Reject**: + - Click "Approve" to finalize the content for export. + - Click "Reject" to send the job back for reprocessing or manual intervention. + +### Workflow 4: Exporting Final Content + +1. **Navigate to Output**: Go to the "Output" section. +2. **Select Approved Jobs**: Choose the jobs you want to export. +3. **Download/Export**: + - Export the final translations in the desired format (e.g., JSON, CSV). + - Files are stored in `/storage/output` by default. + +## FAQ + +### Q: How are translations generated? +A: Translations are generated using the Claude LLM (via Anthropic API) configured in the environment variables. The system adapts marketing copy while preserving brand tone. + +### Q: What if a translation fails? +A: Failed jobs will show a `FAILED` status. Check the job logs in the Monitor tab for details. You can restart the job or manually edit the output in the Review section. + +### Q: Can I customize the AI's tone? +A: Currently, the tone is controlled via the model selection (`claude-sonnet-4-6`). Future updates may include tone presets. + +### Q: How many concurrent jobs can run? +A: The system is configured with 4 concurrent Celery workers. Additional jobs will queue until resources are available. + +### Q: Is my data secure? +A: Yes, the platform supports Azure AD SSO for authentication. Data is stored in secure volumes (`pgdata` for DB, `/storage` for files). + +### Q: How do I access the Admin panel? +A: Admin access is restricted to users with admin roles. Contact your system administrator to grant access. \ No newline at end of file diff --git a/01 Projects/enterprise-ai-hub-nexus/DEVELOPER_MANUAL.md b/01 Projects/enterprise-ai-hub-nexus/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..23fdb21 --- /dev/null +++ b/01 Projects/enterprise-ai-hub-nexus/DEVELOPER_MANUAL.md @@ -0,0 +1,149 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: Developer_Manual +status: active +tags: +- client/oliver +- domain/ai +- domain/security +- tech/azure-ad +- tech/docker +- tech/fastapi +- tech/gemini +- tech/nextjs +- tech/postgresql +- tech/python +- tech/redis +- type/sop +type: sop +--- + +# Enterprise AI Hub Nexus - Developer Manual + +## Architecture Overview +Nexus is a microservices-based architecture deployed on a Google Cloud Engine (GCE) VM with Docker Compose, supplemented by Google Cloud Run for heavy document processing tasks. + +### High-Level Components +1. **Frontend**: Next.js 14 application (SSG/SSR hybrid). +2. **Backend**: FastAPI (Python 3.11) handling business logic, auth, and API routes. +3. **Database**: PostgreSQL 16 for relational data (users, metadata). +4. **Vector DB**: Qdrant for RAG embeddings and semantic search. +5. **Cache/Queue**: Redis for caching and task queues. +6. **Processing Service**: Google Cloud Run microservice for heavy document ingestion. + +## Tech Stack +- **Frontend**: Next.js 14, React, Tailwind CSS, Lucide Icons, Radix UI. +- **Backend**: Python 3.11, FastAPI, SQLAlchemy, Pydantic. +- **Auth**: Microsoft Entra ID (Azure AD) with PKCE, JWT (HS256). +- **Data**: PostgreSQL, Redis, Qdrant. +- **AI**: OpenAI, Anthropic, Google Gemini APIs. + +## Local Setup + +### Prerequisites +- Docker & Docker Compose +- Node.js 18+ +- Python 3.11 + +### Steps +1. **Clone the repository**. +2. **Configure Environment**: + ```bash + cp .env.example .env + # Edit .env with your local/dev values + ``` +3. **Start Infrastructure**: + ```bash + docker-compose up -d db redis qdrant + ``` +4. **Install Backend Dependencies**: + ```bash + cd backend + pip install -r requirements.txt + ``` +5. **Run Backend**: + ```bash + uvicorn main:app --reload --host 0.0.0.0 --port 8000 + ``` +6. **Install Frontend Dependencies**: + ```bash + cd frontend + npm install + ``` +7. **Run Frontend**: + ```bash + npm run dev + ``` +8. **Access**: Frontend at `http://localhost:3000`, Backend at `http://localhost:8000`. + +## Environment Variables +Key variables in `.env`: + +| Variable | Description | Example | +|----------|-------------|---------| +| `ENVIRONMENT` | App mode (`development`, `production`) | `development` | +| `DEBUG` | Enable debug mode | `true` | +| `DATABASE_URL` | PostgreSQL connection string | `postgresql://user:pass@db:5432/nexus_db` | +| `REDIS_URL` | Redis connection string | `redis://redis:6379/0` | +| `QDRANT_URL` | Vector DB URL | `http://qdrant:6333` | +| `ENTRA_CLIENT_ID` | Azure AD App Registration Client ID | `uuid-here` | +| `ENTRA_TENANT_ID` | Azure AD Tenant ID | `uuid-here` | +| `ENTRA_REDIRECT_URI` | Callback URL for Auth | `http://localhost:8000/api/v1/auth/callback` | +| `JWT_SECRET` | Secret for signing JWTs | `secure-string` | +| `OPENAI_API_KEY` | OpenAI API Key | `sk-...` | +| `GOOGLE_API_KEY` | Google Gemini API Key | `AIza...` | +| `ANTHROPIC_API_KEY` | Anthropic API Key | `sk-ant-...` | + +## Key Services & Entry Points + +### Backend Entry Point +- **File**: `backend/main.py` (or similar, based on FastAPI structure) +- **Router**: `/api/v1/` +- **Auth Callback**: `/api/v1/auth/callback` + +### Database Migrations +- Use Alembic for SQLAlchemy migrations. +- Run migrations before starting the backend: + ```bash + alembic upgrade head + ``` + +### Qdrant Collection +- Ensure a collection exists in Qdrant (e.g., `nexus_knowledge`) for storing embeddings. + +## API Reference +- **Base URL**: `http://localhost:8000/api/v1` +- **Auth**: Bearer Token (JWT) required for most endpoints. + +#### Key Endpoints +- `POST /auth/login`: Initiates Entra ID login flow. +- `GET /auth/callback`: Handles OAuth2 redirect. +- `GET /chat`: Retrieve chat history. +- `POST /chat`: Send message and get RAG response. +- `POST /docs/upload`: Upload document to knowledge base. +- `GET /docs`: List documents. + +## Deployment +### Production +1. **Infrastructure**: Deploy on GCE VM (`optical-web-1`). +2. **Reverse Proxy**: Apache serves as the reverse proxy. +3. **Docker Compose**: Manages backend services (db, redis, qdrant, backend). +4. **Cloud Run**: Handles heavy document processing tasks. +5. **Environment**: Set `ENVIRONMENT=production` and `LOG_LEVEL=INFO`. +6. **Security**: + - Never commit `.env`. + - Use strong passwords for DB. + - Ensure CORS origins are restricted to `https://ai-sandbox.oliver.solutions`. + +## Known Gotchas +1. **CORS Errors**: Ensure `CORS_ORIGINS` in `.env` matches the frontend URL exactly. +2. **DB Health Check**: The `docker-compose.yml` includes health checks for DB, Redis, and Qdrant. If services fail to start, check `pg_isready` and `redis-cli ping`. +3. **Entra ID Redirect URI**: Must exactly match the configured URI in the Azure Portal App Registration. +4. **Qdrant Port**: Qdrant exposes both HTTP (6333) and gRPC (6334) ports. The backend uses HTTP. +5. **JWT Expiration**: Default is 15 minutes for access tokens (configured via `JWT_EXPIRATION_MINUTES`), but the README mentions 8-hour lifetime. Verify if refresh tokens are used for extended sessions. + +## Related +- [[01 Projects/Oliver-ai-bot_2.0/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/enterprise-ai-hub-nexus/USER_MANUAL.md b/01 Projects/enterprise-ai-hub-nexus/USER_MANUAL.md new file mode 100644 index 0000000..59ba875 --- /dev/null +++ b/01 Projects/enterprise-ai-hub-nexus/USER_MANUAL.md @@ -0,0 +1,88 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: User_Manual +status: active +tags: +- client/oliver +- domain/ai +- domain/security +- status/active +- tech/azure-ad +- tech/fastapi +- tech/postgresql +- tech/react +- tech/redis +- type/sop +type: sop +--- + +# Enterprise AI Hub Nexus - User Manual + +## Overview +Enterprise AI Hub Nexus is a secure, enterprise-grade AI platform designed for OLIVER Agency. It provides intelligent knowledge management, a RAG-based chat interface, and integration with Microsoft 365 productivity tools. The platform allows users to interact with company data securely using natural language, with responses contextualized by department and region. + +## Who Uses It? +- **All Employees**: To access company knowledge, ask questions, and get AI-driven answers. +- **Content Managers**: To upload, manage, and index documents in the knowledge base. +- **Super Admins**: To manage system settings, user roles, and security configurations. + +## How to Access +The platform is accessible via the public URL: **https://ai-sandbox.oliver.solutions** + +### Authentication +- **Provider**: Microsoft Entra ID (Azure AD). +- **Flow**: PKCE (Public Client) SPA flow. No client secret is required for the frontend. +- **Login**: Click "Sign In" and use your corporate Microsoft 365 credentials. +- **Token Lifetime**: JWT tokens last for 8 hours. + +## Main Workflows + +### 1. RAG Chat (Ask a Question) +Nexus answers questions based on your company's knowledge base with source citations. + +1. Navigate to the **Chat** tab. +2. Type your question in natural language (e.g., "What is the policy on remote work?"). +3. Click **Send**. +4. Review the AI-generated response, which includes: + - The direct answer. + - Source citations (links to the original documents). + - Language matching (ask in any language; reply in the same language). + +### 2. Personal Assistant (Microsoft 365 Integration) +Access read-only data from your Microsoft 365 account. + +1. Navigate to the **Assistant** tab. +2. Request information such as: + - **Emails**: "Show my latest emails from [Sender]." + - **Calendar**: "What are my meetings today?" + - **Files**: "Find documents in OneDrive named [Keyword]." + - **SharePoint**: "List recent files in [Site Name]." +3. *Note*: Integration is **read-only** for security. + +### 3. Knowledge Base Management (Admins & Content Managers) +Manage the documents that fuel the AI's answers. + +1. Navigate to the **Admin** > **Knowledge Base** section. +2. **Upload**: Click "Upload Document" to add new files (PDF, DOCX, TXT, etc.). +3. **Manage**: Edit metadata, tag by department/region, or delete existing documents. +4. **Re-index**: Trigger a re-indexing process to update the vector database with new/updated content. + +## FAQ + +**Q: Why can't I edit documents in SharePoint via Nexus?** +A: The Microsoft Graph integration is configured for **read-only** access to protect data integrity. + +**Q: Is my data shared externally?** +A: No. All data processing happens within the Enterprise AI Hub Nexus infrastructure (GCE VM + Google Cloud Run). External AI APIs (OpenAI, Anthropic) are used solely for inference, not storage. + +**Q: How is my content scoped?** +A: Content is filtered by **Department** and **Region** to ensure you only see relevant information for your team. + +**Q: What if I forget my password?** +A: Since authentication is handled by Microsoft Entra ID, reset your password via the standard Microsoft 365 password reset flow. + +## Related +- [[01 Projects/Oliver-ai-bot_2.0/USER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/ford-gechub-sftp/DEVELOPER_MANUAL.md b/01 Projects/ford-gechub-sftp/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..45a3649 --- /dev/null +++ b/01 Projects/ford-gechub-sftp/DEVELOPER_MANUAL.md @@ -0,0 +1,6 @@ +--- +auto_generated: true +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +--- + diff --git a/01 Projects/ford-gechub-sftp/USER_MANUAL.md b/01 Projects/ford-gechub-sftp/USER_MANUAL.md new file mode 100644 index 0000000..181cb0d --- /dev/null +++ b/01 Projects/ford-gechub-sftp/USER_MANUAL.md @@ -0,0 +1,104 @@ +--- +auto_generated: true +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +--- + +# Ford Asset Pack SFTP Transfer System + +## What This Tool Does + +The Ford Asset Pack SFTP Transfer System automates the movement of asset pack ZIP files from Box cloud storage to Ford's GECHUB SFTP servers. It ensures that assets are correctly distributed across Production (PROD), Education (EDU), and Quality Assurance (QA) environments. + +**Core Capabilities:** +- **Automated Monitoring:** Continuously checks specific Box folders for new asset pack ZIP files. +- **Secure Transfer:** Uploads files to designated SFTP servers using dual-factor authentication (Public Key + Password). +- **Record Keeping:** Archives files in Box after successful transfer and logs all events in a local database for reporting. +- **Alerting:** Sends email notifications via Mailgun for successes, failures, or when errors occur. + +## Who Uses It + +This tool is primarily used by: +- **System Administrators:** To configure and maintain the automation daemon and server configurations. +- **DevOps Engineers:** To manage environment variables, SFTP credentials, and Box API keys. +- **Ford Asset Managers:** To verify transfer reports and ensure assets are correctly deployed to GECHUB environments. + +## How to Access + +The system runs as a background service (daemon) on the host machine. + +1. **Daemon Service:** The main transfer process runs via `systemd`. It is designed to start automatically on boot and restart automatically if it crashes. +2. **Web Report Server:** A separate lightweight web interface allows users to view upload history via a browser. + - **URL:** `http://:5050` (default port). + - **Access:** Browse to the URL to see the upload history dashboard. +3. **Manual Testing:** Developers can manually trigger specific components using CLI scripts. + +## Main Workflows + +### 1. Automated Asset Transfer (Daemon Mode) +This workflow runs continuously in the background. + +1. **Detection:** The daemon polls Box folders (PROD, EDU, QA) every 5 minutes (configurable) for new ZIP files. +2. **Download:** Upon detecting a new file, it downloads the ZIP to local temporary storage. +3. **Upload:** It connects to the GECHUB SFTP server using dual-factor authentication and uploads the file. +4. **Archival:** If the upload is successful, the file is archived in Box. +5. **Logging:** The event is recorded in the local SQLite database (`upload_history.db`). +6. **Notification:** An email is sent to the designated recipients if the operation succeeds or fails. + +### 2. Viewing Upload Reports +Users can view historical data about asset transfers. + +1. **Start the Report Server:** + ```bash + python report_server.py + ``` + Ensure the `UPLOAD_HISTORY_DB` environment variable points to the correct SQLite database path. +2. **Open Browser:** Navigate to `http://localhost:5050`. +3. **Filter Data:** Use the date range selectors to filter uploads by environment (PROD, EDU, QA) and status (Success, Failed, etc.). + +### 3. Generating Email Reports +A weekly email report can be generated manually or via Cron. + +1. **Manual Execution:** + ```bash + # Default: Last 7 days + python report_email.py + + # Specific Date Range + python report_email.py --start 2026-04-01 --end 2026-04-14 + ``` +2. **Cron Job:** Configure `cron` to run the script weekly (e.g., every Monday at 8 AM UTC). + +### 4. Manual SFTP Testing +Use the test script to verify credentials and connectivity before running the full workflow. + +1. **Create a Test File:** + ```bash + echo "test" > test.zip + ``` +2. **Run Test Script:** + ```bash + python test_sftp_upload.py --env prod --file test.zip + # Or test all environments: + python test_sftp_upload.py --all --file test.zip + ``` + +## FAQ + +**Q: Why is my file not uploading?** +A: Check the `report_email.log` or system logs for errors. Common causes include: +- Incorrect SFTP credentials in `.env`. +- Box API rate limiting. +- Temporary SFTP server outage (the system retries automatically with exponential backoff). + +**Q: How are emails handled during outages?** +A: The system uses rate-limited alerts. If Mailgun or the SFTP server is unavailable, it prevents email spam by coalescing notifications until the service recovers. + +**Q: Where is the data stored?** +A: All upload history is stored in a local SQLite database named `upload_history.db` located in the project directory. + +**Q: Can I change the polling frequency?** +A: Yes, modify the `POLL_INTERVAL_SECONDS` variable in your `.env` file. + +**Q: What happens if the server reboots?** +A: The daemon is configured for `systemd` integration, so it will automatically restart after a reboot. \ No newline at end of file diff --git a/01 Projects/hm_ems_report/DEVELOPER_MANUAL.md b/01 Projects/hm_ems_report/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..cfd453e --- /dev/null +++ b/01 Projects/hm_ems_report/DEVELOPER_MANUAL.md @@ -0,0 +1,161 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: Developer_Manual +status: active +tags: +- client/hm +- domain/devops +- tech/docker +- tech/flask +- tech/python +- type/sop +type: sop +--- + +# H&M EMS Product Review Tool — Developer Manual + +## 1. Architecture Overview + +The application is a Flask-based Python web service that serves a single-page application (SPA) interface. It is designed to be deployed behind an Apache reverse proxy. + +- **Backend**: Flask (Python) handles API requests, JSON data manipulation, file locking, and authentication. +- **Frontend**: A self-contained HTML/JS/CSS interface (served as `static/index.html`) manages the UI, state, and API calls. +- **Data Storage**: + - **Master JSON**: Campaign data is read from and written to JSON files in `MASTER_JSON_DIR`. + - **Changelog**: Approval/unapproval actions are logged to a separate changelog file. + - **Images**: Static campaign images are served via the filesystem path `IMAGE_BASE_PATH`. + +## 2. Tech Stack + +- **Language**: Python 3.x +- **Web Framework**: Flask +- **Proxy Handling**: Werkzeug `ProxyFix` (for Apache/NGINX reverse proxy support) +- **Concurrency**: Threading (for file locks) +- **Frontend**: Vanilla JavaScript, HTML5, CSS3 (no external JS frameworks) + +## 3. Local Setup + +### 1. Clone and Install +```bash +# Clone the repository +git clone +cd hm-ems-product-review-tool + +# Install dependencies +pip install -r requirements.txt +``` + +### 2. Configure Environment +Copy `.env.example` to `.env` and adjust paths: +```bash +cp .env.example .env +``` + +### 3. Run Locally +```bash +python3 server.py +``` +Open `http://localhost:5000` in your browser. + +### 4. Run with Gunicorn (Production/Staging) +```bash +gunicorn server:app --bind 0.0.0.0:5000 --workers 4 +``` + +## 4. Environment Variables + +| Variable | Default Value | Description | +|----------|---------------|-------------| +| `MASTER_JSON_DIR` | `./Master_Json` | Directory containing campaign JSON files (e.g., `1022A.json`). | +| `IMAGE_BASE_PATH` | `./campaign_images` | Root directory for product images. Structure: `{year}/{campaign}/Automation_LR/`. | +| `PORT` | `5000` | Port the Flask server listens on. | +| `WORKERS` | `4` | Number of Gunicorn workers. | +| `SECRET_KEY` | `hm-ems-secret-key-change-in-production` | Flask secret key for session management. **Must be changed in production.** | +| `DEPLOY_USER` | `vadym.samoilenko` | Deployment user (metadata). | +| `DEPLOY_GROUP` | `vadym.samoilenko` | Deployment group (metadata). | +| `WEB_DIR` | `/var/www/html/hm-ems-report` | Apache web root directory. | +| `APP_DIR` | `/opt/hm_ems_report` | Application root directory. | +| `DATA_DIR` | `/opt/hm-ems-data` | Data root directory. | + +## 5. Key Services & Entry Points + +### `server.py` +- **Entry Point**: `if __name__ == '__main__': app.run()` +- **Authentication**: + - Hardcoded credentials in `server.py` (`admin3_M` / `Pa$$w0rd2026_!`). + - Uses Flask sessions and a `login_required` decorator. + - Handles AJAX 401 responses for API requests. +- **API Endpoints**: + - `GET /api/campaigns`: Returns list of available JSON files. + - `POST /api/load_campaign`: Loads JSON data for a selected campaign, mapping images. + - `POST /api/approve`: Saves edits to the master JSON, locks fields, and logs the change. + - `GET /images/`: Serves campaign images from the filesystem. + +### `html_generator.py` +- **Static Content Generator**: + - Contains `LANGUAGE_DISPLAY_NAMES` mapping for UI labels. + - Provides helper functions `_get_campaign_prefix`, `_get_main_image_filename`, `_get_all_image_filenames` to resolve image paths based on campaign structure. + - *Note*: While named `html_generator`, it primarily serves as a utility module for image/path logic in this context. + +### `static/index.html` +- **Frontend SPA**: + - Handles campaign switching, language selection, inline editing, and approval UI. + - Communicates with backend via `fetch` API calls. + - Implements image preview modal logic. + +## 6. API Reference + +### `GET /api/campaigns` +Returns a JSON list of available campaign files. +```json +["1022A.json", "2023.json"] +``` + +### `POST /api/load_campaign` +Request Body: +```json +{"filename": "1022A.json"} +``` +Returns the full campaign JSON data with pre-computed image paths. + +### `POST /api/approve` +Request Body: +```json +{ + "filename": "1022A.json", + "article_id": "ART123456", + "language": "de-de", + "field": "name", + "value": "Neuer Name" +} +``` +Returns `{"status": "success", "message": "Approved"}` or `{"status": "error", "message": "..."}`. + +## 7. Deployment + +### Apache Configuration +The application is designed to run behind Apache with mod_wsgi. Key considerations: +1. Ensure the `APPLICATION_ROOT` is set in `server.py` (`/hm-ems-report`). +2. Configure Apache VirtualHost to proxy requests to the Gunicorn socket/port. +3. Ensure the `DEPLOY_USER` has read/write permissions to `MASTER_JSON_DIR` and `IMAGE_BASE_PATH`. + +### File Permissions +```bash +chown -R vadym.samoilenko:vadym.samoilenko /opt/hm-ems-data +chmod -R 755 /opt/hm-ems-data +``` + +## 8. Known Gotchas + +1. **Session Timeout**: Flask sessions rely on the `SECRET_KEY`. If changed, all users will be logged out. +2. **Image Path Resolution**: Image paths are relative to `IMAGE_BASE_PATH`. If the directory structure (`year/campaign/Automation_LR`) changes, `_get_all_image_filenames` must be updated. +3. **JSON Concurrency**: The `file_lock` in `server.py` uses a simple threading lock. For high-concurrency environments, consider using a database or more robust file locking mechanisms. +4. **Hardcoded Credentials**: The login credentials are hardcoded in `server.py`. For production, migrate to a proper authentication system (e.g., LDAP, OAuth). +5. **Large Campaigns**: Loading very large JSON files may cause timeout issues. Consider pagination or lazy loading in the frontend for campaigns with thousands of articles. +6. **Proxy Headers**: The `ProxyFix` middleware is configured for Apache (`x_for=1, x_proto=1, x_host=1, x_prefix=1`). Adjust these if deploying behind NGINX or other proxies. + +## Related +- [[03 Resources/SOPs/DevOps/Deployment/Generic Flask Guide]] \ No newline at end of file diff --git a/01 Projects/hm_ems_report/USER_MANUAL.md b/01 Projects/hm_ems_report/USER_MANUAL.md new file mode 100644 index 0000000..1c3cb6e --- /dev/null +++ b/01 Projects/hm_ems_report/USER_MANUAL.md @@ -0,0 +1,109 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: User_Manual +status: active +tags: +- client/hm +- status/active +- tech/flask +- tech/python +- type/sop +type: sop +--- + +# H&M EMS Product Review Tool — User Manual + +## 1. What This Tool Does + +The H&M EMS Product Review Tool is a web-based application that allows client teams to review, edit, and approve product data (names and prices) for multiple international markets. It reads campaign data from JSON files and displays it in a structured, multi-language grid. Key capabilities include: + +- **Reviewing Product Data**: View product names and prices for English (GB) alongside up to 4 other target languages simultaneously. +- **Inline Editing**: Modify product names or prices directly in the interface for any selected language. +- **Approval Workflow**: Approve changes with a single click. Approved items are locked, saved to the master JSON, and logged. +- **Visual Verification**: View product images with a click-to-enlarge feature to verify visual context. +- **Campaign Switching**: Easily switch between different JSON campaign files. + +## 2. Who Uses It + +This tool is designed for: +- **Product Data Analysts**: Who need to verify translation accuracy and pricing consistency. +- **Localisation Managers**: Who oversee the approval of content for specific regions. +- **Marketing Teams**: Who review campaign imagery and product positioning. + +## 3. How to Access + +### Prerequisites +- A web browser (Chrome, Firefox, Edge, or Safari). +- Network access to the deployed server. + +### Login +1. Navigate to the application URL (e.g., `http://your-server/hm-ems-report`). +2. You will be presented with a login screen. +3. Enter your credentials: + - **Username**: `admin3_M` + - **Password**: `Pa$$w0rd2026_!` + *Note: In production, these credentials should be changed to match your specific environment settings.* +4. Click **Login** to access the dashboard. + +## 4. Main Workflows + +### Workflow 1: Selecting a Campaign +1. On the main dashboard, locate the **Campaign Selector** (usually a dropdown or list at the top of the page). +2. Click to reveal available JSON campaign files from the `Master_Json/` directory. +3. Select the desired campaign (e.g., `1022A.json` or `2023.json`). +4. The product table will load with data from that specific campaign. + +### Workflow 2: Reviewing and Editing Product Data +1. **Select Languages**: + - Use the language selector to choose up to 4 target languages (e.g., `de-de`, `fr-fr`, `es-es`). + - The **English (GB)** column is fixed and always visible as a reference. +2. **Review Data**: + - Scroll through the table to inspect Product Names and Prices for the selected languages. + - Compare translations against the English reference. +3. **Edit**: + - Click on any editable cell (Product Name or Price) for a non-English language. + - Type the new value. + - Press `Enter` or click outside the cell to save the local change. + +### Workflow 3: Approving Changes +1. After verifying and editing the necessary fields: + - Locate the **Approval Column** (typically contains a checkmark icon or button). + - Click the **Tick/Approve button** for the specific article/language pair. +2. **Confirmation**: + - The row or cell will visually indicate it has been approved (e.g., greyed out or locked). + - The change is automatically saved to the master JSON file. + - An entry is added to the changelog file. +3. **Undo**: + - If an approval was made in error, click the **Unapprove** button to unlock the field and revert to the editable state. + +### Workflow 4: Viewing Product Images +1. In each row, locate the **Image Column**. +2. **Multiple Images**: + - If a product has multiple campaign images, thumbnails will be displayed. +3. **Preview**: + - Click on any image thumbnail. + - A popup overlay will appear showing the full-size image for detailed inspection. +4. Close the popup by clicking the **X** or clicking outside the image. + +## 5. FAQ + +**Q: What happens if I leave the page without approving edits?** +A: Edits made in the text fields are only saved to the server when you click the **Approve** button. If you leave the page, unsaved changes will be lost. + +**Q: Can I edit the English (GB) column?** +A: No. The English (GB) column is a fixed reference for comparison and cannot be edited directly in this tool. + +**Q: How do I know my changes have been saved?** +A: Once you click the Approve button, the cell locks, and the row/image typically updates visually to confirm the action. The change is also recorded in the backend changelog. + +**Q: What if the images don't load?** +A: Ensure the `IMAGE_BASE_PATH` is correctly configured in the environment. Check the console for any 404 errors. The tool looks for images in `{year}/{campaign}/Automation_LR/`. + +**Q: Can I select more than 4 languages?** +A: No, the tool is optimized for side-by-side comparison of up to 4 target languages plus the English reference to maintain readability. + +## Related +- [[01 Projects/hm_ems_report/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/oliver-ai-assistant/DEVELOPER_MANUAL.md b/01 Projects/oliver-ai-assistant/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..6781d64 --- /dev/null +++ b/01 Projects/oliver-ai-assistant/DEVELOPER_MANUAL.md @@ -0,0 +1,214 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: Developer_Manual +status: active +tags: +- client/oliver +- domain/ai +- domain/devops +- tech/celery +- tech/docker +- tech/fastapi +- tech/llamaindex +- tech/nextjs +- tech/postgresql +- tech/python +- tech/react +- tech/redis +- tech/typescript +- type/project +type: project +--- + +# Oliver AI Assistant - Developer Manual + +## Architecture Overview + +Oliver AI Assistant is a full-stack application split into two main components: + +1. **Frontend**: Built with Next.js, React, and Tailwind CSS. Handles the UI, state management, and API interactions. +2. **Backend**: Built with FastAPI (Python), handling business logic, database interactions, and AI service orchestration. + +### High-Level Flow + +1. **Client** (Next.js) sends HTTP/WebSocket requests to **Backend** (FastAPI). +2. **Backend** processes requests using services like `DocumentProcessor`, `EmbeddingService`, and `VectorStoreFactory`. +3. **Data Layer** uses PostgreSQL for relational data and a Vector Store (e.g., pgvector) for embeddings. +4. **Async Tasks** (Celery) handle heavy lifting like SharePoint sync and file processing in the background via Redis. + +## Tech Stack + +- **Frontend**: Next.js 14+, React, TypeScript, Tailwind CSS, Axios/WebSockets. +- **Backend**: Python 3.11+, FastAPI, SQLAlchemy (Async), Pydantic, Celery, Redis. +- **Database**: PostgreSQL (Main DB + Vector Extensions). +- **Object Storage**: MinIO (for temporary file storage). +- **Task Queue**: Celery + Redis (Broker/Backend). +- **ORM/Alchemy**: Alembic for migrations. + +## Local Setup + +### Prerequisites + +- Python 3.11+ +- Node.js 18+ +- PostgreSQL 14+ +- Redis 7+ +- Docker & Docker Compose (recommended for infrastructure) + +### Step 1: Clone and Install Dependencies + +```bash +# Clone the repository +git clone +cd oliver-ai-assistant + +# Install Backend dependencies +pip install -r backend/requirements.txt + +# Install Frontend dependencies +cd frontend +npm install +cd .. +``` + +### Step 2: Environment Variables + +Copy the `.env.example` files to `.env` in both root and backend directories. Fill in required values: + +```bash +cp backend/.env.example backend/.env +cp frontend/.env.local frontend/.env.local +``` + +Key Variables: +- `DATABASE_URL`: PostgreSQL connection string. +- `REDIS_URL`: Redis connection string (default `redis://localhost:6379/0`). +- `MINIO_ROOT_USER`, `MINIO_ROOT_PASSWORD`: MinIO credentials. +- `OPENAI_API_KEY`: API key for embedding and chat models. +- `SHAREPOINT_CLIENT_ID`, `SHAREPOINT_CLIENT_SECRET`: SharePoint credentials. + +### Step 3: Start Infrastructure + +```bash +docker-compose up -d postgres redis minio +``` + +### Step 4: Database Migrations + +```bash +# Run main database migrations +alembic upgrade head + +# Run vector database migrations +cd backend +alembic -c alembic_vectors.ini upgrade head +``` + +### Step 5: Run Services + +#### Backend +```bash +cd backend +uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 +``` + +#### Celery Worker +```bash +cd backend +celery -A app.worker.celery_app worker --loglevel=info -Q default +``` + +#### Frontend +```bash +cd frontend +npm run dev +``` + +## Environment Variables + +| Variable | Description | Required | +|---|---|---| +| `ENV` | Environment mode (development/production) | Yes | +| `DATABASE_URL` | PostgreSQL URI | Yes | +| `REDIS_URL` | Redis URI for Celery/Cache | Yes | +| `MINIO_ENDPOINT` | MinIO host | Yes | +| `MINIO_BUCKET` | Bucket name for file storage | Yes | +| `OPENAI_API_KEY` | Key for AI models | Yes | +| `SECRET_KEY` | JWT Signing Secret | Yes | +| `CORS_ORIGINS` | Comma-separated list of allowed origins | Yes | + +## Key Services & Entry Points + +### Backend (`backend/app/`) + +- **`app/main.py`**: FastAPI application entry point. Manages lifespan (startup/shutdown), middleware setup, and route registration. +- **`app/config.py`**: `RuntimeConfig` class handles loading configuration from DB/Redis with fallback to `.env`. Ensures thread-safe caching. +- **`app/worker.py`**: Celery configuration. Defines tasks like `sync_all_sharepoint`, `cleanup_temp_files`, and scheduled beat jobs. +- **`app/core/document_processor.py`**: Logic for chunking, cleaning, and preparing documents for embedding. +- **`app/core/embedding_service.py`**: Interfaces with AI providers (e.g., OpenAI) to generate vector embeddings. +- **`app/core/vector_store_factory.py`**: Factory pattern to switch between vector store implementations (e.g., pgvector, Pinecone). +- **`app/services/sharepoint_sync.py`**: Handles OAuth2 authentication with Microsoft Graph API and folder synchronization logic. + +### Frontend (`frontend/src/`) + +- **`app/`**: Next.js App Router pages. `app/chat/page.tsx` is the main AI interface. +- **`components/`**: Reusable UI components (e.g., `ChatInput`, `MessageBubble`). +- **`lib/api.ts`**: API client functions for making requests to the FastAPI backend. + +## API Reference + +The API follows RESTful conventions and WebSocket for real-time chat. + +### Authentication +- **Endpoint**: `POST /api/auth/login` +- **Body**: `{ "email": "user@example.com", "password": "pass" }` +- **Response**: `{ "access_token": "...", "refresh_token": "..." }` + +### Chat +- **Endpoint**: `WebSocket /ws/chat` or `POST /api/chat` +- **Payload**: `{ "message": "What is Oliver?", "session_id": "..." }` +- **Response**: `{ "reply": "Oliver is...", "tokens_used": 120 }` + +### Admin +- **Endpoint**: `GET /api/admin/users` +- **Auth**: Admin Bearer Token required. +- **Response**: List of user objects. + +## Deployment + +### Backend +The backend is configured for standalone deployment via `next.config.js` and `pyproject.toml`. Use Docker: + +```dockerfile +FROM python:3.11-slim +WORKDIR /app +COPY backend/. /app +RUN pip install --no-cache-dir -r requirements.txt +CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"] +``` + +### Frontend +Build the static site: + +```bash +cd frontend +npm run build +npm run export # Or next build for standalone output +``` + +Deploy via Nginx or a static site host (Vercel/Netlify) for the frontend, and a cloud provider (AWS ECS, Azure App Service) for the backend. + +## Known Gotchas + +1. **Vector Indexing**: Ensure the `pgvector` extension is enabled in PostgreSQL before running migrations. If the database is empty, initial sync tasks may fail silently if not monitored. +2. **Celery Beat**: The `beat_schedule` in `app/worker.py` runs every 4 hours for sync. Ensure the Celery Beat container is running alongside the worker. +3. **Redis Limits**: Default Redis memory limits may need adjustment if storing large numbers of session tokens or config caches. +4. **CORS**: When deploying, update `CORS_ORIGINS` in the backend `.env` to include your frontend's domain to avoid preflight failures. +5. **Timezones**: Celery tasks use UTC. Ensure server time is synchronized or use `enable_utc=True` in Celery config. + +## Related +- [[01 Projects/Oliver-ai-bot_2.0/DEVELOPER_MANUAL.md]] +- [[01 Projects/enterprise-ai-hub-nexus/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/oliver-ai-assistant/USER_MANUAL.md b/01 Projects/oliver-ai-assistant/USER_MANUAL.md new file mode 100644 index 0000000..fbfc3aa --- /dev/null +++ b/01 Projects/oliver-ai-assistant/USER_MANUAL.md @@ -0,0 +1,88 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: User_Manual +status: active +tags: +- client/oliver +- domain/ai +- tech/azure-ad +- tech/rag +- type/sop +type: sop +--- + +# Oliver AI Assistant - User Manual + +## What This Tool Does + +Oliver AI Assistant is an intelligent workplace assistant designed to help users manage information, automate workflows, and retrieve knowledge efficiently. Key capabilities include: + +- **AI-Powered Chat**: Natural language interface for asking questions about your documents and data. +- **SharePoint Integration**: Automatic synchronization of files and folders from Microsoft SharePoint to a centralized knowledge base. +- **Vector Search**: Semantic search capabilities allowing you to find relevant information based on meaning, not just keywords. +- **Document Processing**: Upload, process, and index documents for quick retrieval. +- **Productivity Tools**: Assists with drafting, summarizing, and organizing tasks. + +## Who Uses It + +- **Employees/Staff**: To quickly find answers to questions, access company documents, and automate repetitive data tasks. +- **Knowledge Managers**: To oversee the synchronization of content from SharePoint and ensure data integrity. +- **Administrators**: To manage users, configure system settings, and monitor system health and analytics. + +## How to Access + +1. **URL**: Navigate to the deployed Oliver AI Assistant URL provided by your IT department (e.g., `https://oliver.yourcompany.com`). +2. **Login**: Use your corporate SSO credentials to log in. +3. **Interface**: You will see a main dashboard with: + - A chat input area at the bottom. + - A sidebar showing recent conversations or shared resources. + - A navigation menu for Admin settings (if permitted). + +## Main Workflows + +### Workflow 1: Chatting with Oliver + +1. **Start a Conversation**: Type a question or command in the chat box at the bottom of the screen. +2. **Send Message**: Press `Enter` or click the Send arrow. +3. **Wait for Response**: Oliver processes your request, potentially querying the vector database or API services. +4. **Review Results**: Read the AI-generated response, which may include text, links, or summaries. +5. **Follow Up**: Ask clarifying questions to refine the answer. + +### Workflow 2: Sharing and Syncing Documents + +1. **Upload**: Click the "Upload" button in the chat or dedicated documents section. +2. **Select Files**: Choose PDFs, DOCX, TXT, or other supported formats from your local device. +3. **Process**: The system will automatically chunk and embed the document. +4. **Query**: Ask Oliver about the content immediately after processing. Example: "What are the key points in the document I just uploaded?" + +### Workflow 3: Managing SharePoint Sync (Admins) + +1. **Navigate to Admin**: Click the Admin icon in the sidebar. +2. **Go to Sync Settings**: Select "SharePoint Sync" from the admin menu. +3. **Configure Folder**: Enter the SharePoint folder URL and credentials if not already set. +4. **Trigger Sync**: Click "Sync Now" to force a manual synchronization. +5. **Monitor Status**: Check the sync log for errors or completion status. + +## FAQ + +### Q: Why is Oliver taking so long to respond? +A: Complex queries involving large document sets or heavy processing may take longer. For immediate responses, try rephrasing your question to be more specific. + +### Q: Can I delete a chat history? +A: Yes, navigate to the conversation list in the sidebar, hover over the conversation, and click the delete icon. + +### Q: Is my data secure? +A: Yes, Oliver AI Assistant uses encrypted connections (HTTPS) and stores data in secure, isolated environments. Sensitive configurations are encrypted at rest. + +### Q: How do I troubleshoot sync errors? +A: Check the "Sync Logs" in the Admin panel. If errors persist, ensure your SharePoint credentials are valid and the folder permissions are correctly set. + +### Q: What file types are supported? +A: Currently, we support PDF, DOCX, TXT, MD, and CSV files. Contact support for additional format requests. + +## Related +- [[01 Projects/Oliver-ai-bot_2.0/USER_MANUAL.md]] +- [[01 Projects/enterprise-ai-hub-nexus/USER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/pdf-accessibility/DEVELOPER_MANUAL.md b/01 Projects/pdf-accessibility/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..34686a3 --- /dev/null +++ b/01 Projects/pdf-accessibility/DEVELOPER_MANUAL.md @@ -0,0 +1,159 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: Developer_Manual +status: active +tags: +- client/oliver +- domain/accessibility +- domain/security +- tech/azure-ad +- tech/claude +- tech/docker +- tech/postgresql +- tech/python +type: sop +--- + +# PDF Accessibility Checker Developer Manual + +## Architecture Overview + +The application is a microservice-oriented system comprising: +1. **Web Frontend**: HTML/CSS/JS interface for upload, visualization, and result inspection. +2. **Backend API (Python)**: Handles file processing, AI analysis coordination, database interactions, and API authentication. +3. **Database (PostgreSQL)**: Stores job metadata, analysis results, and user configuration. +4. **External Services**: + - **Anthropic Claude 3.5 Sonnet**: Used for AI-powered image alt-text generation and analysis. + - **Google Cloud Vision**: Optional OCR and text-in-image detection. + - **Cloud Run**: Optional external service for offloading PDF processing tasks. + +## Tech Stack + +- **Backend**: Python 3, FastAPI/Flask (implied by REST API structure), veraPDF for validation. +- **Frontend**: Vanilla JavaScript, HTML5, CSS3 (Tailwind-like utility classes). +- **Database**: PostgreSQL 16. +- **Containerization**: Docker & Docker Compose. +- **Authentication**: Azure AD (MSAL) for UI access, API Key for REST access. +- **AI**: Anthropic API, Google Cloud Vision API. + +## Local Setup + +### Prerequisites +- Docker and Docker Compose. +- Python 3.9+ (for local script execution without Docker). +- API Keys for Anthropic and optionally Google Cloud. + +### Step-by-Step Setup + +1. **Clone the Repository**: + ```bash + git clone + cd pdf-accessibility-checker + ``` + +2. **Environment Configuration**: + ```bash + cp .env.example .env + ``` + Edit `.env` and fill in: + - `ANTHROPIC_API_KEY`: Required for AI image analysis. + - `GOOGLE_API_KEY` or `GOOGLE_APPLICATION_CREDENTIALS`: Optional, for enhanced OCR. + - `DB_PASSWORD`: Set a strong password for PostgreSQL. + - `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_REDIRECT_URI`: For Azure AD login. + - `CLOUD_RUN_URL`: Set if deploying to GCP Cloud Run; leave empty for local processing. + +3. **Start Services**: + ```bash + docker-compose up -d + ``` + This starts the web app on port 8000 and PostgreSQL on port 5432. + +4. **Verify Installation**: + - Web UI: `http://localhost:8000` + - API: `http://localhost:8000/api/` (requires API key in headers). + +5. **Run Tests**: + ```bash + # Activate virtual environment first if not using Docker + pip install -r requirements.txt + pytest + ``` + There are 31 automated tests covering core functionality. + +## Environment Variables + +| Variable | Required | Description | +|----------|----------|-------------| +| `ANTHROPIC_API_KEY` | Yes | API key for Anthropic Claude. | +| `GOOGLE_API_KEY` | No | Google Cloud Vision API key (optional). | +| `GOOGLE_APPLICATION_CREDENTIALS` | No | Path to Google credentials JSON (alternative to API key). | +| `DEV_MODE` | No | Set to `true` to bypass Azure AD auth for local dev. | +| `DB_HOST` | Yes | PostgreSQL host (default `postgres`). | +| `DB_PORT` | Yes | PostgreSQL port (default `5432`). | +| `DB_NAME` | Yes | Database name (default `pdf_checker`). | +| `DB_USER` | Yes | Database user (default `pdf_checker`). | +| `DB_PASSWORD` | Yes | Database password. | +| `CLOUD_RUN_URL` | No | URL for external Cloud Run service. | +| `GCP_SA_KEY_PATH` | No | Path to GCP Service Account key for Cloud Run auth. | +| `GCS_BUCKET_NAME` | No | GCS bucket for storing page images. | +| `RETENTION_HOURS` | No | Hours to keep uploaded PDFs (default 24). | +| `RESULTS_RETENTION_HOURS` | No | Hours to keep result JSONs (default 720). | +| `AZURE_TENANT_ID` | Yes | Azure AD Tenant ID. | +| `AZURE_CLIENT_ID` | Yes | Azure AD Client ID. | +| `AZURE_REDIRECT_URI` | Yes | Azure AD Redirect URI. | + +## Key Services & Entry Points + +- **`js/batch.js`**: Handles multi-file drag-and-drop and upload logic. +- **`js/page-viewer.js`**: Renders the visual page inspector with SVG markers for issues. +- **`js/results.js`**: Displays scores, severity counts, and manages issue filtering/dismissal logic. +- **`js/upload.js`**: Handles single-file upload and polling for analysis status. +- **`docker-compose.yml`**: Defines the web and postgres services. +- **`.env.example`**: Template for configuration. + +## API Reference + +### Authentication +- **UI**: Uses Azure AD via MSAL. +- **REST**: Requires an API key in the `Authorization` header (e.g., `Bearer `). + +### Endpoints +- `POST /upload`: Uploads a PDF for analysis. Returns a `job_id`. +- `GET /results/{job_id}`: Retrieves analysis results. Poll this endpoint until `status` is `completed`. +- `GET /api/image?job_id=&page=`: Retrieves the rendered page image for the visual inspector. + +### Payload Example (Results) +```json +{ + "accessibility_score": 85, + "severity_counts": { "critical": 0, "error": 2, "warning": 5, "info": 1, "success": 10 }, + "issues": [ ... ], + "page_images": { "1": "https://...", "2": "https://..." }, + "wcag_compliance": { "1.1.1": "A", "1.4.3": "AA" }, + "dismissed_indices": [], + "score_breakdown": { "adjusted": false } +} +``` + +## Deployment + +1. **Environment**: Set `DEV_MODE=false` and use production DB credentials. +2. **Secrets**: Never commit `.env`. Use secrets management (e.g., AWS Secrets Manager, GCP Secret Manager) for API keys. +3. **Cloud Run**: Deploy the Python service to Cloud Run using `CLOUD_RUN_URL` and `GCP_SA_KEY_PATH` for authentication. +4. **Database**: Use a managed PostgreSQL instance (e.g., Cloud SQL, Azure Database). +5. **Storage**: Configure GCS bucket for image storage if using Cloud Run. + +## Known Gotchas + +1. **Duplicate Files**: The frontend (`js/batch.js`) prevents adding duplicate files based on name and size. Ensure backend also validates. +2. **AI Costs**: AI analysis via Anthropic and Google Cloud incurs costs. Monitor usage. +3. **Polling**: Frontend uses polling for analysis status. Implement exponential backoff on the backend if processing takes long. +4. **Retention**: Ensure `RETENTION_HOURS` and `RESULTS_RETENTION_HOURS` are set correctly to manage storage costs. +5. **Auth Bypass**: Never leave `DEV_MODE=true` in production. +6. **Imports**: A critical import bug was previously fixed in the remediation module. Ensure all imports are relative and correct if modifying Python code. + +## Related +- [[01 Projects/ppt-tool/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/pdf-accessibility/USER_MANUAL.md b/01 Projects/pdf-accessibility/USER_MANUAL.md new file mode 100644 index 0000000..4b172f9 --- /dev/null +++ b/01 Projects/pdf-accessibility/USER_MANUAL.md @@ -0,0 +1,87 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: User_Manual +status: active +tags: +- domain/accessibility +- domain/ai +- type/sop +type: sop +--- + +# PDF Accessibility Checker User Manual + +## What This Tool Does + +The **PDF Accessibility Checker** is an enterprise-grade tool that validates PDF documents against **WCAG 2.1 Level A & AA** standards. It combines traditional PDF structural analysis with AI-powered image recognition to provide comprehensive accessibility compliance reports. + +### Key Capabilities +- **Automated WCAG Validation**: Checks 30+ accessibility criteria automatically. +- **AI-Powered Image Analysis**: Uses AI to analyze images within PDFs for missing or inappropriate alt text. +- **Visual Inspector**: View specific pages with visual markers indicating exactly where accessibility issues exist. +- **Auto-Remediation**: Provides suggestions and automated fixes for common issues. +- **Batch Processing**: Upload and analyze multiple PDFs simultaneously. +- **Compliance Reporting**: Generates detailed scores, severity counts (Critical, Error, Warning, Info, Success), and WCAG conformance levels. + +## Who Uses It +- **Document Creators**: Ensure their PDFs are accessible before distribution. +- **Compliance Officers**: Verify that documents meet regulatory standards (WCAG 2.1). +- **Quality Assurance Teams**: Integrate accessibility checks into their document publishing workflows. + +## How to Access + +### Web Interface +Access the application via your organization's internal URL. The interface provides: +1. **Upload Section**: Drag-and-drop or click to upload single or batch PDF files. +2. **Results Dashboard**: Displays accessibility scores, severity breakdowns, and detailed issue lists. +3. **Visual Page Inspector**: Interactive viewer to see issues mapped to specific pages. + +### API Access +The tool also exposes a REST API for integration with other systems. API access requires authentication via an API key provided in the request headers. + +### Command Line Interface +For advanced users, the application can be run via command line. See the developer section for setup instructions. + +## Main Workflows + +### Workflow 1: Single PDF Analysis +1. **Upload**: Drag and drop a PDF file into the upload area or click to browse. +2. **Processing**: Wait for the analysis to complete (status updates in real-time). +3. **Review**: View the overall accessibility score and severity breakdown. +4. **Inspect**: Use the "Visual Page Inspector" to navigate through pages and see highlighted issues. +5. **Remediate**: Review the list of issues, filter by severity or WCAG criterion, and apply auto-remediation suggestions where applicable. +6. **Download**: Export the full accessibility report (JSON/HTML) if needed. + +### Workflow 2: Batch PDF Analysis +1. **Switch Mode**: Click the "Batch Upload" tab to switch from single-file mode. +2. **Select Files**: Drag and drop up to 10 PDF files (max 50MB each) into the batch upload area. +3. **Process**: Click "Analyze All" to start processing. Track per-file status in the batch list. +4. **Review**: Iterate through each file's results using the same interface as single-file analysis. + +### Workflow 3: Filtering and Dismissing Issues +1. **Filter Issues**: Use the filter dropdown to show only Critical, Error, Warning, or Info issues. +2. **Dismiss Issues**: For false positives, click the "Dismiss" button next to an issue. This updates the score and allows you to re-calculate the score excluding dismissed items. +3. **Override Checks**: Override specific AI-driven checks if you have domain-specific knowledge that contradicts the automated analysis. + +## FAQ + +**Q: How accurate is the AI image analysis?** +A: The AI provides approximately 95% automated coverage of accessibility requirements. While highly accurate, it is recommended to review AI-generated alt text suggestions for context-specific accuracy. + +**Q: What is the file size limit?** +A: Individual PDFs can be up to 50MB. Batch uploads support up to 10 files simultaneously. + +**Q: Are my PDFs stored permanently?** +A: No. Uploaded PDFs are automatically deleted after 24 hours (configurable via `RETENTION_HOURS`). Results JSON files are kept for 30 days (configurable via `RESULTS_RETENTION_HOURS`). + +**Q: What does the Accessibility Score represent?** +A: The score is a weighted metric based on the number and severity of issues found. A perfect score indicates no critical errors, though some warnings may remain. The exact calculation considers Critical, Error, Warning, and Info issues. + +**Q: Can I use this tool for compliance with ISO 14289-1 (PDF/UA-1)?** +A: Yes, the tool integrates veraPDF for enhanced PDF/UA-1 validation alongside WCAG 2.1 checks. + +## Related +- [[01 Projects/pdf-accessibility/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/presenton/DEVELOPER_MANUAL.md b/01 Projects/presenton/DEVELOPER_MANUAL.md new file mode 100644 index 0000000..f30d2ac --- /dev/null +++ b/01 Projects/presenton/DEVELOPER_MANUAL.md @@ -0,0 +1,115 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: Developer_Manual +status: active +tags: +- client/oliver +- domain/ai +- status/active +- tech/docker +- tech/fastapi +- tech/nextjs +- tech/node +- tech/python +- tech/rag +- tech/react +- tech/sqlite +- type/sop +type: sop +--- + +# Presenton Developer Manual + +## Architecture Overview +Presenton is a monorepo structure with two main server components: +1. **FastAPI Server (`servers/fastapi`)**: Handles business logic, AI model integration, file processing, and API endpoints. +2. **Next.js Frontend (`servers/nextjs`)**: Serves the web UI, manages user state, and communicates with the FastAPI backend. + +The application is containerized using Docker Compose, allowing for easy deployment with GPU support via `production-gpu`. + +## Tech Stack +- **Backend**: Python 3, FastAPI, Uvicorn +- **Frontend**: Next.js (React), Tailwind CSS +- **Database**: SQLite (default), configurable via `DATABASE_URL` +- **AI Integration**: OpenAI SDK, Google Generative AI, Anthropic SDK, Ollama +- **Containerization**: Docker, Docker Compose + +## Local Setup + +### 1. Environment Configuration +Copy `.env.example` to `.env` in the root directory. Required variables: +- `LLM`: Provider (`openai`, `google`, `anthropic`, `ollama`) +- `*_API_KEY`: API keys for the selected provider +- `*_MODEL`: Specific model identifier +- `DATABASE_URL`: Connection string for the database + +### 2. Development Mode +Run the application in dev mode with hot reloading: +```bash +node start.js --dev +``` +This starts: +- FastAPI on `http://127.0.0.1:8000` +- Next.js on `http://127.0.0.1:3000` + +### 3. Production Deployment +```bash +docker-compose up -d +``` +Or for GPU support: +```bash +docker-compose up -d production-gpu +``` + +## Environment Variables +| Variable | Description | Example | +|----------|-------------|---------| +| `LLM` | AI provider | `openai`, `ollama` | +| `OPENAI_API_KEY` | OpenAI API key | `sk-...` | +| `OPENAI_MODEL` | OpenAI model | `gpt-4.1` | +| `GOOGLE_API_KEY` | Google API key | `AIza...` | +| `ANTHROPIC_API_KEY` | Anthropic API key | `sk-ant-...` | +| `OLLAMA_URL` | Local Ollama URL | `http://localhost:11434` | +| `COMFYUI_URL` | ComfyUI URL for image gen | `http://localhost:8188` | +| `CAN_CHANGE_KEYS` | Allow key updates via UI | `true` | +| `DISABLE_ANONYMOUS_TRACKING` | Disable analytics | `true` | + +## Key Services & Entry Points +- **`start.js`**: Bootstraps both servers and initializes user config. +- **`servers/fastapi/server.py`**: Main FastAPI server entry. Use `--port` and `--reload` args. +- **`servers/fastapi/mcp_server.py`**: Generates an MCP server from OpenAPI spec for tool calling. +- **`servers/fastapi/constants/llm.py`**: Default model configurations. +- **`servers/fastapi/constants/documents.py`**: Supported file MIME types for uploads. + +## API Reference +The FastAPI server exposes REST endpoints at `/api`. Key features: +- **Presentations**: CRUD operations for presentation drafts. +- **AI Generation**: POST requests to trigger slide generation. +- **Templates**: Upload and manage custom HTML/Tailwind templates. +- **MCP Integration**: A separate MCP server runs on port `8001` (default) for agent-based access. + +To test endpoints, use the Swagger UI at `http://127.0.0.1:8000/docs` when running locally. + +## Deployment +### Docker +- `docker-compose.yml` defines `production` (CPU) and `production-gpu` services. +- Volumes map `./app_data` to persist data. +- Images can be built from `Dockerfile` or pulled from `ghcr.io/presenton/presenton:latest`. + +### Custom Builds +1. Ensure all `npm` dependencies are installed in `servers/nextjs`. +2. Ensure Python dependencies are installed in `servers/fastapi`. +3. Build the Docker image: `docker build -t presenton .` + +## Known Gotchas +- **Ollama Connectivity**: Ensure `OLLAMA_URL` is accessible from inside the container. +- **GPU Support**: Requires NVIDIA container toolkit and `nvidia` driver in Docker Compose. +- **Port Conflicts**: Ensure ports 5000 (web), 8000 (API), and 3000 (dev frontend) are free. +- **File Upload Limits**: Large PPTX/PDF files may require adjusting backend timeout settings. +- **Config Persistence**: User configs are stored in `userConfig.json` inside `APP_DATA_DIRECTORY`. If `CAN_CHANGE_KEYS=false`, keys are fixed at startup. + +## Related +- [[01 Projects/ppt-tool/DEVELOPER_MANUAL.md]] \ No newline at end of file diff --git a/01 Projects/presenton/USER_MANUAL.md b/01 Projects/presenton/USER_MANUAL.md new file mode 100644 index 0000000..57c8634 --- /dev/null +++ b/01 Projects/presenton/USER_MANUAL.md @@ -0,0 +1,88 @@ +--- +auto_generated: true +created: 2026-05-18 +manual_updated_at: 2026-05-18 +modified: 2026-05-18 +name: User_Manual +status: active +tags: +- client/oliver +- domain/ai +- tech/docker +- tech/fastapi +- tech/sqlite +- type/sop +type: sop +--- + +# Presenton User Manual + +## What This Tool Does +Presenton is an open-source AI-powered presentation generator. It allows users to create professional presentations from text prompts or existing documents (PPTX, PDF, DOCX). It supports various AI models (OpenAI, Google Gemini, Anthropic, Ollama) and offers custom template creation using HTML and Tailwind CSS. + +## Who Uses It +- **Professionals**: Quickly create slides for business meetings, reports, or pitches. +- **Educators**: Generate lecture slides or study materials. +- **Developers**: Integrate presentation generation into workflows via API or MCP. +- **Privacy-Conscious Users**: Run the entire application locally on their device. + +## How to Access +Presenton is a self-hosted web application. You can access it locally via your web browser at `http://localhost:5000` after starting the server. + +### Prerequisites +- **Docker** and **Docker Compose**: For easy local deployment. +- **LLM API Keys**: Valid keys for OpenAI, Google, or Anthropic (or an Ollama instance running locally). + +### Quick Start +1. Clone the repository. +2. Create a `.env` file in the root directory with your API keys: + ```env + OPENAI_API_KEY=your_key_here + OPENAI_MODEL=gpt-4.1 + LLM=openai + DATABASE_URL=sqlite:///./app_data/presenton.db + ``` +3. Run the application: + ```bash + docker-compose up -d + ``` +4. Open `http://localhost:5000` in your browser. + +## Main Workflows + +### 1. Generate a Presentation from Scratch +1. Navigate to the **New Presentation** screen. +2. Select **AI Template Generation** or **Prompt to Presentation**. +3. Enter your topic or upload a document (PPTX, PDF, DOCX) for context. +4. Choose an AI model (e.g., GPT-4, Gemini, Llama 3 via Ollama). +5. Click **Generate**. The AI will create slides based on your input. +6. Preview and download your presentation. + +### 2. Use Custom Templates +1. Go to **Templates**. +2. Select **Create New Template**. +3. Design your slide layout using HTML and Tailwind CSS. +4. Save the template and use it in new presentations. + +### 3. Upload Existing Presentation +1. Click **Upload File**. +2. Select a `.pptx` file. +3. Presenton will analyze the design and generate a new presentation with a similar style. + +## FAQ + +**Q: Can I use my own local models?** +A: Yes. Set `LLM=ollama` in your environment variables and provide the `OLLAMA_URL` and `OLLAMA_MODEL`. + +**Q: Where are my data stored?** +A: All user data is stored locally in the `./app_data` volume mapped in your Docker Compose setup. No data is sent to external servers unless required by your chosen AI provider. + +**Q: How do I change the port?** +A: Modify the `ports` section in `docker-compose.yml` (e.g., `- "8080:80"`). + +**Q: Is there an API?** +A: Yes. The FastAPI server exposes endpoints for programmatic access. See the Developer Manual for details. + +## Related +- [[01 Projects/presenton/DEVELOPER_MANUAL.md]] +- [[01 Projects/ppt-tool/USER_MANUAL.md]] \ No newline at end of file diff --git a/99 Daily/2026-05-18.md b/99 Daily/2026-05-18.md index 7d00536..6a20638 100644 --- a/99 Daily/2026-05-18.md +++ b/99 Daily/2026-05-18.md @@ -70,3 +70,6 @@ tags: [daily] - 18:37 (1min) — session ended | `.vault-agent` - 18:43 — session ended | `pimco-charts` - 18:45 — session ended | `.vault-agent` +- 18:49 — session ended | `.vault-agent` +- 18:52 — session ended | `.vault-agent` +- 18:55 — session ended | `pimco-charts`