# FORGE AI: Unified Generative AI Platform **FORGE AI** is an enterprise-grade, microservices-based platform designed to unify the world's most powerful generative AI models into a single, cohesive workflow for various media types. It provides a robust backend for orchestration and a modern, responsive frontend for creative professionals. --- ## ๐ŸŒŸ Executive Summary Instead of managing subscription islands (Runway, Midjourney, Topaz, ElevenLabs), **FORGE AI** brings them all together. It allows for complex workflows like generating an image with **DALL-E 3**, upscaling it with **Topaz**, extending it into a video with **Google Veo**, and adding a voiceover via **ElevenLabs**โ€”all within one interface. --- ## ๏ฟฝ Comprehensive Feature Matrix ### 1. ๐ŸŽฌ Video Generation (Multi-Provider) The video module abstracts complexity between different providers, handling authentication, file upload, polling, and result retrieval automatically. | Provider | Model | Capabilities | Optimal Use Case | | :--- | :--- | :--- | :--- | | **Runway** | **Gen-4 Turbo** | Image-to-Video | **High-Fidelity Animation**. Best for animating static marketing assets. Features **Smart Cropping** (auto-resize to 1280x768). | | **Runway** | **Veo 3 / 3.1** | Text/Image-to-Video | **Versatile Generation**. Native 720p/1080p, 8-second clips. Good for general-purpose stock footage. | | **Google** | **Veo Native** | Text-to-Video | **Enterprise Scale**. Direct Vertex AI integration for scalable generation. | > **Key Feature**: **Smart Aspect Ratio Handling**. The backend automatically detects the required aspect ratio for the selected model (e.g., Gen-4's strict 1280:768 requirement) and resizes/crops input images on the fly to prevent API errors. ### 2. ๐Ÿ–ผ๏ธ Image Generation (The "Omni-Model" Engine) Access the latest models without switching tools. * **OpenAI**: * **GPT-Image-1**: The latest efficient model. Supports transparent backgrounds and variable quality. * **DALL-E 3**: HD quality, vivid/natural styles. * **Google Imagen**: * **Imagen 4.0 (Standard/Ultra/Fast)**: Supports "Prompt Enhance" and "Person Generation" safety filters. * **Stability AI**: * **SD3.5 / SDXL**: Advanced control with **Negative Prompts** and **Image-to-Image** strength sliders. * **Nano Banana (Gemini)**: * **Gemini 2.5 Flash / 3 Pro**: High speed, supports up to 4K resolution and 21:9 aspect ratios. * **Flux**: Black Forest Labs Flux Pro integration for photorealistic outputs. * **Ideogram**: Version 2 integration, excellent for typography. ### 3. ๏ฟฝ Image Utilities * **Professional Upscaling**: * Integrated with **Topaz Photo AI SDK**. * Capabilities: **Face Recovery**, **Denoising**, 2x/4x scaling. * **Background Removal**: * **Clipping Magic**: High-precision removal. * **Bria AI**: Fast, commercially safe removal. ### 4. ๐Ÿ”Š Audio Intelligence * **Text-to-Speech**: * **ElevenLabs Integration**: Multilingual V2 model. High-quality voice synthesis. * Configurable stability, similarity boost, and style. * **Voice-to-Text**: * **OpenAI Whisper**: Industry-leading transcription accuracy. * Supports multiple languages and timestamp generation. * **Sound Effects**: * AI-generated SFX for video post-production. ### 5. ๐Ÿ“ Text & Utilities * **Subtitle Processor**: * Automated subtitle generation (VTT/SRT). * **"Burn-in" Capability**: Hardcodes subtitles directly onto video frames using FFmpeg. * **Prompt Studio**: * Uses LLMs (Gemini/GPT-4) to refine simple prompts into detailed, artistic descriptions. * Style presets: Cinematic, Anime, Photorealistic, etc. * **Markdown/Mermaid**: * Renders technical diagrams (Flowcharts, Gantt) from text descriptions. --- ## ๐Ÿ—๏ธ Technical Architecture FORGE AI is built on a **Microservices Architecture** using **Docker Compose**. ### Backend (`forge-backend`) * **Framework**: **FastAPI** (Python 3.11). High-performance, async-first API. * **Task Queue**: **Celery** with **Redis**. Handles long-running jobs (video gen can take minutes) cleanly, decoupling HTTP requests from processing. * **Database ORM**: **SQLAlchemy**. Interface for PostgreSQL. * **Validation**: **Pydantic**. Strict usage of data contracts ensures API reliability. * **File Handling**: Direct handling of binary assets, streaming uploads to disk storage. ### Frontend (`forge-frontend`) * **Framework**: **Next.js 14** (App Router). Server-Side Rendering (SSR) for performance. * **UI Library**: **React** + **Tailwind CSS**. * **Components**: Custom design system using **ShadCN/UI** primitives (Dialogs, Selects, Toasts). * **State Management**: React Hooks for polling job status and updating UI progress bars. ### Data Persistence * **PostgreSQL 16**: * `jobs`: Stores parameters, status, and API metadata for every request. * `assets`: Tracks file paths, MIME types, metadata (width/height/duration). * `users`: Authentication and profile data. * **Redis**: In-memory message broker for Celery and caching layer. * **Docker Volumes**: * `postgres_data`: Persistent DB storage. * `assets_data`: Shared volume for storing generated media files. --- ## โšก๏ธ Scalability & Performance FORGE AI is architected to handle high user concurrency (e.g., 200+ simultaneous users) without degrading API performance. * **Asynchronous Job Queue**: All heavy compute tasks (Video Generation, Upscaling) are offloaded to **Celery** workers via **Redis**. The API responds immediately with a `Queued` status, ensuring the interface remains snappy. * **Horizontal Scaling**: * The `forge-worker` service can be scaled horizontally to process more jobs in parallel. * Command: `docker-compose up -d --scale worker=3` (Starts 3 worker containers). * **Fault Tolerance**: If a worker crashes or an API fails, the job status is tracked in Postgres, and broken jobs can be automatically retried or flagged. --- ## ๐Ÿ’พ Database Schema Overview | Table | Description | Key Fields | | :--- | :--- | :--- | | `jobs` | The central unit of work. | `id`, `module`, `action`, `status` (pending/failed/completed), `input_data` (JSON), `api_provider` | | `assets` | Files managed by the system. | `id`, `file_path`, `thumbnail_path`, `mime_type`, `source_job_id` | | `users` | User accounts. | `id`, `email`, `hashed_password`, `role` | --- ## ๏ฟฝ Configuration (.env) The system is highly configurable. Key variables include: ### API Keys (Critical) * `RUNWAY_API_KEY`: For Gen-4 Turbo / Veo 3. * `GOOGLE_API_KEY` / `GOOGLE_PROJECT_ID`: For Imagen / Vertex AI Veo. * `OPENAI_API_KEY`: For DALL-E, GPT, Whisper. * `ELEVENLABS_API_KEY`: For TTS. * `TOPAZ_API_KEY`: For Upscaling. ### Infrastructure * `DATABASE_URL`: Postgres connection string. * `REDIS_URL`: Redis connection string. * `CELERY_BROKER_URL`: Usually same as Redis. --- ## ๐Ÿš€ Deployment & Operation ### Development Mode ```bash # Start the full stack docker-compose up -d --build # View logs docker-compose logs -f forge-backend ``` ### Access Points * **UI**: `http://localhost:3000` * **API Documentation (Swagger UI)**: `http://localhost:8000/docs` * **Database (Internal)**: Port `5432` --- ## ๐Ÿ’พ Database Storage & Access ### PostgreSQL Database Storage The database files are stored in a Docker volume for persistence across container restarts. **Docker Volume:** `forge-ai_postgres_data` **Physical Location:** `/var/lib/docker/volumes/forge-ai_postgres_data/_data` *Note: On macOS, this is inside Docker Desktop's Linux VM and managed automatically.* ### Database Connection Info - **Host:** `localhost` (from your Mac) - **Port:** `5452` (mapped from container's 5432) - **Database:** `forge_ai` - **User:** `forge_user` - **Password:** `forge_secure_password_2024` ### Accessing the Database **1. Via Docker exec (Interactive SQL):** ```bash docker exec -it forge-postgres psql -U forge_user -d forge_ai ``` **2. Backup the Database:** ```bash docker exec forge-postgres pg_dump -U forge_user forge_ai > backup.sql ``` **3. Restore from Backup:** ```bash cat backup.sql | docker exec -i forge-postgres psql -U forge_user -d forge_ai ``` **4. View Volume Information:** ```bash docker volume inspect forge-ai_postgres_data ``` ### Other Storage Locations - **Uploaded Files/Assets:** `./storage` (in your project directory) - **Redis Data:** `forge-ai_redis_data` Docker volume - **Celery Task Queue:** Managed by Redis (in-memory) --- ## โš ๏ธ Troubleshooting Common Issues ### "Gen-4 Turbo - Validation Failed: Ratio" * **Cause**: Runway's Gen-4 Turbo API is extremely strict about input image resolution (must be 1280x768). * **Solution**: The backend now includes a **Smart Crop** pre-processor. It automatically resizes and crops your input to the exact pixel dimensions required. You do not need to manually edit images. ### "422 Unprocessable Entity" * **Cause**: Usually missing required fields in the API payload. * **Solution**: We recently relaxed the `prompt` requirement for Image-to-Video jobs. Ensure your frontend is sending the correct structure (refresh browser to clear cache). --- ## ยฉ 2025 FORGE AI Platforms *Unified Creativity.*