No description
Find a file
DJP 321dec4029 Add upload capability to AssetLibrary modal
- Added Upload button to asset selection modal
- Users can now upload files directly when selecting images
- Works for Veo first/last frames, reference images, etc.
- Auto-selects uploaded file in single-select mode
- Shows upload progress indicator

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
2025-12-09 20:55:14 -05:00
backend Fix auth, database issues, add provider API endpoint 2025-12-09 20:50:12 -05:00
docker Initial commit - FORGE AI unified platform 2025-12-09 20:39:00 -05:00
frontend Add upload capability to AssetLibrary modal 2025-12-09 20:55:14 -05:00
nginx Initial commit - FORGE AI unified platform 2025-12-09 20:39:00 -05:00
.env.example Initial commit - FORGE AI unified platform 2025-12-09 20:39:00 -05:00
.gitignore Initial commit - FORGE AI unified platform 2025-12-09 20:39:00 -05:00
docker-compose.yml Initial commit - FORGE AI unified platform 2025-12-09 20:39:00 -05:00
README.md Initial commit - FORGE AI unified platform 2025-12-09 20:39:00 -05:00

FORGE AI

A unified AI platform for creative media generation, processing, and management.

Features

Image

  • Generate - AI image generation with multiple providers (OpenAI DALL-E, Google Gemini/Imagen, Leonardo AI, Bria AI, Stability AI)
  • Upscale - Enhance image resolution with Topaz Labs AI
  • Remove Background - Remove backgrounds from images

Video

  • Generate - AI video generation
  • Upscale - Enhance video resolution with Topaz Labs AI
  • Subtitles - Generate and add subtitles to videos

Audio

  • Text to Speech - Convert text to natural-sounding speech (ElevenLabs)
  • Voice to Text - Transcribe audio/video to text (OpenAI Whisper)
  • Sound Effects - Generate AI sound effects (ElevenLabs)

Text

  • Prompt Studio - AI-powered prompt enhancement and generation
  • Alt Text Generator - Generate accessible alt text for images

Tech Stack

  • Frontend: Next.js 15, React 19, TypeScript, TailwindCSS
  • Backend: FastAPI, Python 3.11
  • Database: PostgreSQL 16
  • Cache: Redis
  • Task Queue: Celery
  • Containerization: Docker Compose

Quick Start

Prerequisites

  • Docker and Docker Compose
  • API Keys for services you want to use (OpenAI, Google AI, ElevenLabs, etc.)

Setup

  1. Clone the repository:
git clone <repo-url>
cd forge-ai
  1. Copy the example environment file:
cp .env.example .env
  1. Configure your API keys in .env:
# Required for basic functionality
OPENAI_API_KEY=your-openai-key

# Optional - for additional providers
GOOGLE_AI_API_KEY=your-google-ai-key
ELEVENLABS_API_KEY=your-elevenlabs-key
LEONARDO_API_KEY=your-leonardo-key
BRIA_API_KEY=your-bria-key
STABILITY_API_KEY=your-stability-key
ANTHROPIC_API_KEY=your-anthropic-key
  1. Start the application:
docker compose up -d
  1. Access the application:

Test Accounts

Admin User

  • Email: test@forge.ai
  • Password: password123
  • Role: Admin (full access including admin panel)

You can also create new accounts via the signup page.

Architecture

forge-ai/
├── frontend/          # Next.js frontend application
│   ├── app/           # App router pages
│   ├── components/    # React components
│   └── lib/           # Utilities and API client
├── backend/           # FastAPI backend
│   └── app/
│       ├── api/       # API routes
│       ├── models/    # SQLAlchemy models
│       ├── schemas/   # Pydantic schemas
│       └── services/  # Business logic
├── docker/            # Docker configuration
│   ├── init.sql       # Database initialization
│   └── *.dockerfile   # Service Dockerfiles
└── storage/           # File storage (mounted volume)

API Providers

Image Generation

Provider Models Features
OpenAI DALL-E 3, DALL-E 2 Text to image
Google Gemini Imagen 3, Gemini 2.0 Flash (Nano Banana) Text to image, iterative editing
Leonardo AI Multiple models with style presets Text to image, style control
Bria AI Bria 2.3, Bria Fast Text to image, fast generation
Stability AI Stable Diffusion 3 Text to image

Audio Generation

Provider Features
ElevenLabs Text-to-speech, voice cloning, sound effects
OpenAI Whisper Speech-to-text transcription

Admin Panel

The admin panel is accessible at /admin for users with admin role:

  • Dashboard - System stats and recent activity
  • Users - User management
  • Reports - Usage analytics
  • Audit Logs - System audit trail
  • Voices - ElevenLabs voice management

Development

Running locally without Docker

Backend:

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8020

Frontend:

cd frontend
npm install
npm run dev

Environment Variables

See .env.example for all available configuration options.

Troubleshooting

Common Issues

Login not working:

  • Ensure the database is initialized with test data
  • Check that bcrypt==4.0.1 is installed (for passlib compatibility)

API calls failing:

  • Verify your API keys are configured correctly
  • Check backend logs: docker compose logs backend

File uploads/downloads not working:

  • Ensure the storage volume is mounted correctly
  • Check file permissions in /app/storage

License

Proprietary - All rights reserved.