Initial commit - FORGE AI unified platform

Features:
- Image generation (OpenAI, Gemini, Leonardo, Bria, Stability, Flux)
- Nano Banana iterative editing
- Video generation and upscaling
- Audio TTS, STT, sound effects (ElevenLabs)
- Text prompt studio and alt text
- User authentication with JWT/cookies
- Admin panel with voice management
- Job queue with Celery
- PostgreSQL + Redis backend
- Next.js 15 + FastAPI architecture

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
This commit is contained in:
DJP 2025-12-09 20:39:00 -05:00
commit 7a804e896d
89 changed files with 17262 additions and 0 deletions

88
.env.example Normal file
View file

@ -0,0 +1,88 @@
# FORGE AI Environment Configuration
# Copy this to .env and fill in your values
# =============================================================================
# DATABASE
# =============================================================================
POSTGRES_USER=forge_user
POSTGRES_PASSWORD=forge_secure_password_2024
POSTGRES_DB=forge_ai
DATABASE_URL=postgresql://forge_user:forge_secure_password_2024@postgres:5432/forge_ai
# =============================================================================
# REDIS
# =============================================================================
REDIS_URL=redis://redis:6379
# =============================================================================
# APPLICATION
# =============================================================================
APP_NAME=FORGE AI
APP_VERSION=1.0.0
DEBUG=false
SECRET_KEY=your-super-secret-key-change-in-production
# =============================================================================
# STORAGE
# =============================================================================
STORAGE_PATH=/app/storage
# =============================================================================
# AI API KEYS
# =============================================================================
# OpenAI (DALL-E, GPT-4 Vision)
OPENAI_API_KEY=sk-your-openai-api-key
# Stability AI (Stable Diffusion)
STABILITY_API_KEY=sk-your-stability-api-key
# Leonardo AI
LEONARDO_API_KEY=your-leonardo-api-key
# Ideogram
IDEOGRAM_API_KEY=your-ideogram-api-key
# Flux/Black Forest Labs
FLUX_API_KEY=your-flux-api-key
# Google AI (Gemini, Imagen, Veo)
GOOGLE_API_KEY=your-google-api-key
GOOGLE_PROJECT_ID=your-gcp-project-id
# Runway ML
RUNWAY_API_KEY=your-runway-api-key
# ElevenLabs (Text-to-Speech)
ELEVENLABS_API_KEY=your-elevenlabs-api-key
# DeepL (Translation)
DEEPL_API_KEY=your-deepl-api-key
# Topaz Labs (Image/Video Upscaling)
TOPAZ_API_KEY=your-topaz-api-key
# Clipping Magic (Background Removal) - Alternative
CLIPPING_MAGIC_API_KEY=your-clipping-magic-api-key
# Bria AI (Background Removal)
BRIA_API_KEY=your-bria-api-key
# =============================================================================
# GOOGLE CLOUD (Optional - for GCS storage)
# =============================================================================
GCS_BUCKET_NAME=forge-ai-assets
GOOGLE_APPLICATION_CREDENTIALS=/app/credentials/gcs-service-account.json
# =============================================================================
# AZURE AD (SSO - Optional)
# =============================================================================
AZURE_CLIENT_ID=your-azure-client-id
AZURE_CLIENT_SECRET=your-azure-client-secret
AZURE_TENANT_ID=your-azure-tenant-id
# =============================================================================
# CELERY (Background Jobs)
# =============================================================================
CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0

69
.gitignore vendored Normal file
View file

@ -0,0 +1,69 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Node
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.pnpm-debug.log*
# Next.js
.next/
out/
*.tsbuildinfo
next-env.d.ts
# Storage
storage/
# Environment
.env
.env.local
.env.*.local
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
# OS
.DS_Store
Thumbs.db
# Docker volumes
postgres_data/
redis_data/
# Logs
*.log
logs/
# Testing
.pytest_cache/
.coverage
htmlcov/

174
README.md Normal file
View file

@ -0,0 +1,174 @@
# FORGE AI
A unified AI platform for creative media generation, processing, and management.
## Features
### Image
- **Generate** - AI image generation with multiple providers (OpenAI DALL-E, Google Gemini/Imagen, Leonardo AI, Bria AI, Stability AI)
- **Upscale** - Enhance image resolution with Topaz Labs AI
- **Remove Background** - Remove backgrounds from images
### Video
- **Generate** - AI video generation
- **Upscale** - Enhance video resolution with Topaz Labs AI
- **Subtitles** - Generate and add subtitles to videos
### Audio
- **Text to Speech** - Convert text to natural-sounding speech (ElevenLabs)
- **Voice to Text** - Transcribe audio/video to text (OpenAI Whisper)
- **Sound Effects** - Generate AI sound effects (ElevenLabs)
### Text
- **Prompt Studio** - AI-powered prompt enhancement and generation
- **Alt Text Generator** - Generate accessible alt text for images
## Tech Stack
- **Frontend**: Next.js 15, React 19, TypeScript, TailwindCSS
- **Backend**: FastAPI, Python 3.11
- **Database**: PostgreSQL 16
- **Cache**: Redis
- **Task Queue**: Celery
- **Containerization**: Docker Compose
## Quick Start
### Prerequisites
- Docker and Docker Compose
- API Keys for services you want to use (OpenAI, Google AI, ElevenLabs, etc.)
### Setup
1. Clone the repository:
```bash
git clone <repo-url>
cd forge-ai
```
2. Copy the example environment file:
```bash
cp .env.example .env
```
3. Configure your API keys in `.env`:
```bash
# Required for basic functionality
OPENAI_API_KEY=your-openai-key
# Optional - for additional providers
GOOGLE_AI_API_KEY=your-google-ai-key
ELEVENLABS_API_KEY=your-elevenlabs-key
LEONARDO_API_KEY=your-leonardo-key
BRIA_API_KEY=your-bria-key
STABILITY_API_KEY=your-stability-key
ANTHROPIC_API_KEY=your-anthropic-key
```
4. Start the application:
```bash
docker compose up -d
```
5. Access the application:
- **Frontend**: http://localhost:3020
- **API**: http://localhost:8020
- **API Docs**: http://localhost:8020/docs
## Test Accounts
### Admin User
- **Email**: test@forge.ai
- **Password**: password123
- **Role**: Admin (full access including admin panel)
You can also create new accounts via the signup page.
## Architecture
```
forge-ai/
├── frontend/ # Next.js frontend application
│ ├── app/ # App router pages
│ ├── components/ # React components
│ └── lib/ # Utilities and API client
├── backend/ # FastAPI backend
│ └── app/
│ ├── api/ # API routes
│ ├── models/ # SQLAlchemy models
│ ├── schemas/ # Pydantic schemas
│ └── services/ # Business logic
├── docker/ # Docker configuration
│ ├── init.sql # Database initialization
│ └── *.dockerfile # Service Dockerfiles
└── storage/ # File storage (mounted volume)
```
## API Providers
### Image Generation
| Provider | Models | Features |
|----------|--------|----------|
| OpenAI | DALL-E 3, DALL-E 2 | Text to image |
| Google Gemini | Imagen 3, Gemini 2.0 Flash (Nano Banana) | Text to image, iterative editing |
| Leonardo AI | Multiple models with style presets | Text to image, style control |
| Bria AI | Bria 2.3, Bria Fast | Text to image, fast generation |
| Stability AI | Stable Diffusion 3 | Text to image |
### Audio Generation
| Provider | Features |
|----------|----------|
| ElevenLabs | Text-to-speech, voice cloning, sound effects |
| OpenAI Whisper | Speech-to-text transcription |
## Admin Panel
The admin panel is accessible at `/admin` for users with admin role:
- **Dashboard** - System stats and recent activity
- **Users** - User management
- **Reports** - Usage analytics
- **Audit Logs** - System audit trail
- **Voices** - ElevenLabs voice management
## Development
### Running locally without Docker
**Backend:**
```bash
cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8020
```
**Frontend:**
```bash
cd frontend
npm install
npm run dev
```
### Environment Variables
See `.env.example` for all available configuration options.
## Troubleshooting
### Common Issues
**Login not working:**
- Ensure the database is initialized with test data
- Check that bcrypt==4.0.1 is installed (for passlib compatibility)
**API calls failing:**
- Verify your API keys are configured correctly
- Check backend logs: `docker compose logs backend`
**File uploads/downloads not working:**
- Ensure the storage volume is mounted correctly
- Check file permissions in `/app/storage`
## License
Proprietary - All rights reserved.

38
backend/Dockerfile Normal file
View file

@ -0,0 +1,38 @@
# FORGE AI Backend - Python FastAPI
FROM python:3.11-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV PYTHONPATH=/app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
curl \
ffmpeg \
libpq-dev \
libmagic1 \
&& rm -rf /var/lib/apt/lists/*
# Set work directory
WORKDIR /app
# Copy requirements first for caching
COPY requirements.txt .
# Install Python dependencies
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Create storage directories
RUN mkdir -p /app/storage/{images,videos,audio,documents,temp}
# Expose port
EXPOSE 8000
# Run the application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]

2
backend/app/__init__.py Normal file
View file

@ -0,0 +1,2 @@
# FORGE AI Backend
__version__ = "1.0.0"

View file

@ -0,0 +1 @@
"""API Package"""

View file

@ -0,0 +1,13 @@
"""API v1 Router"""
from fastapi import APIRouter
from app.api.v1 import auth, users, jobs, assets, modules, admin
router = APIRouter()
# Include all routers
router.include_router(auth.router, prefix="/auth", tags=["Authentication"])
router.include_router(users.router, prefix="/users", tags=["Users"])
router.include_router(jobs.router, prefix="/jobs", tags=["Jobs"])
router.include_router(assets.router, prefix="/assets", tags=["Assets"])
router.include_router(modules.router, prefix="/modules", tags=["Modules"])
router.include_router(admin.router, tags=["Admin"])

510
backend/app/api/v1/admin.py Normal file
View file

@ -0,0 +1,510 @@
"""Admin API routes - Admin only access"""
from fastapi import APIRouter, Depends, HTTPException, Query
from sqlalchemy.orm import Session
from sqlalchemy import func, desc
from datetime import datetime, timedelta
from typing import Optional
from app.database import get_db
from app.models.user import User
from app.models.job import Job
from app.models.usage import UsageLog
from app.schemas.user import UserResponse
router = APIRouter(prefix="/admin", tags=["admin"])
def get_current_admin_user(db: Session = Depends(get_db)) -> User:
"""Dependency to verify admin access - placeholder for real auth"""
# TODO: Implement real auth with JWT/session
user = db.query(User).filter(User.role.in_(['admin', 'super_admin'])).first()
if not user:
raise HTTPException(status_code=403, detail="Admin access required")
return user
@router.get("/stats")
async def get_admin_stats(
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get admin dashboard statistics"""
today = datetime.utcnow().date()
total_users = db.query(func.count(User.id)).scalar()
active_users = db.query(func.count(User.id)).filter(User.is_active == True).scalar()
total_jobs = db.query(func.count(Job.id)).scalar()
jobs_today = db.query(func.count(Job.id)).filter(
func.date(Job.created_at) == today
).scalar()
failed_jobs = db.query(func.count(Job.id)).filter(
func.date(Job.created_at) == today,
Job.status == 'failed'
).scalar()
# Calculate average processing time for completed jobs
avg_time_result = db.query(
func.avg(
func.extract('epoch', Job.completed_at) - func.extract('epoch', Job.created_at)
)
).filter(
Job.status == 'completed',
Job.completed_at.isnot(None)
).scalar()
avg_processing_time = round(avg_time_result or 0, 1)
# Estimate API costs from usage logs
total_cost = db.query(func.sum(UsageLog.estimated_cost_usd)).filter(
func.date(UsageLog.created_at) >= today.replace(day=1)
).scalar() or 0
return {
"totalUsers": total_users,
"activeUsers": active_users,
"totalJobs": total_jobs,
"jobsToday": jobs_today,
"failedJobs": failed_jobs,
"avgProcessingTime": avg_processing_time,
"apiCosts": round(total_cost, 2)
}
@router.get("/activity")
async def get_recent_activity(
limit: int = Query(10, le=50),
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get recent system activity"""
# Get recent jobs with user info
recent_jobs = db.query(Job, User).join(
User, Job.user_id == User.id
).order_by(desc(Job.created_at)).limit(limit).all()
items = []
for job, user in recent_jobs:
action_map = {
'pending': 'Started',
'processing': 'Processing',
'completed': 'Completed',
'failed': 'Failed'
}
action = f"{action_map.get(job.status, 'Created')} {job.module.replace('_', ' ')}"
items.append({
"id": str(job.id),
"user": user.email,
"action": action,
"module": job.module,
"time": _format_relative_time(job.created_at)
})
return {"items": items}
@router.get("/users")
async def list_users(
page: int = Query(1, ge=1),
limit: int = Query(20, le=100),
role: Optional[str] = None,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""List all users (admin only)"""
query = db.query(User)
if role:
query = query.filter(User.role == role)
total = query.count()
users = query.order_by(desc(User.created_at)).offset((page - 1) * limit).limit(limit).all()
return {
"items": [
{
"id": str(u.id),
"email": u.email,
"name": u.display_name,
"role": u.role,
"is_active": u.is_active,
"created_at": u.created_at.isoformat(),
"last_login": u.last_login_at.isoformat() if u.last_login_at else None
}
for u in users
],
"total": total,
"page": page,
"limit": limit
}
@router.patch("/users/{user_id}")
async def update_user(
user_id: str,
role: Optional[str] = None,
is_active: Optional[bool] = None,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Update user role or status (admin only)"""
user = db.query(User).filter(User.id == user_id).first()
if not user:
raise HTTPException(status_code=404, detail="User not found")
if role and role in ['user', 'admin', 'super_admin']:
# Only super_admin can create other super_admins
if role == 'super_admin' and admin.role != 'super_admin':
raise HTTPException(status_code=403, detail="Only super admins can create super admins")
user.role = role
if is_active is not None:
user.is_active = is_active
db.commit()
db.refresh(user)
return {"message": "User updated", "user_id": str(user.id)}
@router.get("/reports")
async def get_usage_reports(
range: str = Query("7d"),
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get usage reports and analytics"""
days = {"7d": 7, "30d": 30, "90d": 90, "365d": 365}.get(range, 7)
start_date = datetime.utcnow() - timedelta(days=days)
# Usage over time
usage_query = db.query(
func.date(Job.created_at).label('date'),
func.count(Job.id).label('jobs')
).filter(
Job.created_at >= start_date
).group_by(
func.date(Job.created_at)
).order_by(
func.date(Job.created_at)
).all()
usage_over_time = [
{"date": str(row.date), "jobs": row.jobs, "cost": row.jobs * 0.15}
for row in usage_query
]
# Module breakdown
module_query = db.query(
Job.module,
func.count(Job.id).label('count')
).filter(
Job.created_at >= start_date
).group_by(Job.module).all()
total_jobs = sum(m.count for m in module_query)
module_breakdown = [
{
"module": m.module.replace('_', ' ').title(),
"count": m.count,
"percentage": round(m.count / total_jobs * 100 if total_jobs > 0 else 0)
}
for m in module_query
]
# Top users
top_users_query = db.query(
User.id,
User.email,
func.count(Job.id).label('job_count')
).join(
Job, Job.user_id == User.id
).filter(
Job.created_at >= start_date
).group_by(User.id, User.email).order_by(
desc(func.count(Job.id))
).limit(10).all()
top_users = [
{
"user_id": str(u.id),
"user_email": u.email,
"job_count": u.job_count,
"total_cost": round(u.job_count * 0.15, 2)
}
for u in top_users_query
]
return {
"usage_over_time": usage_over_time,
"module_breakdown": module_breakdown,
"top_users": top_users,
"totals": {
"totalJobs": total_jobs,
"totalCost": round(total_jobs * 0.15, 2),
"avgJobsPerDay": round(total_jobs / days, 1) if days > 0 else 0
}
}
@router.get("/audit-logs")
async def get_audit_logs(
page: int = Query(1, ge=1),
limit: int = Query(50, le=100),
severity: Optional[str] = None,
action: Optional[str] = None,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get audit logs"""
# For now, generate from job history - in production would use dedicated audit table
query = db.query(Job, User).join(User, Job.user_id == User.id)
if action:
if 'failed' in action:
query = query.filter(Job.status == 'failed')
elif 'completed' in action:
query = query.filter(Job.status == 'completed')
total = query.count()
results = query.order_by(desc(Job.created_at)).offset((page - 1) * limit).limit(limit).all()
items = []
for job, user in results:
severity = 'error' if job.status == 'failed' else 'info'
action = f"job.{job.status}"
items.append({
"id": str(job.id),
"user_id": str(user.id),
"user_email": user.email,
"action": action,
"resource_type": "job",
"resource_id": str(job.id),
"details": {
"module": job.module,
"error": job.error_message if job.error_message else None
},
"ip_address": "192.168.1.100", # Placeholder
"created_at": job.created_at.isoformat(),
"severity": severity
})
return {
"items": items,
"total": total,
"page": page,
"limit": limit
}
def _format_relative_time(dt: datetime) -> str:
"""Format datetime as relative time string"""
now = datetime.utcnow()
diff = now - dt
if diff.seconds < 60:
return "Just now"
elif diff.seconds < 3600:
mins = diff.seconds // 60
return f"{mins} min{'s' if mins > 1 else ''} ago"
elif diff.seconds < 86400:
hours = diff.seconds // 3600
return f"{hours} hour{'s' if hours > 1 else ''} ago"
else:
days = diff.days
return f"{days} day{'s' if days > 1 else ''} ago"
# ============== VOICE MANAGEMENT ==============
@router.get("/voices")
async def get_voices(
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get all ElevenLabs voices including custom cloned voices"""
import httpx
from app.config import settings
if not settings.elevenlabs_api_key:
raise HTTPException(status_code=500, detail="ElevenLabs API key not configured")
async with httpx.AsyncClient(timeout=30) as client:
response = await client.get(
"https://api.elevenlabs.io/v1/voices",
headers={"xi-api-key": settings.elevenlabs_api_key}
)
response.raise_for_status()
data = response.json()
voices = []
for voice in data.get("voices", []):
voices.append({
"voice_id": voice.get("voice_id"),
"name": voice.get("name"),
"category": voice.get("category"),
"description": voice.get("description"),
"labels": voice.get("labels", {}),
"preview_url": voice.get("preview_url"),
"available_for_tiers": voice.get("available_for_tiers", []),
"settings": voice.get("settings"),
"sharing": voice.get("sharing"),
"high_quality_base_model_ids": voice.get("high_quality_base_model_ids", []),
"samples": voice.get("samples", [])
})
return {
"voices": voices,
"total": len(voices)
}
@router.get("/voices/{voice_id}")
async def get_voice_details(
voice_id: str,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get detailed information about a specific voice"""
import httpx
from app.config import settings
if not settings.elevenlabs_api_key:
raise HTTPException(status_code=500, detail="ElevenLabs API key not configured")
async with httpx.AsyncClient(timeout=30) as client:
response = await client.get(
f"https://api.elevenlabs.io/v1/voices/{voice_id}",
headers={"xi-api-key": settings.elevenlabs_api_key}
)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Voice not found")
response.raise_for_status()
return response.json()
@router.post("/voices/clone")
async def clone_voice(
name: str,
description: Optional[str] = None,
files: list = None,
labels: Optional[dict] = None,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Clone a voice using audio samples (Instant Voice Cloning)"""
import httpx
from app.config import settings
if not settings.elevenlabs_api_key:
raise HTTPException(status_code=500, detail="ElevenLabs API key not configured")
# For now, return instructions - actual implementation requires file upload
return {
"message": "Voice cloning requires audio file upload",
"instructions": {
"endpoint": "POST /api/v1/admin/voices/clone-with-files",
"required": ["name", "files (audio samples)"],
"optional": ["description", "labels"],
"notes": [
"Upload 1-25 audio samples (max 10MB each)",
"Supported formats: mp3, wav, m4a, ogg, flac",
"Minimum sample length: 30 seconds combined",
"Best results: clear speech, no background noise"
]
}
}
@router.delete("/voices/{voice_id}")
async def delete_voice(
voice_id: str,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Delete a custom voice (only works for cloned voices)"""
import httpx
from app.config import settings
if not settings.elevenlabs_api_key:
raise HTTPException(status_code=500, detail="ElevenLabs API key not configured")
async with httpx.AsyncClient(timeout=30) as client:
response = await client.delete(
f"https://api.elevenlabs.io/v1/voices/{voice_id}",
headers={"xi-api-key": settings.elevenlabs_api_key}
)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Voice not found")
if response.status_code == 400:
raise HTTPException(status_code=400, detail="Cannot delete premade voices")
response.raise_for_status()
return {"message": f"Voice {voice_id} deleted successfully"}
@router.patch("/voices/{voice_id}/settings")
async def update_voice_settings(
voice_id: str,
name: Optional[str] = None,
description: Optional[str] = None,
labels: Optional[dict] = None,
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Update voice name, description or labels"""
import httpx
from app.config import settings
if not settings.elevenlabs_api_key:
raise HTTPException(status_code=500, detail="ElevenLabs API key not configured")
payload = {}
if name:
payload["name"] = name
if description:
payload["description"] = description
if labels:
payload["labels"] = labels
if not payload:
raise HTTPException(status_code=400, detail="No updates provided")
async with httpx.AsyncClient(timeout=30) as client:
response = await client.patch(
f"https://api.elevenlabs.io/v1/voices/{voice_id}/edit",
headers={
"xi-api-key": settings.elevenlabs_api_key,
"Content-Type": "application/json"
},
json=payload
)
if response.status_code == 404:
raise HTTPException(status_code=404, detail="Voice not found")
response.raise_for_status()
return {"message": f"Voice {voice_id} updated successfully"}
@router.get("/voices/models")
async def get_voice_models(
db: Session = Depends(get_db),
admin: User = Depends(get_current_admin_user)
):
"""Get available TTS models from ElevenLabs"""
import httpx
from app.config import settings
if not settings.elevenlabs_api_key:
raise HTTPException(status_code=500, detail="ElevenLabs API key not configured")
async with httpx.AsyncClient(timeout=30) as client:
response = await client.get(
"https://api.elevenlabs.io/v1/models",
headers={"xi-api-key": settings.elevenlabs_api_key}
)
response.raise_for_status()
return response.json()

View file

@ -0,0 +1,267 @@
"""Asset API Routes"""
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Form, Query
from fastapi.responses import FileResponse
from sqlalchemy.orm import Session
from sqlalchemy import desc
from typing import List, Optional
from uuid import UUID, uuid4
import os
import shutil
from PIL import Image
import io
from app.database import get_db
from app.models.asset import Asset
from app.models.user import User
from app.schemas.asset import AssetCreate, AssetResponse
from app.config import settings
router = APIRouter()
THUMBNAIL_SIZE = (256, 256)
THUMBNAIL_QUALITY = 85
def get_file_type(mime_type: str) -> str:
"""Determine file type from mime type"""
if mime_type.startswith("image/"):
return "image"
elif mime_type.startswith("video/"):
return "video"
elif mime_type.startswith("audio/"):
return "audio"
else:
return "document"
def generate_thumbnail(file_path: str, file_type: str, asset_id: str) -> Optional[str]:
"""Generate a thumbnail for an asset"""
try:
thumbnail_dir = os.path.join(settings.storage_path, "thumbnails")
os.makedirs(thumbnail_dir, exist_ok=True)
thumbnail_path = os.path.join(thumbnail_dir, f"{asset_id}.jpg")
if file_type == "image":
with Image.open(file_path) as img:
img.thumbnail(THUMBNAIL_SIZE, Image.Resampling.LANCZOS)
# Convert to RGB if necessary (for PNG with alpha)
if img.mode in ('RGBA', 'LA', 'P'):
img = img.convert('RGB')
img.save(thumbnail_path, 'JPEG', quality=THUMBNAIL_QUALITY)
return thumbnail_path
elif file_type == "video":
# For video, we'd use ffmpeg - placeholder for now
# Could extract first frame with: ffmpeg -i input.mp4 -vframes 1 -f image2 output.jpg
return None
except Exception as e:
print(f"Failed to generate thumbnail: {e}")
return None
@router.get("/", response_model=List[AssetResponse])
def get_assets(
skip: int = 0,
limit: int = 50,
file_type: Optional[str] = None,
module: Optional[str] = None,
db: Session = Depends(get_db)
):
"""Get all assets with optional filtering"""
query = db.query(Asset)
if file_type:
query = query.filter(Asset.file_type == file_type)
if module:
query = query.filter(Asset.source_module == module)
assets = query.order_by(Asset.created_at.desc()).offset(skip).limit(limit).all()
return assets
@router.get("/library")
def get_asset_library(
file_types: Optional[str] = Query(None, description="Comma-separated file types: image,video,audio"),
search: Optional[str] = None,
page: int = Query(1, ge=1),
limit: int = Query(20, le=100),
db: Session = Depends(get_db)
):
"""Get user's asset library with thumbnails for selection in tools"""
# Get test user for now
user = db.query(User).filter(User.email == "test@forge.ai").first()
query = db.query(Asset).filter(Asset.is_temporary == False)
if user:
query = query.filter(Asset.user_id == user.id)
if file_types:
types = [t.strip() for t in file_types.split(",")]
query = query.filter(Asset.file_type.in_(types))
if search:
query = query.filter(Asset.original_filename.ilike(f"%{search}%"))
total = query.count()
assets = query.order_by(desc(Asset.created_at)).offset((page - 1) * limit).limit(limit).all()
return {
"items": [
{
"id": str(a.id),
"filename": a.original_filename or a.stored_filename,
"file_type": a.file_type,
"mime_type": a.mime_type,
"width": a.width,
"height": a.height,
"thumbnail_url": f"/api/v1/assets/{a.id}/thumbnail" if a.thumbnail_path else None,
"file_url": f"/api/v1/assets/{a.id}/download",
"created_at": a.created_at.isoformat(),
"source_module": a.source_module
}
for a in assets
],
"total": total,
"page": page,
"limit": limit,
"pages": (total + limit - 1) // limit
}
@router.get("/{asset_id}/thumbnail")
def get_asset_thumbnail(asset_id: UUID, db: Session = Depends(get_db)):
"""Get asset thumbnail for fast preview"""
asset = db.query(Asset).filter(Asset.id == asset_id).first()
if not asset:
raise HTTPException(status_code=404, detail="Asset not found")
# If thumbnail exists, serve it
if asset.thumbnail_path and os.path.exists(asset.thumbnail_path):
return FileResponse(asset.thumbnail_path, media_type="image/jpeg")
# Generate thumbnail on-demand if it doesn't exist
if asset.file_type == "image" and os.path.exists(asset.file_path):
thumbnail_path = generate_thumbnail(asset.file_path, asset.file_type, str(asset.id))
if thumbnail_path:
asset.thumbnail_path = thumbnail_path
db.commit()
return FileResponse(thumbnail_path, media_type="image/jpeg")
# Fallback: serve original (not ideal but works)
if os.path.exists(asset.file_path):
return FileResponse(asset.file_path, media_type=asset.mime_type)
raise HTTPException(status_code=404, detail="Thumbnail not available")
@router.get("/{asset_id}", response_model=AssetResponse)
def get_asset(asset_id: UUID, db: Session = Depends(get_db)):
"""Get asset by ID"""
asset = db.query(Asset).filter(Asset.id == asset_id).first()
if not asset:
raise HTTPException(status_code=404, detail="Asset not found")
return asset
@router.get("/{asset_id}/download")
def download_asset(asset_id: UUID, db: Session = Depends(get_db)):
"""Download an asset file"""
asset = db.query(Asset).filter(Asset.id == asset_id).first()
if not asset:
raise HTTPException(status_code=404, detail="Asset not found")
file_path = asset.file_path
if not os.path.exists(file_path):
raise HTTPException(status_code=404, detail="File not found on disk")
return FileResponse(
file_path,
filename=asset.original_filename or asset.stored_filename,
media_type=asset.mime_type
)
@router.post("/upload", response_model=AssetResponse)
async def upload_asset(
file: UploadFile = File(...),
project_id: Optional[str] = Form(None),
source_module: Optional[str] = Form(None),
db: Session = Depends(get_db)
):
"""Upload a new asset"""
# Get test user
user = db.query(User).filter(User.email == "test@forge.ai").first()
# Determine file type
file_type = get_file_type(file.content_type)
# Generate unique ID and filename
asset_id = uuid4()
ext = os.path.splitext(file.filename)[1] if file.filename else ""
stored_filename = f"{asset_id}{ext}"
# Determine storage path
storage_dir = os.path.join(settings.storage_path, f"{file_type}s")
os.makedirs(storage_dir, exist_ok=True)
file_path = os.path.join(storage_dir, stored_filename)
# Save file
with open(file_path, "wb") as buffer:
shutil.copyfileobj(file.file, buffer)
# Get file size
file_size = os.path.getsize(file_path)
# Get image dimensions if applicable
width = None
height = None
if file_type == "image":
try:
with Image.open(file_path) as img:
width, height = img.size
except Exception:
pass
# Generate thumbnail
thumbnail_path = generate_thumbnail(file_path, file_type, str(asset_id))
# Create asset record
asset = Asset(
id=asset_id,
user_id=user.id if user else None,
project_id=UUID(project_id) if project_id else None,
original_filename=file.filename,
stored_filename=stored_filename,
file_path=file_path,
thumbnail_path=thumbnail_path,
file_type=file_type,
mime_type=file.content_type,
file_size_bytes=file_size,
width=width,
height=height,
source_module=source_module
)
db.add(asset)
db.commit()
db.refresh(asset)
return asset
@router.delete("/{asset_id}")
def delete_asset(asset_id: UUID, db: Session = Depends(get_db)):
"""Delete an asset"""
asset = db.query(Asset).filter(Asset.id == asset_id).first()
if not asset:
raise HTTPException(status_code=404, detail="Asset not found")
# Delete file from disk
if os.path.exists(asset.file_path):
os.remove(asset.file_path)
# Delete from database
db.delete(asset)
db.commit()
return {"message": "Asset deleted"}

261
backend/app/api/v1/auth.py Normal file
View file

@ -0,0 +1,261 @@
"""Authentication API Routes"""
from fastapi import APIRouter, Depends, HTTPException, status, Response, Cookie
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from sqlalchemy.orm import Session
from jose import JWTError, jwt
from datetime import datetime, timedelta
from typing import Optional
from uuid import UUID
from app.database import get_db
from app.models.user import User
from app.schemas.user import (
UserSignUp, UserLogin, UserResponse, TokenResponse,
PasswordChange, UserUpdate
)
from app.config import settings
router = APIRouter()
security = HTTPBearer(auto_error=False)
# JWT Settings from config
SECRET_KEY = settings.jwt_secret_key
ALGORITHM = settings.jwt_algorithm
ACCESS_TOKEN_EXPIRE_MINUTES = settings.jwt_expire_minutes
def create_access_token(data: dict, expires_delta: Optional[timedelta] = None) -> str:
"""Create a JWT access token"""
to_encode = data.copy()
expire = datetime.utcnow() + (expires_delta or timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES))
to_encode.update({"exp": expire})
return jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
def verify_token(token: str) -> Optional[dict]:
"""Verify a JWT token and return the payload"""
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
return payload
except JWTError:
return None
async def get_current_user(
credentials: Optional[HTTPAuthorizationCredentials] = Depends(security),
access_token: Optional[str] = Cookie(None),
db: Session = Depends(get_db)
) -> User:
"""Get the current authenticated user from JWT token"""
token = None
# Check Authorization header first
if credentials:
token = credentials.credentials
# Fall back to cookie
elif access_token:
token = access_token
if not token:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Not authenticated",
headers={"WWW-Authenticate": "Bearer"},
)
payload = verify_token(token)
if not payload:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid or expired token",
headers={"WWW-Authenticate": "Bearer"},
)
user_id = payload.get("sub")
if not user_id:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid token payload",
)
user = db.query(User).filter(User.id == user_id).first()
if not user:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="User not found",
)
if not user.is_active:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="User account is disabled",
)
return user
async def get_optional_user(
credentials: Optional[HTTPAuthorizationCredentials] = Depends(security),
access_token: Optional[str] = Cookie(None),
db: Session = Depends(get_db)
) -> Optional[User]:
"""Get the current user if authenticated, otherwise return None"""
token = None
if credentials:
token = credentials.credentials
elif access_token:
token = access_token
if not token:
return None
payload = verify_token(token)
if not payload:
return None
user_id = payload.get("sub")
if not user_id:
return None
return db.query(User).filter(User.id == user_id, User.is_active == True).first()
@router.post("/signup", response_model=TokenResponse)
async def signup(user_data: UserSignUp, response: Response, db: Session = Depends(get_db)):
"""Register a new user"""
# Check if email already exists
existing_user = db.query(User).filter(User.email == user_data.email).first()
if existing_user:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Email already registered"
)
# Create new user
user = User(
email=user_data.email,
display_name=user_data.display_name,
hashed_password=User.hash_password(user_data.password),
role="user",
is_active=True,
)
db.add(user)
db.commit()
db.refresh(user)
# Create access token
access_token = create_access_token(data={"sub": str(user.id)})
# Set cookie
response.set_cookie(
key="access_token",
value=access_token,
httponly=True,
max_age=ACCESS_TOKEN_EXPIRE_MINUTES * 60,
samesite="lax",
secure=False, # Set to True in production with HTTPS
)
return TokenResponse(
access_token=access_token,
expires_in=ACCESS_TOKEN_EXPIRE_MINUTES * 60,
user=UserResponse.model_validate(user)
)
@router.post("/login", response_model=TokenResponse)
async def login(credentials: UserLogin, response: Response, db: Session = Depends(get_db)):
"""Login with email and password"""
user = db.query(User).filter(User.email == credentials.email).first()
if not user or not user.verify_password(credentials.password):
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid email or password"
)
if not user.is_active:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="User account is disabled"
)
# Update last login
user.last_login_at = datetime.utcnow()
db.commit()
# Create access token
access_token = create_access_token(data={"sub": str(user.id)})
# Set cookie
response.set_cookie(
key="access_token",
value=access_token,
httponly=True,
max_age=ACCESS_TOKEN_EXPIRE_MINUTES * 60,
samesite="lax",
secure=False, # Set to True in production with HTTPS
)
return TokenResponse(
access_token=access_token,
expires_in=ACCESS_TOKEN_EXPIRE_MINUTES * 60,
user=UserResponse.model_validate(user)
)
@router.post("/logout")
async def logout(response: Response):
"""Logout by clearing the access token cookie"""
response.delete_cookie(key="access_token")
return {"message": "Successfully logged out"}
@router.get("/me", response_model=UserResponse)
async def get_me(current_user: User = Depends(get_current_user)):
"""Get current authenticated user"""
return current_user
@router.patch("/me", response_model=UserResponse)
async def update_me(
user_data: UserUpdate,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
"""Update current user profile"""
# Only allow updating certain fields
allowed_fields = ["display_name", "avatar_url"]
for key, value in user_data.model_dump(exclude_unset=True).items():
if key in allowed_fields and value is not None:
setattr(current_user, key, value)
db.commit()
db.refresh(current_user)
return current_user
@router.post("/me/change-password")
async def change_password(
password_data: PasswordChange,
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db)
):
"""Change current user's password"""
if not current_user.verify_password(password_data.current_password):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Current password is incorrect"
)
current_user.hashed_password = User.hash_password(password_data.new_password)
db.commit()
return {"message": "Password changed successfully"}
@router.get("/verify")
async def verify_auth(current_user: User = Depends(get_current_user)):
"""Verify the current authentication token is valid"""
return {"valid": True, "user_id": str(current_user.id)}

133
backend/app/api/v1/jobs.py Normal file
View file

@ -0,0 +1,133 @@
"""Job API Routes"""
from fastapi import APIRouter, Depends, HTTPException, BackgroundTasks
from sqlalchemy.orm import Session
from typing import List, Optional
from uuid import UUID
from datetime import datetime
from app.database import get_db
from app.models.job import Job
from app.models.user import User
from app.schemas.job import JobCreate, JobResponse, JobUpdate
from app.services.job_processor import process_job
router = APIRouter()
@router.get("/")
def get_jobs(
page: int = 1,
limit: int = 50,
status: Optional[str] = None,
module: Optional[str] = None,
db: Session = Depends(get_db)
):
"""Get all jobs with optional filtering and pagination"""
query = db.query(Job)
if status:
query = query.filter(Job.status == status)
if module:
query = query.filter(Job.module == module)
# Get total count
total = query.count()
# Calculate offset from page
skip = (page - 1) * limit
jobs = query.order_by(Job.created_at.desc()).offset(skip).limit(limit).all()
return {
"items": [
{
"id": str(job.id),
"module": job.module,
"action": job.action,
"status": job.status,
"progress": job.progress or 0,
"input_data": job.input_data,
"output_data": job.output_data,
"input_asset_ids": [str(a) for a in job.input_asset_ids] if job.input_asset_ids else None,
"output_asset_ids": [str(a) for a in job.output_asset_ids] if job.output_asset_ids else None,
"error_message": job.error_message,
"api_provider": job.api_provider,
"api_model": job.api_model,
"created_at": job.created_at.isoformat() if job.created_at else None,
"completed_at": job.completed_at.isoformat() if job.completed_at else None,
}
for job in jobs
],
"total": total,
"page": page,
"limit": limit
}
@router.get("/{job_id}", response_model=JobResponse)
def get_job(job_id: UUID, db: Session = Depends(get_db)):
"""Get job by ID"""
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
raise HTTPException(status_code=404, detail="Job not found")
return job
@router.post("/", response_model=JobResponse)
def create_job(
job: JobCreate,
background_tasks: BackgroundTasks,
db: Session = Depends(get_db)
):
"""Create a new job and queue it for processing"""
# Get test user if no user_id provided
if not job.user_id:
user = db.query(User).filter(User.email == "test@forge.ai").first()
if user:
job.user_id = user.id
# Create job
db_job = Job(
**job.model_dump(),
status="queued",
queued_at=datetime.utcnow()
)
db.add(db_job)
db.commit()
db.refresh(db_job)
# Queue for background processing
background_tasks.add_task(process_job, str(db_job.id))
return db_job
@router.patch("/{job_id}", response_model=JobResponse)
def update_job(job_id: UUID, job: JobUpdate, db: Session = Depends(get_db)):
"""Update a job"""
db_job = db.query(Job).filter(Job.id == job_id).first()
if not db_job:
raise HTTPException(status_code=404, detail="Job not found")
for key, value in job.model_dump(exclude_unset=True).items():
setattr(db_job, key, value)
db.commit()
db.refresh(db_job)
return db_job
@router.delete("/{job_id}")
def cancel_job(job_id: UUID, db: Session = Depends(get_db)):
"""Cancel a job"""
db_job = db.query(Job).filter(Job.id == job_id).first()
if not db_job:
raise HTTPException(status_code=404, detail="Job not found")
if db_job.status in ["completed", "failed"]:
raise HTTPException(status_code=400, detail="Cannot cancel completed or failed job")
db_job.status = "cancelled"
db.commit()
return {"message": "Job cancelled"}

View file

@ -0,0 +1,821 @@
"""Module API Routes - All AI processing endpoints"""
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Form, BackgroundTasks, Body
from sqlalchemy.orm import Session
from typing import Optional, List
from uuid import UUID
from pydantic import BaseModel
import json
from app.database import get_db
from app.models.job import Job
from app.models.user import User
from app.services import (
image_generator,
image_upscaler,
background_remover,
video_generator,
video_upscaler,
subtitle_processor,
voice_to_text,
text_to_speech,
alt_text_generator,
prompt_studio,
markdown_tools,
sound_effects
)
router = APIRouter()
# ============== REQUEST MODELS ==============
class ImageGenerateRequest(BaseModel):
prompt: str
provider: str = "openai"
model: Optional[str] = None
width: int = 1024
height: int = 1024
style: Optional[str] = None
quality: Optional[str] = None
negative_prompt: Optional[str] = None
aspect_ratio: Optional[str] = None
style_preset: Optional[str] = None
# For iterative editing (Nano Banana/Gemini)
reference_asset_id: Optional[str] = None
class VideoGenerateRequest(BaseModel):
prompt: str
provider: str = "runway"
model: Optional[str] = None
duration: int = 5
aspect_ratio: str = "16:9"
resolution: str = "1280x768"
# Runway specific
camera_control: Optional[dict] = None
frame_position: str = "first"
# Veo specific
first_frame_asset_id: Optional[str] = None
last_frame_asset_id: Optional[str] = None
reference_asset_ids: Optional[List[str]] = None
# Input image
input_asset_id: Optional[str] = None
class TextToSpeechRequest(BaseModel):
text: str
voice_id: str = "21m00Tcm4TlvDq8ikWAM"
model_id: str = "eleven_multilingual_v2"
stability: float = 0.5
similarity_boost: float = 0.5
style: float = 0.0
use_speaker_boost: bool = True
speed: float = 1.0
output_format: str = "mp3_44100_128"
class SoundEffectRequest(BaseModel):
text: str
duration_seconds: Optional[float] = None
prompt_influence: float = 0.3
loop: bool = False
output_format: str = "mp3_44100_128"
class PromptEnhanceRequest(BaseModel):
prompt: str
style: str = "cinematic"
provider: str = "openai"
include_negative: bool = True
include_technical: bool = True
language: str = "en"
class MermaidRenderRequest(BaseModel):
code: str
output_format: str = "svg"
theme: str = "default"
background: str = "transparent"
class MermaidGenerateRequest(BaseModel):
description: str
diagram_type: str = "flowchart"
style: str = "detailed"
render: bool = True
class MarkdownConvertRequest(BaseModel):
content: str
output_format: str = "html"
theme: str = "github"
class MarkdownGenerateRequest(BaseModel):
topic: str
content_type: str = "article"
length: str = "medium"
include_toc: bool = True
# ============== IMAGE MODULES ==============
def job_response(job: Job) -> dict:
"""Format job for API response"""
return {
"id": str(job.id),
"module": job.module,
"action": job.action,
"status": job.status,
"progress": job.progress or 0,
"input_data": job.input_data,
"output_data": job.output_data,
"input_asset_ids": [str(a) for a in job.input_asset_ids] if job.input_asset_ids else None,
"output_asset_ids": [str(a) for a in job.output_asset_ids] if job.output_asset_ids else None,
"error_message": job.error_message,
"api_provider": job.api_provider,
"api_model": job.api_model,
"created_at": job.created_at.isoformat() if job.created_at else None,
"completed_at": job.completed_at.isoformat() if job.completed_at else None,
}
@router.post("/image/generate")
async def generate_image(
request: ImageGenerateRequest,
background_tasks: BackgroundTasks,
db: Session = Depends(get_db)
):
"""Generate an image using various AI providers
Providers: openai, dalle3, stable-diffusion, leonardo, ideogram, flux, gemini, nano-banana
Supports iterative editing with reference_asset_id for nano-banana/gemini providers
"""
from app.models.asset import Asset
import base64
user = db.query(User).filter(User.email == "test@forge.ai").first()
input_data = request.model_dump(exclude_none=True)
# If reference_asset_id is provided, load the image and convert to base64
if request.reference_asset_id:
asset = db.query(Asset).filter(Asset.id == request.reference_asset_id).first()
if asset and asset.file_path:
import os
if os.path.exists(asset.file_path):
with open(asset.file_path, "rb") as f:
image_data = f.read()
# Convert to base64 for the generator
input_data["reference_image"] = base64.b64encode(image_data).decode("utf-8")
# Remove reference_asset_id from input_data (we've converted it)
del input_data["reference_asset_id"]
job = Job(
user_id=user.id if user else None,
module="image_generator",
action="generate",
input_data=input_data,
status="queued",
progress=0
)
db.add(job)
db.commit()
db.refresh(job)
background_tasks.add_task(image_generator.generate, str(job.id))
return job_response(job)
@router.post("/image/upscale")
async def upscale_image(
file: UploadFile = File(...),
scale: int = Form(2),
model: str = Form("auto"),
face_enhancement: bool = Form(False),
noise_reduction: Optional[int] = Form(None),
sharpening: Optional[int] = Form(None),
compression_recovery: Optional[int] = Form(None),
detail_enhancement: Optional[int] = Form(None),
preserve_grain: bool = Form(False),
output_format: str = Form("png"),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""Upscale an image using Topaz Labs
Models: proteus, artemis, gaia, iris, nyx, rhea, theia, auto
"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="image_upscaler", db=db)
job = Job(
user_id=user.id if user else None,
module="image_upscaler",
action="upscale",
input_data={
"scale": scale,
"model": model,
"face_enhancement": face_enhancement,
"noise_reduction": noise_reduction,
"sharpening": sharpening,
"compression_recovery": compression_recovery,
"detail_enhancement": detail_enhancement,
"preserve_grain": preserve_grain,
"output_format": output_format
},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(image_upscaler.upscale, str(job.id))
return job_response(job)
@router.post("/image/remove-background")
async def remove_background(
file: UploadFile = File(...),
output_format: str = Form("png"),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""Remove background from image"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="background_remover", db=db)
job = Job(
user_id=user.id if user else None,
module="background_remover",
action="remove",
input_data={"output_format": output_format},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(background_remover.remove_background, str(job.id))
return job_response(job)
# ============== VIDEO MODULES ==============
@router.post("/video/generate")
async def generate_video(
request: VideoGenerateRequest,
background_tasks: BackgroundTasks,
db: Session = Depends(get_db)
):
"""Generate video using Runway or Google Veo
Runway: gen3_alpha, gen3_alpha_turbo, gen4
Veo: veo-3.1-generate-preview, veo-3.1-fast
"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
input_asset_ids = []
if request.input_asset_id:
input_asset_ids.append(UUID(request.input_asset_id))
job = Job(
user_id=user.id if user else None,
module="video_generator",
action="generate",
input_data=request.model_dump(exclude_none=True),
input_asset_ids=input_asset_ids if input_asset_ids else None,
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
background_tasks.add_task(video_generator.generate, str(job.id))
return job_response(job)
@router.post("/video/upscale")
async def upscale_video(
file: UploadFile = File(...),
scale: int = Form(2),
model: str = Form("auto"),
frame_interpolation: int = Form(1),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""Upscale video using Topaz Labs"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="video_upscaler", db=db)
job = Job(
user_id=user.id if user else None,
module="video_upscaler",
action="upscale",
input_data={
"scale": scale,
"model": model,
"frame_interpolation": frame_interpolation
},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(video_upscaler.upscale, str(job.id))
return job_response(job)
@router.get("/video/subtitles/config")
async def get_subtitle_config():
"""Get available subtitle configuration options"""
return subtitle_processor.get_subtitle_config()
@router.post("/video/subtitles")
async def generate_subtitles(
file: UploadFile = File(...),
source_language: str = Form("auto"),
target_language: Optional[str] = Form(None),
burn_subtitles: bool = Form(False),
whisper_model: str = Form("base"),
output_format: str = Form("srt"),
# Styling options
font: str = Form("Arial"),
font_size: int = Form(24),
text_color: str = Form("white"),
outline_color: str = Form("black"),
outline_width: float = Form(2.0),
background_color: Optional[str] = Form(None),
background_opacity: float = Form(0.0),
position: str = Form("bottom"),
alignment: str = Form("center"),
margin_v: int = Form(30),
margin_h: int = Form(20),
shadow: int = Form(0),
bold: bool = Form(False),
italic: bool = Form(False),
font_preset: Optional[str] = Form(None),
word_timestamps: bool = Form(False),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""
Generate subtitles for video using Whisper + DeepL
Parameters:
- source_language: Source language code or "auto" for detection
- target_language: Target language code for translation (optional)
- burn_subtitles: Whether to burn subtitles into video
- whisper_model: Whisper model (tiny/base/small/medium/large/large-v2/large-v3)
- output_format: Output format (srt/vtt/ass)
Styling (for burning):
- font: Font family name
- font_size: Font size in points
- text_color: Primary text color
- outline_color: Text outline color
- outline_width: Outline thickness (0-5)
- background_color: Background box color
- background_opacity: Background opacity (0-1)
- position: Vertical position (bottom/top/center)
- alignment: Horizontal alignment (left/center/right)
- margin_v: Vertical margin from edge
- margin_h: Horizontal margin
- shadow: Shadow depth (0-4)
- bold: Use bold text
- italic: Use italic text
- font_preset: Predefined style preset (default/cinematic/documentary/news/social_media/minimal/bold)
- word_timestamps: Include word-level timestamps
"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="subtitle_processor", db=db)
job = Job(
user_id=user.id if user else None,
module="subtitle_processor",
action="generate",
input_data={
"source_language": source_language,
"target_language": target_language,
"burn_subtitles": burn_subtitles,
"whisper_model": whisper_model,
"output_format": output_format,
"font": font,
"font_size": font_size,
"text_color": text_color,
"outline_color": outline_color,
"outline_width": outline_width,
"background_color": background_color,
"background_opacity": background_opacity,
"position": position,
"alignment": alignment,
"margin_v": margin_v,
"margin_h": margin_h,
"shadow": shadow,
"bold": bold,
"italic": italic,
"font_preset": font_preset,
"word_timestamps": word_timestamps
},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(subtitle_processor.process, str(job.id))
return job_response(job)
# ============== AUDIO MODULES ==============
@router.post("/audio/voice-to-text")
async def transcribe_audio(
file: UploadFile = File(...),
output_format: str = Form("txt"),
translate: bool = Form(False),
target_language: str = Form("EN-US"),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""Transcribe audio to text using Whisper"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="voice_to_text", db=db)
job = Job(
user_id=user.id if user else None,
module="voice_to_text",
action="transcribe",
input_data={
"output_format": output_format,
"translate": translate,
"target_language": target_language
},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(voice_to_text.transcribe, str(job.id))
return job_response(job)
@router.post("/audio/text-to-speech")
async def synthesize_speech(
request: TextToSpeechRequest,
background_tasks: BackgroundTasks,
db: Session = Depends(get_db)
):
"""Convert text to speech using ElevenLabs
Models: eleven_multilingual_v2, eleven_flash_v2_5, eleven_turbo_v2_5, eleven_v3
"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
job = Job(
user_id=user.id if user else None,
module="text_to_speech",
action="synthesize",
input_data=request.model_dump(),
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
background_tasks.add_task(text_to_speech.synthesize, str(job.id))
return job_response(job)
@router.post("/audio/speech-to-speech")
async def convert_voice(
file: UploadFile = File(...),
voice_id: str = Form(...),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""Convert voice to another voice using ElevenLabs"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="speech_to_speech", db=db)
job = Job(
user_id=user.id if user else None,
module="speech_to_speech",
action="convert",
input_data={"voice_id": voice_id},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(text_to_speech.speech_to_speech, str(job.id))
return job_response(job)
@router.post("/audio/sound-effects")
async def generate_sound_effect(
request: SoundEffectRequest,
background_tasks: BackgroundTasks,
db: Session = Depends(get_db)
):
"""Generate sound effects from text description using ElevenLabs
Describe the sound you want - explosions, footsteps, ambient sounds, etc.
Max duration: 22 seconds
"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
job = Job(
user_id=user.id if user else None,
module="sound_effects",
action="generate",
input_data=request.model_dump(),
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
background_tasks.add_task(sound_effects.generate_sound_effect_job, str(job.id))
return job_response(job)
@router.get("/audio/sound-effects/formats")
async def get_sound_effect_formats():
"""Get available output formats for sound effects"""
generator = sound_effects.get_sound_effects_generator()
return await generator.get_available_formats()
# ============== TEXT MODULES ==============
@router.post("/text/alt-text")
async def generate_alt_text(
file: UploadFile = File(...),
background_tasks: BackgroundTasks = None,
db: Session = Depends(get_db)
):
"""Generate alt text for image using GPT-4 Vision"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
from app.api.v1.assets import upload_asset
asset = await upload_asset(file=file, source_module="alt_text_generator", db=db)
job = Job(
user_id=user.id if user else None,
module="alt_text_generator",
action="generate",
input_data={},
input_asset_ids=[asset.id],
status="queued"
)
db.add(job)
db.commit()
db.refresh(job)
if background_tasks:
background_tasks.add_task(alt_text_generator.generate, str(job.id))
return job_response(job)
@router.post("/text/enhance-prompt")
async def enhance_prompt(
request: PromptEnhanceRequest,
db: Session = Depends(get_db)
):
"""Enhance a prompt using AI (Gemini/OpenAI)
Styles: cinematic, photographic, artistic, product, fantasy, minimal,
vintage, futuristic, anime, portrait, landscape, abstract,
fashion, architecture, food
Providers: openai, gpt-image-1, stable-diffusion, midjourney, flux, leonardo
"""
result = await prompt_studio.enhance(
prompt=request.prompt,
style=request.style,
provider=request.provider,
include_negative=request.include_negative,
include_technical=request.include_technical,
language=request.language
)
return result
@router.get("/text/prompt-styles")
async def get_prompt_styles():
"""Get available prompt enhancement styles"""
return prompt_studio.get_available_styles()
# ============== MARKDOWN & MERMAID MODULES ==============
@router.post("/text/mermaid/render")
async def render_mermaid_diagram(request: MermaidRenderRequest):
"""Render Mermaid diagram code to SVG/PNG
Themes: default, dark, forest, neutral
Formats: svg, png
"""
result = await markdown_tools.render_mermaid(
code=request.code,
output_format=request.output_format,
theme=request.theme,
background=request.background
)
return result
@router.post("/text/mermaid/generate")
async def generate_mermaid_diagram(request: MermaidGenerateRequest):
"""Generate Mermaid diagram from natural language description
Diagram types: flowchart, sequence, class, state, er, journey,
gantt, pie, mindmap, timeline, gitgraph
Styles: simple, detailed, complex
"""
result = await markdown_tools.generate_mermaid_with_ai(
description=request.description,
diagram_type=request.diagram_type,
style=request.style
)
# Optionally render the diagram
if request.render and result.get("success") and result.get("code"):
render_result = await markdown_tools.render_mermaid(result["code"])
result["rendered"] = render_result
return result
@router.get("/text/mermaid/templates")
async def get_mermaid_templates():
"""Get available Mermaid diagram templates"""
return markdown_tools.get_mermaid_templates()
@router.get("/text/mermaid/templates/{diagram_type}")
async def get_mermaid_template(diagram_type: str):
"""Get a specific Mermaid template"""
template = markdown_tools.get_mermaid_template(diagram_type)
if not template:
raise HTTPException(status_code=404, detail=f"Template not found: {diagram_type}")
return template
@router.post("/text/markdown/convert")
async def convert_markdown(request: MarkdownConvertRequest):
"""Convert Markdown to HTML or plain text
Output formats: html, plain
Themes: github (for HTML)
"""
result = await markdown_tools.convert_markdown(
content=request.content,
output_format=request.output_format,
theme=request.theme
)
return result
@router.post("/text/markdown/generate")
async def generate_markdown_content(request: MarkdownGenerateRequest):
"""Generate Markdown content using AI
Content types: article, documentation, readme, tutorial, report
Length: short, medium, long
"""
result = await markdown_tools.generate_markdown_with_ai(
topic=request.topic,
content_type=request.content_type,
length=request.length,
include_toc=request.include_toc
)
return result
# ============== UTILITY ENDPOINTS ==============
@router.get("/voices")
async def get_elevenlabs_voices():
"""Get available ElevenLabs voices"""
voices = await text_to_speech.get_voices()
return voices
@router.get("/models/{provider}")
async def get_provider_models(provider: str):
"""Get available models for a provider"""
models = {
# Image providers
"openai": ["gpt-image-1", "dall-e-3", "dall-e-2"],
"stable-diffusion": ["sd3-large", "sd3-medium", "sdxl-1.0", "stable-cascade"],
"leonardo": ["phoenix-1", "kino-xl", "anime-xl"],
"ideogram": ["V_2", "V_2_TURBO"],
"flux": ["flux-pro-1.1", "flux-dev", "flux-schnell"],
"gemini": ["gemini-2.0-flash-exp"],
# Video providers
"runway": ["gen3_alpha", "gen3_alpha_turbo", "gen4"],
"veo": [
"veo-3.1-generate-preview",
"veo-3.1-fast-generate-preview",
"veo-3.0-generate-001",
"veo-3.0-fast-generate-001",
"veo-2.0-generate-001"
],
# Upscaling
"topaz-image": ["proteus", "artemis", "gaia", "iris", "nyx", "rhea", "theia", "auto"],
"topaz-video": ["auto", "proteus", "artemis"],
# Audio
"elevenlabs": [
"eleven_multilingual_v2",
"eleven_flash_v2_5",
"eleven_turbo_v2_5",
"eleven_v3",
"eleven_monolingual_v1"
]
}
return models.get(provider, [])
@router.get("/models")
async def get_all_models():
"""Get all available models organized by category"""
return {
"image": {
"openai": {
"models": ["gpt-image-1", "dall-e-3"],
"default": "gpt-image-1",
"features": ["quality", "background", "transparent"]
},
"stable-diffusion": {
"models": ["sd3-large", "sd3-medium", "sdxl-1.0"],
"default": "sd3-large",
"features": ["negative_prompt", "style_preset", "img2img"]
},
"flux": {
"models": ["flux-pro-1.1", "flux-dev", "flux-schnell"],
"default": "flux-pro-1.1",
"features": ["img2img"]
}
},
"video": {
"runway": {
"models": ["gen3_alpha", "gen3_alpha_turbo", "gen4"],
"default": "gen3_alpha_turbo",
"features": ["camera_control", "image_to_video"]
},
"veo": {
"models": ["veo-3.1-generate-preview", "veo-3.1-fast-generate-preview", "veo-3.0-generate-001"],
"default": "veo-3.1-generate-preview",
"features": ["audio", "reference_images", "video_extension", "frame_interpolation"]
}
},
"audio": {
"elevenlabs": {
"models": ["eleven_multilingual_v2", "eleven_flash_v2_5", "eleven_turbo_v2_5", "eleven_v3"],
"default": "eleven_multilingual_v2",
"features": ["32_languages", "voice_cloning", "voice_settings"]
}
}
}

View file

@ -0,0 +1,61 @@
"""User API Routes"""
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session
from typing import List
from uuid import UUID
from app.database import get_db
from app.models.user import User
from app.schemas.user import UserCreate, UserResponse, UserUpdate
router = APIRouter()
@router.get("/", response_model=List[UserResponse])
def get_users(skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
"""Get all users"""
users = db.query(User).offset(skip).limit(limit).all()
return users
@router.get("/me", response_model=UserResponse)
def get_current_user(db: Session = Depends(get_db)):
"""Get current user (test user for now)"""
user = db.query(User).filter(User.email == "test@forge.ai").first()
if not user:
raise HTTPException(status_code=404, detail="User not found")
return user
@router.get("/{user_id}", response_model=UserResponse)
def get_user(user_id: UUID, db: Session = Depends(get_db)):
"""Get user by ID"""
user = db.query(User).filter(User.id == user_id).first()
if not user:
raise HTTPException(status_code=404, detail="User not found")
return user
@router.post("/", response_model=UserResponse)
def create_user(user: UserCreate, db: Session = Depends(get_db)):
"""Create a new user"""
db_user = User(**user.model_dump())
db.add(db_user)
db.commit()
db.refresh(db_user)
return db_user
@router.patch("/{user_id}", response_model=UserResponse)
def update_user(user_id: UUID, user: UserUpdate, db: Session = Depends(get_db)):
"""Update a user"""
db_user = db.query(User).filter(User.id == user_id).first()
if not db_user:
raise HTTPException(status_code=404, detail="User not found")
for key, value in user.model_dump(exclude_unset=True).items():
setattr(db_user, key, value)
db.commit()
db.refresh(db_user)
return db_user

61
backend/app/config.py Normal file
View file

@ -0,0 +1,61 @@
"""FORGE AI Configuration"""
from pydantic_settings import BaseSettings
from functools import lru_cache
import os
class Settings(BaseSettings):
# App
app_name: str = "FORGE AI"
app_version: str = "1.0.0"
debug: bool = False
# Database
database_url: str = "postgresql://forge_user:forge_secure_password_2024@localhost:5452/forge_ai"
# Redis
redis_url: str = "redis://localhost:6399"
# Storage
storage_path: str = "/app/storage"
# API Keys (loaded from environment)
openai_api_key: str = ""
anthropic_api_key: str = ""
google_api_key: str = ""
elevenlabs_api_key: str = ""
topaz_api_key: str = ""
runway_api_key: str = ""
deepl_api_key: str = ""
clipping_magic_api_key: str = ""
stability_api_key: str = ""
leonardo_api_key: str = ""
ideogram_api_key: str = ""
bria_api_key: str = ""
flux_api_key: str = ""
# Google Cloud
gcs_bucket_name: str = ""
gcs_project_id: str = ""
# Azure AD
azure_client_id: str = ""
azure_tenant_id: str = ""
azure_authority: str = ""
# JWT
jwt_secret_key: str = "forge-ai-secret-key-change-in-production"
jwt_algorithm: str = "HS256"
jwt_expire_minutes: int = 60 * 24 * 7 # 7 days
class Config:
env_file = ".env"
extra = "ignore"
@lru_cache()
def get_settings() -> Settings:
return Settings()
settings = get_settings()

28
backend/app/database.py Normal file
View file

@ -0,0 +1,28 @@
"""Database configuration and session management"""
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from app.config import settings
# Create engine
engine = create_engine(
settings.database_url,
pool_pre_ping=True,
pool_size=10,
max_overflow=20
)
# Create session factory
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
# Base class for models
Base = declarative_base()
def get_db():
"""Dependency for getting database session"""
db = SessionLocal()
try:
yield db
finally:
db.close()

73
backend/app/main.py Normal file
View file

@ -0,0 +1,73 @@
"""FORGE AI - Main FastAPI Application"""
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles
from contextlib import asynccontextmanager
import os
from app.config import settings
from app.api.v1 import router as api_router
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Startup and shutdown events"""
# Startup
print(f"🚀 Starting {settings.app_name} v{settings.app_version}")
# Ensure storage directories exist
storage_dirs = ["images", "videos", "audio", "documents", "temp"]
for dir_name in storage_dirs:
os.makedirs(os.path.join(settings.storage_path, dir_name), exist_ok=True)
yield
# Shutdown
print(f"👋 Shutting down {settings.app_name}")
# Create FastAPI app
app = FastAPI(
title=settings.app_name,
version=settings.app_version,
description="Unified AI Creative Platform - Image, Video, Audio, and Text Processing",
lifespan=lifespan
)
# CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=[
"http://localhost:3020",
"http://localhost:3000",
"http://127.0.0.1:3020",
"https://ai-sandbox.oliver.solutions",
],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Mount static files for storage access
if os.path.exists(settings.storage_path):
app.mount("/storage", StaticFiles(directory=settings.storage_path), name="storage")
# Include API router
app.include_router(api_router, prefix="/api/v1")
@app.get("/")
async def root():
"""Root endpoint"""
return {
"name": settings.app_name,
"version": settings.app_version,
"status": "running",
"docs": "/docs"
}
@app.get("/health")
async def health_check():
"""Health check endpoint"""
return {"status": "healthy", "service": settings.app_name}

View file

@ -0,0 +1,9 @@
"""SQLAlchemy Models"""
from app.models.user import User
from app.models.project import Project
from app.models.asset import Asset
from app.models.job import Job
from app.models.usage import UsageLog
from app.models.api_key import APIKey
__all__ = ["User", "Project", "Asset", "Job", "UsageLog", "APIKey"]

View file

@ -0,0 +1,21 @@
"""API Key Model"""
from sqlalchemy import Column, String, Boolean, DateTime, Integer, Numeric, Text
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.sql import func
import uuid
from app.database import Base
class APIKey(Base):
__tablename__ = "api_keys"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
provider = Column(String(100), nullable=False)
key_name = Column(String(255), nullable=False)
encrypted_key = Column(Text, nullable=False)
is_active = Column(Boolean, default=True)
rate_limit_per_minute = Column(Integer)
monthly_budget = Column(Numeric(10, 2))
current_month_usage = Column(Numeric(10, 2), default=0)
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())

View file

@ -0,0 +1,47 @@
"""Asset Model"""
from sqlalchemy import Column, String, Boolean, DateTime, ForeignKey, Text, Integer, BigInteger, Numeric
from sqlalchemy.dialects.postgresql import UUID, JSONB
from sqlalchemy.orm import relationship
from sqlalchemy.sql import func
import uuid
from app.database import Base
class Asset(Base):
__tablename__ = "assets"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
user_id = Column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="SET NULL"))
project_id = Column(UUID(as_uuid=True), ForeignKey("projects.id", ondelete="SET NULL"))
# File information
original_filename = Column(String(500))
stored_filename = Column(String(500), nullable=False)
file_path = Column(Text, nullable=False)
thumbnail_path = Column(Text) # Proxy thumbnail for fast UI loading
file_type = Column(String(50), nullable=False) # image, video, audio, document
mime_type = Column(String(100))
file_size_bytes = Column(BigInteger)
# Metadata
width = Column(Integer)
height = Column(Integer)
duration_seconds = Column(Numeric(10, 2))
asset_metadata = Column('metadata', JSONB, default={})
# Source tracking
source_module = Column(String(100))
source_job_id = Column(UUID(as_uuid=True))
parent_asset_id = Column(UUID(as_uuid=True), ForeignKey("assets.id"))
# Status
is_temporary = Column(Boolean, default=False)
expires_at = Column(DateTime(timezone=True))
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
# Relationships
user = relationship("User", back_populates="assets")
project = relationship("Project", back_populates="assets")
parent = relationship("Asset", remote_side=[id])

51
backend/app/models/job.py Normal file
View file

@ -0,0 +1,51 @@
"""Job Model"""
from sqlalchemy import Column, String, Boolean, DateTime, ForeignKey, Text, Integer
from sqlalchemy.dialects.postgresql import UUID, JSONB, ARRAY
from sqlalchemy.orm import relationship
from sqlalchemy.sql import func
import uuid
from app.database import Base
class Job(Base):
__tablename__ = "jobs"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
user_id = Column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="SET NULL"))
project_id = Column(UUID(as_uuid=True), ForeignKey("projects.id", ondelete="SET NULL"))
# Job details
module = Column(String(100), nullable=False)
action = Column(String(100), nullable=False)
priority = Column(Integer, default=5)
# Input/Output
input_data = Column(JSONB, nullable=False)
output_data = Column(JSONB)
input_asset_ids = Column(ARRAY(UUID(as_uuid=True)))
output_asset_ids = Column(ARRAY(UUID(as_uuid=True)))
# Status tracking
status = Column(String(50), default="pending")
progress = Column(Integer, default=0)
error_message = Column(Text)
retry_count = Column(Integer, default=0)
max_retries = Column(Integer, default=3)
# Timing
queued_at = Column(DateTime(timezone=True))
started_at = Column(DateTime(timezone=True))
completed_at = Column(DateTime(timezone=True))
estimated_duration_seconds = Column(Integer)
# API tracking
api_provider = Column(String(100))
api_model = Column(String(100))
api_request_id = Column(String(255))
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
# Relationships
user = relationship("User", back_populates="jobs")
project = relationship("Project", back_populates="jobs")

View file

@ -0,0 +1,24 @@
"""Project Model"""
from sqlalchemy import Column, String, Boolean, DateTime, ForeignKey, Text
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.orm import relationship
from sqlalchemy.sql import func
import uuid
from app.database import Base
class Project(Base):
__tablename__ = "projects"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
user_id = Column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="SET NULL"))
name = Column(String(255), nullable=False)
description = Column(Text)
is_archived = Column(Boolean, default=False)
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
# Relationships
user = relationship("User", back_populates="projects")
assets = relationship("Asset", back_populates="project")
jobs = relationship("Job", back_populates="project")

View file

@ -0,0 +1,33 @@
"""Usage Log Model"""
from sqlalchemy import Column, String, DateTime, ForeignKey, Integer, Numeric
from sqlalchemy.dialects.postgresql import UUID, JSONB
from sqlalchemy.sql import func
import uuid
from app.database import Base
class UsageLog(Base):
__tablename__ = "usage_logs"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
user_id = Column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="SET NULL"))
job_id = Column(UUID(as_uuid=True), ForeignKey("jobs.id", ondelete="SET NULL"))
# What was used
module = Column(String(100), nullable=False)
action = Column(String(100), nullable=False)
api_provider = Column(String(100))
api_model = Column(String(100))
# Metrics
tokens_input = Column(Integer)
tokens_output = Column(Integer)
api_credits_used = Column(Numeric(10, 4))
estimated_cost_usd = Column(Numeric(10, 4))
processing_time_ms = Column(Integer)
# Request details
request_metadata = Column(JSONB)
response_metadata = Column(JSONB)
created_at = Column(DateTime(timezone=True), server_default=func.now())

View file

@ -0,0 +1,48 @@
"""User Model"""
from sqlalchemy import Column, String, Boolean, DateTime
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.orm import relationship
from sqlalchemy.sql import func
import uuid
from passlib.context import CryptContext
from app.database import Base
# Configure bcrypt password hashing
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
class User(Base):
__tablename__ = "users"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
azure_oid = Column(String(255), unique=True, nullable=True)
email = Column(String(255), unique=True, nullable=False)
hashed_password = Column(String(255), nullable=True) # Nullable for SSO users
display_name = Column(String(255))
avatar_url = Column(String)
role = Column(String(50), default="user")
department = Column(String(255))
is_active = Column(Boolean, default=True)
last_login_at = Column(DateTime(timezone=True))
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
def verify_password(self, password: str) -> bool:
"""Verify a password against the hash"""
if not self.hashed_password:
return False
# Truncate to 72 bytes for bcrypt compatibility
password_bytes = password.encode('utf-8')[:72].decode('utf-8', errors='ignore')
return pwd_context.verify(password_bytes, self.hashed_password)
@staticmethod
def hash_password(password: str) -> str:
"""Hash a password (truncate to 72 bytes for bcrypt compatibility)"""
# bcrypt has a 72-byte limit on passwords
password_bytes = password.encode('utf-8')[:72].decode('utf-8', errors='ignore')
return pwd_context.hash(password_bytes)
# Relationships
projects = relationship("Project", back_populates="user")
assets = relationship("Asset", back_populates="user")
jobs = relationship("Job", back_populates="user")

View file

@ -0,0 +1,10 @@
"""Pydantic Schemas"""
from app.schemas.user import UserCreate, UserResponse, UserUpdate
from app.schemas.job import JobCreate, JobResponse, JobUpdate
from app.schemas.asset import AssetCreate, AssetResponse
__all__ = [
"UserCreate", "UserResponse", "UserUpdate",
"JobCreate", "JobResponse", "JobUpdate",
"AssetCreate", "AssetResponse"
]

View file

@ -0,0 +1,49 @@
"""Asset Schemas"""
from pydantic import BaseModel, Field
from typing import Optional, Dict, Any
from datetime import datetime
from uuid import UUID
class AssetBase(BaseModel):
original_filename: Optional[str] = None
file_type: str
mime_type: Optional[str] = None
source_module: Optional[str] = None
class AssetCreate(AssetBase):
user_id: Optional[UUID] = None
project_id: Optional[UUID] = None
stored_filename: str
file_path: str
file_size_bytes: Optional[int] = None
width: Optional[int] = None
height: Optional[int] = None
duration_seconds: Optional[float] = None
asset_metadata: Optional[Dict[str, Any]] = Field(default={}, alias="metadata")
source_job_id: Optional[UUID] = None
parent_asset_id: Optional[UUID] = None
class AssetResponse(AssetBase):
id: UUID
user_id: Optional[UUID] = None
project_id: Optional[UUID] = None
stored_filename: str
file_path: str
file_size_bytes: Optional[int] = None
width: Optional[int] = None
height: Optional[int] = None
duration_seconds: Optional[float] = None
asset_metadata: Dict[str, Any] = Field(default={}, serialization_alias="metadata")
source_job_id: Optional[UUID] = None
parent_asset_id: Optional[UUID] = None
is_temporary: bool = False
expires_at: Optional[datetime] = None
created_at: datetime
updated_at: datetime
class Config:
from_attributes = True
populate_by_name = True

View file

@ -0,0 +1,48 @@
"""Job Schemas"""
from pydantic import BaseModel
from typing import Optional, List, Dict, Any
from datetime import datetime
from uuid import UUID
class JobBase(BaseModel):
module: str
action: str
priority: int = 5
input_data: Dict[str, Any]
class JobCreate(JobBase):
user_id: Optional[UUID] = None
project_id: Optional[UUID] = None
input_asset_ids: Optional[List[UUID]] = None
class JobUpdate(BaseModel):
status: Optional[str] = None
progress: Optional[int] = None
output_data: Optional[Dict[str, Any]] = None
output_asset_ids: Optional[List[UUID]] = None
error_message: Optional[str] = None
class JobResponse(JobBase):
id: UUID
user_id: Optional[UUID] = None
project_id: Optional[UUID] = None
status: str
progress: int
output_data: Optional[Dict[str, Any]] = None
input_asset_ids: Optional[List[UUID]] = None
output_asset_ids: Optional[List[UUID]] = None
error_message: Optional[str] = None
api_provider: Optional[str] = None
api_model: Optional[str] = None
queued_at: Optional[datetime] = None
started_at: Optional[datetime] = None
completed_at: Optional[datetime] = None
created_at: datetime
updated_at: datetime
class Config:
from_attributes = True

View file

@ -0,0 +1,77 @@
"""User Schemas"""
from pydantic import BaseModel, EmailStr, validator
from typing import Optional
from datetime import datetime
from uuid import UUID
class UserBase(BaseModel):
email: EmailStr
display_name: Optional[str] = None
role: str = "user"
department: Optional[str] = None
class UserCreate(UserBase):
azure_oid: Optional[str] = None
password: Optional[str] = None # Optional for SSO users
class UserSignUp(BaseModel):
"""Schema for user registration"""
email: EmailStr
password: str
display_name: str
@validator("password")
def validate_password(cls, v):
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
return v
class UserLogin(BaseModel):
"""Schema for user login"""
email: EmailStr
password: str
class UserUpdate(BaseModel):
display_name: Optional[str] = None
role: Optional[str] = None
department: Optional[str] = None
is_active: Optional[bool] = None
avatar_url: Optional[str] = None
class PasswordChange(BaseModel):
"""Schema for changing password"""
current_password: str
new_password: str
@validator("new_password")
def validate_password(cls, v):
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
return v
class UserResponse(UserBase):
id: UUID
azure_oid: Optional[str] = None
avatar_url: Optional[str] = None
is_active: bool
last_login_at: Optional[datetime] = None
created_at: datetime
updated_at: datetime
class Config:
from_attributes = True
class TokenResponse(BaseModel):
"""Schema for JWT token response"""
access_token: str
token_type: str = "bearer"
expires_in: int
user: UserResponse

View file

@ -0,0 +1,32 @@
"""Services Package"""
from app.services import (
image_generator,
image_upscaler,
background_remover,
video_generator,
video_upscaler,
subtitle_processor,
voice_to_text,
text_to_speech,
alt_text_generator,
prompt_studio,
job_processor,
markdown_tools,
sound_effects
)
__all__ = [
"image_generator",
"image_upscaler",
"background_remover",
"video_generator",
"video_upscaler",
"subtitle_processor",
"voice_to_text",
"text_to_speech",
"alt_text_generator",
"prompt_studio",
"job_processor",
"markdown_tools",
"sound_effects"
]

View file

@ -0,0 +1,126 @@
"""Alt Text Generator Service - OpenAI GPT-4 Vision"""
import httpx
import base64
import os
from datetime import datetime
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
async def generate(job_id: str):
"""Generate alt text for image using GPT-4 Vision"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
job.progress = 10
job.api_provider = "openai"
job.api_model = "gpt-4o"
db.commit()
# Read and encode image
with open(input_asset.file_path, "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
job.progress = 20
db.commit()
# Call GPT-4 Vision
async with httpx.AsyncClient(timeout=60) as client:
response = await client.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {settings.openai_api_key}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": """You are an expert at writing accessible alt text for images.
Your alt text should:
- Be concise and descriptive
- Focus on the most important elements
- Avoid starting with "image of" or "picture of"
- Include any text visible in the image
- Be factual and non-subjective
Provide two versions:
1. Short version: 150 characters or less
2. Long version: 400 characters or less"""
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please analyze this image and provide alt text descriptions in the following format exactly:\n\nShort version: [brief description]\n\nLong version: [detailed description]"
},
{
"type": "image_url",
"image_url": {
"url": f"data:{input_asset.mime_type};base64,{image_data}"
}
}
]
}
],
"max_tokens": 500
}
)
response.raise_for_status()
result = response.json()
job.progress = 80
db.commit()
# Parse response
content = result.get("choices", [{}])[0].get("message", {}).get("content", "")
# Extract short and long versions
short_alt = ""
long_alt = ""
lines = content.split("\n")
for i, line in enumerate(lines):
if line.lower().startswith("short version:"):
short_alt = line.replace("Short version:", "").replace("short version:", "").strip()
elif line.lower().startswith("long version:"):
long_alt = line.replace("Long version:", "").replace("long version:", "").strip()
# If parsing failed, use full content
if not short_alt and not long_alt:
short_alt = content[:150]
long_alt = content[:400]
job.output_data = {
"short_alt_text": short_alt,
"long_alt_text": long_alt,
"raw_response": content
}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()

View file

@ -0,0 +1,129 @@
"""Background Remover Service - Clipping Magic API"""
import httpx
import os
import base64
from uuid import uuid4
from datetime import datetime
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
async def remove_background(job_id: str):
"""Remove background from image using Clipping Magic"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
job.progress = 10
job.api_provider = "clipping_magic"
db.commit()
# Read input image
with open(input_asset.file_path, "rb") as f:
image_data = f.read()
output_format = input_data.get("output_format", "png")
job.progress = 20
db.commit()
# Call Clipping Magic API
async with httpx.AsyncClient(timeout=120) as client:
# Decode the API key (it's base64 encoded in the original code)
api_key = settings.clipping_magic_api_key
response = await client.post(
"https://clippingmagic.com/api/v1/images",
auth=(api_key, ""),
files={"image": (input_asset.original_filename, image_data, input_asset.mime_type)},
data={
"format": "result" if output_format == "png" else "clipping_path_tiff"
}
)
response.raise_for_status()
result = response.json()
image_id = result.get("image", {}).get("id")
job.progress = 50
db.commit()
if image_id:
# Download the result
download_response = await client.get(
f"https://clippingmagic.com/api/v1/images/{image_id}",
auth=(api_key, ""),
params={"format": "result" if output_format == "png" else "clipping_path_tiff"}
)
download_response.raise_for_status()
processed_data = download_response.content
job.progress = 80
db.commit()
# Save output
ext = "png" if output_format == "png" else "tiff"
filename = f"nobg_{uuid4()}.{ext}"
storage_path = os.path.join(settings.storage_path, "images")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(processed_data)
# Create output asset
output_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="image",
mime_type=f"image/{ext}",
file_size_bytes=len(processed_data),
width=input_asset.width,
height=input_asset.height,
source_module="background_remover",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={"output_format": output_format}
)
db.add(output_asset)
db.commit()
db.refresh(output_asset)
job.output_asset_ids = [output_asset.id]
job.output_data = {"asset_id": str(output_asset.id), "file_path": file_path}
# Delete from Clipping Magic (cleanup)
await client.post(
f"https://clippingmagic.com/api/v1/images/{image_id}/delete",
auth=(api_key, "")
)
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()

View file

@ -0,0 +1,890 @@
"""Image Generator Service - Multiple AI Providers
Supported Providers:
- openai: GPT-Image-1 (latest) or DALL-E 3
- imagen: Google Imagen 4 (Standard, Ultra, Fast)
- nano-banana: Gemini 2.5 Flash Image / Nano Banana Pro
- stable-diffusion: Stability AI SDXL, SD3, image-to-image
- leonardo: Leonardo.ai models
- ideogram: Ideogram v2 with text rendering
- flux: Black Forest Labs Flux Pro
OpenAI GPT-Image-1 (April 2025):
- model: 'gpt-image-1' (default) or 'dall-e-3'
- quality: 'low', 'medium', 'high' (default high)
- size: 1024x1024, 1024x1536, 1536x1024
- background: 'transparent', 'opaque', 'auto' (for PNG/WebP)
- output_format: 'png', 'jpeg', 'webp'
- n: 1-10 images per request
- Pricing: ~$0.02 (low), $0.07 (medium), $0.19 (high) per image
Google Imagen 4 (December 2025):
- model: 'imagen-4.0-generate-001' (default), 'imagen-4.0-ultra-generate-001', 'imagen-4.0-fast-generate-001'
- image_size: '1K', '2K' (Ultra/Standard only)
- aspect_ratio: '1:1', '3:4', '4:3', '9:16', '16:9'
- number_of_images: 1-4
- enhance_prompt: true/false (LLM prompt enhancement)
- person_generation: 'dont_allow', 'allow_adult', 'allow_all'
- Pricing: $0.02 (Fast), $0.04 (Standard), $0.06 (Ultra) per image
Nano Banana / Gemini Image (December 2025):
- model: 'gemini-2.5-flash-image' (Nano Banana), 'gemini-3-pro-image-preview' (Nano Banana Pro)
- aspect_ratio: '1:1', '2:3', '3:2', '3:4', '4:3', '4:5', '5:4', '9:16', '16:9', '21:9'
- image_size: '1K', '2K', '4K' (Pro only for 4K)
- Features: Text rendering, image editing, multi-turn conversation
- Pricing: ~$0.04 per 1MP image
DALL-E 3 Options:
- quality: 'standard' or 'hd' (default hd)
- style: 'vivid' (hyper-real) or 'natural' (more realistic)
- size: 1024x1024, 1024x1792, 1792x1024
Stability AI Options:
- model: sd3.5-large, sd3.5-medium, sd3-large, sd3-medium, sdxl-1.0
- aspect_ratio: 1:1, 16:9, 9:16, 4:3, 3:4, 21:9, 9:21
- negative_prompt: What to avoid in generation
- image_to_image: Use input image as starting point
- strength: 0.0-1.0 for image-to-image (how much to change)
- style_preset: enhance, anime, photographic, digital-art, etc.
"""
import httpx
import os
import base64
import logging
from uuid import uuid4
from datetime import datetime
from typing import Optional, Dict, Any, Tuple
logger = logging.getLogger(__name__)
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
# Provider configurations
IMAGE_PROVIDERS = {
"openai": {
"name": "OpenAI Image Generation",
"models": ["gpt-image-1", "dall-e-3", "dall-e-2"],
"default_model": "gpt-image-1",
"gpt-image-1": {
"sizes": ["1024x1024", "1024x1536", "1536x1024"],
"qualities": ["low", "medium", "high"],
"output_formats": ["png", "jpeg", "webp"],
"backgrounds": ["auto", "transparent", "opaque"],
"max_images": 10
},
"dall-e-3": {
"sizes": ["1024x1024", "1024x1792", "1792x1024"],
"qualities": ["standard", "hd"],
"styles": ["vivid", "natural"]
},
"supports_styles": True
},
"imagen": {
"name": "Google Imagen 4",
"models": ["imagen-4.0-generate-001", "imagen-4.0-ultra-generate-001", "imagen-4.0-fast-generate-001"],
"default_model": "imagen-4.0-generate-001",
"aspect_ratios": ["1:1", "3:4", "4:3", "9:16", "16:9"],
"image_sizes": ["1K", "2K"],
"max_images": 4,
"supports_enhance_prompt": True,
"supports_person_generation": True
},
"nano-banana": {
"name": "Nano Banana (Gemini Image)",
"models": ["gemini-2.5-flash-image", "gemini-3-pro-image-preview"],
"default_model": "gemini-2.5-flash-image",
"aspect_ratios": ["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"],
"image_sizes": ["1K", "2K", "4K"],
"supports_text_rendering": True,
"supports_image_editing": True
},
"stable-diffusion": {
"name": "Stability AI",
"models": ["sd3.5-large", "sd3.5-medium", "sd3-large", "sd3-medium", "sdxl-1.0"],
"default_model": "sd3.5-large",
"aspect_ratios": ["1:1", "16:9", "9:16", "4:3", "3:4", "21:9", "9:21"],
"supports_img2img": True,
"supports_negative_prompt": True
},
"leonardo": {
"name": "Leonardo.ai",
"models": {
# Latest Models (2025)
"de7d3faf-762f-48e0-b3b7-9d0ac3a3fcf3": "Leonardo Phoenix 1.0",
"7b592283-e8a7-4c5a-9ba6-d18c31f258b9": "Lucid Origin",
"05ce0082-2d80-4a2d-8653-4d1c85e2418e": "Lucid Realism",
"28aeddf8-bd19-4803-80fc-79602d1a9989": "FLUX.1 Kontext",
"b2614463-296c-462a-9586-aafdb8f00e36": "Flux Dev",
"1dd50843-d653-4516-a8e3-f0238ee453ff": "Flux Schnell",
# Phoenix/XL Models
"6b645e3a-d64f-4341-a6d8-7a3690fbf042": "Leonardo Phoenix 0.9",
"e71a1c2f-4f80-4800-934f-2c68979d8cc8": "Leonardo Anime XL",
"b24e16ff-06e3-43eb-8d33-4416c2d75876": "Leonardo Lightning XL",
"aa77f04e-3eec-4034-9c07-d0f619684628": "Leonardo Kino XL",
"5c232a9e-9061-4777-980a-ddc8e65647c6": "Leonardo Vision XL",
"1e60896f-3c26-4296-8ecc-53e2afecc132": "Leonardo Diffusion XL",
# SDXL Models
"16e7060a-803e-4df3-97ee-edcfa5dc9cc8": "SDXL 1.0",
"2067ae52-33fd-4a82-bb92-c2c55e7d2786": "AlbedoBase XL",
"b63f7119-31dc-4540-969b-2a9df997e173": "SDXL 0.9",
# Style Models
"f1929ea3-b169-4c18-a16c-5d58b4292c69": "RPG v5",
"d69c8273-6b17-4a30-a13e-d6637ae1c644": "3D Animation Style",
"ac614f96-1082-45bf-be9d-757f2d31c174": "DreamShaper v7",
"e316348f-7773-490e-adcd-46757c738eb7": "Absolute Reality v1.6"
},
"default_model": "de7d3faf-762f-48e0-b3b7-9d0ac3a3fcf3",
"widths": [512, 768, 1024, 1472],
"heights": [512, 768, 832, 1024],
"style_presets": [
"ANIME", "BOKEH", "CINEMATIC", "CINEMATIC_CLOSEUP", "CREATIVE",
"DYNAMIC", "ENVIRONMENT", "FASHION", "FILM", "FOOD", "GENERAL",
"HDR", "ILLUSTRATION", "LEONARDO", "LONG_EXPOSURE", "MACRO",
"MINIMALISTIC", "MONOCHROME", "MOODY", "NONE", "NEUTRAL",
"PHOTOGRAPHY", "PORTRAIT", "RAYTRACED", "RENDER_3D", "RETRO",
"SKETCH_BW", "SKETCH_COLOR", "STOCK_PHOTO", "VIBRANT", "UNPROCESSED"
],
"supports_img2img": True,
"supports_character_reference": True,
"supports_style_reference": True
},
"bria": {
"name": "Bria AI",
"models": ["base", "fast"],
"default_model": "base",
"aspect_ratios": ["1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9"],
"mediums": ["photography", "art"],
"supports_prompt_enhancement": True,
"base_config": {"steps_num": [20, 50], "guidance_scale": [1, 10]},
"fast_config": {"steps_num": [4, 10]}
},
"ideogram": {
"name": "Ideogram",
"models": ["V_2", "V_2_TURBO"],
"supports_text_rendering": True
},
"flux": {
"name": "Flux Pro",
"models": ["flux-pro-1.1", "flux-dev", "flux-schnell"],
"supports_img2img": True
}
}
STABILITY_STYLE_PRESETS = [
"enhance", "anime", "photographic", "digital-art", "comic-book",
"fantasy-art", "line-art", "analog-film", "neon-punk", "isometric",
"low-poly", "origami", "modeling-compound", "cinematic", "3d-model", "pixel-art"
]
async def generate(job_id: str):
"""Generate image based on provider"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
provider = input_data.get("provider", "openai")
prompt = input_data.get("prompt", "")
# Update progress
job.progress = 10
job.api_provider = provider
db.commit()
image_data = None
filename = None
if provider == "openai" or provider == "dalle3":
image_data, filename = await _generate_openai(input_data)
job.api_model = input_data.get("model", "gpt-image-1")
elif provider == "imagen":
image_data, filename = await _generate_imagen(input_data)
job.api_model = input_data.get("model", "imagen-4.0-generate-001")
elif provider == "nano-banana" or provider == "gemini":
image_data, filename = await _generate_nano_banana(input_data)
job.api_model = input_data.get("model", "gemini-2.5-flash-image")
elif provider == "stable-diffusion":
image_data, filename = await _generate_stability(input_data)
job.api_model = input_data.get("model", "sd3.5-large")
elif provider == "leonardo":
image_data, filename = await _generate_leonardo(input_data)
job.api_model = "leonardo"
elif provider == "ideogram":
image_data, filename = await _generate_ideogram(input_data)
job.api_model = "ideogram-v2"
elif provider == "flux":
image_data, filename = await _generate_flux(input_data)
job.api_model = "flux-pro"
elif provider == "bria":
image_data, filename = await _generate_bria(input_data)
job.api_model = input_data.get("model", "base")
else:
raise ValueError(f"Unknown provider: {provider}")
job.progress = 80
db.commit()
# Save image
if image_data:
storage_path = os.path.join(settings.storage_path, "images")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(image_data)
# Create asset
asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="image",
mime_type="image/png",
file_size_bytes=len(image_data),
source_module="image_generator",
source_job_id=job.id,
metadata={
"prompt": prompt,
"provider": provider,
"model": job.api_model
}
)
db.add(asset)
db.commit()
db.refresh(asset)
job.output_asset_ids = [asset.id]
job.output_data = {"asset_id": str(asset.id), "file_path": file_path}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
async def _generate_openai(input_data: dict) -> Tuple[Optional[bytes], Optional[str]]:
"""Generate image using OpenAI GPT-Image-1 or DALL-E 3
GPT-Image-1 Parameters (default):
- prompt: Text description (max 32000 chars)
- quality: 'low', 'medium', 'high' (default: high)
- size: '1024x1024', '1024x1536', '1536x1024'
- background: 'transparent', 'opaque', 'auto'
- output_format: 'png', 'jpeg', 'webp' (default: png)
- output_compression: 0-100 for jpeg/webp
- moderation: 'auto' or 'low' (less restrictive)
- n: 1-10 images
DALL-E 3 Parameters:
- prompt: Text description (max 4000 chars)
- quality: 'standard' or 'hd' (default: hd)
- style: 'vivid' or 'natural' (default: vivid)
- size: '1024x1024', '1024x1792', '1792x1024'
"""
prompt = input_data.get("prompt", "")
model = input_data.get("model", "gpt-image-1")
width = input_data.get("width", 1024)
height = input_data.get("height", 1024)
# Determine size based on width/height
if width > height:
size = "1536x1024" if model == "gpt-image-1" else "1792x1024"
elif height > width:
size = "1024x1536" if model == "gpt-image-1" else "1024x1792"
else:
size = "1024x1024"
async with httpx.AsyncClient(timeout=180) as client:
if model == "gpt-image-1":
# GPT-Image-1 (latest model)
quality = input_data.get("quality", "high")
background = input_data.get("background", "auto")
output_format = input_data.get("output_format", "png")
output_compression = input_data.get("output_compression", 100)
moderation = input_data.get("moderation", "auto")
n = min(input_data.get("n", 1), 10)
payload = {
"model": "gpt-image-1",
"prompt": prompt,
"size": size,
"quality": quality,
"n": n
}
# Add optional parameters
if background != "auto":
payload["background"] = background
if output_format != "png":
payload["output_format"] = output_format
if output_format in ["jpeg", "webp"] and output_compression != 100:
payload["output_compression"] = output_compression
if moderation != "auto":
payload["moderation"] = moderation
response = await client.post(
"https://api.openai.com/v1/images/generations",
headers={
"Authorization": f"Bearer {settings.openai_api_key}",
"Content-Type": "application/json"
},
json=payload
)
response.raise_for_status()
data = response.json()
if data.get("data") and len(data["data"]) > 0:
# GPT-Image-1 always returns base64
b64_image = data["data"][0].get("b64_json")
if b64_image:
ext = output_format if output_format in ["png", "jpeg", "webp"] else "png"
filename = f"gptimage1_{quality}_{uuid4()}.{ext}"
return base64.b64decode(b64_image), filename
else:
# DALL-E 3 (or DALL-E 2)
quality = input_data.get("quality", "hd")
style = input_data.get("style", "vivid")
payload = {
"model": model,
"prompt": prompt,
"size": size,
"n": 1,
"response_format": "b64_json"
}
# DALL-E 3 specific options
if model == "dall-e-3":
payload["quality"] = quality
payload["style"] = style
response = await client.post(
"https://api.openai.com/v1/images/generations",
headers={
"Authorization": f"Bearer {settings.openai_api_key}",
"Content-Type": "application/json"
},
json=payload
)
response.raise_for_status()
data = response.json()
if data.get("data") and len(data["data"]) > 0:
b64_image = data["data"][0].get("b64_json")
if b64_image:
filename = f"{model.replace('-', '')}_{style if model == 'dall-e-3' else 'gen'}_{uuid4()}.png"
return base64.b64decode(b64_image), filename
return None, None
async def _generate_stability(input_data: dict, input_image_data: Optional[bytes] = None) -> Tuple[Optional[bytes], Optional[str]]:
"""Generate image using Stability AI
Parameters:
- prompt: Text description (required)
- negative_prompt: What to avoid in generation
- model: 'sd3.5-large', 'sd3.5-medium', 'sd3-large', 'sd3-medium'
- aspect_ratio: '1:1', '16:9', '9:16', '4:3', '3:4', '21:9', '9:21'
- seed: Optional seed for reproducibility (0-4294967294)
- mode: 'text-to-image' or 'image-to-image'
"""
if not settings.stability_api_key:
raise ValueError("Stability API key not configured")
prompt = input_data.get("prompt", "")
if not prompt:
raise ValueError("Prompt is required")
negative_prompt = input_data.get("negative_prompt", "")
model = input_data.get("model", "sd3.5-large")
aspect_ratio = input_data.get("aspect_ratio", "1:1")
seed = input_data.get("seed")
output_format = input_data.get("output_format", "png")
async with httpx.AsyncClient(timeout=180) as client:
# Build form data - Stability uses multipart/form-data
form_data = {
"prompt": prompt,
"mode": "text-to-image",
"model": model,
"aspect_ratio": aspect_ratio,
"output_format": output_format,
}
if negative_prompt:
form_data["negative_prompt"] = negative_prompt
if seed is not None:
form_data["seed"] = seed
# Image-to-image mode
files = None
if input_image_data:
form_data["mode"] = "image-to-image"
form_data["strength"] = input_data.get("strength", 0.7)
files = {"image": ("input.png", input_image_data, "image/png")}
try:
response = await client.post(
"https://api.stability.ai/v2beta/stable-image/generate/sd3",
headers={
"Authorization": f"Bearer {settings.stability_api_key}",
"Accept": "image/*"
},
data=form_data,
files=files
)
if response.status_code != 200:
error_text = response.text
logger.error(f"Stability AI error {response.status_code}: {error_text}")
raise Exception(f"Stability AI error: {error_text}")
model_short = model.replace("-", "").replace(".", "")
filename = f"stability_{model_short}_{uuid4()}.{output_format}"
return response.content, filename
except httpx.HTTPStatusError as e:
logger.error(f"Stability AI HTTP error: {e.response.status_code} - {e.response.text}")
raise
except Exception as e:
logger.error(f"Stability AI generation error: {e}")
raise
async def _generate_leonardo(input_data: dict) -> tuple:
"""
Generate image using Leonardo AI
Parameters:
- prompt: Text description
- model: Leonardo model ID (default: Phoenix)
- width: Image width (512, 768, 1024, 1472)
- height: Image height (512, 768, 832, 1024)
- preset_style: Style preset (ANIME, CINEMATIC, PHOTOGRAPHY, etc.)
- num_images: Number of images to generate
- guidance_scale: How closely to follow prompt (7-15)
- num_inference_steps: Quality/speed tradeoff (30-60)
- negative_prompt: What to avoid
- init_image_id: For image-to-image
- init_strength: How much to change input image (0.1-0.9)
"""
# Default model is Leonardo Phoenix
model_id = input_data.get("model", "6b645e3a-d64f-4341-a6d8-7a3690fbf042")
# Build request payload
payload = {
"prompt": input_data.get("prompt"),
"modelId": model_id,
"width": input_data.get("width", 1024),
"height": input_data.get("height", 1024),
"num_images": input_data.get("num_images", 1),
}
# Add optional parameters
if input_data.get("preset_style"):
payload["presetStyle"] = input_data.get("preset_style")
if input_data.get("guidance_scale"):
payload["guidance_scale"] = input_data.get("guidance_scale")
if input_data.get("num_inference_steps"):
payload["num_inference_steps"] = input_data.get("num_inference_steps")
if input_data.get("negative_prompt"):
payload["negative_prompt"] = input_data.get("negative_prompt")
# Image-to-image support
if input_data.get("init_image_id"):
payload["init_image_id"] = input_data.get("init_image_id")
payload["init_strength"] = input_data.get("init_strength", 0.5)
async with httpx.AsyncClient(timeout=180) as client:
# Create generation
response = await client.post(
"https://cloud.leonardo.ai/api/rest/v1/generations",
headers={
"Authorization": f"Bearer {settings.leonardo_api_key}",
"Content-Type": "application/json"
},
json=payload
)
response.raise_for_status()
data = response.json()
# Poll for result
generation_id = data.get("sdGenerationJob", {}).get("generationId")
if generation_id:
import asyncio
for _ in range(90): # Wait up to 3 minutes
await asyncio.sleep(2)
status_response = await client.get(
f"https://cloud.leonardo.ai/api/rest/v1/generations/{generation_id}",
headers={"Authorization": f"Bearer {settings.leonardo_api_key}"}
)
status_data = status_response.json()
generation = status_data.get("generations_by_pk", {})
status = generation.get("status")
if status == "COMPLETE":
images = generation.get("generated_images", [])
if images:
image_url = images[0].get("url")
if image_url:
img_response = await client.get(image_url)
model_name = IMAGE_PROVIDERS["leonardo"]["models"].get(model_id, "leonardo")
filename = f"leonardo_{model_name.replace(' ', '_').lower()}_{uuid4()}.png"
return img_response.content, filename
elif status == "FAILED":
raise Exception("Leonardo generation failed")
return None, None
async def _generate_bria(input_data: dict) -> tuple:
"""
Generate image using Bria AI
Parameters:
- prompt: Text description
- model: 'base' (Bria 2.3 Base) or 'fast' (Bria 2.3 Fast)
- aspect_ratio: Image aspect ratio
- medium: 'photography' or 'art'
- prompt_enhancement: Enable AI prompt enhancement
- steps_num: Number of inference steps
- guidance_scale: How closely to follow prompt
- negative_prompt: What to avoid
"""
model = input_data.get("model", "base")
base_url = "https://engine.prod.bria-api.com/v1/text-to-image"
# Build request payload
payload = {
"prompt": input_data.get("prompt"),
"num_results": 1
}
# Add aspect ratio
if input_data.get("aspect_ratio"):
payload["aspect_ratio"] = input_data.get("aspect_ratio")
# Add medium
if input_data.get("medium"):
payload["medium"] = input_data.get("medium")
# Add prompt enhancement
if input_data.get("prompt_enhancement"):
payload["prompt_enhancement"] = True
# Add negative prompt
if input_data.get("negative_prompt"):
payload["negative_prompt"] = input_data.get("negative_prompt")
# Model-specific parameters
if model == "base":
url = f"{base_url}/base"
if input_data.get("steps_num"):
payload["steps_num"] = input_data.get("steps_num")
if input_data.get("guidance_scale"):
payload["text_guidance_scale"] = input_data.get("guidance_scale")
else:
url = f"{base_url}/fast"
if input_data.get("steps_num"):
payload["steps_num"] = min(input_data.get("steps_num"), 10)
async with httpx.AsyncClient(timeout=120) as client:
response = await client.post(
url,
headers={
"api_token": settings.bria_api_key,
"Content-Type": "application/json"
},
json=payload
)
response.raise_for_status()
data = response.json()
# Get the result
result = data.get("result", [])
if result and len(result) > 0:
image_url = result[0].get("urls", {}).get("url")
if image_url:
img_response = await client.get(image_url)
filename = f"bria_{model}_{uuid4()}.png"
return img_response.content, filename
return None, None
async def _generate_ideogram(input_data: dict) -> tuple:
"""Generate image using Ideogram"""
async with httpx.AsyncClient(timeout=120) as client:
response = await client.post(
"https://api.ideogram.ai/generate",
headers={
"Api-Key": settings.ideogram_api_key,
"Content-Type": "application/json"
},
json={
"image_request": {
"prompt": input_data.get("prompt"),
"model": "V_2",
"aspect_ratio": "ASPECT_1_1"
}
}
)
response.raise_for_status()
data = response.json()
if data.get("data") and len(data["data"]) > 0:
image_url = data["data"][0].get("url")
if image_url:
img_response = await client.get(image_url)
filename = f"ideogram_{uuid4()}.png"
return img_response.content, filename
return None, None
async def _generate_flux(input_data: dict) -> tuple:
"""Generate image using Flux (Black Forest Labs)
Note: Requires FLUX_API_KEY from https://api.bfl.ml/
May require paid account for flux-pro-1.1 model
"""
if not settings.flux_api_key:
raise ValueError("FLUX_API_KEY not configured")
async with httpx.AsyncClient(timeout=120) as client:
try:
response = await client.post(
"https://api.bfl.ml/v1/flux-pro-1.1",
headers={
"x-key": settings.flux_api_key,
"Content-Type": "application/json"
},
json={
"prompt": input_data.get("prompt"),
"width": input_data.get("width", 1024),
"height": input_data.get("height", 1024)
}
)
if response.status_code == 403:
logger.error("Flux API 403: Invalid API key or insufficient permissions")
raise ValueError("Flux API key is invalid or your account doesn't have access to flux-pro-1.1")
response.raise_for_status()
data = response.json()
# Poll for result
request_id = data.get("id")
if request_id:
import asyncio
for _ in range(60):
await asyncio.sleep(2)
status_response = await client.get(
f"https://api.bfl.ml/v1/get_result?id={request_id}",
headers={"x-key": settings.flux_api_key}
)
status_data = status_response.json()
if status_data.get("status") == "Ready":
image_url = status_data.get("result", {}).get("sample")
if image_url:
img_response = await client.get(image_url)
filename = f"flux_{uuid4()}.png"
return img_response.content, filename
except Exception as e:
logger.error(f"Flux generation error: {e}")
raise
return None, None
async def _generate_gemini(input_data: dict) -> tuple:
"""Generate image using Google Gemini"""
import google.generativeai as genai
genai.configure(api_key=settings.google_api_key)
model = genai.GenerativeModel("gemini-2.0-flash-exp")
response = model.generate_content(
input_data.get("prompt"),
generation_config=genai.types.GenerationConfig(
response_mime_type="image/png"
)
)
if response.candidates and response.candidates[0].content.parts:
for part in response.candidates[0].content.parts:
if hasattr(part, 'inline_data') and part.inline_data:
filename = f"gemini_{uuid4()}.png"
return part.inline_data.data, filename
return None, None
async def _generate_imagen(input_data: dict) -> tuple:
"""
Generate image using Google Imagen 3 via REST API
Note: Imagen 3 is accessed through the generativelanguage API with API key.
Parameters:
- prompt: Text description of the image
- aspect_ratio: "1:1", "3:4", "4:3", "9:16", "16:9"
- number_of_images: 1-4
- negative_prompt: What to avoid in the image
"""
if not settings.google_api_key:
raise ValueError("GOOGLE_API_KEY not configured")
prompt = input_data.get("prompt", "")
negative_prompt = input_data.get("negative_prompt", "")
aspect_ratio = input_data.get("aspect_ratio", "1:1")
number_of_images = min(input_data.get("number_of_images", 1), 4)
# Use the Generative Language API for Imagen
url = f"https://generativelanguage.googleapis.com/v1beta/models/imagen-3.0-generate-001:predict?key={settings.google_api_key}"
payload = {
"instances": [{"prompt": prompt}],
"parameters": {
"sampleCount": number_of_images,
"aspectRatio": aspect_ratio,
}
}
if negative_prompt:
payload["instances"][0]["negativePrompt"] = negative_prompt
try:
async with httpx.AsyncClient(timeout=120.0) as client:
response = await client.post(
url,
headers={"Content-Type": "application/json"},
json=payload
)
if response.status_code == 200:
data = response.json()
predictions = data.get("predictions", [])
if predictions and predictions[0].get("bytesBase64Encoded"):
image_data = base64.b64decode(predictions[0]["bytesBase64Encoded"])
filename = f"imagen3_{uuid4()}.png"
return image_data, filename
else:
logger.warning(f"Imagen API error: {response.status_code} - {response.text}")
# Fall back to Nano Banana (Gemini native)
logger.info("Falling back to Nano Banana (Gemini native image generation)")
return await _generate_nano_banana(input_data)
except Exception as e:
logger.error(f"Imagen generation error: {e}")
# Fallback to Gemini native image generation
return await _generate_nano_banana(input_data)
return None, None
async def _generate_nano_banana(input_data: dict) -> tuple:
"""
Generate image using Nano Banana (Gemini native image generation)
Models:
- gemini-2.5-flash-image: Fast image generation with Gemini
- gemini-3-pro-image-preview: Higher quality image generation
Features:
- Native text rendering (can include text in images)
- Up to 4K resolution
- Wide range of aspect ratios
- Conversational image editing
Parameters:
- prompt: Text description of the image
- model: Gemini model to use
- aspect_ratio: Various ratios from 1:1 to 21:9
- image_size: "1K", "2K", "4K"
- number_of_images: Number of images to generate
- reference_image: Optional base64 image for editing
"""
import google.generativeai as genai
genai.configure(api_key=settings.google_api_key)
model_name = input_data.get("model", "gemini-2.5-flash-image")
# Map model names to actual Gemini model IDs
model_mapping = {
"gemini-2.5-flash-image": "gemini-2.0-flash-exp-image-generation",
"gemini-3-pro-image-preview": "gemini-2.0-flash-exp-image-generation", # Use available model
}
actual_model = model_mapping.get(model_name, "gemini-2.0-flash-exp-image-generation")
model = genai.GenerativeModel(actual_model)
# Handle aspect ratio if provided
aspect_ratio = input_data.get("aspect_ratio", "1:1")
# Build the prompt - can include aspect ratio hints
prompt = input_data.get("prompt", "")
if aspect_ratio != "1:1":
prompt = f"{prompt} [aspect ratio: {aspect_ratio}]"
# If reference image provided, include it in the request
contents = [prompt]
if input_data.get("reference_image"):
import base64
# Add reference image for editing
ref_data = input_data.get("reference_image")
if isinstance(ref_data, str) and ref_data.startswith("data:"):
# Extract base64 data from data URL
ref_data = ref_data.split(",")[1]
contents = [
{
"parts": [
{"text": prompt},
{
"inline_data": {
"mime_type": "image/png",
"data": ref_data
}
}
]
}
]
try:
# Generate content - Gemini automatically returns image data
response = model.generate_content(contents)
if response.candidates and response.candidates[0].content.parts:
for part in response.candidates[0].content.parts:
if hasattr(part, 'inline_data') and part.inline_data:
filename = f"nano_banana_{uuid4()}.png"
return part.inline_data.data, filename
except Exception as e:
logger.error(f"Nano Banana generation error: {e}")
raise
return None, None

View file

@ -0,0 +1,283 @@
"""Image Upscaler Service - Topaz Labs API
Available Models:
- proteus: General enhancement with fine-tuning parameters (default)
- artemis: Detail enhancement and noise reduction
- gaia: Specialized for HD/4K upscaling
- iris: Noise and compression artifact reduction
- nyx: Low light and high ISO recovery
- rhea: Detail recovery for older/degraded images
- theia: High-fidelity upscaling
Output Options:
- Scale: 2x, 4x, 6x, 8x (up to 16K)
- Output formats: png, jpg, tiff
- Face enhancement: auto-detect and enhance faces
- Noise reduction: 0-100
- Sharpening: 0-100
- Grain recovery: preserve film grain
"""
import httpx
import os
from uuid import uuid4
from datetime import datetime
import asyncio
from typing import Optional, Dict, Any
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
# Topaz enhancement models with their specialties
TOPAZ_MODELS = {
"proteus": {
"name": "Proteus",
"description": "General enhancement with fine control over noise, blur, and compression",
"parameters": ["noise_reduction", "sharpening", "compression_recovery", "detail_enhancement"],
"best_for": "General purpose, low to medium quality footage"
},
"artemis": {
"name": "Artemis",
"description": "Detail enhancement with noise reduction",
"parameters": ["noise_reduction", "detail_recovery"],
"best_for": "Details in low-noise footage"
},
"gaia": {
"name": "Gaia",
"description": "Specialized for upscaling HD to 4K/8K",
"parameters": ["detail_level", "anti_aliasing"],
"best_for": "High-resolution upscaling from HD source"
},
"iris": {
"name": "Iris",
"description": "Noise and compression artifact reduction",
"parameters": ["noise_reduction", "compression_recovery", "debanding"],
"best_for": "Heavily compressed or noisy images"
},
"nyx": {
"name": "Nyx",
"description": "Low light and high ISO recovery",
"parameters": ["noise_reduction", "shadow_recovery", "highlight_recovery"],
"best_for": "Dark or high-ISO images"
},
"rhea": {
"name": "Rhea",
"description": "Detail recovery for older/degraded images",
"parameters": ["detail_recovery", "texture_enhancement"],
"best_for": "Scanned photos, old digital images"
},
"theia": {
"name": "Theia",
"description": "High-fidelity detail enhancement",
"parameters": ["detail_level", "texture_preservation"],
"best_for": "Maximum detail retention"
},
"auto": {
"name": "Auto",
"description": "Automatically select best model for input",
"parameters": [],
"best_for": "When unsure which model to use"
}
}
async def upscale(job_id: str):
"""Upscale image using Topaz Labs API
Input parameters:
- scale: Upscale factor (2, 4, 6, 8)
- model: Enhancement model (see TOPAZ_MODELS)
- output_format: 'png', 'jpg', 'tiff' (default: png)
- face_enhancement: Boolean to enable face detection and enhancement
- noise_reduction: 0-100, amount of noise removal
- sharpening: 0-100, output sharpening level
- compression_recovery: 0-100, recover compression artifacts
- detail_enhancement: 0-100, enhance fine details
- preserve_grain: Boolean to preserve film grain
- output_quality: 1-100 for jpg output (default: 95)
"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
# Get input asset
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
# Extract parameters
scale = input_data.get("scale", 2)
model = input_data.get("model", "auto")
output_format = input_data.get("output_format", "png")
face_enhancement = input_data.get("face_enhancement", False)
noise_reduction = input_data.get("noise_reduction")
sharpening = input_data.get("sharpening")
compression_recovery = input_data.get("compression_recovery")
detail_enhancement = input_data.get("detail_enhancement")
preserve_grain = input_data.get("preserve_grain", False)
output_quality = input_data.get("output_quality", 95)
job.progress = 10
job.api_provider = "topaz"
job.api_model = model
db.commit()
# Read input image
with open(input_asset.file_path, "rb") as f:
image_data = f.read()
# Calculate output dimensions
original_width = input_asset.width or 1920
original_height = input_asset.height or 1080
output_width = original_width * scale
output_height = original_height * scale
job.progress = 20
db.commit()
# Build enhancement parameters
enhance_params: Dict[str, Any] = {
"output_height": str(output_height),
"output_width": str(output_width),
"output_format": output_format,
"model": model,
"face_enhancement": "true" if face_enhancement else "false"
}
# Add model-specific parameters if provided
if noise_reduction is not None:
enhance_params["noise_reduction"] = str(min(100, max(0, noise_reduction)))
if sharpening is not None:
enhance_params["sharpening"] = str(min(100, max(0, sharpening)))
if compression_recovery is not None:
enhance_params["compression_recovery"] = str(min(100, max(0, compression_recovery)))
if detail_enhancement is not None:
enhance_params["detail_enhancement"] = str(min(100, max(0, detail_enhancement)))
if preserve_grain:
enhance_params["preserve_grain"] = "true"
if output_format == "jpg":
enhance_params["quality"] = str(output_quality)
# Call Topaz API
async with httpx.AsyncClient(timeout=600) as client:
# Start async enhancement
response = await client.post(
"https://api.topazlabs.com/image/v1/enhance/async",
headers={
"X-API-Key": settings.topaz_api_key,
"Accept": "application/json"
},
files={"image": (input_asset.original_filename, image_data, input_asset.mime_type)},
data=enhance_params
)
response.raise_for_status()
result = response.json()
request_id = result.get("id") or result.get("requestId")
job.progress = 40
job.api_request_id = request_id
db.commit()
# Poll for completion
output_url = None
for i in range(180): # Wait up to 6 minutes for large upscales
await asyncio.sleep(2)
status_response = await client.get(
f"https://api.topazlabs.com/image/v1/enhance/{request_id}/status",
headers={"X-API-Key": settings.topaz_api_key}
)
status_data = status_response.json()
status = status_data.get("status", "")
if status == "completed":
output_url = status_data.get("outputUrl") or status_data.get("output_url")
break
elif status == "failed":
raise ValueError(f"Topaz enhancement failed: {status_data.get('error')}")
job.progress = min(40 + (i * 0.28), 85)
db.commit()
if output_url:
# Download result
img_response = await client.get(output_url)
upscaled_data = img_response.content
job.progress = 90
db.commit()
# Determine output extension
ext_map = {"png": ".png", "jpg": ".jpg", "jpeg": ".jpg", "tiff": ".tiff"}
ext = ext_map.get(output_format, ".png")
mime_map = {"png": "image/png", "jpg": "image/jpeg", "jpeg": "image/jpeg", "tiff": "image/tiff"}
mime = mime_map.get(output_format, "image/png")
# Save output
filename = f"upscaled_{scale}x_{model}_{uuid4()}{ext}"
storage_path = os.path.join(settings.storage_path, "images")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(upscaled_data)
# Create output asset
output_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="image",
mime_type=mime,
file_size_bytes=len(upscaled_data),
width=output_width,
height=output_height,
source_module="image_upscaler",
source_job_id=job.id,
parent_asset_id=input_asset.id,
asset_metadata={
"scale": scale,
"model": model,
"face_enhancement": face_enhancement,
"noise_reduction": noise_reduction,
"sharpening": sharpening,
"original_dimensions": f"{original_width}x{original_height}",
"output_dimensions": f"{output_width}x{output_height}"
}
)
db.add(output_asset)
db.commit()
db.refresh(output_asset)
job.output_asset_ids = [output_asset.id]
job.output_data = {"asset_id": str(output_asset.id), "file_path": file_path}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
def get_available_models() -> Dict[str, Any]:
"""Get all available Topaz upscaling models and their capabilities"""
return TOPAZ_MODELS

View file

@ -0,0 +1,73 @@
"""Job Processor - Routes jobs to appropriate services"""
from datetime import datetime
from app.database import SessionLocal
from app.models.job import Job
from app.services import (
image_generator,
image_upscaler,
background_remover,
video_generator,
video_upscaler,
subtitle_processor,
voice_to_text,
text_to_speech,
alt_text_generator
)
async def process_job(job_id: str):
"""Process a job based on its module and action"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
# Update status
job.status = "processing"
job.started_at = datetime.utcnow()
db.commit()
try:
# Route to appropriate service
module = job.module
action = job.action
if module == "image_generator":
await image_generator.generate(job_id)
elif module == "image_upscaler":
await image_upscaler.upscale(job_id)
elif module == "background_remover":
await background_remover.remove_background(job_id)
elif module == "video_generator":
await video_generator.generate(job_id)
elif module == "video_upscaler":
await video_upscaler.upscale(job_id)
elif module == "subtitle_processor":
await subtitle_processor.process(job_id)
elif module == "voice_to_text":
await voice_to_text.transcribe(job_id)
elif module == "text_to_speech":
if action == "synthesize":
await text_to_speech.synthesize(job_id)
elif action == "convert":
await text_to_speech.speech_to_speech(job_id)
elif module == "alt_text_generator":
await alt_text_generator.generate(job_id)
else:
raise ValueError(f"Unknown module: {module}")
# Mark as completed
job.status = "completed"
job.progress = 100
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
job.completed_at = datetime.utcnow()
db.commit()
finally:
db.close()

View file

@ -0,0 +1,626 @@
"""Markdown & Mermaid Tools Service
Text processing utilities for Markdown and Mermaid diagram generation.
Features:
- Markdown to HTML conversion
- Markdown to PDF export
- Mermaid diagram generation (flowcharts, sequence diagrams, etc.)
- AI-powered content generation
- Template support
Mermaid Diagram Types:
- flowchart: Process flows and decision trees
- sequence: Interaction sequences between actors
- class: UML class diagrams
- state: State machine diagrams
- er: Entity relationship diagrams
- journey: User journey mapping
- gantt: Project timelines
- pie: Pie charts
- mindmap: Mind maps and concept trees
- timeline: Historical timelines
- quadrant: Quadrant charts
- gitgraph: Git branch visualization
"""
import httpx
import os
from uuid import uuid4
from datetime import datetime
from typing import Optional, Dict, Any, List
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
# Mermaid diagram templates
MERMAID_TEMPLATES = {
"flowchart": {
"name": "Flowchart",
"description": "Process flows and decision trees",
"template": """flowchart TD
A[Start] --> B{Decision}
B -->|Yes| C[Process 1]
B -->|No| D[Process 2]
C --> E[End]
D --> E""",
"directions": ["TD", "TB", "BT", "LR", "RL"]
},
"sequence": {
"name": "Sequence Diagram",
"description": "Interaction sequences between actors",
"template": """sequenceDiagram
participant A as Actor
participant B as System
A->>B: Request
B-->>A: Response
A->>B: Action
B-->>A: Result"""
},
"class": {
"name": "Class Diagram",
"description": "UML class diagrams",
"template": """classDiagram
class Animal {
+String name
+int age
+makeSound()
}
class Dog {
+String breed
+bark()
}
Animal <|-- Dog"""
},
"state": {
"name": "State Diagram",
"description": "State machine diagrams",
"template": """stateDiagram-v2
[*] --> Idle
Idle --> Processing : start
Processing --> Completed : success
Processing --> Failed : error
Completed --> [*]
Failed --> Idle : retry"""
},
"er": {
"name": "ER Diagram",
"description": "Entity relationship diagrams",
"template": """erDiagram
CUSTOMER ||--o{ ORDER : places
ORDER ||--|{ LINE-ITEM : contains
PRODUCT ||--o{ LINE-ITEM : includes"""
},
"journey": {
"name": "User Journey",
"description": "User journey mapping",
"template": """journey
title User Journey
section Sign Up
Visit site: 5: User
Create account: 3: User
Verify email: 4: User
section First Use
Login: 5: User
Explore features: 4: User
Complete task: 5: User"""
},
"gantt": {
"name": "Gantt Chart",
"description": "Project timelines",
"template": """gantt
title Project Timeline
dateFormat YYYY-MM-DD
section Phase 1
Research: 2024-01-01, 30d
Design: 2024-02-01, 20d
section Phase 2
Development: 2024-02-21, 60d
Testing: 2024-04-22, 30d"""
},
"pie": {
"name": "Pie Chart",
"description": "Pie charts for data visualization",
"template": """pie title Distribution
"Category A" : 40
"Category B" : 30
"Category C" : 20
"Category D" : 10"""
},
"mindmap": {
"name": "Mind Map",
"description": "Mind maps and concept trees",
"template": """mindmap
root((Central Idea))
Topic 1
Subtopic 1.1
Subtopic 1.2
Topic 2
Subtopic 2.1
Subtopic 2.2
Topic 3"""
},
"timeline": {
"name": "Timeline",
"description": "Historical timelines",
"template": """timeline
title History of Events
2020 : Event 1
: Description
2021 : Event 2
: Description
2022 : Event 3"""
},
"gitgraph": {
"name": "Git Graph",
"description": "Git branch visualization",
"template": """gitGraph
commit
branch develop
checkout develop
commit
commit
checkout main
merge develop
commit"""
}
}
async def render_mermaid(
code: str,
output_format: str = "svg",
theme: str = "default",
background: str = "transparent"
) -> Dict[str, Any]:
"""Render Mermaid diagram to image
Args:
code: Mermaid diagram code
output_format: 'svg', 'png', 'pdf'
theme: 'default', 'dark', 'forest', 'neutral'
background: 'transparent', 'white', or hex color
Returns:
Dictionary with rendered image data or URL
"""
try:
# Use mermaid.ink for rendering (free API)
import base64
import urllib.parse
# Encode the mermaid code
encoded = base64.urlsafe_b64encode(code.encode()).decode()
# Build URL
base_url = "https://mermaid.ink"
if output_format == "svg":
url = f"{base_url}/svg/{encoded}"
else:
url = f"{base_url}/img/{encoded}"
# Add theme parameter
params = []
if theme != "default":
params.append(f"theme={theme}")
if background != "transparent":
params.append(f"bgColor={background.replace('#', '')}")
if params:
url += "?" + "&".join(params)
async with httpx.AsyncClient(timeout=30) as client:
response = await client.get(url)
response.raise_for_status()
return {
"success": True,
"data": base64.b64encode(response.content).decode(),
"mime_type": "image/svg+xml" if output_format == "svg" else "image/png",
"url": url
}
except Exception as e:
return {
"success": False,
"error": str(e),
"code": code
}
async def generate_mermaid_with_ai(
description: str,
diagram_type: str = "flowchart",
style: str = "detailed"
) -> Dict[str, Any]:
"""Generate Mermaid diagram code using AI
Args:
description: Natural language description of the diagram
diagram_type: Type of diagram (flowchart, sequence, class, etc.)
style: 'simple', 'detailed', 'complex'
Returns:
Dictionary with generated Mermaid code
"""
template = MERMAID_TEMPLATES.get(diagram_type, MERMAID_TEMPLATES["flowchart"])
# Try Gemini first, then OpenAI
if settings.google_api_key:
return await _generate_mermaid_gemini(description, diagram_type, template, style)
elif settings.openai_api_key:
return await _generate_mermaid_openai(description, diagram_type, template, style)
else:
# Return template as fallback
return {
"success": True,
"code": template["template"],
"diagram_type": diagram_type,
"note": "API keys not configured - returning template"
}
async def _generate_mermaid_gemini(
description: str,
diagram_type: str,
template: dict,
style: str
) -> Dict[str, Any]:
"""Generate Mermaid using Gemini"""
try:
import google.generativeai as genai
genai.configure(api_key=settings.google_api_key)
model = genai.GenerativeModel("gemini-2.0-flash-exp")
prompt = f"""Generate a Mermaid {template['name']} diagram based on this description:
"{description}"
Requirements:
- Use valid Mermaid syntax for {diagram_type}
- Style: {style} (simple=few nodes, detailed=moderate, complex=comprehensive)
- Return ONLY the Mermaid code, no explanations
- Start with the diagram type declaration
Example format:
{template['template']}
Generate the diagram code:"""
response = model.generate_content(prompt)
code = response.text.strip()
# Clean up response
if "```mermaid" in code:
code = code.split("```mermaid")[1].split("```")[0].strip()
elif "```" in code:
code = code.split("```")[1].split("```")[0].strip()
return {
"success": True,
"code": code,
"diagram_type": diagram_type,
"description": description
}
except Exception as e:
return {
"success": False,
"error": str(e),
"code": template["template"]
}
async def _generate_mermaid_openai(
description: str,
diagram_type: str,
template: dict,
style: str
) -> Dict[str, Any]:
"""Generate Mermaid using OpenAI"""
try:
async with httpx.AsyncClient(timeout=60) as client:
response = await client.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {settings.openai_api_key}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o-mini",
"messages": [
{
"role": "system",
"content": f"You are a Mermaid diagram expert. Generate valid Mermaid {diagram_type} diagrams. Return ONLY the code, no explanations."
},
{
"role": "user",
"content": f"Create a {style} {template['name']} diagram for: {description}"
}
],
"temperature": 0.7,
"max_tokens": 1000
}
)
response.raise_for_status()
data = response.json()
code = data["choices"][0]["message"]["content"].strip()
# Clean up
if "```mermaid" in code:
code = code.split("```mermaid")[1].split("```")[0].strip()
elif "```" in code:
code = code.split("```")[1].split("```")[0].strip()
return {
"success": True,
"code": code,
"diagram_type": diagram_type,
"description": description
}
except Exception as e:
return {
"success": False,
"error": str(e),
"code": template["template"]
}
async def convert_markdown(
content: str,
output_format: str = "html",
theme: str = "github"
) -> Dict[str, Any]:
"""Convert Markdown to various formats
Args:
content: Markdown content
output_format: 'html', 'plain', 'json' (AST)
theme: CSS theme for HTML output
Returns:
Dictionary with converted content
"""
try:
import markdown
from markdown.extensions import tables, fenced_code, toc
if output_format == "html":
# Convert to HTML with extensions
md = markdown.Markdown(extensions=[
'tables',
'fenced_code',
'toc',
'nl2br',
'sane_lists'
])
html = md.convert(content)
# Add basic styling
styled_html = f"""<!DOCTYPE html>
<html>
<head>
<style>
body {{ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; line-height: 1.6; max-width: 800px; margin: 0 auto; padding: 20px; }}
code {{ background: #f4f4f4; padding: 2px 6px; border-radius: 3px; }}
pre {{ background: #f4f4f4; padding: 16px; border-radius: 6px; overflow-x: auto; }}
table {{ border-collapse: collapse; width: 100%; }}
th, td {{ border: 1px solid #ddd; padding: 8px; text-align: left; }}
th {{ background: #f4f4f4; }}
blockquote {{ border-left: 4px solid #ddd; margin: 0; padding-left: 16px; color: #666; }}
</style>
</head>
<body>
{html}
</body>
</html>"""
return {
"success": True,
"content": styled_html,
"format": "html",
"toc": md.toc if hasattr(md, 'toc') else None
}
elif output_format == "plain":
# Strip markdown to plain text
import re
# Remove images
text = re.sub(r'!\[.*?\]\(.*?\)', '', content)
# Remove links but keep text
text = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', text)
# Remove formatting
text = re.sub(r'[*_~`#>-]', '', text)
# Clean up whitespace
text = re.sub(r'\n{3,}', '\n\n', text)
return {
"success": True,
"content": text.strip(),
"format": "plain"
}
else:
return {
"success": False,
"error": f"Unsupported format: {output_format}"
}
except ImportError:
# Fallback without markdown library
return {
"success": True,
"content": content,
"format": output_format,
"note": "markdown library not installed"
}
except Exception as e:
return {
"success": False,
"error": str(e)
}
async def generate_markdown_with_ai(
topic: str,
content_type: str = "article",
length: str = "medium",
include_toc: bool = True
) -> Dict[str, Any]:
"""Generate Markdown content using AI
Args:
topic: Topic or subject to write about
content_type: 'article', 'documentation', 'readme', 'tutorial', 'report'
length: 'short', 'medium', 'long'
include_toc: Include table of contents
Returns:
Dictionary with generated markdown content
"""
length_guide = {
"short": "2-3 paragraphs, ~200 words",
"medium": "5-7 paragraphs, ~500 words",
"long": "10+ paragraphs, ~1000 words"
}
type_guide = {
"article": "engaging article with introduction, body, and conclusion",
"documentation": "technical documentation with clear sections and code examples",
"readme": "GitHub README with badges, installation, usage, and contributing sections",
"tutorial": "step-by-step tutorial with numbered instructions and examples",
"report": "professional report with executive summary, findings, and recommendations"
}
if settings.google_api_key:
return await _generate_markdown_gemini(topic, content_type, type_guide, length_guide.get(length, length_guide["medium"]), include_toc)
elif settings.openai_api_key:
return await _generate_markdown_openai(topic, content_type, type_guide, length_guide.get(length, length_guide["medium"]), include_toc)
else:
return {
"success": False,
"error": "No API keys configured",
"content": f"# {topic}\n\nContent generation requires API keys."
}
async def _generate_markdown_gemini(
topic: str,
content_type: str,
type_guide: dict,
length_guide: str,
include_toc: bool
) -> Dict[str, Any]:
"""Generate markdown using Gemini"""
try:
import google.generativeai as genai
genai.configure(api_key=settings.google_api_key)
model = genai.GenerativeModel("gemini-2.0-flash-exp")
prompt = f"""Write a {type_guide.get(content_type, 'article')} about:
"{topic}"
Requirements:
- Format: Proper Markdown with headers, lists, code blocks where appropriate
- Length: {length_guide}
- {"Include a table of contents at the start" if include_toc else "No table of contents needed"}
- Use appropriate markdown features (bold, italic, links, code, blockquotes)
- Make it informative and well-structured
Generate the markdown content:"""
response = model.generate_content(prompt)
content = response.text.strip()
return {
"success": True,
"content": content,
"content_type": content_type,
"topic": topic
}
except Exception as e:
return {
"success": False,
"error": str(e)
}
async def _generate_markdown_openai(
topic: str,
content_type: str,
type_guide: dict,
length_guide: str,
include_toc: bool
) -> Dict[str, Any]:
"""Generate markdown using OpenAI"""
try:
async with httpx.AsyncClient(timeout=60) as client:
response = await client.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {settings.openai_api_key}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o-mini",
"messages": [
{
"role": "system",
"content": f"You are a technical writer. Generate well-formatted Markdown content. {type_guide.get(content_type, '')}"
},
{
"role": "user",
"content": f"Write about '{topic}'. Length: {length_guide}. {'Include TOC.' if include_toc else ''}"
}
],
"temperature": 0.7,
"max_tokens": 2000
}
)
response.raise_for_status()
data = response.json()
content = data["choices"][0]["message"]["content"].strip()
return {
"success": True,
"content": content,
"content_type": content_type,
"topic": topic
}
except Exception as e:
return {
"success": False,
"error": str(e)
}
def get_mermaid_templates() -> List[Dict[str, str]]:
"""Get available Mermaid diagram templates"""
return [
{
"id": key,
"name": config["name"],
"description": config["description"],
"template": config["template"]
}
for key, config in MERMAID_TEMPLATES.items()
]
def get_mermaid_template(diagram_type: str) -> Optional[Dict[str, Any]]:
"""Get a specific Mermaid template"""
template = MERMAID_TEMPLATES.get(diagram_type)
if template:
return {
"id": diagram_type,
**template
}
return None

View file

@ -0,0 +1,514 @@
"""Prompt Studio Service - AI-Powered Prompt Enhancement
Uses Google Gemini or OpenAI GPT-4 to transform basic prompts into
professional, detailed prompts optimized for AI image/video generation.
Features:
- Multiple style presets (cinematic, photographic, artistic, etc.)
- Provider support for various image generators (DALL-E, Stable Diffusion, Midjourney, Flux)
- Negative prompt generation
- Technical parameter suggestions
- Multi-language support
Styles Available:
- cinematic: Movie-like scenes with dramatic lighting
- photographic: Professional photography with camera settings
- artistic: Painterly descriptions with artistic techniques
- product: Commercial product photography
- fantasy: Magical, otherworldly scenes
- minimal: Clean, simple compositions
- vintage: Retro, nostalgic aesthetics
- futuristic: Sci-fi, high-tech visuals
- anime: Japanese animation style
- portrait: Professional portrait photography
- landscape: Nature and scenic photography
- abstract: Non-representational art
- fashion: High-end fashion photography
- architecture: Building and interior design
- food: Culinary and food photography
"""
import httpx
from typing import Optional, Dict, Any, List
from app.config import settings
# Style configurations with detailed instructions
STYLE_CONFIGS = {
"cinematic": {
"name": "Cinematic",
"instruction": """Transform this into a cinematic, movie-like scene description with:
- Dramatic lighting (golden hour, chiaroscuro, rim lighting, volumetric rays)
- Film-quality composition (rule of thirds, leading lines, depth of field)
- Atmospheric elements (fog, dust particles, lens flares)
- Color grading suggestions (teal and orange, desaturated, high contrast)
- Camera movement or angle (dolly shot, crane shot, dutch angle)
- Aspect ratio: 21:9 or 2.39:1 for widescreen cinematic feel""",
"negative_base": "amateur, low budget, poorly lit, flat lighting, snapshot quality",
"technical": {"aspect_ratio": "21:9", "style": "cinematic"}
},
"photographic": {
"name": "Professional Photography",
"instruction": """Transform this into a professional photography prompt with:
- Specific camera and lens (e.g., Canon EOS R5, Sony A7IV, 85mm f/1.4)
- Exact lighting setup (softbox, ring light, natural window light, golden hour)
- Technical settings (ISO, aperture, shutter speed)
- Composition technique (rule of thirds, symmetry, leading lines)
- Post-processing style (high contrast, film emulation, clean edit)""",
"negative_base": "blurry, out of focus, overexposed, underexposed, amateur",
"technical": {"quality": "high", "style": "photorealistic"}
},
"artistic": {
"name": "Fine Art",
"instruction": """Transform this into an artistic, painterly description with:
- Art movement reference (Impressionism, Surrealism, Art Nouveau, Baroque)
- Specific artist style influence (Monet, Van Gogh, Klimt, Dali)
- Medium specification (oil on canvas, watercolor, digital painting)
- Brushwork and texture details (impasto, glazing, wet-on-wet)
- Color palette (complementary, analogous, monochromatic)
- Emotional mood and atmosphere""",
"negative_base": "photorealistic, photograph, digital render, 3D, CGI",
"technical": {"style": "artistic"}
},
"product": {
"name": "Product Photography",
"instruction": """Transform this into professional product photography with:
- Clean, commercial backdrop (white seamless, gradient, lifestyle setting)
- Studio lighting setup (three-point lighting, beauty dish, softbox)
- Hero shot composition (angle, distance, focal point)
- Reflection and shadow control
- Brand-appropriate styling
- E-commerce or advertising context""",
"negative_base": "cluttered background, amateur lighting, dirty, damaged",
"technical": {"background": "transparent", "quality": "high"}
},
"fantasy": {
"name": "Fantasy Art",
"instruction": """Transform this into a fantastical, imaginative scene with:
- Magical elements (glowing particles, ethereal light, mystical symbols)
- Otherworldly setting details (floating islands, crystal formations, ancient ruins)
- Fantasy creature or character design elements
- Epic scale and grandeur
- Rich color palette (jewel tones, iridescent, bioluminescent)
- Atmospheric effects (mist, aurora, magical energy)""",
"negative_base": "mundane, realistic, boring, plain, everyday",
"technical": {"style": "fantasy-art"}
},
"minimal": {
"name": "Minimalist",
"instruction": """Transform this into a minimalist, clean description with:
- Negative space utilization (vast empty areas, breathing room)
- Limited color palette (monochrome, two-tone, muted)
- Simple geometric forms
- Clean lines and shapes
- Subtle textures
- Zen-like calm and balance""",
"negative_base": "cluttered, busy, complex, detailed, ornate, decorated",
"technical": {"style": "minimal"}
},
"vintage": {
"name": "Vintage/Retro",
"instruction": """Transform this into a vintage, retro-styled description with:
- Era-specific details (1920s Art Deco, 1950s Americana, 1970s psychedelic, 1980s neon)
- Film stock characteristics (Kodachrome, Polaroid, black and white)
- Grain and texture (film grain, light leaks, vignette)
- Period-appropriate color palette (sepia, faded, cross-processed)
- Nostalgic elements and props
- Authentic vintage aesthetic""",
"negative_base": "modern, digital, contemporary, clean, sharp",
"technical": {"style": "analog-film"}
},
"futuristic": {
"name": "Sci-Fi/Futuristic",
"instruction": """Transform this into a futuristic, sci-fi description with:
- Advanced technology elements (holograms, neon lights, cybernetic)
- Futuristic architecture (sleek, geometric, towering)
- Sci-fi lighting (neon, bioluminescent, holographic)
- Cyberpunk or utopian aesthetic
- High-tech materials (chrome, glass, LED)
- Atmospheric sci-fi elements (rain, smog, data streams)""",
"negative_base": "primitive, ancient, rustic, natural, organic",
"technical": {"style": "neon-punk"}
},
"anime": {
"name": "Anime/Manga",
"instruction": """Transform this into anime/manga style with:
- Character design elements (large expressive eyes, dynamic poses)
- Japanese animation aesthetic (cel shading, speed lines)
- Studio style reference (Studio Ghibli, Makoto Shinkai, MAPPA)
- Dramatic lighting and composition
- Vibrant color palette
- Emotional expression and atmosphere""",
"negative_base": "realistic, photograph, western cartoon, 3D render",
"technical": {"style": "anime"}
},
"portrait": {
"name": "Portrait Photography",
"instruction": """Transform this into professional portrait photography with:
- Flattering lighting setup (Rembrandt, butterfly, split lighting)
- Lens choice for portraits (85mm, 105mm, shallow depth of field)
- Background treatment (bokeh, studio backdrop, environmental)
- Skin tone and texture (natural, retouched, editorial)
- Expression and emotion capture
- Composition (headshot, half-body, full-body)""",
"negative_base": "unflattering angle, harsh shadows, distorted features",
"technical": {"style": "photographic"}
},
"landscape": {
"name": "Landscape Photography",
"instruction": """Transform this into epic landscape photography with:
- Golden hour or blue hour lighting
- Weather and atmospheric conditions (dramatic clouds, fog, storm)
- Geographic specificity (mountains, ocean, forest, desert)
- Foreground interest and depth
- Wide-angle perspective
- Long exposure effects (smooth water, star trails)""",
"negative_base": "flat, boring, midday harsh light, no depth",
"technical": {"aspect_ratio": "16:9", "style": "photorealistic"}
},
"abstract": {
"name": "Abstract Art",
"instruction": """Transform this into abstract art with:
- Non-representational forms and shapes
- Color theory application (complementary, triadic, split-complementary)
- Texture and pattern exploration
- Movement and flow
- Emotional expression through color and form
- Artistic technique (drip, splatter, geometric)""",
"negative_base": "representational, realistic, figurative, recognizable objects",
"technical": {"style": "digital-art"}
},
"fashion": {
"name": "Fashion Photography",
"instruction": """Transform this into high-end fashion photography with:
- Editorial or commercial context
- Designer styling and wardrobe
- High-fashion lighting (dramatic, clean, artistic)
- Model pose and expression
- Location or studio setting
- Magazine-worthy composition""",
"negative_base": "casual, everyday, amateur, unflattering",
"technical": {"style": "photographic", "quality": "high"}
},
"architecture": {
"name": "Architectural Photography",
"instruction": """Transform this into architectural photography with:
- Building style and era (modern, classical, brutalist, Art Deco)
- Perspective and angles (worm's eye, bird's eye, straight-on)
- Interior or exterior focus
- Lighting conditions (golden hour, twilight, dramatic shadows)
- Detail and texture emphasis
- Scale and grandeur""",
"negative_base": "distorted, amateur angle, poor lighting, obstructed view",
"technical": {"style": "photographic"}
},
"food": {
"name": "Food Photography",
"instruction": """Transform this into appetizing food photography with:
- Styling and plating details
- Lighting setup (backlit, side-lit, soft diffused)
- Props and context (table setting, ingredients, utensils)
- Texture and freshness emphasis
- Color harmony and contrast
- Angle (overhead, 45-degree, eye-level)""",
"negative_base": "unappetizing, messy, cold, stale, poor presentation",
"technical": {"style": "photographic", "quality": "high"}
}
}
# Provider-specific optimizations
PROVIDER_OPTIMIZATIONS = {
"openai": {
"max_length": 4000,
"style_suffix": "highly detailed, professional quality",
"avoid": "text, watermarks, logos"
},
"gpt-image-1": {
"max_length": 32000,
"style_suffix": "highly detailed, professional quality, masterpiece",
"avoid": "text, watermarks, logos, blurry"
},
"stable-diffusion": {
"max_length": 500,
"style_suffix": "(masterpiece, best quality, highly detailed)",
"avoid": "(worst quality, low quality, blurry, distorted)"
},
"midjourney": {
"max_length": 600,
"style_suffix": "--v 6 --q 2 --s 750",
"avoid": "--no text, watermarks, blurry"
},
"flux": {
"max_length": 2000,
"style_suffix": "ultra high quality, professional, detailed",
"avoid": "low quality, amateur, blurry"
},
"leonardo": {
"max_length": 1000,
"style_suffix": "highly detailed, professional, stunning",
"avoid": "low quality, blurry, distorted"
}
}
async def enhance(
prompt: str,
style: str = "cinematic",
provider: str = "openai",
include_negative: bool = True,
include_technical: bool = True,
language: str = "en"
) -> dict:
"""Enhance a prompt using AI
Args:
prompt: The original prompt to enhance
style: Style preset to apply (see STYLE_CONFIGS)
provider: Target image generation provider for optimization
include_negative: Whether to generate negative prompts
include_technical: Whether to include technical parameters
language: Output language code
Returns:
Dictionary with enhanced prompt, negative prompt, and metadata
"""
# Get style configuration
style_config = STYLE_CONFIGS.get(style, STYLE_CONFIGS["cinematic"])
provider_config = PROVIDER_OPTIMIZATIONS.get(provider, PROVIDER_OPTIMIZATIONS["openai"])
# Try Google Gemini first, then OpenAI, then fallback
enhanced_result = None
if settings.google_api_key:
enhanced_result = await _enhance_with_gemini(prompt, style_config, provider_config, language)
elif settings.openai_api_key:
enhanced_result = await _enhance_with_openai(prompt, style_config, provider_config, language)
if not enhanced_result:
# Fallback to rule-based enhancement
enhanced_result = _enhance_fallback(prompt, style_config, provider_config)
# Build response
response = {
"original_prompt": prompt,
"enhanced_prompt": enhanced_result.get("enhanced_prompt", prompt),
"style": style,
"style_name": style_config["name"],
"provider": provider
}
if include_negative:
response["negative_prompt"] = enhanced_result.get(
"negative_prompt",
style_config.get("negative_base", "blurry, low quality, distorted")
)
if include_technical:
response["technical_params"] = {
**style_config.get("technical", {}),
"max_prompt_length": provider_config["max_length"]
}
if enhanced_result.get("suggestions"):
response["suggestions"] = enhanced_result["suggestions"]
if enhanced_result.get("note"):
response["note"] = enhanced_result["note"]
return response
async def _enhance_with_gemini(
prompt: str,
style_config: dict,
provider_config: dict,
language: str
) -> Optional[Dict[str, Any]]:
"""Enhance prompt using Google Gemini"""
try:
import google.generativeai as genai
genai.configure(api_key=settings.google_api_key)
model = genai.GenerativeModel("gemini-2.0-flash-exp")
system_prompt = f"""You are an expert AI image prompt engineer. Your task is to transform basic prompts into detailed, professional prompts optimized for AI image generation.
STYLE: {style_config['name']}
{style_config['instruction']}
OPTIMIZATION TARGET: {provider_config.get('max_length', 1000)} characters maximum
Guidelines:
1. Add specific visual details (lighting, colors, textures, materials)
2. Include composition and framing suggestions
3. Add atmosphere, mood, and emotional tone
4. Be specific about quality indicators
5. Keep under {provider_config.get('max_length', 1000)} characters
6. Make it suitable for AI image generators
7. {"Output in " + language if language != "en" else ""}
ORIGINAL PROMPT: {prompt}
Respond in this exact format:
ENHANCED: [your enhanced prompt here]
NEGATIVE: [negative prompt - things to avoid]
SUGGESTIONS: [1-2 additional tips for better results]"""
response = model.generate_content(system_prompt)
text = response.text.strip()
# Parse response
enhanced_prompt = prompt
negative_prompt = style_config.get("negative_base", "")
suggestions = []
if "ENHANCED:" in text:
parts = text.split("ENHANCED:")[1]
if "NEGATIVE:" in parts:
enhanced_prompt = parts.split("NEGATIVE:")[0].strip()
parts = parts.split("NEGATIVE:")[1]
if "SUGGESTIONS:" in parts:
negative_prompt = parts.split("SUGGESTIONS:")[0].strip()
suggestions = parts.split("SUGGESTIONS:")[1].strip().split("\n")
else:
negative_prompt = parts.strip()
else:
enhanced_prompt = parts.strip()
else:
# If format not followed, use full response as enhanced prompt
enhanced_prompt = text
# Apply provider optimization suffix
if provider_config.get("style_suffix"):
enhanced_prompt = f"{enhanced_prompt}, {provider_config['style_suffix']}"
# Truncate if needed
max_len = provider_config.get("max_length", 1000)
if len(enhanced_prompt) > max_len:
enhanced_prompt = enhanced_prompt[:max_len-3] + "..."
return {
"enhanced_prompt": enhanced_prompt,
"negative_prompt": negative_prompt,
"suggestions": [s.strip() for s in suggestions if s.strip()]
}
except Exception as e:
return {"note": f"Gemini enhancement failed: {str(e)}"}
async def _enhance_with_openai(
prompt: str,
style_config: dict,
provider_config: dict,
language: str
) -> Optional[Dict[str, Any]]:
"""Enhance prompt using OpenAI GPT-4"""
try:
async with httpx.AsyncClient(timeout=60) as client:
response = await client.post(
"https://api.openai.com/v1/chat/completions",
headers={
"Authorization": f"Bearer {settings.openai_api_key}",
"Content-Type": "application/json"
},
json={
"model": "gpt-4o-mini",
"messages": [
{
"role": "system",
"content": f"""You are an expert AI image prompt engineer. Transform basic prompts into detailed, professional prompts.
STYLE: {style_config['name']}
{style_config['instruction']}
Keep under {provider_config.get('max_length', 1000)} characters. Be specific about visual details, lighting, composition, and mood."""
},
{
"role": "user",
"content": f"Enhance this prompt for {style_config['name']} style:\n\n{prompt}\n\nRespond with only the enhanced prompt, nothing else."
}
],
"temperature": 0.7,
"max_tokens": 500
}
)
response.raise_for_status()
data = response.json()
enhanced_prompt = data["choices"][0]["message"]["content"].strip()
# Apply provider optimization
if provider_config.get("style_suffix"):
enhanced_prompt = f"{enhanced_prompt}, {provider_config['style_suffix']}"
return {
"enhanced_prompt": enhanced_prompt,
"negative_prompt": style_config.get("negative_base", "blurry, low quality")
}
except Exception as e:
return {"note": f"OpenAI enhancement failed: {str(e)}"}
def _enhance_fallback(
prompt: str,
style_config: dict,
provider_config: dict
) -> Dict[str, Any]:
"""Rule-based fallback enhancement when no API is available"""
# Basic enhancement patterns
enhancements = {
"cinematic": "cinematic lighting, dramatic composition, film grain, shallow depth of field, atmospheric, 8K resolution",
"photographic": "professional photography, sharp focus, natural lighting, high resolution, detailed",
"artistic": "artistic style, painterly, rich colors, textured brushstrokes, masterpiece",
"product": "studio lighting, clean white background, professional product photography, sharp details",
"fantasy": "magical atmosphere, ethereal lighting, fantasy art style, highly detailed, epic scale",
"minimal": "minimalist composition, clean lines, negative space, simple elegant",
"vintage": "vintage aesthetic, film grain, warm tones, retro style, nostalgic",
"futuristic": "futuristic, sci-fi, neon lights, cyberpunk aesthetic, high tech",
"anime": "anime style, vibrant colors, expressive, Japanese animation aesthetic",
"portrait": "portrait photography, professional lighting, shallow depth of field, sharp focus",
"landscape": "epic landscape, golden hour lighting, dramatic sky, high resolution",
"abstract": "abstract art, bold colors, dynamic composition, non-representational",
"fashion": "high fashion photography, editorial style, professional lighting, elegant",
"architecture": "architectural photography, dramatic angles, professional composition",
"food": "food photography, appetizing presentation, professional lighting, fresh"
}
style_key = style_config.get("name", "cinematic").lower().replace(" ", "_").replace("/", "_")
base_enhancement = enhancements.get(style_key, enhancements["cinematic"])
enhanced_prompt = f"{prompt}, {base_enhancement}"
if provider_config.get("style_suffix"):
enhanced_prompt = f"{enhanced_prompt}, {provider_config['style_suffix']}"
return {
"enhanced_prompt": enhanced_prompt,
"negative_prompt": style_config.get("negative_base", "blurry, low quality, distorted, poorly drawn"),
"note": "Enhanced using rule-based system (API keys not configured)"
}
def get_available_styles() -> List[Dict[str, str]]:
"""Get list of available style presets"""
return [
{"id": key, "name": config["name"]}
for key, config in STYLE_CONFIGS.items()
]
def get_style_info(style: str) -> Optional[Dict[str, Any]]:
"""Get detailed information about a style"""
config = STYLE_CONFIGS.get(style)
if not config:
return None
return {
"id": style,
"name": config["name"],
"description": config["instruction"].split("\n")[0],
"technical": config.get("technical", {}),
"negative_base": config.get("negative_base", "")
}

View file

@ -0,0 +1,229 @@
"""Sound Effects Generation Service using ElevenLabs API"""
import httpx
import structlog
from typing import Optional, Dict, Any
from pathlib import Path
import uuid
from app.config import settings
logger = structlog.get_logger()
# ElevenLabs Sound Effects API endpoint
ELEVENLABS_SFX_URL = "https://api.elevenlabs.io/v1/sound-generation"
# Available output formats
OUTPUT_FORMATS = {
"mp3_44100_128": "MP3 (44.1kHz, 128kbps)",
"mp3_44100_192": "MP3 (44.1kHz, 192kbps)",
"pcm_48000": "WAV (48kHz)",
"opus_48000_64": "Opus (48kHz, 64kbps)",
}
class SoundEffectsGenerator:
"""Generate sound effects using ElevenLabs API"""
def __init__(self):
self.api_key = settings.elevenlabs_api_key
if not self.api_key:
logger.warning("ElevenLabs API key not configured")
async def generate(
self,
text: str,
duration_seconds: Optional[float] = None,
prompt_influence: float = 0.3,
loop: bool = False,
output_format: str = "mp3_44100_128",
output_path: Optional[str] = None,
) -> Dict[str, Any]:
"""
Generate a sound effect from text description.
Args:
text: Description of the sound effect to generate
duration_seconds: Desired duration (max 22 seconds, or None for auto)
prompt_influence: How closely to follow the prompt (0.0-1.0)
loop: Whether to generate a looping sound effect
output_format: Audio format (mp3_44100_128, pcm_48000, etc.)
output_path: Optional path to save the audio file
Returns:
Dict with file_path, duration, format info
"""
if not self.api_key:
raise ValueError("ElevenLabs API key not configured")
logger.info(
"Generating sound effect",
text=text[:50] + "..." if len(text) > 50 else text,
duration=duration_seconds,
loop=loop,
)
headers = {
"xi-api-key": self.api_key,
"Content-Type": "application/json",
}
payload: Dict[str, Any] = {
"text": text,
"prompt_influence": prompt_influence,
}
if duration_seconds is not None:
payload["duration_seconds"] = min(duration_seconds, 22) # Max 22 seconds
if loop:
payload["loop"] = True
params = {"output_format": output_format}
async with httpx.AsyncClient(timeout=120.0) as client:
response = await client.post(
ELEVENLABS_SFX_URL,
headers=headers,
json=payload,
params=params,
)
if response.status_code == 422:
error_detail = response.json()
raise ValueError(f"Validation error: {error_detail}")
response.raise_for_status()
# Determine file extension from format
if output_format.startswith("mp3"):
extension = ".mp3"
elif output_format.startswith("pcm"):
extension = ".wav"
elif output_format.startswith("opus"):
extension = ".opus"
else:
extension = ".mp3"
# Generate output path if not provided
if not output_path:
output_path = str(
Path(settings.storage_path)
/ "audio"
/ f"sfx_{uuid.uuid4().hex[:8]}{extension}"
)
# Ensure directory exists
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
# Write the audio file
with open(output_path, "wb") as f:
f.write(response.content)
file_size = len(response.content)
logger.info(
"Sound effect generated",
output_path=output_path,
file_size=file_size,
format=output_format,
)
return {
"file_path": output_path,
"file_size": file_size,
"format": output_format,
"duration_seconds": duration_seconds,
"loop": loop,
}
async def get_available_formats(self) -> Dict[str, str]:
"""Return available output formats"""
return OUTPUT_FORMATS
# Singleton instance
_generator: Optional[SoundEffectsGenerator] = None
def get_sound_effects_generator() -> SoundEffectsGenerator:
"""Get the singleton sound effects generator instance"""
global _generator
if _generator is None:
_generator = SoundEffectsGenerator()
return _generator
async def generate_sound_effect_job(job_id: str) -> None:
"""Process a sound effect generation job"""
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
import asyncio
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
logger.error(f"Job {job_id} not found")
return
job.status = "processing"
job.progress = 10
db.commit()
input_data = job.input_data
generator = get_sound_effects_generator()
# Generate the sound effect
result = await generator.generate(
text=input_data["text"],
duration_seconds=input_data.get("duration_seconds"),
prompt_influence=input_data.get("prompt_influence", 0.3),
loop=input_data.get("loop", False),
output_format=input_data.get("output_format", "mp3_44100_128"),
)
job.progress = 80
db.commit()
# Create asset for the output
file_path = result["file_path"]
filename = Path(file_path).name
asset = Asset(
user_id=job.user_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="audio",
mime_type="audio/mpeg" if filename.endswith(".mp3") else "audio/wav",
file_size_bytes=result["file_size"],
source_module="sound_effects",
source_job_id=job.id,
)
db.add(asset)
db.commit()
db.refresh(asset)
job.output_asset_ids = [asset.id]
job.output_data = {
"duration_seconds": result.get("duration_seconds"),
"format": result["format"],
"loop": result["loop"],
}
job.status = "completed"
job.progress = 100
db.commit()
logger.info(f"Sound effect job {job_id} completed successfully")
except Exception as e:
logger.error(f"Sound effect job {job_id} failed: {str(e)}")
job = db.query(Job).filter(Job.id == job_id).first()
if job:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()

View file

@ -0,0 +1,652 @@
"""
Subtitle Processor Service - Whisper + DeepL + FFmpeg
Full-featured subtitle processing with:
- Whisper transcription (multiple model sizes)
- DeepL translation (30+ languages)
- FFmpeg burning with full styling control
Styling Options:
- font: Font family (Arial, Helvetica, Times New Roman, etc.)
- font_size: Font size in points (default: 24)
- text_color: Primary text color (white, yellow, black, red, blue, green, orange, purple)
- outline_color: Outline/border color
- outline_width: Outline thickness (0-5, default: 2)
- background_color: Optional background box color
- background_opacity: Background box opacity (0-1)
- position: vertical position (bottom, top, center)
- alignment: horizontal alignment (left, center, right)
- margin_v: Vertical margin from edge (default: 30)
- margin_h: Horizontal margin (default: 20)
- shadow: Shadow depth (0-4)
- bold: Bold text (true/false)
- italic: Italic text (true/false)
Whisper Models:
- tiny: Fastest, lowest accuracy (~1GB VRAM)
- base: Fast, good accuracy (~1GB VRAM) - default
- small: Balanced (~2GB VRAM)
- medium: High accuracy (~5GB VRAM)
- large: Best accuracy (~10GB VRAM)
- large-v2: Latest large model
- large-v3: Newest model with best accuracy
"""
import os
import subprocess
from uuid import uuid4
from datetime import datetime, timedelta
from typing import Optional
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
import structlog
logger = structlog.get_logger()
# Supported languages for DeepL translation
SUPPORTED_LANGUAGES = {
'BG': 'Bulgarian',
'CS': 'Czech',
'DA': 'Danish',
'DE': 'German',
'EL': 'Greek',
'EN-GB': 'English (British)',
'EN-US': 'English (American)',
'ES': 'Spanish',
'ET': 'Estonian',
'FI': 'Finnish',
'FR': 'French',
'HU': 'Hungarian',
'ID': 'Indonesian',
'IT': 'Italian',
'JA': 'Japanese',
'KO': 'Korean',
'LT': 'Lithuanian',
'LV': 'Latvian',
'NB': 'Norwegian (Bokmål)',
'NL': 'Dutch',
'PL': 'Polish',
'PT-BR': 'Portuguese (Brazilian)',
'PT-PT': 'Portuguese (European)',
'RO': 'Romanian',
'RU': 'Russian',
'SK': 'Slovak',
'SL': 'Slovenian',
'SV': 'Swedish',
'TR': 'Turkish',
'UK': 'Ukrainian',
'ZH': 'Chinese (simplified)',
'ZH-HANS': 'Chinese (simplified)'
}
# Color mapping for ASS format (BGR order)
COLOR_MAP = {
'white': 'FFFFFF',
'yellow': '00FFFF',
'black': '000000',
'red': '0000FF',
'blue': 'FF0000',
'green': '00FF00',
'orange': '0080FF',
'purple': '800080',
'cyan': 'FFFF00',
'magenta': 'FF00FF',
'gray': '808080',
'silver': 'C0C0C0',
'gold': '00D7FF',
'lime': '00FF00',
'navy': '800000',
'teal': '808000',
'maroon': '000080',
'olive': '008080'
}
# Whisper model options
WHISPER_MODELS = {
'tiny': {'name': 'Tiny', 'vram': '~1GB', 'speed': 'fastest'},
'base': {'name': 'Base', 'vram': '~1GB', 'speed': 'fast'},
'small': {'name': 'Small', 'vram': '~2GB', 'speed': 'moderate'},
'medium': {'name': 'Medium', 'vram': '~5GB', 'speed': 'slow'},
'large': {'name': 'Large', 'vram': '~10GB', 'speed': 'slowest'},
'large-v2': {'name': 'Large V2', 'vram': '~10GB', 'speed': 'slowest'},
'large-v3': {'name': 'Large V3', 'vram': '~10GB', 'speed': 'slowest'}
}
# Font presets
FONT_PRESETS = {
'default': {'font': 'Arial', 'size': 24, 'outline': 2},
'cinematic': {'font': 'Helvetica', 'size': 28, 'outline': 3},
'documentary': {'font': 'Georgia', 'size': 22, 'outline': 1},
'news': {'font': 'Arial', 'size': 26, 'outline': 2},
'social_media': {'font': 'Arial Black', 'size': 32, 'outline': 4},
'minimal': {'font': 'Helvetica', 'size': 20, 'outline': 1},
'bold': {'font': 'Impact', 'size': 30, 'outline': 3}
}
def get_available_fonts():
"""Get list of available fonts on the system"""
try:
output = subprocess.check_output(['fc-list', ':', 'family'], stderr=subprocess.DEVNULL).decode('utf-8')
fonts = set()
for line in output.splitlines():
for font in line.split(','):
font = font.strip()
if font:
fonts.add(font)
return sorted(list(fonts))
except (subprocess.SubprocessError, FileNotFoundError):
return [
'Arial', 'Helvetica', 'Times New Roman', 'Courier New', 'Verdana',
'Georgia', 'Palatino', 'Garamond', 'Comic Sans MS', 'Trebuchet MS',
'Arial Black', 'Impact', 'Tahoma', 'Roboto', 'Open Sans'
]
def get_subtitle_config():
"""Return available configuration options for subtitles"""
return {
"whisper_models": WHISPER_MODELS,
"supported_languages": SUPPORTED_LANGUAGES,
"colors": list(COLOR_MAP.keys()),
"fonts": get_available_fonts(),
"font_presets": FONT_PRESETS,
"positions": ["bottom", "top", "center"],
"alignments": ["left", "center", "right"]
}
async def process(job_id: str):
"""
Process video for subtitles - transcribe, translate, optionally burn
Input parameters:
- source_language: Source language code or "auto" for detection
- target_language: Target language code for translation (optional)
- burn_subtitles: Whether to burn subtitles into video
- whisper_model: Whisper model size (tiny/base/small/medium/large)
- font: Font family name
- font_size: Font size in points
- text_color: Primary text color
- outline_color: Text outline color
- outline_width: Outline thickness (0-5)
- background_color: Background box color (optional)
- background_opacity: Background opacity 0-1 (default 0)
- position: Vertical position (bottom/top/center)
- alignment: Horizontal alignment (left/center/right)
- margin_v: Vertical margin from edge
- margin_h: Horizontal margin
- shadow: Shadow depth (0-4)
- bold: Use bold text
- italic: Use italic text
- font_preset: Use a predefined style preset
- word_timestamps: Include word-level timestamps in output
- output_format: SRT, VTT, or ASS format
"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
job.progress = 5
job.api_provider = "whisper"
db.commit()
# Get all parameters with defaults
source_language = input_data.get("source_language", "auto")
target_language = input_data.get("target_language")
burn_subtitles = input_data.get("burn_subtitles", False)
whisper_model = input_data.get("whisper_model", "base")
word_timestamps = input_data.get("word_timestamps", False)
output_format = input_data.get("output_format", "srt").lower()
# Styling parameters
font_preset = input_data.get("font_preset")
if font_preset and font_preset in FONT_PRESETS:
preset = FONT_PRESETS[font_preset]
font = input_data.get("font", preset['font'])
font_size = input_data.get("font_size", preset['size'])
outline_width = input_data.get("outline_width", preset['outline'])
else:
font = input_data.get("font", "Arial")
font_size = input_data.get("font_size", 24)
outline_width = input_data.get("outline_width", 2)
text_color = input_data.get("text_color", "white")
outline_color = input_data.get("outline_color", "black")
background_color = input_data.get("background_color")
background_opacity = input_data.get("background_opacity", 0)
position = input_data.get("position", "bottom")
alignment = input_data.get("alignment", "center")
margin_v = input_data.get("margin_v", 30)
margin_h = input_data.get("margin_h", 20)
shadow = input_data.get("shadow", 0)
bold = input_data.get("bold", False)
italic = input_data.get("italic", False)
# Extract audio from video
audio_path = os.path.join(settings.storage_path, "temp", f"{uuid4()}.wav")
os.makedirs(os.path.dirname(audio_path), exist_ok=True)
subprocess.run([
"ffmpeg", "-i", input_asset.file_path,
"-vn", "-acodec", "pcm_s16le", "-ar", "16000", "-ac", "1",
"-y", audio_path
], check=True, capture_output=True)
job.progress = 20
db.commit()
# Transcribe with Whisper
import whisper
logger.info(f"Loading Whisper model: {whisper_model}")
model = whisper.load_model(whisper_model)
transcribe_options = {
"language": None if source_language == "auto" else source_language,
"verbose": False,
"word_timestamps": word_timestamps
}
result = model.transcribe(audio_path, **transcribe_options)
job.progress = 50
job.api_model = f"whisper-{whisper_model}"
db.commit()
# Generate subtitle content
segments = result.get("segments", [])
detected_language = result.get("language", source_language)
if output_format == "vtt":
subtitle_content = _generate_vtt(segments, word_timestamps)
subtitle_ext = "vtt"
elif output_format == "ass":
subtitle_content = _generate_ass(segments, font, font_size, text_color, outline_color,
outline_width, position, alignment, margin_v, margin_h,
shadow, bold, italic, background_color, background_opacity)
subtitle_ext = "ass"
else:
subtitle_content = _generate_srt(segments)
subtitle_ext = "srt"
# Translate if needed
translated_content = None
if target_language:
job.api_provider = "whisper+deepl"
import deepl
translator = deepl.Translator(settings.deepl_api_key)
# Translate only the text content
text_for_translation = "\n".join([seg.get("text", "").strip() for seg in segments])
translated_text = translator.translate_text(
text_for_translation,
target_lang=target_language
).text
# Rebuild the subtitles with translated text
translated_lines = translated_text.split("\n")
translated_segments = []
for i, seg in enumerate(segments):
new_seg = seg.copy()
if i < len(translated_lines):
new_seg["text"] = translated_lines[i]
translated_segments.append(new_seg)
if output_format == "vtt":
translated_content = _generate_vtt(translated_segments, word_timestamps)
elif output_format == "ass":
translated_content = _generate_ass(translated_segments, font, font_size, text_color,
outline_color, outline_width, position, alignment,
margin_v, margin_h, shadow, bold, italic,
background_color, background_opacity)
else:
translated_content = _generate_srt(translated_segments)
job.progress = 70
db.commit()
output_assets = []
# Save original subtitle file
subtitle_filename = f"subtitles_{uuid4()}.{subtitle_ext}"
subtitle_path = os.path.join(settings.storage_path, "documents", subtitle_filename)
os.makedirs(os.path.dirname(subtitle_path), exist_ok=True)
with open(subtitle_path, "w", encoding="utf-8") as f:
f.write(subtitle_content)
subtitle_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=subtitle_filename,
stored_filename=subtitle_filename,
file_path=subtitle_path,
file_type="document",
mime_type="text/plain",
file_size_bytes=len(subtitle_content.encode()),
source_module="subtitle_processor",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={
"language": detected_language,
"type": "original",
"format": output_format,
"whisper_model": whisper_model
}
)
db.add(subtitle_asset)
db.commit()
db.refresh(subtitle_asset)
output_assets.append(subtitle_asset.id)
# Save translated subtitle if exists
trans_path = None
if translated_content:
trans_filename = f"subtitles_translated_{uuid4()}.{subtitle_ext}"
trans_path = os.path.join(settings.storage_path, "documents", trans_filename)
with open(trans_path, "w", encoding="utf-8") as f:
f.write(translated_content)
trans_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=trans_filename,
stored_filename=trans_filename,
file_path=trans_path,
file_type="document",
mime_type="text/plain",
file_size_bytes=len(translated_content.encode()),
source_module="subtitle_processor",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={
"language": target_language,
"type": "translated",
"format": output_format
}
)
db.add(trans_asset)
db.commit()
db.refresh(trans_asset)
output_assets.append(trans_asset.id)
job.progress = 80
db.commit()
# Burn subtitles if requested
if burn_subtitles:
burn_path = trans_path if translated_content else subtitle_path
output_filename = f"subtitled_{uuid4()}.mp4"
output_path = os.path.join(settings.storage_path, "videos", output_filename)
os.makedirs(os.path.dirname(output_path), exist_ok=True)
# Build the FFmpeg subtitle filter
subtitle_filter = _build_subtitle_filter(
burn_path, font, font_size, text_color, outline_color,
outline_width, position, alignment, margin_v, margin_h,
shadow, bold, italic, background_color, background_opacity
)
subprocess.run([
"ffmpeg", "-i", input_asset.file_path,
"-vf", subtitle_filter,
"-c:a", "copy",
"-y", output_path
], check=True, capture_output=True)
video_size = os.path.getsize(output_path)
video_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=output_filename,
stored_filename=output_filename,
file_path=output_path,
file_type="video",
mime_type="video/mp4",
file_size_bytes=video_size,
width=input_asset.width,
height=input_asset.height,
duration_seconds=input_asset.duration_seconds,
source_module="subtitle_processor",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={
"burned_subtitles": True,
"subtitle_language": target_language or detected_language,
"styling": {
"font": font,
"font_size": font_size,
"text_color": text_color,
"position": position
}
}
)
db.add(video_asset)
db.commit()
db.refresh(video_asset)
output_assets.append(video_asset.id)
# Cleanup temp audio
if os.path.exists(audio_path):
os.remove(audio_path)
job.output_asset_ids = output_assets
job.output_data = {
"transcript": result.get("text", ""),
"language": detected_language,
"segments_count": len(segments),
"word_timestamps": word_timestamps,
"output_format": output_format,
"translated": bool(translated_content),
"burned": burn_subtitles,
"asset_ids": [str(a) for a in output_assets]
}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
logger.error(f"Subtitle processing error: {e}")
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
def _generate_srt(segments: list) -> str:
"""Generate SRT format from segments"""
srt_lines = []
for i, segment in enumerate(segments, 1):
start = _format_srt_timestamp(segment['start'])
end = _format_srt_timestamp(segment['end'])
text = segment['text'].strip()
srt_lines.append(f"{i}\n{start} --> {end}\n{text}\n")
return "\n".join(srt_lines)
def _generate_vtt(segments: list, word_timestamps: bool = False) -> str:
"""Generate WebVTT format from segments"""
vtt_lines = ["WEBVTT", ""]
for i, segment in enumerate(segments, 1):
start = _format_vtt_timestamp(segment['start'])
end = _format_vtt_timestamp(segment['end'])
text = segment['text'].strip()
# Add word-level timestamps if available
if word_timestamps and 'words' in segment:
words_with_timing = []
for word in segment['words']:
word_start = _format_vtt_timestamp(word['start'])
words_with_timing.append(f"<{word_start}>{word['word']}")
text = "".join(words_with_timing)
vtt_lines.append(f"{i}")
vtt_lines.append(f"{start} --> {end}")
vtt_lines.append(text)
vtt_lines.append("")
return "\n".join(vtt_lines)
def _generate_ass(segments: list, font: str, font_size: int, text_color: str,
outline_color: str, outline_width: float, position: str,
alignment: str, margin_v: int, margin_h: int, shadow: int,
bold: bool, italic: bool, background_color: Optional[str],
background_opacity: float) -> str:
"""Generate ASS (Advanced SubStation Alpha) format with full styling"""
# Convert colors to ASS format (&HBBGGRR)
primary_hex = COLOR_MAP.get(text_color.lower(), 'FFFFFF')
outline_hex = COLOR_MAP.get(outline_color.lower(), '000000')
# Calculate alignment value (SSA uses different numbering)
# 1=left-bottom, 2=center-bottom, 3=right-bottom
# 4=left-middle, 5=center-middle, 6=right-middle
# 7=left-top, 8=center-top, 9=right-top
align_map = {
('left', 'bottom'): 1, ('center', 'bottom'): 2, ('right', 'bottom'): 3,
('left', 'center'): 4, ('center', 'center'): 5, ('right', 'center'): 6,
('left', 'top'): 7, ('center', 'top'): 8, ('right', 'top'): 9
}
ass_alignment = align_map.get((alignment, position), 2)
# Background color with opacity
back_alpha = hex(int((1 - background_opacity) * 255))[2:].upper().zfill(2)
if background_color:
back_hex = COLOR_MAP.get(background_color.lower(), '000000')
back_color = f"&H{back_alpha}{back_hex}"
else:
back_color = f"&H{back_alpha}000000"
# Font weight and style
bold_val = -1 if bold else 0
italic_val = -1 if italic else 0
ass_content = f"""[Script Info]
Title: Generated Subtitles
ScriptType: v4.00+
PlayResX: 1920
PlayResY: 1080
ScaledBorderAndShadow: yes
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,{font},{font_size},&H00{primary_hex},&H00{primary_hex},&H00{outline_hex},{back_color},{bold_val},{italic_val},0,0,100,100,0,0,1,{outline_width},{shadow},{ass_alignment},{margin_h},{margin_h},{margin_v},1
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
"""
for segment in segments:
start = _format_ass_timestamp(segment['start'])
end = _format_ass_timestamp(segment['end'])
text = segment['text'].strip().replace('\n', '\\N')
ass_content += f"Dialogue: 0,{start},{end},Default,,0,0,0,,{text}\n"
return ass_content
def _format_srt_timestamp(seconds: float) -> str:
"""Convert seconds to SRT timestamp format (HH:MM:SS,mmm)"""
td = timedelta(seconds=seconds)
hours = td.seconds // 3600
minutes = (td.seconds % 3600) // 60
secs = td.seconds % 60
millis = td.microseconds // 1000
return f"{hours:02d}:{minutes:02d}:{secs:02d},{millis:03d}"
def _format_vtt_timestamp(seconds: float) -> str:
"""Convert seconds to WebVTT timestamp format (HH:MM:SS.mmm)"""
td = timedelta(seconds=seconds)
hours = td.seconds // 3600
minutes = (td.seconds % 3600) // 60
secs = td.seconds % 60
millis = td.microseconds // 1000
return f"{hours:02d}:{minutes:02d}:{secs:02d}.{millis:03d}"
def _format_ass_timestamp(seconds: float) -> str:
"""Convert seconds to ASS timestamp format (H:MM:SS.cc)"""
hours = int(seconds // 3600)
minutes = int((seconds % 3600) // 60)
secs = int(seconds % 60)
centisecs = int((seconds - int(seconds)) * 100)
return f"{hours}:{minutes:02d}:{secs:02d}.{centisecs:02d}"
def _build_subtitle_filter(subtitle_path: str, font: str, font_size: int,
text_color: str, outline_color: str, outline_width: float,
position: str, alignment: str, margin_v: int, margin_h: int,
shadow: int, bold: bool, italic: bool,
background_color: Optional[str], background_opacity: float) -> str:
"""Build FFmpeg subtitle filter with styling"""
# Determine if we're using ASS file (has its own styling)
if subtitle_path.endswith('.ass'):
return f"ass={subtitle_path}"
# Get hex colors
primary_hex = COLOR_MAP.get(text_color.lower(), 'FFFFFF')
outline_hex = COLOR_MAP.get(outline_color.lower(), '000000')
# Calculate alignment for subtitles filter
# SSA/ASS alignment: 1-3 bottom, 4-6 middle, 7-9 top
align_map = {
('left', 'bottom'): 1, ('center', 'bottom'): 2, ('right', 'bottom'): 3,
('left', 'center'): 4, ('center', 'center'): 5, ('right', 'center'): 6,
('left', 'top'): 7, ('center', 'top'): 8, ('right', 'top'): 9
}
ass_alignment = align_map.get((alignment, position), 2)
# Build force_style string
style_parts = [
f"Fontname={font}",
f"Fontsize={font_size}",
f"PrimaryColour=&H00{primary_hex}",
f"OutlineColour=&H00{outline_hex}",
f"BorderStyle=1",
f"Outline={outline_width:.1f}",
f"Shadow={shadow}",
f"Alignment={ass_alignment}",
f"MarginL={margin_h}",
f"MarginR={margin_h}",
f"MarginV={margin_v}"
]
if bold:
style_parts.append("Bold=1")
if italic:
style_parts.append("Italic=1")
# Add background if specified
if background_color and background_opacity > 0:
back_alpha = hex(int((1 - background_opacity) * 255))[2:].upper().zfill(2)
back_hex = COLOR_MAP.get(background_color.lower(), '000000')
style_parts.append(f"BackColour=&H{back_alpha}{back_hex}")
style_parts.append("BorderStyle=4") # Opaque box style
force_style = ",".join(style_parts)
# Escape the subtitle path for FFmpeg
escaped_path = subtitle_path.replace("'", "'\\''").replace(":", "\\:")
return f"subtitles='{escaped_path}':force_style='{force_style}'"

View file

@ -0,0 +1,406 @@
"""Text to Speech Service - ElevenLabs
Supported Models (December 2025):
- eleven_multilingual_v2: Highest quality, 32 languages (default)
- eleven_flash_v2_5: Ultra-low 75ms latency for real-time/chatbots
- eleven_turbo_v2_5: Emotion & drama - great for dialogue, characters, storytelling
- eleven_monolingual_v1: English only (legacy)
- eleven_v3: Latest model with high emotional range (alpha, multilingual only)
Model Selection Guide:
- Quality & Languages eleven_multilingual_v2
- Speed/Real-time (chatbots, live agents) eleven_flash_v2_5
- Emotion & Drama (dialogue, characters) eleven_turbo_v2_5
Voice Settings:
- stability: 0.0-1.0 (higher = more consistent, lower = more expressive)
- similarity_boost: 0.0-1.0 (higher = closer to original voice)
- style: 0.0-1.0 (style exaggeration, v2+ models only)
- use_speaker_boost: boolean (enhance voice clarity)
- speed: 0.7-1.2 (speech speed, default 1.0)
Advanced Features:
- seed: Integer for reproducible output (same seed + params = same result)
- previous_text: Context for better prosody continuation
- next_text: Lookahead context for natural flow
- apply_text_normalization: 'auto', 'on', 'off' (number/date spelling)
- language_code: Override auto-detection (e.g., 'en', 'es', 'fr')
Output Formats:
- MP3: mp3_44100_128, mp3_44100_192, mp3_22050_32
- PCM: pcm_16000, pcm_22050, pcm_24000, pcm_44100, pcm_48000
- Opus: opus_48000, opus_64000
- Other: ulaw_8000, alaw_8000
Voice Cloning:
- Instant Voice Cloning (IVC): Quick replication from short samples
- Professional Voice Cloning (PVC): 30+ min audio for highest fidelity
"""
import httpx
import os
from uuid import uuid4
from datetime import datetime
from typing import Optional, Dict, Any
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
# Available models with their descriptions
ELEVENLABS_MODELS = {
"eleven_multilingual_v2": {
"name": "Multilingual v2",
"description": "Highest quality, supports 32 languages",
"latency": "medium",
"use_case": "quality",
"supports_style": True,
"languages": 32
},
"eleven_flash_v2_5": {
"name": "Flash v2.5",
"description": "Ultra-low 75ms latency for real-time apps",
"latency": "ultra-low",
"use_case": "realtime",
"supports_style": True,
"languages": 32
},
"eleven_turbo_v2_5": {
"name": "Turbo v2.5",
"description": "Emotion & drama - dialogue, characters, storytelling",
"latency": "low",
"use_case": "emotion",
"supports_style": True,
"languages": 32
},
"eleven_v3": {
"name": "Eleven v3 (Alpha)",
"description": "Latest model with high emotional range",
"latency": "medium",
"use_case": "emotion",
"supports_style": True,
"languages": 32
},
"eleven_monolingual_v1": {
"name": "English v1",
"description": "English only, legacy model",
"latency": "medium",
"use_case": "legacy",
"supports_style": False,
"languages": 1
}
}
OUTPUT_FORMATS = {
# MP3 formats
"mp3_44100_128": {"ext": "mp3", "mime": "audio/mpeg"},
"mp3_44100_192": {"ext": "mp3", "mime": "audio/mpeg"},
"mp3_22050_32": {"ext": "mp3", "mime": "audio/mpeg"},
# PCM formats (raw audio)
"pcm_16000": {"ext": "wav", "mime": "audio/wav"},
"pcm_22050": {"ext": "wav", "mime": "audio/wav"},
"pcm_24000": {"ext": "wav", "mime": "audio/wav"},
"pcm_44100": {"ext": "wav", "mime": "audio/wav"},
"pcm_48000": {"ext": "wav", "mime": "audio/wav"},
# Opus formats
"opus_48000": {"ext": "opus", "mime": "audio/opus"},
"opus_64000": {"ext": "opus", "mime": "audio/opus"},
# Telephony formats
"ulaw_8000": {"ext": "wav", "mime": "audio/wav"},
"alaw_8000": {"ext": "wav", "mime": "audio/wav"}
}
async def synthesize(job_id: str):
"""Synthesize speech from text using ElevenLabs
Input parameters:
- text: The text to convert to speech
- voice_id: ElevenLabs voice ID
- model_id: Model to use (see ELEVENLABS_MODELS)
- stability: Voice stability 0.0-1.0 (default 0.5)
- similarity_boost: Voice similarity 0.0-1.0 (default 0.75)
- style: Style exaggeration 0.0-1.0 (v2+ models, default 0.0)
- use_speaker_boost: Enhance voice clarity (default true)
- speed: Speech speed 0.7-1.2 (default 1.0)
- output_format: Audio format (default mp3_44100_128)
- seed: Optional seed for reproducible output
- language_code: Override auto-detection (e.g., 'en', 'es', 'fr', 'de')
- previous_text: Context from before for better prosody
- next_text: Lookahead context for natural flow
- apply_text_normalization: 'auto', 'on', 'off' (how to spell numbers/dates)
"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
# Extract all parameters with defaults
text = input_data.get("text", "")
voice_id = input_data.get("voice_id", "21m00Tcm4TlvDq8ikWAM")
model_id = input_data.get("model_id", "eleven_multilingual_v2")
stability = float(input_data.get("stability", 0.5))
similarity_boost = float(input_data.get("similarity_boost", 0.75))
style = float(input_data.get("style", 0.0))
use_speaker_boost = input_data.get("use_speaker_boost", True)
speed = float(input_data.get("speed", 1.0))
output_format = input_data.get("output_format", "mp3_44100_128")
seed = input_data.get("seed")
# New advanced parameters
language_code = input_data.get("language_code")
previous_text = input_data.get("previous_text")
next_text = input_data.get("next_text")
apply_text_normalization = input_data.get("apply_text_normalization", "auto")
# Validate speed range
speed = max(0.7, min(1.2, speed))
job.progress = 10
job.api_provider = "elevenlabs"
job.api_model = model_id
db.commit()
# Get model config to check supported features
model_config = ELEVENLABS_MODELS.get(model_id, ELEVENLABS_MODELS["eleven_multilingual_v2"])
# Build voice settings
voice_settings: Dict[str, Any] = {
"stability": stability,
"similarity_boost": similarity_boost,
"use_speaker_boost": use_speaker_boost
}
# Style only supported in v2+ models
if model_config.get("supports_style", False):
voice_settings["style"] = style
# Build request payload
payload: Dict[str, Any] = {
"text": text,
"model_id": model_id,
"voice_settings": voice_settings
}
# Add optional parameters
if speed != 1.0:
payload["speed"] = speed
if seed is not None:
payload["seed"] = seed
if language_code:
payload["language_code"] = language_code
if previous_text:
payload["previous_text"] = previous_text
if next_text:
payload["next_text"] = next_text
if apply_text_normalization and apply_text_normalization != "auto":
payload["apply_text_normalization"] = apply_text_normalization
# Determine accept header based on format
format_info = OUTPUT_FORMATS.get(output_format, OUTPUT_FORMATS["mp3_44100_128"])
async with httpx.AsyncClient(timeout=120) as client:
response = await client.post(
f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}",
headers={
"xi-api-key": settings.elevenlabs_api_key,
"Content-Type": "application/json",
"Accept": f"audio/mpeg" # ElevenLabs returns mp3 by default
},
params={"output_format": output_format},
json=payload
)
response.raise_for_status()
audio_data = response.content
job.progress = 80
db.commit()
# Save audio file
filename = f"tts_{uuid4()}.mp3"
storage_path = os.path.join(settings.storage_path, "audio")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(audio_data)
# Create asset
asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="audio",
mime_type="audio/mpeg",
file_size_bytes=len(audio_data),
source_module="text_to_speech",
source_job_id=job.id,
metadata={
"text_length": len(text),
"voice_id": voice_id,
"model_id": model_id
}
)
db.add(asset)
db.commit()
db.refresh(asset)
job.output_asset_ids = [asset.id]
job.output_data = {"asset_id": str(asset.id), "file_path": file_path}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
async def speech_to_speech(job_id: str):
"""Convert voice to another voice using ElevenLabs"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
job.progress = 10
job.api_provider = "elevenlabs"
job.api_model = "eleven_english_sts_v2"
db.commit()
voice_id = input_data.get("voice_id")
if not voice_id:
raise ValueError("No voice_id provided")
# Read input audio
with open(input_asset.file_path, "rb") as f:
audio_data = f.read()
job.progress = 20
db.commit()
async with httpx.AsyncClient(timeout=120) as client:
response = await client.post(
f"https://api.elevenlabs.io/v1/speech-to-speech/{voice_id}",
headers={
"xi-api-key": settings.elevenlabs_api_key,
"Accept": "audio/mpeg"
},
files={"audio": (input_asset.original_filename, audio_data, input_asset.mime_type)},
data={
"model_id": "eleven_english_sts_v2",
"voice_settings": '{"stability": 0.5, "similarity_boost": 0.5}'
}
)
response.raise_for_status()
converted_audio = response.content
job.progress = 80
db.commit()
# Save converted audio
filename = f"sts_{uuid4()}.mp3"
storage_path = os.path.join(settings.storage_path, "audio")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(converted_audio)
# Create asset
asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="audio",
mime_type="audio/mpeg",
file_size_bytes=len(converted_audio),
source_module="speech_to_speech",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={"voice_id": voice_id}
)
db.add(asset)
db.commit()
db.refresh(asset)
job.output_asset_ids = [asset.id]
job.output_data = {"asset_id": str(asset.id), "file_path": file_path}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
async def get_voices() -> list:
"""Get available ElevenLabs voices"""
if not settings.elevenlabs_api_key:
# Return default voices when API key is not configured
return [
{"voice_id": "21m00Tcm4TlvDq8ikWAM", "name": "Rachel (Default)", "category": "premade", "labels": {"accent": "american", "gender": "female"}},
{"voice_id": "AZnzlk1XvdvUeBnXmlld", "name": "Domi", "category": "premade", "labels": {"accent": "american", "gender": "female"}},
{"voice_id": "EXAVITQu4vr4xnSDxMaL", "name": "Bella", "category": "premade", "labels": {"accent": "american", "gender": "female"}},
{"voice_id": "ErXwobaYiN019PkySvjV", "name": "Antoni", "category": "premade", "labels": {"accent": "american", "gender": "male"}},
{"voice_id": "MF3mGyEYCl7XYWbV9V6O", "name": "Elli", "category": "premade", "labels": {"accent": "american", "gender": "female"}},
{"voice_id": "TxGEqnHWrfWFTfGW9XjX", "name": "Josh", "category": "premade", "labels": {"accent": "american", "gender": "male"}},
{"voice_id": "VR6AewLTigWG4xSOukaG", "name": "Arnold", "category": "premade", "labels": {"accent": "american", "gender": "male"}},
{"voice_id": "pNInz6obpgDQGcFmaJgB", "name": "Adam", "category": "premade", "labels": {"accent": "american", "gender": "male"}},
{"voice_id": "yoZ06aMxZJJ28mfd3POQ", "name": "Sam", "category": "premade", "labels": {"accent": "american", "gender": "male"}},
]
try:
async with httpx.AsyncClient(timeout=30) as client:
response = await client.get(
"https://api.elevenlabs.io/v1/voices",
headers={"xi-api-key": settings.elevenlabs_api_key}
)
response.raise_for_status()
data = response.json()
voices = []
for voice in data.get("voices", []):
voices.append({
"voice_id": voice.get("voice_id"),
"name": voice.get("name"),
"preview_url": voice.get("preview_url"),
"category": voice.get("category"),
"labels": voice.get("labels", {})
})
return voices
except Exception:
# Return default voices on error
return [
{"voice_id": "21m00Tcm4TlvDq8ikWAM", "name": "Rachel (Default)", "category": "premade"},
{"voice_id": "ErXwobaYiN019PkySvjV", "name": "Antoni", "category": "premade"},
{"voice_id": "TxGEqnHWrfWFTfGW9XjX", "name": "Josh", "category": "premade"},
]

View file

@ -0,0 +1,613 @@
"""Video Generator Service - Runway and Google Veo
Runway Models:
- gen3_alpha: High quality, supports Motion Brush, Camera Control
- gen3_alpha_turbo: 7x faster, half cost, good for most use cases
- gen4: Latest model with highest fidelity
Runway Features:
- text_to_video: Generate from text prompt
- image_to_video: Generate from starting image
- camera_control: Pan, tilt, zoom, roll with intensity (-10 to 10)
- motion_brush: Define motion areas with direction
- first_frame/last_frame: Control start and end frames
Google Veo Models (December 2025):
- veo-3.1-generate-preview: Latest with native audio, 720p/1080p, reference images
- veo-3.1-fast-generate-preview: Speed-optimized variant with audio
- veo-3.0-generate-001: Stable Veo 3 with audio
- veo-3.0-fast-generate-001: Fast Veo 3 variant
- veo-2.0-generate-001: Legacy, supports 2 outputs per request
Veo 3/3.1 Features:
- Native audio generation with soundtrack, dialogue, ambient sounds
- first_frame: Starting image for video (image-to-video)
- last_frame: Ending image for video (creates frame interpolation)
- reference_images: Up to 3 images for character/style/asset consistency
- video_extension: Extend existing videos up to 20 times
- negative_prompt: Describe unwanted elements
- aspect_ratio: 16:9, 9:16
- resolution: 720p, 1080p (Veo 3.1 only)
- duration: 4, 6, or 8 seconds
- person_generation: Control adult face generation
Audio Prompt Techniques (Veo 3+):
- Dialogue: Use quotation marks ("She whispered, 'Hello'")
- Sound Effects: Explicit descriptions (tires screeching loudly)
- Ambient Noise: Environmental details (eerie hum in background)
"""
import httpx
import os
import base64
from uuid import uuid4
from datetime import datetime
import asyncio
from typing import Optional, Dict, Any, List, Tuple
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
# Runway model configurations
RUNWAY_MODELS = {
"gen3_alpha": {
"name": "Gen-3 Alpha",
"description": "High quality with full feature support",
"supports_camera_control": True,
"supports_motion_brush": True,
"max_duration": 10,
"resolutions": ["1280x768", "768x1280"]
},
"gen3_alpha_turbo": {
"name": "Gen-3 Alpha Turbo",
"description": "7x faster, half the cost",
"supports_camera_control": True,
"supports_motion_brush": False,
"max_duration": 10,
"resolutions": ["1280x768", "768x1280"]
},
"gen4": {
"name": "Gen-4",
"description": "Latest model with highest fidelity",
"supports_camera_control": True,
"supports_motion_brush": True,
"max_duration": 10,
"resolutions": ["1280x768", "768x1280", "1920x1080"]
}
}
# Veo model configurations (December 2025)
VEO_MODELS = {
"veo-3.1-generate-preview": {
"name": "Veo 3.1",
"description": "Latest with native audio, 720p/1080p, reference images",
"supports_audio": True,
"supports_first_last_frame": True,
"supports_reference_images": True,
"supports_extension": True,
"resolutions": ["720p", "1080p"],
"durations": [4, 6, 8],
"max_references": 3
},
"veo-3.1-fast-generate-preview": {
"name": "Veo 3.1 Fast",
"description": "Speed-optimized with audio ($0.40/sec)",
"supports_audio": True,
"supports_first_last_frame": True,
"supports_reference_images": True,
"supports_extension": True,
"resolutions": ["720p", "1080p"],
"durations": [4, 6, 8],
"max_references": 3
},
"veo-3.0-generate-001": {
"name": "Veo 3",
"description": "Stable Veo 3 with native audio",
"supports_audio": True,
"supports_first_last_frame": True,
"supports_reference_images": False,
"supports_extension": False,
"resolutions": ["720p", "1080p"],
"durations": [4, 6, 8],
"max_references": 0
},
"veo-3.0-fast-generate-001": {
"name": "Veo 3 Fast",
"description": "Fast Veo 3 variant with audio",
"supports_audio": True,
"supports_first_last_frame": True,
"supports_reference_images": False,
"supports_extension": False,
"resolutions": ["720p"],
"durations": [4, 6, 8],
"max_references": 0
},
"veo-2.0-generate-001": {
"name": "Veo 2",
"description": "Legacy model, supports 2 outputs per request",
"supports_audio": False,
"supports_first_last_frame": True,
"supports_reference_images": False,
"supports_extension": False,
"resolutions": ["720p"],
"durations": [5, 6, 8],
"max_references": 0
}
}
async def generate(job_id: str):
"""Generate video using Runway or Veo
Input parameters:
- provider: 'runway' or 'veo'
- prompt: Text description
- model: Specific model to use
- duration: Video length in seconds
- aspect_ratio: '16:9', '9:16', '1:1'
Runway-specific:
- camera_control: {pan, tilt, zoom, roll} with values -10 to 10
- motion_brush: [{area_mask, direction, intensity}]
- frame_position: 'first' or 'last' for input image
Veo-specific:
- first_frame_asset_id: Asset ID for starting frame
- last_frame_asset_id: Asset ID for ending frame
- reference_asset_ids: List of asset IDs for reference (max 4)
"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
provider = input_data.get("provider", "runway")
prompt = input_data.get("prompt", "")
job.progress = 10
job.api_provider = provider
db.commit()
video_data = None
filename = None
if provider == "runway":
video_data, filename = await _generate_runway(job, input_data, db)
elif provider == "veo":
video_data, filename = await _generate_veo(job, input_data, db)
else:
raise ValueError(f"Unknown video provider: {provider}")
if video_data:
# Save video
storage_path = os.path.join(settings.storage_path, "videos")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(video_data)
# Create asset
asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="video",
mime_type="video/mp4",
file_size_bytes=len(video_data),
duration_seconds=input_data.get("duration", 5),
source_module="video_generator",
source_job_id=job.id,
asset_metadata={
"prompt": prompt,
"provider": provider,
"model": job.api_model
}
)
db.add(asset)
db.commit()
db.refresh(asset)
job.output_asset_ids = [asset.id]
job.output_data = {"asset_id": str(asset.id), "file_path": file_path}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
async def _generate_runway(job, input_data: dict, db) -> Tuple[Optional[bytes], Optional[str]]:
"""Generate video using Runway
Supports:
- Text to video
- Image to video with first/middle/last frame positioning
- Camera control (pan, tilt, zoom, roll)
- Motion brush for targeted animation
- Multiple resolutions
"""
prompt = input_data.get("prompt", "")
model = input_data.get("model", "gen3_alpha_turbo")
duration = min(input_data.get("duration", 5), 10)
resolution = input_data.get("resolution", "1280x768")
frame_position = input_data.get("frame_position", "first") # first, middle, last
# Camera control settings
camera_control = input_data.get("camera_control", {})
pan = camera_control.get("pan", 0) # -10 to 10, horizontal
tilt = camera_control.get("tilt", 0) # -10 to 10, vertical
zoom = camera_control.get("zoom", 0) # -10 to 10
roll = camera_control.get("roll", 0) # -10 to 10, rotation
static = camera_control.get("static", False) # Reduce camera motion
job.api_model = model
db.commit()
# Get input image if provided
image_data = None
if job.input_asset_ids:
input_asset = db.query(Asset).filter(Asset.id == job.input_asset_ids[0]).first()
if input_asset and os.path.exists(input_asset.file_path):
with open(input_asset.file_path, "rb") as f:
image_data = base64.b64encode(f.read()).decode()
async with httpx.AsyncClient(timeout=600) as client:
# Build payload based on whether we have an image
if image_data:
# Image to video
payload = {
"model": model,
"promptImage": f"data:image/png;base64,{image_data}",
"promptText": prompt,
"duration": duration,
"ratio": resolution.replace("x", ":")
}
# Frame position (Gen-3 Alpha Turbo supports first, middle, last)
if model == "gen3_alpha_turbo":
payload["imagePosition"] = frame_position
endpoint = "https://api.runwayml.com/v1/image_to_video"
else:
# Text to video
payload = {
"model": model,
"promptText": prompt,
"duration": duration,
"ratio": resolution.replace("x", ":")
}
endpoint = "https://api.runwayml.com/v1/text_to_video"
# Add camera control if any values are set
if any([pan, tilt, zoom, roll]) and not static:
payload["cameraControl"] = {
"pan": pan,
"tilt": tilt,
"zoom": zoom,
"roll": roll
}
elif static:
payload["cameraControl"] = {"static": True}
# Create generation task
response = await client.post(
endpoint,
headers={
"Authorization": f"Bearer {settings.runway_api_key}",
"Content-Type": "application/json",
"X-Runway-Version": "2024-11-06"
},
json=payload
)
response.raise_for_status()
result = response.json()
task_id = result.get("id")
job.progress = 30
job.api_request_id = task_id
db.commit()
# Poll for completion
for i in range(180): # Wait up to 6 minutes
await asyncio.sleep(2)
status_response = await client.get(
f"https://api.runwayml.com/v1/tasks/{task_id}",
headers={
"Authorization": f"Bearer {settings.runway_api_key}",
"X-Runway-Version": "2024-11-06"
}
)
status_data = status_response.json()
status = status_data.get("status", "")
if status == "SUCCEEDED":
output_url = status_data.get("output", [None])[0]
if output_url:
video_response = await client.get(output_url)
filename = f"runway_{model}_{uuid4()}.mp4"
return video_response.content, filename
break
elif status == "FAILED":
raise ValueError(f"Runway generation failed: {status_data.get('error')}")
job.progress = min(30 + (i * 0.35), 90)
db.commit()
return None, None
async def _generate_veo(job, input_data: dict, db) -> Tuple[Optional[bytes], Optional[str]]:
"""Generate video using Google Veo 3/3.1
Supports:
- Text to video with native audio generation
- First frame image (video starts from this image)
- Last frame image (video ends at this image, creates frame interpolation)
- Reference images (up to 3, for character/style/asset consistency - Veo 3.1 only)
- Video extension (continue from previous video - Veo 3.1 only)
- Negative prompts
- Multiple resolutions (720p, 1080p)
- Duration options (4, 6, 8 seconds)
Audio Prompting:
- Use quotation marks for dialogue: "She said, 'Hello'"
- Describe sound effects: "tires screeching loudly"
- Add ambient sounds: "quiet forest with birds chirping"
"""
prompt = input_data.get("prompt", "")
model = input_data.get("model", "veo-3.1-generate-preview")
duration = input_data.get("duration", 8)
aspect_ratio = input_data.get("aspect_ratio", "16:9")
resolution = input_data.get("resolution", "720p")
negative_prompt = input_data.get("negative_prompt", "")
person_generation = input_data.get("person_generation") # "allow_adult" or None
# Frame control
first_frame_asset_id = input_data.get("first_frame_asset_id")
last_frame_asset_id = input_data.get("last_frame_asset_id")
reference_asset_ids = input_data.get("reference_asset_ids", [])[:3] # Max 3 for Veo 3.1
# Video extension (Veo 3.1 only)
extend_video_asset_id = input_data.get("extend_video_asset_id")
# Validate duration
model_config = VEO_MODELS.get(model, VEO_MODELS["veo-3.1-generate-preview"])
valid_durations = model_config.get("durations", [4, 6, 8])
if duration not in valid_durations:
duration = max(valid_durations)
# Validate resolution
valid_resolutions = model_config.get("resolutions", ["720p"])
if resolution not in valid_resolutions:
resolution = valid_resolutions[0]
job.api_model = model
db.commit()
try:
from google import genai
from google.genai import types
# Initialize client
client = genai.Client(api_key=settings.google_api_key)
job.progress = 20
db.commit()
# Build generation config
config_kwargs = {
"aspect_ratio": aspect_ratio,
}
# Add negative prompt if provided
if negative_prompt:
config_kwargs["negative_prompt"] = negative_prompt
# Add person generation setting if specified
if person_generation:
config_kwargs["person_generation"] = person_generation
# Resolution for Veo 3.1
if "3.1" in model or "3.0" in model:
config_kwargs["resolution"] = resolution
config_kwargs["duration_seconds"] = str(duration)
# Prepare first frame image
first_frame_image = None
if first_frame_asset_id:
first_asset = db.query(Asset).filter(Asset.id == first_frame_asset_id).first()
if first_asset and os.path.exists(first_asset.file_path):
with open(first_asset.file_path, "rb") as f:
first_frame_image = types.Image.from_bytes(
data=f.read(),
mime_type=first_asset.mime_type or "image/png"
)
# Prepare last frame for interpolation
if last_frame_asset_id:
last_asset = db.query(Asset).filter(Asset.id == last_frame_asset_id).first()
if last_asset and os.path.exists(last_asset.file_path):
with open(last_asset.file_path, "rb") as f:
config_kwargs["last_frame"] = types.Image.from_bytes(
data=f.read(),
mime_type=last_asset.mime_type or "image/png"
)
# Reference images for character/style consistency (Veo 3.1 only)
if reference_asset_ids and model_config.get("supports_reference_images"):
reference_images = []
for ref_id in reference_asset_ids:
ref_asset = db.query(Asset).filter(Asset.id == ref_id).first()
if ref_asset and os.path.exists(ref_asset.file_path):
with open(ref_asset.file_path, "rb") as f:
# Create VideoGenerationReferenceImage
ref_image = types.VideoGenerationReferenceImage(
image=types.Image.from_bytes(
data=f.read(),
mime_type=ref_asset.mime_type or "image/png"
),
reference_type="asset" # or "style" for style reference
)
reference_images.append(ref_image)
if reference_images:
config_kwargs["reference_images"] = reference_images
# Video extension (Veo 3.1 only)
extend_video = None
if extend_video_asset_id and model_config.get("supports_extension"):
extend_asset = db.query(Asset).filter(Asset.id == extend_video_asset_id).first()
if extend_asset and os.path.exists(extend_asset.file_path):
with open(extend_asset.file_path, "rb") as f:
extend_video = types.Video.from_bytes(
data=f.read(),
mime_type=extend_asset.mime_type or "video/mp4"
)
config = types.GenerateVideosConfig(**config_kwargs)
job.progress = 40
db.commit()
# Generate video using the async long-running operation
if extend_video:
# Video extension mode
operation = await asyncio.to_thread(
client.models.generate_videos,
model=model,
video=extend_video,
prompt=prompt,
config=config
)
elif first_frame_image:
# Image-to-video mode
operation = await asyncio.to_thread(
client.models.generate_videos,
model=model,
image=first_frame_image,
prompt=prompt,
config=config
)
else:
# Text-to-video mode
operation = await asyncio.to_thread(
client.models.generate_videos,
model=model,
prompt=prompt,
config=config
)
# Poll for completion (can take 11 seconds to 6 minutes)
job.progress = 50
db.commit()
max_attempts = 72 # 6 minutes with 5 second intervals
for attempt in range(max_attempts):
await asyncio.sleep(5)
# Check operation status
operation = await asyncio.to_thread(
client.operations.get,
operation
)
if operation.done:
break
# Update progress
progress = min(50 + (attempt * 0.5), 90)
job.progress = int(progress)
db.commit()
job.progress = 90
db.commit()
# Extract video from response
if operation.done and operation.response:
generated_videos = operation.response.generated_videos
if generated_videos and len(generated_videos) > 0:
video = generated_videos[0]
# Download the video file
video_data = await asyncio.to_thread(
client.files.download,
file=video.video
)
filename = f"veo_{model.replace('.', '_').replace('-', '_')}_{uuid4()}.mp4"
return video_data, filename
# Check for errors
if operation.error:
raise ValueError(f"Veo generation failed: {operation.error}")
except ImportError:
raise ValueError("Google GenAI library not installed. Run: pip install google-genai")
except Exception as e:
raise ValueError(f"Veo generation error: {str(e)}")
return None, None
async def extend_video(job_id: str):
"""Extend an existing video using Veo scene extension"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
source_asset_id = input_data.get("source_asset_id")
prompt = input_data.get("prompt", "")
extension_seconds = min(input_data.get("extension_seconds", 4), 8)
if not source_asset_id:
raise ValueError("No source video provided for extension")
source_asset = db.query(Asset).filter(Asset.id == source_asset_id).first()
if not source_asset:
raise ValueError("Source video not found")
job.progress = 10
job.api_provider = "veo"
job.api_model = "veo-3.1-generate-preview"
db.commit()
# Implementation would use Veo's scene extension API
# This extends video by building on the final seconds of the previous clip
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
def get_available_models() -> Dict[str, Any]:
"""Get all available video generation models and their capabilities"""
return {
"runway": RUNWAY_MODELS,
"veo": VEO_MODELS
}

View file

@ -0,0 +1,221 @@
"""Video Upscaler Service - Topaz Labs API"""
import httpx
import os
from uuid import uuid4
from datetime import datetime
import asyncio
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
async def upscale(job_id: str):
"""Upscale video using Topaz Labs API"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
job.progress = 5
job.api_provider = "topaz"
job.api_model = input_data.get("model", "auto")
db.commit()
scale = input_data.get("scale", 2)
model = input_data.get("model", "auto")
frame_interpolation = input_data.get("frame_interpolation", 1)
# Get video info (simplified - would need ffprobe in production)
video_info = {
"container": "mp4",
"size": input_asset.file_size_bytes,
"duration": float(input_asset.duration_seconds or 10),
"frameCount": int((input_asset.duration_seconds or 10) * 30),
"frameRate": 30,
"resolution": {
"width": input_asset.width or 1920,
"height": input_asset.height or 1080
}
}
output_width = video_info["resolution"]["width"] * scale
output_height = video_info["resolution"]["height"] * scale
job.progress = 10
db.commit()
async with httpx.AsyncClient(timeout=1800) as client:
# Create video enhancement request
response = await client.post(
"https://api.topazlabs.com/video/v1/enhance",
headers={
"X-API-Key": settings.topaz_api_key,
"Content-Type": "application/json"
},
json={
"source": video_info,
"filters": [
{
"model": model if model != "auto" else "prob-4",
"videoType": "Progressive",
"auto": "Auto" if model == "auto" else None
}
],
"output": {
"resolution": {
"width": output_width,
"height": output_height
},
"frameRate": video_info["frameRate"] * frame_interpolation,
"audioCodec": "AAC",
"audioTransfer": "Copy",
"container": "mp4"
}
}
)
response.raise_for_status()
result = response.json()
request_id = result.get("requestId")
job.progress = 15
job.api_request_id = request_id
db.commit()
# Accept the request and get upload URLs
accept_response = await client.patch(
f"https://api.topazlabs.com/video/v1/enhance/{request_id}/accept",
headers={"X-API-Key": settings.topaz_api_key}
)
accept_data = accept_response.json()
upload_urls = accept_data.get("urls", [])
job.progress = 20
db.commit()
# Upload video file in parts
with open(input_asset.file_path, "rb") as f:
video_data = f.read()
part_size = len(video_data) // len(upload_urls) if upload_urls else len(video_data)
upload_results = []
for i, url in enumerate(upload_urls):
start = i * part_size
end = start + part_size if i < len(upload_urls) - 1 else len(video_data)
part_data = video_data[start:end]
upload_response = await client.put(
url,
content=part_data,
headers={"Content-Type": "application/octet-stream"}
)
etag = upload_response.headers.get("ETag", "").strip('"')
upload_results.append({
"partNum": i + 1,
"eTag": etag
})
job.progress = 20 + (i + 1) * (30 / len(upload_urls))
db.commit()
# Complete the upload
await client.patch(
f"https://api.topazlabs.com/video/v1/enhance/{request_id}/complete-upload/",
headers={
"X-API-Key": settings.topaz_api_key,
"Content-Type": "application/json"
},
json={"uploadResults": upload_results}
)
job.progress = 50
db.commit()
# Poll for completion
for _ in range(360): # Wait up to 12 minutes
await asyncio.sleep(2)
status_response = await client.get(
f"https://api.topazlabs.com/video/v1/enhance/{request_id}/status",
headers={"X-API-Key": settings.topaz_api_key}
)
status_data = status_response.json()
status = status_data.get("status", "")
if status == "completed":
output_url = status_data.get("outputUrl")
if output_url:
video_response = await client.get(output_url)
upscaled_data = video_response.content
# Save output
filename = f"upscaled_{uuid4()}.mp4"
storage_path = os.path.join(settings.storage_path, "videos")
os.makedirs(storage_path, exist_ok=True)
file_path = os.path.join(storage_path, filename)
with open(file_path, "wb") as f:
f.write(upscaled_data)
# Create output asset
output_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="video",
mime_type="video/mp4",
file_size_bytes=len(upscaled_data),
width=output_width,
height=output_height,
duration_seconds=input_asset.duration_seconds,
source_module="video_upscaler",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={
"scale": scale,
"model": model,
"frame_interpolation": frame_interpolation
}
)
db.add(output_asset)
db.commit()
db.refresh(output_asset)
job.output_asset_ids = [output_asset.id]
job.output_data = {"asset_id": str(output_asset.id), "file_path": file_path}
break
elif status == "failed":
raise ValueError(f"Video enhancement failed: {status_data.get('error')}")
job.progress = min(50 + (_ * 0.14), 95)
db.commit()
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()

View file

@ -0,0 +1,203 @@
"""Voice to Text Service - Whisper + DeepL"""
import os
from uuid import uuid4
from datetime import datetime, timedelta
from app.database import SessionLocal
from app.models.job import Job
from app.models.asset import Asset
from app.config import settings
async def transcribe(job_id: str):
"""Transcribe audio to text using Whisper with optional translation"""
db = SessionLocal()
try:
job = db.query(Job).filter(Job.id == job_id).first()
if not job:
return
input_data = job.input_data
input_asset_ids = job.input_asset_ids
if not input_asset_ids:
raise ValueError("No input asset provided")
input_asset = db.query(Asset).filter(Asset.id == input_asset_ids[0]).first()
if not input_asset:
raise ValueError("Input asset not found")
job.progress = 10
job.api_provider = "whisper"
db.commit()
output_format = input_data.get("output_format", "txt")
translate = input_data.get("translate", False)
target_language = input_data.get("target_language", "EN-US")
# Transcribe with Whisper
import whisper
model = whisper.load_model("base")
result = model.transcribe(input_asset.file_path, verbose=False)
job.progress = 60
db.commit()
segments = result.get("segments", [])
text = result.get("text", "")
# Generate output based on format
if output_format == "txt":
content = text
extension = "txt"
mime_type = "text/plain"
elif output_format == "vtt":
content = _generate_vtt(segments)
extension = "vtt"
mime_type = "text/vtt"
elif output_format == "srt":
content = _generate_srt(segments)
extension = "srt"
mime_type = "text/plain"
else:
content = text
extension = "txt"
mime_type = "text/plain"
output_assets = []
# Save original transcription
filename = f"transcription_{uuid4()}.{extension}"
file_path = os.path.join(settings.storage_path, "documents", filename)
os.makedirs(os.path.dirname(file_path), exist_ok=True)
with open(file_path, "w", encoding="utf-8") as f:
f.write(content)
asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=filename,
stored_filename=filename,
file_path=file_path,
file_type="document",
mime_type=mime_type,
file_size_bytes=len(content.encode()),
source_module="voice_to_text",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={
"language": result.get("language"),
"format": output_format,
"type": "original"
}
)
db.add(asset)
db.commit()
db.refresh(asset)
output_assets.append(asset.id)
job.progress = 75
db.commit()
# Translate if requested
translated_content = None
if translate:
job.api_provider = "whisper+deepl"
import deepl
translator = deepl.Translator(settings.deepl_api_key)
translated_content = translator.translate_text(
content,
target_lang=target_language
).text
trans_filename = f"transcription_translated_{uuid4()}.{extension}"
trans_path = os.path.join(settings.storage_path, "documents", trans_filename)
with open(trans_path, "w", encoding="utf-8") as f:
f.write(translated_content)
trans_asset = Asset(
user_id=job.user_id,
project_id=job.project_id,
original_filename=trans_filename,
stored_filename=trans_filename,
file_path=trans_path,
file_type="document",
mime_type=mime_type,
file_size_bytes=len(translated_content.encode()),
source_module="voice_to_text",
source_job_id=job.id,
parent_asset_id=input_asset.id,
metadata={
"language": target_language,
"format": output_format,
"type": "translated"
}
)
db.add(trans_asset)
db.commit()
db.refresh(trans_asset)
output_assets.append(trans_asset.id)
job.output_asset_ids = output_assets
job.output_data = {
"text": text,
"translated_text": translated_content,
"language": result.get("language"),
"asset_ids": [str(a) for a in output_assets]
}
job.progress = 100
job.status = "completed"
job.completed_at = datetime.utcnow()
db.commit()
except Exception as e:
job.status = "failed"
job.error_message = str(e)
db.commit()
finally:
db.close()
def _generate_srt(segments: list) -> str:
"""Generate SRT format from Whisper segments"""
srt_lines = []
for i, segment in enumerate(segments, 1):
start = _format_timestamp_srt(segment['start'])
end = _format_timestamp_srt(segment['end'])
text = segment['text'].strip()
srt_lines.append(f"{i}\n{start} --> {end}\n{text}\n")
return "\n".join(srt_lines)
def _generate_vtt(segments: list) -> str:
"""Generate VTT format from Whisper segments"""
vtt_lines = ["WEBVTT\n"]
for segment in segments:
start = _format_timestamp_vtt(segment['start'])
end = _format_timestamp_vtt(segment['end'])
text = segment['text'].strip()
vtt_lines.append(f"{start} --> {end}\n{text}\n")
return "\n".join(vtt_lines)
def _format_timestamp_srt(seconds: float) -> str:
"""Convert seconds to SRT timestamp format (HH:MM:SS,mmm)"""
td = timedelta(seconds=seconds)
hours = td.seconds // 3600
minutes = (td.seconds % 3600) // 60
secs = td.seconds % 60
millis = td.microseconds // 1000
return f"{hours:02d}:{minutes:02d}:{secs:02d},{millis:03d}"
def _format_timestamp_vtt(seconds: float) -> str:
"""Convert seconds to VTT timestamp format (HH:MM:SS.mmm)"""
td = timedelta(seconds=seconds)
hours = td.seconds // 3600
minutes = (td.seconds % 3600) // 60
secs = td.seconds % 60
millis = td.microseconds // 1000
return f"{hours:02d}:{minutes:02d}:{secs:02d}.{millis:03d}"

View file

@ -0,0 +1 @@
"""Celery Workers Package"""

View file

@ -0,0 +1,27 @@
"""Celery Application Configuration"""
from celery import Celery
from app.config import settings
celery_app = Celery(
"forge_ai",
broker=settings.redis_url,
backend=settings.redis_url,
include=[
"app.workers.tasks"
]
)
# Celery configuration
celery_app.conf.update(
task_serializer="json",
accept_content=["json"],
result_serializer="json",
timezone="UTC",
enable_utc=True,
task_track_started=True,
task_time_limit=3600, # 1 hour max per task
task_soft_time_limit=3300, # Soft limit 55 minutes
worker_prefetch_multiplier=1,
task_acks_late=True,
task_reject_on_worker_lost=True,
)

View file

@ -0,0 +1,116 @@
"""Celery Tasks for background processing"""
import asyncio
from celery import shared_task
from app.workers.celery_app import celery_app
from app.services import (
image_generator,
image_upscaler,
background_remover,
video_generator,
video_upscaler,
subtitle_processor,
voice_to_text,
text_to_speech,
alt_text_generator
)
def run_async(coro):
"""Helper to run async functions in sync context"""
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
return loop.run_until_complete(coro)
finally:
loop.close()
@celery_app.task(bind=True, name="process_image_generation")
def process_image_generation(self, job_id: str):
"""Process image generation job"""
try:
run_async(image_generator.generate(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_image_upscaling")
def process_image_upscaling(self, job_id: str):
"""Process image upscaling job"""
try:
run_async(image_upscaler.upscale(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_background_removal")
def process_background_removal(self, job_id: str):
"""Process background removal job"""
try:
run_async(background_remover.remove_background(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_video_generation")
def process_video_generation(self, job_id: str):
"""Process video generation job"""
try:
run_async(video_generator.generate(job_id))
except Exception as e:
self.retry(exc=e, countdown=120, max_retries=2)
@celery_app.task(bind=True, name="process_video_upscaling")
def process_video_upscaling(self, job_id: str):
"""Process video upscaling job"""
try:
run_async(video_upscaler.upscale(job_id))
except Exception as e:
self.retry(exc=e, countdown=120, max_retries=2)
@celery_app.task(bind=True, name="process_subtitles")
def process_subtitles(self, job_id: str):
"""Process subtitle generation job"""
try:
run_async(subtitle_processor.process(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_voice_to_text")
def process_voice_to_text(self, job_id: str):
"""Process voice to text transcription job"""
try:
run_async(voice_to_text.transcribe(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_text_to_speech")
def process_text_to_speech(self, job_id: str):
"""Process text to speech synthesis job"""
try:
run_async(text_to_speech.synthesize(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_speech_to_speech")
def process_speech_to_speech(self, job_id: str):
"""Process speech to speech conversion job"""
try:
run_async(text_to_speech.speech_to_speech(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)
@celery_app.task(bind=True, name="process_alt_text")
def process_alt_text(self, job_id: str):
"""Process alt text generation job"""
try:
run_async(alt_text_generator.generate(job_id))
except Exception as e:
self.retry(exc=e, countdown=60, max_retries=2)

67
backend/requirements.txt Normal file
View file

@ -0,0 +1,67 @@
# FORGE AI Backend Requirements
# Web Framework
fastapi==0.109.0
uvicorn[standard]==0.27.0
python-multipart==0.0.6
# Database
sqlalchemy==2.0.25
asyncpg==0.29.0
psycopg2-binary==2.9.9
alembic==1.13.1
# Redis/Queue
redis==5.0.1
celery==5.3.6
kombu==5.3.4
# API Clients
httpx==0.26.0
aiohttp==3.9.1
requests==2.31.0
# AI/ML
openai==1.10.0
anthropic==0.14.0
google-generativeai==0.3.2
google-cloud-aiplatform==1.38.0
stability-sdk==0.8.4
# Video/Audio Processing
ffmpeg-python==0.2.0
openai-whisper==20231117
pydub==0.25.1
elevenlabs==1.0.0
# Image Processing
pillow==10.2.0
opencv-python-headless==4.9.0.80
# Translation
deepl==1.16.1
# Google Cloud
google-cloud-storage==2.14.0
google-auth==2.27.0
# Utilities
python-dotenv==1.0.0
pydantic==2.5.3
pydantic-settings==2.1.0
email-validator==2.1.0
aiofiles==23.2.1
python-magic==0.4.27
markdown==3.5.2
# Security
python-jose[cryptography]==3.3.0
passlib[bcrypt]==1.7.4
bcrypt==4.0.1 # Pin to version compatible with passlib 1.7.4
msal==1.26.0
# Monitoring
structlog==24.1.0
# NumPy (compatible with whisper)
numpy<2.0.0

132
docker-compose.yml Normal file
View file

@ -0,0 +1,132 @@
services:
# PostgreSQL Database (port 5452 instead of 5432)
postgres:
image: postgres:16-alpine
container_name: forge-postgres
restart: unless-stopped
environment:
POSTGRES_USER: forge_user
POSTGRES_PASSWORD: forge_secure_password_2024
POSTGRES_DB: forge_ai
ports:
- "5452:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
- ./docker/init.sql:/docker-entrypoint-initdb.d/init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U forge_user -d forge_ai"]
interval: 10s
timeout: 5s
retries: 5
# Redis (port 6399 instead of 6379)
redis:
image: redis:7-alpine
container_name: forge-redis
restart: unless-stopped
ports:
- "6399:6379"
volumes:
- redis_data:/data
command: redis-server --appendonly yes
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
# Next.js Frontend (port 3020 instead of 3000)
frontend:
build:
context: ./frontend
dockerfile: Dockerfile
container_name: forge-frontend
restart: unless-stopped
ports:
- "3020:3000"
environment:
- NODE_ENV=production
- NEXT_PUBLIC_API_URL=http://localhost:8020/api/v1
- DATABASE_URL=postgresql://forge_user:forge_secure_password_2024@postgres:5432/forge_ai
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
volumes:
- ./storage:/app/storage
# FastAPI Backend (port 8020 instead of 8000)
backend:
build:
context: ./backend
dockerfile: Dockerfile
container_name: forge-backend
restart: unless-stopped
ports:
- "8020:8000"
environment:
- DATABASE_URL=postgresql://forge_user:forge_secure_password_2024@postgres:5432/forge_ai
- REDIS_URL=redis://redis:6379
- STORAGE_PATH=/app/storage
- PYTHONUNBUFFERED=1
env_file:
- .env
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
volumes:
- ./storage:/app/storage
- ./backend:/app
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
# Celery Worker for background jobs
worker:
build:
context: ./backend
dockerfile: Dockerfile
container_name: forge-worker
restart: unless-stopped
command: celery -A app.workers.celery_app worker --loglevel=info --concurrency=4
environment:
- DATABASE_URL=postgresql://forge_user:forge_secure_password_2024@postgres:5432/forge_ai
- REDIS_URL=redis://redis:6379
- STORAGE_PATH=/app/storage
- PYTHONUNBUFFERED=1
env_file:
- .env
depends_on:
- backend
- redis
volumes:
- ./storage:/app/storage
- ./backend:/app
# Nginx Reverse Proxy (port 8080 instead of 80)
nginx:
build:
context: ./nginx
dockerfile: Dockerfile
container_name: forge-nginx
restart: unless-stopped
ports:
- "8100:80"
volumes:
- ./storage:/var/www/storage:ro
depends_on:
- frontend
- backend
volumes:
postgres_data:
redis_data:
networks:
default:
name: forge-network

238
docker/init.sql Normal file
View file

@ -0,0 +1,238 @@
-- FORGE AI Database Schema
-- PostgreSQL 16
-- Enable UUID extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
-- Users & Authentication
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
azure_oid VARCHAR(255) UNIQUE,
email VARCHAR(255) UNIQUE NOT NULL,
hashed_password VARCHAR(255),
display_name VARCHAR(255),
avatar_url TEXT,
role VARCHAR(50) DEFAULT 'user',
department VARCHAR(255),
is_active BOOLEAN DEFAULT true,
last_login_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Create test user with password "password123" (bcrypt hash)
INSERT INTO users (id, email, hashed_password, display_name, role, is_active)
VALUES (
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11',
'test@forge.ai',
'$2b$12$bg3.YrCZnAoL7L/qKzh3lusjFr5J8FZYZswb8j.wVNu4bqPYRtoIG',
'Test User',
'admin',
true
);
-- API Keys (centralized)
CREATE TABLE api_keys (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
provider VARCHAR(100) NOT NULL,
key_name VARCHAR(255) NOT NULL,
encrypted_key TEXT NOT NULL,
is_active BOOLEAN DEFAULT true,
rate_limit_per_minute INTEGER,
monthly_budget DECIMAL(10,2),
current_month_usage DECIMAL(10,2) DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Projects
CREATE TABLE projects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE SET NULL,
name VARCHAR(255) NOT NULL,
description TEXT,
is_archived BOOLEAN DEFAULT false,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Create default project for test user
INSERT INTO projects (id, user_id, name, description)
VALUES (
'b0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11',
'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11',
'Default Project',
'Default project for testing'
);
-- Assets (images, videos, documents, audio)
CREATE TABLE assets (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE SET NULL,
project_id UUID REFERENCES projects(id) ON DELETE SET NULL,
original_filename VARCHAR(500),
stored_filename VARCHAR(500) NOT NULL,
file_path TEXT NOT NULL,
file_type VARCHAR(50) NOT NULL,
mime_type VARCHAR(100),
file_size_bytes BIGINT,
width INTEGER,
height INTEGER,
duration_seconds DECIMAL(10,2),
metadata JSONB DEFAULT '{}',
source_module VARCHAR(100),
source_job_id UUID,
parent_asset_id UUID REFERENCES assets(id),
is_temporary BOOLEAN DEFAULT false,
expires_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_assets_user ON assets(user_id);
CREATE INDEX idx_assets_project ON assets(project_id);
CREATE INDEX idx_assets_type ON assets(file_type);
CREATE INDEX idx_assets_module ON assets(source_module);
-- Jobs (queue management)
CREATE TABLE jobs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE SET NULL,
project_id UUID REFERENCES projects(id) ON DELETE SET NULL,
module VARCHAR(100) NOT NULL,
action VARCHAR(100) NOT NULL,
priority INTEGER DEFAULT 5,
input_data JSONB NOT NULL,
output_data JSONB,
input_asset_ids UUID[],
output_asset_ids UUID[],
status VARCHAR(50) DEFAULT 'pending',
progress INTEGER DEFAULT 0,
error_message TEXT,
retry_count INTEGER DEFAULT 0,
max_retries INTEGER DEFAULT 3,
queued_at TIMESTAMPTZ,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
estimated_duration_seconds INTEGER,
api_provider VARCHAR(100),
api_model VARCHAR(100),
api_request_id VARCHAR(255),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_jobs_user ON jobs(user_id);
CREATE INDEX idx_jobs_status ON jobs(status);
CREATE INDEX idx_jobs_module ON jobs(module);
CREATE INDEX idx_jobs_created ON jobs(created_at DESC);
-- Usage Tracking
CREATE TABLE usage_logs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE SET NULL,
job_id UUID REFERENCES jobs(id) ON DELETE SET NULL,
module VARCHAR(100) NOT NULL,
action VARCHAR(100) NOT NULL,
api_provider VARCHAR(100),
api_model VARCHAR(100),
tokens_input INTEGER,
tokens_output INTEGER,
api_credits_used DECIMAL(10,4),
estimated_cost_usd DECIMAL(10,4),
processing_time_ms INTEGER,
request_metadata JSONB,
response_metadata JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_usage_user ON usage_logs(user_id);
CREATE INDEX idx_usage_module ON usage_logs(module);
CREATE INDEX idx_usage_provider ON usage_logs(api_provider);
CREATE INDEX idx_usage_created ON usage_logs(created_at DESC);
-- Audit Log
CREATE TABLE audit_logs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE SET NULL,
action VARCHAR(100) NOT NULL,
entity_type VARCHAR(100),
entity_id UUID,
old_values JSONB,
new_values JSONB,
ip_address INET,
user_agent TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_audit_user ON audit_logs(user_id);
CREATE INDEX idx_audit_action ON audit_logs(action);
CREATE INDEX idx_audit_created ON audit_logs(created_at DESC);
-- Work History
CREATE TABLE work_history (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE SET NULL,
session_id UUID,
asset_id UUID REFERENCES assets(id) ON DELETE CASCADE,
from_module VARCHAR(100),
to_module VARCHAR(100),
action_type VARCHAR(100),
notes TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Saved Prompts
CREATE TABLE saved_prompts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
module VARCHAR(100) NOT NULL,
name VARCHAR(255) NOT NULL,
prompt_text TEXT NOT NULL,
parameters JSONB,
is_shared BOOLEAN DEFAULT false,
use_count INTEGER DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- User Module Settings
CREATE TABLE user_module_settings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
module VARCHAR(100) NOT NULL,
settings JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(user_id, module)
);
-- Views for Reporting
CREATE VIEW v_user_usage_summary AS
SELECT
u.id as user_id,
u.email,
u.display_name,
COUNT(DISTINCT j.id) as total_jobs,
COUNT(DISTINCT CASE WHEN j.status = 'completed' THEN j.id END) as completed_jobs,
COALESCE(SUM(ul.estimated_cost_usd), 0) as total_cost
FROM users u
LEFT JOIN jobs j ON u.id = j.user_id
LEFT JOIN usage_logs ul ON u.id = ul.user_id
GROUP BY u.id, u.email, u.display_name;
CREATE VIEW v_daily_usage AS
SELECT
DATE(created_at) as date,
module,
api_provider,
COUNT(*) as request_count,
COALESCE(SUM(estimated_cost_usd), 0) as total_cost,
COALESCE(AVG(processing_time_ms), 0) as avg_processing_time
FROM usage_logs
GROUP BY DATE(created_at), module, api_provider;
-- Grant permissions
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO forge_user;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO forge_user;

View file

@ -0,0 +1,26 @@
-- Migration: Add hashed_password column to users table
-- Run this if you have an existing database without the password column
-- Add hashed_password column if it doesn't exist
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'hashed_password') THEN
ALTER TABLE users ADD COLUMN hashed_password VARCHAR(255);
END IF;
END $$;
-- Update test user with password "password123" (bcrypt hash)
UPDATE users
SET hashed_password = '$2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/X.9QYQxQj9oQx9zWe'
WHERE email = 'test@forge.ai' AND hashed_password IS NULL;
-- If no test user exists, create one
INSERT INTO users (id, email, hashed_password, display_name, role, is_active)
SELECT 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11',
'test@forge.ai',
'$2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/X.9QYQxQj9oQx9zWe',
'Test User',
'admin',
true
WHERE NOT EXISTS (SELECT 1 FROM users WHERE email = 'test@forge.ai');

17
frontend/Dockerfile Normal file
View file

@ -0,0 +1,17 @@
FROM node:20-alpine
WORKDIR /app
# Install dependencies
COPY package*.json ./
RUN npm install
# Copy source
COPY . .
# Build for production
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

200
frontend/app/admin/page.tsx Normal file
View file

@ -0,0 +1,200 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import {
Shield,
Users,
Activity,
TrendingUp,
DollarSign,
Clock,
AlertTriangle,
} from 'lucide-react';
import AdminGuard from '@/components/AdminGuard';
import api from '@/lib/api';
export default function AdminDashboard() {
const [stats, setStats] = useState({
totalUsers: 0,
activeUsers: 0,
totalJobs: 0,
jobsToday: 0,
failedJobs: 0,
avgProcessingTime: 0,
apiCosts: 0,
});
const [recentActivity, setRecentActivity] = useState<any[]>([]);
const [loading, setLoading] = useState(true);
useEffect(() => {
const fetchAdminStats = async () => {
try {
// These would be admin-only endpoints
const [statsRes, activityRes] = await Promise.all([
api.get('/admin/stats'),
api.get('/admin/activity?limit=10'),
]);
setStats(statsRes.data);
setRecentActivity(activityRes.data.items || []);
} catch (err) {
// Use mock data for demo
setStats({
totalUsers: 24,
activeUsers: 8,
totalJobs: 1247,
jobsToday: 47,
failedJobs: 3,
avgProcessingTime: 4.2,
apiCosts: 142.50,
});
setRecentActivity([
{ id: 1, user: 'john@example.com', action: 'Generated image', module: 'image_generation', time: '2 min ago' },
{ id: 2, user: 'jane@example.com', action: 'Transcribed audio', module: 'voice_to_text', time: '5 min ago' },
{ id: 3, user: 'admin@example.com', action: 'Updated user role', module: 'admin', time: '12 min ago' },
]);
} finally {
setLoading(false);
}
};
fetchAdminStats();
}, []);
return (
<AdminGuard>
<div className="space-y-8">
{/* Header */}
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-red-900/30 rounded-lg flex items-center justify-center">
<Shield className="w-6 h-6 text-red-400" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Admin Dashboard</h1>
<p className="text-gray-500">System overview and management</p>
</div>
</div>
{/* Stats Grid */}
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6">
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-blue-900/30 rounded-lg flex items-center justify-center">
<Users className="w-6 h-6 text-blue-400" />
</div>
<div>
<p className="text-gray-500 text-sm">Total Users</p>
<p className="text-2xl font-bold text-white">{stats.totalUsers}</p>
<p className="text-xs text-green-400">{stats.activeUsers} active</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Activity className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<p className="text-gray-500 text-sm">Jobs Today</p>
<p className="text-2xl font-bold text-white">{stats.jobsToday}</p>
<p className="text-xs text-gray-500">{stats.totalJobs} total</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-red-900/30 rounded-lg flex items-center justify-center">
<AlertTriangle className="w-6 h-6 text-red-400" />
</div>
<div>
<p className="text-gray-500 text-sm">Failed Jobs</p>
<p className="text-2xl font-bold text-white">{stats.failedJobs}</p>
<p className="text-xs text-gray-500">Today</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-green-900/30 rounded-lg flex items-center justify-center">
<DollarSign className="w-6 h-6 text-green-400" />
</div>
<div>
<p className="text-gray-500 text-sm">API Costs (Est.)</p>
<p className="text-2xl font-bold text-white">${stats.apiCosts.toFixed(2)}</p>
<p className="text-xs text-gray-500">This month</p>
</div>
</div>
</div>
</div>
{/* Quick Links */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-6">
<a
href="/admin/users"
className="bg-forge-dark rounded-xl p-6 border border-gray-800 hover:border-forge-yellow transition-colors"
>
<Users className="w-8 h-8 text-forge-yellow mb-4" />
<h3 className="text-lg font-semibold text-white mb-2">User Management</h3>
<p className="text-gray-500 text-sm">
Manage users, roles, and permissions
</p>
</a>
<a
href="/admin/reports"
className="bg-forge-dark rounded-xl p-6 border border-gray-800 hover:border-forge-yellow transition-colors"
>
<TrendingUp className="w-8 h-8 text-forge-yellow mb-4" />
<h3 className="text-lg font-semibold text-white mb-2">Usage Reports</h3>
<p className="text-gray-500 text-sm">
View detailed usage analytics and reports
</p>
</a>
<a
href="/admin/logs"
className="bg-forge-dark rounded-xl p-6 border border-gray-800 hover:border-forge-yellow transition-colors"
>
<Clock className="w-8 h-8 text-forge-yellow mb-4" />
<h3 className="text-lg font-semibold text-white mb-2">Audit Logs</h3>
<p className="text-gray-500 text-sm">
Review system activity and audit trail
</p>
</a>
</div>
{/* Recent Activity */}
<div className="bg-forge-dark rounded-xl border border-gray-800">
<div className="p-6 border-b border-gray-800">
<h2 className="text-lg font-semibold text-white">Recent Activity</h2>
</div>
<div className="divide-y divide-gray-800">
{loading ? (
<div className="p-6 text-center text-gray-500">Loading...</div>
) : recentActivity.length === 0 ? (
<div className="p-6 text-center text-gray-500">No recent activity</div>
) : (
recentActivity.map((activity) => (
<div key={activity.id} className="p-4 flex items-center justify-between">
<div>
<p className="text-white">{activity.action}</p>
<p className="text-sm text-gray-500">{activity.user}</p>
</div>
<div className="text-right">
<span className="text-xs bg-forge-gray px-2 py-1 rounded text-gray-400">
{activity.module}
</span>
<p className="text-xs text-gray-500 mt-1">{activity.time}</p>
</div>
</div>
))
)}
</div>
</div>
</div>
</AdminGuard>
);
}

View file

@ -0,0 +1,326 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import {
TrendingUp,
Download,
Calendar,
BarChart3,
PieChart,
Activity,
} from 'lucide-react';
import AdminGuard from '@/components/AdminGuard';
import api from '@/lib/api';
interface UsageData {
date: string;
jobs: number;
cost: number;
}
interface ModuleUsage {
module: string;
count: number;
percentage: number;
}
interface UserUsage {
user_id: string;
user_email: string;
job_count: number;
total_cost: number;
}
export default function ReportsPage() {
const [dateRange, setDateRange] = useState('7d');
const [loading, setLoading] = useState(true);
const [usageOverTime, setUsageOverTime] = useState<UsageData[]>([]);
const [moduleBreakdown, setModuleBreakdown] = useState<ModuleUsage[]>([]);
const [topUsers, setTopUsers] = useState<UserUsage[]>([]);
const [totals, setTotals] = useState({
totalJobs: 0,
totalCost: 0,
avgJobsPerDay: 0,
});
useEffect(() => {
fetchReportData();
}, [dateRange]);
const fetchReportData = async () => {
setLoading(true);
try {
const response = await api.get('/admin/reports', {
params: { range: dateRange },
});
// Set real data from API
setUsageOverTime(response.data.usage_over_time || []);
setModuleBreakdown(response.data.module_breakdown || []);
setTopUsers(response.data.top_users || []);
setTotals(response.data.totals || {});
} catch (err) {
// Use mock data for demo
setUsageOverTime([
{ date: '2024-12-03', jobs: 45, cost: 12.50 },
{ date: '2024-12-04', jobs: 62, cost: 18.30 },
{ date: '2024-12-05', jobs: 38, cost: 9.80 },
{ date: '2024-12-06', jobs: 71, cost: 22.40 },
{ date: '2024-12-07', jobs: 55, cost: 15.60 },
{ date: '2024-12-08', jobs: 48, cost: 13.20 },
{ date: '2024-12-09', jobs: 47, cost: 14.70 },
]);
setModuleBreakdown([
{ module: 'Image Generation', count: 156, percentage: 35 },
{ module: 'Video Generation', count: 89, percentage: 20 },
{ module: 'Text to Speech', count: 78, percentage: 18 },
{ module: 'Voice to Text', count: 67, percentage: 15 },
{ module: 'Image Upscaling', count: 45, percentage: 10 },
{ module: 'Other', count: 11, percentage: 2 },
]);
setTopUsers([
{ user_id: '1', user_email: 'john@example.com', job_count: 89, total_cost: 28.50 },
{ user_id: '2', user_email: 'jane@example.com', job_count: 67, total_cost: 21.30 },
{ user_id: '3', user_email: 'bob@example.com', job_count: 45, total_cost: 15.80 },
{ user_id: '4', user_email: 'alice@example.com', job_count: 34, total_cost: 12.40 },
{ user_id: '5', user_email: 'admin@forgeai.dev', job_count: 28, total_cost: 9.20 },
]);
setTotals({
totalJobs: 366,
totalCost: 106.50,
avgJobsPerDay: 52.3,
});
} finally {
setLoading(false);
}
};
const handleExport = async (format: 'csv' | 'json') => {
try {
const response = await api.get('/admin/reports/export', {
params: { range: dateRange, format },
responseType: 'blob',
});
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = `forge-ai-report-${dateRange}.${format}`;
a.click();
window.URL.revokeObjectURL(url);
toast.success('Report exported!');
} catch (err) {
toast.error('Failed to export report');
}
};
const maxJobs = Math.max(...usageOverTime.map((d) => d.jobs), 1);
return (
<AdminGuard>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<TrendingUp className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Usage Reports</h1>
<p className="text-gray-500">Analytics and usage statistics</p>
</div>
</div>
<div className="flex items-center gap-3">
<select
value={dateRange}
onChange={(e) => setDateRange(e.target.value)}
className="select-field"
>
<option value="7d">Last 7 days</option>
<option value="30d">Last 30 days</option>
<option value="90d">Last 90 days</option>
<option value="365d">Last year</option>
</select>
<button
onClick={() => handleExport('csv')}
className="btn-secondary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Export CSV
</button>
</div>
</div>
{/* Summary Cards */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-6">
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-3 mb-2">
<Activity className="w-5 h-5 text-forge-yellow" />
<span className="text-gray-500">Total Jobs</span>
</div>
<p className="text-3xl font-bold text-white">{totals.totalJobs}</p>
<p className="text-sm text-gray-500 mt-1">
Avg {totals.avgJobsPerDay.toFixed(1)}/day
</p>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-3 mb-2">
<BarChart3 className="w-5 h-5 text-green-400" />
<span className="text-gray-500">Estimated Cost</span>
</div>
<p className="text-3xl font-bold text-white">
${totals.totalCost.toFixed(2)}
</p>
<p className="text-sm text-gray-500 mt-1">API usage costs</p>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-3 mb-2">
<Calendar className="w-5 h-5 text-blue-400" />
<span className="text-gray-500">Period</span>
</div>
<p className="text-3xl font-bold text-white">
{dateRange === '7d'
? '7 Days'
: dateRange === '30d'
? '30 Days'
: dateRange === '90d'
? '90 Days'
: '1 Year'}
</p>
<p className="text-sm text-gray-500 mt-1">Date range</p>
</div>
</div>
{/* Charts Row */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{/* Usage Over Time */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6">
<h3 className="text-lg font-semibold text-white mb-4">
Jobs Over Time
</h3>
{loading ? (
<div className="h-64 flex items-center justify-center text-gray-500">
Loading...
</div>
) : (
<div className="h-64 flex items-end gap-2">
{usageOverTime.map((data, i) => (
<div
key={i}
className="flex-1 flex flex-col items-center gap-2"
>
<div
className="w-full bg-forge-yellow rounded-t transition-all"
style={{
height: `${(data.jobs / maxJobs) * 200}px`,
minHeight: '4px',
}}
/>
<span className="text-xs text-gray-500">
{new Date(data.date).toLocaleDateString('en-US', {
month: 'short',
day: 'numeric',
})}
</span>
</div>
))}
</div>
)}
</div>
{/* Module Breakdown */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6">
<h3 className="text-lg font-semibold text-white mb-4">
Usage by Module
</h3>
{loading ? (
<div className="h-64 flex items-center justify-center text-gray-500">
Loading...
</div>
) : (
<div className="space-y-4">
{moduleBreakdown.map((module) => (
<div key={module.module}>
<div className="flex items-center justify-between mb-1">
<span className="text-gray-300">{module.module}</span>
<span className="text-gray-500 text-sm">
{module.count} ({module.percentage}%)
</span>
</div>
<div className="progress-bar">
<div
className="progress-bar-fill"
style={{ width: `${module.percentage}%` }}
/>
</div>
</div>
))}
</div>
)}
</div>
</div>
{/* Top Users */}
<div className="bg-forge-dark rounded-xl border border-gray-800">
<div className="p-6 border-b border-gray-800">
<h3 className="text-lg font-semibold text-white">Top Users</h3>
</div>
{loading ? (
<div className="p-8 text-center text-gray-500">Loading...</div>
) : (
<table className="w-full">
<thead>
<tr className="border-b border-gray-800">
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Rank
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
User
</th>
<th className="text-right px-6 py-4 text-sm font-medium text-gray-500">
Jobs
</th>
<th className="text-right px-6 py-4 text-sm font-medium text-gray-500">
Est. Cost
</th>
</tr>
</thead>
<tbody>
{topUsers.map((user, index) => (
<tr
key={user.user_id}
className="border-b border-gray-800 last:border-0"
>
<td className="px-6 py-4">
<span
className={`w-6 h-6 rounded-full flex items-center justify-center text-sm font-medium ${
index === 0
? 'bg-forge-yellow text-black'
: index === 1
? 'bg-gray-400 text-black'
: index === 2
? 'bg-orange-600 text-white'
: 'bg-forge-gray text-gray-400'
}`}
>
{index + 1}
</span>
</td>
<td className="px-6 py-4 text-white">{user.user_email}</td>
<td className="px-6 py-4 text-right text-gray-300">
{user.job_count}
</td>
<td className="px-6 py-4 text-right text-gray-300">
${user.total_cost.toFixed(2)}
</td>
</tr>
))}
</tbody>
</table>
)}
</div>
</div>
</AdminGuard>
);
}

View file

@ -0,0 +1,306 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import { Users, Search, Edit2, Shield, ShieldOff, Trash2 } from 'lucide-react';
import AdminGuard from '@/components/AdminGuard';
import api from '@/lib/api';
interface User {
id: string;
email: string;
name: string;
role: string;
is_active: boolean;
created_at: string;
last_login?: string;
}
export default function UserManagementPage() {
const [users, setUsers] = useState<User[]>([]);
const [loading, setLoading] = useState(true);
const [searchQuery, setSearchQuery] = useState('');
const [roleFilter, setRoleFilter] = useState('');
const [editingUser, setEditingUser] = useState<User | null>(null);
const [newRole, setNewRole] = useState('');
useEffect(() => {
fetchUsers();
}, [roleFilter]);
const fetchUsers = async () => {
setLoading(true);
try {
const params: any = {};
if (roleFilter) params.role = roleFilter;
const response = await api.get('/admin/users', { params });
setUsers(response.data.items || []);
} catch (err) {
// Mock data for demo
setUsers([
{
id: '1',
email: 'admin@forgeai.dev',
name: 'Admin User',
role: 'admin',
is_active: true,
created_at: '2024-01-15T10:00:00Z',
last_login: '2024-12-09T14:30:00Z',
},
{
id: '2',
email: 'test@forgeai.dev',
name: 'Test User',
role: 'user',
is_active: true,
created_at: '2024-02-01T10:00:00Z',
last_login: '2024-12-09T12:00:00Z',
},
{
id: '3',
email: 'john@example.com',
name: 'John Doe',
role: 'user',
is_active: true,
created_at: '2024-03-01T10:00:00Z',
},
]);
} finally {
setLoading(false);
}
};
const handleUpdateRole = async () => {
if (!editingUser || !newRole) return;
try {
await api.patch(`/admin/users/${editingUser.id}`, { role: newRole });
toast.success('User role updated');
setEditingUser(null);
fetchUsers();
} catch (err) {
toast.error('Failed to update role');
}
};
const handleToggleActive = async (user: User) => {
try {
await api.patch(`/admin/users/${user.id}`, { is_active: !user.is_active });
toast.success(user.is_active ? 'User deactivated' : 'User activated');
fetchUsers();
} catch (err) {
toast.error('Failed to update user status');
}
};
const filteredUsers = users.filter(
(user) =>
user.email.toLowerCase().includes(searchQuery.toLowerCase()) ||
user.name.toLowerCase().includes(searchQuery.toLowerCase())
);
const getRoleBadgeColor = (role: string) => {
switch (role) {
case 'super_admin':
return 'bg-red-900/50 text-red-400';
case 'admin':
return 'bg-orange-900/50 text-orange-400';
default:
return 'bg-blue-900/50 text-blue-400';
}
};
return (
<AdminGuard>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-blue-900/30 rounded-lg flex items-center justify-center">
<Users className="w-6 h-6 text-blue-400" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">User Management</h1>
<p className="text-gray-500">Manage users and their roles</p>
</div>
</div>
</div>
{/* Filters */}
<div className="flex gap-4">
<div className="flex-1">
<div className="relative">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-5 h-5 text-gray-500" />
<input
type="text"
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
placeholder="Search users..."
className="input-field pl-10"
/>
</div>
</div>
<select
value={roleFilter}
onChange={(e) => setRoleFilter(e.target.value)}
className="select-field w-40"
>
<option value="">All Roles</option>
<option value="user">User</option>
<option value="admin">Admin</option>
<option value="super_admin">Super Admin</option>
</select>
</div>
{/* Users Table */}
<div className="bg-forge-dark rounded-xl border border-gray-800 overflow-hidden">
{loading ? (
<div className="p-8 text-center text-gray-500">Loading...</div>
) : filteredUsers.length === 0 ? (
<div className="p-8 text-center text-gray-500">No users found</div>
) : (
<table className="w-full">
<thead>
<tr className="border-b border-gray-800">
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
User
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Role
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Status
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Last Login
</th>
<th className="text-right px-6 py-4 text-sm font-medium text-gray-500">
Actions
</th>
</tr>
</thead>
<tbody>
{filteredUsers.map((user) => (
<tr
key={user.id}
className="border-b border-gray-800 last:border-0 hover:bg-forge-gray/50"
>
<td className="px-6 py-4">
<div>
<p className="text-white font-medium">{user.name}</p>
<p className="text-sm text-gray-500">{user.email}</p>
</div>
</td>
<td className="px-6 py-4">
<span className={`badge ${getRoleBadgeColor(user.role)}`}>
{user.role.replace('_', ' ')}
</span>
</td>
<td className="px-6 py-4">
<span
className={`badge ${
user.is_active
? 'bg-green-900/50 text-green-400'
: 'bg-gray-700 text-gray-400'
}`}
>
{user.is_active ? 'Active' : 'Inactive'}
</span>
</td>
<td className="px-6 py-4 text-gray-400 text-sm">
{user.last_login
? new Date(user.last_login).toLocaleDateString()
: 'Never'}
</td>
<td className="px-6 py-4">
<div className="flex items-center justify-end gap-2">
<button
onClick={() => {
setEditingUser(user);
setNewRole(user.role);
}}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
title="Edit role"
>
<Edit2 className="w-4 h-4" />
</button>
<button
onClick={() => handleToggleActive(user)}
className={`p-2 transition-colors ${
user.is_active
? 'text-gray-400 hover:text-red-400'
: 'text-gray-400 hover:text-green-400'
}`}
title={user.is_active ? 'Deactivate' : 'Activate'}
>
{user.is_active ? (
<ShieldOff className="w-4 h-4" />
) : (
<Shield className="w-4 h-4" />
)}
</button>
</div>
</td>
</tr>
))}
</tbody>
</table>
)}
</div>
{/* Edit Role Modal */}
{editingUser && (
<div className="fixed inset-0 bg-black/60 flex items-center justify-center z-50 p-4">
<div className="bg-forge-dark rounded-xl border border-gray-800 w-full max-w-md">
<div className="p-6 border-b border-gray-800 flex items-center justify-between">
<h3 className="text-lg font-semibold text-white">Change User Role</h3>
<button
onClick={() => setEditingUser(null)}
className="text-gray-400 hover:text-white"
>
&times;
</button>
</div>
<div className="p-6 space-y-4">
<div>
<p className="text-gray-400 text-sm mb-1">User</p>
<p className="text-white">{editingUser.name}</p>
<p className="text-sm text-gray-500">{editingUser.email}</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
New Role
</label>
<select
value={newRole}
onChange={(e) => setNewRole(e.target.value)}
className="select-field"
>
<option value="user">User</option>
<option value="admin">Admin</option>
<option value="super_admin">Super Admin</option>
</select>
</div>
<div className="flex gap-3">
<button
onClick={() => setEditingUser(null)}
className="btn-secondary flex-1"
>
Cancel
</button>
<button
onClick={handleUpdateRole}
className="btn-primary flex-1"
>
Update Role
</button>
</div>
</div>
</div>
</div>
)}
</div>
</AdminGuard>
);
}

View file

@ -0,0 +1,500 @@
'use client';
import { useState, useEffect, useRef } from 'react';
import { toast } from 'react-hot-toast';
import {
Mic,
Search,
Play,
Pause,
Trash2,
Edit2,
Plus,
Volume2,
User,
Building2,
RefreshCw,
BookmarkPlus
} from 'lucide-react';
import AdminGuard from '@/components/AdminGuard';
import api from '@/lib/api';
interface Voice {
voice_id: string;
name: string;
category: string;
description?: string;
labels?: {
accent?: string;
gender?: string;
age?: string;
description?: string;
use_case?: string;
};
preview_url?: string;
settings?: {
stability: number;
similarity_boost: number;
style?: number;
use_speaker_boost?: boolean;
};
samples?: { sample_id: string; file_name: string; mime_type: string }[];
}
export default function VoicesAdminPage() {
const [voices, setVoices] = useState<Voice[]>([]);
const [loading, setLoading] = useState(true);
const [searchQuery, setSearchQuery] = useState('');
const [categoryFilter, setCategoryFilter] = useState('');
const [playingVoiceId, setPlayingVoiceId] = useState<string | null>(null);
const [editingVoice, setEditingVoice] = useState<Voice | null>(null);
const [newName, setNewName] = useState('');
const [newDescription, setNewDescription] = useState('');
const [savedVoices, setSavedVoices] = useState<Set<string>>(new Set());
const audioRef = useRef<HTMLAudioElement | null>(null);
useEffect(() => {
// Load saved voices from localStorage
const saved = localStorage.getItem('savedVoices');
if (saved) {
setSavedVoices(new Set(JSON.parse(saved)));
}
}, []);
useEffect(() => {
fetchVoices();
}, []);
const fetchVoices = async () => {
setLoading(true);
try {
const response = await api.get('/admin/voices');
setVoices(response.data.voices || []);
} catch (err: any) {
console.error('Failed to fetch voices:', err);
toast.error(err.response?.data?.detail || 'Failed to fetch voices');
// Mock data for demo
setVoices([
{
voice_id: '21m00Tcm4TlvDq8ikWAM',
name: 'Rachel',
category: 'premade',
description: 'Calm, professional female voice',
labels: { accent: 'american', gender: 'female', age: 'young' },
preview_url: 'https://api.elevenlabs.io/v1/voices/21m00Tcm4TlvDq8ikWAM/preview',
},
{
voice_id: 'ErXwobaYiN019PkySvjV',
name: 'Antoni',
category: 'premade',
description: 'Well-rounded male voice',
labels: { accent: 'american', gender: 'male', age: 'middle_aged' },
},
]);
} finally {
setLoading(false);
}
};
const handlePlayPreview = (voice: Voice) => {
if (playingVoiceId === voice.voice_id) {
// Stop playing
if (audioRef.current) {
audioRef.current.pause();
audioRef.current = null;
}
setPlayingVoiceId(null);
} else {
// Stop any current playback
if (audioRef.current) {
audioRef.current.pause();
}
// Start new playback
if (voice.preview_url) {
const audio = new Audio(voice.preview_url);
audio.onended = () => {
setPlayingVoiceId(null);
audioRef.current = null;
};
audio.onerror = () => {
toast.error('Failed to play preview');
setPlayingVoiceId(null);
};
audio.play();
audioRef.current = audio;
setPlayingVoiceId(voice.voice_id);
} else {
toast.error('No preview available for this voice');
}
}
};
const handleDeleteVoice = async (voice: Voice) => {
if (voice.category === 'premade') {
toast.error('Cannot delete premade voices');
return;
}
if (!confirm(`Are you sure you want to delete the voice "${voice.name}"?`)) {
return;
}
try {
await api.delete(`/admin/voices/${voice.voice_id}`);
toast.success('Voice deleted successfully');
fetchVoices();
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to delete voice');
}
};
const handleUpdateVoice = async () => {
if (!editingVoice) return;
try {
await api.patch(`/admin/voices/${editingVoice.voice_id}/settings`, {
name: newName || undefined,
description: newDescription || undefined,
});
toast.success('Voice updated successfully');
setEditingVoice(null);
fetchVoices();
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to update voice');
}
};
const handleSaveVoice = (voiceId: string) => {
const newSaved = new Set(savedVoices);
if (newSaved.has(voiceId)) {
newSaved.delete(voiceId);
toast.success('Removed from library');
} else {
newSaved.add(voiceId);
toast.success('Added to library');
}
setSavedVoices(newSaved);
localStorage.setItem('savedVoices', JSON.stringify(Array.from(newSaved)));
};
const filteredVoices = voices.filter((voice) => {
const matchesSearch =
voice.name.toLowerCase().includes(searchQuery.toLowerCase()) ||
voice.description?.toLowerCase().includes(searchQuery.toLowerCase()) ||
voice.labels?.accent?.toLowerCase().includes(searchQuery.toLowerCase());
const matchesCategory = !categoryFilter || voice.category === categoryFilter;
return matchesSearch && matchesCategory;
});
const getCategoryBadgeColor = (category: string) => {
switch (category) {
case 'cloned':
return 'bg-purple-900/50 text-purple-400';
case 'generated':
return 'bg-blue-900/50 text-blue-400';
case 'professional':
return 'bg-green-900/50 text-green-400';
default:
return 'bg-gray-700 text-gray-400';
}
};
const getGenderIcon = (gender?: string) => {
if (gender === 'female') return <User className="w-4 h-4 text-pink-400" />;
if (gender === 'male') return <User className="w-4 h-4 text-blue-400" />;
return <User className="w-4 h-4 text-gray-400" />;
};
const categories = [...new Set(voices.map((v) => v.category))];
return (
<AdminGuard>
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-purple-900/30 rounded-lg flex items-center justify-center">
<Mic className="w-6 h-6 text-purple-400" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Voice Management</h1>
<p className="text-gray-500">Manage ElevenLabs voices and custom clones</p>
</div>
</div>
<div className="flex items-center gap-3">
<button
onClick={fetchVoices}
className="btn-secondary flex items-center gap-2"
>
<RefreshCw className="w-4 h-4" />
Refresh
</button>
<button
onClick={() => toast('Voice cloning coming soon!')}
className="btn-primary flex items-center gap-2"
>
<Plus className="w-4 h-4" />
Clone Voice
</button>
</div>
</div>
{/* Filters */}
<div className="flex gap-4">
<div className="flex-1">
<div className="relative">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-5 h-5 text-gray-500" />
<input
type="text"
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
placeholder="Search voices by name, accent..."
className="input-field pl-10"
/>
</div>
</div>
<select
value={categoryFilter}
onChange={(e) => setCategoryFilter(e.target.value)}
className="select-field w-40"
>
<option value="">All Categories</option>
{categories.map((cat) => (
<option key={cat} value={cat}>
{cat.charAt(0).toUpperCase() + cat.slice(1)}
</option>
))}
</select>
</div>
{/* Stats */}
<div className="grid grid-cols-4 gap-4">
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-purple-900/30 rounded-lg flex items-center justify-center">
<Volume2 className="w-5 h-5 text-purple-400" />
</div>
<div>
<p className="text-2xl font-bold text-white">{voices.length}</p>
<p className="text-sm text-gray-500">Total Voices</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-blue-900/30 rounded-lg flex items-center justify-center">
<Building2 className="w-5 h-5 text-blue-400" />
</div>
<div>
<p className="text-2xl font-bold text-white">
{voices.filter((v) => v.category === 'premade').length}
</p>
<p className="text-sm text-gray-500">Premade</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-green-900/30 rounded-lg flex items-center justify-center">
<User className="w-5 h-5 text-green-400" />
</div>
<div>
<p className="text-2xl font-bold text-white">
{voices.filter((v) => v.category === 'cloned').length}
</p>
<p className="text-sm text-gray-500">Cloned</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-orange-900/30 rounded-lg flex items-center justify-center">
<Mic className="w-5 h-5 text-orange-400" />
</div>
<div>
<p className="text-2xl font-bold text-white">
{voices.filter((v) => v.category === 'professional').length}
</p>
<p className="text-sm text-gray-500">Professional</p>
</div>
</div>
</div>
</div>
{/* Voices Grid */}
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{loading ? (
<div className="col-span-full p-8 text-center text-gray-500">
Loading voices...
</div>
) : filteredVoices.length === 0 ? (
<div className="col-span-full p-8 text-center text-gray-500">
No voices found
</div>
) : (
filteredVoices.map((voice) => (
<div
key={voice.voice_id}
className="bg-forge-dark rounded-xl border border-gray-800 p-4 hover:border-gray-700 transition-colors"
>
<div className="flex items-start justify-between mb-3">
<div className="flex items-center gap-3">
<div className="w-10 h-10 bg-forge-gray rounded-lg flex items-center justify-center">
{getGenderIcon(voice.labels?.gender)}
</div>
<div>
<h3 className="text-white font-medium">{voice.name}</h3>
<span className={`badge text-xs ${getCategoryBadgeColor(voice.category)}`}>
{voice.category}
</span>
</div>
</div>
<button
onClick={() => handlePlayPreview(voice)}
className={`p-2 rounded-lg transition-colors ${
playingVoiceId === voice.voice_id
? 'bg-forge-yellow text-black'
: 'bg-forge-gray text-gray-400 hover:text-white'
}`}
>
{playingVoiceId === voice.voice_id ? (
<Pause className="w-4 h-4" />
) : (
<Play className="w-4 h-4" />
)}
</button>
</div>
{voice.description && (
<p className="text-sm text-gray-500 mb-3 line-clamp-2">
{voice.description}
</p>
)}
{voice.labels && (
<div className="flex flex-wrap gap-1 mb-3">
{voice.labels.accent && (
<span className="px-2 py-0.5 text-xs bg-forge-gray rounded text-gray-400">
{voice.labels.accent}
</span>
)}
{voice.labels.age && (
<span className="px-2 py-0.5 text-xs bg-forge-gray rounded text-gray-400">
{voice.labels.age.replace('_', ' ')}
</span>
)}
{voice.labels.use_case && (
<span className="px-2 py-0.5 text-xs bg-forge-gray rounded text-gray-400">
{voice.labels.use_case}
</span>
)}
</div>
)}
<div className="flex items-center justify-between pt-3 border-t border-gray-800">
<code className="text-xs text-gray-500 font-mono">
{voice.voice_id.substring(0, 12)}...
</code>
<div className="flex items-center gap-1">
<button
onClick={() => handleSaveVoice(voice.voice_id)}
className={`p-1.5 transition-colors ${
savedVoices.has(voice.voice_id)
? 'text-forge-yellow'
: 'text-gray-400 hover:text-forge-yellow'
}`}
title={savedVoices.has(voice.voice_id) ? 'Remove from library' : 'Add to library'}
>
<BookmarkPlus className="w-4 h-4" />
</button>
{voice.category !== 'premade' && (
<>
<button
onClick={() => {
setEditingVoice(voice);
setNewName(voice.name);
setNewDescription(voice.description || '');
}}
className="p-1.5 text-gray-400 hover:text-forge-yellow transition-colors"
title="Edit voice"
>
<Edit2 className="w-4 h-4" />
</button>
<button
onClick={() => handleDeleteVoice(voice)}
className="p-1.5 text-gray-400 hover:text-red-400 transition-colors"
title="Delete voice"
>
<Trash2 className="w-4 h-4" />
</button>
</>
)}
</div>
</div>
</div>
))
)}
</div>
{/* Edit Voice Modal */}
{editingVoice && (
<div className="fixed inset-0 bg-black/60 flex items-center justify-center z-50 p-4">
<div className="bg-forge-dark rounded-xl border border-gray-800 w-full max-w-md">
<div className="p-6 border-b border-gray-800 flex items-center justify-between">
<h3 className="text-lg font-semibold text-white">Edit Voice</h3>
<button
onClick={() => setEditingVoice(null)}
className="text-gray-400 hover:text-white text-2xl"
>
&times;
</button>
</div>
<div className="p-6 space-y-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Voice Name
</label>
<input
type="text"
value={newName}
onChange={(e) => setNewName(e.target.value)}
className="input-field"
placeholder="Enter voice name"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Description
</label>
<textarea
value={newDescription}
onChange={(e) => setNewDescription(e.target.value)}
className="input-field min-h-[80px]"
placeholder="Enter voice description"
/>
</div>
<div className="flex gap-3">
<button
onClick={() => setEditingVoice(null)}
className="btn-secondary flex-1"
>
Cancel
</button>
<button
onClick={handleUpdateVoice}
className="btn-primary flex-1"
>
Save Changes
</button>
</div>
</div>
</div>
</div>
)}
</div>
</AdminGuard>
);
}

View file

@ -0,0 +1,352 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { Volume2, Download, Sparkles, Play, Pause, RotateCw } from 'lucide-react';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const outputFormats = [
{ id: 'mp3_44100_128', name: 'MP3 (128kbps)' },
{ id: 'mp3_44100_192', name: 'MP3 (192kbps)' },
{ id: 'pcm_48000', name: 'WAV (48kHz)' },
];
const presetPrompts = [
{ label: 'Explosion', prompt: 'Cinematic explosion with debris and fire, big impact' },
{ label: 'Footsteps', prompt: 'Footsteps walking on gravel, steady pace' },
{ label: 'Thunder', prompt: 'Deep rolling thunder with distant rumbles' },
{ label: 'Swoosh', prompt: 'Fast whoosh sound, air movement, transition effect' },
{ label: 'Rain', prompt: 'Gentle rain falling on a window, ambient background' },
{ label: 'Door', prompt: 'Heavy wooden door creaking open slowly' },
{ label: 'Heartbeat', prompt: 'Slow heartbeat, tense dramatic moment' },
{ label: 'Typing', prompt: 'Keyboard typing, mechanical keys, fast typing' },
];
export default function SoundEffectsPage() {
const { addJob, updateJob } = useStore();
const [prompt, setPrompt] = useState('');
const [duration, setDuration] = useState<number | null>(null);
const [promptInfluence, setPromptInfluence] = useState(0.3);
const [loop, setLoop] = useState(false);
const [outputFormat, setOutputFormat] = useState('mp3_44100_128');
const [jobId, setJobId] = useState<string | null>(null);
const [generatedAudio, setGeneratedAudio] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [playing, setPlaying] = useState(false);
const [audioElement, setAudioElement] = useState<HTMLAudioElement | null>(null);
const handleGenerate = async () => {
if (!prompt.trim()) {
toast.error('Please describe the sound effect');
return;
}
setLoading(true);
setGeneratedAudio(null);
try {
const response = await modulesApi.generateSoundEffect({
text: prompt,
duration_seconds: duration,
prompt_influence: promptInfluence,
loop,
output_format: outputFormat,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'sound_effects',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Sound effect generation started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start generation');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.[0]) {
const asset = await assetsApi.get(job.output_asset_ids[0]);
setGeneratedAudio(asset.data);
toast.success('Sound effect generated successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async () => {
if (!generatedAudio) return;
try {
const response = await assetsApi.download(generatedAudio.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = generatedAudio.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download audio');
}
};
const handlePlayPause = () => {
if (!generatedAudio) return;
if (audioElement) {
if (playing) {
audioElement.pause();
} else {
audioElement.play();
}
setPlaying(!playing);
} else {
const audio = new Audio(`/api/v1/assets/${generatedAudio.id}/download`);
audio.onended = () => setPlaying(false);
audio.play();
setAudioElement(audio);
setPlaying(true);
}
};
const applyPreset = (preset: typeof presetPrompts[0]) => {
setPrompt(preset.prompt);
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Volume2 className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Sound Effects Generator</h1>
<p className="text-gray-500">Create custom sound effects with AI</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* Preset Buttons */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Quick Presets
</label>
<div className="flex flex-wrap gap-2">
{presetPrompts.map((preset) => (
<button
key={preset.label}
onClick={() => applyPreset(preset)}
className="px-3 py-1.5 text-sm bg-forge-gray border border-gray-700 rounded-lg text-gray-300 hover:text-white hover:border-forge-yellow transition-colors"
>
{preset.label}
</button>
))}
</div>
</div>
{/* Prompt */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Sound Description
</label>
<textarea
value={prompt}
onChange={(e) => setPrompt(e.target.value)}
placeholder="Describe the sound effect you want to create... e.g., 'Cinematic whoosh with reverb tail'"
className="input-field min-h-[120px] resize-none"
maxLength={500}
/>
<p className="mt-1 text-xs text-gray-500">{prompt.length}/500 characters</p>
</div>
{/* Duration */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Duration (Optional)
</label>
<div className="flex items-center gap-4">
<input
type="number"
min={1}
max={22}
value={duration || ''}
onChange={(e) => setDuration(e.target.value ? parseInt(e.target.value) : null)}
placeholder="Auto"
className="input-field w-24"
/>
<span className="text-gray-500 text-sm">seconds (max 22)</span>
</div>
<p className="mt-1 text-xs text-gray-500">
Leave empty for automatic duration based on the sound
</p>
</div>
{/* Prompt Influence */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Prompt Influence: {promptInfluence.toFixed(1)}
</label>
<input
type="range"
min={0}
max={1}
step={0.1}
value={promptInfluence}
onChange={(e) => setPromptInfluence(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
<p className="text-xs text-gray-500 mt-1">
Higher = closer match to description
</p>
</div>
{/* Options */}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Output Format
</label>
<select
value={outputFormat}
onChange={(e) => setOutputFormat(e.target.value)}
className="select-field"
>
{outputFormats.map((format) => (
<option key={format.id} value={format.id}>
{format.name}
</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Loop
</label>
<label className="flex items-center gap-2 cursor-pointer mt-2">
<input
type="checkbox"
checked={loop}
onChange={(e) => setLoop(e.target.checked)}
className="w-4 h-4 accent-forge-yellow rounded"
/>
<span className="text-gray-300">Create seamless loop</span>
</label>
</div>
</div>
{/* Generate Button */}
<button
onClick={handleGenerate}
disabled={loading || !prompt.trim()}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Generating...' : 'Generate Sound Effect'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Generated Sound</h2>
{generatedAudio ? (
<div className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800">
<div className="p-6">
<div className="flex items-center justify-center gap-4 mb-6">
<button
onClick={handlePlayPause}
className="w-16 h-16 bg-forge-yellow rounded-full flex items-center justify-center text-black hover:bg-yellow-400 transition-colors"
>
{playing ? (
<Pause className="w-8 h-8" />
) : (
<Play className="w-8 h-8 ml-1" />
)}
</button>
</div>
<audio
src={`/api/v1/assets/${generatedAudio.id}/download`}
controls
className="w-full"
onPlay={() => setPlaying(true)}
onPause={() => setPlaying(false)}
onEnded={() => setPlaying(false)}
/>
</div>
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<p className="text-white font-medium">{generatedAudio.original_filename}</p>
<p className="text-sm text-gray-500">
{(generatedAudio.file_size_bytes / 1024).toFixed(1)} KB
{loop && ' (looping)'}
</p>
</div>
<div className="flex items-center gap-2">
<button
onClick={() => {
setGeneratedAudio(null);
setAudioElement(null);
setPlaying(false);
}}
className="p-2 text-gray-400 hover:text-white transition-colors"
title="Generate new"
>
<RotateCw className="w-5 h-5" />
</button>
<button
onClick={handleDownload}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-8 text-center">
<Volume2 className="w-12 h-12 text-gray-600 mx-auto mb-3" />
<p className="text-gray-500">Generated sound effects will appear here</p>
</div>
)}
{/* Tips */}
<div className="mt-6 p-4 bg-forge-dark rounded-lg border border-gray-800">
<h3 className="text-white font-medium mb-2">Tips for better results:</h3>
<ul className="text-sm text-gray-400 space-y-1">
<li>- Be specific about the type of sound (e.g., "metallic clang" vs just "impact")</li>
<li>- Include acoustic qualities (reverb, echo, muffled, crisp)</li>
<li>- Mention context (cinematic, game, UI, ambient)</li>
<li>- Use loop option for ambient sounds and backgrounds</li>
</ul>
</div>
</div>
</div>
</div>
);
}

View file

@ -0,0 +1,429 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import { Volume2, Download, Sparkles, Play, Pause } from 'lucide-react';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
interface Voice {
voice_id: string;
name: string;
preview_url?: string;
category?: string;
labels?: Record<string, string>;
}
export default function TextToSpeechPage() {
const { addJob, updateJob } = useStore();
const [text, setText] = useState('');
const [voices, setVoices] = useState<Voice[]>([]);
const [selectedVoice, setSelectedVoice] = useState<string>('');
const [model, setModel] = useState('eleven_multilingual_v2');
const [stability, setStability] = useState(0.5);
const [similarityBoost, setSimilarityBoost] = useState(0.75);
const [style, setStyle] = useState(0);
const [speed, setSpeed] = useState(1.0);
const [useSpeakerBoost, setUseSpeakerBoost] = useState(true);
const [outputFormat, setOutputFormat] = useState('mp3_44100_128');
const [jobId, setJobId] = useState<string | null>(null);
const [generatedAudio, setGeneratedAudio] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [loadingVoices, setLoadingVoices] = useState(true);
const [previewPlaying, setPreviewPlaying] = useState<string | null>(null);
const [previewAudio, setPreviewAudio] = useState<HTMLAudioElement | null>(null);
const [showAllVoices, setShowAllVoices] = useState(false);
useEffect(() => {
const fetchVoices = async () => {
try {
const response = await modulesApi.getVoices();
let allVoices = response.data;
// Filter to show only saved voices if not showing all
if (!showAllVoices) {
const savedVoiceIds = JSON.parse(localStorage.getItem('savedVoices') || '[]');
if (savedVoiceIds.length > 0) {
allVoices = allVoices.filter((v: Voice) => savedVoiceIds.includes(v.voice_id));
}
}
setVoices(allVoices);
if (allVoices.length > 0 && !selectedVoice) {
setSelectedVoice(allVoices[0].voice_id);
}
} catch (err) {
toast.error('Failed to load voices');
} finally {
setLoadingVoices(false);
}
};
fetchVoices();
}, [showAllVoices]);
const handleGenerate = async () => {
if (!text.trim()) {
toast.error('Please enter some text');
return;
}
if (!selectedVoice) {
toast.error('Please select a voice');
return;
}
setLoading(true);
setGeneratedAudio(null);
try {
const response = await modulesApi.textToSpeech({
text,
voice_id: selectedVoice,
model_id: model,
stability,
similarity_boost: similarityBoost,
style,
speed,
use_speaker_boost: useSpeakerBoost,
output_format: outputFormat,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'text_to_speech',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Speech synthesis started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start synthesis');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.[0]) {
const asset = await assetsApi.get(job.output_asset_ids[0]);
setGeneratedAudio(asset.data);
toast.success('Audio generated successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async () => {
if (!generatedAudio) return;
try {
const response = await assetsApi.download(generatedAudio.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = generatedAudio.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download audio');
}
};
const playPreview = (previewUrl: string, voiceId: string) => {
// Stop any currently playing audio
if (previewAudio) {
previewAudio.pause();
previewAudio.currentTime = 0;
}
if (previewPlaying === voiceId) {
setPreviewPlaying(null);
setPreviewAudio(null);
return;
}
const audio = new Audio(previewUrl);
audio.onended = () => {
setPreviewPlaying(null);
setPreviewAudio(null);
};
setPreviewAudio(audio);
setPreviewPlaying(voiceId);
audio.play();
};
const selectedVoiceData = voices.find((v) => v.voice_id === selectedVoice);
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Volume2 className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Text to Speech</h1>
<p className="text-gray-500">Convert text to natural speech with ElevenLabs</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* Text Input */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Text to Speak
</label>
<textarea
value={text}
onChange={(e) => setText(e.target.value)}
placeholder="Enter the text you want to convert to speech..."
className="input-field min-h-[200px] resize-none"
maxLength={5000}
/>
<p className="mt-1 text-xs text-gray-500">{text.length}/5000 characters</p>
</div>
{/* Voice Selection */}
<div>
<div className="flex items-center justify-between mb-2">
<label className="block text-sm font-medium text-gray-300">
Voice
</label>
<button
onClick={() => setShowAllVoices(!showAllVoices)}
className="text-xs text-forge-yellow hover:text-yellow-400"
>
{showAllVoices ? 'Show Library Only' : 'Show All Voices'}
</button>
</div>
{loadingVoices ? (
<div className="text-gray-500">Loading voices...</div>
) : voices.length === 0 ? (
<div className="bg-forge-dark border border-gray-700 rounded-lg p-4 text-center">
<p className="text-gray-500 text-sm mb-2">
No voices in your library
</p>
<button
onClick={() => setShowAllVoices(true)}
className="text-xs text-forge-yellow hover:text-yellow-400"
>
Show all voices
</button>
</div>
) : (
<div className="space-y-2">
<select
value={selectedVoice}
onChange={(e) => setSelectedVoice(e.target.value)}
className="select-field"
>
{voices.map((voice) => (
<option key={voice.voice_id} value={voice.voice_id}>
{voice.name} {voice.category && `(${voice.category})`}
</option>
))}
</select>
{selectedVoiceData?.preview_url && (
<button
onClick={() => playPreview(selectedVoiceData.preview_url!, selectedVoiceData.voice_id)}
className="flex items-center gap-2 text-sm text-forge-yellow hover:text-yellow-400"
>
{previewPlaying === selectedVoiceData.voice_id ? (
<Pause className="w-4 h-4" />
) : (
<Play className="w-4 h-4" />
)}
Preview voice
</button>
)}
</div>
)}
</div>
{/* Model */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Model
</label>
<select
value={model}
onChange={(e) => setModel(e.target.value)}
className="select-field"
>
<option value="eleven_multilingual_v2">Multilingual V2 (Best quality)</option>
<option value="eleven_flash_v2_5">Flash V2.5 (Ultra low latency)</option>
<option value="eleven_turbo_v2_5">Turbo V2.5 (Fast, low latency)</option>
<option value="eleven_v3">V3 (Latest, most natural)</option>
<option value="eleven_turbo_v2">Turbo V2</option>
<option value="eleven_multilingual_sts_v2">Multilingual STS V2</option>
<option value="eleven_monolingual_v1">English V1 (Legacy)</option>
</select>
</div>
{/* Output Format */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Output Format
</label>
<select
value={outputFormat}
onChange={(e) => setOutputFormat(e.target.value)}
className="select-field"
>
<option value="mp3_44100_128">MP3 44.1kHz 128kbps (Recommended)</option>
<option value="mp3_44100_192">MP3 44.1kHz 192kbps (High Quality)</option>
<option value="mp3_44100_64">MP3 44.1kHz 64kbps (Small Size)</option>
<option value="pcm_16000">PCM 16kHz (Uncompressed)</option>
<option value="pcm_22050">PCM 22.05kHz (Uncompressed)</option>
<option value="pcm_24000">PCM 24kHz (Uncompressed)</option>
<option value="pcm_44100">PCM 44.1kHz (Uncompressed)</option>
<option value="ulaw_8000">uLaw 8kHz (Telephony)</option>
</select>
</div>
{/* Voice Settings */}
<div className="space-y-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Stability: {stability.toFixed(2)}
</label>
<input
type="range"
min={0}
max={1}
step={0.05}
value={stability}
onChange={(e) => setStability(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
<p className="text-xs text-gray-500 mt-1">Higher = more consistent, Lower = more expressive</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Clarity/Similarity: {similarityBoost.toFixed(2)}
</label>
<input
type="range"
min={0}
max={1}
step={0.05}
value={similarityBoost}
onChange={(e) => setSimilarityBoost(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
<p className="text-xs text-gray-500 mt-1">Higher = closer to original voice</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Style/Exaggeration: {style.toFixed(2)}
</label>
<input
type="range"
min={0}
max={1}
step={0.05}
value={style}
onChange={(e) => setStyle(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
<p className="text-xs text-gray-500 mt-1">Higher = more dramatic delivery</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Speed: {speed.toFixed(2)}x
</label>
<input
type="range"
min={0.25}
max={4.0}
step={0.05}
value={speed}
onChange={(e) => setSpeed(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
<p className="text-xs text-gray-500 mt-1">Playback speed multiplier</p>
</div>
<div className="flex items-center gap-3">
<input
type="checkbox"
id="speakerBoost"
checked={useSpeakerBoost}
onChange={(e) => setUseSpeakerBoost(e.target.checked)}
className="w-4 h-4 rounded border-gray-600 bg-forge-dark text-forge-yellow focus:ring-forge-yellow"
/>
<label htmlFor="speakerBoost" className="text-gray-300 text-sm">
Speaker Boost (Enhances audio quality)
</label>
</div>
</div>
{/* Generate Button */}
<button
onClick={handleGenerate}
disabled={loading || !text.trim() || !selectedVoice}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Generating...' : 'Generate Speech'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Generated Audio</h2>
{generatedAudio ? (
<div className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800">
<div className="p-6">
<audio
src={`/api/v1/assets/${generatedAudio.id}/download`}
controls
className="w-full"
/>
</div>
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<p className="text-white font-medium">{generatedAudio.original_filename}</p>
<p className="text-sm text-gray-500">
{(generatedAudio.file_size_bytes / 1024).toFixed(1)} KB
</p>
</div>
<button
onClick={handleDownload}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-8 text-center">
<Volume2 className="w-12 h-12 text-gray-600 mx-auto mb-3" />
<p className="text-gray-500">Generated audio will appear here</p>
</div>
)}
</div>
</div>
</div>
);
}

View file

@ -0,0 +1,335 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { Type, Download, Sparkles, Copy, Check } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const outputFormats = [
{ value: 'txt', label: 'Plain Text (.txt)' },
{ value: 'srt', label: 'SRT Subtitles (.srt)' },
{ value: 'vtt', label: 'WebVTT (.vtt)' },
];
const targetLanguages = [
{ value: '', label: 'No translation' },
{ value: 'EN-US', label: 'English (US)' },
{ value: 'EN-GB', label: 'English (UK)' },
{ value: 'ES', label: 'Spanish' },
{ value: 'FR', label: 'French' },
{ value: 'DE', label: 'German' },
{ value: 'IT', label: 'Italian' },
{ value: 'PT-BR', label: 'Portuguese (Brazil)' },
{ value: 'JA', label: 'Japanese' },
{ value: 'KO', label: 'Korean' },
{ value: 'ZH', label: 'Chinese' },
];
export default function VoiceToTextPage() {
const { addJob, updateJob } = useStore();
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [outputFormat, setOutputFormat] = useState('txt');
const [translate, setTranslate] = useState(false);
const [targetLanguage, setTargetLanguage] = useState('EN-US');
const [jobId, setJobId] = useState<string | null>(null);
const [results, setResults] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [uploading, setUploading] = useState(false);
const [copied, setCopied] = useState(false);
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Audio uploaded!');
} catch (err) {
toast.error('Failed to upload audio');
setFile(null);
} finally {
setUploading(false);
}
};
const handleTranscribe = async () => {
if (!assetId) {
toast.error('Please upload an audio file first');
return;
}
setLoading(true);
setResults(null);
try {
const response = await modulesApi.voiceToText({
asset_id: assetId,
output_format: outputFormat,
translate,
target_language: translate ? targetLanguage : undefined,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'voice_to_text',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Transcription started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start transcription');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_data) {
const assets = job.output_asset_ids
? await Promise.all(
job.output_asset_ids.map(async (id: string) => {
const asset = await assetsApi.get(id);
return asset.data;
})
)
: [];
setResults({
text: job.output_data.text,
translatedText: job.output_data.translated_text,
language: job.output_data.language,
assets,
});
toast.success('Transcription completed!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async (asset: any) => {
try {
const response = await assetsApi.download(asset.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = asset.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download file');
}
};
const copyToClipboard = (text: string) => {
navigator.clipboard.writeText(text);
setCopied(true);
toast.success('Copied to clipboard!');
setTimeout(() => setCopied(false), 2000);
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Type className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Voice to Text</h1>
<p className="text-gray-500">Transcribe audio with Whisper AI</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* File Upload */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upload Audio
</label>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'audio/*': ['.mp3', '.wav', '.m4a', '.flac', '.ogg'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
}}
label="Upload an audio file to transcribe"
/>
{uploading && (
<p className="mt-2 text-sm text-forge-yellow">Uploading...</p>
)}
</div>
{/* Output Format */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Output Format
</label>
<select
value={outputFormat}
onChange={(e) => setOutputFormat(e.target.value)}
className="select-field"
>
{outputFormats.map((format) => (
<option key={format.value} value={format.value}>
{format.label}
</option>
))}
</select>
</div>
{/* Translation */}
<div className="space-y-3">
<div className="flex items-center gap-3">
<input
type="checkbox"
id="translate"
checked={translate}
onChange={(e) => setTranslate(e.target.checked)}
className="w-4 h-4 rounded border-gray-600 bg-forge-dark text-forge-yellow focus:ring-forge-yellow"
/>
<label htmlFor="translate" className="text-gray-300">
Translate to another language
</label>
</div>
{translate && (
<select
value={targetLanguage}
onChange={(e) => setTargetLanguage(e.target.value)}
className="select-field"
>
{targetLanguages.filter((l) => l.value).map((lang) => (
<option key={lang.value} value={lang.value}>
{lang.label}
</option>
))}
</select>
)}
</div>
{/* Transcribe Button */}
<button
onClick={handleTranscribe}
disabled={loading || !assetId || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Transcribing...' : 'Transcribe Audio'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Transcription</h2>
{results ? (
<div className="space-y-4">
{/* Download Files */}
{results.assets.length > 0 && (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<h3 className="text-white font-medium mb-3">Download Files</h3>
<div className="space-y-2">
{results.assets.map((asset: any) => (
<div
key={asset.id}
className="flex items-center justify-between p-3 bg-forge-gray rounded-lg"
>
<div>
<p className="text-white text-sm">{asset.original_filename}</p>
<p className="text-xs text-gray-500">
{asset.metadata?.type === 'translated' ? 'Translated' : 'Original'}
</p>
</div>
<button
onClick={() => handleDownload(asset)}
className="p-2 text-forge-yellow hover:bg-forge-yellow/10 rounded transition-colors"
>
<Download className="w-4 h-4" />
</button>
</div>
))}
</div>
</div>
)}
{/* Original Transcript */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center justify-between mb-3">
<h3 className="text-white font-medium">
Original Text
{results.language && (
<span className="text-gray-500 text-sm ml-2">
(Detected: {results.language})
</span>
)}
</h3>
<button
onClick={() => copyToClipboard(results.text)}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
>
{copied ? <Check className="w-4 h-4" /> : <Copy className="w-4 h-4" />}
</button>
</div>
<div className="max-h-64 overflow-y-auto">
<p className="text-gray-300 text-sm whitespace-pre-wrap">
{results.text}
</p>
</div>
</div>
{/* Translated Text */}
{results.translatedText && (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center justify-between mb-3">
<h3 className="text-white font-medium">Translated Text</h3>
<button
onClick={() => copyToClipboard(results.translatedText)}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
>
<Copy className="w-4 h-4" />
</button>
</div>
<div className="max-h-64 overflow-y-auto">
<p className="text-gray-300 text-sm whitespace-pre-wrap">
{results.translatedText}
</p>
</div>
</div>
)}
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-8 text-center">
<Type className="w-12 h-12 text-gray-600 mx-auto mb-3" />
<p className="text-gray-500">Transcription will appear here</p>
</div>
)}
</div>
</div>
</div>
);
}

507
frontend/app/files/page.tsx Normal file
View file

@ -0,0 +1,507 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import {
FolderOpen,
Upload,
Download,
Trash2,
Search,
Image as ImageIcon,
Video,
Mic,
FileText,
Grid,
List,
Loader2,
Eye,
} from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import api, { assetsApi } from '@/lib/api';
import { clsx } from 'clsx';
interface Asset {
id: string;
filename: string;
file_type: string;
mime_type: string;
width?: number;
height?: number;
thumbnail_url: string | null;
file_url: string;
created_at: string;
source_module?: string;
}
const FILE_TYPE_ICONS = {
image: ImageIcon,
video: Video,
audio: Mic,
document: FileText,
};
const FILE_TYPE_COLORS = {
image: 'text-blue-400',
video: 'text-purple-400',
audio: 'text-green-400',
document: 'text-orange-400',
};
export default function MyFilesPage() {
const [assets, setAssets] = useState<Asset[]>([]);
const [loading, setLoading] = useState(true);
const [search, setSearch] = useState('');
const [selectedType, setSelectedType] = useState<string | null>(null);
const [viewMode, setViewMode] = useState<'grid' | 'list'>('grid');
const [page, setPage] = useState(1);
const [totalPages, setTotalPages] = useState(1);
const [showUpload, setShowUpload] = useState(false);
const [selectedAsset, setSelectedAsset] = useState<Asset | null>(null);
const [uploading, setUploading] = useState(false);
useEffect(() => {
loadAssets();
}, [search, selectedType, page]);
const loadAssets = async () => {
setLoading(true);
try {
const params: any = { page, limit: 24 };
if (selectedType) params.file_types = selectedType;
if (search) params.search = search;
const response = await api.get('/assets/library', { params });
setAssets(response.data.items);
setTotalPages(response.data.pages);
} catch (error) {
toast.error('Failed to load files');
} finally {
setLoading(false);
}
};
const handleUpload = async (file: File) => {
setUploading(true);
try {
await assetsApi.upload(file);
toast.success('File uploaded!');
loadAssets();
setShowUpload(false);
} catch (error) {
toast.error('Failed to upload file');
} finally {
setUploading(false);
}
};
const handleDownload = async (asset: Asset, e?: React.MouseEvent) => {
e?.stopPropagation();
try {
const response = await assetsApi.download(asset.id);
// Create blob and download
const blob = new Blob([response.data], { type: response.headers['content-type'] || asset.mime_type });
const url = window.URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = asset.filename || 'download';
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
window.URL.revokeObjectURL(url);
toast.success('Download started');
} catch (error) {
console.error('Download error:', error);
toast.error('Failed to download file');
}
};
const handleDelete = async (asset: Asset, e?: React.MouseEvent) => {
e?.stopPropagation();
if (!confirm(`Delete "${asset.filename}"?`)) return;
try {
await assetsApi.delete(asset.id);
toast.success('File deleted');
loadAssets();
if (selectedAsset?.id === asset.id) setSelectedAsset(null);
} catch (error) {
toast.error('Failed to delete file');
}
};
const formatDate = (dateStr: string) => {
return new Date(dateStr).toLocaleDateString('en-US', {
month: 'short',
day: 'numeric',
year: 'numeric',
hour: '2-digit',
minute: '2-digit',
});
};
const formatSize = (bytes?: number) => {
if (!bytes) return 'Unknown';
if (bytes < 1024) return `${bytes} B`;
if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(1)} KB`;
return `${(bytes / (1024 * 1024)).toFixed(1)} MB`;
};
return (
<div className="max-w-7xl mx-auto space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<FolderOpen className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">My Files</h1>
<p className="text-gray-500">Manage your uploaded and generated assets</p>
</div>
</div>
<button
onClick={() => setShowUpload(!showUpload)}
className="btn-primary flex items-center gap-2"
>
<Upload className="w-4 h-4" />
Upload
</button>
</div>
{/* Upload Area */}
{showUpload && (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6">
<FileUpload
onUpload={handleUpload}
accept={{
'image/*': ['.png', '.jpg', '.jpeg', '.webp', '.gif'],
'video/*': ['.mp4', '.webm', '.mov'],
'audio/*': ['.mp3', '.wav', '.ogg'],
}}
label="Drop files here or click to upload"
/>
{uploading && (
<div className="mt-4 flex items-center gap-2 text-forge-yellow">
<Loader2 className="w-4 h-4 animate-spin" />
Uploading...
</div>
)}
</div>
)}
{/* Filters */}
<div className="flex items-center gap-4 flex-wrap">
{/* Search */}
<div className="relative flex-1 min-w-[200px]">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-4 h-4 text-gray-500" />
<input
type="text"
placeholder="Search files..."
value={search}
onChange={(e) => {
setSearch(e.target.value);
setPage(1);
}}
className="w-full pl-10 pr-4 py-2 bg-forge-dark border border-gray-700 rounded-lg text-white placeholder-gray-500 focus:border-forge-yellow focus:outline-none"
/>
</div>
{/* Type Filters */}
<div className="flex items-center gap-2">
<button
onClick={() => {
setSelectedType(null);
setPage(1);
}}
className={clsx(
'px-3 py-2 rounded-lg text-sm font-medium transition-colors',
!selectedType
? 'bg-forge-yellow text-black'
: 'bg-forge-dark border border-gray-700 text-gray-400 hover:text-white'
)}
>
All
</button>
{['image', 'video', 'audio'].map((type) => {
const Icon = FILE_TYPE_ICONS[type as keyof typeof FILE_TYPE_ICONS];
return (
<button
key={type}
onClick={() => {
setSelectedType(type);
setPage(1);
}}
className={clsx(
'px-3 py-2 rounded-lg text-sm font-medium transition-colors flex items-center gap-2',
selectedType === type
? 'bg-forge-yellow text-black'
: 'bg-forge-dark border border-gray-700 text-gray-400 hover:text-white'
)}
>
<Icon className="w-4 h-4" />
{type.charAt(0).toUpperCase() + type.slice(1)}
</button>
);
})}
</div>
{/* View Toggle */}
<div className="flex items-center bg-forge-dark border border-gray-700 rounded-lg overflow-hidden">
<button
onClick={() => setViewMode('grid')}
className={clsx(
'p-2 transition-colors',
viewMode === 'grid' ? 'bg-forge-yellow text-black' : 'text-gray-400 hover:text-white'
)}
>
<Grid className="w-4 h-4" />
</button>
<button
onClick={() => setViewMode('list')}
className={clsx(
'p-2 transition-colors',
viewMode === 'list' ? 'bg-forge-yellow text-black' : 'text-gray-400 hover:text-white'
)}
>
<List className="w-4 h-4" />
</button>
</div>
</div>
{/* Content */}
{loading ? (
<div className="flex items-center justify-center h-64">
<Loader2 className="w-8 h-8 text-forge-yellow animate-spin" />
</div>
) : assets.length === 0 ? (
<div className="flex flex-col items-center justify-center h-64 text-gray-500">
<FolderOpen className="w-16 h-16 mb-4" />
<p className="text-lg">No files found</p>
<p className="text-sm">Upload files or generate content to see them here</p>
</div>
) : viewMode === 'grid' ? (
<div className="grid grid-cols-2 md:grid-cols-4 lg:grid-cols-6 gap-4">
{assets.map((asset) => {
const Icon = FILE_TYPE_ICONS[asset.file_type as keyof typeof FILE_TYPE_ICONS] || FileText;
const colorClass = FILE_TYPE_COLORS[asset.file_type as keyof typeof FILE_TYPE_COLORS] || 'text-gray-400';
return (
<div
key={asset.id}
className="bg-forge-dark rounded-lg border border-gray-800 overflow-hidden hover:border-gray-700 transition-colors group"
>
{/* Thumbnail */}
<div
className="aspect-square relative cursor-pointer"
onClick={() => setSelectedAsset(asset)}
>
{asset.thumbnail_url || asset.file_type === 'image' ? (
<img
src={`/api/v1/assets/${asset.id}/download`}
alt={asset.filename}
className="w-full h-full object-cover"
onError={(e) => {
(e.target as HTMLImageElement).style.display = 'none';
}}
/>
) : (
<div className="w-full h-full flex items-center justify-center bg-forge-gray">
<Icon className={clsx('w-12 h-12', colorClass)} />
</div>
)}
{/* Hover Actions */}
<div className="absolute inset-0 bg-black/60 opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center gap-2">
<button
onClick={(e) => {
e.stopPropagation();
setSelectedAsset(asset);
}}
className="p-2 bg-white/20 rounded-full hover:bg-white/30"
>
<Eye className="w-4 h-4 text-white" />
</button>
<button
onClick={(e) => handleDownload(asset, e)}
className="p-2 bg-white/20 rounded-full hover:bg-white/30"
>
<Download className="w-4 h-4 text-white" />
</button>
<button
onClick={(e) => handleDelete(asset, e)}
className="p-2 bg-red-500/50 rounded-full hover:bg-red-500/70"
>
<Trash2 className="w-4 h-4 text-white" />
</button>
</div>
</div>
{/* Info */}
<div className="p-2">
<p className="text-sm text-white truncate">{asset.filename}</p>
<p className="text-xs text-gray-500">{formatDate(asset.created_at)}</p>
</div>
</div>
);
})}
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 overflow-hidden">
<table className="w-full">
<thead className="bg-forge-gray">
<tr>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-400 uppercase">Name</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-400 uppercase">Type</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-400 uppercase">Source</th>
<th className="px-4 py-3 text-left text-xs font-medium text-gray-400 uppercase">Date</th>
<th className="px-4 py-3 text-right text-xs font-medium text-gray-400 uppercase">Actions</th>
</tr>
</thead>
<tbody className="divide-y divide-gray-800">
{assets.map((asset) => {
const Icon = FILE_TYPE_ICONS[asset.file_type as keyof typeof FILE_TYPE_ICONS] || FileText;
const colorClass = FILE_TYPE_COLORS[asset.file_type as keyof typeof FILE_TYPE_COLORS] || 'text-gray-400';
return (
<tr key={asset.id} className="hover:bg-forge-gray/50">
<td className="px-4 py-3">
<div className="flex items-center gap-3">
<Icon className={clsx('w-5 h-5', colorClass)} />
<span className="text-white">{asset.filename}</span>
</div>
</td>
<td className="px-4 py-3 text-gray-400 capitalize">{asset.file_type}</td>
<td className="px-4 py-3 text-gray-400">
{asset.source_module?.replace('_', ' ') || 'Upload'}
</td>
<td className="px-4 py-3 text-gray-400">{formatDate(asset.created_at)}</td>
<td className="px-4 py-3">
<div className="flex items-center justify-end gap-2">
<button
onClick={() => setSelectedAsset(asset)}
className="p-1 text-gray-400 hover:text-white"
>
<Eye className="w-4 h-4" />
</button>
<button
onClick={(e) => handleDownload(asset, e)}
className="p-1 text-gray-400 hover:text-forge-yellow"
>
<Download className="w-4 h-4" />
</button>
<button
onClick={(e) => handleDelete(asset, e)}
className="p-1 text-gray-400 hover:text-red-400"
>
<Trash2 className="w-4 h-4" />
</button>
</div>
</td>
</tr>
);
})}
</tbody>
</table>
</div>
)}
{/* Pagination */}
{totalPages > 1 && (
<div className="flex items-center justify-center gap-4">
<button
onClick={() => setPage((p) => Math.max(1, p - 1))}
disabled={page === 1}
className="px-4 py-2 bg-forge-dark border border-gray-700 rounded-lg text-gray-400 hover:text-white disabled:opacity-50"
>
Previous
</button>
<span className="text-gray-400">
Page {page} of {totalPages}
</span>
<button
onClick={() => setPage((p) => Math.min(totalPages, p + 1))}
disabled={page === totalPages}
className="px-4 py-2 bg-forge-dark border border-gray-700 rounded-lg text-gray-400 hover:text-white disabled:opacity-50"
>
Next
</button>
</div>
)}
{/* Preview Modal */}
{selectedAsset && (
<div
className="fixed inset-0 z-50 flex items-center justify-center bg-black/80 backdrop-blur-sm"
onClick={() => setSelectedAsset(null)}
>
<div
className="bg-forge-dark rounded-xl border border-gray-800 max-w-4xl max-h-[90vh] overflow-auto"
onClick={(e) => e.stopPropagation()}
>
{/* Preview Content */}
<div className="p-4">
{selectedAsset.file_type === 'image' && (
<img
src={`/api/v1/assets/${selectedAsset.id}/download`}
alt={selectedAsset.filename}
className="max-w-full max-h-[60vh] mx-auto rounded-lg"
/>
)}
{selectedAsset.file_type === 'video' && (
<video
src={`/api/v1/assets/${selectedAsset.id}/download`}
controls
autoPlay
className="max-w-full max-h-[60vh] mx-auto rounded-lg"
/>
)}
{selectedAsset.file_type === 'audio' && (
<div className="p-8 flex flex-col items-center">
<Mic className="w-16 h-16 text-green-400 mb-4" />
<audio
src={`/api/v1/assets/${selectedAsset.id}/download`}
controls
autoPlay
className="w-full"
/>
</div>
)}
</div>
{/* Info & Actions */}
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<h3 className="text-white font-medium">{selectedAsset.filename}</h3>
<p className="text-sm text-gray-500">
{selectedAsset.file_type} {selectedAsset.width && `${selectedAsset.width}x${selectedAsset.height}`} {formatDate(selectedAsset.created_at)}
</p>
</div>
<div className="flex items-center gap-2">
<button
onClick={() => handleDownload(selectedAsset)}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
<button
onClick={() => setSelectedAsset(null)}
className="px-4 py-2 bg-forge-gray text-gray-300 rounded-lg hover:text-white"
>
Close
</button>
</div>
</div>
</div>
</div>
</div>
)}
</div>
);
}

124
frontend/app/globals.css Normal file
View file

@ -0,0 +1,124 @@
@import url('https://fonts.googleapis.com/css2?family=Montserrat:wght@300;400;500;600;700;800&display=swap');
@tailwind base;
@tailwind components;
@tailwind utilities;
:root {
--forge-yellow: #FFC407;
--forge-black: #000000;
--forge-dark: #111111;
--forge-gray: #1a1a1a;
--forge-gray-light: #2a2a2a;
}
* {
box-sizing: border-box;
padding: 0;
margin: 0;
}
html,
body {
max-width: 100vw;
overflow-x: hidden;
font-family: 'Montserrat', sans-serif;
background-color: var(--forge-black);
color: #ffffff;
}
a {
color: inherit;
text-decoration: none;
}
/* Custom scrollbar */
::-webkit-scrollbar {
width: 8px;
height: 8px;
}
::-webkit-scrollbar-track {
background: var(--forge-dark);
}
::-webkit-scrollbar-thumb {
background: var(--forge-gray-light);
border-radius: 4px;
}
::-webkit-scrollbar-thumb:hover {
background: var(--forge-yellow);
}
/* Custom button styles */
.btn-primary {
@apply bg-forge-yellow text-black font-semibold px-6 py-3 rounded-lg hover:bg-yellow-400 transition-all duration-200;
}
.btn-secondary {
@apply bg-forge-gray-light text-white font-medium px-6 py-3 rounded-lg hover:bg-forge-gray transition-all duration-200 border border-gray-700;
}
/* Module card styling */
.module-card {
@apply bg-forge-gray rounded-xl p-6 border border-gray-800 hover:border-forge-yellow transition-all duration-300 cursor-pointer;
}
.module-card:hover {
box-shadow: 0 0 20px rgba(255, 196, 7, 0.15);
}
/* Input styling */
.input-field {
@apply w-full bg-forge-dark border border-gray-700 rounded-lg px-4 py-3 text-white placeholder-gray-500 focus:outline-none focus:border-forge-yellow transition-colors;
}
/* Dropdown styling */
.select-field {
@apply w-full bg-forge-dark border border-gray-700 rounded-lg px-4 py-3 text-white focus:outline-none focus:border-forge-yellow transition-colors cursor-pointer;
}
/* File upload zone */
.upload-zone {
@apply border-2 border-dashed border-gray-600 rounded-xl p-8 text-center hover:border-forge-yellow transition-colors cursor-pointer;
}
.upload-zone.active {
@apply border-forge-yellow bg-forge-yellow/5;
}
/* Progress bar */
.progress-bar {
@apply h-2 bg-forge-dark rounded-full overflow-hidden;
}
.progress-bar-fill {
@apply h-full bg-forge-yellow transition-all duration-300;
}
/* Loading spinner */
.spinner {
@apply animate-spin rounded-full border-2 border-gray-600 border-t-forge-yellow;
}
/* Status badges */
.badge {
@apply px-3 py-1 rounded-full text-sm font-medium;
}
.badge-pending {
@apply bg-gray-700 text-gray-300;
}
.badge-processing {
@apply bg-blue-900/50 text-blue-400;
}
.badge-completed {
@apply bg-green-900/50 text-green-400;
}
.badge-failed {
@apply bg-red-900/50 text-red-400;
}

View file

@ -0,0 +1,350 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import { History, Download, Eye, Trash2, Filter, Search } from 'lucide-react';
import { jobsApi, assetsApi } from '@/lib/api';
import { format } from 'date-fns';
const moduleFilters = [
{ value: '', label: 'All Modules' },
{ value: 'image_generation', label: 'Image Generation' },
{ value: 'image_upscaling', label: 'Image Upscaling' },
{ value: 'background_removal', label: 'Background Removal' },
{ value: 'video_generation', label: 'Video Generation' },
{ value: 'video_upscaling', label: 'Video Upscaling' },
{ value: 'subtitle_processor', label: 'Subtitles' },
{ value: 'text_to_speech', label: 'Text to Speech' },
{ value: 'voice_to_text', label: 'Voice to Text' },
{ value: 'alt_text_generator', label: 'Alt Text' },
{ value: 'prompt_studio', label: 'Prompt Studio' },
];
const statusFilters = [
{ value: '', label: 'All Status' },
{ value: 'completed', label: 'Completed' },
{ value: 'processing', label: 'Processing' },
{ value: 'pending', label: 'Pending' },
{ value: 'failed', label: 'Failed' },
];
export default function HistoryPage() {
const [jobs, setJobs] = useState<any[]>([]);
const [loading, setLoading] = useState(true);
const [moduleFilter, setModuleFilter] = useState('');
const [statusFilter, setStatusFilter] = useState('');
const [searchQuery, setSearchQuery] = useState('');
const [page, setPage] = useState(1);
const [totalPages, setTotalPages] = useState(1);
const [selectedJob, setSelectedJob] = useState<any>(null);
const fetchJobs = async () => {
setLoading(true);
try {
const params: any = {
page,
limit: 20,
};
if (moduleFilter) params.module = moduleFilter;
if (statusFilter) params.status = statusFilter;
const response = await jobsApi.list(params);
setJobs(response.data.items || []);
setTotalPages(Math.ceil((response.data.total || 0) / 20));
} catch (err) {
toast.error('Failed to load history');
} finally {
setLoading(false);
}
};
useEffect(() => {
fetchJobs();
}, [page, moduleFilter, statusFilter]);
const handleDownload = async (assetId: string, filename: string) => {
try {
const response = await assetsApi.download(assetId);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download');
}
};
const getStatusColor = (status: string) => {
switch (status) {
case 'completed':
return 'bg-green-900/50 text-green-400';
case 'processing':
return 'bg-blue-900/50 text-blue-400';
case 'pending':
return 'bg-gray-700 text-gray-300';
case 'failed':
return 'bg-red-900/50 text-red-400';
default:
return 'bg-gray-700 text-gray-300';
}
};
const formatModuleName = (module: string) => {
return module?.replace(/_/g, ' ').replace(/\b\w/g, (l) => l.toUpperCase()) || '-';
};
return (
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<History className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Work History</h1>
<p className="text-gray-500">View and manage your past jobs</p>
</div>
</div>
</div>
{/* Filters */}
<div className="flex flex-wrap gap-4">
<div className="flex-1 min-w-[200px]">
<div className="relative">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-5 h-5 text-gray-500" />
<input
type="text"
value={searchQuery}
onChange={(e) => setSearchQuery(e.target.value)}
placeholder="Search jobs..."
className="input-field pl-10"
/>
</div>
</div>
<select
value={moduleFilter}
onChange={(e) => {
setModuleFilter(e.target.value);
setPage(1);
}}
className="select-field w-48"
>
{moduleFilters.map((filter) => (
<option key={filter.value} value={filter.value}>
{filter.label}
</option>
))}
</select>
<select
value={statusFilter}
onChange={(e) => {
setStatusFilter(e.target.value);
setPage(1);
}}
className="select-field w-40"
>
{statusFilters.map((filter) => (
<option key={filter.value} value={filter.value}>
{filter.label}
</option>
))}
</select>
</div>
{/* Jobs Table */}
<div className="bg-forge-dark rounded-xl border border-gray-800 overflow-hidden">
{loading ? (
<div className="p-8 text-center text-gray-500">Loading...</div>
) : jobs.length === 0 ? (
<div className="p-8 text-center text-gray-500">No jobs found</div>
) : (
<table className="w-full">
<thead>
<tr className="border-b border-gray-800">
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Module
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Status
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Provider
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Created
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Duration
</th>
<th className="text-right px-6 py-4 text-sm font-medium text-gray-500">
Actions
</th>
</tr>
</thead>
<tbody>
{jobs.map((job) => (
<tr
key={job.id}
className="border-b border-gray-800 last:border-0 hover:bg-forge-gray/50"
>
<td className="px-6 py-4">
<span className="text-white font-medium">
{formatModuleName(job.module)}
</span>
</td>
<td className="px-6 py-4">
<span className={`badge ${getStatusColor(job.status)}`}>
{job.status}
</span>
</td>
<td className="px-6 py-4 text-gray-400">
{job.api_provider || '-'}
{job.api_model && (
<span className="text-gray-600 text-xs block">
{job.api_model}
</span>
)}
</td>
<td className="px-6 py-4 text-gray-400 text-sm">
{format(new Date(job.created_at), 'MMM d, yyyy HH:mm')}
</td>
<td className="px-6 py-4 text-gray-400 text-sm">
{job.completed_at
? `${Math.round(
(new Date(job.completed_at).getTime() -
new Date(job.created_at).getTime()) /
1000
)}s`
: '-'}
</td>
<td className="px-6 py-4">
<div className="flex items-center justify-end gap-2">
<button
onClick={() => setSelectedJob(job)}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
title="View details"
>
<Eye className="w-4 h-4" />
</button>
{job.output_asset_ids?.[0] && (
<button
onClick={() =>
handleDownload(
job.output_asset_ids[0],
`output_${job.id}`
)
}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
title="Download"
>
<Download className="w-4 h-4" />
</button>
)}
</div>
</td>
</tr>
))}
</tbody>
</table>
)}
</div>
{/* Pagination */}
{totalPages > 1 && (
<div className="flex items-center justify-center gap-2">
<button
onClick={() => setPage((p) => Math.max(1, p - 1))}
disabled={page === 1}
className="btn-secondary px-4 py-2 disabled:opacity-50"
>
Previous
</button>
<span className="text-gray-400 px-4">
Page {page} of {totalPages}
</span>
<button
onClick={() => setPage((p) => Math.min(totalPages, p + 1))}
disabled={page === totalPages}
className="btn-secondary px-4 py-2 disabled:opacity-50"
>
Next
</button>
</div>
)}
{/* Job Detail Modal */}
{selectedJob && (
<div className="fixed inset-0 bg-black/60 flex items-center justify-center z-50 p-4">
<div className="bg-forge-dark rounded-xl border border-gray-800 max-w-2xl w-full max-h-[80vh] overflow-y-auto">
<div className="p-6 border-b border-gray-800 flex items-center justify-between">
<h3 className="text-lg font-semibold text-white">Job Details</h3>
<button
onClick={() => setSelectedJob(null)}
className="text-gray-400 hover:text-white"
>
&times;
</button>
</div>
<div className="p-6 space-y-4">
<div className="grid grid-cols-2 gap-4">
<div>
<p className="text-sm text-gray-500">Job ID</p>
<p className="text-white font-mono text-sm">{selectedJob.id}</p>
</div>
<div>
<p className="text-sm text-gray-500">Module</p>
<p className="text-white">{formatModuleName(selectedJob.module)}</p>
</div>
<div>
<p className="text-sm text-gray-500">Status</p>
<span className={`badge ${getStatusColor(selectedJob.status)}`}>
{selectedJob.status}
</span>
</div>
<div>
<p className="text-sm text-gray-500">Progress</p>
<p className="text-white">{selectedJob.progress}%</p>
</div>
<div>
<p className="text-sm text-gray-500">Provider</p>
<p className="text-white">{selectedJob.api_provider || '-'}</p>
</div>
<div>
<p className="text-sm text-gray-500">Model</p>
<p className="text-white">{selectedJob.api_model || '-'}</p>
</div>
</div>
{selectedJob.error_message && (
<div className="bg-red-900/20 border border-red-800 rounded-lg p-4">
<p className="text-sm text-red-400">{selectedJob.error_message}</p>
</div>
)}
{selectedJob.input_data && (
<div>
<p className="text-sm text-gray-500 mb-2">Input Data</p>
<pre className="bg-forge-gray rounded-lg p-4 text-sm text-gray-300 overflow-x-auto">
{JSON.stringify(selectedJob.input_data, null, 2)}
</pre>
</div>
)}
{selectedJob.output_data && (
<div>
<p className="text-sm text-gray-500 mb-2">Output Data</p>
<pre className="bg-forge-gray rounded-lg p-4 text-sm text-gray-300 overflow-x-auto">
{JSON.stringify(selectedJob.output_data, null, 2)}
</pre>
</div>
)}
</div>
</div>
</div>
)}
</div>
);
}

View file

@ -0,0 +1,437 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { ImagePlus, Download, Sparkles, Pencil, X, RotateCw } from 'lucide-react';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
interface ModelOption {
id: string;
name: string;
}
interface Provider {
id: string;
name: string;
models: ModelOption[];
}
const providers: Provider[] = [
{ id: 'openai', name: 'OpenAI GPT-Image-1', models: [
{ id: 'gpt-image-1', name: 'GPT Image 1' },
{ id: 'dall-e-3', name: 'DALL-E 3' },
{ id: 'dall-e-2', name: 'DALL-E 2' }
]},
{ id: 'stable-diffusion', name: 'Stability AI SD3.5', models: [
{ id: 'sd3.5-large', name: 'SD 3.5 Large' },
{ id: 'sd3.5-medium', name: 'SD 3.5 Medium' },
{ id: 'sd3-large', name: 'SD 3 Large' },
{ id: 'sd3-medium', name: 'SD 3 Medium' },
{ id: 'sdxl-1.0', name: 'SDXL 1.0' }
]},
{ id: 'imagen', name: 'Google Imagen 4', models: [
{ id: 'imagen-4.0-generate-001', name: 'Imagen 4.0' },
{ id: 'imagen-4.0-ultra-generate-001', name: 'Imagen 4.0 Ultra' },
{ id: 'imagen-4.0-fast-generate-001', name: 'Imagen 4.0 Fast' }
]},
{ id: 'nano-banana', name: 'Nano Banana (Gemini)', models: [
{ id: 'gemini-2.5-flash-image', name: 'Gemini 2.5 Flash Image' },
{ id: 'gemini-3-pro-image-preview', name: 'Gemini 3 Pro Image' }
]},
{ id: 'leonardo', name: 'Leonardo AI', models: [
{ id: '6b645e3a-d64f-4341-a6d8-7a3690fbf042', name: 'Leonardo Phoenix' },
{ id: 'e71a1c2f-4f80-4800-934f-2c68979d8cc8', name: 'Leonardo Anime XL' },
{ id: 'b24e16ff-06e3-43eb-8d33-4416c2d75876', name: 'Leonardo Lightning XL' },
{ id: 'aa77f04e-3eec-4034-9c07-d0f619684628', name: 'Leonardo Kino XL' },
{ id: '5c232a9e-9061-4777-980a-ddc8e65647c6', name: 'Leonardo Vision XL' },
{ id: '1e60896f-3c26-4296-8ecc-53e2afecc132', name: 'Leonardo Diffusion XL' },
{ id: '2067ae52-33fd-4a82-bb92-c2c55e7d2786', name: 'AlbedoBase XL' },
{ id: 'f1929ea3-b169-4c18-a16c-5d58b4292c69', name: 'RPG v5' },
{ id: 'd69c8273-6b17-4a30-a13e-d6637ae1c644', name: '3D Animation Style' },
{ id: 'ac614f96-1082-45bf-be9d-757f2d31c174', name: 'DreamShaper v7' },
{ id: 'e316348f-7773-490e-adcd-46757c738eb7', name: 'Absolute Reality v1.6' }
]},
{ id: 'bria', name: 'Bria AI', models: [
{ id: 'base', name: 'Base' },
{ id: 'fast', name: 'Fast' }
]},
{ id: 'ideogram', name: 'Ideogram', models: [
{ id: 'V_2', name: 'V2' },
{ id: 'V_2_TURBO', name: 'V2 Turbo' }
]},
{ id: 'flux', name: 'Flux Pro', models: [
{ id: 'flux-pro-1.1', name: 'Flux Pro 1.1' },
{ id: 'flux-dev', name: 'Flux Dev' },
{ id: 'flux-schnell', name: 'Flux Schnell' }
]},
{ id: 'gemini', name: 'Google Gemini', models: [
{ id: 'gemini-2.0-flash-exp', name: 'Gemini 2.0 Flash' }
]},
];
const sizes = ['1024x1024', '1024x1792', '1792x1024', '512x512'];
const styles = ['vivid', 'natural', 'cinematic', 'anime', 'photographic', '3d-render'];
export default function ImageGeneratePage() {
const { addJob, updateJob } = useStore();
const [prompt, setPrompt] = useState('');
const [negativePrompt, setNegativePrompt] = useState('');
const [provider, setProvider] = useState('openai');
const [model, setModel] = useState('gpt-image-1');
const [size, setSize] = useState('1024x1024');
const [style, setStyle] = useState('vivid');
const [numImages, setNumImages] = useState(1);
const [jobId, setJobId] = useState<string | null>(null);
const [generatedImages, setGeneratedImages] = useState<any[]>([]);
const [loading, setLoading] = useState(false);
// Iterative editing state (for Nano Banana)
const [editingImage, setEditingImage] = useState<any | null>(null);
const [editInstructions, setEditInstructions] = useState('');
const selectedProvider = providers.find((p) => p.id === provider);
const supportsEditing = provider === 'nano-banana' || provider === 'gemini';
const handleGenerate = async () => {
const effectivePrompt = editingImage ? editInstructions : prompt;
if (!effectivePrompt.trim()) {
toast.error(editingImage ? 'Please enter edit instructions' : 'Please enter a prompt');
return;
}
setLoading(true);
if (!editingImage) {
setGeneratedImages([]);
}
try {
const response = await modulesApi.generateImage({
prompt: effectivePrompt,
negative_prompt: negativePrompt || undefined,
provider: editingImage ? 'nano-banana' : provider,
model: editingImage ? 'gemini-2.5-flash-image' : model,
size,
style,
num_images: numImages,
// Include reference image for iterative editing
reference_asset_id: editingImage?.id || undefined,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'image_generation',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success(editingImage ? 'Image editing started!' : 'Image generation started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start generation');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.length > 0) {
const images = await Promise.all(
job.output_asset_ids.map(async (id: string) => {
const asset = await assetsApi.get(id);
return asset.data;
})
);
// When editing, append to existing images; otherwise replace
if (editingImage) {
setGeneratedImages([...generatedImages, ...images]);
setEditingImage(null);
setEditInstructions('');
toast.success('Image edited successfully!');
} else {
setGeneratedImages(images);
toast.success('Images generated successfully!');
}
}
};
const handleStartEdit = (image: any) => {
setEditingImage(image);
setEditInstructions('');
// Auto-switch to Nano Banana for editing
setProvider('nano-banana');
setModel('gemini-2.5-flash-image');
};
const handleCancelEdit = () => {
setEditingImage(null);
setEditInstructions('');
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async (assetId: string, filename: string) => {
try {
const response = await assetsApi.download(assetId);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download image');
}
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<ImagePlus className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Image Generator</h1>
<p className="text-gray-500">Create stunning images with AI</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* Editing Mode Panel */}
{editingImage && (
<div className="bg-purple-900/20 border border-purple-500/50 rounded-xl p-4 space-y-4">
<div className="flex items-center justify-between">
<h3 className="text-purple-400 font-medium flex items-center gap-2">
<Pencil className="w-4 h-4" />
Editing Image
</h3>
<button
onClick={handleCancelEdit}
className="text-gray-400 hover:text-white transition-colors"
>
<X className="w-5 h-5" />
</button>
</div>
<div className="flex gap-4">
<div className="w-24 h-24 rounded-lg overflow-hidden border border-purple-500/50">
<img
src={`/api/v1/assets/${editingImage.id}/download`}
alt="Reference"
className="w-full h-full object-cover"
/>
</div>
<div className="flex-1">
<label className="block text-sm font-medium text-gray-300 mb-2">
Edit Instructions
</label>
<textarea
value={editInstructions}
onChange={(e) => setEditInstructions(e.target.value)}
placeholder="Describe how you want to modify this image... e.g., 'Change the sky to sunset colors' or 'Add a cat in the foreground'"
className="input-field min-h-[80px] resize-none"
/>
</div>
</div>
<p className="text-xs text-gray-500">
Using Nano Banana (Gemini) for iterative image editing
</p>
</div>
)}
{/* Prompt */}
{!editingImage && (
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Prompt
</label>
<textarea
value={prompt}
onChange={(e) => setPrompt(e.target.value)}
placeholder="Describe the image you want to create..."
className="input-field min-h-[120px] resize-none"
/>
</div>
)}
{/* Negative Prompt */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Negative Prompt (Optional)
</label>
<textarea
value={negativePrompt}
onChange={(e) => setNegativePrompt(e.target.value)}
placeholder="What to avoid in the image..."
className="input-field min-h-[80px] resize-none"
/>
</div>
{/* Provider & Model */}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Provider
</label>
<select
value={provider}
onChange={(e) => {
setProvider(e.target.value);
const p = providers.find((pr) => pr.id === e.target.value);
if (p && p.models.length > 0) setModel(p.models[0].id);
}}
className="select-field"
>
{providers.map((p) => (
<option key={p.id} value={p.id}>
{p.name}
</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Model
</label>
<select
value={model}
onChange={(e) => setModel(e.target.value)}
className="select-field"
>
{selectedProvider?.models.map((m) => (
<option key={m.id} value={m.id}>
{m.name}
</option>
))}
</select>
</div>
</div>
{/* Size & Style */}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Size
</label>
<select
value={size}
onChange={(e) => setSize(e.target.value)}
className="select-field"
>
{sizes.map((s) => (
<option key={s} value={s}>
{s}
</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Style
</label>
<select
value={style}
onChange={(e) => setStyle(e.target.value)}
className="select-field"
>
{styles.map((s) => (
<option key={s} value={s}>
{s.charAt(0).toUpperCase() + s.slice(1)}
</option>
))}
</select>
</div>
</div>
{/* Number of Images */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Number of Images
</label>
<input
type="number"
min={1}
max={4}
value={numImages}
onChange={(e) => setNumImages(parseInt(e.target.value) || 1)}
className="input-field w-24"
/>
</div>
{/* Generate Button */}
<button
onClick={handleGenerate}
disabled={loading || (editingImage ? !editInstructions.trim() : !prompt.trim())}
className={`w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed ${
editingImage ? 'bg-purple-600 hover:bg-purple-700 text-white py-3 px-6 rounded-lg font-medium transition-colors' : 'btn-primary'
}`}
>
{editingImage ? <Pencil className="w-5 h-5" /> : <Sparkles className="w-5 h-5" />}
{loading ? (editingImage ? 'Editing...' : 'Generating...') : (editingImage ? 'Apply Edits' : 'Generate Images')}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Generated Images</h2>
{generatedImages.length > 0 ? (
<div className="grid grid-cols-2 gap-4">
{generatedImages.map((image) => (
<div
key={image.id}
className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800 group"
>
<div className="aspect-square relative">
<img
src={`/api/v1/assets/${image.id}/download`}
alt="Generated"
className="w-full h-full object-cover"
/>
<div className="absolute inset-0 bg-black/60 opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center gap-2">
<button
onClick={() => handleStartEdit(image)}
className="bg-purple-600 hover:bg-purple-700 text-white py-2 px-4 rounded-lg font-medium transition-colors flex items-center gap-2"
title="Edit with Nano Banana"
>
<Pencil className="w-4 h-4" />
Edit
</button>
<button
onClick={() => handleDownload(image.id, image.original_filename)}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
))}
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 aspect-square flex items-center justify-center">
<p className="text-gray-500">Generated images will appear here</p>
</div>
)}
</div>
</div>
</div>
);
}

View file

@ -0,0 +1,251 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { Eraser, Download, Sparkles } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const outputFormats = [
{ value: 'png', label: 'PNG (Transparent)' },
{ value: 'webp', label: 'WebP' },
];
export default function RemoveBackgroundPage() {
const { addJob, updateJob } = useStore();
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [outputFormat, setOutputFormat] = useState('png');
const [refineMask, setRefineMask] = useState(true);
const [jobId, setJobId] = useState<string | null>(null);
const [resultImage, setResultImage] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [uploading, setUploading] = useState(false);
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Image uploaded!');
} catch (err) {
toast.error('Failed to upload image');
setFile(null);
} finally {
setUploading(false);
}
};
const handleRemoveBackground = async () => {
if (!assetId) {
toast.error('Please upload an image first');
return;
}
setLoading(true);
setResultImage(null);
try {
const response = await modulesApi.removeBackground({
asset_id: assetId,
output_format: outputFormat,
refine_mask: refineMask,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'background_removal',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Background removal started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start processing');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.[0]) {
const asset = await assetsApi.get(job.output_asset_ids[0]);
setResultImage(asset.data);
toast.success('Background removed successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async () => {
if (!resultImage) return;
try {
const response = await assetsApi.download(resultImage.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = resultImage.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download image');
}
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Eraser className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Background Remover</h1>
<p className="text-gray-500">Remove backgrounds instantly with AI precision</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* File Upload */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upload Image
</label>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'image/*': ['.png', '.jpg', '.jpeg', '.webp'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
}}
label="Upload an image"
/>
{uploading && (
<p className="mt-2 text-sm text-forge-yellow">Uploading...</p>
)}
</div>
{/* Output Format */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Output Format
</label>
<select
value={outputFormat}
onChange={(e) => setOutputFormat(e.target.value)}
className="select-field"
>
{outputFormats.map((format) => (
<option key={format.value} value={format.value}>
{format.label}
</option>
))}
</select>
</div>
{/* Refine Mask */}
<div className="flex items-center gap-3">
<input
type="checkbox"
id="refineMask"
checked={refineMask}
onChange={(e) => setRefineMask(e.target.checked)}
className="w-4 h-4 rounded border-gray-600 bg-forge-dark text-forge-yellow focus:ring-forge-yellow"
/>
<label htmlFor="refineMask" className="text-gray-300">
Refine edges (better quality, slower)
</label>
</div>
{/* Remove Background Button */}
<button
onClick={handleRemoveBackground}
disabled={loading || !assetId || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Processing...' : 'Remove Background'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Result</h2>
{resultImage ? (
<div className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800">
<div
className="relative"
style={{
backgroundImage:
'linear-gradient(45deg, #2a2a2a 25%, transparent 25%), linear-gradient(-45deg, #2a2a2a 25%, transparent 25%), linear-gradient(45deg, transparent 75%, #2a2a2a 75%), linear-gradient(-45deg, transparent 75%, #2a2a2a 75%)',
backgroundSize: '20px 20px',
backgroundPosition: '0 0, 0 10px, 10px -10px, -10px 0px',
}}
>
<img
src={`/api/v1/assets/${resultImage.id}/download`}
alt="Result"
className="w-full"
/>
</div>
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<p className="text-white font-medium">{resultImage.original_filename}</p>
<p className="text-sm text-gray-500">
{(resultImage.file_size_bytes / 1024).toFixed(1)} KB
</p>
</div>
<button
onClick={handleDownload}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
) : (
<div
className="bg-forge-dark rounded-xl border border-gray-800 aspect-video flex items-center justify-center"
style={{
backgroundImage:
'linear-gradient(45deg, #1a1a1a 25%, transparent 25%), linear-gradient(-45deg, #1a1a1a 25%, transparent 25%), linear-gradient(45deg, transparent 75%, #1a1a1a 75%), linear-gradient(-45deg, transparent 75%, #1a1a1a 75%)',
backgroundSize: '20px 20px',
backgroundPosition: '0 0, 0 10px, 10px -10px, -10px 0px',
}}
>
<p className="text-gray-500 bg-forge-dark/80 px-4 py-2 rounded">
Result will appear here
</p>
</div>
)}
</div>
</div>
</div>
);
}

View file

@ -0,0 +1,294 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import { Maximize, Download, Sparkles } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const scaleOptions = [
{ value: 2, label: '2x' },
{ value: 4, label: '4x' },
{ value: 6, label: '6x' },
];
const modelOptions = [
{ value: 'standard', label: 'Standard V2' },
{ value: 'high-fidelity', label: 'High Fidelity' },
{ value: 'low-resolution', label: 'Low Resolution Fix' },
{ value: 'cgi', label: 'CGI' },
];
export default function ImageUpscalePage() {
const { addJob, updateJob } = useStore();
const [mounted, setMounted] = useState(false);
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [scale, setScale] = useState(2);
const [model, setModel] = useState('standard');
const [denoiseStrength, setDenoiseStrength] = useState(0.5);
const [sharpen, setSharpen] = useState(0.5);
const [jobId, setJobId] = useState<string | null>(null);
const [upscaledImage, setUpscaledImage] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [uploading, setUploading] = useState(false);
useEffect(() => {
setMounted(true);
}, []);
if (!mounted) {
return null;
}
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Image uploaded!');
} catch (err) {
toast.error('Failed to upload image');
setFile(null);
} finally {
setUploading(false);
}
};
const handleUpscale = async () => {
if (!assetId) {
toast.error('Please upload an image first');
return;
}
setLoading(true);
setUpscaledImage(null);
try {
const response = await modulesApi.upscaleImage({
asset_id: assetId,
scale,
model,
denoise_strength: denoiseStrength,
sharpen,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'image_upscaling',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Upscaling started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start upscaling');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.[0]) {
const asset = await assetsApi.get(job.output_asset_ids[0]);
setUpscaledImage(asset.data);
toast.success('Image upscaled successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async () => {
if (!upscaledImage) return;
try {
const response = await assetsApi.download(upscaledImage.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = upscaledImage.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download image');
}
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Maximize className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Image Upscaler</h1>
<p className="text-gray-500">Enhance image resolution with Topaz Labs AI</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* File Upload */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upload Image
</label>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'image/*': ['.png', '.jpg', '.jpeg', '.webp'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
}}
label="Upload an image to upscale"
/>
{uploading && (
<p className="mt-2 text-sm text-forge-yellow">Uploading...</p>
)}
</div>
{/* Scale */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Scale Factor
</label>
<div className="flex gap-2">
{scaleOptions.map((option) => (
<button
key={option.value}
onClick={() => setScale(option.value)}
className={`px-6 py-3 rounded-lg font-medium transition-colors ${
scale === option.value
? 'bg-forge-yellow text-black'
: 'bg-forge-dark border border-gray-700 text-gray-300 hover:border-gray-600'
}`}
>
{option.label}
</button>
))}
</div>
</div>
{/* Model */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upscaling Model
</label>
<select
value={model}
onChange={(e) => setModel(e.target.value)}
className="select-field"
>
{modelOptions.map((option) => (
<option key={option.value} value={option.value}>
{option.label}
</option>
))}
</select>
</div>
{/* Denoise */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Denoise Strength: {denoiseStrength.toFixed(1)}
</label>
<input
type="range"
min={0}
max={1}
step={0.1}
value={denoiseStrength}
onChange={(e) => setDenoiseStrength(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
</div>
{/* Sharpen */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Sharpen: {sharpen.toFixed(1)}
</label>
<input
type="range"
min={0}
max={1}
step={0.1}
value={sharpen}
onChange={(e) => setSharpen(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
</div>
{/* Upscale Button */}
<button
onClick={handleUpscale}
disabled={loading || !assetId || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Upscaling...' : 'Upscale Image'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Result</h2>
{upscaledImage ? (
<div className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800">
<div className="relative">
<img
src={`/api/v1/assets/${upscaledImage.id}/download`}
alt="Upscaled"
className="w-full"
/>
</div>
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<p className="text-white font-medium">{upscaledImage.original_filename}</p>
<p className="text-sm text-gray-500">
{upscaledImage.width} x {upscaledImage.height}
</p>
</div>
<button
onClick={handleDownload}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 aspect-video flex items-center justify-center">
<p className="text-gray-500">Upscaled image will appear here</p>
</div>
)}
</div>
</div>
</div>
);
}

42
frontend/app/layout.tsx Normal file
View file

@ -0,0 +1,42 @@
import type { Metadata } from 'next';
import { Toaster } from 'react-hot-toast';
import './globals.css';
import AuthProvider from '@/components/AuthProvider';
import AppShell from '@/components/AppShell';
export const metadata: Metadata = {
title: 'FORGE AI - Creative Tools Platform',
description: 'Unified AI-powered creative tools for image, video, and audio generation',
};
export default function RootLayout({
children,
}: {
children: React.ReactNode;
}) {
return (
<html lang="en">
<body className="font-montserrat bg-forge-black min-h-screen">
<Toaster
position="top-right"
toastOptions={{
style: {
background: '#1a1a1a',
color: '#fff',
border: '1px solid #2a2a2a',
},
success: {
iconTheme: {
primary: '#FFC407',
secondary: '#000',
},
},
}}
/>
<AuthProvider>
<AppShell>{children}</AppShell>
</AuthProvider>
</body>
</html>
);
}

137
frontend/app/login/page.tsx Normal file
View file

@ -0,0 +1,137 @@
'use client';
import { useState } from 'react';
import { useRouter } from 'next/navigation';
import Link from 'next/link';
import { toast } from 'react-hot-toast';
import { LogIn, Eye, EyeOff, Loader2 } from 'lucide-react';
import { authApi } from '@/lib/api';
import { useStore } from '@/lib/store';
export default function LoginPage() {
const router = useRouter();
const { setUser, setToken } = useStore();
const [email, setEmail] = useState('');
const [password, setPassword] = useState('');
const [showPassword, setShowPassword] = useState(false);
const [loading, setLoading] = useState(false);
const handleLogin = async (e: React.FormEvent) => {
e.preventDefault();
if (!email.trim() || !password) {
toast.error('Please enter email and password');
return;
}
setLoading(true);
try {
const response = await authApi.login({ email, password });
const { user } = response.data;
// Store user data and set token marker (actual auth via cookie)
setUser({
id: user.id,
email: user.email,
name: user.display_name || user.email,
role: user.role,
avatar_url: user.avatar_url,
});
setToken('cookie-auth'); // Marker to indicate authenticated
toast.success('Welcome back!');
router.push('/');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Invalid email or password');
} finally {
setLoading(false);
}
};
return (
<div className="min-h-screen bg-forge-gray flex items-center justify-center p-4">
<div className="w-full max-w-md">
{/* Logo */}
<div className="text-center mb-8">
<div className="inline-flex items-center justify-center w-16 h-16 bg-forge-yellow rounded-xl mb-4">
<span className="text-2xl font-bold text-black">F</span>
</div>
<h1 className="text-2xl font-bold text-white">Welcome to FORGE AI</h1>
<p className="text-gray-500 mt-2">Sign in to your account</p>
</div>
{/* Login Form */}
<form onSubmit={handleLogin} className="bg-forge-dark rounded-xl border border-gray-800 p-8 space-y-6">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Email Address
</label>
<input
type="email"
value={email}
onChange={(e) => setEmail(e.target.value)}
placeholder="you@example.com"
className="input-field"
autoComplete="email"
autoFocus
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Password
</label>
<div className="relative">
<input
type={showPassword ? 'text' : 'password'}
value={password}
onChange={(e) => setPassword(e.target.value)}
placeholder="Enter your password"
className="input-field pr-10"
autoComplete="current-password"
/>
<button
type="button"
onClick={() => setShowPassword(!showPassword)}
className="absolute right-3 top-1/2 -translate-y-1/2 text-gray-500 hover:text-gray-300"
>
{showPassword ? <EyeOff className="w-4 h-4" /> : <Eye className="w-4 h-4" />}
</button>
</div>
</div>
<button
type="submit"
disabled={loading}
className="btn-primary w-full flex items-center justify-center gap-2"
>
{loading ? (
<Loader2 className="w-5 h-5 animate-spin" />
) : (
<LogIn className="w-5 h-5" />
)}
{loading ? 'Signing in...' : 'Sign In'}
</button>
<div className="text-center text-sm text-gray-500">
Don't have an account?{' '}
<Link href="/signup" className="text-forge-yellow hover:text-yellow-400">
Sign up
</Link>
</div>
</form>
{/* Dev Test User Info */}
<div className="mt-6 p-4 bg-forge-dark/50 rounded-lg border border-gray-800 text-center">
<p className="text-xs text-gray-500">
Development Mode - Test credentials:
</p>
<p className="text-sm text-gray-400 mt-1">
test@forge.ai / password123
</p>
</div>
</div>
</div>
);
}

248
frontend/app/page.tsx Normal file
View file

@ -0,0 +1,248 @@
'use client';
import { useEffect, useState } from 'react';
import {
ImagePlus,
Maximize,
Eraser,
Film,
Captions,
Volume2,
Type,
Wand2,
FileText,
TrendingUp,
Clock,
CheckCircle,
} from 'lucide-react';
import ModuleCard from '@/components/ModuleCard';
import { useStore } from '@/lib/store';
import { jobsApi, usersApi } from '@/lib/api';
const modules = [
{
title: 'Image Generator',
description: 'Create stunning images with AI using multiple providers',
icon: ImagePlus,
href: '/image/generate',
},
{
title: 'Image Upscaler',
description: 'Enhance image resolution with Topaz Labs AI',
icon: Maximize,
href: '/image/upscale',
},
{
title: 'Background Remover',
description: 'Remove backgrounds instantly with precision',
icon: Eraser,
href: '/image/remove-bg',
},
{
title: 'Video Generator',
description: 'Generate videos with Runway and Google Veo',
icon: Film,
href: '/video/generate',
},
{
title: 'Video Upscaler',
description: 'Upscale videos to higher resolutions',
icon: Maximize,
href: '/video/upscale',
},
{
title: 'Subtitle Generator',
description: 'Auto-generate and translate subtitles',
icon: Captions,
href: '/video/subtitles',
},
{
title: 'Text to Speech',
description: 'Convert text to natural speech with ElevenLabs',
icon: Volume2,
href: '/audio/text-to-speech',
},
{
title: 'Voice to Text',
description: 'Transcribe audio with Whisper AI',
icon: Type,
href: '/audio/voice-to-text',
},
{
title: 'Prompt Studio',
description: 'Enhance your prompts with AI assistance',
icon: Wand2,
href: '/text/prompt-studio',
},
{
title: 'Alt Text Generator',
description: 'Generate accessible alt text for images',
icon: FileText,
href: '/text/alt-text',
},
];
export default function Dashboard() {
const { activeJobs } = useStore();
const [stats, setStats] = useState({
totalJobs: 0,
completedToday: 0,
processingTime: 0,
});
const [recentJobs, setRecentJobs] = useState<any[]>([]);
useEffect(() => {
const fetchData = async () => {
try {
const jobsResponse = await jobsApi.list({ limit: 5 });
setRecentJobs(jobsResponse.data.items || []);
// Calculate stats from recent jobs
const completed = jobsResponse.data.items?.filter(
(j: any) => j.status === 'completed'
).length || 0;
setStats({
totalJobs: jobsResponse.data.total || 0,
completedToday: completed,
processingTime: 2.4,
});
} catch (err) {
console.error('Failed to fetch dashboard data:', err);
}
};
fetchData();
}, []);
return (
<div className="space-y-8">
{/* Stats Grid */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-6">
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<TrendingUp className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<p className="text-gray-500 text-sm">Total Jobs</p>
<p className="text-2xl font-bold text-white">{stats.totalJobs}</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-green-900/30 rounded-lg flex items-center justify-center">
<CheckCircle className="w-6 h-6 text-green-400" />
</div>
<div>
<p className="text-gray-500 text-sm">Completed Today</p>
<p className="text-2xl font-bold text-white">{stats.completedToday}</p>
</div>
</div>
</div>
<div className="bg-forge-dark rounded-xl p-6 border border-gray-800">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-blue-900/30 rounded-lg flex items-center justify-center">
<Clock className="w-6 h-6 text-blue-400" />
</div>
<div>
<p className="text-gray-500 text-sm">Avg. Processing Time</p>
<p className="text-2xl font-bold text-white">{stats.processingTime}s</p>
</div>
</div>
</div>
</div>
{/* Active Jobs */}
{activeJobs.filter((j) => j.status === 'processing').length > 0 && (
<div>
<h2 className="text-lg font-semibold text-white mb-4">Active Jobs</h2>
<div className="space-y-3">
{activeJobs
.filter((j) => j.status === 'processing')
.map((job) => (
<div
key={job.id}
className="bg-forge-dark rounded-xl p-4 border border-gray-800"
>
<div className="flex items-center justify-between mb-2">
<span className="text-white font-medium capitalize">
{job.module.replace('_', ' ')}
</span>
<span className="text-gray-500 text-sm">{job.progress}%</span>
</div>
<div className="progress-bar">
<div
className="progress-bar-fill"
style={{ width: `${job.progress}%` }}
/>
</div>
</div>
))}
</div>
</div>
)}
{/* Modules Grid */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">AI Tools</h2>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 xl:grid-cols-4 gap-6">
{modules.map((module) => (
<ModuleCard key={module.href} {...module} />
))}
</div>
</div>
{/* Recent Jobs */}
{recentJobs.length > 0 && (
<div>
<h2 className="text-lg font-semibold text-white mb-4">Recent Activity</h2>
<div className="bg-forge-dark rounded-xl border border-gray-800 overflow-hidden">
<table className="w-full">
<thead>
<tr className="border-b border-gray-800">
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Module
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Status
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Provider
</th>
<th className="text-left px-6 py-4 text-sm font-medium text-gray-500">
Created
</th>
</tr>
</thead>
<tbody>
{recentJobs.map((job: any) => (
<tr key={job.id} className="border-b border-gray-800 last:border-0">
<td className="px-6 py-4 text-white capitalize">
{job.module?.replace('_', ' ')}
</td>
<td className="px-6 py-4">
<span
className={`badge badge-${job.status}`}
>
{job.status}
</span>
</td>
<td className="px-6 py-4 text-gray-400">
{job.api_provider || '-'}
</td>
<td className="px-6 py-4 text-gray-500 text-sm">
{new Date(job.created_at).toLocaleString()}
</td>
</tr>
))}
</tbody>
</table>
</div>
</div>
)}
</div>
);
}

View file

@ -0,0 +1,414 @@
'use client';
import { useState, useEffect } from 'react';
import { useRouter } from 'next/navigation';
import { toast } from 'react-hot-toast';
import { Settings, User, Bell, Palette, Key, Save, Shield, LogOut, Loader2 } from 'lucide-react';
import { useStore } from '@/lib/store';
import { usersApi, authApi } from '@/lib/api';
export default function SettingsPage() {
const router = useRouter();
const { user, setUser, logout } = useStore();
const [activeTab, setActiveTab] = useState('profile');
const [loading, setLoading] = useState(false);
// Profile state
const [name, setName] = useState('');
const [email, setEmail] = useState('');
// Security state
const [currentPassword, setCurrentPassword] = useState('');
const [newPassword, setNewPassword] = useState('');
const [confirmPassword, setConfirmPassword] = useState('');
const [savingPassword, setSavingPassword] = useState(false);
const [loggingOut, setLoggingOut] = useState(false);
// Notification preferences
const [emailNotifications, setEmailNotifications] = useState(true);
const [jobCompletionAlerts, setJobCompletionAlerts] = useState(true);
// Default settings
const [defaultImageProvider, setDefaultImageProvider] = useState('openai');
const [defaultVideoProvider, setDefaultVideoProvider] = useState('runway');
const [defaultVoice, setDefaultVoice] = useState('');
useEffect(() => {
if (user) {
setName(user.name || '');
setEmail(user.email || '');
}
}, [user]);
const handleSaveProfile = async () => {
setLoading(true);
try {
const response = await usersApi.updateProfile({ name });
setUser(response.data);
toast.success('Profile updated!');
} catch (err) {
toast.error('Failed to update profile');
} finally {
setLoading(false);
}
};
const handleChangePassword = async () => {
if (!currentPassword || !newPassword || !confirmPassword) {
toast.error('Please fill in all password fields');
return;
}
if (newPassword.length < 8) {
toast.error('New password must be at least 8 characters');
return;
}
if (newPassword !== confirmPassword) {
toast.error('New passwords do not match');
return;
}
setSavingPassword(true);
try {
await authApi.changePassword({
current_password: currentPassword,
new_password: newPassword,
});
setCurrentPassword('');
setNewPassword('');
setConfirmPassword('');
toast.success('Password changed successfully');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to change password');
} finally {
setSavingPassword(false);
}
};
const handleLogout = async () => {
setLoggingOut(true);
try {
await authApi.logout();
} catch (err) {
// Ignore logout errors
}
logout();
toast.success('Logged out successfully');
router.push('/login');
};
const tabs = [
{ id: 'profile', label: 'Profile', icon: User },
{ id: 'security', label: 'Security', icon: Shield },
{ id: 'notifications', label: 'Notifications', icon: Bell },
{ id: 'preferences', label: 'Preferences', icon: Palette },
{ id: 'api-keys', label: 'API Keys', icon: Key },
];
return (
<div className="max-w-4xl mx-auto space-y-8">
{/* Header */}
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Settings className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Settings</h1>
<p className="text-gray-500">Manage your account and preferences</p>
</div>
</div>
{/* Tabs */}
<div className="flex gap-2 border-b border-gray-800 pb-2">
{tabs.map((tab) => (
<button
key={tab.id}
onClick={() => setActiveTab(tab.id)}
className={`flex items-center gap-2 px-4 py-2 rounded-lg transition-colors ${
activeTab === tab.id
? 'bg-forge-yellow/10 text-forge-yellow'
: 'text-gray-400 hover:text-white hover:bg-forge-gray'
}`}
>
<tab.icon className="w-4 h-4" />
{tab.label}
</button>
))}
</div>
{/* Content */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6">
{activeTab === 'profile' && (
<div className="space-y-6">
<h2 className="text-lg font-semibold text-white">Profile Settings</h2>
<div className="space-y-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Display Name
</label>
<input
type="text"
value={name}
onChange={(e) => setName(e.target.value)}
className="input-field"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Email Address
</label>
<input
type="email"
value={email}
disabled
className="input-field opacity-50 cursor-not-allowed"
/>
<p className="text-xs text-gray-500 mt-1">
Email cannot be changed. Contact support for assistance.
</p>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Role
</label>
<input
type="text"
value={user?.role || 'user'}
disabled
className="input-field opacity-50 cursor-not-allowed capitalize"
/>
</div>
</div>
<button
onClick={handleSaveProfile}
disabled={loading}
className="btn-primary flex items-center gap-2"
>
<Save className="w-4 h-4" />
{loading ? 'Saving...' : 'Save Changes'}
</button>
</div>
)}
{activeTab === 'security' && (
<div className="space-y-6">
<h2 className="text-lg font-semibold text-white">Security Settings</h2>
{/* Change Password */}
<div className="space-y-4">
<h3 className="text-white font-medium">Change Password</h3>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Current Password
</label>
<input
type="password"
value={currentPassword}
onChange={(e) => setCurrentPassword(e.target.value)}
placeholder="Enter current password"
className="input-field"
autoComplete="current-password"
/>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
New Password
</label>
<input
type="password"
value={newPassword}
onChange={(e) => setNewPassword(e.target.value)}
placeholder="Enter new password"
className="input-field"
autoComplete="new-password"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Confirm New Password
</label>
<input
type="password"
value={confirmPassword}
onChange={(e) => setConfirmPassword(e.target.value)}
placeholder="Confirm new password"
className="input-field"
autoComplete="new-password"
/>
</div>
</div>
<button
onClick={handleChangePassword}
disabled={savingPassword}
className="btn-primary flex items-center gap-2"
>
{savingPassword ? (
<Loader2 className="w-4 h-4 animate-spin" />
) : (
<Shield className="w-4 h-4" />
)}
{savingPassword ? 'Changing...' : 'Change Password'}
</button>
</div>
{/* Sign Out */}
<div className="pt-6 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<h3 className="text-white font-medium">Sign Out</h3>
<p className="text-sm text-gray-500">
Sign out of your account on this device
</p>
</div>
<button
onClick={handleLogout}
disabled={loggingOut}
className="px-4 py-2 bg-red-600 text-white rounded-lg hover:bg-red-700 transition-colors flex items-center gap-2 disabled:opacity-50"
>
{loggingOut ? (
<Loader2 className="w-4 h-4 animate-spin" />
) : (
<LogOut className="w-4 h-4" />
)}
Sign Out
</button>
</div>
</div>
</div>
)}
{activeTab === 'notifications' && (
<div className="space-y-6">
<h2 className="text-lg font-semibold text-white">Notification Settings</h2>
<div className="space-y-4">
<div className="flex items-center justify-between p-4 bg-forge-gray rounded-lg">
<div>
<p className="text-white font-medium">Email Notifications</p>
<p className="text-sm text-gray-500">
Receive email updates about your account
</p>
</div>
<label className="relative inline-flex items-center cursor-pointer">
<input
type="checkbox"
checked={emailNotifications}
onChange={(e) => setEmailNotifications(e.target.checked)}
className="sr-only peer"
/>
<div className="w-11 h-6 bg-gray-600 peer-focus:outline-none rounded-full peer peer-checked:after:translate-x-full peer-checked:after:border-white after:content-[''] after:absolute after:top-[2px] after:left-[2px] after:bg-white after:rounded-full after:h-5 after:w-5 after:transition-all peer-checked:bg-forge-yellow"></div>
</label>
</div>
<div className="flex items-center justify-between p-4 bg-forge-gray rounded-lg">
<div>
<p className="text-white font-medium">Job Completion Alerts</p>
<p className="text-sm text-gray-500">
Get notified when your jobs complete
</p>
</div>
<label className="relative inline-flex items-center cursor-pointer">
<input
type="checkbox"
checked={jobCompletionAlerts}
onChange={(e) => setJobCompletionAlerts(e.target.checked)}
className="sr-only peer"
/>
<div className="w-11 h-6 bg-gray-600 peer-focus:outline-none rounded-full peer peer-checked:after:translate-x-full peer-checked:after:border-white after:content-[''] after:absolute after:top-[2px] after:left-[2px] after:bg-white after:rounded-full after:h-5 after:w-5 after:transition-all peer-checked:bg-forge-yellow"></div>
</label>
</div>
</div>
</div>
)}
{activeTab === 'preferences' && (
<div className="space-y-6">
<h2 className="text-lg font-semibold text-white">Default Preferences</h2>
<div className="space-y-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Default Image Provider
</label>
<select
value={defaultImageProvider}
onChange={(e) => setDefaultImageProvider(e.target.value)}
className="select-field"
>
<option value="openai">OpenAI DALL-E</option>
<option value="stability">Stability AI</option>
<option value="leonardo">Leonardo AI</option>
<option value="ideogram">Ideogram</option>
<option value="flux">Flux</option>
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Default Video Provider
</label>
<select
value={defaultVideoProvider}
onChange={(e) => setDefaultVideoProvider(e.target.value)}
className="select-field"
>
<option value="runway">Runway Gen-3</option>
<option value="veo">Google Veo</option>
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Default Voice (Text-to-Speech)
</label>
<select
value={defaultVoice}
onChange={(e) => setDefaultVoice(e.target.value)}
className="select-field"
>
<option value="">Select a default voice...</option>
<option value="21m00Tcm4TlvDq8ikWAM">Rachel</option>
<option value="AZnzlk1XvdvUeBnXmlld">Domi</option>
<option value="EXAVITQu4vr4xnSDxMaL">Bella</option>
</select>
</div>
</div>
<button className="btn-primary flex items-center gap-2">
<Save className="w-4 h-4" />
Save Preferences
</button>
</div>
)}
{activeTab === 'api-keys' && (
<div className="space-y-6">
<h2 className="text-lg font-semibold text-white">API Keys</h2>
<p className="text-gray-400">
Manage your personal API keys for third-party services.
</p>
<div className="bg-forge-gray rounded-lg p-4">
<p className="text-gray-400 text-sm">
API key management is handled at the organization level. Contact your
administrator to update API keys.
</p>
</div>
</div>
)}
</div>
</div>
);
}

View file

@ -0,0 +1,191 @@
'use client';
import { useState } from 'react';
import { useRouter } from 'next/navigation';
import Link from 'next/link';
import { toast } from 'react-hot-toast';
import { UserPlus, Eye, EyeOff, Loader2, Check, X } from 'lucide-react';
import { authApi } from '@/lib/api';
import { useStore } from '@/lib/store';
export default function SignUpPage() {
const router = useRouter();
const { setUser, setToken } = useStore();
const [email, setEmail] = useState('');
const [displayName, setDisplayName] = useState('');
const [password, setPassword] = useState('');
const [confirmPassword, setConfirmPassword] = useState('');
const [showPassword, setShowPassword] = useState(false);
const [loading, setLoading] = useState(false);
const passwordRequirements = [
{ label: 'At least 8 characters', met: password.length >= 8 },
{ label: 'Contains a number', met: /\d/.test(password) },
{ label: 'Contains a letter', met: /[a-zA-Z]/.test(password) },
{ label: 'Passwords match', met: password === confirmPassword && password.length > 0 },
];
const isValid =
email.trim() &&
displayName.trim() &&
password.length >= 8 &&
password === confirmPassword;
const handleSignUp = async (e: React.FormEvent) => {
e.preventDefault();
if (!isValid) {
toast.error('Please fill in all fields correctly');
return;
}
setLoading(true);
try {
const response = await authApi.signup({
email,
password,
display_name: displayName,
});
const { user } = response.data;
// Store user data and set token marker (actual auth via cookie)
setUser({
id: user.id,
email: user.email,
name: user.display_name || user.email,
role: user.role,
avatar_url: user.avatar_url,
});
setToken('cookie-auth'); // Marker to indicate authenticated
toast.success('Account created successfully!');
router.push('/');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to create account');
} finally {
setLoading(false);
}
};
return (
<div className="min-h-screen bg-forge-gray flex items-center justify-center p-4">
<div className="w-full max-w-md">
{/* Logo */}
<div className="text-center mb-8">
<div className="inline-flex items-center justify-center w-16 h-16 bg-forge-yellow rounded-xl mb-4">
<span className="text-2xl font-bold text-black">F</span>
</div>
<h1 className="text-2xl font-bold text-white">Create Account</h1>
<p className="text-gray-500 mt-2">Join FORGE AI and start creating</p>
</div>
{/* Sign Up Form */}
<form onSubmit={handleSignUp} className="bg-forge-dark rounded-xl border border-gray-800 p-8 space-y-6">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Full Name
</label>
<input
type="text"
value={displayName}
onChange={(e) => setDisplayName(e.target.value)}
placeholder="Your name"
className="input-field"
autoFocus
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Email Address
</label>
<input
type="email"
value={email}
onChange={(e) => setEmail(e.target.value)}
placeholder="you@example.com"
className="input-field"
autoComplete="email"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Password
</label>
<div className="relative">
<input
type={showPassword ? 'text' : 'password'}
value={password}
onChange={(e) => setPassword(e.target.value)}
placeholder="Create a password"
className="input-field pr-10"
autoComplete="new-password"
/>
<button
type="button"
onClick={() => setShowPassword(!showPassword)}
className="absolute right-3 top-1/2 -translate-y-1/2 text-gray-500 hover:text-gray-300"
>
{showPassword ? <EyeOff className="w-4 h-4" /> : <Eye className="w-4 h-4" />}
</button>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Confirm Password
</label>
<input
type="password"
value={confirmPassword}
onChange={(e) => setConfirmPassword(e.target.value)}
placeholder="Confirm your password"
className="input-field"
autoComplete="new-password"
/>
</div>
{/* Password Requirements */}
{password.length > 0 && (
<div className="space-y-2">
{passwordRequirements.map((req) => (
<div key={req.label} className="flex items-center gap-2 text-sm">
{req.met ? (
<Check className="w-4 h-4 text-green-500" />
) : (
<X className="w-4 h-4 text-gray-500" />
)}
<span className={req.met ? 'text-green-500' : 'text-gray-500'}>
{req.label}
</span>
</div>
))}
</div>
)}
<button
type="submit"
disabled={loading || !isValid}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
{loading ? (
<Loader2 className="w-5 h-5 animate-spin" />
) : (
<UserPlus className="w-5 h-5" />
)}
{loading ? 'Creating account...' : 'Create Account'}
</button>
<div className="text-center text-sm text-gray-500">
Already have an account?{' '}
<Link href="/login" className="text-forge-yellow hover:text-yellow-400">
Sign in
</Link>
</div>
</form>
</div>
</div>
);
}

View file

@ -0,0 +1,254 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { FileText, Copy, Check, Sparkles } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
export default function AltTextPage() {
const { addJob, updateJob } = useStore();
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [jobId, setJobId] = useState<string | null>(null);
const [results, setResults] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [uploading, setUploading] = useState(false);
const [copied, setCopied] = useState<string | null>(null);
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Image uploaded!');
} catch (err) {
toast.error('Failed to upload image');
setFile(null);
} finally {
setUploading(false);
}
};
const handleGenerate = async () => {
if (!assetId) {
toast.error('Please upload an image first');
return;
}
setLoading(true);
setResults(null);
try {
const response = await modulesApi.generateAltText({
asset_id: assetId,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'alt_text_generator',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Alt text generation started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start generation');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_data) {
setResults({
shortAlt: job.output_data.short_alt_text,
longAlt: job.output_data.long_alt_text,
raw: job.output_data.raw_response,
});
toast.success('Alt text generated!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const copyToClipboard = (text: string, field: string) => {
navigator.clipboard.writeText(text);
setCopied(field);
toast.success('Copied to clipboard!');
setTimeout(() => setCopied(null), 2000);
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<FileText className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Alt Text Generator</h1>
<p className="text-gray-500">Generate accessible alt text for images using GPT-4 Vision</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Upload Section */}
<div className="space-y-6">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upload Image
</label>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'image/*': ['.png', '.jpg', '.jpeg', '.webp', '.gif'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
setResults(null);
}}
label="Upload an image to analyze"
/>
{uploading && (
<p className="mt-2 text-sm text-forge-yellow">Uploading...</p>
)}
</div>
{/* Preview */}
{assetId && (
<div className="bg-forge-dark rounded-xl border border-gray-800 overflow-hidden">
<img
src={`/api/v1/assets/${assetId}/download`}
alt="Preview"
className="w-full max-h-96 object-contain"
/>
</div>
)}
{/* Generate Button */}
<button
onClick={handleGenerate}
disabled={loading || !assetId || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Generating...' : 'Generate Alt Text'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results Section */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Generated Alt Text</h2>
{results ? (
<div className="space-y-4">
{/* Short Alt */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center justify-between mb-2">
<h3 className="text-white font-medium">
Short Version
<span className="text-gray-500 text-sm ml-2">(150 chars max)</span>
</h3>
<button
onClick={() => copyToClipboard(results.shortAlt, 'short')}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
>
{copied === 'short' ? (
<Check className="w-4 h-4" />
) : (
<Copy className="w-4 h-4" />
)}
</button>
</div>
<p className="text-gray-300">{results.shortAlt}</p>
<p className="text-xs text-gray-500 mt-2">
{results.shortAlt?.length || 0} characters
</p>
</div>
{/* Long Alt */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<div className="flex items-center justify-between mb-2">
<h3 className="text-white font-medium">
Long Version
<span className="text-gray-500 text-sm ml-2">(400 chars max)</span>
</h3>
<button
onClick={() => copyToClipboard(results.longAlt, 'long')}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
>
{copied === 'long' ? (
<Check className="w-4 h-4" />
) : (
<Copy className="w-4 h-4" />
)}
</button>
</div>
<p className="text-gray-300">{results.longAlt}</p>
<p className="text-xs text-gray-500 mt-2">
{results.longAlt?.length || 0} characters
</p>
</div>
{/* HTML Snippets */}
<div className="bg-forge-gray rounded-xl p-4">
<h3 className="text-white font-medium mb-3">Ready-to-use HTML</h3>
<div className="space-y-3">
<div>
<p className="text-xs text-gray-500 mb-1">Short version:</p>
<code className="block bg-forge-dark p-2 rounded text-sm text-gray-300 overflow-x-auto">
{`<img src="image.jpg" alt="${results.shortAlt}" />`}
</code>
</div>
<div>
<p className="text-xs text-gray-500 mb-1">Long version (with aria):</p>
<code className="block bg-forge-dark p-2 rounded text-sm text-gray-300 overflow-x-auto">
{`<img src="image.jpg" alt="${results.shortAlt}" aria-describedby="desc" />\n<p id="desc" class="sr-only">${results.longAlt}</p>`}
</code>
</div>
</div>
</div>
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-8 text-center">
<FileText className="w-12 h-12 text-gray-600 mx-auto mb-3" />
<p className="text-gray-500">Alt text will appear here</p>
</div>
)}
</div>
</div>
{/* Accessibility Tips */}
<div className="bg-forge-gray rounded-xl p-6">
<h3 className="text-white font-medium mb-3">Alt Text Best Practices</h3>
<ul className="space-y-2 text-gray-400 text-sm">
<li> Use the short version for simple images in context</li>
<li> Use the long version for complex images or when detail matters</li>
<li> Alt text should describe the purpose, not just the appearance</li>
<li> Avoid redundant phrases like "image of" or "picture of"</li>
<li> Include text visible in the image when relevant</li>
</ul>
</div>
</div>
);
}

View file

@ -0,0 +1,194 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { Wand2, Copy, Check, Sparkles, RefreshCw } from 'lucide-react';
import { modulesApi } from '@/lib/api';
const styles = [
{ value: 'cinematic', label: 'Cinematic', description: 'Movie-like scenes with dramatic lighting' },
{ value: 'photographic', label: 'Photographic', description: 'Professional photography style' },
{ value: 'artistic', label: 'Artistic', description: 'Painterly with rich colors' },
{ value: 'product', label: 'Product', description: 'Commercial product photography' },
{ value: 'fantasy', label: 'Fantasy', description: 'Magical and otherworldly' },
{ value: 'minimal', label: 'Minimal', description: 'Clean and simple' },
{ value: 'vintage', label: 'Vintage', description: 'Nostalgic retro aesthetics' },
{ value: 'futuristic', label: 'Futuristic', description: 'Sci-fi and modern tech' },
];
export default function PromptStudioPage() {
const [prompt, setPrompt] = useState('');
const [style, setStyle] = useState('cinematic');
const [enhancedPrompt, setEnhancedPrompt] = useState('');
const [negativePrompt, setNegativePrompt] = useState('');
const [loading, setLoading] = useState(false);
const [copied, setCopied] = useState<string | null>(null);
const handleEnhance = async () => {
if (!prompt.trim()) {
toast.error('Please enter a prompt');
return;
}
setLoading(true);
try {
const response = await modulesApi.enhancePrompt({
prompt,
style,
});
setEnhancedPrompt(response.data.enhanced_prompt);
setNegativePrompt(response.data.negative_prompt);
toast.success('Prompt enhanced!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to enhance prompt');
} finally {
setLoading(false);
}
};
const copyToClipboard = (text: string, field: string) => {
navigator.clipboard.writeText(text);
setCopied(field);
toast.success('Copied to clipboard!');
setTimeout(() => setCopied(null), 2000);
};
const useEnhancedPrompt = () => {
setPrompt(enhancedPrompt);
toast.success('Enhanced prompt moved to input');
};
return (
<div className="max-w-4xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Wand2 className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Prompt Studio</h1>
<p className="text-gray-500">Enhance your prompts with AI assistance</p>
</div>
</div>
{/* Input Section */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6 space-y-6">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Your Prompt
</label>
<textarea
value={prompt}
onChange={(e) => setPrompt(e.target.value)}
placeholder="Enter a basic prompt to enhance..."
className="input-field min-h-[120px] resize-none"
/>
</div>
{/* Style Selection */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-3">
Enhancement Style
</label>
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
{styles.map((s) => (
<button
key={s.value}
onClick={() => setStyle(s.value)}
className={`p-3 rounded-lg text-left transition-all ${
style === s.value
? 'bg-forge-yellow/10 border-forge-yellow text-white border'
: 'bg-forge-gray border border-gray-700 text-gray-400 hover:border-gray-600'
}`}
>
<p className="font-medium text-sm">{s.label}</p>
<p className="text-xs mt-1 opacity-70">{s.description}</p>
</button>
))}
</div>
</div>
<button
onClick={handleEnhance}
disabled={loading || !prompt.trim()}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
{loading ? (
<>
<RefreshCw className="w-5 h-5 animate-spin" />
Enhancing...
</>
) : (
<>
<Sparkles className="w-5 h-5" />
Enhance Prompt
</>
)}
</button>
</div>
{/* Results Section */}
{(enhancedPrompt || negativePrompt) && (
<div className="space-y-6">
{/* Enhanced Prompt */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6">
<div className="flex items-center justify-between mb-3">
<h3 className="text-white font-medium">Enhanced Prompt</h3>
<div className="flex items-center gap-2">
<button
onClick={useEnhancedPrompt}
className="text-sm text-forge-yellow hover:text-yellow-400 flex items-center gap-1"
>
<RefreshCw className="w-3 h-3" />
Use as input
</button>
<button
onClick={() => copyToClipboard(enhancedPrompt, 'enhanced')}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
>
{copied === 'enhanced' ? (
<Check className="w-4 h-4" />
) : (
<Copy className="w-4 h-4" />
)}
</button>
</div>
</div>
<p className="text-gray-300 whitespace-pre-wrap">{enhancedPrompt}</p>
</div>
{/* Negative Prompt */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-6">
<div className="flex items-center justify-between mb-3">
<h3 className="text-white font-medium">Negative Prompt</h3>
<button
onClick={() => copyToClipboard(negativePrompt, 'negative')}
className="p-2 text-gray-400 hover:text-forge-yellow transition-colors"
>
{copied === 'negative' ? (
<Check className="w-4 h-4" />
) : (
<Copy className="w-4 h-4" />
)}
</button>
</div>
<p className="text-gray-300 whitespace-pre-wrap">{negativePrompt}</p>
</div>
</div>
)}
{/* Tips */}
<div className="bg-forge-gray rounded-xl p-6">
<h3 className="text-white font-medium mb-3">Tips for Better Prompts</h3>
<ul className="space-y-2 text-gray-400 text-sm">
<li> Be specific about subjects, actions, and settings</li>
<li> Include mood, lighting, and atmosphere details</li>
<li> Mention art styles or artist references if desired</li>
<li> Use the negative prompt to exclude unwanted elements</li>
<li> Iterate by using enhanced prompts as new input</li>
</ul>
</div>
</div>
);
}

View file

@ -0,0 +1,624 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { Film, Download, Sparkles, FolderOpen, ChevronDown, ChevronUp, X } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import AssetLibrary from '@/components/AssetLibrary';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const providers = [
{
id: 'runway',
name: 'Runway',
models: [
{ id: 'gen3_alpha_turbo', name: 'Gen-3 Alpha Turbo', description: '7x faster, half cost' },
{ id: 'gen3_alpha', name: 'Gen-3 Alpha', description: 'High quality, full features' },
{ id: 'gen4', name: 'Gen-4', description: 'Latest, highest fidelity' },
],
features: ['camera_control', 'frame_position'],
},
{
id: 'veo',
name: 'Google Veo',
models: [
{ id: 'veo-3.1-generate-preview', name: 'Veo 3.1', description: 'High quality with audio' },
{ id: 'veo-3.1-fast', name: 'Veo 3.1 Fast', description: 'Faster generation' },
],
features: ['first_last_frame', 'reference_images'],
},
];
const durations = [
{ value: 5, label: '5s' },
{ value: 8, label: '8s' },
{ value: 10, label: '10s' },
];
const resolutions = [
{ value: '1280x768', label: '1280x768 (Landscape)' },
{ value: '768x1280', label: '768x1280 (Portrait)' },
{ value: '1920x1080', label: '1920x1080 (Full HD)' },
];
export default function VideoGeneratePage() {
const { addJob, updateJob } = useStore();
// Basic state
const [mode, setMode] = useState<'text' | 'image'>('text');
const [prompt, setPrompt] = useState('');
const [provider, setProvider] = useState('runway');
const [model, setModel] = useState('gen3_alpha_turbo');
const [duration, setDuration] = useState(5);
const [resolution, setResolution] = useState('1280x768');
// Input image state
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [uploading, setUploading] = useState(false);
// Runway camera control
const [showCameraControl, setShowCameraControl] = useState(false);
const [cameraControl, setCameraControl] = useState({
pan: 0,
tilt: 0,
zoom: 0,
roll: 0,
static: false,
});
const [framePosition, setFramePosition] = useState('first');
// Veo frame control
const [firstFrameAssetId, setFirstFrameAssetId] = useState<string | null>(null);
const [lastFrameAssetId, setLastFrameAssetId] = useState<string | null>(null);
const [referenceAssetIds, setReferenceAssetIds] = useState<string[]>([]);
const [firstFramePreview, setFirstFramePreview] = useState<string | null>(null);
const [lastFramePreview, setLastFramePreview] = useState<string | null>(null);
// Asset library modal
const [showAssetLibrary, setShowAssetLibrary] = useState(false);
const [assetSelectTarget, setAssetSelectTarget] = useState<'input' | 'first' | 'last' | 'reference'>('input');
// Results
const [jobId, setJobId] = useState<string | null>(null);
const [generatedVideo, setGeneratedVideo] = useState<any>(null);
const [loading, setLoading] = useState(false);
const selectedProvider = providers.find((p) => p.id === provider);
const selectedModel = selectedProvider?.models.find((m) => m.id === model);
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Image uploaded!');
} catch (err) {
toast.error('Failed to upload image');
setFile(null);
} finally {
setUploading(false);
}
};
const handleAssetSelect = (asset: any) => {
const thumbnailUrl = asset.thumbnail_url
? `${process.env.NEXT_PUBLIC_API_URL}${asset.thumbnail_url}`
: null;
switch (assetSelectTarget) {
case 'input':
setAssetId(asset.id);
setFile(null);
break;
case 'first':
setFirstFrameAssetId(asset.id);
setFirstFramePreview(thumbnailUrl);
break;
case 'last':
setLastFrameAssetId(asset.id);
setLastFramePreview(thumbnailUrl);
break;
case 'reference':
if (referenceAssetIds.length < 4) {
setReferenceAssetIds([...referenceAssetIds, asset.id]);
}
break;
}
setShowAssetLibrary(false);
};
const openAssetLibrary = (target: 'input' | 'first' | 'last' | 'reference') => {
setAssetSelectTarget(target);
setShowAssetLibrary(true);
};
const handleGenerate = async () => {
if (mode === 'text' && !prompt.trim()) {
toast.error('Please enter a prompt');
return;
}
if (mode === 'image' && !assetId) {
toast.error('Please upload or select an image');
return;
}
setLoading(true);
setGeneratedVideo(null);
try {
const payload: any = {
prompt: prompt || undefined,
provider,
model,
duration,
resolution,
aspect_ratio: resolution === '1280x768' ? '16:9' : resolution === '768x1280' ? '9:16' : '16:9',
};
// Add input image for image-to-video mode
if (mode === 'image' && assetId) {
payload.image_asset_id = assetId;
}
// Runway-specific options
if (provider === 'runway') {
payload.frame_position = framePosition;
if (!cameraControl.static && (cameraControl.pan || cameraControl.tilt || cameraControl.zoom || cameraControl.roll)) {
payload.camera_control = {
pan: cameraControl.pan,
tilt: cameraControl.tilt,
zoom: cameraControl.zoom,
roll: cameraControl.roll,
};
} else if (cameraControl.static) {
payload.camera_control = { static: true };
}
}
// Veo-specific options
if (provider === 'veo') {
if (firstFrameAssetId) {
payload.first_frame_asset_id = firstFrameAssetId;
}
if (lastFrameAssetId) {
payload.last_frame_asset_id = lastFrameAssetId;
}
if (referenceAssetIds.length > 0) {
payload.reference_asset_ids = referenceAssetIds;
}
}
const response = await modulesApi.generateVideo(payload);
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'video_generation',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Video generation started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start generation');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.[0]) {
const asset = await assetsApi.get(job.output_asset_ids[0]);
setGeneratedVideo(asset.data);
toast.success('Video generated successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async () => {
if (!generatedVideo) return;
try {
const response = await assetsApi.download(generatedVideo.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = generatedVideo.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download video');
}
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Film className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Video Generator</h1>
<p className="text-gray-500">Create videos with Runway Gen-3/4 and Google Veo 3.1</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* Mode Toggle */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">Generation Mode</label>
<div className="flex gap-2">
<button
onClick={() => setMode('text')}
className={`flex-1 px-4 py-3 rounded-lg font-medium transition-colors ${
mode === 'text'
? 'bg-forge-yellow text-black'
: 'bg-forge-dark border border-gray-700 text-gray-300 hover:border-gray-600'
}`}
>
Text to Video
</button>
<button
onClick={() => setMode('image')}
className={`flex-1 px-4 py-3 rounded-lg font-medium transition-colors ${
mode === 'image'
? 'bg-forge-yellow text-black'
: 'bg-forge-dark border border-gray-700 text-gray-300 hover:border-gray-600'
}`}
>
Image to Video
</button>
</div>
</div>
{/* Prompt */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
{mode === 'text' ? 'Prompt' : 'Motion Prompt (Optional)'}
</label>
<textarea
value={prompt}
onChange={(e) => setPrompt(e.target.value)}
placeholder={mode === 'text' ? 'Describe the video you want to create...' : 'Describe how the image should animate...'}
className="input-field min-h-[100px] resize-none"
/>
</div>
{/* Image Upload (for image mode) */}
{mode === 'image' && (
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">Source Image</label>
<div className="flex gap-2 mb-2">
<button
onClick={() => openAssetLibrary('input')}
className="flex items-center gap-2 px-4 py-2 bg-forge-gray border border-gray-700 rounded-lg text-gray-300 hover:border-forge-yellow transition-colors"
>
<FolderOpen className="w-4 h-4" />
My Files
</button>
</div>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'image/*': ['.png', '.jpg', '.jpeg', '.webp'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
}}
label="Or upload a new image"
/>
{uploading && <p className="mt-2 text-sm text-forge-yellow">Uploading...</p>}
{assetId && !file && (
<p className="mt-2 text-sm text-green-400">Using selected asset from library</p>
)}
</div>
)}
{/* Provider & Model */}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">Provider</label>
<select
value={provider}
onChange={(e) => {
setProvider(e.target.value);
const p = providers.find((pr) => pr.id === e.target.value);
if (p) setModel(p.models[0].id);
}}
className="select-field"
>
{providers.map((p) => (
<option key={p.id} value={p.id}>
{p.name}
</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">Model</label>
<select value={model} onChange={(e) => setModel(e.target.value)} className="select-field">
{selectedProvider?.models.map((m) => (
<option key={m.id} value={m.id}>
{m.name}
</option>
))}
</select>
</div>
</div>
{selectedModel && (
<p className="text-xs text-gray-500 -mt-4">{selectedModel.description}</p>
)}
{/* Duration & Resolution */}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">Duration</label>
<select
value={duration}
onChange={(e) => setDuration(parseInt(e.target.value))}
className="select-field"
>
{durations
.filter((d) => provider === 'veo' ? d.value <= 8 : true)
.map((d) => (
<option key={d.value} value={d.value}>
{d.label}
</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">Resolution</label>
<select
value={resolution}
onChange={(e) => setResolution(e.target.value)}
className="select-field"
>
{resolutions.map((r) => (
<option key={r.value} value={r.value}>
{r.label}
</option>
))}
</select>
</div>
</div>
{/* Runway Camera Control */}
{provider === 'runway' && (
<div className="bg-forge-gray rounded-lg p-4">
<button
onClick={() => setShowCameraControl(!showCameraControl)}
className="flex items-center justify-between w-full text-left"
>
<span className="text-sm font-medium text-gray-300">Camera Control</span>
{showCameraControl ? (
<ChevronUp className="w-4 h-4 text-gray-400" />
) : (
<ChevronDown className="w-4 h-4 text-gray-400" />
)}
</button>
{showCameraControl && (
<div className="mt-4 space-y-4">
<label className="flex items-center gap-2">
<input
type="checkbox"
checked={cameraControl.static}
onChange={(e) => setCameraControl({ ...cameraControl, static: e.target.checked })}
className="rounded border-gray-700 bg-forge-dark text-forge-yellow focus:ring-forge-yellow"
/>
<span className="text-sm text-gray-300">Static (Reduce camera motion)</span>
</label>
{!cameraControl.static && (
<div className="grid grid-cols-2 gap-4">
{['pan', 'tilt', 'zoom', 'roll'].map((control) => (
<div key={control}>
<label className="block text-xs text-gray-500 mb-1 capitalize">
{control}: {cameraControl[control as keyof typeof cameraControl]}
</label>
<input
type="range"
min="-10"
max="10"
value={cameraControl[control as keyof typeof cameraControl] as number}
onChange={(e) =>
setCameraControl({ ...cameraControl, [control]: parseInt(e.target.value) })
}
className="w-full accent-forge-yellow"
/>
</div>
))}
</div>
)}
{mode === 'image' && (
<div>
<label className="block text-sm text-gray-300 mb-2">Frame Position</label>
<select
value={framePosition}
onChange={(e) => setFramePosition(e.target.value)}
className="select-field"
>
<option value="first">First Frame</option>
<option value="middle">Middle Frame</option>
<option value="last">Last Frame</option>
</select>
</div>
)}
</div>
)}
</div>
)}
{/* Veo Frame Control */}
{provider === 'veo' && (
<div className="bg-forge-gray rounded-lg p-4 space-y-4">
<h3 className="text-sm font-medium text-gray-300">Frame Control (Veo 3.1)</h3>
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-xs text-gray-500 mb-2">First Frame</label>
<button
onClick={() => openAssetLibrary('first')}
className="w-full aspect-video bg-forge-dark border border-gray-700 rounded-lg flex items-center justify-center hover:border-forge-yellow transition-colors overflow-hidden"
>
{firstFramePreview ? (
<div className="relative w-full h-full">
<img src={firstFramePreview} alt="First frame" className="w-full h-full object-cover" />
<button
onClick={(e) => {
e.stopPropagation();
setFirstFrameAssetId(null);
setFirstFramePreview(null);
}}
className="absolute top-1 right-1 p-1 bg-black/50 rounded"
>
<X className="w-3 h-3 text-white" />
</button>
</div>
) : (
<span className="text-xs text-gray-500">Select from My Files</span>
)}
</button>
</div>
<div>
<label className="block text-xs text-gray-500 mb-2">Last Frame</label>
<button
onClick={() => openAssetLibrary('last')}
className="w-full aspect-video bg-forge-dark border border-gray-700 rounded-lg flex items-center justify-center hover:border-forge-yellow transition-colors overflow-hidden"
>
{lastFramePreview ? (
<div className="relative w-full h-full">
<img src={lastFramePreview} alt="Last frame" className="w-full h-full object-cover" />
<button
onClick={(e) => {
e.stopPropagation();
setLastFrameAssetId(null);
setLastFramePreview(null);
}}
className="absolute top-1 right-1 p-1 bg-black/50 rounded"
>
<X className="w-3 h-3 text-white" />
</button>
</div>
) : (
<span className="text-xs text-gray-500">Select from My Files</span>
)}
</button>
</div>
</div>
<div>
<label className="block text-xs text-gray-500 mb-2">
Reference Images ({referenceAssetIds.length}/4) - Character/Style consistency
</label>
<button
onClick={() => openAssetLibrary('reference')}
disabled={referenceAssetIds.length >= 4}
className="px-3 py-2 bg-forge-dark border border-gray-700 rounded-lg text-sm text-gray-300 hover:border-forge-yellow transition-colors disabled:opacity-50"
>
<FolderOpen className="w-4 h-4 inline mr-2" />
Add Reference Image
</button>
{referenceAssetIds.length > 0 && (
<div className="mt-2 flex gap-2">
{referenceAssetIds.map((id, i) => (
<button
key={id}
onClick={() => setReferenceAssetIds(referenceAssetIds.filter((_, idx) => idx !== i))}
className="px-2 py-1 bg-forge-dark border border-gray-700 rounded text-xs text-gray-400 hover:border-red-500 hover:text-red-400"
>
Ref {i + 1} &times;
</button>
))}
</div>
)}
</div>
</div>
)}
{/* Generate Button */}
<button
onClick={handleGenerate}
disabled={loading || (mode === 'text' ? !prompt.trim() : !assetId) || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Generating...' : 'Generate Video'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress jobId={jobId} onComplete={handleJobComplete} onError={handleJobError} />
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Generated Video</h2>
{generatedVideo ? (
<div className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800">
<video
src={`${process.env.NEXT_PUBLIC_API_URL}/api/v1/assets/${generatedVideo.id}/download`}
controls
className="w-full"
/>
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<p className="text-white font-medium">{generatedVideo.original_filename}</p>
<p className="text-sm text-gray-500">
{generatedVideo.duration_seconds}s {generatedVideo.width}x{generatedVideo.height}
</p>
</div>
<button onClick={handleDownload} className="btn-primary flex items-center gap-2">
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 aspect-video flex items-center justify-center">
<p className="text-gray-500">Generated video will appear here</p>
</div>
)}
</div>
</div>
{/* Asset Library Modal */}
<AssetLibrary
isOpen={showAssetLibrary}
onClose={() => setShowAssetLibrary(false)}
onSelect={handleAssetSelect}
fileTypes={['image']}
title={
assetSelectTarget === 'input'
? 'Select Source Image'
: assetSelectTarget === 'first'
? 'Select First Frame'
: assetSelectTarget === 'last'
? 'Select Last Frame'
: 'Select Reference Image'
}
/>
</div>
);
}

View file

@ -0,0 +1,306 @@
'use client';
import { useState } from 'react';
import { toast } from 'react-hot-toast';
import { Captions, Download, Sparkles } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const languages = [
{ value: '', label: 'Auto-detect' },
{ value: 'en', label: 'English' },
{ value: 'es', label: 'Spanish' },
{ value: 'fr', label: 'French' },
{ value: 'de', label: 'German' },
{ value: 'it', label: 'Italian' },
{ value: 'pt', label: 'Portuguese' },
{ value: 'ja', label: 'Japanese' },
{ value: 'ko', label: 'Korean' },
{ value: 'zh', label: 'Chinese' },
];
const targetLanguages = [
{ value: '', label: 'No translation' },
{ value: 'EN-US', label: 'English (US)' },
{ value: 'EN-GB', label: 'English (UK)' },
{ value: 'ES', label: 'Spanish' },
{ value: 'FR', label: 'French' },
{ value: 'DE', label: 'German' },
{ value: 'IT', label: 'Italian' },
{ value: 'PT-BR', label: 'Portuguese (Brazil)' },
{ value: 'JA', label: 'Japanese' },
{ value: 'KO', label: 'Korean' },
{ value: 'ZH', label: 'Chinese' },
];
export default function SubtitlesPage() {
const { addJob, updateJob } = useStore();
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [sourceLanguage, setSourceLanguage] = useState('');
const [targetLanguage, setTargetLanguage] = useState('');
const [burnSubtitles, setBurnSubtitles] = useState(false);
const [jobId, setJobId] = useState<string | null>(null);
const [results, setResults] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [uploading, setUploading] = useState(false);
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Video uploaded!');
} catch (err) {
toast.error('Failed to upload video');
setFile(null);
} finally {
setUploading(false);
}
};
const handleProcess = async () => {
if (!assetId) {
toast.error('Please upload a video first');
return;
}
setLoading(true);
setResults(null);
try {
const response = await modulesApi.processSubtitles({
asset_id: assetId,
source_language: sourceLanguage || undefined,
target_language: targetLanguage || undefined,
burn_subtitles: burnSubtitles,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'subtitle_processor',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Subtitle processing started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start processing');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.length > 0) {
const assets = await Promise.all(
job.output_asset_ids.map(async (id: string) => {
const asset = await assetsApi.get(id);
return asset.data;
})
);
setResults({
assets,
transcript: job.output_data?.transcript,
language: job.output_data?.language,
});
toast.success('Subtitles generated successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async (asset: any) => {
try {
const response = await assetsApi.download(asset.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = asset.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download file');
}
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Captions className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Subtitle Generator</h1>
<p className="text-gray-500">Auto-generate and translate subtitles</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* File Upload */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upload Video
</label>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'video/*': ['.mp4', '.mov', '.avi', '.webm'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
}}
label="Upload a video for transcription"
/>
{uploading && (
<p className="mt-2 text-sm text-forge-yellow">Uploading...</p>
)}
</div>
{/* Languages */}
<div className="grid grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Source Language
</label>
<select
value={sourceLanguage}
onChange={(e) => setSourceLanguage(e.target.value)}
className="select-field"
>
{languages.map((lang) => (
<option key={lang.value} value={lang.value}>
{lang.label}
</option>
))}
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Translate To
</label>
<select
value={targetLanguage}
onChange={(e) => setTargetLanguage(e.target.value)}
className="select-field"
>
{targetLanguages.map((lang) => (
<option key={lang.value} value={lang.value}>
{lang.label}
</option>
))}
</select>
</div>
</div>
{/* Burn Subtitles */}
<div className="flex items-center gap-3">
<input
type="checkbox"
id="burnSubtitles"
checked={burnSubtitles}
onChange={(e) => setBurnSubtitles(e.target.checked)}
className="w-4 h-4 rounded border-gray-600 bg-forge-dark text-forge-yellow focus:ring-forge-yellow"
/>
<label htmlFor="burnSubtitles" className="text-gray-300">
Burn subtitles into video (hardcoded)
</label>
</div>
{/* Process Button */}
<button
onClick={handleProcess}
disabled={loading || !assetId || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Processing...' : 'Generate Subtitles'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Results</h2>
{results ? (
<div className="space-y-4">
{/* Generated Files */}
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<h3 className="text-white font-medium mb-3">Generated Files</h3>
<div className="space-y-2">
{results.assets.map((asset: any) => (
<div
key={asset.id}
className="flex items-center justify-between p-3 bg-forge-gray rounded-lg"
>
<div>
<p className="text-white text-sm">{asset.original_filename}</p>
<p className="text-xs text-gray-500">
{asset.metadata?.type === 'translated' ? 'Translated' : 'Original'} {' '}
{asset.file_type}
</p>
</div>
<button
onClick={() => handleDownload(asset)}
className="p-2 text-forge-yellow hover:bg-forge-yellow/10 rounded transition-colors"
>
<Download className="w-4 h-4" />
</button>
</div>
))}
</div>
</div>
{/* Transcript Preview */}
{results.transcript && (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-4">
<h3 className="text-white font-medium mb-3">
Transcript Preview
{results.language && (
<span className="text-gray-500 text-sm ml-2">
(Detected: {results.language})
</span>
)}
</h3>
<div className="max-h-64 overflow-y-auto">
<p className="text-gray-300 text-sm whitespace-pre-wrap">
{results.transcript.substring(0, 1000)}
{results.transcript.length > 1000 && '...'}
</p>
</div>
</div>
)}
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 p-8 text-center">
<Captions className="w-12 h-12 text-gray-600 mx-auto mb-3" />
<p className="text-gray-500">Subtitles will appear here</p>
</div>
)}
</div>
</div>
</div>
);
}

View file

@ -0,0 +1,277 @@
'use client';
import { useState, useEffect } from 'react';
import { toast } from 'react-hot-toast';
import { Maximize, Download, Sparkles } from 'lucide-react';
import FileUpload from '@/components/FileUpload';
import JobProgress from '@/components/JobProgress';
import { modulesApi, assetsApi } from '@/lib/api';
import { useStore } from '@/lib/store';
const scaleOptions = [
{ value: 2, label: '2x' },
{ value: 4, label: '4x' },
];
const modelOptions = [
{ value: 'standard', label: 'Standard' },
{ value: 'high-quality', label: 'High Quality' },
{ value: 'fast', label: 'Fast' },
];
export default function VideoUpscalePage() {
const { addJob, updateJob } = useStore();
const [mounted, setMounted] = useState(false);
const [file, setFile] = useState<File | null>(null);
const [assetId, setAssetId] = useState<string | null>(null);
const [scale, setScale] = useState(2);
const [model, setModel] = useState('standard');
const [denoiseStrength, setDenoiseStrength] = useState(0.3);
const [jobId, setJobId] = useState<string | null>(null);
const [upscaledVideo, setUpscaledVideo] = useState<any>(null);
const [loading, setLoading] = useState(false);
const [uploading, setUploading] = useState(false);
useEffect(() => {
setMounted(true);
}, []);
if (!mounted) {
return null;
}
const handleFileUpload = async (uploadedFile: File) => {
setFile(uploadedFile);
setUploading(true);
try {
const response = await assetsApi.upload(uploadedFile);
setAssetId(response.data.id);
toast.success('Video uploaded!');
} catch (err) {
toast.error('Failed to upload video');
setFile(null);
} finally {
setUploading(false);
}
};
const handleUpscale = async () => {
if (!assetId) {
toast.error('Please upload a video first');
return;
}
setLoading(true);
setUpscaledVideo(null);
try {
const response = await modulesApi.upscaleVideo({
asset_id: assetId,
scale,
model,
denoise_strength: denoiseStrength,
});
const job = response.data;
setJobId(job.id);
addJob({
id: job.id,
module: 'video_upscaling',
status: job.status,
progress: job.progress,
created_at: job.created_at,
});
toast.success('Video upscaling started!');
} catch (err: any) {
toast.error(err.response?.data?.detail || 'Failed to start upscaling');
setLoading(false);
}
};
const handleJobComplete = async (job: any) => {
setLoading(false);
updateJob(job.id, { status: 'completed', progress: 100 });
if (job.output_asset_ids?.[0]) {
const asset = await assetsApi.get(job.output_asset_ids[0]);
setUpscaledVideo(asset.data);
toast.success('Video upscaled successfully!');
}
};
const handleJobError = (error: string) => {
setLoading(false);
toast.error(error);
};
const handleDownload = async () => {
if (!upscaledVideo) return;
try {
const response = await assetsApi.download(upscaledVideo.id);
const url = window.URL.createObjectURL(response.data);
const a = document.createElement('a');
a.href = url;
a.download = upscaledVideo.original_filename;
a.click();
window.URL.revokeObjectURL(url);
} catch (err) {
toast.error('Failed to download video');
}
};
return (
<div className="max-w-6xl mx-auto space-y-8">
<div className="flex items-center gap-4">
<div className="w-12 h-12 bg-forge-yellow/10 rounded-lg flex items-center justify-center">
<Maximize className="w-6 h-6 text-forge-yellow" />
</div>
<div>
<h1 className="text-2xl font-bold text-white">Video Upscaler</h1>
<p className="text-gray-500">Enhance video resolution with Topaz Labs AI</p>
</div>
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* Controls */}
<div className="space-y-6">
{/* File Upload */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upload Video
</label>
<FileUpload
onUpload={handleFileUpload}
accept={{ 'video/*': ['.mp4', '.mov', '.avi', '.webm'] }}
currentFile={file}
onClear={() => {
setFile(null);
setAssetId(null);
}}
label="Upload a video to upscale"
/>
{uploading && (
<p className="mt-2 text-sm text-forge-yellow">Uploading...</p>
)}
</div>
{/* Scale */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Scale Factor
</label>
<div className="flex gap-2">
{scaleOptions.map((option) => (
<button
key={option.value}
onClick={() => setScale(option.value)}
className={`px-6 py-3 rounded-lg font-medium transition-colors ${
scale === option.value
? 'bg-forge-yellow text-black'
: 'bg-forge-dark border border-gray-700 text-gray-300 hover:border-gray-600'
}`}
>
{option.label}
</button>
))}
</div>
</div>
{/* Model */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Upscaling Model
</label>
<select
value={model}
onChange={(e) => setModel(e.target.value)}
className="select-field"
>
{modelOptions.map((option) => (
<option key={option.value} value={option.value}>
{option.label}
</option>
))}
</select>
</div>
{/* Denoise */}
<div>
<label className="block text-sm font-medium text-gray-300 mb-2">
Denoise Strength: {denoiseStrength.toFixed(1)}
</label>
<input
type="range"
min={0}
max={1}
step={0.1}
value={denoiseStrength}
onChange={(e) => setDenoiseStrength(parseFloat(e.target.value))}
className="w-full accent-forge-yellow"
/>
</div>
{/* Upscale Button */}
<button
onClick={handleUpscale}
disabled={loading || !assetId || uploading}
className="btn-primary w-full flex items-center justify-center gap-2 disabled:opacity-50 disabled:cursor-not-allowed"
>
<Sparkles className="w-5 h-5" />
{loading ? 'Upscaling...' : 'Upscale Video'}
</button>
{/* Job Progress */}
{jobId && loading && (
<JobProgress
jobId={jobId}
onComplete={handleJobComplete}
onError={handleJobError}
/>
)}
{/* Note about processing time */}
<p className="text-sm text-gray-500">
Note: Video upscaling can take several minutes depending on the video length and resolution.
</p>
</div>
{/* Results */}
<div>
<h2 className="text-lg font-semibold text-white mb-4">Result</h2>
{upscaledVideo ? (
<div className="bg-forge-dark rounded-xl overflow-hidden border border-gray-800">
<video
src={`/api/v1/assets/${upscaledVideo.id}/download`}
controls
className="w-full"
/>
<div className="p-4 border-t border-gray-800">
<div className="flex items-center justify-between">
<div>
<p className="text-white font-medium">{upscaledVideo.original_filename}</p>
<p className="text-sm text-gray-500">
{upscaledVideo.width}x{upscaledVideo.height} {(upscaledVideo.file_size_bytes / 1024 / 1024).toFixed(1)} MB
</p>
</div>
<button
onClick={handleDownload}
className="btn-primary flex items-center gap-2"
>
<Download className="w-4 h-4" />
Download
</button>
</div>
</div>
</div>
) : (
<div className="bg-forge-dark rounded-xl border border-gray-800 aspect-video flex items-center justify-center">
<p className="text-gray-500">Upscaled video will appear here</p>
</div>
)}
</div>
</div>
</div>
);
}

View file

@ -0,0 +1,59 @@
'use client';
import { useEffect, useState } from 'react';
import { useRouter } from 'next/navigation';
import { useStore } from '@/lib/store';
import { isAdmin } from '@/lib/auth';
import { ShieldX, Loader2 } from 'lucide-react';
interface AdminGuardProps {
children: React.ReactNode;
fallback?: React.ReactNode;
}
export default function AdminGuard({ children, fallback }: AdminGuardProps) {
const { user } = useStore();
const router = useRouter();
const [checking, setChecking] = useState(true);
useEffect(() => {
// Small delay to allow store to hydrate
const timer = setTimeout(() => {
setChecking(false);
}, 100);
return () => clearTimeout(timer);
}, []);
if (checking) {
return (
<div className="flex items-center justify-center h-64">
<Loader2 className="w-8 h-8 text-forge-yellow animate-spin" />
</div>
);
}
if (!isAdmin(user as any)) {
if (fallback) {
return <>{fallback}</>;
}
return (
<div className="flex flex-col items-center justify-center h-64 text-center">
<ShieldX className="w-16 h-16 text-red-400 mb-4" />
<h2 className="text-xl font-bold text-white mb-2">Access Denied</h2>
<p className="text-gray-400 mb-6">
You don't have permission to access this area.
</p>
<button
onClick={() => router.push('/')}
className="btn-secondary"
>
Go to Dashboard
</button>
</div>
);
}
return <>{children}</>;
}

View file

@ -0,0 +1,31 @@
'use client';
import { usePathname } from 'next/navigation';
import Sidebar from '@/components/Sidebar';
import Header from '@/components/Header';
// Pages that should not have the app shell (sidebar/header)
const FULL_SCREEN_PAGES = ['/login', '/signup'];
export default function AppShell({ children }: { children: React.ReactNode }) {
const pathname = usePathname();
// Check if current page should be full screen (no sidebar/header)
const isFullScreen = FULL_SCREEN_PAGES.includes(pathname);
if (isFullScreen) {
return <>{children}</>;
}
return (
<div className="flex h-screen">
<Sidebar />
<div className="flex-1 flex flex-col overflow-hidden">
<Header />
<main className="flex-1 overflow-y-auto p-6">
{children}
</main>
</div>
</div>
);
}

View file

@ -0,0 +1,287 @@
'use client';
import { useState, useEffect } from 'react';
import api from '@/lib/api';
import { clsx } from 'clsx';
import {
X,
Search,
Image as ImageIcon,
Video,
Mic,
FileText,
Check,
Loader2,
FolderOpen,
} from 'lucide-react';
interface Asset {
id: string;
filename: string;
file_type: string;
mime_type: string;
width?: number;
height?: number;
thumbnail_url: string | null;
file_url: string;
created_at: string;
source_module?: string;
}
interface AssetLibraryProps {
isOpen: boolean;
onClose: () => void;
onSelect: (asset: Asset) => void;
fileTypes?: string[]; // ['image', 'video', 'audio']
title?: string;
multiple?: boolean;
}
const FILE_TYPE_ICONS = {
image: ImageIcon,
video: Video,
audio: Mic,
document: FileText,
};
export default function AssetLibrary({
isOpen,
onClose,
onSelect,
fileTypes = ['image', 'video', 'audio'],
title = 'Select from My Files',
multiple = false,
}: AssetLibraryProps) {
const [assets, setAssets] = useState<Asset[]>([]);
const [loading, setLoading] = useState(false);
const [search, setSearch] = useState('');
const [selectedType, setSelectedType] = useState<string | null>(null);
const [page, setPage] = useState(1);
const [totalPages, setTotalPages] = useState(1);
const [selectedAssets, setSelectedAssets] = useState<Set<string>>(new Set());
useEffect(() => {
if (isOpen) {
loadAssets();
}
}, [isOpen, search, selectedType, page]);
const loadAssets = async () => {
setLoading(true);
try {
const types = selectedType ? selectedType : fileTypes.join(',');
const response = await api.get('/assets/library', {
params: {
file_types: types,
search: search || undefined,
page,
limit: 20,
},
});
setAssets(response.data.items);
setTotalPages(response.data.pages);
} catch (error) {
console.error('Failed to load assets:', error);
} finally {
setLoading(false);
}
};
const handleSelect = (asset: Asset) => {
if (multiple) {
const newSelected = new Set(selectedAssets);
if (newSelected.has(asset.id)) {
newSelected.delete(asset.id);
} else {
newSelected.add(asset.id);
}
setSelectedAssets(newSelected);
} else {
onSelect(asset);
onClose();
}
};
const handleConfirmMultiple = () => {
const selected = assets.filter(a => selectedAssets.has(a.id));
selected.forEach(asset => onSelect(asset));
onClose();
};
if (!isOpen) return null;
return (
<div className="fixed inset-0 z-50 flex items-center justify-center bg-black/70 backdrop-blur-sm">
<div className="bg-forge-dark border border-gray-800 rounded-xl w-full max-w-4xl max-h-[80vh] flex flex-col">
{/* Header */}
<div className="p-4 border-b border-gray-800 flex items-center justify-between">
<div className="flex items-center gap-3">
<FolderOpen className="w-5 h-5 text-forge-yellow" />
<h2 className="text-lg font-semibold text-white">{title}</h2>
</div>
<button
onClick={onClose}
className="p-2 text-gray-400 hover:text-white transition-colors"
>
<X className="w-5 h-5" />
</button>
</div>
{/* Filters */}
<div className="p-4 border-b border-gray-800 flex items-center gap-4">
{/* Search */}
<div className="relative flex-1">
<Search className="absolute left-3 top-1/2 -translate-y-1/2 w-4 h-4 text-gray-500" />
<input
type="text"
placeholder="Search files..."
value={search}
onChange={(e) => {
setSearch(e.target.value);
setPage(1);
}}
className="w-full pl-10 pr-4 py-2 bg-forge-gray border border-gray-700 rounded-lg text-white placeholder-gray-500 focus:border-forge-yellow focus:outline-none"
/>
</div>
{/* Type filters */}
<div className="flex items-center gap-2">
<button
onClick={() => {
setSelectedType(null);
setPage(1);
}}
className={clsx(
'px-3 py-2 rounded-lg text-sm font-medium transition-colors',
!selectedType
? 'bg-forge-yellow text-black'
: 'bg-forge-gray text-gray-400 hover:text-white'
)}
>
All
</button>
{fileTypes.map((type) => {
const Icon = FILE_TYPE_ICONS[type as keyof typeof FILE_TYPE_ICONS] || FileText;
return (
<button
key={type}
onClick={() => {
setSelectedType(type);
setPage(1);
}}
className={clsx(
'px-3 py-2 rounded-lg text-sm font-medium transition-colors flex items-center gap-2',
selectedType === type
? 'bg-forge-yellow text-black'
: 'bg-forge-gray text-gray-400 hover:text-white'
)}
>
<Icon className="w-4 h-4" />
{type.charAt(0).toUpperCase() + type.slice(1)}
</button>
);
})}
</div>
</div>
{/* Asset Grid */}
<div className="flex-1 overflow-y-auto p-4">
{loading ? (
<div className="flex items-center justify-center h-48">
<Loader2 className="w-8 h-8 text-forge-yellow animate-spin" />
</div>
) : assets.length === 0 ? (
<div className="flex flex-col items-center justify-center h-48 text-gray-500">
<FolderOpen className="w-12 h-12 mb-3" />
<p>No files found</p>
<p className="text-sm">Upload some files or generate content to see them here</p>
</div>
) : (
<div className="grid grid-cols-4 md:grid-cols-5 lg:grid-cols-6 gap-3">
{assets.map((asset) => {
const isSelected = selectedAssets.has(asset.id);
const Icon = FILE_TYPE_ICONS[asset.file_type as keyof typeof FILE_TYPE_ICONS] || FileText;
return (
<button
key={asset.id}
onClick={() => handleSelect(asset)}
className={clsx(
'relative aspect-square rounded-lg overflow-hidden border-2 transition-all hover:scale-105',
isSelected
? 'border-forge-yellow'
: 'border-transparent hover:border-gray-600'
)}
>
{/* Thumbnail or Icon */}
{asset.thumbnail_url ? (
<img
src={`${process.env.NEXT_PUBLIC_API_URL}${asset.thumbnail_url}`}
alt={asset.filename}
className="w-full h-full object-cover"
/>
) : (
<div className="w-full h-full bg-forge-gray flex items-center justify-center">
<Icon className="w-8 h-8 text-gray-500" />
</div>
)}
{/* Selection indicator */}
{multiple && isSelected && (
<div className="absolute top-2 right-2 w-6 h-6 bg-forge-yellow rounded-full flex items-center justify-center">
<Check className="w-4 h-4 text-black" />
</div>
)}
{/* File type badge */}
<div className="absolute bottom-0 left-0 right-0 bg-black/70 px-2 py-1">
<p className="text-xs text-gray-300 truncate">{asset.filename}</p>
</div>
</button>
);
})}
</div>
)}
</div>
{/* Pagination */}
{totalPages > 1 && (
<div className="p-4 border-t border-gray-800 flex items-center justify-center gap-2">
<button
onClick={() => setPage(p => Math.max(1, p - 1))}
disabled={page === 1}
className="px-3 py-1 bg-forge-gray rounded text-sm text-gray-400 hover:text-white disabled:opacity-50"
>
Previous
</button>
<span className="text-sm text-gray-400">
Page {page} of {totalPages}
</span>
<button
onClick={() => setPage(p => Math.min(totalPages, p + 1))}
disabled={page === totalPages}
className="px-3 py-1 bg-forge-gray rounded text-sm text-gray-400 hover:text-white disabled:opacity-50"
>
Next
</button>
</div>
)}
{/* Footer for multiple selection */}
{multiple && selectedAssets.size > 0 && (
<div className="p-4 border-t border-gray-800 flex items-center justify-between">
<span className="text-sm text-gray-400">
{selectedAssets.size} file(s) selected
</span>
<button
onClick={handleConfirmMultiple}
className="px-4 py-2 bg-forge-yellow text-black font-medium rounded-lg hover:bg-forge-yellow/90"
>
Use Selected
</button>
</div>
)}
</div>
</div>
);
}

View file

@ -0,0 +1,77 @@
'use client';
import { useEffect, useState } from 'react';
import { useRouter, usePathname } from 'next/navigation';
import { useStore } from '@/lib/store';
import { authApi } from '@/lib/api';
import { Loader2 } from 'lucide-react';
// Pages that don't require authentication
const PUBLIC_PAGES = ['/login', '/signup'];
export default function AuthProvider({ children }: { children: React.ReactNode }) {
const router = useRouter();
const pathname = usePathname();
const { user, token, setUser, setToken, logout } = useStore();
const [loading, setLoading] = useState(true);
useEffect(() => {
const initAuth = async () => {
// If on a public page, no need to check auth
if (PUBLIC_PAGES.includes(pathname)) {
setLoading(false);
return;
}
// Try to verify auth with the backend (uses cookie automatically)
try {
const response = await authApi.me();
if (response.data) {
const userData = {
id: response.data.id,
email: response.data.email,
name: response.data.display_name || response.data.email,
role: response.data.role,
avatar_url: response.data.avatar_url,
};
setUser(userData);
// Also set a dummy token for compatibility (actual auth is via cookie)
setToken('cookie-auth');
}
} catch (error) {
// Not authenticated, clear state and redirect
console.log('Not authenticated, redirecting to login');
logout();
router.push('/login');
}
setLoading(false);
};
initAuth();
}, [pathname]);
// Show loading spinner while checking auth
if (loading && !PUBLIC_PAGES.includes(pathname)) {
return (
<div className="min-h-screen bg-forge-gray flex items-center justify-center">
<div className="text-center">
<Loader2 className="w-8 h-8 text-forge-yellow animate-spin mx-auto mb-4" />
<p className="text-gray-500">Loading...</p>
</div>
</div>
);
}
// On public pages, just render children
if (PUBLIC_PAGES.includes(pathname)) {
return <>{children}</>;
}
// On protected pages, only render if we have a user
if (!user && !loading) {
return null;
}
return <>{children}</>;
}

View file

@ -0,0 +1,123 @@
'use client';
import { useCallback, useState } from 'react';
import { useDropzone } from 'react-dropzone';
import { Upload, X, FileImage, FileVideo, FileAudio, File } from 'lucide-react';
import { clsx } from 'clsx';
interface FileUploadProps {
onUpload: (file: File) => void;
accept?: Record<string, string[]>;
maxSize?: number;
label?: string;
currentFile?: File | null;
onClear?: () => void;
}
const fileIcons: Record<string, any> = {
image: FileImage,
video: FileVideo,
audio: FileAudio,
};
export default function FileUpload({
onUpload,
accept,
maxSize = 100 * 1024 * 1024, // 100MB default
label = 'Upload a file',
currentFile,
onClear,
}: FileUploadProps) {
const [error, setError] = useState<string | null>(null);
const onDrop = useCallback(
(acceptedFiles: File[], rejectedFiles: any[]) => {
setError(null);
if (rejectedFiles.length > 0) {
const rejection = rejectedFiles[0];
if (rejection.errors[0]?.code === 'file-too-large') {
setError(`File too large. Max size is ${Math.round(maxSize / 1024 / 1024)}MB`);
} else if (rejection.errors[0]?.code === 'file-invalid-type') {
setError('Invalid file type');
} else {
setError('File rejected');
}
return;
}
if (acceptedFiles.length > 0) {
onUpload(acceptedFiles[0]);
}
},
[onUpload, maxSize]
);
const { getRootProps, getInputProps, isDragActive } = useDropzone({
onDrop,
accept,
maxSize,
multiple: false,
});
const getFileIcon = (file: File) => {
const type = file.type.split('/')[0];
const Icon = fileIcons[type] || File;
return <Icon className="w-8 h-8 text-forge-yellow" />;
};
const formatFileSize = (bytes: number) => {
if (bytes < 1024) return `${bytes} B`;
if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(1)} KB`;
return `${(bytes / 1024 / 1024).toFixed(1)} MB`;
};
if (currentFile) {
return (
<div className="bg-forge-dark rounded-xl p-4 border border-gray-700">
<div className="flex items-center gap-4">
{getFileIcon(currentFile)}
<div className="flex-1 min-w-0">
<p className="text-white font-medium truncate">{currentFile.name}</p>
<p className="text-sm text-gray-500">{formatFileSize(currentFile.size)}</p>
</div>
{onClear && (
<button
onClick={onClear}
className="p-2 text-gray-400 hover:text-red-400 transition-colors"
>
<X className="w-5 h-5" />
</button>
)}
</div>
</div>
);
}
return (
<div>
<div
{...getRootProps()}
className={clsx(
'upload-zone',
isDragActive && 'active'
)}
>
<input {...getInputProps()} />
<Upload className="w-12 h-12 text-gray-500 mx-auto mb-4" />
<p className="text-white font-medium mb-2">{label}</p>
<p className="text-gray-500 text-sm">
Drag and drop or <span className="text-forge-yellow">browse</span>
</p>
{accept && (
<p className="text-gray-600 text-xs mt-2">
Accepted: {Object.keys(accept).join(', ')}
</p>
)}
</div>
{error && (
<p className="mt-2 text-sm text-red-400">{error}</p>
)}
</div>
);
}

View file

@ -0,0 +1,99 @@
'use client';
import { useState } from 'react';
import { useRouter } from 'next/navigation';
import Link from 'next/link';
import { useStore } from '@/lib/store';
import { authApi } from '@/lib/api';
import { Bell, User, LogOut, ChevronDown, Settings } from 'lucide-react';
import JobTracker from './JobTracker';
export default function Header() {
const router = useRouter();
const { user, logout } = useStore();
const [showUserMenu, setShowUserMenu] = useState(false);
const handleLogout = async () => {
try {
await authApi.logout();
} catch (err) {
// Ignore errors
}
logout();
setShowUserMenu(false);
router.push('/login');
};
return (
<header className="h-16 bg-forge-dark border-b border-gray-800 flex items-center justify-between px-6">
{/* Breadcrumb / Title area */}
<div>
<h1 className="text-lg font-semibold text-white">Welcome to FORGE AI</h1>
<p className="text-sm text-gray-500">Creative tools powered by AI</p>
</div>
{/* Right side */}
<div className="flex items-center gap-4">
{/* Job Tracker */}
<JobTracker />
{/* Notifications */}
<button className="relative p-2 text-gray-400 hover:text-white transition-colors">
<Bell className="w-5 h-5" />
</button>
{/* User Menu */}
<div className="relative">
<button
onClick={() => setShowUserMenu(!showUserMenu)}
className="flex items-center gap-2 p-2 rounded-lg hover:bg-forge-gray transition-colors"
>
<div className="w-8 h-8 bg-forge-gray-light rounded-full flex items-center justify-center">
{user?.avatar_url ? (
<img
src={user.avatar_url}
alt={user.name}
className="w-8 h-8 rounded-full"
/>
) : (
<User className="w-4 h-4 text-gray-400" />
)}
</div>
<span className="text-sm text-gray-300">{user?.name || 'Test User'}</span>
<ChevronDown className="w-4 h-4 text-gray-500" />
</button>
{showUserMenu && (
<>
<div
className="fixed inset-0 z-10"
onClick={() => setShowUserMenu(false)}
/>
<div className="absolute right-0 top-full mt-2 w-48 bg-forge-gray border border-gray-700 rounded-lg shadow-xl z-20">
<div className="p-3 border-b border-gray-700">
<p className="text-sm font-medium text-white">{user?.name || 'User'}</p>
<p className="text-xs text-gray-500">{user?.email || ''}</p>
</div>
<Link
href="/settings"
onClick={() => setShowUserMenu(false)}
className="w-full flex items-center gap-2 px-3 py-2 text-sm text-gray-300 hover:bg-forge-dark transition-colors"
>
<Settings className="w-4 h-4" />
Settings
</Link>
<button
onClick={handleLogout}
className="w-full flex items-center gap-2 px-3 py-2 text-sm text-red-400 hover:bg-forge-dark transition-colors"
>
<LogOut className="w-4 h-4" />
Sign out
</button>
</div>
</>
)}
</div>
</div>
</header>
);
}

View file

@ -0,0 +1,95 @@
'use client';
import { useEffect, useState } from 'react';
import { clsx } from 'clsx';
import { CheckCircle, XCircle, Loader2 } from 'lucide-react';
import { jobsApi } from '@/lib/api';
interface JobProgressProps {
jobId: string;
onComplete?: (job: any) => void;
onError?: (error: string) => void;
}
export default function JobProgress({ jobId, onComplete, onError }: JobProgressProps) {
const [job, setJob] = useState<any>(null);
const [polling, setPolling] = useState(true);
useEffect(() => {
if (!jobId || !polling) return;
const pollJob = async () => {
try {
const response = await jobsApi.get(jobId);
const jobData = response.data;
setJob(jobData);
if (jobData.status === 'completed') {
setPolling(false);
onComplete?.(jobData);
} else if (jobData.status === 'failed') {
setPolling(false);
onError?.(jobData.error_message || 'Job failed');
}
} catch (err) {
console.error('Failed to poll job:', err);
}
};
pollJob();
const interval = setInterval(pollJob, 2000);
return () => clearInterval(interval);
}, [jobId, polling, onComplete, onError]);
if (!job) {
return (
<div className="bg-forge-dark rounded-xl p-6 border border-gray-700">
<div className="flex items-center gap-3">
<Loader2 className="w-5 h-5 text-forge-yellow animate-spin" />
<span className="text-gray-300">Starting job...</span>
</div>
</div>
);
}
return (
<div className="bg-forge-dark rounded-xl p-6 border border-gray-700">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-3">
{job.status === 'completed' && (
<CheckCircle className="w-5 h-5 text-green-400" />
)}
{job.status === 'failed' && (
<XCircle className="w-5 h-5 text-red-400" />
)}
{(job.status === 'pending' || job.status === 'processing') && (
<Loader2 className="w-5 h-5 text-forge-yellow animate-spin" />
)}
<span className="text-white font-medium capitalize">{job.status}</span>
</div>
<span className="text-gray-500 text-sm">{job.progress}%</span>
</div>
<div className="progress-bar">
<div
className={clsx(
'progress-bar-fill',
job.status === 'failed' && 'bg-red-500'
)}
style={{ width: `${job.progress}%` }}
/>
</div>
{job.api_provider && (
<p className="mt-3 text-sm text-gray-500">
Using: {job.api_provider} {job.api_model && `(${job.api_model})`}
</p>
)}
{job.error_message && (
<p className="mt-3 text-sm text-red-400">{job.error_message}</p>
)}
</div>
);
}

View file

@ -0,0 +1,264 @@
'use client';
import { useState, useEffect, useCallback } from 'react';
import { useStore } from '@/lib/store';
import api from '@/lib/api';
import { toast } from 'react-hot-toast';
import {
Loader2,
CheckCircle2,
XCircle,
Clock,
ChevronDown,
ChevronUp,
Image as ImageIcon,
Video,
Mic,
FileText,
X,
} from 'lucide-react';
import { clsx } from 'clsx';
const MODULE_ICONS: Record<string, any> = {
image_generator: ImageIcon,
image_upscaler: ImageIcon,
background_remover: ImageIcon,
video_generator: Video,
video_upscaler: Video,
subtitle_processor: Video,
text_to_speech: Mic,
voice_to_text: Mic,
speech_to_speech: Mic,
alt_text_generator: FileText,
prompt_studio: FileText,
};
const MODULE_LABELS: Record<string, string> = {
image_generator: 'Image Generation',
image_upscaler: 'Image Upscale',
background_remover: 'BG Removal',
video_generator: 'Video Generation',
video_upscaler: 'Video Upscale',
subtitle_processor: 'Subtitles',
text_to_speech: 'Text to Speech',
voice_to_text: 'Voice to Text',
speech_to_speech: 'Voice Clone',
alt_text_generator: 'Alt Text',
prompt_studio: 'Prompt Studio',
};
interface JobTrackerProps {
className?: string;
}
export default function JobTracker({ className }: JobTrackerProps) {
const { activeJobs, updateJob, removeJob } = useStore();
const [expanded, setExpanded] = useState(false);
const [polling, setPolling] = useState(false);
// Poll for job updates
const pollJobs = useCallback(async () => {
const pendingJobs = activeJobs.filter(
(job) => job.status === 'queued' || job.status === 'processing'
);
if (pendingJobs.length === 0) {
setPolling(false);
return;
}
setPolling(true);
for (const job of pendingJobs) {
try {
const response = await api.get(`/jobs/${job.id}`);
const updatedJob = response.data;
if (updatedJob.status !== job.status || updatedJob.progress !== job.progress) {
updateJob(job.id, {
status: updatedJob.status,
progress: updatedJob.progress,
completed_at: updatedJob.completed_at,
output_asset_ids: updatedJob.output_asset_ids,
error_message: updatedJob.error_message,
});
// Show toast on completion
if (updatedJob.status === 'completed' && job.status !== 'completed') {
toast.success(`${MODULE_LABELS[job.module] || job.module} completed!`, {
duration: 5000,
});
} else if (updatedJob.status === 'failed' && job.status !== 'failed') {
toast.error(`${MODULE_LABELS[job.module] || job.module} failed`, {
duration: 5000,
});
}
}
} catch (error) {
console.error(`Failed to poll job ${job.id}:`, error);
}
}
}, [activeJobs, updateJob]);
// Set up polling interval
useEffect(() => {
const hasPendingJobs = activeJobs.some(
(job) => job.status === 'queued' || job.status === 'processing'
);
if (!hasPendingJobs) return;
// Poll immediately
pollJobs();
// Then poll every 2 seconds
const interval = setInterval(pollJobs, 2000);
return () => clearInterval(interval);
}, [activeJobs.length, pollJobs]);
const pendingCount = activeJobs.filter(
(job) => job.status === 'queued' || job.status === 'processing'
).length;
const getStatusIcon = (status: string) => {
switch (status) {
case 'completed':
return <CheckCircle2 className="w-4 h-4 text-green-400" />;
case 'failed':
return <XCircle className="w-4 h-4 text-red-400" />;
case 'processing':
return <Loader2 className="w-4 h-4 text-forge-yellow animate-spin" />;
default:
return <Clock className="w-4 h-4 text-gray-400" />;
}
};
const formatTime = (dateStr: string) => {
const date = new Date(dateStr);
const now = new Date();
const diffMs = now.getTime() - date.getTime();
const diffSec = Math.floor(diffMs / 1000);
const diffMin = Math.floor(diffSec / 60);
if (diffMin < 1) return 'Just now';
if (diffMin < 60) return `${diffMin}m ago`;
return `${Math.floor(diffMin / 60)}h ago`;
};
if (activeJobs.length === 0) return null;
return (
<div className={clsx('relative', className)}>
{/* Trigger Button */}
<button
onClick={() => setExpanded(!expanded)}
className={clsx(
'flex items-center gap-2 px-3 py-2 rounded-lg transition-colors',
pendingCount > 0
? 'bg-forge-yellow/10 text-forge-yellow'
: 'bg-forge-gray text-gray-400 hover:text-white'
)}
>
{pendingCount > 0 ? (
<Loader2 className="w-4 h-4 animate-spin" />
) : (
<CheckCircle2 className="w-4 h-4" />
)}
<span className="text-sm font-medium">
{pendingCount > 0 ? `${pendingCount} Active` : `${activeJobs.length} Jobs`}
</span>
{expanded ? (
<ChevronUp className="w-4 h-4" />
) : (
<ChevronDown className="w-4 h-4" />
)}
</button>
{/* Dropdown Panel */}
{expanded && (
<div className="absolute right-0 top-full mt-2 w-96 bg-forge-dark border border-gray-800 rounded-xl shadow-2xl z-50 overflow-hidden">
<div className="p-3 border-b border-gray-800 flex items-center justify-between">
<h3 className="text-sm font-semibold text-white">Active Jobs</h3>
<span className="text-xs text-gray-500">
{polling && 'Updating...'}
</span>
</div>
<div className="max-h-96 overflow-y-auto">
{activeJobs.slice(0, 10).map((job) => {
const Icon = MODULE_ICONS[job.module] || FileText;
return (
<div
key={job.id}
className="p-3 border-b border-gray-800 last:border-0 hover:bg-forge-gray/50"
>
<div className="flex items-start gap-3">
<div className="w-8 h-8 bg-forge-gray rounded-lg flex items-center justify-center flex-shrink-0">
<Icon className="w-4 h-4 text-forge-yellow" />
</div>
<div className="flex-1 min-w-0">
<div className="flex items-center justify-between gap-2">
<span className="text-sm font-medium text-white truncate">
{MODULE_LABELS[job.module] || job.module}
</span>
<div className="flex items-center gap-2">
{getStatusIcon(job.status)}
<button
onClick={() => removeJob(job.id)}
className="p-1 text-gray-500 hover:text-gray-300"
>
<X className="w-3 h-3" />
</button>
</div>
</div>
{/* Progress bar for processing jobs */}
{(job.status === 'processing' || job.status === 'queued') && (
<div className="mt-2">
<div className="flex items-center justify-between text-xs text-gray-500 mb-1">
<span>{job.status === 'queued' ? 'Queued' : 'Processing'}</span>
<span>{job.progress}%</span>
</div>
<div className="h-1.5 bg-forge-gray rounded-full overflow-hidden">
<div
className="h-full bg-forge-yellow transition-all duration-300"
style={{ width: `${job.progress}%` }}
/>
</div>
</div>
)}
{/* Completed/Failed status */}
{job.status === 'completed' && (
<p className="text-xs text-green-400 mt-1">
Completed {job.completed_at && formatTime(job.completed_at)}
</p>
)}
{job.status === 'failed' && (
<p className="text-xs text-red-400 mt-1 truncate">
{job.error_message || 'Failed'}
</p>
)}
<p className="text-xs text-gray-600 mt-1">
Started {formatTime(job.created_at)}
</p>
</div>
</div>
</div>
);
})}
</div>
{activeJobs.length > 10 && (
<div className="p-2 border-t border-gray-800 text-center">
<a href="/history" className="text-xs text-forge-yellow hover:underline">
View all {activeJobs.length} jobs
</a>
</div>
)}
</div>
)}
</div>
);
}

View file

@ -0,0 +1,30 @@
'use client';
import Link from 'next/link';
import { LucideIcon } from 'lucide-react';
interface ModuleCardProps {
title: string;
description: string;
icon: LucideIcon;
href: string;
color?: string;
}
export default function ModuleCard({
title,
description,
icon: Icon,
href,
color = 'forge-yellow',
}: ModuleCardProps) {
return (
<Link href={href} className="module-card group">
<div className={`w-12 h-12 bg-${color}/10 rounded-lg flex items-center justify-center mb-4 group-hover:bg-${color}/20 transition-colors`}>
<Icon className={`w-6 h-6 text-${color}`} />
</div>
<h3 className="text-lg font-semibold text-white mb-2">{title}</h3>
<p className="text-gray-400 text-sm">{description}</p>
</Link>
);
}

View file

@ -0,0 +1,281 @@
'use client';
import Link from 'next/link';
import { usePathname } from 'next/navigation';
import { clsx } from 'clsx';
import { useStore } from '@/lib/store';
import { isAdmin } from '@/lib/auth';
import {
Home,
Image,
Video,
Mic,
FileText,
History,
Settings,
ChevronLeft,
ChevronRight,
Sparkles,
Maximize,
Eraser,
Captions,
Volume2,
Type,
Wand2,
ImagePlus,
Film,
Shield,
Users,
TrendingUp,
Clock,
FolderOpen,
AudioLines,
} from 'lucide-react';
const modules = [
{
category: 'Image',
icon: Image,
items: [
{ name: 'Generate', href: '/image/generate', icon: ImagePlus },
{ name: 'Upscale', href: '/image/upscale', icon: Maximize },
{ name: 'Remove Background', href: '/image/remove-bg', icon: Eraser },
],
},
{
category: 'Video',
icon: Video,
items: [
{ name: 'Generate', href: '/video/generate', icon: Film },
{ name: 'Upscale', href: '/video/upscale', icon: Maximize },
{ name: 'Subtitles', href: '/video/subtitles', icon: Captions },
],
},
{
category: 'Audio',
icon: Mic,
items: [
{ name: 'Text to Speech', href: '/audio/text-to-speech', icon: Volume2 },
{ name: 'Voice to Text', href: '/audio/voice-to-text', icon: Type },
{ name: 'Sound Effects', href: '/audio/sound-effects', icon: AudioLines },
],
},
{
category: 'Text',
icon: FileText,
items: [
{ name: 'Prompt Studio', href: '/text/prompt-studio', icon: Wand2 },
{ name: 'Alt Text Generator', href: '/text/alt-text', icon: FileText },
],
},
];
export default function Sidebar() {
const pathname = usePathname();
const { user, sidebarCollapsed, toggleSidebar } = useStore();
const userIsAdmin = isAdmin(user as any);
return (
<aside
className={clsx(
'bg-forge-dark border-r border-gray-800 flex flex-col transition-all duration-300',
sidebarCollapsed ? 'w-20' : 'w-64'
)}
>
{/* Logo */}
<div className="p-4 border-b border-gray-800">
<Link href="/" className="flex items-center gap-3">
<div className="w-10 h-10 bg-forge-yellow rounded-lg flex items-center justify-center">
<Sparkles className="w-6 h-6 text-black" />
</div>
{!sidebarCollapsed && (
<span className="text-xl font-bold text-white">FORGE AI</span>
)}
</Link>
</div>
{/* Navigation */}
<nav className="flex-1 overflow-y-auto py-4">
{/* Dashboard */}
<Link
href="/"
className={clsx(
'flex items-center gap-3 px-4 py-3 mx-2 rounded-lg transition-colors',
pathname === '/'
? 'bg-forge-yellow/10 text-forge-yellow'
: 'text-gray-400 hover:text-white hover:bg-forge-gray'
)}
>
<Home className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Dashboard</span>}
</Link>
{/* Modules */}
{modules.map((module) => (
<div key={module.category} className="mt-4">
{!sidebarCollapsed && (
<div className="px-4 py-2 text-xs font-semibold text-gray-500 uppercase tracking-wider">
{module.category}
</div>
)}
{sidebarCollapsed && (
<div className="px-4 py-2 flex justify-center">
<module.icon className="w-4 h-4 text-gray-500" />
</div>
)}
{module.items.map((item) => (
<Link
key={item.href}
href={item.href}
className={clsx(
'flex items-center gap-3 px-4 py-2.5 mx-2 rounded-lg transition-colors',
pathname === item.href
? 'bg-forge-yellow/10 text-forge-yellow'
: 'text-gray-400 hover:text-white hover:bg-forge-gray'
)}
>
<item.icon className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>{item.name}</span>}
</Link>
))}
</div>
))}
{/* My Files */}
<div className="mt-6 pt-6 border-t border-gray-800">
<Link
href="/files"
className={clsx(
'flex items-center gap-3 px-4 py-3 mx-2 rounded-lg transition-colors',
pathname === '/files'
? 'bg-forge-yellow/10 text-forge-yellow'
: 'text-gray-400 hover:text-white hover:bg-forge-gray'
)}
>
<FolderOpen className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>My Files</span>}
</Link>
</div>
{/* History & Settings */}
<div className="mt-4">
<Link
href="/history"
className={clsx(
'flex items-center gap-3 px-4 py-3 mx-2 rounded-lg transition-colors',
pathname === '/history'
? 'bg-forge-yellow/10 text-forge-yellow'
: 'text-gray-400 hover:text-white hover:bg-forge-gray'
)}
>
<History className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>History</span>}
</Link>
<Link
href="/settings"
className={clsx(
'flex items-center gap-3 px-4 py-3 mx-2 rounded-lg transition-colors',
pathname === '/settings'
? 'bg-forge-yellow/10 text-forge-yellow'
: 'text-gray-400 hover:text-white hover:bg-forge-gray'
)}
>
<Settings className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Settings</span>}
</Link>
</div>
{/* Admin Section - Only visible to admins */}
{userIsAdmin && (
<div className="mt-4 pt-4 border-t border-red-900/50">
{!sidebarCollapsed && (
<div className="px-4 py-2 text-xs font-semibold text-red-400 uppercase tracking-wider">
Admin
</div>
)}
{sidebarCollapsed && (
<div className="px-4 py-2 flex justify-center">
<Shield className="w-4 h-4 text-red-400" />
</div>
)}
<Link
href="/admin"
className={clsx(
'flex items-center gap-3 px-4 py-2.5 mx-2 rounded-lg transition-colors',
pathname === '/admin'
? 'bg-red-900/20 text-red-400'
: 'text-gray-400 hover:text-red-400 hover:bg-red-900/10'
)}
>
<Shield className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Dashboard</span>}
</Link>
<Link
href="/admin/users"
className={clsx(
'flex items-center gap-3 px-4 py-2.5 mx-2 rounded-lg transition-colors',
pathname === '/admin/users'
? 'bg-red-900/20 text-red-400'
: 'text-gray-400 hover:text-red-400 hover:bg-red-900/10'
)}
>
<Users className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Users</span>}
</Link>
<Link
href="/admin/reports"
className={clsx(
'flex items-center gap-3 px-4 py-2.5 mx-2 rounded-lg transition-colors',
pathname === '/admin/reports'
? 'bg-red-900/20 text-red-400'
: 'text-gray-400 hover:text-red-400 hover:bg-red-900/10'
)}
>
<TrendingUp className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Reports</span>}
</Link>
<Link
href="/admin/logs"
className={clsx(
'flex items-center gap-3 px-4 py-2.5 mx-2 rounded-lg transition-colors',
pathname === '/admin/logs'
? 'bg-red-900/20 text-red-400'
: 'text-gray-400 hover:text-red-400 hover:bg-red-900/10'
)}
>
<Clock className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Audit Logs</span>}
</Link>
<Link
href="/admin/voices"
className={clsx(
'flex items-center gap-3 px-4 py-2.5 mx-2 rounded-lg transition-colors',
pathname === '/admin/voices'
? 'bg-red-900/20 text-red-400'
: 'text-gray-400 hover:text-red-400 hover:bg-red-900/10'
)}
>
<Mic className="w-5 h-5 flex-shrink-0" />
{!sidebarCollapsed && <span>Voices</span>}
</Link>
</div>
)}
</nav>
{/* Collapse Toggle */}
<button
onClick={toggleSidebar}
className="p-4 border-t border-gray-800 text-gray-400 hover:text-white transition-colors"
>
{sidebarCollapsed ? (
<ChevronRight className="w-5 h-5 mx-auto" />
) : (
<div className="flex items-center gap-2">
<ChevronLeft className="w-5 h-5" />
<span>Collapse</span>
</div>
)}
</button>
</aside>
);
}

17
frontend/next.config.js Normal file
View file

@ -0,0 +1,17 @@
/** @type {import('next').NextConfig} */
const nextConfig = {
output: 'standalone',
images: {
domains: ['localhost'],
},
async rewrites() {
return [
{
source: '/api/:path*',
destination: 'http://backend:8000/api/:path*',
},
];
},
};
module.exports = nextConfig;

33
frontend/package.json Normal file
View file

@ -0,0 +1,33 @@
{
"name": "forge-ai-frontend",
"version": "1.0.0",
"private": true,
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint"
},
"dependencies": {
"next": "15.3.4",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"axios": "^1.7.7",
"zustand": "^5.0.1",
"lucide-react": "^0.460.0",
"react-dropzone": "^14.2.9",
"react-hot-toast": "^2.4.1",
"date-fns": "^4.1.0",
"clsx": "^2.1.1",
"tailwind-merge": "^2.5.4"
},
"devDependencies": {
"@types/node": "^22.9.0",
"@types/react": "^18.3.12",
"@types/react-dom": "^18.3.1",
"autoprefixer": "^10.4.20",
"postcss": "^8.4.47",
"tailwindcss": "^3.4.14",
"typescript": "^5.6.3"
}
}

View file

@ -0,0 +1,6 @@
module.exports = {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
};

View file

@ -0,0 +1,25 @@
/** @type {import('tailwindcss').Config} */
module.exports = {
content: [
'./pages/**/*.{js,ts,jsx,tsx,mdx}',
'./components/**/*.{js,ts,jsx,tsx,mdx}',
'./app/**/*.{js,ts,jsx,tsx,mdx}',
],
theme: {
extend: {
colors: {
forge: {
yellow: '#FFC407',
black: '#000000',
dark: '#111111',
gray: '#1a1a1a',
'gray-light': '#2a2a2a',
},
},
fontFamily: {
montserrat: ['Montserrat', 'sans-serif'],
},
},
},
plugins: [],
};

26
frontend/tsconfig.json Normal file
View file

@ -0,0 +1,26 @@
{
"compilerOptions": {
"lib": ["dom", "dom.iterable", "esnext"],
"allowJs": true,
"skipLibCheck": true,
"strict": true,
"noEmit": true,
"esModuleInterop": true,
"module": "esnext",
"moduleResolution": "bundler",
"resolveJsonModule": true,
"isolatedModules": true,
"jsx": "preserve",
"incremental": true,
"plugins": [
{
"name": "next"
}
],
"paths": {
"@/*": ["./*"]
}
},
"include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
"exclude": ["node_modules"]
}

11
nginx/Dockerfile Normal file
View file

@ -0,0 +1,11 @@
FROM nginx:alpine
# Remove default config
RUN rm /etc/nginx/conf.d/default.conf
# Copy our config
COPY nginx.conf /etc/nginx/conf.d/
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

62
nginx/nginx.conf Normal file
View file

@ -0,0 +1,62 @@
upstream frontend {
server frontend:3000;
}
upstream backend {
server backend:8000;
}
server {
listen 80;
server_name localhost ai-sandbox.oliver.solutions;
client_max_body_size 500M;
# Frontend
location / {
proxy_pass http://frontend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_cache_bypass $http_upgrade;
}
# API
location /api {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 60s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
# Storage/Assets
location /storage {
proxy_pass http://backend/storage;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
# Health check
location /health {
proxy_pass http://backend/health;
proxy_http_version 1.1;
}
# WebSocket support for Next.js HMR (development)
location /_next/webpack-hmr {
proxy_pass http://frontend/_next/webpack-hmr;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}