apac-ops-bot/DEPLOYMENT.md
2026-02-11 12:54:37 +00:00

11 KiB

Production Deployment Guide

This guide covers deploying APAC Ops Bot to production.

Prerequisites

  • Server with Docker and Docker Compose installed
  • Domain name (optional, for HTTPS)
  • Azure AD application configured for production URLs
  • OpenAI API key with Responses API access

Deployment Steps

1. Server Setup

# Update system
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

# Install Docker Compose
sudo apt install docker-compose -y

# Add current user to docker group
sudo usermod -aG docker $USER
newgrp docker

2. Clone Repository

git clone <repository-url>
cd apac-ops-bot

3. Configure Environment Variables

Backend Production (.env):

cd backend
cp .env.example .env
nano .env

Update with production values:

# App
APP_NAME=The APAC OpsBot
APP_ENV=production
DEBUG=False
SECRET_KEY=<generate-strong-secret-key>
CORS_ORIGINS=https://your-domain.com

# Database (use strong password)
DATABASE_URL=postgresql+asyncpg://apac_ops_bot:STRONG_PASSWORD@postgres:5432/apac_ops_bot

# Azure AD (production app registration)
AZURE_TENANT_ID=your-production-tenant-id
AZURE_CLIENT_ID=your-production-client-id
AZURE_CLIENT_SECRET=your-production-client-secret
AZURE_REDIRECT_URI=https://your-domain.com/api/v1/auth/msal/callback

# OpenAI
OPENAI_API_KEY=your-openai-api-key
OPENAI_VECTOR_STORE_ID=vs_QkOKiQCqzCHS4iFT5lP9qUxc
OPENAI_MODEL=gpt-5-nano-2025-08-07
OPENAI_API_BASE=https://api.openai.com/v1

# Redis
REDIS_URL=redis://redis:6399/0

# Rate Limiting (adjust as needed)
RATE_LIMIT_PER_MINUTE=30
RATE_LIMIT_PER_DAY=1000

# Token Costs
PROMPT_TOKEN_COST=0.00005
CACHED_PROMPT_TOKEN_COST=0.000005
COMPLETION_TOKEN_COST=0.0004

Frontend Production (.env):

cd ../frontend
cp .env.example .env
nano .env

Update with production values:

REACT_APP_API_URL=https://your-domain.com/api/v1
REACT_APP_WS_URL=wss://your-domain.com/ws
REACT_APP_AZURE_CLIENT_ID=your-production-client-id
REACT_APP_AZURE_TENANT_ID=your-production-tenant-id
REACT_APP_AZURE_REDIRECT_URI=https://your-domain.com
REACT_APP_NAME=The APAC OpsBot

4. Update Docker Compose for Production

Create docker-compose.prod.yml:

version: '3.8'

services:
  postgres:
    image: postgres:15-alpine
    container_name: apac_ops_bot_postgres
    environment:
      POSTGRES_USER: apac_ops_bot
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_DB: apac_ops_bot
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./backups:/backups
    networks:
      - apac_network
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U apac_ops_bot"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    container_name: apac_ops_bot_redis
    command: redis-server --requirepass ${REDIS_PASSWORD}
    volumes:
      - redis_data:/data
    networks:
      - apac_network
    restart: always
    healthcheck:
      test: ["CMD", "redis-cli", "--raw", "incr", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  backend:
    build:
      context: ./backend
      dockerfile: Dockerfile
      target: production
    container_name: apac_ops_bot_backend
    env_file:
      - ./backend/.env
    environment:
      - DATABASE_URL=postgresql+asyncpg://apac_ops_bot:${POSTGRES_PASSWORD}@postgres:5432/apac_ops_bot
      - REDIS_URL=redis://:${REDIS_PASSWORD}@redis:6399/0
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    networks:
      - apac_network
    restart: always

  frontend:
    build:
      context: ./frontend
      dockerfile: Dockerfile
      target: production
    container_name: apac_ops_bot_frontend
    depends_on:
      - backend
    networks:
      - apac_network
    restart: always

  nginx:
    image: nginx:alpine
    container_name: apac_ops_bot_nginx
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/ssl:/etc/nginx/ssl:ro
      - ./certbot/conf:/etc/letsencrypt:ro
      - ./certbot/www:/var/www/certbot:ro
    depends_on:
      - backend
      - frontend
    networks:
      - apac_network
    restart: always

volumes:
  postgres_data:
  redis_data:

networks:
  apac_network:
    driver: bridge

5. Configure Nginx

Create nginx/nginx.conf:

events {
    worker_connections 1024;
}

http {
    upstream backend {
        server backend:8048;
    }

    upstream frontend {
        server frontend:80;
    }

    # Redirect HTTP to HTTPS
    server {
        listen 80;
        server_name your-domain.com;

        location /.well-known/acme-challenge/ {
            root /var/www/certbot;
        }

        location / {
            return 301 https://$server_name$request_uri;
        }
    }

    # HTTPS server
    server {
        listen 443 ssl http2;
        server_name your-domain.com;

        ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem;

        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_ciphers HIGH:!aNULL:!MD5;
        ssl_prefer_server_ciphers on;

        # Backend API
        location /api/ {
            proxy_pass http://backend;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }

        # WebSocket
        location /ws {
            proxy_pass http://backend;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
        }

        # Frontend
        location / {
            proxy_pass http://frontend;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }
}

6. SSL Certificate (Let's Encrypt)

# Install certbot
sudo apt install certbot python3-certbot-nginx -y

# Get certificate
sudo certbot certonly --webroot -w ./certbot/www -d your-domain.com

# Auto-renewal (add to crontab)
sudo crontab -e
# Add: 0 12 * * * /usr/bin/certbot renew --quiet

7. Start Production Services

# Set environment variables for docker-compose
export POSTGRES_PASSWORD=$(openssl rand -base64 32)
export REDIS_PASSWORD=$(openssl rand -base64 32)

# Save passwords securely
echo "POSTGRES_PASSWORD=$POSTGRES_PASSWORD" >> .env.secrets
echo "REDIS_PASSWORD=$REDIS_PASSWORD" >> .env.secrets
chmod 600 .env.secrets

# Start services
docker-compose -f docker-compose.prod.yml up -d --build

# Check logs
docker-compose -f docker-compose.prod.yml logs -f

8. Database Initialization

# Run migrations
docker-compose -f docker-compose.prod.yml exec backend alembic upgrade head

# Create initial admin user (optional)
docker-compose -f docker-compose.prod.yml exec backend python -c "
from app.services.user_service import create_admin_user
create_admin_user('admin@example.com', 'Admin User')
"

Monitoring

Health Checks

# Check all services
docker-compose -f docker-compose.prod.yml ps

# Check backend health
curl https://your-domain.com/health

# Check database
docker-compose -f docker-compose.prod.yml exec postgres psql -U apac_ops_bot -c "SELECT 1"

Logs

# View all logs
docker-compose -f docker-compose.prod.yml logs -f

# View specific service logs
docker-compose -f docker-compose.prod.yml logs -f backend
docker-compose -f docker-compose.prod.yml logs -f frontend
docker-compose -f docker-compose.prod.yml logs -f postgres

Backup & Restore

Database Backup

# Create backup
docker-compose -f docker-compose.prod.yml exec postgres pg_dump -U apac_ops_bot apac_ops_bot > backup_$(date +%Y%m%d_%H%M%S).sql

# Automated daily backups (add to crontab)
0 2 * * * docker-compose -f /path/to/apac-ops-bot/docker-compose.prod.yml exec -T postgres pg_dump -U apac_ops_bot apac_ops_bot > /path/to/backups/backup_$(date +\%Y\%m\%d).sql

Database Restore

# Restore from backup
docker-compose -f docker-compose.prod.yml exec -T postgres psql -U apac_ops_bot apac_ops_bot < backup_20250127.sql

Updates

# Pull latest changes
git pull origin main

# Rebuild and restart services
docker-compose -f docker-compose.prod.yml up -d --build

# Run new migrations
docker-compose -f docker-compose.prod.yml exec backend alembic upgrade head

Troubleshooting

Backend not starting

# Check backend logs
docker-compose -f docker-compose.prod.yml logs backend

# Check environment variables
docker-compose -f docker-compose.prod.yml exec backend env | grep -E '(DATABASE|OPENAI|AZURE)'

# Test database connection
docker-compose -f docker-compose.prod.yml exec backend python -c "from app.database import engine; print('DB OK')"

Frontend not loading

# Check frontend logs
docker-compose -f docker-compose.prod.yml logs frontend

# Check if backend is accessible
curl http://backend:8048/health

SSL certificate issues

# Renew certificate manually
sudo certbot renew

# Check certificate
sudo certbot certificates

Security Checklist

  • Strong SECRET_KEY generated
  • Strong database password set
  • Strong Redis password set
  • Azure AD production app registered
  • CORS origins properly configured
  • SSL certificate installed and auto-renewal configured
  • Firewall configured (only 80, 443, SSH)
  • Regular database backups scheduled
  • Log rotation configured
  • Rate limiting enabled
  • Monitoring and alerts setup

Performance Optimization

Database

# Optimize PostgreSQL for production
# Edit postgresql.conf in container or mount custom config
docker-compose -f docker-compose.prod.yml exec postgres psql -U apac_ops_bot -c "ALTER SYSTEM SET shared_buffers = '256MB';"
docker-compose -f docker-compose.prod.yml exec postgres psql -U apac_ops_bot -c "ALTER SYSTEM SET effective_cache_size = '1GB';"
docker-compose -f docker-compose.prod.yml restart postgres

Backend

  • Backend already runs with 4 workers (production Dockerfile)
  • Adjust worker count based on CPU cores: --workers $(nproc)

Redis

  • Redis persistence enabled by default
  • Consider using Redis Cluster for high availability

Scaling

For high-traffic scenarios:

  1. Horizontal Scaling: Run multiple backend instances behind load balancer
  2. Database: Use managed PostgreSQL (AWS RDS, Azure Database)
  3. Redis: Use managed Redis (AWS ElastiCache, Azure Cache)
  4. CDN: Use CloudFlare or AWS CloudFront for frontend
  5. Monitoring: Set up Prometheus + Grafana for metrics

Support

For production issues, contact the development team.