ferrero-opentext/Python-Version/DEPLOYMENT_GUIDE.md
DJP d4debbe3da Add complete deployment guide and database init.sql
Production-ready deployment package with database schema and server setup guide.

NEW FILES:
1. DEPLOYMENT_GUIDE.md - Complete server deployment instructions
2. database/init.sql - PostgreSQL database initialization script

DEPLOYMENT_GUIDE.md includes:
✓ Server deployment steps (rsync/scp)
✓ PostgreSQL setup (Docker or native)
✓ Python environment setup
✓ Configuration guide
✓ Cron job examples
✓ OAuth2 vs mTLS setup instructions
✓ Testing checklist
✓ Troubleshooting guide
✓ Production checklist
✓ Security hardening steps

database/init.sql includes:
✓ Complete schema with all 35 columns in master_assets
✓ All 4 tables (master_assets, derivative_assets, asset_events, workflow_state)
✓ Campaign relationship fields (global_master_campaign_id, global_master_folder_id, local_campaign_id)
✓ JSONB full_metadata column
✓ 12 indexes for performance
✓ 4 triggers (auto-update timestamps, event logging)
✓ 2 helper functions
✓ Proper permissions for ferrero_user

KEY FEATURES:
- Self-contained deployment
- Works with Docker or native PostgreSQL
- Includes all schema additions from this session
- OAuth2 and mTLS configuration documented
- Cron job templates provided
- Security checklist included

DEPLOYMENT:
1. Copy Python-Version/ folder to server
2. Run database init.sql
3. Install Python dependencies
4. Update .env for server
5. Test connections
6. Set up cron jobs

Everything needed for production deployment is included.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 09:27:25 -05:00

14 KiB

Ferrero Asset Tracking - Complete Deployment Guide

Last Updated: November 5, 2025 Version: 1.0 - Production Ready


📦 What You're Deploying

Complete automation system with:

  • 4 workflow automation scripts (A1→A2, A2→A3, A5→A6, B1→B2)
  • PostgreSQL database with tracking system
  • Dual authentication (OAuth2 + mTLS)
  • Email notifications
  • Daily summary reports
  • Recursive folder search
  • NOT APPROVED filtering (A5→A6)

🚀 Server Deployment Steps

Step 1: Copy Files to Server

From your local machine:

# Copy entire Python-Version folder (excluding temp files)
rsync -av --exclude='venv' --exclude='logs' --exclude='temp' \
  --exclude='__pycache__' --exclude='*.pyc' --exclude='.DS_Store' \
  --exclude='asset_metadata_debug.json' --exclude='check_asset_metadata.py' \
  --exclude='00:*' \
  Python-Version/ user@server:/opt/ferrero-automation/

Or use SCP:

scp -r Python-Version/ user@server:/opt/ferrero-automation/

Step 2: Setup PostgreSQL Database

Option A: Docker (Recommended)

# On server
cd /opt/ferrero-automation

# Create Docker compose file
cat > docker-compose.yml << 'DOCKER'
version: '3.8'

services:
  postgres:
    image: postgres:15
    container_name: ferrero-tracking-db
    restart: always
    environment:
      POSTGRES_DB: ferrero_tracking
      POSTGRES_USER: ferrero_user
      POSTGRES_PASSWORD: ferrero_pass_2025
    ports:
      - "5433:5432"
    volumes:
      - ./database/init.sql:/docker-entrypoint-initdb.d/init.sql
      - ferrero-db-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ferrero_user -d ferrero_tracking"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  ferrero-db-data:
    driver: local
DOCKER

# Start database
docker-compose up -d

# Wait for it to be healthy
docker-compose ps

Option B: Native PostgreSQL

# Install PostgreSQL 15
sudo apt update
sudo apt install postgresql-15

# Create database and user
sudo -u postgres psql << 'SQL'
CREATE DATABASE ferrero_tracking;
CREATE USER ferrero_user WITH PASSWORD 'ferrero_pass_2025';
GRANT ALL PRIVILEGES ON DATABASE ferrero_tracking TO ferrero_user;
\c ferrero_tracking
GRANT ALL ON SCHEMA public TO ferrero_user;
SQL

# Run init script
psql -h localhost -p 5432 -U ferrero_user -d ferrero_tracking -f database/init.sql

Step 3: Initialize Database Schema

The init.sql file includes:

  • master_assets table (35 columns)
  • derivative_assets table
  • asset_events table (audit log)
  • workflow_state table
  • All indexes
  • All triggers
  • Helper functions

Key columns added this session:

  • full_metadata JSONB - Complete metadata storage
  • global_master_campaign_id VARCHAR(50) - Global campaign reference
  • global_master_folder_id VARCHAR(255) - Global folder ID
  • local_campaign_id VARCHAR(50) - Local campaign ID

Step 4: Setup Python Environment

cd /opt/ferrero-automation

# Create virtual environment
python3 -m venv venv

# Activate
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Requirements include:

  • requests
  • python-dotenv
  • PyYAML
  • psycopg2-binary
  • boxsdk
  • cryptography (for mTLS)
  • Jinja2
  • And more...

Step 5: Configure Environment

Update .env file for server:

# Edit .env
nano .env

# Update these values:
DB_HOST=localhost          # Or actual DB server IP
DB_PORT=5433               # 5432 if native PostgreSQL
DB_USER=ferrero_user
DB_PASSWORD=ferrero_pass_2025  # CHANGE IN PRODUCTION!

# Verify other settings:
# - Box credentials
# - SMTP credentials
# - DAM URLs (OAuth2 and mTLS)

CRITICAL: Change passwords in production!


Step 6: Test Connections

cd /opt/ferrero-automation
source venv/bin/activate

# Test with OAuth2 (default)
python scripts/test_connection.py

# Should show:
# ✓ DAM connection OK
# ✓ Box connection OK
# ✓ Database connection OK

If testing from whitelisted IP:

# Test with mTLS
python scripts/test_mtls_cert.py
python scripts/test_connection.py --auth-pfx

Step 7: Setup Cron Jobs

Edit crontab:

crontab -e

Add these lines:

# Ferrero Automation - Run every 5 minutes

# A1→A2: Master assets download (Local Adaptation)
*/5 * * * * cd /opt/ferrero-automation && venv/bin/python scripts/a1_to_a2_download.py >> logs/cron_a1_a2.log 2>&1

# A5→A6: Rework assets download (NOT APPROVED only)
*/5 * * * * cd /opt/ferrero-automation && venv/bin/python scripts/a5_to_a6_download.py >> logs/cron_a5_a6.log 2>&1

# B1→B2: Global master assets download
*/5 * * * * cd /opt/ferrero-automation && venv/bin/python scripts/b1_to_b2_download.py >> logs/cron_b1_b2.log 2>&1

# A2→A3: Upload from Box (polling)
*/5 * * * * cd /opt/ferrero-automation && venv/bin/python scripts/a2_to_a3_upload_polling.py >> logs/cron_a2_a3.log 2>&1

# Daily Report: Send summary email at 7 PM
0 19 * * * cd /opt/ferrero-automation && venv/bin/python scripts/daily_report.py >> logs/daily_report.log 2>&1

For mTLS (from whitelisted IP):

# Just add --auth-pfx flag to any script:
*/5 * * * * cd /opt/ferrero-automation && venv/bin/python scripts/a1_to_a2_download.py --auth-pfx >> logs/cron_a1_a2.log 2>&1

Step 8: Verify Workflows

Manual test runs:

cd /opt/ferrero-automation
source venv/bin/activate

# Test each workflow
python scripts/a1_to_a2_download.py
python scripts/a5_to_a6_download.py
python scripts/b1_to_b2_download.py
python scripts/a2_to_a3_upload_polling.py

# Generate daily report
python scripts/daily_report.py

Check logs:

tail -f logs/a1_to_a2.log
tail -f logs/a5_to_a6.log

🗄️ Database Schema Reference

master_assets Table (35 columns)

Core Fields:

  • tracking_id VARCHAR(6) UNIQUE - 6-character tracking ID
  • opentext_id VARCHAR(255) - DAM asset ID
  • original_filename VARCHAR(500)
  • file_extension VARCHAR(20)
  • file_size_bytes BIGINT
  • mime_type VARCHAR(100)

Campaign Relationships:

  • global_master_campaign_id VARCHAR(50) - Global campaign reference
  • global_master_folder_id VARCHAR(255) - Global folder ID
  • local_campaign_id VARCHAR(50) - Local campaign ID

Metadata:

  • full_metadata JSONB - Complete 200KB+ metadata (no truncation!)
  • upload_directory VARCHAR(1000)
  • description TEXT

Extracted Fields:

  • brand_code, brand_name
  • country_code, country_name
  • language_code, language_name
  • subject_title
  • asset_type, asset_type_name
  • duration_seconds, aspect_ratio, width_px, height_px
  • tags TEXT[], categories TEXT[]

Status & Tracking:

  • status VARCHAR(50) DEFAULT 'active'
  • is_deleted BOOLEAN DEFAULT false
  • created_at, updated_at, deleted_at TIMESTAMP
  • ingested_by, ingested_at

🔐 Security Checklist

Before Production:

  • Change database password from ferrero_pass_2025
  • Update SMTP password if needed
  • Verify Box JWT credentials are production keys
  • Verify DAM credentials (OAuth2 or mTLS)
  • Set certificate file permissions: chmod 600 config/certificates/*.pfx
  • Verify .env is not readable by other users: chmod 600 .env
  • Configure firewall to allow only whitelisted IPs
  • Enable database backups
  • Set up log rotation

📧 Email Configuration

Current SMTP (Mailgun):

Server: smtp.mailgun.org:587
From: TWIST-UK-SERVER@oliver.agency
To: daveporter@oliver.agency

Email Types:

  • Success notifications (green theme)
  • Warning notifications (orange theme)
  • Error notifications (red theme)
  • Daily summary reports (blue dashboard)

🔄 Authentication Methods

OAuth2 (Default - Production Ready)

When to use: Current production environment

Configuration:

DAM_BASE_URL=https://ppr.dam.ferrero.com/otmmapi
DAM_AUTH_URL=https://ppr.dam.ferrero.com/otdsws/oauth2/token
DAM_CLIENT_ID=otds-OLV
DAM_CLIENT_SECRET=hs28LZ9ZzQ5I9rlW3P7Wwyw85oOatlC1

Usage:

python scripts/a1_to_a2_download.py  # No flag needed

mTLS Certificate (Optional - Whitelisted IP Required)

When to use: Enhanced security, API gateway environments

Configuration:

DAM_MTLS_BASE_URL=https://dev-auth.app-api.ferrero.com/00003/mm
DAM_MTLS_CERT_PATH=config/certificates/dam-mtls-dev.pfx
DAM_MTLS_CERT_PASSWORD=fnJ8xrnh!54NE&2HR62=2P3YEy+hy9RajZ7v5&=y

Usage:

python scripts/a1_to_a2_download.py --auth-pfx  # Requires --auth-pfx flag

Requirements:

  • Must run from IP whitelisted in API gateway
  • Different base URL than OAuth2
  • Certificate file must be present

📊 Monitoring

Log Files

Location: logs/

Files:

  • a1_to_a2.log - Master assets download (A1→A2)
  • a5_to_a6.log - Rework assets (A5→A6)
  • b1_to_b2.log - Global masters (B1→B2)
  • a2_to_a3.log - Upload from Box
  • cron_*.log - Cron execution logs
  • daily_report.log - Report generation log

Log Rotation:

  • Max size: 10MB per file
  • Backups: 28 files (~1 month)
  • Automatic cleanup

Database Queries

Check recent activity:

-- Recent master assets
SELECT tracking_id, original_filename, local_campaign_id, created_at
FROM master_assets
ORDER BY created_at DESC
LIMIT 10;

-- Count by status
SELECT status, COUNT(*)
FROM master_assets
GROUP BY status;

-- Recent events
SELECT * FROM asset_events
ORDER BY event_timestamp DESC
LIMIT 20;

🧪 Testing Checklist

Before going live:

  • Database connection test passes
  • OAuth2 authentication works
  • mTLS authentication works (from whitelisted IP)
  • A1→A2 workflow completes successfully
  • A5→A6 workflow filters NOT APPROVED correctly
  • B1→B2 workflow completes successfully
  • A2→A3 upload works
  • Email notifications received
  • Daily report generates correctly
  • Recursive folder search finds subfolders
  • Folder structure preserved in Box
  • Tracking IDs generated uniquely
  • Database stores all data correctly

📁 Complete File Structure

/opt/ferrero-automation/
├── .env                          # Environment variables & credentials
├── config/
│   ├── config.yaml              # Main configuration
│   ├── field_mappings.yaml      # Field mappings
│   └── certificates/            # mTLS certificates (gitignored)
│       └── dam-mtls-dev.pfx    # Certificate file
├── scripts/
│   ├── a1_to_a2_download.py    # A1→A2 workflow
│   ├── a5_to_a6_download.py    # A5→A6 workflow (rework)
│   ├── b1_to_b2_download.py    # B1→B2 workflow (global)
│   ├── a2_to_a3_upload_polling.py  # A2→A3 workflow
│   ├── daily_report.py          # Daily summary email
│   ├── test_connection.py       # Connection tester
│   ├── test_mtls_cert.py        # Certificate tester
│   └── shared/                  # Shared modules
│       ├── dam_client.py        # DAM API client (dual auth)
│       ├── box_client.py        # Box API client
│       ├── database.py          # PostgreSQL client
│       ├── notifier.py          # Email & webhooks
│       └── config_loader.py     # Config loader
├── database/
│   └── init.sql                 # Database schema (create this)
├── venv/                        # Virtual environment (recreate)
├── logs/                        # Log files (auto-created)
├── temp/                        # Temp downloads (auto-created)
├── requirements.txt             # Python dependencies
└── Documentation/
    ├── DATABASE_SCHEMA.md       # Schema reference
    ├── WORKFLOW_DIAGRAMS.md     # Workflow documentation
    └── DEPLOYMENT.md            # Old deployment doc

🗄️ Database Schema (init.sql)

Create file: /opt/ferrero-automation/database/init.sql

See DATABASE_SCHEMA.md for complete schema!

Quick reference - master_assets table has:

  • 35 columns total
  • JSONB for full metadata
  • Campaign relationship fields
  • Unique tracking_id constraint
  • 6 indexes for performance
  • 2 triggers (updated_at, event logging)

🔧 Troubleshooting

Database Connection Issues

# Check database is running
docker ps | grep ferrero

# Test connection
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5433 -U ferrero_user -d ferrero_tracking -c "SELECT 1;"

# Check tables exist
PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5433 -U ferrero_user -d ferrero_tracking -c "\dt"

OAuth2 Issues

# Test OAuth2 auth
python scripts/test_connection.py

# Check logs
tail -f logs/a1_to_a2.log

mTLS Issues

# Verify certificate loads
python scripts/test_mtls_cert.py

# Check you're on whitelisted IP
curl -I https://dev-auth.app-api.ferrero.com/00003/mm/v6

# Test mTLS connection
python scripts/test_connection.py --auth-pfx

Log Files Not Created

# Create log directories
mkdir -p logs/backup
chmod 755 logs

🎯 Production Checklist

Before Production:

  1. Database:

    • Change password from default
    • Configure backups
    • Set up monitoring
  2. Credentials:

    • Update all passwords in .env
    • Get production mTLS certificate (if using)
    • Verify Box credentials are production keys
  3. Testing:

    • Run all test scripts
    • Process test campaign end-to-end
    • Verify emails received
    • Check database records created
  4. Monitoring:

    • Set up log monitoring
    • Configure daily report recipients
    • Set up alerts for errors
  5. Security:

    • File permissions (600 for .env, certificates)
    • Firewall rules configured
    • Only whitelisted IPs for mTLS

📞 Support

If issues occur:

  1. Check logs in logs/ directory
  2. Run python scripts/test_connection.py
  3. Check database with psql commands
  4. Review email notifications for errors

Key contacts:

  • DAM API support: For mTLS questions
  • Database team: For DB issues
  • Email: Check SMTP logs

Summary

Copy Python-Version folder → Install dependencies → Configure DB → Test → Deploy cron jobs

That's it! Everything is self-contained and ready to deploy. 🚀