# PostgreSQL Database Backup & Restore Guide ## Overview Automated backup system for the Ferrero Asset Tracking database with dual backup strategy: - **Daily backups (pg_dump):** Logical SQL dumps, 7-day retention - **Weekly backups (pg_basebackup):** Physical binary backups, keeps latest only ## Quick Start ### Run Manual Backup ```bash # Daily backup ./database/backup.sh --daily # Weekly backup ./database/backup.sh --weekly ``` ### Check Backup Health ```bash ./database/check_backups.sh ``` ### Restore from Backup ```bash # List available backups ./database/restore.sh --list # Restore from specific backup ./database/restore.sh backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz ``` ## Backup Strategy ### Dual Backup Approach | Type | Frequency | Method | Retention | Size | Purpose | |------|-----------|--------|-----------|------|---------| | **Daily** | Every day 2:00 AM | pg_dump | 7 days | ~2-5 MB | Quick restore, portable | | **Weekly** | Sundays 3:00 AM | pg_basebackup | Latest only | ~30 MB | Full cluster backup | ### Why Both Methods? **pg_dump (Daily):** - ✅ Portable across PostgreSQL versions - ✅ Human-readable SQL - ✅ Can restore individual tables - ✅ Smaller file size - ⚠️ Slower restore for large databases **pg_basebackup (Weekly):** - ✅ Faster restore - ✅ Exact binary replica - ✅ Point-in-time recovery capable - ⚠️ Version-specific - ⚠️ Larger file size ## Directory Structure ``` /opt/ferrero-automation/Python-Version/ ├── backups/ │ ├── dumps/ # Daily SQL dumps (7 files) │ │ ├── ferrero_tracking_2025-11-11_02-00.sql.gz │ │ ├── ferrero_tracking_2025-11-10_02-00.sql.gz │ │ └── ... (up to 7 days) │ ├── basebackups/ # Weekly binary backups (1 file) │ │ └── ferrero_tracking_2025-11-10_03-00.tar.gz │ ├── safety/ # Pre-restore safety backups │ │ └── pre_restore_ferrero_tracking_2025-11-11_14-30.sql.gz │ └── README.md ├── database/ │ ├── backup.sh # Backup script │ ├── restore.sh # Restore script │ ├── check_backups.sh # Health check script │ └── init.sql └── logs/ ├── backup.log # Backup execution logs └── restore.log # Restore execution logs ``` ## Automated Backups (Cron) ### Production Cron Setup Add to server crontab (`crontab -e`): ```cron # Daily database backup at 2:00 AM 0 2 * * * cd /opt/ferrero-automation/Python-Version && ./database/backup.sh --daily >> logs/backup.log 2>&1 # Weekly full backup on Sundays at 3:00 AM 0 3 * * 0 cd /opt/ferrero-automation/Python-Version && ./database/backup.sh --weekly >> logs/backup.log 2>&1 # Check backup health daily at 8:00 AM (alerts if stale) 0 8 * * * cd /opt/ferrero-automation/Python-Version && ./database/check_backups.sh --quiet >> logs/backup_check.log 2>&1 ``` ### Schedule Rationale - **2:00 AM** - Off-peak hours, minimal workflow activity - **Sundays 3:00 AM** - Weekly full backup after daily completes - **8:00 AM health check** - Alerts early in business day if backup failed ### Systemd Service Setup (Alternative to Cron) For a more robust setup, you can use the provided systemd services. **1. Install Service Files:** ```bash sudo cp ferrero-backup-daily.service /etc/systemd/system/ sudo cp ferrero-backup-daily.timer /etc/systemd/system/ sudo cp ferrero-backup-weekly.service /etc/systemd/system/ sudo cp ferrero-backup-weekly.timer /etc/systemd/system/ ``` **2. Enable Timers:** ```bash sudo systemctl daemon-reload sudo systemctl enable --now ferrero-backup-daily.timer sudo systemctl enable --now ferrero-backup-weekly.timer ``` **3. Verify Timers:** ```bash systemctl list-timers | grep ferrero ``` ## Backup Scripts ### backup.sh **Daily Mode:** ```bash ./database/backup.sh --daily ``` **What it does:** 1. Checks Docker container is running 2. Runs `pg_dump` via `docker exec` 3. Compresses output with gzip 4. Saves to `backups/dumps/ferrero_tracking_YYYY-MM-DD_HH-MM.sql.gz` 5. Deletes backups older than 7 days 6. Logs to `logs/backup.log` 7. Sends error email if backup fails **Weekly Mode:** ```bash ./database/backup.sh --weekly ``` **What it does:** 1. Checks Docker container is running 2. Runs `pg_basebackup` via `docker exec` 3. Creates compressed tar archive 4. Saves to `backups/basebackups/` 5. Keeps only the latest weekly backup 6. Logs to `logs/backup.log` ### restore.sh **List available backups:** ```bash ./database/restore.sh --list ``` **Restore from backup:** ```bash ./database/restore.sh backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz ``` **Safety features:** 1. Creates safety backup before restore 2. Requires explicit "yes" confirmation 3. Validates backup file exists 4. Drops and recreates database 5. Restores from compressed SQL dump 6. Verifies restoration (counts rows) ### check_backups.sh **Check backup health:** ```bash ./database/check_backups.sh ``` **Output includes:** - Daily backup count and status - Latest backup age (warns if > 25 hours) - Weekly backup status - Total disk space used - List of all backups **Quiet mode (for cron):** ```bash ./database/check_backups.sh --quiet ``` ## Restore Procedures ### Scenario 1: Restore from Daily Backup (Most Common) **When to use:** Data corruption, accidental deletion, need to rollback ```bash # Step 1: List available backups ./database/restore.sh --list # Output shows: # Daily Backups (pg_dump): # backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz (2.1M, Nov 11 02:00) # backups/dumps/ferrero_tracking_2025-11-10_02-00.sql.gz (2.0M, Nov 10 02:00) # ... # Step 2: Choose backup and restore ./database/restore.sh backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz # Step 3: Confirm when prompted # ⚠️ WARNING: This will REPLACE the current database! # Are you sure you want to continue? (yes/no): yes # Step 4: Wait for restore to complete # ✓ Safety backup created # ✓ Database recreated # ✓ Data restored # ✓ Verification: master_assets table has 150 rows ``` ### Scenario 2: Disaster Recovery **When to use:** Complete data loss, server migration 1. **Ensure Docker is running:** ```bash docker ps | grep ferrero-tracking-db ``` 2. **If container not running:** ```bash cd /opt/ferrero-automation/Python-Version docker compose up -d ``` 3. **Wait for PostgreSQL to be ready:** ```bash docker logs ferrero-tracking-db # Wait for "database system is ready to accept connections" ``` 4. **Restore from backup:** ```bash ./database/restore.sh backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz ``` 5. **Verify data:** ```bash PGPASSWORD=ferrero_pass_2025 psql -h localhost -p 5437 -U ferrero_user -d ferrero_tracking -c " SELECT (SELECT COUNT(*) FROM master_assets) as master_assets, (SELECT COUNT(*) FROM derivative_assets) as derivative_assets, (SELECT COUNT(*) FROM creativex_scores) as creativex_scores; " ``` 6. **Resume workflows:** ```bash # Test connection python scripts/test_connection.py # Workflows will auto-resume on next cron run ``` ## Monitoring ### Check Backup Logs ```bash # View recent backup activity tail -50 logs/backup.log # Search for errors grep -i error logs/backup.log # View today's backups grep "$(date +%Y-%m-%d)" logs/backup.log ``` ### Verify Backups Were Created ```bash # Check latest daily backup ls -lh backups/dumps/ | tail -5 # Check latest weekly backup ls -lh backups/basebackups/ # Run health check ./database/check_backups.sh ``` ### Systemd Logs If using the systemd services, you can also check the system journal: ```bash # Check daily backup service logs sudo journalctl -u ferrero-backup-daily.service # Check weekly backup service logs sudo journalctl -u ferrero-backup-weekly.service ``` ### Email Notifications Backup failures automatically send emails to `ERROR_EMAIL` configured in `.env`. **Email includes:** - Error message - Timestamp - Server hostname ## Disk Space Management ### Current Usage ```bash # Check backup directory size du -sh backups/ # Check Docker volume size docker system df -v | grep ferrero-db-data ``` ### Expected Growth **Estimated for 7 days retention:** - Daily dumps: 7 files × ~2 MB = ~14 MB - Weekly backup: 1 file × ~30 MB = ~30 MB - Safety backups: ~5 MB (created only during restores) - **Total: ~50 MB maximum** ### Cleanup Automatic cleanup is built into scripts: - Daily backups older than 7 days are auto-deleted - Only latest weekly backup is kept - Safety backups can be manually deleted after successful restore **Manual cleanup if needed:** ```bash # Remove old safety backups (optional) rm backups/safety/*.sql.gz # Remove specific backup rm backups/dumps/ferrero_tracking_2025-11-01_02-00.sql.gz ``` ## Manual Backup Commands ### Create Ad-Hoc Backup Before Risky Operation ```bash # Quick backup before making changes TIMESTAMP=$(date +%Y-%m-%d_%H-%M) docker exec ferrero-tracking-db pg_dump -U ferrero_user ferrero_tracking | \ gzip > "backups/manual/before_change_${TIMESTAMP}.sql.gz" ``` ### Backup Specific Table Only ```bash # Backup just creativex_scores table docker exec ferrero-tracking-db pg_dump -U ferrero_user -t creativex_scores ferrero_tracking | \ gzip > "backups/manual/creativex_scores_${TIMESTAMP}.sql.gz" ``` ### Copy Backup to Remote Location ```bash # SCP to another server scp backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz \ user@backup-server:/backups/ferrero/ # Sync entire backup directory rsync -avz backups/ user@backup-server:/backups/ferrero/ ``` ## Testing Backups **CRITICAL:** Always test your backups! A backup you can't restore is worthless. ### Test Procedure (Monthly) 1. **Create test database:** ```bash docker exec ferrero-tracking-db createdb -U ferrero_user ferrero_tracking_test ``` 2. **Restore backup to test database:** ```bash gunzip -c backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz | \ docker exec -i ferrero-tracking-db psql -U ferrero_user -d ferrero_tracking_test ``` 3. **Verify data:** ```bash docker exec ferrero-tracking-db psql -U ferrero_user -d ferrero_tracking_test -c " SELECT COUNT(*) FROM master_assets; SELECT COUNT(*) FROM creativex_scores; " ``` 4. **Cleanup test database:** ```bash docker exec ferrero-tracking-db dropdb -U ferrero_user ferrero_tracking_test ``` ## Troubleshooting ### Backup Failed: "Container not running" ```bash # Check Docker status docker ps -a | grep ferrero-tracking-db # Start container docker compose up -d # Verify database is ready docker logs ferrero-tracking-db | tail -20 ``` ### Backup Failed: "Permission denied" ```bash # Fix script permissions chmod +x database/backup.sh database/restore.sh database/check_backups.sh # Fix backup directory permissions chmod 755 backups chmod 600 backups/dumps/*.sql.gz ``` ### Restore Failed: "Database is being accessed" ```bash # Stop all workflows accessing database # (or just run restore during off-hours) # Kill active connections docker exec ferrero-tracking-db psql -U ferrero_user -d postgres -c " SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'ferrero_tracking' AND pid <> pg_backend_pid(); " # Retry restore ./database/restore.sh ``` ### Backup Directory Full ```bash # Check space df -h /opt/ferrero-automation # Remove old safety backups rm backups/safety/*.sql.gz # Reduce retention (edit backup.sh): DAILY_RETENTION_DAYS=3 # Instead of 7 ``` ### Latest Backup is Stale (> 25 hours) ```bash # Check backup log for errors tail -100 logs/backup.log | grep -i error # Check cron is running crontab -l | grep backup # Check cron logs grep CRON /var/log/syslog | grep backup # Run manual backup ./database/backup.sh --daily ``` ## Backup Verification ### Check Backup File Integrity ```bash # Test gzip file is not corrupted gunzip -t backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz echo $? # Should output 0 if OK # View first 20 lines of backup (without extracting) gunzip -c backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz | head -20 ``` ### Verify Backup Contents ```bash # Count how many INSERT statements gunzip -c backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz | grep -c "INSERT" # Check for specific table gunzip -c backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz | grep "CREATE TABLE creativex_scores" ``` ## Production Deployment ### Initial Setup on Server ```bash # SSH to server ssh user@production-server # Navigate to project cd /opt/ferrero-automation/Python-Version # Make scripts executable chmod +x database/backup.sh chmod +x database/restore.sh chmod +x database/check_backups.sh # Create backup directories mkdir -p backups/dumps mkdir -p backups/basebackups mkdir -p backups/safety mkdir -p backups/manual # Set permissions chmod 700 backups chmod 755 database/*.sh # Test backup manually ./database/backup.sh --daily ``` ### Add to Cron ```bash # Edit crontab crontab -e # Add these lines: 0 2 * * * cd /opt/ferrero-automation/Python-Version && ./database/backup.sh --daily >> logs/backup.log 2>&1 0 3 * * 0 cd /opt/ferrero-automation/Python-Version && ./database/backup.sh --weekly >> logs/backup.log 2>&1 0 8 * * * cd /opt/ferrero-automation/Python-Version && ./database/check_backups.sh --quiet >> logs/backup_check.log 2>&1 # Save and exit ``` ### Verify Cron Setup ```bash # List cron jobs crontab -l | grep backup # Check cron service is running systemctl status cron # or 'crond' on some systems # Wait for next scheduled backup and check tail -f logs/backup.log ``` ## Advanced Operations ### Backup Before Major Changes ```bash # Before schema changes ./database/backup.sh --daily # Verify backup created ls -lh backups/dumps/ | tail -1 # Make your changes # ... # If something goes wrong, restore ./database/restore.sh backups/dumps/ferrero_tracking_2025-11-11_14-30.sql.gz ``` ### Backup to Remote Server (Optional) Add to cron for offsite backup: ```bash # Rsync backups to remote server daily at 4:00 AM 0 4 * * * rsync -avz /opt/ferrero-automation/Python-Version/backups/ backup-server:/ferrero-backups/ >> /opt/ferrero-automation/Python-Version/logs/rsync.log 2>&1 ``` ### Point-in-Time Recovery For future enhancement, enable WAL archiving in Docker: ```yaml # docker-compose.yml services: postgres: command: postgres -c wal_level=replica -c archive_mode=on -c archive_command='cp %p /backups/wal/%f' volumes: - ./backups/wal:/backups/wal ``` ## Maintenance ### Weekly Tasks - ✅ Verify latest backup exists - ✅ Check backup file sizes are reasonable - ✅ Review backup logs for errors ```bash # Quick weekly check ./database/check_backups.sh tail -100 logs/backup.log | grep -E "ERROR|✓" ``` ### Monthly Tasks - ✅ Test restore process with test database - ✅ Verify email notifications work - ✅ Check disk space usage ```bash # Monthly test restore docker exec ferrero-tracking-db createdb -U ferrero_user test_restore gunzip -c backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz | \ docker exec -i ferrero-tracking-db psql -U ferrero_user -d test_restore docker exec ferrero-tracking-db dropdb -U ferrero_user test_restore ``` ## Security ### Backup File Permissions ```bash # Set restrictive permissions on backup directories chmod 700 backups chmod 700 backups/dumps chmod 700 backups/basebackups chmod 600 backups/dumps/*.sql.gz chmod 600 backups/basebackups/*.tar.gz ``` ### What's in Backups? Backup files contain: - All database tables (master_assets, creativex_scores, etc.) - All data including metadata (JSONB columns) - Database schema (tables, indexes, functions, triggers) - **NO sensitive files** (.env, certificates, Box-config.json) ### Encryption (Future Enhancement) For offsite backups, consider encryption: ```bash # Encrypt backup gpg --encrypt --recipient your-email@domain.com backup.sql.gz # Decrypt when needed gpg --decrypt backup.sql.gz.gpg > backup.sql.gz ``` ## Backup Failure Scenarios ### What Happens if Backup Fails? 1. **Error logged** to `logs/backup.log` 2. **Email sent** to `ERROR_EMAIL` 3. **Cron continues** - will retry next day 4. **Previous backups** still available ### Email Template for Failures Recipients receive email with: - Error message - Timestamp of failure - Server hostname - Instructions to check logs ## Recovery Time Objectives | Scenario | Data Loss | Recovery Time | |----------|-----------|---------------| | Restore from daily (2 AM backup) | Max 24 hours | ~2-5 minutes | | Restore from weekly (Sunday backup) | Max 7 days | ~1 minute | | Manual backup before change | No loss | ~2-5 minutes | ## FAQs ### How much data will I lose if I restore? Any data created **after** the backup timestamp will be lost. Example: - Backup at 2:00 AM on Nov 11 - Database corrupted at 3:00 PM on Nov 11 - Restore from 2:00 AM backup - **Lost:** 13 hours of data (assets processed between 2 AM - 3 PM) ### Can I restore just one table? Yes, with pg_dump backups: ```bash # Extract just creativex_scores table gunzip -c backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz | \ grep -A 10000 "CREATE TABLE creativex_scores" | \ docker exec -i ferrero-tracking-db psql -U ferrero_user -d ferrero_tracking ``` ### How do I backup before deploying new code? ```bash # Quick manual backup ./database/backup.sh --daily # Or ad-hoc backup with custom name docker exec ferrero-tracking-db pg_dump -U ferrero_user ferrero_tracking | \ gzip > "backups/manual/before_deployment_$(date +%Y-%m-%d).sql.gz" ``` ### What if I need backups older than 7 days? Modify retention in `backup.sh`: ```bash # Edit backup.sh DAILY_RETENTION_DAYS=30 # Keep 30 days instead of 7 ``` Or manually copy important backups to archive: ```bash mkdir -p backups/archive cp backups/dumps/ferrero_tracking_2025-11-11_02-00.sql.gz backups/archive/ ``` ## Support Commands ### Database Size ```bash # Check database size docker exec ferrero-tracking-db psql -U ferrero_user -d ferrero_tracking -c " SELECT pg_size_pretty(pg_database_size('ferrero_tracking')); " ``` ### Table Sizes ```bash # Check individual table sizes docker exec ferrero-tracking-db psql -U ferrero_user -d ferrero_tracking -c " SELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size FROM pg_tables WHERE schemaname = 'public' ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC; " ``` ### Backup History ```bash # List all backups with sizes find backups/ -name "*.sql.gz" -o -name "*.tar.gz" | \ xargs ls -lh | \ awk '{print $9 " - " $5 " - " $6 " " $7 " " $8}' ``` ## Summary ✅ **Automated** - Daily and weekly backups via cron ✅ **Reliable** - Dual strategy (logical + physical) ✅ **Space-efficient** - 7-day retention, ~50 MB total ✅ **Safe** - Safety backups before restore ✅ **Monitored** - Health checks and email alerts ✅ **Docker-compatible** - Works with containerized PostgreSQL ✅ **Production-ready** - Standard PostgreSQL tools **Recommendation:** Set up cron jobs immediately and test restore monthly.