Commit graph

4 commits

Author SHA1 Message Date
DJP
e0128d98b8 Add automated PostgreSQL backup and restore system
Implements dual backup strategy with daily SQL dumps and weekly binary
backups, complete with restore capabilities and health monitoring.

Backup System Components:

1. database/backup.sh:
   - Daily mode: pg_dump SQL dumps (7-day retention)
   - Weekly mode: pg_basebackup binary backup (latest only)
   - Automatic cleanup of old backups
   - Compression (gzip) for space efficiency
   - Email notifications on failures
   - Docker-compatible execution

2. database/restore.sh:
   - Restore from SQL dump backups
   - Safety backup before restore
   - Confirmation prompts
   - Validation and verification
   - List available backups

3. database/check_backups.sh:
   - Health check monitoring
   - Verifies latest backup age (warns if > 25 hours)
   - Displays backup counts and sizes
   - Quiet mode for cron automation
   - Lists all available backups

Documentation:

- DATABASE_BACKUP_GUIDE.md: Complete backup/restore guide
  - Automated cron setup
  - Manual backup procedures
  - Restore scenarios
  - Troubleshooting
  - Disk space management

- backups/README.md: Quick reference
  - Directory structure
  - Common commands
  - Retention policy
  - Security notes

Configuration:

- Updated .gitignore to exclude backup files
- Backup locations: backups/dumps/, backups/basebackups/
- Logs: logs/backup.log, logs/restore.log
- Retention: 7 daily dumps + 1 weekly basebackup

Cron Schedule (Production):
- Daily: 2:00 AM (pg_dump)
- Weekly: Sundays 3:00 AM (pg_basebackup)
- Health Check: 8:00 AM daily

Features:
 Automated daily and weekly backups
 Dual strategy (logical + physical)
 Space-efficient (7-day retention, ~50 MB total)
 Safety backups before restore
 Email alerts on failures
 Health monitoring
 Docker-compatible
 Tested locally

Testing Performed:
- Daily backup created successfully (77K compressed)
- Backup file integrity verified (gzip test passed)
- Health check shows "Backup system healthy"
- Restore --list command working
- All scripts executable and functional

Disk Usage Estimate:
- Daily dumps: 7 × ~2 MB = ~14 MB
- Weekly backup: 1 × ~30 MB = ~30 MB
- Total: ~50 MB maximum

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 17:30:10 -05:00
DJP
b6b9d7337a Add CreativeX score extraction and storage system
Implements new workflow to extract CreativeX quality scores from PDFs
using LlamaExtract AI and store results in PostgreSQL database.

Components added:
- creativex_scoring_storing.py: Main script to process PDFs from Box
- creativex_scores table: Database table with JSONB for full JSON storage
- Database methods: store_creativex_score() and get_creativex_score_by_filename()
- Email templates: creativex_complete, creativex_partial, creativex_no_files
- Configuration: creativex section in config.yaml
- CREATIVEX_DEPLOYMENT.md: Complete deployment and usage guide

Features:
- Monitors Box folder 350605024645 for PDFs
- Extracts scores using LlamaExtract agent "Creativex-Extract"
- Stores 4 key fields (filename, ID, URL, score) + full JSON
- Deletes processed PDFs from Box after successful extraction
- Sends email notifications for success/partial/no-files scenarios
- Manual execution (python scripts/creativex_scoring_storing.py)

Database schema:
- Table: creativex_scores with 10 columns
- Indexes on filename, box_file_id, status for fast lookups
- JSONB column stores complete extraction for future flexibility

Future integration ready:
db.get_creativex_score_by_filename() available for DAM upload workflows
to attach CreativeX metadata during asset processing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-11 16:15:45 -05:00
DJP
5f6d24c550 Fix timestamp bug in campaign status recording
- Fixed database.py line 479: Changed 'CURRENT_TIMESTAMP' string to actual datetime
- Added datetime import for proper UTC timestamp generation
- This fixes the PostgreSQL error: invalid input syntax for type timestamp
- Added migration file for campaign_status table

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 16:34:46 -05:00
DJP
d4debbe3da Add complete deployment guide and database init.sql
Production-ready deployment package with database schema and server setup guide.

NEW FILES:
1. DEPLOYMENT_GUIDE.md - Complete server deployment instructions
2. database/init.sql - PostgreSQL database initialization script

DEPLOYMENT_GUIDE.md includes:
✓ Server deployment steps (rsync/scp)
✓ PostgreSQL setup (Docker or native)
✓ Python environment setup
✓ Configuration guide
✓ Cron job examples
✓ OAuth2 vs mTLS setup instructions
✓ Testing checklist
✓ Troubleshooting guide
✓ Production checklist
✓ Security hardening steps

database/init.sql includes:
✓ Complete schema with all 35 columns in master_assets
✓ All 4 tables (master_assets, derivative_assets, asset_events, workflow_state)
✓ Campaign relationship fields (global_master_campaign_id, global_master_folder_id, local_campaign_id)
✓ JSONB full_metadata column
✓ 12 indexes for performance
✓ 4 triggers (auto-update timestamps, event logging)
✓ 2 helper functions
✓ Proper permissions for ferrero_user

KEY FEATURES:
- Self-contained deployment
- Works with Docker or native PostgreSQL
- Includes all schema additions from this session
- OAuth2 and mTLS configuration documented
- Cron job templates provided
- Security checklist included

DEPLOYMENT:
1. Copy Python-Version/ folder to server
2. Run database init.sql
3. Install Python dependencies
4. Update .env for server
5. Test connections
6. Set up cron jobs

Everything needed for production deployment is included.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 09:27:25 -05:00