# CreativeX Score Version Tracking & Decimal Removal ## Changes Made ### 1. Decimal Removal (.0 suffix) **Problem:** CreativeX ID and Quality Score were storing as `6864255.0` and `80.0` **Solution:** Strip `.0` decimals before storing **Code Changes:** - `scripts/creativex_scoring_storing.py` - `parse_csv_fields()` method - Converts to int then back to string: `str(int(float(value)))` - Applied to both `creativex_id` and `quality_score` **Result:** - Before: ID=`6864255.0`, Score=`80.0` - After: ID=`6864255`, Score=`80` ### 2. Version Number Tracking **Feature:** Track how many times each file has been scored **Implementation:** Count total records for filename (including superseded) **Code Changes:** #### Database Method (`scripts/shared/database.py`): - Added version counter query before insert - Counts ALL versions (active + superseded) for filename - Returns `version_number` in result dict - Logs include version: "Score: 80 -> 85, Version: 3" #### Script (`scripts/creativex_scoring_storing.py`): - Captures `version_number` from database result - Passes to email template in `processed_files` list - Logs show version: "Success: Score 85 extracted (Version 3)" #### Email Templates (`scripts/shared/notifier.py`): - **Complete Template:** - Shows version badge in header: "Version 3 (Updated)" - Only displays if `version_number > 1` - Note below: "This is version 3 of this file" - **Partial Template:** - Shows inline: "Score: 85 (Version 3)" ## Example Email Output ### First Upload (New File): ``` Filename: video.mp4 Quality Score: 80 CreativeX ID: 6864255 ``` ### Third Upload (Re-scored): ``` Filename: video.mp4 [Version 3 (Updated)] Quality Score: 85 CreativeX ID: 6864255 📝 Note: This is version 3 of this file (previous versions preserved in database) ``` ## Database Behavior ### Version Counter Logic: 1. Count ALL records with this filename 2. New version = count + 1 3. Mark old `active` → `superseded` 4. Insert new record as `active` with incremented version ### Example Database State: ``` ID | Filename | Score | Status | Version (implicit) 1 | video.mp4 | 80 | superseded | 1 (first) 2 | video.mp4 | 85 | superseded | 2 (second) 3 | video.mp4 | 90 | active | 3 (current) ``` ## Query Examples ### Get Latest Version Only: ```sql SELECT * FROM creativex_scores WHERE filename = 'video.mp4' AND status = 'active'; ``` ### Get Version Count for File: ```sql SELECT COUNT(*) as version_count FROM creativex_scores WHERE filename = 'video.mp4'; ``` ### Get All Versions with Numbers: ```sql SELECT filename, quality_score, status, ROW_NUMBER() OVER (PARTITION BY filename ORDER BY created_at) as version_number, extracted_at FROM creativex_scores WHERE filename = 'video.mp4' ORDER BY extracted_at; ``` ## Testing Checklist - [ ] Upload PDF to Box folder 350605024645 - [ ] Run script: `python scripts/creativex_scoring_storing.py` - [ ] Check logs show version number - [ ] Check database: ID and Score have no `.0` - [ ] Check email shows version badge (if > 1) - [ ] Re-upload same PDF with different score - [ ] Verify version counter increments - [ ] Verify old record marked `superseded` ## Benefits 1. **Clean Data:** No unnecessary `.0` decimals in IDs and scores 2. **Version Tracking:** Know if file has been re-scored 3. **History Preserved:** All previous scores available for audit 4. **Email Clarity:** Users see when a file is being updated vs new 5. **A2→A3 Ready:** Latest version automatically selected via `status='active'` ## Future Use in A2→A3 The version tracking is informational only. The `get_creativex_score_by_filename()` method automatically returns the latest `active` version, so A2→A3 workflow doesn't need to worry about versions. ```python # This always returns the latest version score_data = db.get_creativex_score_by_filename(filename) # score_data['quality_score'] will be "90" (not "90.0") # score_data['creativex_id'] will be "6864255" (not "6864255.0") ```