134 lines
3.9 KiB
Markdown
134 lines
3.9 KiB
Markdown
# CreativeX Score Version Tracking & Decimal Removal
|
|
|
|
## Changes Made
|
|
|
|
### 1. Decimal Removal (.0 suffix)
|
|
**Problem:** CreativeX ID and Quality Score were storing as `6864255.0` and `80.0`
|
|
**Solution:** Strip `.0` decimals before storing
|
|
|
|
**Code Changes:**
|
|
- `scripts/creativex_scoring_storing.py` - `parse_csv_fields()` method
|
|
- Converts to int then back to string: `str(int(float(value)))`
|
|
- Applied to both `creativex_id` and `quality_score`
|
|
|
|
**Result:**
|
|
- Before: ID=`6864255.0`, Score=`80.0`
|
|
- After: ID=`6864255`, Score=`80`
|
|
|
|
### 2. Version Number Tracking
|
|
**Feature:** Track how many times each file has been scored
|
|
**Implementation:** Count total records for filename (including superseded)
|
|
|
|
**Code Changes:**
|
|
|
|
#### Database Method (`scripts/shared/database.py`):
|
|
- Added version counter query before insert
|
|
- Counts ALL versions (active + superseded) for filename
|
|
- Returns `version_number` in result dict
|
|
- Logs include version: "Score: 80 -> 85, Version: 3"
|
|
|
|
#### Script (`scripts/creativex_scoring_storing.py`):
|
|
- Captures `version_number` from database result
|
|
- Passes to email template in `processed_files` list
|
|
- Logs show version: "Success: Score 85 extracted (Version 3)"
|
|
|
|
#### Email Templates (`scripts/shared/notifier.py`):
|
|
- **Complete Template:**
|
|
- Shows version badge in header: "Version 3 (Updated)"
|
|
- Only displays if `version_number > 1`
|
|
- Note below: "This is version 3 of this file"
|
|
|
|
- **Partial Template:**
|
|
- Shows inline: "Score: 85 (Version 3)"
|
|
|
|
## Example Email Output
|
|
|
|
### First Upload (New File):
|
|
```
|
|
Filename: video.mp4
|
|
Quality Score: 80
|
|
CreativeX ID: 6864255
|
|
```
|
|
|
|
### Third Upload (Re-scored):
|
|
```
|
|
Filename: video.mp4 [Version 3 (Updated)]
|
|
Quality Score: 85
|
|
CreativeX ID: 6864255
|
|
|
|
📝 Note: This is version 3 of this file (previous versions preserved in database)
|
|
```
|
|
|
|
## Database Behavior
|
|
|
|
### Version Counter Logic:
|
|
1. Count ALL records with this filename
|
|
2. New version = count + 1
|
|
3. Mark old `active` → `superseded`
|
|
4. Insert new record as `active` with incremented version
|
|
|
|
### Example Database State:
|
|
```
|
|
ID | Filename | Score | Status | Version (implicit)
|
|
1 | video.mp4 | 80 | superseded | 1 (first)
|
|
2 | video.mp4 | 85 | superseded | 2 (second)
|
|
3 | video.mp4 | 90 | active | 3 (current)
|
|
```
|
|
|
|
## Query Examples
|
|
|
|
### Get Latest Version Only:
|
|
```sql
|
|
SELECT * FROM creativex_scores
|
|
WHERE filename = 'video.mp4' AND status = 'active';
|
|
```
|
|
|
|
### Get Version Count for File:
|
|
```sql
|
|
SELECT COUNT(*) as version_count
|
|
FROM creativex_scores
|
|
WHERE filename = 'video.mp4';
|
|
```
|
|
|
|
### Get All Versions with Numbers:
|
|
```sql
|
|
SELECT
|
|
filename,
|
|
quality_score,
|
|
status,
|
|
ROW_NUMBER() OVER (PARTITION BY filename ORDER BY created_at) as version_number,
|
|
extracted_at
|
|
FROM creativex_scores
|
|
WHERE filename = 'video.mp4'
|
|
ORDER BY extracted_at;
|
|
```
|
|
|
|
## Testing Checklist
|
|
|
|
- [ ] Upload PDF to Box folder 350605024645
|
|
- [ ] Run script: `python scripts/creativex_scoring_storing.py`
|
|
- [ ] Check logs show version number
|
|
- [ ] Check database: ID and Score have no `.0`
|
|
- [ ] Check email shows version badge (if > 1)
|
|
- [ ] Re-upload same PDF with different score
|
|
- [ ] Verify version counter increments
|
|
- [ ] Verify old record marked `superseded`
|
|
|
|
## Benefits
|
|
|
|
1. **Clean Data:** No unnecessary `.0` decimals in IDs and scores
|
|
2. **Version Tracking:** Know if file has been re-scored
|
|
3. **History Preserved:** All previous scores available for audit
|
|
4. **Email Clarity:** Users see when a file is being updated vs new
|
|
5. **A2→A3 Ready:** Latest version automatically selected via `status='active'`
|
|
|
|
## Future Use in A2→A3
|
|
|
|
The version tracking is informational only. The `get_creativex_score_by_filename()` method automatically returns the latest `active` version, so A2→A3 workflow doesn't need to worry about versions.
|
|
|
|
```python
|
|
# This always returns the latest version
|
|
score_data = db.get_creativex_score_by_filename(filename)
|
|
# score_data['quality_score'] will be "90" (not "90.0")
|
|
# score_data['creativex_id'] will be "6864255" (not "6864255.0")
|
|
```
|