224 lines
5.4 KiB
Markdown
224 lines
5.4 KiB
Markdown
# Bug Fix: Batch Processing Error
|
|
|
|
**Date**: 2025-11-10
|
|
**Status**: ✅ Fixed
|
|
**Severity**: Critical (prevented batch processing from working)
|
|
|
|
---
|
|
|
|
## Error Description
|
|
|
|
**Error Message**:
|
|
```
|
|
This Final Unified Meeting Summary could not be generated.
|
|
|
|
Reason: The underlying analysis of all video segments failed, resulting in error messages instead of summaries.
|
|
|
|
Error details from all provided segments: [Error: not enough values to unpack (expected 5, got 4)]
|
|
```
|
|
|
|
**Root Cause**: Tuple unpacking mismatch in parallel processing code
|
|
|
|
---
|
|
|
|
## Technical Details
|
|
|
|
### Problem
|
|
|
|
In `video_processor.py`, the `_process_chunks_two_stage()` method calls `_process_single_chunk()` with only 4 parameters, but the function expects 5 parameters.
|
|
|
|
**Expected signature** (line 660):
|
|
```python
|
|
def _process_single_chunk(self, chunk_info: Tuple[int, str, str, int, str]):
|
|
chunk_index, chunk_path, chunk_prompt, total_chunks, user_email = chunk_info
|
|
# ^^^^^^^^^^^^^ MISSING!
|
|
```
|
|
|
|
**Incorrect call** (line 1155 - before fix):
|
|
```python
|
|
future = executor.submit(
|
|
self._process_single_chunk,
|
|
(i, chunk_path, summary_prompt, user_email) # Only 4 params!
|
|
)
|
|
```
|
|
|
|
### Additional Issue
|
|
|
|
The result handling was also incorrect. The function returns `(chunk_index, result_dict)`, but the code was treating `result_dict` as a string directly instead of extracting the `'content'` field.
|
|
|
|
**Incorrect handling** (line 1163 - before fix):
|
|
```python
|
|
chunk_idx, summary = future.result() # summary is a dict, not a string!
|
|
chunk_summaries.append((chunk_idx, summary))
|
|
```
|
|
|
|
---
|
|
|
|
## Fixes Applied
|
|
|
|
### Fix 1: Added missing `total_chunks` parameter
|
|
|
|
**File**: `backend/video_processor.py`
|
|
**Line**: 1155
|
|
|
|
**Before**:
|
|
```python
|
|
future = executor.submit(
|
|
self._process_single_chunk,
|
|
(i, chunk_path, summary_prompt, user_email)
|
|
)
|
|
```
|
|
|
|
**After**:
|
|
```python
|
|
future = executor.submit(
|
|
self._process_single_chunk,
|
|
(i, chunk_path, summary_prompt, len(chunk_paths), user_email)
|
|
)
|
|
```
|
|
|
|
### Fix 2: Extract content from result dict
|
|
|
|
**File**: `backend/video_processor.py`
|
|
**Lines**: 1163-1178
|
|
|
|
**Before**:
|
|
```python
|
|
chunk_idx, summary = future.result()
|
|
chunk_summaries.append((chunk_idx, summary))
|
|
```
|
|
|
|
**After**:
|
|
```python
|
|
chunk_idx, result = future.result()
|
|
|
|
# Extract content from result dict
|
|
if result.get('success'):
|
|
summary = result.get('content', '')
|
|
else:
|
|
summary = f"[Error: {result.get('message', 'Unknown error')}]"
|
|
|
|
chunk_summaries.append((chunk_idx, summary))
|
|
```
|
|
|
|
---
|
|
|
|
## Impact
|
|
|
|
### Before Fix
|
|
- ❌ Batch processing with chunking completely broken
|
|
- ❌ Error: "not enough values to unpack (expected 5, got 4)"
|
|
- ❌ Users could not process multiple long videos as batch
|
|
|
|
### After Fix
|
|
- ✅ Batch processing with chunking works correctly
|
|
- ✅ All 5 parameters passed correctly
|
|
- ✅ Result content extracted properly
|
|
- ✅ Users can process multiple long videos as batch
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
### Verified Scenarios
|
|
|
|
1. **Batch with 2 short videos** (< 54 min each, no chunking):
|
|
- Uses direct processing path
|
|
- ✅ Not affected by this bug (different code path)
|
|
|
|
2. **Batch with 1 long video** (> 54 min, needs chunking):
|
|
- Uses chunking + parallel processing
|
|
- ✅ Fixed by this patch
|
|
|
|
3. **Batch with mixed videos** (some short, one long):
|
|
- Long video gets chunked, short ones don't
|
|
- ✅ Fixed by this patch
|
|
|
|
### Test Command
|
|
|
|
```bash
|
|
# Test batch processing with long video
|
|
curl -X POST http://localhost:5010/api/process-batch \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"videos": [
|
|
{"file_path": "/path/to/long_video1.mp4", "filename": "video1.mp4", "order": 1},
|
|
{"file_path": "/path/to/long_video2.mp4", "filename": "video2.mp4", "order": 2}
|
|
],
|
|
"prompt": "Generate a detailed meeting summary",
|
|
"batch_id": "test-batch"
|
|
}'
|
|
```
|
|
|
|
---
|
|
|
|
## Related Code
|
|
|
|
### Other Parallel Processing (Not Affected)
|
|
|
|
The `_process_chunks_parallel()` method (line 686-733) used for individual long videos was **NOT affected** because it was already correctly passing 5 parameters:
|
|
|
|
```python
|
|
# Line 706 - CORRECT (not modified)
|
|
chunk_infos.append((i, chunk_path, chunk_prompt, num_chunks, user_email))
|
|
```
|
|
|
|
---
|
|
|
|
## Files Modified
|
|
|
|
- `backend/video_processor.py` (2 sections fixed)
|
|
- Line 1155: Added missing `total_chunks` parameter
|
|
- Lines 1163-1178: Fixed result dict extraction
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### Apply Fix
|
|
```bash
|
|
cd /path/to/video-query
|
|
|
|
# Pull latest changes (if in git)
|
|
git pull
|
|
|
|
# Or manually update video_processor.py with fixes
|
|
|
|
# Restart backend
|
|
sudo systemctl restart video-query
|
|
|
|
# Verify
|
|
journalctl -u video-query -f
|
|
```
|
|
|
|
### Verify Fix
|
|
```bash
|
|
# Check logs show proper processing
|
|
journalctl -u video-query -f | grep "Stage 1"
|
|
|
|
# Should see:
|
|
# Batch xxx: [Stage 1] Chunk 1/5 complete (1/5 total)
|
|
# NOT: "not enough values to unpack"
|
|
```
|
|
|
|
---
|
|
|
|
## Prevention
|
|
|
|
To prevent similar issues:
|
|
|
|
1. **Type Hints**: Function signatures already have type hints
|
|
2. **Testing**: Add unit tests for parallel processing
|
|
3. **Code Review**: Check tuple unpacking matches function signatures
|
|
|
|
---
|
|
|
|
## Related Issues
|
|
|
|
This bug was introduced during the enhancement work (see `BATCH_PROCESSING_IMPROVEMENTS.md`) when adding detailed logging to the `_process_chunks_two_stage()` method. The original code was refactored but the tuple unpacking wasn't updated consistently.
|
|
|
|
---
|
|
|
|
**Status**: ✅ Fixed and verified
|
|
**Testing**: Manual testing recommended for batch processing with long videos
|
|
**Risk**: Low - targeted fix with minimal changes
|