Problem: Header detection picked data rows (with Yes/No/numbers) as headers because they had more filled cells than the actual header row (which had merged cells with gaps). Result: data values became column labels, deep extraction failed. Fix: - Header values must be text-like (not numbers, Yes/No, 0/1, ü, x, -) - Only consecutive header rows count - stop scanning at first data row - Multi-row headers combined (row 1 + row 2 both contribute) - Tested against Wella Job Routes 2: correctly identifies row 2 as header with "Buckets | Categories | Top 10 deliverables | Tier A | Tier B | Tier C" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| alembic | ||
| app | ||
| alembic.ini | ||
| Dockerfile | ||
| requirements.txt | ||
| start.sh | ||