v3.1 Enterprise Edition: Excel/Import mapping, UI fixes, documentation update
Features: - Smart column mapping for Excel and Import files (CSV/Excel/JSON) - Modal dialogs for configuring sheet and column mappings - Auto-detection of common column names (filename, title, description, keywords) - Preview of first 3 rows before confirming mapping - Case-insensitive filename matching without extension UI Improvements: - Fixed output folder selection (now uses text input instead of folder browser) - Removed non-functional Reset button from metadata editor - Clear button for output folder path Documentation: - Updated README.md with v3.1 Enterprise Edition information - Developer: Vadym Samoilenko - License: Corporate License - Oliver Marketing - Added AI usage tracking and logging documentation - Complete installation guide with all dependencies - API endpoint documentation - Security and privacy section - Troubleshooting guide Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
e9784d7da8
commit
804c8acbbb
5 changed files with 1849 additions and 221 deletions
17
.gitignore
vendored
17
.gitignore
vendored
|
|
@ -51,6 +51,7 @@ Thumbs.db
|
|||
# Python virtual environments
|
||||
venv/
|
||||
venv_new/
|
||||
venv_local/
|
||||
env/
|
||||
ENV/
|
||||
.venv/
|
||||
|
|
@ -76,3 +77,19 @@ Files/
|
|||
.vscode/
|
||||
.claude/
|
||||
|
||||
# Database files
|
||||
*.db
|
||||
*.sqlite
|
||||
*.sqlite3
|
||||
|
||||
# Server files
|
||||
server.pid
|
||||
server.log
|
||||
nohup.out
|
||||
|
||||
# Test files
|
||||
test_*.csv
|
||||
test_*.xlsx
|
||||
test_*.json
|
||||
TEST_REPORT.md
|
||||
|
||||
|
|
|
|||
505
README.md
505
README.md
|
|
@ -1,97 +1,486 @@
|
|||
# Oliver Metadata Tool
|
||||
# Oliver Metadata Tool v3.1 Enterprise Edition
|
||||
|
||||
Universal metadata creation and management tool for all file types. Create, import, and manage metadata from multiple sources with an intuitive web interface.
|
||||
Universal metadata creation and management tool for all file types. Create, import, and manage metadata from multiple sources with an intuitive web interface, user authentication, and AI-powered metadata generation.
|
||||
|
||||
**Developer:** Vadym Samoilenko
|
||||
**License:** Corporate License - Oliver Marketing
|
||||
**Version:** 3.1 (Enterprise Edition)
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
- **Excel-based metadata lookup**: Reads metadata from "Celum ID to Adobe Asset Path Mapping Spreadsheet"
|
||||
- **Multi-format support**: PDF, images (JPG, PNG, etc.), Office documents (Word, Excel, PowerPoint), video files
|
||||
- **Unicode support**: Full support for Chinese, Japanese, Korean characters (CGA region)
|
||||
- **OCR capabilities**: Multi-language text extraction with Tesseract
|
||||
- **Web interface**: Flask-based UI for easy batch processing
|
||||
- **Dual-sheet Excel lookup**: Primary lookup from DSB sheet, fallback to Medsurg sheet
|
||||
### Multiple Metadata Sources
|
||||
- **📊 Excel Lookup**: Configure custom Excel files with column mapping
|
||||
- **🤖 AI Generation**: OpenAI-powered intelligent metadata generation
|
||||
- **✏️ Manual Entry**: Direct editing with real-time validation
|
||||
- **📂 File Import**: Import from CSV, Excel, or JSON with custom mapping
|
||||
- **📋 Templates**: Reusable metadata templates with variables
|
||||
|
||||
### Enterprise Features
|
||||
- **🔐 Authentication**: Local user authentication + Microsoft SSO support
|
||||
- **👥 User Management**: SQLite database for users and sessions
|
||||
- **📊 Audit Logging**: Track all user actions and metadata changes
|
||||
- **🔍 AI Usage Tracking**: Monitor OpenAI token usage and costs
|
||||
|
||||
### File Support
|
||||
- **300+ File Formats** via ExifTool integration
|
||||
- **PDF Files**: Full metadata support (title, subject, keywords, author, copyright)
|
||||
- **Images**: JPEG, PNG, GIF, HEIC, TIFF, RAW formats
|
||||
- **Office Documents**: Word, Excel, PowerPoint
|
||||
- **Video Files**: MP4, MOV, AVI, MKV
|
||||
- **Unicode Support**: Full support for Chinese, Japanese, Korean characters
|
||||
|
||||
### Advanced Capabilities
|
||||
- **Smart Field Mapping**: Auto-detect columns with fuzzy matching
|
||||
- **Batch Processing**: Process multiple files with selective updates
|
||||
- **Custom Metadata Fields**: Add unlimited custom fields
|
||||
- **CSV Export**: Export metadata and processing results
|
||||
- **Template Variables**: {filename}, {date}, {user}, custom variables
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.8+
|
||||
- Tesseract OCR (for image text extraction)
|
||||
- Poppler (for PDF processing)
|
||||
- **ExifTool 12.15+** (recommended - enables 300+ file formats and improved performance)
|
||||
### System Dependencies
|
||||
- **Python 3.8+**
|
||||
- **ExifTool 12.15+** (required for 300+ format support)
|
||||
- **Tesseract OCR** (optional - for image text extraction)
|
||||
- **Poppler** (optional - for PDF content extraction)
|
||||
|
||||
### Python Dependencies
|
||||
All listed in `requirements.txt`:
|
||||
- Flask 2.3.0+ (Web framework)
|
||||
- pandas, openpyxl (Excel/CSV processing)
|
||||
- PyExifTool 0.5.6+ (Metadata operations)
|
||||
- openai 1.0.0+ (AI generation)
|
||||
- tiktoken 0.5.0+ (Token counting)
|
||||
- tenacity 8.2.0+ (Retry logic)
|
||||
- msal (Microsoft SSO - optional)
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
1. Install system dependencies:
|
||||
```bash
|
||||
# macOS
|
||||
brew install tesseract tesseract-lang poppler exiftool
|
||||
### 1. Install System Dependencies
|
||||
|
||||
# Linux (Ubuntu/Debian)
|
||||
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim tesseract-ocr-chi-tra tesseract-ocr-jpn tesseract-ocr-kor poppler-utils libimage-exiftool-perl
|
||||
**macOS:**
|
||||
```bash
|
||||
brew install exiftool tesseract tesseract-lang poppler
|
||||
```
|
||||
|
||||
**Note:** ExifTool is optional but highly recommended. It provides:
|
||||
- Support for 300+ file formats
|
||||
- 10-60x faster batch operations
|
||||
- Better PDF metadata writing
|
||||
- See [docs/EXIFTOOL_SETUP.md](docs/EXIFTOOL_SETUP.md) for detailed setup instructions
|
||||
|
||||
2. Create virtual environment and install Python packages:
|
||||
**Linux (Ubuntu/Debian):**
|
||||
```bash
|
||||
sudo apt-get install libimage-exiftool-perl tesseract-ocr tesseract-ocr-chi-sim tesseract-ocr-chi-tra tesseract-ocr-jpn tesseract-ocr-kor poppler-utils
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
```bash
|
||||
# Install ExifTool from: https://exiftool.org/
|
||||
choco install exiftool tesseract
|
||||
```
|
||||
|
||||
**Verify ExifTool Installation:**
|
||||
```bash
|
||||
exiftool -ver
|
||||
# Should show version 12.15 or higher
|
||||
```
|
||||
|
||||
See [docs/EXIFTOOL_SETUP.md](docs/EXIFTOOL_SETUP.md) for detailed setup instructions.
|
||||
|
||||
### 2. Create Virtual Environment
|
||||
|
||||
```bash
|
||||
python3 -m venv venv_local
|
||||
source venv_local/bin/activate # On Windows: venv_local\Scripts\activate
|
||||
```
|
||||
|
||||
### 3. Install Python Dependencies
|
||||
|
||||
```bash
|
||||
python3 -m venv venv
|
||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
3. Set up environment variables (create `.env` file):
|
||||
### 4. Configure Environment Variables
|
||||
|
||||
Create a `.env` file in the project root:
|
||||
|
||||
```env
|
||||
# Required: OpenAI API Key (for AI metadata generation)
|
||||
OPENAI_API_KEY=your-openai-api-key-here
|
||||
|
||||
# Optional: Microsoft SSO (for enterprise authentication)
|
||||
# AZURE_CLIENT_ID=your-azure-client-id
|
||||
# AZURE_CLIENT_SECRET=your-azure-client-secret
|
||||
# AZURE_TENANT_ID=your-azure-tenant-id
|
||||
# REDIRECT_URI=http://localhost:5001/auth/callback
|
||||
|
||||
# Optional: Flask secret key (auto-generated if not set)
|
||||
# SECRET_KEY=your-secret-key-here
|
||||
|
||||
# Optional: AI settings (defaults shown)
|
||||
# AI_MODEL=gpt-4o-mini
|
||||
# MAX_TOKENS=500
|
||||
# TEMPERATURE=0.5
|
||||
# API_TIMEOUT=30
|
||||
# API_MAX_RETRIES=3
|
||||
```
|
||||
UPLOAD_FOLDER=uploads
|
||||
OUTPUT_FOLDER=output
|
||||
TESSERACT_PATH=/opt/homebrew/bin/tesseract
|
||||
OCR_LANGUAGES=eng+chi_sim+chi_tra+jpn+kor
|
||||
|
||||
### 5. Initialize Database
|
||||
|
||||
The database will be created automatically on first run. To manually initialize:
|
||||
|
||||
```bash
|
||||
python -c "from src.database import Database; db = Database(); print('Database initialized')"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
### Web Interface
|
||||
### Starting the Web Application
|
||||
|
||||
```bash
|
||||
python web_app.py
|
||||
```
|
||||
|
||||
Open browser at `http://localhost:5001`
|
||||
The application will:
|
||||
1. ✅ Check for ExifTool availability
|
||||
2. ✅ Initialize SQLite database (users, sessions, audit_log)
|
||||
3. ✅ Start Flask server on http://localhost:5001
|
||||
4. 🌐 Open browser automatically
|
||||
|
||||
### GUI Application
|
||||
### Login
|
||||
|
||||
```bash
|
||||
python run_gui.py
|
||||
**Test Account:**
|
||||
- Username: `tester`
|
||||
- Password: `oliveradmin`
|
||||
|
||||
**Microsoft SSO** (if configured):
|
||||
- Click "Sign in with Microsoft" button
|
||||
- Authenticate via Azure AD
|
||||
- Users auto-created on first login
|
||||
|
||||
### Using Metadata Sources
|
||||
|
||||
#### 1. Excel Lookup
|
||||
1. Click "Upload Excel File"
|
||||
2. Configure mapping modal:
|
||||
- Select sheet name
|
||||
- Map columns: Filename (required), Title, Description, Keywords
|
||||
- Preview first 3 rows
|
||||
3. Confirm mapping
|
||||
4. Upload files to process
|
||||
|
||||
#### 2. AI Generation
|
||||
1. Select "AI Generation" from metadata source dropdown
|
||||
2. Upload files
|
||||
3. AI generates metadata (10-30 seconds per file)
|
||||
4. Review and edit generated metadata
|
||||
5. Save changes
|
||||
|
||||
#### 3. Manual Entry
|
||||
1. Select "Manual Entry"
|
||||
2. Upload files
|
||||
3. Fill in metadata fields manually
|
||||
4. Save changes
|
||||
|
||||
#### 4. Import from File
|
||||
1. Click "Import from File"
|
||||
2. Upload CSV/Excel/JSON file
|
||||
3. Configure column mapping (same as Excel)
|
||||
4. Upload files to match metadata
|
||||
|
||||
#### 5. Templates
|
||||
1. Create template with variables
|
||||
2. Select template from dropdown
|
||||
3. Apply to selected files
|
||||
4. Review and save
|
||||
|
||||
### Batch Operations
|
||||
|
||||
1. Upload multiple files
|
||||
2. Use checkboxes to select files
|
||||
3. "Select All" / "Deselect All" buttons
|
||||
4. Edit metadata individually
|
||||
5. Click "Update Selected Files" to save all at once
|
||||
6. Export results to CSV
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Database Schema
|
||||
|
||||
**Users Table:**
|
||||
- id, username, password_hash, email, full_name
|
||||
- auth_method (local/sso)
|
||||
- created_at, last_login, is_active
|
||||
|
||||
**Sessions Table:**
|
||||
- session_id, user_id, created_at, expires_at
|
||||
- ip_address, user_agent
|
||||
|
||||
**Audit Log Table:**
|
||||
- id, user_id, action, details, timestamp
|
||||
|
||||
### AI Usage Tracking
|
||||
|
||||
Every AI metadata generation is logged with:
|
||||
- User ID
|
||||
- Timestamp
|
||||
- Tokens used (prompt + completion)
|
||||
- Cost estimate (based on gpt-4o-mini pricing)
|
||||
|
||||
View logs in database:
|
||||
```sql
|
||||
SELECT * FROM audit_log WHERE action = 'ai_generation' ORDER BY timestamp DESC;
|
||||
```
|
||||
|
||||
## Excel Data Structure
|
||||
### User Management
|
||||
|
||||
The tool reads metadata from Excel file with two sheets:
|
||||
**Create New User:**
|
||||
```python
|
||||
from src.database import Database
|
||||
db = Database()
|
||||
db.create_user(
|
||||
username='newuser',
|
||||
password='password123',
|
||||
email='user@example.com',
|
||||
full_name='New User',
|
||||
auth_method='local'
|
||||
)
|
||||
```
|
||||
|
||||
### Sheet 1: DSB Celum ID to Path mapping (Primary)
|
||||
- Column B: Celum ID
|
||||
- Column E: Title
|
||||
- Column F: External Description/Alt Text
|
||||
**List All Users:**
|
||||
```python
|
||||
users = db.get_all_users()
|
||||
for user in users:
|
||||
print(f"{user['username']} - Last login: {user['last_login']}")
|
||||
```
|
||||
|
||||
### Sheet 2: Medsurg Metadata Cheat (Fallback)
|
||||
- Column: Solventum DAM Asset Path (contains filename)
|
||||
- Metadata columns for Title and Description
|
||||
|
||||
Lookup is performed by filename (without extension), case-insensitive.
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
- `web_app.py` - Flask web application
|
||||
- `run_gui.py` - GUI launcher
|
||||
- `src/` - Core modules
|
||||
- `extractors/` - Content extraction for different file types
|
||||
- `updaters/` - Metadata update for different file types
|
||||
- `excel_metadata_lookup.py` - Excel-based metadata lookup
|
||||
- `main.py` - Core processing logic
|
||||
- `config.py` - Configuration management
|
||||
### File Structure
|
||||
|
||||
## License
|
||||
```
|
||||
oliver-metadata-tool/
|
||||
├── web_app.py # Flask web application (main entry point)
|
||||
├── requirements.txt # Python dependencies
|
||||
├── .env # Environment configuration
|
||||
├── oliver_metadata.db # SQLite database (auto-created)
|
||||
├── src/
|
||||
│ ├── config.py # Configuration management
|
||||
│ ├── database.py # Database operations
|
||||
│ ├── auth.py # Authentication logic
|
||||
│ ├── metadata_analyzer.py # AI metadata generation
|
||||
│ ├── metadata_importer.py # Import from files
|
||||
│ ├── template_manager.py # Template system
|
||||
│ ├── field_mapper.py # Column mapping
|
||||
│ ├── excel_metadata_lookup.py # Excel lookup
|
||||
│ ├── extractors/
|
||||
│ │ ├── pdf_extractor.py
|
||||
│ │ ├── image_extractor.py
|
||||
│ │ ├── office_extractor.py
|
||||
│ │ ├── video_extractor.py
|
||||
│ │ └── exiftool_extractor.py
|
||||
│ └── updaters/
|
||||
│ ├── pdf_updater.py
|
||||
│ ├── image_updater.py
|
||||
│ ├── office_updater.py
|
||||
│ ├── video_updater.py
|
||||
│ └── exiftool_updater.py
|
||||
├── templates/
|
||||
│ ├── index.html # Main UI
|
||||
│ └── login.html # Login page
|
||||
└── docs/
|
||||
└── EXIFTOOL_SETUP.md # ExifTool setup guide
|
||||
```
|
||||
|
||||
Proprietary - Solventum
|
||||
### Technology Stack
|
||||
|
||||
- **Backend:** Flask (Python)
|
||||
- **Database:** SQLite
|
||||
- **Frontend:** HTML5, CSS3, JavaScript (Vanilla)
|
||||
- **Design:** Montserrat font, Dark & Gold theme
|
||||
- **Authentication:** Flask-Session, werkzeug.security, MSAL
|
||||
- **AI:** OpenAI API (gpt-4o-mini)
|
||||
- **Metadata:** PyExifTool, pypdf, python-docx, openpyxl
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Authentication
|
||||
- `GET /login` - Login page
|
||||
- `POST /login` - Authenticate user
|
||||
- `GET /logout` - Destroy session
|
||||
- `GET /login/microsoft` - Microsoft SSO redirect
|
||||
- `GET /auth/callback` - SSO callback
|
||||
|
||||
### File Operations
|
||||
- `POST /upload` - Upload files and generate metadata
|
||||
- `POST /update-manual` - Update file metadata manually
|
||||
- `GET /download/<filename>` - Download processed file
|
||||
|
||||
### Metadata Sources
|
||||
- `POST /upload-excel` - Upload Excel file for mapping
|
||||
- `POST /preview-excel-sheet` - Preview Excel sheet structure
|
||||
- `POST /configure-excel-mapping` - Configure Excel column mapping
|
||||
- `POST /import-metadata` - Upload import file for mapping
|
||||
- `POST /configure-import-mapping` - Configure import column mapping
|
||||
|
||||
### Templates
|
||||
- `GET /templates/list` - List all templates
|
||||
- `POST /templates/save` - Save new template
|
||||
- `POST /templates/load` - Load template by name
|
||||
- `DELETE /templates/delete` - Delete template
|
||||
- `POST /templates/apply` - Apply template to files
|
||||
- `POST /templates/preview` - Preview template output
|
||||
|
||||
---
|
||||
|
||||
## Security & Privacy
|
||||
|
||||
### Authentication
|
||||
- Passwords hashed with werkzeug.security (pbkdf2:sha256)
|
||||
- Session tokens: 32-byte cryptographically secure random strings
|
||||
- Sessions expire after 24 hours
|
||||
- Microsoft SSO via OAuth2 + Azure AD
|
||||
|
||||
### Data Protection
|
||||
- All credentials stored in `.env` (excluded from git)
|
||||
- Database file excluded from git
|
||||
- API keys never logged or exposed to frontend
|
||||
- Audit trail for all user actions
|
||||
|
||||
### Production Recommendations
|
||||
1. **HTTPS:** Use SSL/TLS certificates in production
|
||||
2. **Database:** Migrate to PostgreSQL for better concurrency
|
||||
3. **Rate Limiting:** Add rate limits to prevent abuse
|
||||
4. **CSRF Protection:** Enable Flask-WTF for form security
|
||||
5. **Error Tracking:** Integrate Sentry or similar service
|
||||
6. **Backups:** Regular database backups
|
||||
7. **Monitoring:** Track AI token usage for cost management
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**ExifTool not found:**
|
||||
```bash
|
||||
# Verify installation
|
||||
exiftool -ver
|
||||
|
||||
# macOS: Reinstall with Homebrew
|
||||
brew reinstall exiftool
|
||||
|
||||
# Linux: Reinstall with apt
|
||||
sudo apt-get install --reinstall libimage-exiftool-perl
|
||||
```
|
||||
|
||||
**Database locked error:**
|
||||
```bash
|
||||
# Stop all instances
|
||||
lsof -ti:5001 | xargs kill -9
|
||||
|
||||
# Restart application
|
||||
python web_app.py
|
||||
```
|
||||
|
||||
**OpenAI API errors:**
|
||||
- Check API key in `.env` file
|
||||
- Verify API key is valid at https://platform.openai.com/api-keys
|
||||
- Check token usage limits on OpenAI dashboard
|
||||
|
||||
**Import failed - column not found:**
|
||||
- Use the mapping modal to manually select columns
|
||||
- Check that your file has headers in the first row
|
||||
- Verify file encoding is UTF-8
|
||||
|
||||
---
|
||||
|
||||
## Development
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Unit tests (if implemented)
|
||||
pytest tests/
|
||||
|
||||
# Manual integration test
|
||||
python -c "from src.database import Database; from src.config import Config; print('✅ All imports successful')"
|
||||
```
|
||||
|
||||
### Git Workflow
|
||||
|
||||
```bash
|
||||
# Check status
|
||||
git status
|
||||
|
||||
# Add changes
|
||||
git add .
|
||||
|
||||
# Commit with message
|
||||
git commit -m "Your commit message"
|
||||
|
||||
# Push to remote
|
||||
git push origin main
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## License & Credits
|
||||
|
||||
**License:** Corporate License - Oliver Marketing
|
||||
All rights reserved. Unauthorized copying, distribution, or modification is prohibited.
|
||||
|
||||
**Developer:** Vadym Samoilenko
|
||||
**Company:** Oliver Marketing
|
||||
**Version:** 3.1 Enterprise Edition
|
||||
**Release Date:** January 2026
|
||||
|
||||
**Third-Party Software:**
|
||||
- ExifTool by Phil Harvey (Perl Artistic License)
|
||||
- Flask by Pallets (BSD License)
|
||||
- OpenAI API (Commercial License)
|
||||
- PyExifTool (LGPL License)
|
||||
|
||||
---
|
||||
|
||||
## Support
|
||||
|
||||
For issues, questions, or feature requests:
|
||||
- **Internal Support:** Contact IT department
|
||||
- **Developer:** Vadym Samoilenko
|
||||
- **Documentation:** See `docs/` folder
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
### v3.1 (January 2026) - Enterprise Edition
|
||||
- ✅ User authentication (local + Microsoft SSO)
|
||||
- ✅ SQLite database with audit logging
|
||||
- ✅ Smart column mapping for Excel/CSV import
|
||||
- ✅ Custom metadata fields support
|
||||
- ✅ AI usage tracking and cost monitoring
|
||||
- ✅ Dark & Gold UI redesign
|
||||
- ✅ Template variables and preview
|
||||
- ✅ Batch selection and CSV export
|
||||
|
||||
### v3.0 (January 2026)
|
||||
- ✅ ExifTool integration (300+ formats)
|
||||
- ✅ Multiple metadata sources (Excel, AI, Manual, Import)
|
||||
- ✅ Field mapping with fuzzy matching
|
||||
- ✅ Metadata templates system
|
||||
- ✅ Rebranded to Oliver Metadata Tool
|
||||
|
||||
### v2.x (Prior)
|
||||
- Basic Excel lookup functionality
|
||||
- Multi-format file support
|
||||
- Web and GUI interfaces
|
||||
|
|
|
|||
1087
templates/index.html
1087
templates/index.html
File diff suppressed because it is too large
Load diff
|
|
@ -4,11 +4,41 @@
|
|||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Login - Oliver Metadata Tool</title>
|
||||
<link href="https://fonts.googleapis.com/css2?family=Montserrat:wght@300;400;500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
:root {
|
||||
--primary-gold: #FFC407;
|
||||
--primary-gold-dark: #e6b007;
|
||||
--primary-gold-light: #ffcf33;
|
||||
--dark-primary: #2c2c2c;
|
||||
--dark-secondary: #1a1a1a;
|
||||
--white: #ffffff;
|
||||
--text-primary: #1f2937;
|
||||
--text-muted: #6b7280;
|
||||
--overlay-light: rgba(255, 255, 255, 0.95);
|
||||
--border-light: rgba(255, 255, 255, 0.2);
|
||||
--shadow-lg: 0 20px 40px rgba(0, 0, 0, 0.1);
|
||||
--radius-md: 12px;
|
||||
--radius-xl: 20px;
|
||||
--font-family: 'Montserrat', -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
|
||||
--transition-fast: 0.15s ease;
|
||||
}
|
||||
|
||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||
|
||||
@keyframes shimmer {
|
||||
0% { transform: translateX(-100%); }
|
||||
100% { transform: translateX(100%); }
|
||||
}
|
||||
|
||||
@keyframes pulse {
|
||||
0%, 100% { transform: scale(1); }
|
||||
50% { transform: scale(1.05); }
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
font-family: var(--font-family);
|
||||
background: linear-gradient(135deg, var(--dark-primary) 0%, var(--dark-secondary) 100%);
|
||||
min-height: 100vh;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
|
|
@ -17,9 +47,11 @@
|
|||
}
|
||||
|
||||
.login-container {
|
||||
background: white;
|
||||
border-radius: 20px;
|
||||
box-shadow: 0 20px 60px rgba(0,0,0,0.3);
|
||||
background: var(--overlay-light);
|
||||
backdrop-filter: blur(20px);
|
||||
border-radius: var(--radius-xl);
|
||||
box-shadow: var(--shadow-lg);
|
||||
border: 1px solid var(--border-light);
|
||||
width: 100%;
|
||||
max-width: 450px;
|
||||
padding: 40px;
|
||||
|
|
@ -28,17 +60,21 @@
|
|||
.logo {
|
||||
text-align: center;
|
||||
margin-bottom: 30px;
|
||||
position: relative;
|
||||
}
|
||||
|
||||
.logo h1 {
|
||||
color: #667eea;
|
||||
font-size: 28px;
|
||||
color: var(--primary-gold-dark);
|
||||
font-size: 32px;
|
||||
margin-bottom: 10px;
|
||||
font-weight: 700;
|
||||
text-shadow: 0 2px 4px rgba(255, 196, 7, 0.2);
|
||||
}
|
||||
|
||||
.logo p {
|
||||
color: #6c757d;
|
||||
color: var(--text-muted);
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.divider {
|
||||
|
|
@ -53,15 +89,16 @@
|
|||
left: 0;
|
||||
right: 0;
|
||||
top: 50%;
|
||||
height: 1px;
|
||||
background: #dee2e6;
|
||||
height: 2px;
|
||||
background: linear-gradient(90deg, transparent, var(--primary-gold-light), transparent);
|
||||
}
|
||||
|
||||
.divider span {
|
||||
background: white;
|
||||
background: var(--overlay-light);
|
||||
padding: 0 15px;
|
||||
color: #6c757d;
|
||||
color: var(--text-muted);
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
position: relative;
|
||||
z-index: 1;
|
||||
}
|
||||
|
|
@ -73,7 +110,7 @@
|
|||
.form-group label {
|
||||
display: block;
|
||||
font-weight: 600;
|
||||
color: #495057;
|
||||
color: var(--text-primary);
|
||||
margin-bottom: 8px;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
|
@ -82,25 +119,28 @@
|
|||
width: 100%;
|
||||
padding: 12px;
|
||||
border: 2px solid #dee2e6;
|
||||
border-radius: 8px;
|
||||
border-radius: var(--radius-md);
|
||||
font-size: 14px;
|
||||
transition: border-color 0.3s;
|
||||
font-family: var(--font-family);
|
||||
transition: all var(--transition-fast);
|
||||
}
|
||||
|
||||
.form-group input:focus {
|
||||
outline: none;
|
||||
border-color: #667eea;
|
||||
border-color: var(--primary-gold);
|
||||
box-shadow: 0 0 0 3px rgba(255, 196, 7, 0.1);
|
||||
}
|
||||
|
||||
.btn {
|
||||
width: 100%;
|
||||
padding: 14px;
|
||||
border: none;
|
||||
border-radius: 8px;
|
||||
border-radius: var(--radius-md);
|
||||
font-size: 16px;
|
||||
font-weight: 600;
|
||||
font-family: var(--font-family);
|
||||
cursor: pointer;
|
||||
transition: transform 0.2s;
|
||||
transition: all var(--transition-fast);
|
||||
}
|
||||
|
||||
.btn:hover {
|
||||
|
|
@ -108,60 +148,79 @@
|
|||
}
|
||||
|
||||
.btn-primary {
|
||||
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
|
||||
color: white;
|
||||
background: linear-gradient(135deg, var(--primary-gold), var(--primary-gold-dark));
|
||||
color: var(--dark-secondary);
|
||||
margin-bottom: 15px;
|
||||
box-shadow: 0 4px 12px rgba(255, 196, 7, 0.3);
|
||||
}
|
||||
|
||||
.btn-primary:hover {
|
||||
box-shadow: 0 6px 16px rgba(255, 196, 7, 0.4);
|
||||
}
|
||||
|
||||
.btn-sso {
|
||||
background: white;
|
||||
color: #495057;
|
||||
border: 2px solid #dee2e6;
|
||||
background: var(--white);
|
||||
color: var(--text-primary);
|
||||
border: 2px solid var(--primary-gold);
|
||||
}
|
||||
|
||||
.btn-sso:hover {
|
||||
border-color: #667eea;
|
||||
color: #667eea;
|
||||
border-color: var(--primary-gold-dark);
|
||||
background: #fffbf0;
|
||||
color: var(--primary-gold-dark);
|
||||
}
|
||||
|
||||
.alert {
|
||||
padding: 12px;
|
||||
border-radius: 8px;
|
||||
border-radius: var(--radius-md);
|
||||
margin-bottom: 20px;
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.alert-error {
|
||||
background: #f8d7da;
|
||||
color: #721c24;
|
||||
border: 1px solid #f5c6cb;
|
||||
background: #fee;
|
||||
color: #c33;
|
||||
border: 2px solid #fcc;
|
||||
}
|
||||
|
||||
.alert-info {
|
||||
background: #d1ecf1;
|
||||
color: #0c5460;
|
||||
border: 1px solid #bee5eb;
|
||||
background: #fffbf0;
|
||||
color: var(--primary-gold-dark);
|
||||
border: 2px solid var(--primary-gold-light);
|
||||
}
|
||||
|
||||
.test-user-info {
|
||||
background: #f8f9ff;
|
||||
border: 2px dashed #667eea;
|
||||
border-radius: 8px;
|
||||
background: #fffbf0;
|
||||
border: 2px dashed var(--primary-gold);
|
||||
border-radius: var(--radius-md);
|
||||
padding: 15px;
|
||||
margin-bottom: 20px;
|
||||
font-size: 13px;
|
||||
color: #495057;
|
||||
color: var(--text-primary);
|
||||
animation: pulse 3s infinite;
|
||||
}
|
||||
|
||||
.test-user-info strong {
|
||||
color: #667eea;
|
||||
color: var(--primary-gold-dark);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.test-user-info code {
|
||||
background: rgba(255, 196, 7, 0.15);
|
||||
padding: 2px 6px;
|
||||
border-radius: 4px;
|
||||
font-family: 'Courier New', monospace;
|
||||
color: var(--primary-gold-dark);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.footer-text {
|
||||
text-align: center;
|
||||
margin-top: 20px;
|
||||
font-size: 12px;
|
||||
color: #6c757d;
|
||||
color: var(--text-muted);
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.microsoft-icon {
|
||||
|
|
|
|||
326
web_app.py
326
web_app.py
|
|
@ -6,7 +6,7 @@ Flask-based web app for local or server deployment.
|
|||
Supports multiple metadata sources: Excel, AI, manual entry, and file import.
|
||||
"""
|
||||
|
||||
from flask import Flask, render_template, request, jsonify, send_file
|
||||
from flask import Flask, render_template, request, jsonify, send_file, session, redirect, url_for
|
||||
from werkzeug.utils import secure_filename # noqa: F401 - kept as fallback
|
||||
from pathlib import Path
|
||||
import os
|
||||
|
|
@ -259,7 +259,19 @@ def upload_file():
|
|||
}
|
||||
|
||||
# Get metadata lookup (only if using Excel source)
|
||||
lookup = get_metadata_lookup() if metadata_source == 'excel' else None
|
||||
excel_session_id = request.form.get('excel_session_id')
|
||||
lookup = None
|
||||
|
||||
if metadata_source == 'excel':
|
||||
if excel_session_id and excel_session_id in imported_metadata:
|
||||
# Use uploaded Excel file
|
||||
lookup = imported_metadata[excel_session_id]
|
||||
else:
|
||||
# Try default Excel file if available
|
||||
try:
|
||||
lookup = get_metadata_lookup()
|
||||
except:
|
||||
return jsonify({'error': 'Please upload an Excel file first using the Upload Excel File button'}), 400
|
||||
|
||||
# Get imported metadata (only if using import source)
|
||||
import_map = None
|
||||
|
|
@ -504,9 +516,22 @@ def update_manual_metadata():
|
|||
custom_metadata = {
|
||||
'title': data.get('title', '').strip()[:200],
|
||||
'subject': data.get('subject', '').strip()[:300],
|
||||
'keywords': data.get('keywords', '').strip()[:500]
|
||||
'keywords': data.get('keywords', '').strip()[:500],
|
||||
'author': data.get('author', '').strip()[:100],
|
||||
'copyright': data.get('copyright', '').strip()[:150],
|
||||
'comments': data.get('comments', '').strip()[:500]
|
||||
}
|
||||
|
||||
# Add custom fields if provided
|
||||
custom_fields = data.get('custom_fields', {})
|
||||
if custom_fields and isinstance(custom_fields, dict):
|
||||
for field_name, field_value in custom_fields.items():
|
||||
# Sanitize custom field names and values
|
||||
safe_name = str(field_name).strip()[:50]
|
||||
safe_value = str(field_value).strip()[:200]
|
||||
if safe_name and safe_value:
|
||||
custom_metadata[safe_name] = safe_value
|
||||
|
||||
# Validate session
|
||||
if not session_id or session_id not in sessions:
|
||||
return jsonify({'error': 'Invalid or expired session'}), 400
|
||||
|
|
@ -566,10 +591,178 @@ def download_file(filename):
|
|||
return send_file(filepath, as_attachment=True)
|
||||
return jsonify({'error': 'File not found'}), 404
|
||||
|
||||
@app.route('/upload-excel', methods=['POST'])
|
||||
@login_required
|
||||
def upload_excel():
|
||||
"""Upload Excel file for Excel Lookup metadata source."""
|
||||
if 'excel_file' not in request.files:
|
||||
return jsonify({'error': 'No file provided'}), 400
|
||||
|
||||
file = request.files['excel_file']
|
||||
if file.filename == '':
|
||||
return jsonify({'error': 'No file selected'}), 400
|
||||
|
||||
try:
|
||||
import pandas as pd
|
||||
|
||||
# Save temp file
|
||||
excel_filename = safe_filename(file.filename)
|
||||
temp_path = Path(app.config['UPLOAD_FOLDER']) / excel_filename
|
||||
file.save(str(temp_path))
|
||||
|
||||
# Preview Excel structure instead of loading directly
|
||||
excel_file = pd.ExcelFile(str(temp_path))
|
||||
sheet_names = excel_file.sheet_names
|
||||
|
||||
# Get columns and sample data from first sheet
|
||||
preview_data = {}
|
||||
for sheet_name in sheet_names[:5]: # Limit to first 5 sheets
|
||||
df = pd.read_excel(excel_file, sheet_name=sheet_name, nrows=5)
|
||||
preview_data[sheet_name] = {
|
||||
'columns': df.columns.tolist(),
|
||||
'sample_data': df.head(3).fillna('').to_dict('records')
|
||||
}
|
||||
|
||||
# Store file path temporarily for later configuration
|
||||
excel_session_id = f"excel_{secrets.token_urlsafe(8)}"
|
||||
if 'excel_files' not in imported_metadata:
|
||||
imported_metadata['excel_files'] = {}
|
||||
imported_metadata['excel_files'][excel_session_id] = {
|
||||
'path': str(temp_path),
|
||||
'filename': excel_filename,
|
||||
'sheet_names': sheet_names
|
||||
}
|
||||
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'excel_session_id': excel_session_id,
|
||||
'filename': excel_filename,
|
||||
'sheets': sheet_names,
|
||||
'preview': preview_data,
|
||||
'message': f'Excel file uploaded. Please configure column mapping.'
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
import logging
|
||||
logging.getLogger(__name__).error(f"Excel upload failed: {e}")
|
||||
return jsonify({'error': f'Excel upload failed: {str(e)}'}), 500
|
||||
|
||||
@app.route('/preview-excel-sheet', methods=['POST'])
|
||||
@login_required
|
||||
def preview_excel_sheet():
|
||||
"""Preview a specific sheet from uploaded Excel file."""
|
||||
try:
|
||||
import pandas as pd
|
||||
|
||||
data = request.json
|
||||
excel_session_id = data.get('excel_session_id')
|
||||
sheet_name = data.get('sheet_name')
|
||||
|
||||
if not excel_session_id or excel_session_id not in imported_metadata.get('excel_files', {}):
|
||||
return jsonify({'error': 'Invalid session ID'}), 400
|
||||
|
||||
excel_info = imported_metadata['excel_files'][excel_session_id]
|
||||
excel_path = excel_info['path']
|
||||
|
||||
# Read the specific sheet
|
||||
df = pd.read_excel(excel_path, sheet_name=sheet_name, nrows=10)
|
||||
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'columns': df.columns.tolist(),
|
||||
'sample_data': df.head(5).fillna('').to_dict('records')
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
import logging
|
||||
logging.getLogger(__name__).error(f"Sheet preview failed: {e}")
|
||||
return jsonify({'error': f'Sheet preview failed: {str(e)}'}), 500
|
||||
|
||||
@app.route('/configure-excel-mapping', methods=['POST'])
|
||||
@login_required
|
||||
def configure_excel_mapping():
|
||||
"""Configure Excel column mapping and load metadata."""
|
||||
try:
|
||||
import pandas as pd
|
||||
|
||||
data = request.json
|
||||
excel_session_id = data.get('excel_session_id')
|
||||
sheet_name = data.get('sheet_name')
|
||||
column_mapping = data.get('column_mapping', {}) # {filename: 'col', title: 'col', ...}
|
||||
|
||||
if not excel_session_id or excel_session_id not in imported_metadata.get('excel_files', {}):
|
||||
return jsonify({'error': 'Invalid session ID'}), 400
|
||||
|
||||
excel_info = imported_metadata['excel_files'][excel_session_id]
|
||||
excel_path = excel_info['path']
|
||||
|
||||
# Read the configured sheet
|
||||
df = pd.read_excel(excel_path, sheet_name=sheet_name)
|
||||
|
||||
# Build metadata map using configured columns
|
||||
metadata_map = {}
|
||||
filename_col = column_mapping.get('filename')
|
||||
title_col = column_mapping.get('title')
|
||||
description_col = column_mapping.get('description')
|
||||
keywords_col = column_mapping.get('keywords')
|
||||
|
||||
if not filename_col:
|
||||
return jsonify({'error': 'Filename column is required'}), 400
|
||||
|
||||
for _, row in df.iterrows():
|
||||
filename = row.get(filename_col)
|
||||
if pd.notna(filename) and str(filename).strip():
|
||||
# Get filename without extension for indexing (case-insensitive)
|
||||
filename_stem = Path(str(filename).strip()).stem.lower()
|
||||
|
||||
metadata = {
|
||||
'title': str(row.get(title_col, '')).strip() if title_col and pd.notna(row.get(title_col)) else '',
|
||||
'description': str(row.get(description_col, '')).strip() if description_col and pd.notna(row.get(description_col)) else '',
|
||||
'keywords': str(row.get(keywords_col, '')).strip() if keywords_col and pd.notna(row.get(keywords_col)) else '',
|
||||
'original_filename': str(filename).strip()
|
||||
}
|
||||
|
||||
metadata_map[filename_stem] = metadata
|
||||
|
||||
# Create a simple lookup object
|
||||
class ConfiguredExcelLookup:
|
||||
def __init__(self, metadata_map):
|
||||
self.metadata_map = metadata_map
|
||||
self.filename_to_metadata = metadata_map
|
||||
|
||||
def lookup_by_filename(self, filename: str):
|
||||
filename_stem = Path(filename).stem.lower()
|
||||
return self.metadata_map.get(filename_stem)
|
||||
|
||||
lookup = ConfiguredExcelLookup(metadata_map)
|
||||
|
||||
# Store configured lookup
|
||||
imported_metadata[excel_session_id] = lookup
|
||||
|
||||
# Get stats
|
||||
stats = {
|
||||
'total_records': len(metadata_map),
|
||||
'with_title': sum(1 for v in metadata_map.values() if v.get('title')),
|
||||
'with_description': sum(1 for v in metadata_map.values() if v.get('description')),
|
||||
'with_keywords': sum(1 for v in metadata_map.values() if v.get('keywords'))
|
||||
}
|
||||
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'excel_session_id': excel_session_id,
|
||||
'stats': stats,
|
||||
'message': f'Configured mapping for {stats["total_records"]} records from sheet "{sheet_name}"'
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
import logging
|
||||
logging.getLogger(__name__).error(f"Excel configuration failed: {e}")
|
||||
return jsonify({'error': f'Excel configuration failed: {str(e)}'}), 500
|
||||
|
||||
@app.route('/import-metadata', methods=['POST'])
|
||||
@login_required
|
||||
def import_metadata():
|
||||
"""Import metadata from external file (CSV, Excel, JSON)."""
|
||||
"""Upload import file and preview structure for mapping."""
|
||||
if 'import_file' not in request.files:
|
||||
return jsonify({'error': 'No file provided'}), 400
|
||||
|
||||
|
|
@ -578,45 +771,142 @@ def import_metadata():
|
|||
return jsonify({'error': 'No file selected'}), 400
|
||||
|
||||
try:
|
||||
import pandas as pd
|
||||
|
||||
# Save temp file
|
||||
import_filename = safe_filename(file.filename)
|
||||
temp_path = Path(app.config['UPLOAD_FOLDER']) / import_filename
|
||||
file.save(str(temp_path))
|
||||
|
||||
# Import based on file type
|
||||
importer = MetadataImporter()
|
||||
file_ext = temp_path.suffix.lower()
|
||||
|
||||
# Read file and get structure
|
||||
if file_ext == '.csv':
|
||||
metadata_map = importer.import_from_csv(str(temp_path))
|
||||
df = pd.read_csv(str(temp_path), nrows=5, encoding='utf-8')
|
||||
elif file_ext in ['.xlsx', '.xls']:
|
||||
metadata_map = importer.import_from_excel(str(temp_path))
|
||||
df = pd.read_excel(str(temp_path), nrows=5)
|
||||
elif file_ext == '.json':
|
||||
metadata_map = importer.import_from_json(str(temp_path))
|
||||
import json
|
||||
with open(str(temp_path), 'r', encoding='utf-8') as f:
|
||||
data = json.load(f)
|
||||
# Convert to DataFrame
|
||||
if isinstance(data, list):
|
||||
df = pd.DataFrame(data[:5])
|
||||
elif isinstance(data, dict):
|
||||
df = pd.DataFrame([data])
|
||||
else:
|
||||
return jsonify({'error': 'Invalid JSON format'}), 400
|
||||
else:
|
||||
return jsonify({'error': f'Unsupported file format: {file_ext}. Supported: .csv, .xlsx, .xls, .json'}), 400
|
||||
return jsonify({'error': f'Unsupported file format: {file_ext}'}), 400
|
||||
|
||||
# Validate import
|
||||
stats = importer.validate_import(metadata_map)
|
||||
columns = df.columns.tolist()
|
||||
sample_data = df.fillna('').to_dict('records')
|
||||
|
||||
# Store in global dict with unique session ID
|
||||
import_session_id = f"import_{len(imported_metadata) + 1}"
|
||||
# Store file path for later configuration
|
||||
import_session_id = f"import_{secrets.token_urlsafe(8)}"
|
||||
if 'import_files' not in imported_metadata:
|
||||
imported_metadata['import_files'] = {}
|
||||
imported_metadata['import_files'][import_session_id] = {
|
||||
'path': str(temp_path),
|
||||
'filename': import_filename,
|
||||
'file_type': file_ext
|
||||
}
|
||||
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'import_session_id': import_session_id,
|
||||
'filename': import_filename,
|
||||
'columns': columns,
|
||||
'sample_data': sample_data,
|
||||
'message': f'Import file uploaded. Please configure column mapping.'
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
import logging
|
||||
logging.getLogger(__name__).error(f"Import upload failed: {e}")
|
||||
return jsonify({'error': f'Import upload failed: {str(e)}'}), 500
|
||||
|
||||
@app.route('/configure-import-mapping', methods=['POST'])
|
||||
@login_required
|
||||
def configure_import_mapping():
|
||||
"""Configure import column mapping and load metadata."""
|
||||
try:
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
data = request.json
|
||||
import_session_id = data.get('import_session_id')
|
||||
column_mapping = data.get('column_mapping', {})
|
||||
|
||||
if not import_session_id or import_session_id not in imported_metadata.get('import_files', {}):
|
||||
return jsonify({'error': 'Invalid session ID'}), 400
|
||||
|
||||
import_info = imported_metadata['import_files'][import_session_id]
|
||||
import_path = import_info['path']
|
||||
file_ext = import_info['file_type']
|
||||
|
||||
# Read the full file
|
||||
if file_ext == '.csv':
|
||||
df = pd.read_csv(import_path, encoding='utf-8')
|
||||
elif file_ext in ['.xlsx', '.xls']:
|
||||
df = pd.read_excel(import_path)
|
||||
elif file_ext == '.json':
|
||||
with open(import_path, 'r', encoding='utf-8') as f:
|
||||
json_data = json.load(f)
|
||||
if isinstance(json_data, list):
|
||||
df = pd.DataFrame(json_data)
|
||||
else:
|
||||
df = pd.DataFrame([json_data])
|
||||
|
||||
# Build metadata map using configured columns
|
||||
metadata_map = {}
|
||||
filename_col = column_mapping.get('filename')
|
||||
title_col = column_mapping.get('title')
|
||||
subject_col = column_mapping.get('subject')
|
||||
keywords_col = column_mapping.get('keywords')
|
||||
|
||||
if not filename_col:
|
||||
return jsonify({'error': 'Filename column is required'}), 400
|
||||
|
||||
for _, row in df.iterrows():
|
||||
filename = row.get(filename_col)
|
||||
if pd.notna(filename) and str(filename).strip():
|
||||
filename_stem = Path(str(filename).strip()).stem.lower()
|
||||
|
||||
metadata = {
|
||||
'title': str(row.get(title_col, '')).strip() if title_col and pd.notna(row.get(title_col)) else '',
|
||||
'subject': str(row.get(subject_col, '')).strip() if subject_col and pd.notna(row.get(subject_col)) else '',
|
||||
'keywords': str(row.get(keywords_col, '')).strip() if keywords_col and pd.notna(row.get(keywords_col)) else '',
|
||||
'original_filename': str(filename).strip()
|
||||
}
|
||||
|
||||
metadata_map[filename_stem] = metadata
|
||||
|
||||
# Store configured metadata map
|
||||
imported_metadata[import_session_id] = metadata_map
|
||||
|
||||
# Clean up temp file
|
||||
temp_path.unlink()
|
||||
Path(import_path).unlink(missing_ok=True)
|
||||
|
||||
# Get stats
|
||||
stats = {
|
||||
'total_records': len(metadata_map),
|
||||
'with_title': sum(1 for v in metadata_map.values() if v.get('title')),
|
||||
'with_subject': sum(1 for v in metadata_map.values() if v.get('subject')),
|
||||
'with_keywords': sum(1 for v in metadata_map.values() if v.get('keywords'))
|
||||
}
|
||||
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'import_session_id': import_session_id,
|
||||
'stats': stats,
|
||||
'message': f'Imported {stats["total_records"]} metadata records from {import_filename}'
|
||||
'message': f'Configured mapping for {stats["total_records"]} records'
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
import logging
|
||||
logging.getLogger(__name__).error(f"Import failed: {e}")
|
||||
return jsonify({'error': f'Import failed: {str(e)}'}), 500
|
||||
logging.getLogger(__name__).error(f"Import configuration failed: {e}")
|
||||
return jsonify({'error': f'Import configuration failed: {str(e)}'}), 500
|
||||
|
||||
@app.route('/preview-import', methods=['POST'])
|
||||
@login_required
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue