volt-newsroom-scraper-report/web
DJP 8aebf36a59 Update web/.htaccess with cleaner rules
- Added session timeout settings
- Explicitly blocks test/debug/simple-index.php files
- Clearer comments on what's protected
- Matches simpler approach from root .htaccess
- Maintains security without breaking functionality
2026-01-07 14:40:18 -05:00
..
.env.example Update branding and fix URL display 2026-01-06 14:00:27 -05:00
.htaccess Update web/.htaccess with cleaner rules 2026-01-07 14:40:18 -05:00
.htaccess-old Update branding and fix URL display 2026-01-06 14:00:27 -05:00
APACHE_DEPLOY.md Fix column mapping and add MAMP/Apache deployment guides 2026-01-06 13:38:01 -05:00
auth.php Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
AuthMiddleware.php Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
config.php Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
download.php Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
env_loader.php Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
generate-simple.php Fix JSON parsing error in web interface 2026-01-07 14:00:25 -05:00
generate.php Fix apache_setenv error for non-mod_php environments 2026-01-07 13:24:38 -05:00
index-streaming.php Make index-simple.php the default index.php 2026-01-07 13:41:46 -05:00
index.php Make index-simple.php the default index.php 2026-01-07 13:41:46 -05:00
JWTValidator.php Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
MAMP_QUICK_START.txt Add MAMP quick start guide 2026-01-06 13:38:26 -05:00
MAMP_SETUP.md Fix column mapping and add MAMP/Apache deployment guides 2026-01-06 13:38:01 -05:00
QUICK_START.md Add quick start guide for web interface 2026-01-06 13:25:37 -05:00
README.md Add web interface with SSO authentication 2026-01-06 13:25:17 -05:00
simple-index.php Add MAMP-friendly simplified interface 2026-01-06 14:07:39 -05:00
test-index.php Add MAMP-friendly simplified interface 2026-01-06 14:07:39 -05:00
test.php Add MAMP-friendly simplified interface 2026-01-06 14:07:39 -05:00

Newsroom Reporter - Web Interface

A PHP web interface for generating newsroom reports with SSO authentication.

Features

  • SSO Authentication: Uses the same Microsoft Azure AD authentication as NANO-RESEARCH
  • Date Selection: Choose any date to generate reports
  • Real-time Output: Streams Python script output to the browser
  • Secure Download: Authenticated PDF downloads
  • Beautiful UI: Modern, responsive design with Montserrat font

Setup Instructions

1. Copy SSO Configuration from NANO-RESEARCH

The web interface uses the same SSO settings as your NANO-RESEARCH app.

Create .env file in the web/ directory:

cp .env.example .env

Edit .env and add your SSO credentials:

SSO_ENABLED=true
SSO_TENANT_ID=your-tenant-id-from-nano-research
SSO_CLIENT_ID=your-client-id-from-nano-research

You can find these values in /Users/daveporter/Desktop/CODING-2024/NANO-RESEARCH/.env

2. Configure Web Server

Option A: Local Development with PHP Built-in Server

cd web/
php -S localhost:8000

Then visit: http://localhost:8000

Option B: Apache/Nginx

  1. Point your web server document root to the web/ directory
  2. Ensure PHP is enabled
  3. Ensure the reports/ directory is writable by the web server

3. Azure AD App Registration

If not already done for NANO-RESEARCH, register the application in Azure AD:

  1. Go to Azure Portal → Azure Active Directory → App registrations
  2. Use the same app registration as NANO-RESEARCH, or create a new one
  3. Add redirect URI: http://your-domain/index.php
  4. Copy Tenant ID and Client ID to .env

File Structure

web/
├── index.php              # Main UI (date selection, output streaming)
├── generate.php           # Backend script (runs Python, streams output)
├── download.php           # Secure PDF download handler
├── auth.php               # Authentication API endpoint
├── config.php             # Configuration
├── AuthMiddleware.php     # SSO authentication logic
├── JWTValidator.php       # JWT token validation
├── env_loader.php         # Environment variable loader
├── .env.example           # Environment template
└── .env                   # Your SSO credentials (gitignored)

Usage

  1. Access the app: Open in browser (http://localhost:8000 or your domain)
  2. Sign in: Click "Sign In with Microsoft"
  3. Enter date: Type date like "Tuesday, January 7"
  4. Generate: Click "Generate Report" button
  5. Watch output: See real-time Python output streaming
  6. Download: Click "Download PDF Report" when complete

Security

  • SSO Authentication: Only authenticated Microsoft users can access
  • Path Traversal Prevention: Filename validation prevents directory attacks
  • Secure Cookies: HttpOnly, SameSite cookies for session management
  • JWT Validation: Tokens validated against Azure AD

Troubleshooting

SSO Not Working

  • Check .env file has correct SSO_TENANT_ID and SSO_CLIENT_ID
  • Verify redirect URI is registered in Azure AD app

Python Script Errors

  • Check Python virtual environment is activated: source ../venv/bin/activate
  • Verify dependencies installed: pip install -r ../requirements.txt
  • Check file permissions on reports/ directory

Output Not Streaming

  • Disable output buffering in php.ini:
    output_buffering = Off
    

Development Mode

To disable SSO for local testing:

SSO_ENABLED=false

You'll be logged in as "Local Developer" without authentication.

Production Deployment

  1. Set SSO_ENABLED=true in .env
  2. Use HTTPS (required for secure cookies)
  3. Set proper file permissions
  4. Enable error logging, disable display_errors
  5. Configure web server security headers

Enjoy your automated newsroom reports! 📰