# Voice to Text with Whisper & DeepL Translation A secure web application that converts audio files to text using OpenAI's Whisper model and translates them using DeepL API. Features Microsoft Azure AD authentication and supports multiple output formats: plain text, VTT (WebVTT), and SRT (SubRip). ## Features - ๐Ÿ” **Microsoft Azure AD SSO** authentication with OAuth2 PKCE flow - ๐ŸŽค **Audio transcription** using OpenAI Whisper (multiple models available) - ๐ŸŒ **Translation** using DeepL API (30+ languages) - ๐Ÿ“ **Multiple output formats**: Text, VTT, SRT - ๐Ÿš€ **Python Flask API** backend (port 5010) - ๐Ÿ’ป **PHP frontend** (MAMP/Apache compatible with PHP-FPM) - ๐Ÿ“ฆ **350MB file size limit** for audio uploads - ๐Ÿ“„ **Generates both original and translated files** - ๐ŸŽจ **Modern black/gold UI** with dark theme - ๐Ÿ“Š **Real-time progress bar** during processing - ๐Ÿ‘€ **In-page preview** of transcriptions - โฌ‡๏ธ **One-click download** for all formats - ๐Ÿ”’ **Session-based file access control** - users can only access their own files - ๐Ÿงช **Dev mode** for local testing without Microsoft authentication ## Requirements ### Required Software - **Python 3.8+** (Recommended: 3.10 or 3.11) - **PHP 7.4+** with PHP-FPM - **MAMP or Apache** web server - **Composer** for PHP dependency management - **FFmpeg** for audio processing ### For Production (Optional for Local Dev) - **Microsoft Azure AD application** with registered redirect URI - **HTTPS/SSL** certificate (required for production secure cookies) ## Quick Start (Local Development) For local testing without Azure AD setup: ```bash # 1. Copy environment file cp .env.example .env # 2. Enable dev mode in .env # Edit .env and set: DEV_MODE=true # 3. Install PHP dependencies composer install # 4. Install Python dependencies ./setup.sh # 5. Start Python API ./start_api.sh # 6. Access via MAMP at http://localhost:8888/voice2text/ ``` **Dev Mode**: When `DEV_MODE=true`, authentication is bypassed and you'll see a mock user "Dev User (Local)". This allows testing without Azure AD configuration. --- ## Full Installation ### 1. Configure Authentication This application uses Microsoft Azure AD for Single Sign-On (SSO) authentication with PKCE flow. **Step 1: Copy and configure environment file** ```bash cp .env.example .env ``` **Step 2: Edit `.env` file with your configuration:** ```env # Set to true for local testing (bypasses Microsoft auth) DEV_MODE=false # Azure AD Configuration (required for production) AZURE_CLIENT_ID=your_client_id_here AZURE_AUTHORITY=https://login.microsoftonline.com/your_tenant_id_here AZURE_REDIRECT_URI=https://yourdomain.com/voice2text/ # API Keys DEEPL_API_KEY=your_deepl_api_key_here # API Configuration PYTHON_API_URL=http://localhost:5010 # Session timeout in seconds (default: 8 hours) SESSION_TIMEOUT=28800 ``` **For Local Development:** - Set `DEV_MODE=true` to bypass authentication - Azure credentials not required when in dev mode **For Production:** - Set `DEV_MODE=false` - Configure valid Azure AD credentials - Register your redirect URI in Azure AD Portal **Step 3: Install PHP dependencies** ```bash composer install ``` This installs: - `league/oauth2-client` - OAuth2 PKCE authentication - `vlucas/phpdotenv` - Environment variable management ### 2. Install FFmpeg **macOS:** ```bash brew install ffmpeg ``` **Linux (Ubuntu/Debian):** ```bash sudo apt update sudo apt install ffmpeg ``` **Windows:** Download from https://ffmpeg.org/download.html ### 3. Setup Python Environment Run the setup script: ```bash chmod +x setup.sh ./setup.sh ``` This will: - Create a Python virtual environment - Install all dependencies (Flask, Whisper, etc.) - Create the outputs directory ### 4. Start the API Server ```bash chmod +x start_api.sh ./start_api.sh ``` Or manually: ```bash source venv/bin/activate python api.py ``` The API will run on http://localhost:5010 ### 5. Configure Web Server **MAMP Setup:** 1. Point MAMP document root to this directory 2. Ensure PHP is enabled (PHP 7.4+ recommended) 3. **IMPORTANT**: MAMP uses PHP-FPM, so PHP configuration is in `.user.ini` (not `.htaccess`) 4. Restart MAMP servers after changing `.user.ini` 5. Access at: `http://localhost:8888/voice2text/` **Apache Setup:** - See "Production Deployment (Apache)" section below for full Apache configuration ## Usage ### Development Mode (DEV_MODE=true) 1. Start the Python API server: `./start_api.sh` 2. Open the web application in your browser 3. You'll be automatically logged in as "Dev User (Local)" 4. See orange "DEV MODE ACTIVE" banner at the top 5. Select output format (Text/VTT/SRT) 6. (Optional) Enable translation and select target language 7. Upload an audio file (max 350MB) 8. Wait for processing 9. Download original and/or translated transcription ### Production Mode (DEV_MODE=false) 1. Start the Python API server: `./start_api.sh` 2. Open the web application in your browser 3. You'll see a **login page** with "Sign in with Microsoft" button 4. Click and authenticate with your Microsoft account 5. After authentication, you'll be redirected to the main application 6. See your name and email in the user header 7. Select output format (Text/VTT/SRT) 8. (Optional) Enable translation and select target language 9. Upload an audio file (max 350MB) 10. Wait for processing - see real-time progress bar 11. View transcription preview in-page (truncated at 10,000 chars) 12. Download original and/or translated transcription files 13. Your files are associated with your session and only accessible to you 14. Click "Logout" when finished ### Translation The app uses DeepL API for high-quality translations. When translation is enabled: - The audio is first transcribed in its original language - The transcription is then translated to your selected target language - Both original and translated files are generated - Both files are tracked in your session for access control - Supports 30+ languages including English, Spanish, French, German, Portuguese, Japanese, Chinese, and more ### File Upload Configuration **MAMP (PHP-FPM):** PHP settings are configured in `.user.ini` (automatically created during setup): ```ini upload_max_filesize = 350M post_max_size = 350M max_execution_time = 1200 max_input_time = 1200 memory_limit = 512M ``` **Note:** Changes to `.user.ini` require MAMP restart. It may take up to 5 minutes to take effect. **Apache (mod_php):** Uncomment the settings in `.htaccess` or configure in `php.ini` ## API Endpoints ### POST /transcribe Transcribe audio file to text/VTT/SRT **Parameters:** - `audio` (file): Audio file to transcribe - `format` (string): Output format (txt/vtt/srt) **Response:** ```json { "success": true, "text": "transcribed text...", "filename": "output.txt", "format": "txt" } ``` ### GET /health Health check endpoint ### GET /download/ Download transcribed file ## Whisper Models The default model is `base` which provides a good balance of speed and accuracy. Available models: - `tiny` - Fastest, least accurate - `base` - Good balance (default) - `small` - Better accuracy, slower - `medium` - High accuracy, much slower - `large` - Best accuracy, very slow To change the model, edit `api.py` line 24: ```python model = whisper.load_model("base") # Change to desired model ``` ## Authentication & Security ### Microsoft Azure AD SSO (Production) - **OAuth2 with PKCE** (Proof Key for Code Exchange) flow - RFC 7636 - **No client secrets** needed - secure public client authentication - **Code challenge**: SHA256 hash of random 64-character verifier - **State validation**: CSRF protection on callback - **Microsoft Graph API**: Retrieves user profile information ### Development Mode - **DEV_MODE=true**: Bypasses authentication entirely - **Mock user**: Auto-creates session with "Dev User (Local)" - **Visual indicator**: Orange banner shows when dev mode is active - **No Microsoft credentials required**: Perfect for local testing ### Session Management - **Secure session cookies**: httponly, secure (HTTPS only), samesite=Lax - **Session timeout**: Configurable (default: 8 hours) - **Session regeneration**: After login to prevent fixation attacks - **Auto-timeout**: Sessions expire and require re-authentication ### File Access Control - **Session-based tracking**: Files stored in `$_SESSION['user_files']` array - **Upload tracking**: Files automatically added to user's session on transcription - **Download validation**: Only files in user's session can be downloaded - **Ownership logging**: Unauthorized attempts logged with user ID - **No cross-user access**: Users cannot access other users' files ### Security Architecture ``` User Request โ†“ Authentication Check (isAuthenticated()) โ†“ Process Request (process.php) โ†“ Add Files to Session ($_SESSION['user_files']) โ†“ Download Request (download.php) โ†“ Verify File in User's Session โ†“ Serve File or 403 Forbidden ``` ### Important Security Notes - โœ… `.env` file excluded from git (contains sensitive credentials) - โœ… HTTPS required in production for secure cookie transmission - โš ๏ธ **Files persist in `outputs/` after session expires** - can't be downloaded but still exist on disk - ๐Ÿ’ก **Recommendation**: Set up cron job to clean old files from `outputs/` directory - ๐Ÿ”’ **Session-only access**: Files become inaccessible when session expires or user logs out - ๐Ÿšจ **Dev mode security**: Only use `DEV_MODE=true` for local development, never in production ## File Structure ``` voice2text/ โ”œโ”€โ”€ api.py # Python Flask API with Whisper & DeepL โ”œโ”€โ”€ login.php # Landing page with Microsoft SSO button โ”œโ”€โ”€ auth.php # OAuth2 PKCE authentication handler โ”œโ”€โ”€ logout.php # Session destruction handler โ”œโ”€โ”€ index.php # Main application interface (auth required) โ”œโ”€โ”€ process.php # PHP request handler (auth + file tracking) โ”œโ”€โ”€ download.php # File download handler (auth + ownership check) โ”œโ”€โ”€ check_api.php # API status checker (auth required) โ”œโ”€โ”€ test_download.php # Download functionality tester (auth required) โ”œโ”€โ”€ config.php # Configuration loader โ”œโ”€โ”€ auth_config.php # Authentication & environment config โ”œโ”€โ”€ style.css # Black/gold theme styles โ”œโ”€โ”€ V2T.svg # Application logo โ”‚ โ”œโ”€โ”€ .env # Environment variables (NOT in git) โš ๏ธ โ”œโ”€โ”€ .env.example # Environment variables template โ”œโ”€โ”€ .user.ini # PHP-FPM configuration (MAMP) โ”œโ”€โ”€ .htaccess # Apache rewrite rules โ”œโ”€โ”€ .gitignore # Git ignore rules โ”‚ โ”œโ”€โ”€ composer.json # PHP dependencies manifest โ”œโ”€โ”€ composer.lock # PHP dependency lock file (NOT in git) โ”œโ”€โ”€ requirements.txt # Python dependencies โ”œโ”€โ”€ setup.sh # Python environment setup script โ”œโ”€โ”€ start_api.sh # Python API start script โ”‚ โ”œโ”€โ”€ README.md # This file - comprehensive documentation โ”œโ”€โ”€ CLAUDE.md # Claude Code guidance for AI assistance โ”‚ โ”œโ”€โ”€ outputs/ # Transcribed files directory (files NOT in git) โ”œโ”€โ”€ vendor/ # Composer PHP dependencies (NOT in git) โ””โ”€โ”€ venv/ # Python virtual environment (NOT in git) Key Files Explained: - .env: Contains secrets (Azure credentials, API keys) - NEVER commit - .user.ini: PHP settings for PHP-FPM (MAMP) - upload limits, timeouts - auth_config.php: Core authentication logic and session management - process.php: Handles uploads, calls Python API, tracks files in session - download.php: Validates ownership before serving files ``` ## Production Deployment (Automated) ### Quick Deployment with deploy.sh This is the recommended method for production deployment. The `deploy.sh` script automates the entire deployment process. **Prerequisites (ALL REQUIRED):** - Ubuntu/Debian or CentOS/RHEL server - Apache 2.4+ or Nginx - PHP 7.4+ with PHP-FPM - Python 3.8+ with pip and venv - **Composer** (PHP dependency manager) - deploy script will fail without it - **FFmpeg** (audio processing) - transcription will fail without it - Root/sudo access **The deploy script checks for these and will exit with an error if any are missing.** ### Initial Setup **Step 1: Install system dependencies** **Ubuntu/Debian:** ```bash sudo apt update sudo apt install apache2 libapache2-mod-php php-curl php-xml php-mbstring \ python3 python3-venv python3-pip composer ffmpeg git sudo a2enmod rewrite sudo systemctl restart apache2 ``` **CentOS/RHEL:** ```bash sudo yum install httpd php php-curl php-xml php-mbstring \ python3 python3-pip composer ffmpeg git sudo systemctl enable httpd sudo systemctl start httpd ``` **Step 2: Clone repository to backend location** ```bash sudo mkdir -p /opt/voice2text sudo git clone https://github.com/yourusername/voice2text.git /opt/voice2text cd /opt/voice2text ``` **Step 3: Configure production environment** ```bash # Copy and edit .env file sudo cp .env.example .env sudo nano .env # IMPORTANT: Set these values in .env: # - DEV_MODE=false (disable dev mode for production) # - AZURE_CLIENT_ID=your_production_client_id # - AZURE_AUTHORITY=your_production_authority # - AZURE_REDIRECT_URI=https://yourdomain.com/voice2txt/ # - DEEPL_API_KEY=your_production_api_key ``` **Step 4: Run deployment script** ```bash sudo ./deploy.sh ``` The script will ask you to confirm you've pulled the latest code, then: - Create Python virtual environment - Install all Python dependencies - Install Composer PHP dependencies - Create outputs/ directory - Install systemd service - Copy frontend files to /var/www/html/voice2txt - Set proper permissions - Start the API service - Verify everything is working **Note:** The deploy script does NOT pull code from git. You should pull the code you want to deploy BEFORE running the script. This gives you control over what version is deployed. ### Deployment Script Options ```bash # Full deployment (default) sudo ./deploy.sh # Deploy only backend (Python API) sudo ./deploy.sh --backend-only # Deploy only frontend (PHP files) sudo ./deploy.sh --frontend-only # See what would be deployed without executing sudo ./deploy.sh --dry-run # Skip git status confirmation prompt (use with caution) sudo ./deploy.sh --skip-git-check # Verbose output for debugging sudo ./deploy.sh --verbose # Show help sudo ./deploy.sh --help ``` **Typical Deployment Workflow:** ```bash # 1. Pull latest code cd /opt/voice2text sudo git pull origin main # 2. Review what will be deployed git log -3 --oneline # See last 3 commits git diff HEAD~1 # See changes in last commit # 3. Deploy sudo ./deploy.sh # The script will show the expected workflow and ask for confirmation before proceeding ``` ### Updating Production (Subsequent Deployments) After initial setup, updating is a two-step process: ```bash cd /opt/voice2text # Step 1: Pull the code you want to deploy sudo git pull origin main # Or deploy a specific branch/tag: # sudo git checkout v1.2.3 # sudo git pull origin feature-branch # Step 2: Deploy the code sudo ./deploy.sh ``` The script will: - Show the backend directory and ask for confirmation - Update dependencies - Restart the API service - Update frontend files - Verify API is responding **Why separate git pull from deploy?** - Gives you control over what version is deployed - Allows you to review changes before deploying - Lets you deploy specific branches, tags, or commits - Prevents accidental deployment of unreviewed code ### Service Management ```bash # Check service status sudo systemctl status voice2text-api # Restart service sudo systemctl restart voice2text-api # Stop service sudo systemctl stop voice2text-api # Start service sudo systemctl start voice2text-api # View live logs sudo journalctl -u voice2text-api -f # View last 50 log entries sudo journalctl -u voice2text-api -n 50 ``` ### Deployment Architecture ``` Production Server: โ”œโ”€โ”€ /opt/voice2text/ # Backend (git repo) โ”‚ โ”œโ”€โ”€ api.py # Python Flask API โ”‚ โ”œโ”€โ”€ venv/ # Python virtual environment โ”‚ โ”œโ”€โ”€ outputs/ # Transcribed files โ”‚ โ”œโ”€โ”€ .env # Production config (not in git) โ”‚ โ””โ”€โ”€ voice2text-api.service # Systemd service file โ”‚ โ”œโ”€โ”€ /var/www/html/voice2txt/ # Frontend (web root) โ”‚ โ”œโ”€โ”€ *.php # PHP files โ”‚ โ”œโ”€โ”€ vendor/ # Composer dependencies โ”‚ โ”œโ”€โ”€ style.css, V2T.svg # Assets โ”‚ โ”œโ”€โ”€ .env # Frontend config (copy) โ”‚ โ””โ”€โ”€ outputs/ โ†’ symlink # Links to /opt/voice2text/outputs โ”‚ โ””โ”€โ”€ /etc/systemd/system/ โ””โ”€โ”€ voice2text-api.service # Systemd service ``` ### Troubleshooting Deployment **"Failed to open vendor/autoload.php" error:** ``` PHP Fatal error: Failed opening required '/var/www/html/voice2txt/vendor/autoload.php' ``` This means Composer dependencies are missing. Fix: ```bash # 1. Check if Composer is installed composer --version # If not installed: curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer # 2. Install dependencies in backend cd /opt/voice2text sudo composer install --no-dev --optimize-autoloader # 3. Verify vendor/ was created ls -la vendor/autoload.php # 4. Copy to frontend sudo cp -r vendor /var/www/html/voice2txt/ # 5. Fix permissions sudo chown -R www-data:www-data /var/www/html/voice2txt/vendor # 6. Verify frontend has it ls -la /var/www/html/voice2txt/vendor/autoload.php # 7. Restart Apache sudo systemctl restart apache2 ``` **Deploy script fails with "Composer not found":** The deploy script now REQUIRES Composer to be installed. Install it: ```bash curl -sS https://getcomposer.org/installer | php sudo mv composer.phar /usr/local/bin/composer sudo chmod +x /usr/local/bin/composer composer --version ``` **Service won't start:** ```bash # Check service logs sudo journalctl -u voice2text-api -n 50 # Check Python dependencies cd /opt/voice2text source venv/bin/activate pip list # Test Python API manually source venv/bin/activate python api.py ``` **Frontend shows 500 error:** ```bash # Check Apache error log sudo tail -f /var/log/apache2/error.log # Verify .env exists ls -la /var/www/html/voice2txt/.env # Check permissions ls -la /var/www/html/voice2txt/ ``` **Can't access files after upload:** ```bash # Check outputs directory permissions ls -ld /opt/voice2text/outputs # Should be: drwxrwxrwx (777) www-data www-data # Fix permissions sudo chmod 777 /opt/voice2text/outputs sudo chown www-data:www-data /opt/voice2text/outputs ``` --- ## Manual Production Deployment (Apache) If you prefer manual deployment instead of using the automated script: ### Prerequisites - Apache 2.4+ - PHP 7.4+ with mod_php or PHP-FPM - Python 3.8+ - FFmpeg - Root/sudo access for system configuration ### Step 1: Install Required Apache Modules (Same as automated deployment above) ### Step 2: Deploy Application Files ```bash # Clone to backend location sudo mkdir -p /opt/voice2text sudo git clone https://github.com/yourusername/voice2text.git /opt/voice2text cd /opt/voice2text # Set up Python environment sudo chmod +x setup.sh sudo ./setup.sh # Install Composer dependencies composer install --no-dev --optimize-autoloader # Set proper ownership and permissions sudo chown -R www-data:www-data /opt/voice2text sudo chmod -R 755 /opt/voice2text sudo chmod 777 /opt/voice2text/outputs # Copy frontend files sudo mkdir -p /var/www/html/voice2txt sudo cp *.php /var/www/html/voice2txt/ sudo cp style.css V2T.svg /var/www/html/voice2txt/ sudo cp .htaccess .user.ini /var/www/html/voice2txt/ sudo cp -r vendor /var/www/html/voice2txt/ sudo ln -s /opt/voice2text/outputs /var/www/html/voice2txt/outputs sudo chown -R www-data:www-data /var/www/html/voice2txt ``` ### Step 3: Configure Apache Virtual Host Create `/etc/apache2/sites-available/voice2text.conf`: ```apache ServerName voice2text.yourdomain.com ServerAdmin admin@yourdomain.com DocumentRoot /var/www/voice2text Options -Indexes +FollowSymLinks AllowOverride All Require all granted # PHP settings for large uploads php_value upload_max_filesize 350M php_value post_max_size 350M php_value max_execution_time 1200 php_value max_input_time 1200 php_value memory_limit 512M # Protect sensitive files Require all denied # Logging ErrorLog ${APACHE_LOG_DIR}/voice2text-error.log CustomLog ${APACHE_LOG_DIR}/voice2text-access.log combined ``` Enable the site: ```bash sudo a2ensite voice2text.conf sudo systemctl reload apache2 ``` ### Step 4: Configure PHP for Large Uploads Edit `/etc/php/7.4/apache2/php.ini` (adjust version as needed): ```ini upload_max_filesize = 350M post_max_size = 350M max_execution_time = 1200 max_input_time = 1200 memory_limit = 512M ``` Restart Apache: ```bash sudo systemctl restart apache2 ``` ### Step 5: Setup Python API as Systemd Service Create `/etc/systemd/system/voice2text-api.service`: ```ini [Unit] Description=Voice to Text Whisper API After=network.target [Service] Type=simple User=www-data Group=www-data WorkingDirectory=/var/www/voice2text Environment="PATH=/var/www/voice2text/venv/bin" ExecStart=/var/www/voice2text/venv/bin/python /var/www/voice2text/api.py Restart=always RestartSec=10 # Security settings NoNewPrivileges=true PrivateTmp=true # Logging StandardOutput=append:/var/log/voice2text-api.log StandardError=append:/var/log/voice2text-api-error.log [Install] WantedBy=multi-user.target ``` Enable and start the service: ```bash sudo systemctl daemon-reload sudo systemctl enable voice2text-api sudo systemctl start voice2text-api # Check status sudo systemctl status voice2text-api # View logs sudo journalctl -u voice2text-api -f ``` ### Step 6: Configure Firewall **UFW (Ubuntu):** ```bash sudo ufw allow 'Apache Full' sudo ufw allow 5010/tcp # Python API sudo ufw enable ``` **Firewalld (CentOS):** ```bash sudo firewall-cmd --permanent --add-service=http sudo firewall-cmd --permanent --add-service=https sudo firewall-cmd --permanent --add-port=5010/tcp sudo firewall-cmd --reload ``` ### Step 7: SSL Configuration (Optional but Recommended) Using Let's Encrypt with Certbot: ```bash # Install Certbot sudo apt install certbot python3-certbot-apache # Get SSL certificate sudo certbot --apache -d voice2text.yourdomain.com # Auto-renewal is configured automatically # Test renewal with: sudo certbot renew --dry-run ``` ### Step 8: Verify Deployment 1. Check Apache status: `sudo systemctl status apache2` 2. Check API status: `sudo systemctl status voice2text-api` 3. Visit: `http://voice2text.yourdomain.com/check_api.php` 4. Test file upload with a small audio file ## Monitoring and Maintenance ### Check API Status ```bash # View API logs sudo journalctl -u voice2text-api -n 100 # Check if API is responding curl http://localhost:5010/health ``` ### Check Apache Logs ```bash # Error log sudo tail -f /var/log/apache2/voice2text-error.log # Access log sudo tail -f /var/log/apache2/voice2text-access.log ``` ### Restart Services ```bash # Restart Apache sudo systemctl restart apache2 # Restart Python API sudo systemctl restart voice2text-api # Restart both sudo systemctl restart apache2 voice2text-api ``` ### Clean Old Files The `outputs/` directory can grow large. Set up a cron job to clean old files: ```bash # Edit crontab sudo crontab -e # Add this line to delete files older than 24 hours daily at 2 AM 0 2 * * * find /var/www/voice2text/outputs -type f -mtime +1 -delete ``` ## Troubleshooting ### MAMP/PHP-FPM Issues **"Invalid command 'php_value'" error:** - This means MAMP is using PHP-FPM instead of mod_php - Solution: Use `.user.ini` instead of `.htaccess` for PHP settings - The `.user.ini` file should already be created - Restart MAMP servers after any changes to `.user.ini` **"Session ini settings cannot be changed" warnings:** - Cause: `session_start()` being called before session configuration - Solution: Always load `config.php` BEFORE starting sessions - Fixed in current version - if you see this, ensure you have latest code **Changes to .user.ini not taking effect:** - PHP-FPM caches `.user.ini` for up to 5 minutes - Solution: Restart MAMP servers completely - Wait a few minutes or check `phpinfo()` to verify settings ### Authentication Issues **Stuck on login page in dev mode:** - Check `.env` file: `DEV_MODE=true` - Clear browser cookies and cache - Restart MAMP servers - Verify `auth_config.php` has latest dev mode logic **Microsoft authentication fails:** - Ensure `DEV_MODE=false` in `.env` - Verify Azure AD credentials are correct - Check redirect URI matches Azure AD Portal configuration - Ensure redirect URI ends with trailing slash if configured that way - Check browser console for detailed OAuth errors **"Authentication required" on every page:** - Session may not be persisting - Check browser allows cookies - Verify `.user.ini` session settings are loaded - Try clearing browser cookies ### API Issues **API not connecting:** 1. Check if API is running: `sudo systemctl status voice2text-api` 2. Test health endpoint: `curl http://localhost:5010/health` 3. Check API logs: `sudo journalctl -u voice2text-api -n 50` 4. Verify firewall allows port 5010 5. Visit `check_api.php` in browser for detailed status **API won't start:** 1. Check Python version: `python3 --version` (must be 3.8+) 2. Verify virtual environment: `ls -la venv/` 3. Check dependencies: `source venv/bin/activate && pip list` 4. Review error logs: `sudo journalctl -u voice2text-api -xe` 5. Ensure FFmpeg is installed: `which ffmpeg` ### Upload Issues **File upload fails:** 1. Check file size limits in `php.ini` 2. Verify `.htaccess` is being read (requires `AllowOverride All`) 3. Check disk space: `df -h` 4. Verify `outputs/` directory permissions: `ls -ld outputs/` 5. Check Apache error log: `tail -f /var/log/apache2/error.log` **"413 Request Entity Too Large":** - If using Nginx as reverse proxy, add to nginx config: ```nginx client_max_body_size 350M; ``` ### Transcription Issues **Transcription fails:** 1. Verify FFmpeg is installed: `ffmpeg -version` 2. Check audio file format (supported: mp3, wav, m4a, etc.) 3. Review API logs for specific errors 4. Test with a small file first 5. Ensure enough disk space in `/tmp` **Slow transcription:** 1. Use a smaller Whisper model (`tiny` or `base`) 2. Consider using GPU acceleration (requires CUDA setup) 3. Upgrade server hardware (more CPU/RAM) 4. Reduce audio file length/quality ### Translation Issues **Translation fails:** 1. Verify DeepL API key is valid in `config.php` 2. Check DeepL API usage: https://www.deepl.com/pro-account 3. Review API response for specific error messages 4. Ensure internet connectivity for DeepL API ### Permission Issues **403 Forbidden errors:** ```bash sudo chown -R www-data:www-data /var/www/voice2text sudo chmod -R 755 /var/www/voice2text sudo chmod 777 /var/www/voice2text/outputs ``` **Can't write to outputs directory:** ```bash sudo mkdir -p /var/www/voice2text/outputs sudo chown www-data:www-data /var/www/voice2text/outputs sudo chmod 777 /var/www/voice2text/outputs ``` ### Performance Issues **Out of memory:** 1. Use a smaller Whisper model (`tiny` or `base`) 2. Increase PHP memory limit in `php.ini` 3. Increase system swap space 4. Add more RAM to server **Timeout errors:** 1. Increase PHP `max_execution_time` in `php.ini` 2. Increase Apache timeout in virtual host config 3. Process smaller audio files 4. Use faster Whisper model ### Debugging Tips **Enable debug mode:** Add to `config.php`: ```php error_reporting(E_ALL); ini_set('display_errors', 1); ``` **Check system resources:** ```bash # CPU and memory usage htop # Disk space df -h # Check running processes ps aux | grep -E 'python|apache' ``` **Test components individually:** 1. Test PHP: Create `test.php` with `` 2. Test Python API: `curl http://localhost:5010/health` 3. Test file upload: Use small test file first 4. Check browser console for JavaScript errors (F12) ## Quick Reference ### Development Commands ```bash # Start Python API ./start_api.sh # or manually: source venv/bin/activate && python api.py # Check Python API status curl http://localhost:5010/health # Install/update PHP dependencies composer install # Install/update Python dependencies ./setup.sh # View Python API output # (if running in background, check terminal where you started it) ``` ### Configuration Files - **`.env`** - Environment variables (authentication, API keys, dev mode) - **`.user.ini`** - PHP settings for MAMP/PHP-FPM (upload limits, timeouts) - **`api.py` line 26** - Whisper model selection (`tiny`, `base`, `small`, `medium`, `large`) ### Dev Mode Toggle ```bash # Edit .env file: DEV_MODE=true # Local testing - bypasses authentication DEV_MODE=false # Production - requires Microsoft SSO ``` ### Diagnostic Pages - **`check_api.php`** - Verify Python API connection and view status - **`test_download.php`** - View your accessible files and test downloads - Access both at: `http://localhost:8888/voice2text/check_api.php` ### Log Locations - **Python API**: Terminal output where `./start_api.sh` was run - **PHP errors**: Check MAMP logs directory - **Download attempts**: PHP error log (unauthorized attempts are logged) ## License MIT