olivas/README.md
DJP 3467dbcf03 Initial commit — OliVAS visual attention analysis platform
Full-stack application for predicting where humans look in images using
DeepGaze saliency models. Includes heatmap overlays, gaze sequence prediction,
hotspot detection, AOI analysis, rule-based insights, optional Claude AI
design analysis, and professional PDF report generation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 20:20:58 -05:00

215 lines
9.6 KiB
Markdown

# OliVAS — Open-Source Visual Attention Software
**OliVAS** (OLIVER Visual Attention Suite) is an open-source web application that predicts where humans will look in an image during the first 3-5 seconds of viewing. Built for creative teams, designers, and marketers at OLIVER, it provides saliency heatmaps, gaze sequence predictions, hotspot analysis, and actionable design insights — all without needing physical eye-tracking hardware.
## Features
- **Saliency Heatmap** — Interactive heatmap overlay showing predicted attention intensity with adjustable opacity and colormap (Jet, Viridis, Inferno, etc.)
- **Gaze Sequence Prediction** — Numbered fixation points showing the predicted order viewers will scan the image
- **Hotspot Detection** — Top 5 attention regions ranked by intensity with bounding boxes
- **Attention Score** — Overall 0-100 concentration score measuring how focused or diffuse the predicted attention is
- **Areas of Interest (AOI)** — Draw rectangles over design elements to measure attention %, area %, and attention density
- **Rule-Based Insights** — Automatic analysis of attention concentration, focal dominance, gaze entry point, spatial balance, edge risk, and drop-off
- **AI Design Analysis** — Optional Claude Sonnet 4.6-powered insights that reference specific visual elements in your design with actionable recommendations
- **PDF Reports** — Professional downloadable reports with Montserrat typography, all visualizations, metrics, and insights (both rule-based and AI)
- **Multi-Model Support** — Choose between DeepGaze I, DeepGaze IIE (recommended), and DeepGaze III
- **Project Organization** — Group analyses into projects for easy management
- **Comparison View** — Side-by-side comparison of two analyses with metrics
## Tech Stack
| Layer | Technology |
|-------|-----------|
| **Frontend** | React 18, TypeScript, Vite, Tailwind CSS, Zustand, React Router |
| **Backend** | FastAPI, Python 3.12, SQLAlchemy (async), Pydantic v2 |
| **Database** | PostgreSQL 16 |
| **ML Models** | DeepGaze I / IIE / III via [deepgaze_pytorch](https://github.com/matthias-k/DeepGaze) |
| **AI Insights** | Anthropic Claude Sonnet 4.6 (optional) |
| **PDF Generation** | ReportLab with Montserrat font |
| **Deployment** | Docker Compose |
## Prerequisites
- Python 3.12+
- Node.js 18+
- Docker & Docker Compose (for PostgreSQL)
- Git
## Quick Start
### 1. Clone the repository
```bash
git clone git@bitbucket.org:zlalani/olivas.git
cd olivas
```
### 2. Start PostgreSQL
```bash
docker compose up -d postgres
```
This starts PostgreSQL on port **5453** with database `olivas`.
### 3. Set up the backend
```bash
cd backend
python3.12 -m venv .venv
.venv/bin/pip install --upgrade pip
.venv/bin/pip install -e ".[dev]"
.venv/bin/pip install "deepgaze-pytorch @ git+https://github.com/matthias-k/DeepGaze.git"
```
### 4. Configure environment (optional)
Create `backend/.env` for optional settings:
```env
# Required for AI Design Analysis feature (optional)
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Defaults (change if needed)
DATABASE_URL=postgresql+asyncpg://olivas:olivas@localhost:5453/olivas
DEVICE=auto
CORS_ORIGINS=http://localhost:1577
```
### 5. Start the backend
```bash
cd backend
.venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```
The backend will load all DeepGaze models on startup (this may take 30-60 seconds on first run as model weights are downloaded).
### 6. Set up and start the frontend
```bash
cd frontend
npm install
npm run dev
```
### 7. Open the app
Navigate to **http://localhost:1577** in your browser.
## Using the Makefile
For convenience, the project includes a Makefile:
```bash
make setup # Install all backend + frontend dependencies
make dev-backend # Start backend with hot reload
make dev-frontend # Start frontend dev server
make db-up # Start PostgreSQL container
make test # Run backend tests
make lint # Run ruff linter
make lint-fix # Auto-fix linting issues
make clean # Remove caches and virtual environments
```
## Docker Compose (Full Stack)
To run everything in Docker:
```bash
docker compose up --build
```
This starts PostgreSQL, the backend API, and the frontend. Access the app at **http://localhost:1577**.
## Project Structure
```
olivas/
├── backend/
│ ├── app/
│ │ ├── api/endpoints/ # FastAPI route handlers
│ │ ├── db/ # Database session & connection
│ │ ├── models/ # SQLAlchemy ORM models
│ │ ├── schemas/ # Pydantic request/response schemas
│ │ ├── services/
│ │ │ ├── saliency/ # DeepGaze model manager & inference
│ │ │ ├── ai_insights.py # Claude AI integration
│ │ │ ├── insights.py # Rule-based insights engine
│ │ │ ├── report_generator.py # PDF report generation
│ │ │ ├── heatmap.py # Heatmap overlay generation
│ │ │ ├── gaze_sequence.py # Gaze sequence extraction
│ │ │ ├── image_processing.py # Image resize & upscale
│ │ │ └── storage.py # File storage abstraction
│ │ ├── config.py # App settings (env vars)
│ │ └── main.py # FastAPI app entry point
│ └── pyproject.toml
├── frontend/
│ ├── src/
│ │ ├── api/ # Axios API client & endpoints
│ │ ├── components/
│ │ │ ├── analysis/ # Heatmap, gaze, hotspots, insights
│ │ │ ├── aoi/ # Area of Interest canvas & results
│ │ │ ├── common/ # Button, Card, LoadingSpinner
│ │ │ └── layout/ # Header, Sidebar, AppLayout
│ │ ├── hooks/ # React Query hooks
│ │ ├── pages/ # Dashboard, NewAnalysis, AnalysisView, Help, About
│ │ ├── stores/ # Zustand state management
│ │ └── types/ # TypeScript interfaces
│ └── package.json
├── docker-compose.yml # Production Docker setup
├── docker-compose.dev.yml # Development Docker overrides
├── Makefile # Development shortcuts
└── LICENSE # MIT License
```
## Saliency Models
OliVAS uses the [DeepGaze](https://github.com/matthias-k/DeepGaze) family of saliency prediction models:
| Model | Architecture | Best For | Reference |
|-------|-------------|----------|-----------|
| **DeepGaze IIE** (recommended) | ResNet + DenseNet ensemble | Best accuracy on benchmarks | [Linardos et al., ICCV 2021](https://arxiv.org/abs/2105.12441) |
| **DeepGaze III** | Transformer-based | Complex layouts with many elements | [Kummerer et al., J. Vision 2022](https://doi.org/10.1167/jov.22.5.7) |
| **DeepGaze I** | AlexNet features | Quick preliminary analysis | [Kummerer et al., ICLR 2015](https://arxiv.org/abs/1411.1045) |
These models are trained on thousands of real eye-tracking experiments and are among the top-performing models on the [MIT/Tubingen Saliency Benchmark](https://saliency.tuebingen.ai/).
## AI Design Analysis
When an Anthropic API key is configured, OliVAS can send the original image and heatmap overlay to **Claude Sonnet 4.6** for context-aware design analysis. The AI references specific visual elements in your design and provides actionable recommendations.
- Cost per analysis is tracked and displayed (typically $0.01-0.05 per image)
- AI insights are saved to the database and included in PDF reports
- This feature is entirely optional — rule-based insights always work without an API key
## API Endpoints
| Method | Endpoint | Description |
|--------|---------|-------------|
| `POST` | `/api/projects` | Create a new project |
| `GET` | `/api/projects` | List all projects |
| `POST` | `/api/projects/{id}/analyses` | Upload image and start analysis |
| `GET` | `/api/analyses/{id}` | Get analysis details + insights |
| `GET` | `/api/analyses/{id}/status` | Poll analysis status |
| `GET` | `/api/analyses/{id}/images/{type}` | Get analysis images |
| `POST` | `/api/analyses/{id}/ai-insights` | Generate AI insights (on-demand) |
| `GET` | `/api/analyses/{id}/report` | Download PDF report |
| `POST` | `/api/analyses/{id}/aois` | Create Areas of Interest |
| `DELETE` | `/api/analyses/{id}` | Delete an analysis |
## Academic References
- Kummerer, M., Theis, L., & Bethge, M. (2015). "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." *ICLR 2015*. [arXiv:1411.1045](https://arxiv.org/abs/1411.1045)
- Kummerer, M., Wallis, T.S.A., & Bethge, M. (2016). "DeepGaze II: Reading fixations from deep features trained on object recognition." [arXiv:1610.01563](https://arxiv.org/abs/1610.01563)
- Linardos, A., Kummerer, M., Press, O., & Bethge, M. (2021). "DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling." *ICCV 2021*. [arXiv:2105.12441](https://arxiv.org/abs/2105.12441)
- Kummerer, M., Bethge, M., & Wallis, T.S.A. (2022). "DeepGaze III: Modeling free-viewing human scanpaths with deep learning." *Journal of Vision*, 22(5):7. [DOI:10.1167/jov.22.5.7](https://doi.org/10.1167/jov.22.5.7)
- Itti, L., Koch, C., & Niebur, E. (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis." *IEEE TPAMI*, 20(11), 1254-1259. [DOI:10.1109/34.730558](https://doi.org/10.1109/34.730558)
## License
MIT License. See [LICENSE](LICENSE) for details.
---
Built with care by **OLIVER** creative teams.