olivas/README.md

# OliVAS — Open-Source Visual Attention Software

**OliVAS** (OLIVER Visual Attention Suite) is an open-source web application that predicts where humans will look in an image during the first 3-5 seconds of viewing. Built for creative teams, designers, and marketers at OLIVER, it provides saliency heatmaps, gaze sequence predictions, hotspot analysis, and actionable design insights — all without needing physical eye-tracking hardware.

## Features

- **Saliency Heatmap** — Interactive heatmap overlay showing predicted attention intensity with adjustable opacity and colormap (Jet, Viridis, Inferno, etc.)
- **Gaze Sequence Prediction** — Numbered fixation points showing the predicted order viewers will scan the image
- **Hotspot Detection** — Top 5 attention regions ranked by intensity with bounding boxes
- **Attention Score** — Overall 0-100 concentration score measuring how focused or diffuse the predicted attention is
- **Areas of Interest (AOI)** — Draw rectangles over design elements to measure attention %, area %, and attention density
- **Rule-Based Insights** — Automatic analysis of attention concentration, focal dominance, gaze entry point, spatial balance, edge risk, and drop-off
- **AI Design Analysis** — Optional Claude Sonnet 4.6-powered insights that reference specific visual elements in your design with actionable recommendations
- **PDF Reports** — Professional downloadable reports with Montserrat typography, all visualizations, metrics, and insights (both rule-based and AI)
- **Multi-Model Support** — Choose between DeepGaze I, DeepGaze IIE (recommended), and DeepGaze III
- **Project Organization** — Group analyses into projects for easy management
- **Comparison View** — Side-by-side comparison of two analyses with metrics

## Tech Stack

| Layer | Technology |
|-------|-----------|
| **Frontend** | React 18, TypeScript, Vite, Tailwind CSS, Zustand, React Router |
| **Backend** | FastAPI, Python 3.12, SQLAlchemy (async), Pydantic v2 |
| **Database** | PostgreSQL 16 |
| **ML Models** | DeepGaze I / IIE / III via [deepgaze_pytorch](https://github.com/matthias-k/DeepGaze) |
| **AI Insights** | Anthropic Claude Sonnet 4.6 (optional) |
| **PDF Generation** | ReportLab with Montserrat font |
| **Deployment** | Docker Compose |

## Prerequisites

- Python 3.12+
- Node.js 18+
- Docker & Docker Compose (for PostgreSQL)
- Git

## Quick Start

### 1. Clone the repository

```bash
git clone git@bitbucket.org:zlalani/olivas.git
cd olivas
```

### 2. Start PostgreSQL

```bash
docker compose up -d postgres
```

This starts PostgreSQL on port **5453** with database `olivas`.

### 3. Set up the backend

```bash
cd backend
python3.12 -m venv .venv
.venv/bin/pip install --upgrade pip
.venv/bin/pip install -e ".[dev]"
.venv/bin/pip install "deepgaze-pytorch @ git+https://github.com/matthias-k/DeepGaze.git"
```

### 4. Configure environment (optional)

Create `backend/.env` for optional settings:

```env
# Required for AI Design Analysis feature (optional)
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Defaults (change if needed)
DATABASE_URL=postgresql+asyncpg://olivas:olivas@localhost:5453/olivas
DEVICE=auto
CORS_ORIGINS=http://localhost:1577
```

### 5. Start the backend

```bash
cd backend
.venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```

The backend will load all DeepGaze models on startup (this may take 30-60 seconds on first run as model weights are downloaded).

### 6. Set up and start the frontend

```bash
cd frontend
npm install
npm run dev
```

### 7. Open the app

Navigate to **http://localhost:1577** in your browser.

## Using the Makefile

For convenience, the project includes a Makefile:

```bash
make setup          # Install all backend + frontend dependencies
make dev-backend    # Start backend with hot reload
make dev-frontend   # Start frontend dev server
make db-up          # Start PostgreSQL container
make test           # Run backend tests
make lint           # Run ruff linter
make lint-fix       # Auto-fix linting issues
make clean          # Remove caches and virtual environments
```

## Docker Compose (Full Stack)

To run everything in Docker:

```bash
docker compose up --build
```

This starts PostgreSQL, the backend API, and the frontend. Access the app at **http://localhost:1577**.

## Project Structure

```
olivas/
├── backend/
│   ├── app/
│   │   ├── api/endpoints/       # FastAPI route handlers
│   │   ├── db/                  # Database session & connection
│   │   ├── models/              # SQLAlchemy ORM models
│   │   ├── schemas/             # Pydantic request/response schemas
│   │   ├── services/
│   │   │   ├── saliency/        # DeepGaze model manager & inference
│   │   │   ├── ai_insights.py   # Claude AI integration
│   │   │   ├── insights.py      # Rule-based insights engine
│   │   │   ├── report_generator.py  # PDF report generation
│   │   │   ├── heatmap.py       # Heatmap overlay generation
│   │   │   ├── gaze_sequence.py # Gaze sequence extraction
│   │   │   ├── image_processing.py  # Image resize & upscale
│   │   │   └── storage.py       # File storage abstraction
│   │   ├── config.py            # App settings (env vars)
│   │   └── main.py              # FastAPI app entry point
│   └── pyproject.toml
├── frontend/
│   ├── src/
│   │   ├── api/                 # Axios API client & endpoints
│   │   ├── components/
│   │   │   ├── analysis/        # Heatmap, gaze, hotspots, insights
│   │   │   ├── aoi/             # Area of Interest canvas & results
│   │   │   ├── common/          # Button, Card, LoadingSpinner
│   │   │   └── layout/          # Header, Sidebar, AppLayout
│   │   ├── hooks/               # React Query hooks
│   │   ├── pages/               # Dashboard, NewAnalysis, AnalysisView, Help, About
│   │   ├── stores/              # Zustand state management
│   │   └── types/               # TypeScript interfaces
│   └── package.json
├── docker-compose.yml           # Production Docker setup
├── docker-compose.dev.yml       # Development Docker overrides
├── Makefile                     # Development shortcuts
└── LICENSE                      # MIT License
```

## Saliency Models

OliVAS uses the [DeepGaze](https://github.com/matthias-k/DeepGaze) family of saliency prediction models:

| Model | Architecture | Best For | Reference |
|-------|-------------|----------|-----------|
| **DeepGaze IIE** (recommended) | ResNet + DenseNet ensemble | Best accuracy on benchmarks | [Linardos et al., ICCV 2021](https://arxiv.org/abs/2105.12441) |
| **DeepGaze III** | Transformer-based | Complex layouts with many elements | [Kummerer et al., J. Vision 2022](https://doi.org/10.1167/jov.22.5.7) |
| **DeepGaze I** | AlexNet features | Quick preliminary analysis | [Kummerer et al., ICLR 2015](https://arxiv.org/abs/1411.1045) |

These models are trained on thousands of real eye-tracking experiments and are among the top-performing models on the [MIT/Tubingen Saliency Benchmark](https://saliency.tuebingen.ai/).

## AI Design Analysis

When an Anthropic API key is configured, OliVAS can send the original image and heatmap overlay to **Claude Sonnet 4.6** for context-aware design analysis. The AI references specific visual elements in your design and provides actionable recommendations.

- Cost per analysis is tracked and displayed (typically $0.01-0.05 per image)
- AI insights are saved to the database and included in PDF reports
- This feature is entirely optional — rule-based insights always work without an API key

## API Endpoints

| Method | Endpoint | Description |
|--------|---------|-------------|
| `POST` | `/api/projects` | Create a new project |
| `GET` | `/api/projects` | List all projects |
| `POST` | `/api/projects/{id}/analyses` | Upload image and start analysis |
| `GET` | `/api/analyses/{id}` | Get analysis details + insights |
| `GET` | `/api/analyses/{id}/status` | Poll analysis status |
| `GET` | `/api/analyses/{id}/images/{type}` | Get analysis images |
| `POST` | `/api/analyses/{id}/ai-insights` | Generate AI insights (on-demand) |
| `GET` | `/api/analyses/{id}/report` | Download PDF report |
| `POST` | `/api/analyses/{id}/aois` | Create Areas of Interest |
| `DELETE` | `/api/analyses/{id}` | Delete an analysis |

## Academic References

- Kummerer, M., Theis, L., & Bethge, M. (2015). "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." *ICLR 2015*. [arXiv:1411.1045](https://arxiv.org/abs/1411.1045)
- Kummerer, M., Wallis, T.S.A., & Bethge, M. (2016). "DeepGaze II: Reading fixations from deep features trained on object recognition." [arXiv:1610.01563](https://arxiv.org/abs/1610.01563)
- Linardos, A., Kummerer, M., Press, O., & Bethge, M. (2021). "DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling." *ICCV 2021*. [arXiv:2105.12441](https://arxiv.org/abs/2105.12441)
- Kummerer, M., Bethge, M., & Wallis, T.S.A. (2022). "DeepGaze III: Modeling free-viewing human scanpaths with deep learning." *Journal of Vision*, 22(5):7. [DOI:10.1167/jov.22.5.7](https://doi.org/10.1167/jov.22.5.7)
- Itti, L., Koch, C., & Niebur, E. (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis." *IEEE TPAMI*, 20(11), 1254-1259. [DOI:10.1109/34.730558](https://doi.org/10.1109/34.730558)

## License

MIT License. See [LICENSE](LICENSE) for details.

---

Built with care by **OLIVER** creative teams.