Full-stack application for predicting where humans look in images using DeepGaze saliency models. Includes heatmap overlays, gaze sequence prediction, hotspot detection, AOI analysis, rule-based insights, optional Claude AI design analysis, and professional PDF report generation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
215 lines
9.6 KiB
Markdown
215 lines
9.6 KiB
Markdown
# OliVAS — Open-Source Visual Attention Software
|
|
|
|
**OliVAS** (OLIVER Visual Attention Suite) is an open-source web application that predicts where humans will look in an image during the first 3-5 seconds of viewing. Built for creative teams, designers, and marketers at OLIVER, it provides saliency heatmaps, gaze sequence predictions, hotspot analysis, and actionable design insights — all without needing physical eye-tracking hardware.
|
|
|
|
## Features
|
|
|
|
- **Saliency Heatmap** — Interactive heatmap overlay showing predicted attention intensity with adjustable opacity and colormap (Jet, Viridis, Inferno, etc.)
|
|
- **Gaze Sequence Prediction** — Numbered fixation points showing the predicted order viewers will scan the image
|
|
- **Hotspot Detection** — Top 5 attention regions ranked by intensity with bounding boxes
|
|
- **Attention Score** — Overall 0-100 concentration score measuring how focused or diffuse the predicted attention is
|
|
- **Areas of Interest (AOI)** — Draw rectangles over design elements to measure attention %, area %, and attention density
|
|
- **Rule-Based Insights** — Automatic analysis of attention concentration, focal dominance, gaze entry point, spatial balance, edge risk, and drop-off
|
|
- **AI Design Analysis** — Optional Claude Sonnet 4.6-powered insights that reference specific visual elements in your design with actionable recommendations
|
|
- **PDF Reports** — Professional downloadable reports with Montserrat typography, all visualizations, metrics, and insights (both rule-based and AI)
|
|
- **Multi-Model Support** — Choose between DeepGaze I, DeepGaze IIE (recommended), and DeepGaze III
|
|
- **Project Organization** — Group analyses into projects for easy management
|
|
- **Comparison View** — Side-by-side comparison of two analyses with metrics
|
|
|
|
## Tech Stack
|
|
|
|
| Layer | Technology |
|
|
|-------|-----------|
|
|
| **Frontend** | React 18, TypeScript, Vite, Tailwind CSS, Zustand, React Router |
|
|
| **Backend** | FastAPI, Python 3.12, SQLAlchemy (async), Pydantic v2 |
|
|
| **Database** | PostgreSQL 16 |
|
|
| **ML Models** | DeepGaze I / IIE / III via [deepgaze_pytorch](https://github.com/matthias-k/DeepGaze) |
|
|
| **AI Insights** | Anthropic Claude Sonnet 4.6 (optional) |
|
|
| **PDF Generation** | ReportLab with Montserrat font |
|
|
| **Deployment** | Docker Compose |
|
|
|
|
## Prerequisites
|
|
|
|
- Python 3.12+
|
|
- Node.js 18+
|
|
- Docker & Docker Compose (for PostgreSQL)
|
|
- Git
|
|
|
|
## Quick Start
|
|
|
|
### 1. Clone the repository
|
|
|
|
```bash
|
|
git clone git@bitbucket.org:zlalani/olivas.git
|
|
cd olivas
|
|
```
|
|
|
|
### 2. Start PostgreSQL
|
|
|
|
```bash
|
|
docker compose up -d postgres
|
|
```
|
|
|
|
This starts PostgreSQL on port **5453** with database `olivas`.
|
|
|
|
### 3. Set up the backend
|
|
|
|
```bash
|
|
cd backend
|
|
python3.12 -m venv .venv
|
|
.venv/bin/pip install --upgrade pip
|
|
.venv/bin/pip install -e ".[dev]"
|
|
.venv/bin/pip install "deepgaze-pytorch @ git+https://github.com/matthias-k/DeepGaze.git"
|
|
```
|
|
|
|
### 4. Configure environment (optional)
|
|
|
|
Create `backend/.env` for optional settings:
|
|
|
|
```env
|
|
# Required for AI Design Analysis feature (optional)
|
|
ANTHROPIC_API_KEY=sk-ant-your-key-here
|
|
|
|
# Defaults (change if needed)
|
|
DATABASE_URL=postgresql+asyncpg://olivas:olivas@localhost:5453/olivas
|
|
DEVICE=auto
|
|
CORS_ORIGINS=http://localhost:1577
|
|
```
|
|
|
|
### 5. Start the backend
|
|
|
|
```bash
|
|
cd backend
|
|
.venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
|
|
```
|
|
|
|
The backend will load all DeepGaze models on startup (this may take 30-60 seconds on first run as model weights are downloaded).
|
|
|
|
### 6. Set up and start the frontend
|
|
|
|
```bash
|
|
cd frontend
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
### 7. Open the app
|
|
|
|
Navigate to **http://localhost:1577** in your browser.
|
|
|
|
## Using the Makefile
|
|
|
|
For convenience, the project includes a Makefile:
|
|
|
|
```bash
|
|
make setup # Install all backend + frontend dependencies
|
|
make dev-backend # Start backend with hot reload
|
|
make dev-frontend # Start frontend dev server
|
|
make db-up # Start PostgreSQL container
|
|
make test # Run backend tests
|
|
make lint # Run ruff linter
|
|
make lint-fix # Auto-fix linting issues
|
|
make clean # Remove caches and virtual environments
|
|
```
|
|
|
|
## Docker Compose (Full Stack)
|
|
|
|
To run everything in Docker:
|
|
|
|
```bash
|
|
docker compose up --build
|
|
```
|
|
|
|
This starts PostgreSQL, the backend API, and the frontend. Access the app at **http://localhost:1577**.
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
olivas/
|
|
├── backend/
|
|
│ ├── app/
|
|
│ │ ├── api/endpoints/ # FastAPI route handlers
|
|
│ │ ├── db/ # Database session & connection
|
|
│ │ ├── models/ # SQLAlchemy ORM models
|
|
│ │ ├── schemas/ # Pydantic request/response schemas
|
|
│ │ ├── services/
|
|
│ │ │ ├── saliency/ # DeepGaze model manager & inference
|
|
│ │ │ ├── ai_insights.py # Claude AI integration
|
|
│ │ │ ├── insights.py # Rule-based insights engine
|
|
│ │ │ ├── report_generator.py # PDF report generation
|
|
│ │ │ ├── heatmap.py # Heatmap overlay generation
|
|
│ │ │ ├── gaze_sequence.py # Gaze sequence extraction
|
|
│ │ │ ├── image_processing.py # Image resize & upscale
|
|
│ │ │ └── storage.py # File storage abstraction
|
|
│ │ ├── config.py # App settings (env vars)
|
|
│ │ └── main.py # FastAPI app entry point
|
|
│ └── pyproject.toml
|
|
├── frontend/
|
|
│ ├── src/
|
|
│ │ ├── api/ # Axios API client & endpoints
|
|
│ │ ├── components/
|
|
│ │ │ ├── analysis/ # Heatmap, gaze, hotspots, insights
|
|
│ │ │ ├── aoi/ # Area of Interest canvas & results
|
|
│ │ │ ├── common/ # Button, Card, LoadingSpinner
|
|
│ │ │ └── layout/ # Header, Sidebar, AppLayout
|
|
│ │ ├── hooks/ # React Query hooks
|
|
│ │ ├── pages/ # Dashboard, NewAnalysis, AnalysisView, Help, About
|
|
│ │ ├── stores/ # Zustand state management
|
|
│ │ └── types/ # TypeScript interfaces
|
|
│ └── package.json
|
|
├── docker-compose.yml # Production Docker setup
|
|
├── docker-compose.dev.yml # Development Docker overrides
|
|
├── Makefile # Development shortcuts
|
|
└── LICENSE # MIT License
|
|
```
|
|
|
|
## Saliency Models
|
|
|
|
OliVAS uses the [DeepGaze](https://github.com/matthias-k/DeepGaze) family of saliency prediction models:
|
|
|
|
| Model | Architecture | Best For | Reference |
|
|
|-------|-------------|----------|-----------|
|
|
| **DeepGaze IIE** (recommended) | ResNet + DenseNet ensemble | Best accuracy on benchmarks | [Linardos et al., ICCV 2021](https://arxiv.org/abs/2105.12441) |
|
|
| **DeepGaze III** | Transformer-based | Complex layouts with many elements | [Kummerer et al., J. Vision 2022](https://doi.org/10.1167/jov.22.5.7) |
|
|
| **DeepGaze I** | AlexNet features | Quick preliminary analysis | [Kummerer et al., ICLR 2015](https://arxiv.org/abs/1411.1045) |
|
|
|
|
These models are trained on thousands of real eye-tracking experiments and are among the top-performing models on the [MIT/Tubingen Saliency Benchmark](https://saliency.tuebingen.ai/).
|
|
|
|
## AI Design Analysis
|
|
|
|
When an Anthropic API key is configured, OliVAS can send the original image and heatmap overlay to **Claude Sonnet 4.6** for context-aware design analysis. The AI references specific visual elements in your design and provides actionable recommendations.
|
|
|
|
- Cost per analysis is tracked and displayed (typically $0.01-0.05 per image)
|
|
- AI insights are saved to the database and included in PDF reports
|
|
- This feature is entirely optional — rule-based insights always work without an API key
|
|
|
|
## API Endpoints
|
|
|
|
| Method | Endpoint | Description |
|
|
|--------|---------|-------------|
|
|
| `POST` | `/api/projects` | Create a new project |
|
|
| `GET` | `/api/projects` | List all projects |
|
|
| `POST` | `/api/projects/{id}/analyses` | Upload image and start analysis |
|
|
| `GET` | `/api/analyses/{id}` | Get analysis details + insights |
|
|
| `GET` | `/api/analyses/{id}/status` | Poll analysis status |
|
|
| `GET` | `/api/analyses/{id}/images/{type}` | Get analysis images |
|
|
| `POST` | `/api/analyses/{id}/ai-insights` | Generate AI insights (on-demand) |
|
|
| `GET` | `/api/analyses/{id}/report` | Download PDF report |
|
|
| `POST` | `/api/analyses/{id}/aois` | Create Areas of Interest |
|
|
| `DELETE` | `/api/analyses/{id}` | Delete an analysis |
|
|
|
|
## Academic References
|
|
|
|
- Kummerer, M., Theis, L., & Bethge, M. (2015). "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." *ICLR 2015*. [arXiv:1411.1045](https://arxiv.org/abs/1411.1045)
|
|
- Kummerer, M., Wallis, T.S.A., & Bethge, M. (2016). "DeepGaze II: Reading fixations from deep features trained on object recognition." [arXiv:1610.01563](https://arxiv.org/abs/1610.01563)
|
|
- Linardos, A., Kummerer, M., Press, O., & Bethge, M. (2021). "DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling." *ICCV 2021*. [arXiv:2105.12441](https://arxiv.org/abs/2105.12441)
|
|
- Kummerer, M., Bethge, M., & Wallis, T.S.A. (2022). "DeepGaze III: Modeling free-viewing human scanpaths with deep learning." *Journal of Vision*, 22(5):7. [DOI:10.1167/jov.22.5.7](https://doi.org/10.1167/jov.22.5.7)
|
|
- Itti, L., Koch, C., & Niebur, E. (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis." *IEEE TPAMI*, 20(11), 1254-1259. [DOI:10.1109/34.730558](https://doi.org/10.1109/34.730558)
|
|
|
|
## License
|
|
|
|
MIT License. See [LICENSE](LICENSE) for details.
|
|
|
|
---
|
|
|
|
Built with care by **OLIVER** creative teams.
|