Full-stack application for predicting where humans look in images using DeepGaze saliency models. Includes heatmap overlays, gaze sequence prediction, hotspot detection, AOI analysis, rule-based insights, optional Claude AI design analysis, and professional PDF report generation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| backend | ||
| frontend | ||
| .env.example | ||
| .gitignore | ||
| .python-version | ||
| docker-compose.dev.yml | ||
| docker-compose.yml | ||
| LICENSE | ||
| Makefile | ||
| README.md | ||
OliVAS — Open-Source Visual Attention Software
OliVAS (OLIVER Visual Attention Suite) is an open-source web application that predicts where humans will look in an image during the first 3-5 seconds of viewing. Built for creative teams, designers, and marketers at OLIVER, it provides saliency heatmaps, gaze sequence predictions, hotspot analysis, and actionable design insights — all without needing physical eye-tracking hardware.
Features
- Saliency Heatmap — Interactive heatmap overlay showing predicted attention intensity with adjustable opacity and colormap (Jet, Viridis, Inferno, etc.)
- Gaze Sequence Prediction — Numbered fixation points showing the predicted order viewers will scan the image
- Hotspot Detection — Top 5 attention regions ranked by intensity with bounding boxes
- Attention Score — Overall 0-100 concentration score measuring how focused or diffuse the predicted attention is
- Areas of Interest (AOI) — Draw rectangles over design elements to measure attention %, area %, and attention density
- Rule-Based Insights — Automatic analysis of attention concentration, focal dominance, gaze entry point, spatial balance, edge risk, and drop-off
- AI Design Analysis — Optional Claude Sonnet 4.6-powered insights that reference specific visual elements in your design with actionable recommendations
- PDF Reports — Professional downloadable reports with Montserrat typography, all visualizations, metrics, and insights (both rule-based and AI)
- Multi-Model Support — Choose between DeepGaze I, DeepGaze IIE (recommended), and DeepGaze III
- Project Organization — Group analyses into projects for easy management
- Comparison View — Side-by-side comparison of two analyses with metrics
Tech Stack
| Layer | Technology |
|---|---|
| Frontend | React 18, TypeScript, Vite, Tailwind CSS, Zustand, React Router |
| Backend | FastAPI, Python 3.12, SQLAlchemy (async), Pydantic v2 |
| Database | PostgreSQL 16 |
| ML Models | DeepGaze I / IIE / III via deepgaze_pytorch |
| AI Insights | Anthropic Claude Sonnet 4.6 (optional) |
| PDF Generation | ReportLab with Montserrat font |
| Deployment | Docker Compose |
Prerequisites
- Python 3.12+
- Node.js 18+
- Docker & Docker Compose (for PostgreSQL)
- Git
Quick Start
1. Clone the repository
git clone git@bitbucket.org:zlalani/olivas.git
cd olivas
2. Start PostgreSQL
docker compose up -d postgres
This starts PostgreSQL on port 5453 with database olivas.
3. Set up the backend
cd backend
python3.12 -m venv .venv
.venv/bin/pip install --upgrade pip
.venv/bin/pip install -e ".[dev]"
.venv/bin/pip install "deepgaze-pytorch @ git+https://github.com/matthias-k/DeepGaze.git"
4. Configure environment (optional)
Create backend/.env for optional settings:
# Required for AI Design Analysis feature (optional)
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Defaults (change if needed)
DATABASE_URL=postgresql+asyncpg://olivas:olivas@localhost:5453/olivas
DEVICE=auto
CORS_ORIGINS=http://localhost:1577
5. Start the backend
cd backend
.venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
The backend will load all DeepGaze models on startup (this may take 30-60 seconds on first run as model weights are downloaded).
6. Set up and start the frontend
cd frontend
npm install
npm run dev
7. Open the app
Navigate to http://localhost:1577 in your browser.
Using the Makefile
For convenience, the project includes a Makefile:
make setup # Install all backend + frontend dependencies
make dev-backend # Start backend with hot reload
make dev-frontend # Start frontend dev server
make db-up # Start PostgreSQL container
make test # Run backend tests
make lint # Run ruff linter
make lint-fix # Auto-fix linting issues
make clean # Remove caches and virtual environments
Docker Compose (Full Stack)
To run everything in Docker:
docker compose up --build
This starts PostgreSQL, the backend API, and the frontend. Access the app at http://localhost:1577.
Project Structure
olivas/
├── backend/
│ ├── app/
│ │ ├── api/endpoints/ # FastAPI route handlers
│ │ ├── db/ # Database session & connection
│ │ ├── models/ # SQLAlchemy ORM models
│ │ ├── schemas/ # Pydantic request/response schemas
│ │ ├── services/
│ │ │ ├── saliency/ # DeepGaze model manager & inference
│ │ │ ├── ai_insights.py # Claude AI integration
│ │ │ ├── insights.py # Rule-based insights engine
│ │ │ ├── report_generator.py # PDF report generation
│ │ │ ├── heatmap.py # Heatmap overlay generation
│ │ │ ├── gaze_sequence.py # Gaze sequence extraction
│ │ │ ├── image_processing.py # Image resize & upscale
│ │ │ └── storage.py # File storage abstraction
│ │ ├── config.py # App settings (env vars)
│ │ └── main.py # FastAPI app entry point
│ └── pyproject.toml
├── frontend/
│ ├── src/
│ │ ├── api/ # Axios API client & endpoints
│ │ ├── components/
│ │ │ ├── analysis/ # Heatmap, gaze, hotspots, insights
│ │ │ ├── aoi/ # Area of Interest canvas & results
│ │ │ ├── common/ # Button, Card, LoadingSpinner
│ │ │ └── layout/ # Header, Sidebar, AppLayout
│ │ ├── hooks/ # React Query hooks
│ │ ├── pages/ # Dashboard, NewAnalysis, AnalysisView, Help, About
│ │ ├── stores/ # Zustand state management
│ │ └── types/ # TypeScript interfaces
│ └── package.json
├── docker-compose.yml # Production Docker setup
├── docker-compose.dev.yml # Development Docker overrides
├── Makefile # Development shortcuts
└── LICENSE # MIT License
Saliency Models
OliVAS uses the DeepGaze family of saliency prediction models:
| Model | Architecture | Best For | Reference |
|---|---|---|---|
| DeepGaze IIE (recommended) | ResNet + DenseNet ensemble | Best accuracy on benchmarks | Linardos et al., ICCV 2021 |
| DeepGaze III | Transformer-based | Complex layouts with many elements | Kummerer et al., J. Vision 2022 |
| DeepGaze I | AlexNet features | Quick preliminary analysis | Kummerer et al., ICLR 2015 |
These models are trained on thousands of real eye-tracking experiments and are among the top-performing models on the MIT/Tubingen Saliency Benchmark.
AI Design Analysis
When an Anthropic API key is configured, OliVAS can send the original image and heatmap overlay to Claude Sonnet 4.6 for context-aware design analysis. The AI references specific visual elements in your design and provides actionable recommendations.
- Cost per analysis is tracked and displayed (typically $0.01-0.05 per image)
- AI insights are saved to the database and included in PDF reports
- This feature is entirely optional — rule-based insights always work without an API key
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/projects |
Create a new project |
GET |
/api/projects |
List all projects |
POST |
/api/projects/{id}/analyses |
Upload image and start analysis |
GET |
/api/analyses/{id} |
Get analysis details + insights |
GET |
/api/analyses/{id}/status |
Poll analysis status |
GET |
/api/analyses/{id}/images/{type} |
Get analysis images |
POST |
/api/analyses/{id}/ai-insights |
Generate AI insights (on-demand) |
GET |
/api/analyses/{id}/report |
Download PDF report |
POST |
/api/analyses/{id}/aois |
Create Areas of Interest |
DELETE |
/api/analyses/{id} |
Delete an analysis |
Academic References
- Kummerer, M., Theis, L., & Bethge, M. (2015). "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." ICLR 2015. arXiv:1411.1045
- Kummerer, M., Wallis, T.S.A., & Bethge, M. (2016). "DeepGaze II: Reading fixations from deep features trained on object recognition." arXiv:1610.01563
- Linardos, A., Kummerer, M., Press, O., & Bethge, M. (2021). "DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling." ICCV 2021. arXiv:2105.12441
- Kummerer, M., Bethge, M., & Wallis, T.S.A. (2022). "DeepGaze III: Modeling free-viewing human scanpaths with deep learning." Journal of Vision, 22(5):7. DOI:10.1167/jov.22.5.7
- Itti, L., Koch, C., & Niebur, E. (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis." IEEE TPAMI, 20(11), 1254-1259. DOI:10.1109/34.730558
License
MIT License. See LICENSE for details.
Built with care by OLIVER creative teams.