OVHserver

History

SamoilenkoVadym 5f0be9a504 fix: оптимизация алертинга Prometheus и Alertmanager Исправлены критические проблемы с избыточными уведомлениями: 1. Alertmanager (config.yml): - group_wait: 10s → 30s (уменьшен спам повторных алертов) - group_interval: 10s → 5m (алерты группируются правильно) - repeat_interval: 1h → 4h (повторные уведомления раз в 4 часа) - Добавлена группировка по severity и instance - Исправлен шаблон Slack для отображения деталей алертов 2. Prometheus правила (alerts.yml): - ContainerHighMemory: порог 90% → 95%, for: 2m → 5m - WebsiteDown: for: 1m → 10m (синхронизировано со scrape_interval) - Добавлены детальные описания в alerts Результат: количество уведомлений снижено с 90+ до минимума, уведомления теперь содержат полную информацию о проблеме. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>		2025-11-21 16:08:03 +00:00
..
grafana/dashboards	chore: initial infrastructure setup with Syncthing, Git and documentation	2025-11-05 16:41:12 +00:00
monitoring	fix: оптимизация алертинга Prometheus и Alertmanager	2025-11-21 16:08:03 +00:00
portainer	chore: initial infrastructure setup with Syncthing, Git and documentation	2025-11-05 16:41:12 +00:00
uptime-kuma	feat: обновление Uptime Kuma до версии 2.0.2	2025-11-20 21:19:30 +00:00
watchtower	chore: initial infrastructure setup with Syncthing, Git and documentation	2025-11-05 16:41:12 +00:00