vault backup: 2026-05-03 18:46:49

This commit is contained in:
Vadym Samoilenko 2026-05-03 18:46:49 +01:00
parent b0fcc4008c
commit d000f646f0
3 changed files with 37 additions and 81 deletions

View file

@ -4,7 +4,7 @@
"syncFolder": "Hoarder",
"attachmentsFolder": "Hoarder/attachments",
"syncIntervalMinutes": 60,
"lastSyncTimestamp": 1777826360368,
"lastSyncTimestamp": 1777829889811,
"updateExistingFiles": false,
"excludeArchived": true,
"onlyFavorites": false,

View file

@ -6,78 +6,8 @@ Commands that need to be run on servers. Move to **Done** after confirmation.
## Pending
### P0 — Do immediately
#### pve: Update router DHCP DNS
On TP-Link AX72 web UI → Network → DHCP Server → Primary DNS:
```
Change: 192.168.1.62 → 192.168.1.225
Secondary DNS: 1.1.1.1
```
_Why: CT101 (192.168.1.62) is legacy, CT102 Docker AdGuard is the new DNS_
#### pve: Fix CT105 Immich GPU (2 commands)
```bash
ssh pve "pct set 105 --delete dev1 && pct set 105 --delete dev2"
ssh pve "pct start 105"
```
_Why: Host only has card1 + renderD128. CT105 conf had dev1=renderD129 and dev2=card0 which don't exist_
#### pve: Free CT102 thin-pool (fstrim)
```bash
ssh pve "pct exec 102 -- fstrim -av"
ssh pve "pct exec 102 -- systemctl enable fstrim.timer && systemctl start fstrim.timer"
```
_Why: vm-102-disk-0 shows 99.39% thin-pool allocation, df shows only 36% — blocks were written then deleted but not trimmed_
#### pve: Destroy legacy LXCs (after router DNS update)
```bash
# First verify CT101 is no longer needed as DNS (router updated)
ssh pve "pct stop 101 && pct destroy 101"
```
_Why: CT101 is legacy AdGuard, replaced by CT102 Docker AdGuard_
#### CT102: Fix dead NPM proxy hosts (disable/delete)
```bash
# In NPM admin (http://192.168.1.225:81):
# Delete or disable: id=5 (flow.ai-impress.com → dead), id=6 (ssh → dead),
# id=8 (grafana → dead), id=12 (auth → Authentik deleted)
# Update: id=10 (dns.ai-impress.com) → change backend to 192.168.1.225:8053
```
#### CT102: Fix Stirling-PDF OIDC
```bash
ssh pve "pct exec 102 -- bash -lc 'cd /opt/services/stirling-pdf && docker compose down && \
sed -i \"s/SECURITY_OAUTH2_ENABLED=true/SECURITY_OAUTH2_ENABLED=false/g\" docker-compose.yml && \
docker compose up -d'"
# If no env var, manually edit docker-compose.yml: set SECURITY_OAUTH2_ENABLED=false
```
### P1 — This week
#### CT102: Restrict docker-socket-proxy to localhost only
```bash
# Edit /opt/services/<socket-proxy compose> or wherever it's defined
# Change: "0.0.0.0:2376:2375" → "127.0.0.1:2376:2375"
# Then: docker compose up -d --force-recreate
```
_Why: Exposes Docker API to entire LAN on 0.0.0.0:2376 — security risk_
#### pve: Enable Tailscale subnet-router (LAN access remotely)
```bash
ssh pve "tailscale up --advertise-routes=192.168.1.0/24 --accept-routes"
# Then: approve the subnet in Tailscale admin console (https://login.tailscale.com/admin/machines)
```
_Why: Currently no subnet route — LAN-only services not accessible when remote_
#### CT102: Configure Promtail for Loki
```bash
# Create /opt/monitoring/promtail-config.yml
# Add to /opt/monitoring/docker-compose.yml: promtail service
# Loki URL: http://loki:3100
```
_Why: Loki running but no Promtail — logs not aggregated_
#### CT102: Add CrowdSec bouncer for NPM
```bash
# Install nginx-proxy-manager bouncer for crowdsec
@ -85,24 +15,42 @@ _Why: Loki running but no Promtail — logs not aggregated_
```
_Why: CrowdSec running but no bouncer — IPS observing but not blocking_
#### pve: vzdump restore drill
```bash
# Test restoring a backup to verify backups work
ssh pve "qmrestore /mnt/usb-backup/dump/<latest-vm200-backup>.vma.zst 299 --storage local-lvm"
ssh pve "qm start 299 && qm status 299"
ssh pve "qm stop 299 && qm destroy 299"
```
_Why: vzdump runs daily but restore procedure never tested_
### P2 — Phase 3: Review all app configs
- Review each service in CT102 /opt/services/*/docker-compose.yml against checklist (restart, healthcheck, logging, networks, secrets)
- Special: Nextcloud (cron container), Vaultwarden (SIGNUPS_ALLOWED=false), Paperless (OCR_LANGUAGE=eng+rus)
### P3 — Phase 4: *arr stack + Russian content
- Add Bazarr (CT111), Recyclarr (CT111), Readarr (CT111)
- Configure Sonarr/Radarr custom formats for Russian audio (score +200)
- Configure Prowlarr: add rutracker, kinozal, rutor, NNM-Club
- qBit: set listening port 50000, add router Virtual Server 50000 TCP+UDP → 192.168.1.230:50000
- Jellyfin: add Sonarr/Radarr connect webhooks for instant library scan
- Jellyfin: set default audio/subtitle language to Russian
### P4 — Phase 5: Dashboards A/B/C
- Rebuild Glance (4 pages: Home/Infrastructure/Media/Monitoring), add power widget (RAPL/Prometheus)
- Deploy Dashy on port 8086 at dashy.ai-impress.com
- Deploy Dashbrr on port 8087 at dashbrr.ai-impress.com
- After comparison: keep 1-2, destroy others
---
## Done
_(Move entries here after confirmation)_
| Date | Command | Result |
|------|---------|--------|
| 2026-05-03 | Live audit of pve server | Completed — all files updated in Obsidian |
| 2026-05-03 | Router DNS updated | 192.168.1.62 → 192.168.1.225 (done by user) |
| 2026-05-03 | CT105 Immich GPU fix | Already fixed (native LXC, dev1/dev2 removed, immich running) |
| 2026-05-03 | CT102 fstrim | 99.39% → 35.81%, issue_discards=1 enabled in lvm.conf |
| 2026-05-03 | CT101 destroyed | pct stop 101 && pct destroy 101 --purge |
| 2026-05-03 | NPM dead proxies removed | id=5,6,8,12,20 deleted; id=10 updated to :8053; id=26 trimmed |
| 2026-05-03 | Stirling-PDF OIDC | Already fixed (SECURITY_ENABLELOGIN=false, no OAuth in compose) |
| 2026-05-03 | docker-socket-proxy → localhost | Recreated with -p 127.0.0.1:2376:2375 |
| 2026-05-03 | rpcbind :111 closed | systemctl disable --now rpcbind rpcbind.socket |
| 2026-05-03 | Tailscale subnet-router | 192.168.1.0/24 advertised + approved in admin console; IP forwarding enabled in /etc/sysctl.d/99-tailscale.conf |
| 2026-05-03 | Promtail for Loki | Added to /opt/monitoring/docker-compose.yml, container running, Docker targets discovered |
| 2026-05-03 | vzdump restore drill | CT102 backup restored as CT999, hostname verified, CT999 destroyed |
---
@ -111,3 +59,5 @@ _(Move entries here after confirmation)_
- Commands for CT102 Docker services always via: `ssh pve "pct exec 102 -- bash -lc '...'"` or `ssh pve "pct exec 102 -- docker compose -f /path/to/compose.yml ..."`
- After any DNS change: flush on clients and wait for DHCP lease renewal (24h default)
- NPM admin: http://192.168.1.225:81 (password: check ~/.secrets/ on local machine)
- fstrim for CT disks: mount /dev/mapper/pve-vm--<ID>--disk--0 /mnt/trim-ctXXX && fstrim -v /mnt/trim-ctXXX && umount (from pve host, container running is OK)
- Tailscale subnet approved in admin: https://login.tailscale.com/admin/machines → pve → Edit route settings

View file

@ -38,3 +38,9 @@ tags: [daily]
- 17:57 (19min) | `aimpress`
- **Asked:** Asked | Audit PVE server containers, services, and network configurations to document setup and identify improvements | infrastructure/_index.md, homelab/_index.md, Known Issues section
- **Done:** Done | Indexed all containers and services, documented local vs internet-accessible resources, and updated infrastructure documentation with current live data | infrastructure/_index.md, homelab/_index.md
- 18:45 (30min) | `aimpress`
- **Asked:** Complete Proxmox homelab audit, document all containers/services, identify issues and create improvement plan with focus on *arr stack, qBittorrent, and Glance dashboard setup.
- **Done:** Conducted comprehensive server inventory, validated Tailscale configuration parameters, executed successful restore drill, and documented completed tasks in Obsidian.
- 18:45 | `aimpress`
- **Asked:** Conducted comprehensive audit of Proxmox homelab server, documented all containers and services with configurations, and identified issues and duplicates.
- **Done:** Completed Phase 1 and Phase 2 improvements including container cleanup, storage optimization, security fixes, and monitoring setup across 10+ tasks.