obsidian/wiki/infrastructure/server-pve.md
2026-05-03 17:54:04 +01:00

6.1 KiB

tags updated last_verified
infrastructure
server
proxmox
homelab
personal
2026-05-03 2026-05-03 (live audit)

pve — Proxmox VE Homelab

SSH alias: pveroot@192.168.1.48:22 Key: ~/.ssh/id_ed25519 Web UI: https://192.168.1.48:8006 Tailscale: 100.122.192.8 (remote access)

Overview

Home Proxmox VE server (HP EliteDesk 800 G3). Runs LXC containers for personal self-hosted services and homelab experimentation.

  • Platform: Bare-metal (home server)
  • OS: Proxmox VE 9.1.9 (kernel 6.17.13-3-pve)
  • CPU: Intel i5-7500 (4c/4t, VT-x + VT-d, no HT) — running at load avg ~2.5 (2026-05-03)
  • RAM: 24 GB DDR4 — 9.7 GB used, 2.2 GB free, 12 GB buff/cache (2026-05-03)
  • IP: 192.168.1.48 (LAN) / 100.122.192.8 (Tailscale)

Storage

Pool Type Total Used % Notes
data-hdd LVM-thin 5.6 TB ~390 GB 6.99% HDD — CT102 data (300G), CT105 upload+data (250G), CT111 media (500G)
local dir 68 GB 8.3 GB 13% NVMe — PVE OS
local-lvm LVM-thin 141 GB ~83 GB 58.85% NVMe — all CT/VM root disks
usb-backup dir 916 GB 345 GB 37.58% USB Toshiba — vzdump backups

⚠ LVM-thin thin-pool alert — CT102 root disk

vm-102-disk-0 (CT102 root, 20GB) shows 99.39% thin-pool data allocation in lvs output. Inside the container df shows only 36% (6.7G/20G) — the gap is due to missing fstrim. History: disk was nearly full in April 2026, files deleted but thin-pool blocks not returned.

Fix (when needed):

ssh pve "pct exec 102 -- fstrim -av"
# Or enable periodic trim:
ssh pve "pct exec 102 -- systemctl enable fstrim.timer && systemctl start fstrim.timer"

Virtual Machines

VMID Name Status RAM Disk Onboot
200 kali-linux stopped 8 GB 60 GB no (manual only)

LXC Containers

VMID Name IP RAM Cores Status Role
101 adguard 192.168.1.62 512 MB 1 running Legacy — native AdGuard Home. DNS for LAN currently. Pending destroy.
102 docker 192.168.1.225 9 GB 4 running Main Docker host — 55+ containers
105 immich 192.168.1.71 8 GB 4 stopped Immich photos — GPU bug (see below)
111 media 192.168.1.230 4 GB 4 running Jellyfin + *arr + qBit (Intel iGPU)
112 n8n 192.168.1.232 2 GB 2 running n8n workflow automation

CT103/104/107/109/110 already destroyed (beszel/vaultwarden/homarr/grafana/uptime-kuma — all migrated to CT102 Docker).

CT105 — GPU Fix (PENDING)

Host /dev/dri contains only: card1 and renderD128 (Intel HD 630). CT105 conf references renderD129 and card0 which do not exist → container fails to start.

Fix:

# Remove dev1 (renderD129) and dev2 (card0) — they don't exist on host
ssh pve "pct stop 105"  # if running
ssh pve "pct set 105 --delete dev1 && pct set 105 --delete dev2"
ssh pve "pct start 105"
# Keep: dev0 (renderD128) + dev3 (card1)

Host Ports

Port Service Binding
22 SSH 0.0.0.0 (all interfaces)
8006 Proxmox Web UI (HTTPS) * (all)
3128 SPICE proxy *
9101 node_exporter *
45876 Beszel agent *
111 rpcbind (NFS leftover) 0.0.0.0 — consider disabling if no NFS

Key Services on Host

  • Tailscale — remote access (100.122.192.8). No subnet-router advertised (as of 2026-05-03).
  • Beszel agent — system monitoring (:45876)
  • node_exporter — Prometheus metrics (:9101)
  • Postfix — local mail relay (127.0.0.1:25 only)

GPU passthrough

Host /dev/dri: card1 (Intel HD 630, minor 1) + renderD128 (Intel rendernode, minor 128). AMD Radeon HD 8490 detected as card0does NOT appear in /dev/dri (not loaded for passthrough).

CT GPU device Status
CT111 card1 + renderD128 working (QuickSync for Jellyfin)
CT105 card1 + renderD128 (dev0+dev3) ⚠️ blocked — needs dev1+dev2 removed

Backup

vzdump job: daily at 12:20, mode snapshot, zstd compression, all VMs/LXCs → usb-backup. Config: /etc/pve/jobs.cfg

⚠ USB backup = single point of failure. Off-site backup (Backblaze B2 / Cloudflare R2 via Backrest) is pending (P2 task).

Beszel Monitoring

Hub: CT102 Docker :8090 (beszel.ai-impress.com)

System IP Port Status
pve (host) 192.168.1.48 45876 up
adguard (CT101) 192.168.1.62 45876 up
docker (CT102) 192.168.1.225 45879 up
immich (CT105) 192.168.1.71 45876 ⚠️ stopped (CT stopped)
media (CT111) 192.168.1.230 45876 up

Useful Commands

# List VMs and containers
ssh pve "qm list && pct list"

# Execute command in container
ssh pve "pct exec 102 -- docker ps"

# Start/stop container
ssh pve "pct start 105"
ssh pve "pct stop 101"

# Check storage utilisation
ssh pve "pvesm status && lvs --units g"

# Free thin-pool space (run after bulk deletes)
ssh pve "pct exec 102 -- fstrim -av"

# Check vzdump jobs
ssh pve "cat /etc/pve/jobs.cfg"

Key Takeaways (2026-05-03)

  • local-lvm at 58.85% — improved from 71% (old stale LXCs destroyed)
  • vm-102-disk-0 thin-pool 99.39% — needs fstrim in CT102 (not urgent, df shows 36%)
  • usb-backup 37.58% — grew from 12% to 37% — monitor retention policy
  • CT101 (old adguard) still running and serving LAN DNS — router DHCP still points to 192.168.1.62
  • CT105 Immich stopped — GPU fix is 2 commands (pct set --delete dev1/dev2)
  • Tailscale no subnet-router advertised — LAN-only services not accessible remotely
  • rpcbind :111 open on host — disable if NFS not in use