obsidian/wiki/infrastructure/_index.md
2026-05-03 17:58:02 +01:00

84 lines
5.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
tags: [infrastructure, index]
updated: 2026-04-27
---
# Infrastructure — Index
Server inventory for all SSH-accessible machines. Last audited: 2026-04-24. Update this section whenever you SSH in and notice changes.
## Oliver Agency Servers (GCP)
| Article | Server | IP | Role |
|---------|--------|----|------|
| [[wiki/infrastructure/server-optical\|server-optical]] | optical-web-1 | 10.220.168.5 | Main AI prod — 35+ apps, systemd |
| [[wiki/infrastructure/server-optical-dev\|server-optical-dev]] | optical-dev | 10.220.168.9 | Docker staging — ppt-tool, cc-dashboard, semblance, 15+ apps |
| [[wiki/infrastructure/server-optical-prod\|server-optical-prod]] | optical-prod | 10.220.168.8 | Minimal / secondary prod |
| [[wiki/infrastructure/server-librechat\|server-librechat]] | librechat-dev + prod | 10.220.168.2 / .4 | LibreChat AI chat platform (both envs) |
| [[wiki/infrastructure/server-modocmms\|server-modocmms]] | modcomms-01 | 10.220.168.6 | ModoCMMS staging + prod (Apache) |
| [[wiki/infrastructure/server-baic\|server-baic]] | web-03 | 10.220.72.13 | Main web host — 40+ domains, oliver.agency |
| [[wiki/infrastructure/server-box-cli\|server-box-cli]] | box-cli-01 | 10.220.176.3 | Ford/L'Oréal hotfolder, CentOS 7, 1TB NFS |
## Personal / Aimpress
| Article | Server | IP | Role |
|---------|--------|----|------|
| [[wiki/infrastructure/server-aimpress\|server-aimpress]] | c2-15-uk1 | 57.128.160.249 | Aimpress VPS — Mailcow, n8n, Traefik |
| [[wiki/infrastructure/server-pve\|server-pve]] | pve | 192.168.1.48 | Proxmox homelab — 8 containers + Kali VM |
## Quick Reference
| Article | Purpose |
|---------|---------|
| [[wiki/infrastructure/ssh-aliases\|ssh-aliases]] | All aliases, IPs, keys, health-check one-liner |
| [[wiki/infrastructure/network-topology\|network-topology]] | Internet→router→NPM→services flow, LAN subnet map, DNS paths, Tailscale overlay |
## ⚠ Known Issues
> Add date when you discover an issue. Move to ✅ Resolved when fixed, then delete after 2 weeks.
### 🔴 Critical
- `optical` `2026-04-24`**DISK 99% FULL** — 5.9 GB free on 533 GB. Top offenders: `/opt/ferrero-opentext` 12 GB, `/opt/backups` 8.9 GB, `/opt/sandbox-notebookllamalm-nextjs` 8.5 GB — **action needed**
- `optical` `2026-04-24`**SSL cert expires May 8 2026** — ai-sandbox.oliver.solutions — renew before May 8
- `optical` `2026-04-24`**notebookllama-backend.service FAILED** — crashed, taking 8.5 GB disk
### 🟠 Security
- `optical` `2026-04-24` — All databases bound to `0.0.0.0`: Redis ×3 (:6379/:6380/:6399), PostgreSQL ×3 (:5432/:5433/:5437), MongoDB ×3 (:27017/:27019/:27021), Neo4j (:7474/:7475/:7687/:7688)
- `librechat-prod` `2026-04-24` — MongoDB :27017 on `0.0.0.0` — publicly exposed, no auth config found
- `baic` `2026-04-24` — PostgreSQL :5432 + rpcbind :111 on `0.0.0.0`
- `optical-dev` `2026-04-24` — PostgreSQL :5436/:5491/:5493 + olivas :8000 + cc-dashboard :8800 on `0.0.0.0`
- `baic` `2026-04-21` — Grafana default `admin:admin` password unchanged
### 🟡 Capacity
- `librechat-prod` `2026-04-24` — data directory **197 GB** (484 GB total, 65%) — monitor growth
- `pve` usb-backup `2026-05-03`**37.58%** (345GB/916GB) — was 12% — growing fast, check vzdump retention
- `pve` vm-102-disk-0 `2026-05-03` — thin-pool 99.39% allocated — run `fstrim` in CT102 (df shows 36% — not urgent but should be cleaned)
- `aimpress` `2026-04-24` — 26.58 GB reclaimable Docker images (`docker image prune -a`)
- `baic` `2026-04-24` — large vhosts: ustudio.global 22 GB, ustudiostaging2 19 GB, ie.oliver.agency 13 GB
### 🟠 Security
- `optical` `2026-04-24` — All databases bound to `0.0.0.0`: Redis ×3 (:6379/:6380/:6399), PostgreSQL ×3 (:5432/:5433/:5437), MongoDB ×3 (:27017/:27019/:27021), Neo4j (:7474/:7475/:7687/:7688)
- `librechat-prod` `2026-04-24` — MongoDB :27017 on `0.0.0.0` — publicly exposed, no auth config found
- `baic` `2026-04-24` — PostgreSQL :5432 + rpcbind :111 on `0.0.0.0`
- `optical-dev` `2026-04-24` — PostgreSQL :5436/:5491/:5493 + olivas :8000 + cc-dashboard :8800 on `0.0.0.0`
- `baic` `2026-04-21` — Grafana default `admin:admin` password unchanged
- `pve` CT102 `2026-05-03`**docker-socket-proxy on 0.0.0.0:2376** — Docker API accessible on LAN (should be 127.0.0.1)
### 🔵 Maintenance
- `optical-dev` `2026-04-24` — hp-prod-tracker + dow-prod-tracker containers unhealthy (healthcheck misconfigured, apps running fine)
- `box-cli` `2026-04-24` — CentOS 7 EOL since Jun 2024 — needs OS migration
- `pve` CT105 `2026-05-03`**Immich STOPPED** — fix: `pct set 105 --delete dev1 && pct set 105 --delete dev2 && pct start 105`
- `pve` CT101 `2026-05-03`**Legacy AdGuard still running** — router DHCP DNS still points to 192.168.1.62, needs update to 192.168.1.225
- `pve` CT102 `2026-05-03`**Stirling-PDF broken** — OIDC points to deleted Authentik — fix: set SECURITY_OAUTH2_ENABLED=false
- `pve` CT102 `2026-05-03`**Loki without Promtail** — logs not flowing
- `pve` CT102 `2026-05-03`**CrowdSec without bouncer** — IPS observing but not blocking
- `pve` CT102 `2026-05-03`**5 dead NPM proxy hosts** — id=5,6,8,12 (delete), id=10 (change to CT102 AdGuard :8053)
- `pve` host `2026-05-03` — rpcbind :111 open on 0.0.0.0 — disable if no NFS: `systemctl disable --now rpcbind rpcbind.socket`
- `pve` `2026-05-03` — Tailscale no subnet-router — LAN not accessible remotely without port forwarding
### ✅ Resolved
- `pve` local-lvm `2026-05-03` — improved to 58.85% (was 71%) — old stale LXCs (CT103/104/107/109/110) destroyed
- `pve` CT 102 (docker) — resolved 2026-04-24 — Docker data-root moved to `/mnt/data/docker`, now 51%
- `pve` CT 105 (immich) — resolved 2026-04-24 — PostgreSQL + cache moved to data-hdd, now 62%
- `pve` — resolved 2026-04-24 — Proxmox security updates applied (libngtcp2, cluster libs)
- `optical` `2026-04-24` — SSL cert ai-sandbox.oliver.solutions — track separately (check if renewed)