4.1 KiB
| title | aliases | tags | sources | created | updated | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LXC ARP Cache — Stale Entries Causing API Failures |
|
|
|
2026-04-19 | 2026-04-19 |
LXC ARP Cache — Stale Entries Causing API Failures
LXC containers can hold stale ARP cache entries that map an IP address to an outdated or incorrect MAC address. When a container tries to reach another host on the same subnet (e.g., a sibling LXC), packets are sent to the wrong MAC and silently dropped. The symptom is API calls failing inside the container even though the same API is reachable externally — creating a maddening "works from my machine, not from the container" debugging scenario.
Key Points
- ARP cache in an LXC container can contain wrong MAC addresses for other LXCs on the same host — especially after container restarts, IP reassignments, or Proxmox live migrations
- The failure is silent: the container sends packets, the network layer accepts them, but they never reach the destination (wrong MAC)
- External tools (curl from the Proxmox host, a laptop, another machine) work fine — the issue is specific to the LXC's own ARP table
ip neigh flush dev eth0clears the ARP cache inside the container and forces re-resolutionarp -d <IP>deletes a single stale entry — useful for surgical fixes
Details
Why This Happens
When an LXC container is first created or restarted, it populates its ARP cache by broadcasting ARP requests. If another container on the same Proxmox host was recently restarted with a different MAC address (even on the same IP), the ARP cache holds the old MAC. Packets destined for that IP are wrapped in Ethernet frames addressed to the old (now invalid) MAC and dropped at the bridge level — vmbr0 doesn't know where to send them.
This is especially common in homelab environments where:
- Containers are frequently created, destroyed, and recreated (new MAC on same IP)
- DHCP reservations keep the IP stable but MAC changes with each new container
- IP addresses are manually assigned in
/etc/network/interfaceswithout DHCP ARP updates
Symptoms
The pattern that indicates ARP cache issues:
- A service inside LXC can't reach another LXC by IP
ping <IP>from inside the affected LXC fails or shows high latency- The same
ping <IP>from the Proxmox host works curl https://<IP>:<port>from outside the container succeeds
The Fix
# Inside the affected LXC — flush all ARP entries
ip neigh flush dev eth0
# Or delete just the problematic entry
arp -d 192.168.1.100
# Verify the cache is cleared
ip neigh show
After flushing, the next network request triggers a fresh ARP broadcast, gets the correct MAC, and the cache is repopulated correctly.
For Persistent Issues
If the problem recurs frequently, add a cron job or systemd timer to flush ARP cache on startup:
# /etc/systemd/system/arp-flush.service
[Unit]
Description=Flush ARP cache on boot
After=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/sbin/ip neigh flush dev eth0
[Install]
WantedBy=multi-user.target
Context: Homepage Widget Failures (2026-04-19)
Homepage dashboard running in LXC 103 showed "API Error" on all custom widgets despite the APIs responding correctly when called from outside the container. After many debugging attempts (fixing SSL, fixing Proxmox widget config, checking env vars), the root cause was traced to ARP cache issues: LXC 103's ARP table had stale entries for LXC 104 and 105 IPs. After ip neigh flush dev eth0, API calls began working.
Related Concepts
- wiki/concepts/homepage-proxmox-widget-quirks — the Homepage dashboard context where ARP cache was the root cause
- wiki/concepts/nodejs-ssl-system-trust-store — the other failure mode for Homepage widget API calls (SSL)
- wiki/homelab/_index — Proxmox and LXC infrastructure context
Sources
- daily/2026-04-19.md — Homepage in LXC 103 had widespread API failures; ARP cache entries for LXC 104/105 were stale;
arp -dandip neigh flush dev eth0resolved the widespread widget errors