Mitigating VM downtime post Proxmox host migration

I had a Proxmox VM migration leave services unreachable for a few minutes. The VMs had moved hosts cleanly, but traffic still went to the old switch port. That delay usually points at the physical switch holding old MAC or ARP entries. The steps below are practical checks and fixes I use when a Proxmox VM migration breaks VM accessibility.

What you see

Symptoms of VM inaccessibility

VM pings fail immediately after migration.
SSH or application ports do not respond for a short while.
ARP shows the wrong MAC for the VM IP on the gateway or host.

Common error messages

ping: sendmsg: Operation not permitted
ping: transmit failed. General failure
ping: sendto: Host is unreachable
bridge or kernel logs showing unknown neighbour or stale entries.

Timeframes for downtime

The gap is usually seconds to several minutes.
If the switch MAC/ARP aging is long, downtime matches that timer.
Cheap or unmanaged switches can take longer to relearn MAC addresses.

Where it happens

Network switch involvement

Physical switches learn MACs per port. After a Proxmox VM migration, the VM’s MAC moves to a different host and port. If the switch still maps that MAC to the old port, traffic goes the wrong way.
Managed switches let you view and clear MAC and ARP tables. Unmanaged switches do not, and they can be slow to adapt.

Impact of ARP table updates

Hosts and routers cache IP-to-MAC mappings in ARP. If the ARP entry points at the old MAC or is missing, traffic fails until ARP is refreshed.
Linux hosts also cache neighbours. The command ip neigh shows the kernel ARP/NDP cache.

VM accessibility issues

On the destination host the VM is running fine locally. Networking fails because upstream devices still send frames to the previous port.
That makes it look like migration broke the VM. The VM is running, the network path is stale.

Find the cause

Diagnosing switch configurations

On a managed switch, list MAC table entries. Cisco example:

show mac address-table | include aa:bb:cc:dd:ee:ff

Expected: MAC mapped to the new host port. Actual problem: MAC points to old port.
For ARP on an IP router:

show ip arp | include 192.0.2.10

Expected: IP mapped to MAC of destination host. Actual: old MAC or no entry.

Checking ARP and MAC table settings

On Linux, check ARP/neighbor entries:

ip neigh show | grep 192.0.2.10

Example outputs:
- Good: 192.0.2.10 dev vmbr0 lladdr aa:bb:cc:dd:ee:ff REACHABLE
- Bad: 192.0.2.10 dev vmbr0 INCOMPLETE
- Stale: 192.0.2.10 dev vmbr0 lladdr aa:bb:cc:dd:ee:ff STALE
Check bridge FDB on the host:

bridge fdb show

Expected: VM MAC present on the host where the VM runs. Actual: MAC on other host or missing.

Identifying host migration problems

Confirm the VM’s MAC after migration:

ip link show dev vmbr0
virsh domiflist # or qm config to list MACs
Confirm the VM process is listening. From the destination host:

ss -tlnp | grep :22
If VM listens locally but network path fails, the problem is external to Proxmox. That points at switch/MAC or ARP caching.

Fix

Steps to clear ARP tables

On the Linux hosts:

ip neigh flush all
bridge fdb flush

Or target a single IP or MAC:

ip neigh del 192.0.2.10 dev vmbr0
bridge fdb del aa:bb:cc:dd:ee:ff dev vmbr0

Expected result: ip neigh shows a new entry after traffic; bridge fdb shows MAC on the correct host.
On common switches:
- Cisco:
clear mac address-table dynamic address aa:bb:cc:dd:ee:ff
clear arp-cache
- Other vendors use similar commands. If you cannot clear per-MAC, clear the dynamic table or reboot the switch port.

Forcing ARP updates from the VM or host

From the VM or its host send gratuitous ARP:

arping -c 3 -A -I eth0 192.0.2.10

Or use:

ip neigh replace 192.0.2.10 lladdr aa:bb:cc:dd:ee:ff dev eth0 nud_reachable

Expected: gateway and switch learn the new MAC quickly. Actual: immediate restoration of traffic if the switch accepts the gratuitous ARP.

Adjusting switch settings

Reduce MAC aging or ARP cache timers on the switch to speed relearning. Typical commands vary by vendor. Example:
- On Cisco switches, set mac-address-table aging to a lower value:
mac-address-table aging-time 120
- Match timers to how often you migrate VMs. If live migration is frequent, use a lower timer.
On unmanaged or cheap switches, replace with a managed unit if the problem recurs. Cheap switches can have buggy MAC learning.

Testing VM connectivity

After clearing entries and sending gratuitous ARP, test:

ping -c 3 192.0.2.10
arp -n | grep 192.0.2.10
ip neigh show 192.0.2.10
bridge fdb show | grep aa:bb:cc:dd:ee:ff

Expected: ping success, ARP maps to correct MAC, bridge fdb lists MAC on destination host.

Check it’s fixed

Confirming VM accessibility post-fix

Run a sequence of checks after a migration:
1. Confirm the VM is running on the destination host: qm status <vmid>
2. Confirm VM network interface MAC on that host: bridge fdb show | grep aa:bb:cc:dd:ee:ff
3. From a remote node, ping and perform a TCP connect: nc -vz 192.0.2.10 22

Monitoring ongoing performance

Watch for repeat occurrences. If downtime repeats on every Proxmox VM migration, the switch is the likely root cause.
Automate a small post-migration script on the destination host to send gratuitous ARP and flush local caches:

!/bin/sh

ip neigh flush all
arping -c 3 -A -I eth0 $VM_IP
bridge fdb flush

Run this as a hook after migration.

Documenting troubleshooting steps

Log the exact commands run and outputs. Keep the switch MAC and ARP dumps with timestamps. That proves the src/dst of traffic during the event.
Note the switch model and firmware. If a particular switch model fails to relearn MACs, record that for replacement planning.

Root cause and remediation summary

Root cause is usually stale MAC or ARP entries on the physical switch or upstream router after Proxmox VM migration.
Remediation: clear MAC/ARP entries, send gratuitous ARP from the VM or host, or reduce aging timers. Replace unmanaged switches that do not relearn quickly.

If the VM still fails after these checks, probe further: capture traffic on the old and new switch ports, confirm VLAN configuration, and check for asymmetric routing. The steps above fix the common case where Proxmox VM migration succeeds but the physical network has not updated its MAC/ARP state.

Popular Topics

PopularView All

Understanding cost management in AI data centres

Configuring Microsoft privacy settings amidst government

paperless-ngx | v2.19.3

Gitea | v1.25.0

Mitigating VM downtime post Proxmox host migration

What you see

Where it happens

Find the cause

Fix

Check it’s fixed

!/bin/sh

Leave a Reply Cancel reply

Choosing between Proxmox and ESXi for virtual machines

n8n | n8n@1.117.3

Tempo | v2.8.2

A Practical Guide to Self-Hosting n8n for Workflow Automation

Acer Nitro V16 Gaming Laptop + 2 more Amazon tech bargains

Implementing VLANs in your home lab setup

Understanding cost management in AI data centres

Configuring Microsoft privacy settings amidst government

paperless-ngx | v2.19.3

Gitea | v1.25.0

Mitigating VM downtime post Proxmox host migration

What you see

Where it happens

Find the cause

Fix

Check it’s fixed

!/bin/sh

Leave a Reply Cancel reply

Choosing between Proxmox and ESXi for virtual machines

n8n | n8n@1.117.3

You May Also Like

Tempo | v2.8.2

A Practical Guide to Self-Hosting n8n for Workflow Automation

Acer Nitro V16 Gaming Laptop + 2 more Amazon tech bargains

Implementing VLANs in your home lab setup