Hyper-V Live Migration is significantly more complex then ESXi. In this post we will go over the types of live migration, how to configure it, and how to handle common errors.
There are 3 main types of live migrations:
One important thing to note is that when selecting the host to move the VM to you can choose to make the VM HA or remove it's HA status. This is the only way to change the HA status inside SCVMM after creating the VM
Types of Migration
There are 3 main types of live migrations:
- Hyper-V Manager Move VM. This can be a shared-nothing move. Non-HA VMs only
- Failover Cluster Manager Live Migration. HA VMs only
- SCVMM Migrate Virtual Machine. Any VM, only from SCVMM
Hyper-V Manager Move VM
Powershell command: move-vm
This is the most basic of live migrations, so commonly this is the first one tried. Unfortunately it's also the most complex to configure
Update 4/27: Check out my new post that does all this configuration in a simple script!
Hyper-V Manager Configuration
Open Hyper-V manager and connect to your hosts. Click on Hyper-V Settings in the right hand pane
Go to Live migrations and make sure they are enabled. Ensure that the correct network is chosen for live migrations and rearrange the networks if needed
You'll likely want to use Kerberos. See below for an explanation of this
So what is the difference between Kerberos and CredSSP? With Kerberos you can initiate the live migrations remotely whereas with CredSSP you need to login to the machine directly. Since you are managing these remotely you should use Kerberos. This does add a little bit of extra configuration on the AD side though.
Repeat this for each of the hosts and then proceed to the AD configuration
AD Configuration
When using Kerberos authentication the computer AD Object uses your account to move the VM. In order to do that it needs to have permissions in Kerberos. You don't want to give it full permissions (what if that computer got a virus?) so rather you constrain the delegation you give it to certain processes and hosts.
- Open Active Directory Users and Computers
- Navigate to the computer accounts for the cluster
- Go to the “Delegation” Tab. Once here add “Microsoft Virtual System Migration Service” and "cifs" for all hosts in the cluster
- Live migration will not work until new tickets have been issued. In order to do this, run klist purge on the servers and log out/log into your workstation. Make sure to fully log out otherwise you won't get a new ticket!
Failover Cluster Live Migration
Powershell Command: Move-ClusterVirtualMachineRole
If you are using HA VMs you will not be able to move VMs with the move-vm command or wizard. In this case you should use Failover Cluster Manager to move VMs around.
There is no more configuration needed after getting the cluster working (which is done as part of the setup_hyperv.py scripts). Simply open Failover Cluster Manager, right click on the VM Role and choose Move -> Live Migration -> Select Node. If the VM is not HA you won't see it here (however I'll have an article soon about making HA VMs)
SCVMM Migrate VM
Powershell Command: Move-SCVirtualMachine
Like Failover Cluster Manager most of the configuration of this is just getting the hosts into SCVMM. This is done as part of the setup_hyperv.py script that is ran during installation.
To initiate a move, simply right click on the VM and choose Migrate Virtual Machine
One important thing to note is that when selecting the host to move the VM to you can choose to make the VM HA or remove it's HA status. This is the only way to change the HA status inside SCVMM after creating the VM
Troubleshooting Live Migration
Getting the Error Message
To find the specific error there's a few things you can do:
- It's in the popup when it fails
- Events tab in SCVMM
- Show Critical Events in Failover Cluster Manager
- Get-WinEvent *vmms* -MaxEvents 5 | fl from powershell on the host (Enter-PSSession, winsh or RDP/IPMI).
Once you have the error message it's time to investigate!
Problems With Hyper-V Configuration or Constrained Delegation
Affects: Move-VM
If you are having problems with this, the most common error you'll get is:
The Virtual Machine Management Service failed to establish a connection for a Virtual Machine migration with host
'cb-2': The credentials supplied to the package were not recognized (0x8009030D).
Failed to authenticate the connection at the source host: no suitable credentials available.
If you get this error message, make sure to do the following:
- Verify that Live Migration is set to Kerberos
- Verify that Constrained Delegation is setup correctly in AD
- Completely log out of the workstation and log back in
- Try rebooting the hosts and the management server
Target Principle Name Is Incorrect
This error indicates that it tried to connect to a particular machine, but a different machine responded. If you get this error check for following:
- Duplicate IPs
- Duplicate DNS entries
- Using the incorrect network to do a live migration
Replication State is not initialized
In this case simple stop the VMMS service by right-clicking on the computer in Hyper-V Manager and choose Stop Service. Repeat but this time choose Start Service
So that's Live Migration in a nutshell! I'll continue to update this article with more information as it becomes available (and as I hit new issues).
Affects: Any
This indicates that there was a problem on the networking side of things. Some things to check here are:
Failed to initialize the cluster role
Affects: SCVMM
This can happen if the VM was made HA/not-HA from outside of SCVMM. If this happens you can force remove just that VM from the SCVMM database with:
This can happen if the VM was made HA/not-HA from outside of SCVMM. If this happens you can force remove just that VM from the SCVMM database with:
get-SCVirtualMachine [VM Name] | remove-SCVirtualMachine -force
And then rescan the host. This won't affect the VM at all, it will simply drain and refill the SCVMM database.
An existing connection was forcibly closed by the remote host
Affects: Any
This indicates that there was a problem on the networking side of things. Some things to check here are:
- Verify that jumbo frames can ping between the hosts if they are configured for jumbo frames with ping -l 8972 -f [IP Address]. You can verify if you are configured for jumbo frames with Get-NetAdapterAdvancedProperty -DisplayName "Jumbo Packet"
- Verify that VMQ is disabled. This can cause problems in some Live Migrations. Check this with Get-NetAdapterVmq | fl Name,Enabled and disable it with Get-netadapter | disable-netadaptervmq. There could be a short network outage when this happens, so don't do it during production.
- The computer account may have an expired password or we have lost our trust to the domain. You can fix by isolating the problematic host and shutting down the CVM with shutdown_cvm now from CVM's command line. Then try the following:
- Run Reset-ComputerMachinePassword -Server <DCServerName> -Credential (get-credential) to reset the password. Reboot the server and try the live migration again.
- If that doesn't work, reset the computer account password in active directory. Then login to the host and run Test-ComputerSecureChannel -Repair -Credential (Get-Credential) –Verbose to repair the channel. Reboot the host.
- You may need to repeat these steps on each node in the cluster