I would like to share an experience with vMotion problem, which I’ve met a while ago during upgrade of few hosts from ESXi 4.1 to ESXi5.0.
IBM x3850 M2 /x3950 M2 -[72334RG]- with Integrated Dual-port Broadcom 5709C PCI Express Gigabit Ethernet controller, 2 dual-port ISP2432-based 4Gb Fibre Channel to PCI Express HBA adapters, 4 quad-port Intel Corporation 82571EB Gigabit Ethernet Controllers , Integrated Dual-port Broadcom 5709C PCI Express Gigabit Ethernet controller fully compatible with vSphere 5.0 according to VMware HCL .
First of all vCenter and all components have been successfully upgraded to Version 5.0.0, Build 1917469. As the next step upgrade of one ESXi in the cluster of three ESXi’s was performed. For ESXi upgrade I used customized IBM image ESXi-5.0.3-1311175-IBM-20131205.iso, after short registration it is available here, plus IBM ESXI5.x Customized Bundle Patch_for_VMware_ESXi5.0_version_1.1_20140618 and last available updates on date 30.07.2014.
Upgrade went successfully and new ESXi 5.0 seamed to be ready to host VMs. It was time to migrate VMs to new host so I started vMotion. Thing, which was strange to me, was that during vMotion the progress of the task was waiting for several seconds at 3%, then up to 1 minute at 8%, then at 9% and then it was going forward and finished successfully.
This way I migrated several VMs to new host without errors. Those VMs were representing around 20% of ESXi load. After I was trying to migrate next VM to the new host, vMotion task had same delays again at 3% and 8% but failed at 9% with following error:
“The vMotion failed because the destination host did not receive data from the source host on the vMotion network. Please check your vMotion network settings and physical network configuration and ensure they are correct.”
I tried to migrate other VMs several times and vMotion failed with a different error:
“A general system error occurred.”
If VMs were powered off, vMotion succeeded in all cases.
During troubleshooting of the issue I started from a KB that is related to diagnostic of vMotion problem at 10% and higher, but anyway I decided to check all the points, even though I had a few successful vMotions performed.
- VMware HCL showed my hardware fully compatible with the target ESXi release. Drivers for all hardware of my ESXi were the latest available.
- I tried to delete vSwitch and create a new one. I involved network department to check configuration of physical switches. At that moment of time only 1 vmnic (Intel) was available for vMotion, and I requested cabling to another onboard vmnic (Broadcom) but this could take some time.
- I even tried to re-install host with pure customized IBM image ESXi-5.0.3-1311175-IBM-20131205.iso without applying IBM customized bundle and VMware updates. vMotion problem still persisted.
- Next step was to open a call to VMware. I would like to appreciate professional behavior of the technician and really big willingness to help with the issue. During several online sessions with vMware engineer and consultations with Senior engineers solution was not still found. Logs have been sent to VMware and they didn’t show any clear error related to the topic.
- There was nothing remaining to try but to install the host from an official VMware release ESXi 5.0 U3.
Once ESXi was installed, installation of latest updates and configuration were done, I couldn’t wait to migrate VMs. Finally, vMotion started working perfectly. I didn’t observe any delays or slowness, vMotion operation took less time then before and 100% successful for all attempts.
I hope this article will be useful for everybody who will experience same troubles as me and it will help not to spend so much time on investigation as I did.
Latest posts by Oleksii Orda (see all)
- vMotion fails at 9 % on ESXi 5.0U3 running on IBM x3850 M2 - January 28, 2015