VMs crashing after snapshot consolidation in ESXi 5.5 Update 3

Those of you who follow social media or simply have ESXi 5.5 Update 3 and use snapshots, are already aware of the problem described in VMware KB 2133118 .

After Snapshot consolidation task, virtual machines running on VMware ESXi 5.5 Update 3 hosts will crash. These are the type of errors you will see in the virtual machine’s vmware.log file:

[YYYY-MM-DD] <TIME>Z| vcpu-0| I120: SNAPSHOT: SnapshotDiskTreeFind: Detected node change from ‘scsi0:1’ to ‘snapshot0.disk1.node’.
[YYYY-MM-DD] <TIME>Z[+0.000]| vcpu-0| W110: Caught signal 11 — tid 1272281 (addr 98)
[YYYY-MM-DD] <TIME>Z[+0.000]| vcpu-0| I120: Unexpected signal: 11.
[YYYY-MM-DD] <TIME>Z[+9.079]| vcpu-0| I120: Msg_Post: Error
[YYYY-MM-DD] <TIME>Z[+9.079]| vcpu-0| I120: [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (vcpu-0)
[YYYY-MM-DD] <TIME>Z[+9.079]| vcpu-0| I120+ Unexpected signal: 11.

 

The short timeline of the issue speaks for it’s high probability and overall urgency:

  • 15-Sep-2015: VMware ESXi 5.5 Update 3 released
  • 20-Sep-2015: VMs crashing after snapshot deletion reported on the VMware Forums
  • 29-Sep-2015: Issue noted on Veeam Forum as well
  • 1-Oct-2015: VMware publishes KB 2133118 describing the issue and suggesting downgrade as the only workaround
  • 6-Oct-2015: VMware updates KB 2133118 and releases VMware ESXi 5.5 Update 3a where the issue is resolved

No wonder this bug affected a lot of environments – snapshots are a vital and commonly used feature in vSphere. Imagine having a backup solution, such as Veeam Backup – not only would you have your VMs crashing, but you wouldn’t have backups for weeks unless you downgrade or decide to wait for the fix.

Nikolay Nikolov posted a nice review of Runecast Analyzer recently. Since one of its core features is VMware KB scan, I tested Runecast on an infrastructure with ESXi 5.5 Update 3 and it detected the potential issue immediately.

 

Runecast Analyzer - Configuration KBs Discovered

Runecast Analyzer – Configuration KBs discovered

This particular VMware KB became viral within the confines of our virtualization community, mainly due to its broad scope, probability and impact. But VMware is publishing and updating KBs every week and it is difficult for the admin to track especially those that relate to more specific configurations.

This is the whole reason we decided to develop Runecast Analyzer – after suffering numerous outages and ending up reading a KB every time, after spending hours and days in troubleshooting.

I would be happy to hear your feedback.

The following two tabs change content below.
Stan is a co-founder of Runecast, where he works on developing a new smart management solution for VMware environments. He spent long years in IBM, where he was a technical lead of the VMware CoE. Stan has experience with large scale transformation projects, including datacenter moves, cross country migrations and massive server virtualization and consolidation for enterprise customers. Stan worked on development of global cloud offerings and reference architectures. He is IBM Redbooks author, virtualization architect and a VMware Certified Instructor delivering authorized VMware classes. Stan holds the highest VMware certification - VMware Certified Design Expert (VCDX #74).

Latest posts by Stanimir Markov (see all)

About Stanimir Markov

Stan is a co-founder of Runecast, where he works on developing a new smart management solution for VMware environments. He spent long years in IBM, where he was a technical lead of the VMware CoE. Stan has experience with large scale transformation projects, including datacenter moves, cross country migrations and massive server virtualization and consolidation for enterprise customers. Stan worked on development of global cloud offerings and reference architectures. He is IBM Redbooks author, virtualization architect and a VMware Certified Instructor delivering authorized VMware classes. Stan holds the highest VMware certification - VMware Certified Design Expert (VCDX #74).
Bookmark the permalink.

Leave a Reply