Corrupted Backups using CBT

New issue was discovered which may result in corrupted backups leveraging vSphere CBT (Change Block Tracking) for incremental backups. Basically all backup products which use VMware API for Backups ( Virtual Disk Development Kit (VDDK) )are affected.

This issue occurs when you expand a virtual disk vmdk file which has Change Block Tracking (CBT) enabled past any 128GB boundary (see Additional Information section). When the disk is extended the change tracking data becomes unreliable. Due to the faulty changed block information, some changed blocks might not be captured by a backup, so that a restoring from an incomplete backup could cause a data loss.

Issue is affecting VMware ESXi 4.x and ESXi 5.x.

Backup can be inconsistent after disk extension passing any of these boundaries:
2^7          128GB
2^8          256GB
2^9          512GB
2^10         1024GB
2^11         2048GB
2^12         4096GB
2^13         8192GB….
If initial disk size was 300GB and you are extending to 500GB you are not affected, but from 300GB to 520GB then yes and so on…

There is no fix available for now, just workaround published by VMware:
After having extended the CBT-enabled disks past a 128GB boundary:

  1. Turn off CBT
  2. Take a snapshot (Does suspend/resume)
  3. Delete the snapshot (To recover space and performance)
  4. Turn on CBT
  5. Take a snapshot (Again with the suspend/resume)
  6. Delete the snapshot

The next backup after toggling CBT is the full backup of the virtual machine.

Note: Discard any backups that were captured after growing the disk, as they can be incomplete.

I found a script which can ease up this work a bit with just two PowerCli commands after each disk extension. http://poshcode.org/4149

UPDATE:Veeam already released fix for this issue for B&R v7 and B&R v8 has built-in fix, you can find more info including their script in  KB1940

Sources:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2090639

http://www-01.ibm.com/support/docview.wss?uid=swg21689947

 

 

The following two tabs change content below.
Dusan has over 6 years experience in Virtualization field. Currently working as Senior VMware plarform Architect at one of the biggest retail bank in Slovakia. He has background in closely related technologies including server operating systems, networking and storage. Used to be a member of VMware Center of Excellence at IBM, co-author of several Redpapers. His main scope of work consists from designing and performance optimization of business critical virtualized solutions on vSphere, including, but not limited to Oracle WebLogic, MSSQL and others. He holds several IT industry leading certifications like VCAP-DCD, VCAP-DCA, MCITP and the others. Honored with #vExpert2015 and 2016 awards by VMware for his contribution to the community. Opinions are my own!

About Dusan Tekeljak

Dusan has over 6 years experience in Virtualization field. Currently working as Senior VMware plarform Architect at one of the biggest retail bank in Slovakia. He has background in closely related technologies including server operating systems, networking and storage. Used to be a member of VMware Center of Excellence at IBM, co-author of several Redpapers. His main scope of work consists from designing and performance optimization of business critical virtualized solutions on vSphere, including, but not limited to Oracle WebLogic, MSSQL and others. He holds several IT industry leading certifications like VCAP-DCD, VCAP-DCA, MCITP and the others. Honored with #vExpert2015 and 2016 awards by VMware for his contribution to the community. Opinions are my own!
Bookmark the permalink.

2 Comments

  1. Hi,

    I’ve wrote a blog post here which you might help you with some more information, and how this actually effects certain backup vendors software.

    Regards

    Dean

Leave a Reply