Follow me on Twitter

Tuesday, October 31, 2017

vSAN 6.6 Rebalance & Resync Operations

In this post I will delve a little bit into day 2 OPS here with vSAN. Suppose you have had a maintenance window for rebooting and patching hosts, or perhaps you have a message under vSAN health saying that a proactive disk rebalance is needed on your vSAN cluster. In either scenario you will have components of VM's resyncing or in the case of a rebalance, moving on the cluster. This won't be a long post, but will give you some idea of what to expect when you encounter either disk resync or proactive rebalancing on your vSAN cluster.

First off let's define what the difference between the two operations are. A resync is replicating VM components across hosts in accordance to Storage Policy-Based Management or SPBM FTT=1, FTT=2 etc.

In this scenario you might have taken a single host out of the cluster and put it into maintenance mode, lets say for patching purposes. While that host is out of the cluster the disks in that host, provided you chose the recommended maintenance mode option of ensure accessibility, (see my article on updating a vSAN host Here) will not be syncing changes that are made elsewhere in the vSAN cluster. Once that host is back up after patching and is admitted back into the vSAN cluster it will have to resync those changes to bring back the redundancy on some components and sync transactional data changes made from the other disks in the cluster. Let's say for argument sake that you went long on your patch window and for one reason or another didn't make it under the 60 min resync window. New in vSAN 6.6 is a feature that will evaluate which scenario is quicker and it will just pull the disks back in and use them versus a full resync that would happen in vSAN 6.0 and earlier. There is now intelligent evaluating going on behind the scenes to determine the least costly method of getting the cluster back to a normal redundant state.

Proactive rebalancing is different. In this scenario we have multiple disks inside of disk groups and we may not be using all of the disks efficiently. This can happen for a variety of reasons, such as adding or removing capacity devices to vSAN. In this example we have a 4 SSD disk group and could have 3 disks that are almost full with the 4th not being utilized very much at all. A proactive disk rebalance warning seeks to correct that action and is a built in alarm in the vSAN health stack.

The rebalance will take a while depending on the size of the components it needs to move, also keep in mind that not all traffic will be across your network, you may also have disks rebalancing data across the internal HBA to another disk in that same host as shown in the photo below. You will notice two different VM's that are syncing components to the same host. This traffic never hits your network since it is internal to the host.

The rebalance task will show 5% complete until its actually done. This is a known bug in the UI only and will hopefully be addressed in a future release. You can see the accurate progress in RVC. You can keep an eye on what the Disk Proactive rebalance is doing on the vSAN health check Cluster level alarms vSAN Disk Balance screen. There if you click the Disk Balance tab down below it will show you which components on their respective hosts from which VM's still have yet to balance. Also rebalance operations can be stopped and resumed at any time.

If in either scenario you run into an issue where you have high network loading where resync or rebalancing operations could pose a threat to the cluster, a new feature was added in vSAN 6.6 that allows you to have control over how much bandwidth these operations take. Not only can you control the bandwidth, but before you even change that setting you can see how much bandwidth each host is using in the current operation. Keep in mind that VMware recommends that the throttling only needs to be changed in extreme scenarios, and a vSAN health warning called Resync Operations Throttling will fail if you enable it.

I hope this post has helped show some of the differences between Rebalancing and Resyncing operations in a vSAN cluster along with some helpful tips for working with both scenarios. That is it for long. Thanks for reading. Cheers!

No comments:

Post a Comment