Vexpert

Follow me on Twitter

Wednesday, October 11, 2017

VSAN 6.2 Deployment Part 3 Implementation

Today we will be getting back to the VSAN deployment series I had started earlier this summer, it's time to continue on with excerpts from the presentation I have been doing at VMUG's on my VSAN deployment. Today we will be focusing on the Implementation phase. Here is an overview of that process.




We started by getting our 4 new Lenovo servers racked then installed esxi 6.0u2 on them, we then installed our 10Gbe switches and configured our VLANS separating out our VSAN traffic on its own VLAN which we also isolated from talking to the rest of the network as we were told that the multicast traffic VSAN 6.2 required could get quite chatty and could take down other network switches on the network (note that this requirement was removed in VSAN 6.6 which now uses unicast) Our network topology in part 2 of this blog series showed our dual redundancy we planned for by having 2 10Gbe switches and 2 PCIE 10GBe interface cards per host.




We installed a brand new vCenter 6.0u2 appliance onto legacy hosts (IBM Bladecenter) which was a welcome addition to the production environment (no windows updates yay!) as I had been using it in my labs for several years already. We then disconnected all hosts from the legacy windows vCenter left all datastores and VM's in place and swung them over and connected them into the new vCenter appliance. The next step was to recreate folders and resource groups in the new vCenter. I chose not to migrate any data over from the old windows vcenter environment as I wanted to start with a clean slate and not bring over any unnecessary logs from the legacy system.

Next up was networking in preparation for turning VSAN on. We built our vDS which is required for VSAN.


We created the vDS port groups and NIOC shares making VSAN traffic the highest priority, vMotion was next, then virtual machine traffic, and finally MGMT traffic was the lowest priority. For those that may not know what NIOC is, it is a mechanism called Network Input Output Control that does exactly what it sounds like, it prioritizes network traffic types as a QOS service over these shared links. Here you can see how we set that NIOC value on each traffic type:


Now that we had all the preceding steps in place it was time to turn VSAN on, which I found out is pretty exciting and anticlimactic at the same time. All you need to do after you have all the prerequisites met is to turn the service on in the cluster settings



The next step was to claim our disks which can be done 1 of 2 ways in VSAN 6.2 (note VSAN 6.6 and newer only have the manual disk claim option, this was done to prevent scenarios where you could be trying to remove a disk from VSAN for whatever reason and the system would auto claim it back into the disk group) You can choose the auto disk claim option and the cache and capacity disks will be auto detected and configured per node. the second option and the one we chose just because I wanted to have more control to see how this works was the manual disk claim option both methods can be seen in the screenshot below: 


We also chose to enable the Dedupe and Compression option to conserve space. This was a brand new feature in VSAN 6.2 and one that we had decided in the planning phase that we definitely wanted to take advantage of. One other thing to note as this was a brand new VSAN cluster without any data on it we had no issues enabling this service, just be aware that when you enable it, the disk groups get rolling reformats to enable this option, while it is stated that they are non disruptive I was glad I didn't have to test that out.


During the network validation phase the system confirms that you have a VMkernel port setup on each host that is a part of the VSAN cluster configured and ready to go as a double check before VSAN is enabled.



The next step is the actual disk claiming. as you can see in the screenshot below you will choose your disk type here SSD or HDD, and you will then choose whether that disk is a cache or capacity device. Note that each VSAN node's storage is made up of logical units called Disk Groups and Disk Groups are made up of a set of cache and capacity devices, the configuration maximums are 7 capacity devices per disk group and 5 disk groups per node. We had 1 disk group per node with 1 cache SSD and 5 capacity SSD's in our final design.



At this point VSAN was up and running. After all the hard work with preparation and planning it was very easy to actually turn it on and start using it. The last thing we did was to start migration from legacy over to the new VSAN environment. 


As shown earlier in this blog series the first node in the VSAN cluster was connected to the legacy storage via Fiber Channel so I migrated compute for each VM over to the first VSAN host using vMotion, once that was completed the next step was to change the networking on the VM from the legacy vSS to the new vDS VM network port group. I could then do a storage vMotion to get the files migrated fully over to VSAN. After that was complete I would then do a vMotion to bring that VM down to another node in the VSAN cluster to free up resources on the first VSAN cluster node for more VM's to be migrated in. This process took a few days to get around 100 VM's migrated in. 

That just about covers it for the implementation phase of this project and this ended up being a longer post but I hope people enjoy it and learn from it. Next time we will go over some Tips and Tricks I learned along the way in this deployment. Thanks for reading! Cheers!







No comments:

Post a Comment