SAN Be Nimble, SAN Be Quick

This is Part 5 of a multi-part VDI saga. In part 4 we left off with a split environment, two four year old overloaded SANs and a whole lot of virtual desktops running poorly on them. As SAN IOPS and disk space were two of the biggest culprits in this evolving crisis, I went looking for a solution. Enter Nimble Storage.

Nimble CS260G

I was introduced to Nimble storage by our County Office of Education. They were early adopters of Nimble and had only good things to say about them. So I took a hard look. At first their claims seemed like magic. VDI class IOPS on 7200RPM spindles. Impossible. In fairness to our current SAN provider, I looked at several options, including upgrading our existing system. However, to meet the actual demand of what our VDI environment had become, upgrading turned into replacement. And don’t forget I had a single controller issue that I needed to solve for our mission critical servers. I was also trying to think long term, by considering the Memory and Computing restrictions that were going to hit in April 2014 when Windows XP support goes away and we’re faced with upgrading the VDI environment to Windows 7.

On decision day, I had three proposals before me.

Option 1: Replace the current NetApp SAN with a flash based, dual controller model with some faster storage and continue using the existing 4 year old shelves for capacity. My main issues with this were complexity and cost. The NetApp management requires specific SAN knowledge and the 4 year old shelves would need to be replaced in short order, which would be an additional cost.

Option 2: Replace the existing Server Blades and Chassis with the latest and greatest in integrated storage. This was actually a very elegant solution that allowed for plenty of overhead for server and storage growth in the future. However, growing the VDI infrastructure was not the direction I was headed and adding more Big Iron just wasn’t in my play book.

Option 3: Install a 3U 36TB Nimble CS260G array and run the student desktops on it. The administration was stupid simple, the cost was less than option 1 or 2 and the solution would take next to no time to implement. The only question was, would this magical technology actually work?

Well, I decided to find out.

On install day, we rack-mounted the Nimble and connected the dual controllers into our 10Gbps Storage VLAN. We had a Nimble engineer onsite that walked us through configuring the interfaces and making sure everything was cabled correctly. We verified failover (there was a bug, since resolved, that actually prevented failover to the shared iSCSI IP address that took us a few hours to finally track down) and auto-support and then we started making Volumes. I have to say making volumes was easy and good thing too, because for VDI, nimble recommends no more than 50 VMs per volume. Divide 1500 by 50. Right.

Connecting VMWare ESX 4.1 hosts to Nimble involved some configuration on the ESX host command line, which is not necessary on VMWare 5 hosts. Once that was done, the hosts connected no problem. Then it was a simple matter of moving the VDI masters over to the new data stores and re-configuring the pools to use the new volumes. It all took less than two days to complete. Once the migrations were done, the student VMs were all running on the Nimble and anecdotally, performance improved across the board.

For the rest of the school year, storage performance faded into the background, at least for the student virtual desktops. For staff, it was another matter. Our staff virtual desktop situation progressively worsened, with desktops continuously running out of C and D drive space, running low on memory and still experiencing slow overall performance. The 8-10 year old re-purposed desktops continued to fail with intermittent errors. Our servers were still running on a single controller SAN and Server Room Air Conditioning failures and power outages were yet in our future.

In part 6 (perhaps the final installment) – What 120% CPU utilization on a Nimble Array really means and the final solution to VDI issues in the district.