Sunday 17 November 2013

Cost per TeraByte and the Spindle Count Trap

Just as a little reminder, I maintain this blog to jot down ideas and thoughts as I go. Any views expressed in this are mine alone, and not those of any past or current employer etc. I will endeavour to correct omissions or mistakes as soon as I learn a better way, so please feel free to contact me with constructive feedback!

Designing and scoping storage solutions:

For any growing business that is consuming more and more storage, many metrics get thrown around as the "rules" by which they should purchase storage.

Many administrators that primarily have dealt with smaller environments, primarily focused on file and print storage, or archiving, will treat the cost per TB of storage as the primary metric on which they base their decision.

Whilst this is fine if the capacity increases are simply to hold more static data from backups or to deal with retaining larger files for whatever regulatory period they are subject to, once an organisation starts to look down the road of Virtual Desktop Infrastructure (VDI), or databases (Exchange the or SQL etc.), or any application with higher numbers of users or high transactional workloads, the cost per TB becomes secondary, and can more often be a distraction from creating an environment that will achieve what is expected of it.

VDI is a great example of cost per TB being a secondary factor, as it is very easy to understand once explained.

When deploying virtual desktops, you are creating an environment which users can connect to from whatever location is allowed, where all, or most, of the processing is handled by the VDI server, rather than depending on the specifications of the users machine. This allows even users with the most basic hardware to access a portion of an extremely powerful central server, delivering a consistent experience across devices.

These powerful central servers take over the role of being "the computer", performing the tasks requested on the central server then displaying the output on the device the end-user is connecting with. Most, if not all, of these tasks rely on storage, and when you are looking at 10's to 1000's of users all rely on the same storage

Flawed logic:

To use only the cost per TB model, if each user needs ~30GB or so, logic might dictate that you could easily accommodate a thousand users on a single tray of RAID5 or RAID6 protected storage, especially with 4+ TB disks becoming more common. So for the sake of this example, let's say that we are working with 1000 users across 12x ~4TB spindles in RAID6.

Now, go back to the start of the example. If you handed each of the 1000 users their own workstation, with a single spinning disk, would they be able to be productive with that level of storage performance? What would happen if that disk was then reduced to 1/10th of its performance, would they still be productive? How about 1/100th? This is the risk you are going to be exposed to when you are trying to run 1000 users across the resources that might do for 10. If we wanted to provide each user with the equivalent performance of one desktop disk drive, we would need 1000 or so disks, plus those needed for parity. In practice, many environments will give each user 1/2 the performance of a dedicated desktop disk in anticipation of the environment not being fully utilised at any one time.

The number of spinning disks, or spindles, per user becomes vital as the above example illustrates. Having tonnes and tonnes of space is of little value if the space is unusable for the purpose it has been deployed.

Key factors:

When looking at VDI or high transaction workloads in virtualised, there are some key factors to consider. The number of IOPS per spindle, the number spindles per user (or per workload), is vital to how the environment performs, the throughput, IOPS, and latency it is capable of delivering.

RAID levels:

How the disks within the trays are configured is a significant factor in performance. Whilst RAID5 or RAID6 are quite effective in delivering good space efficiency and read performance for bulk storage of items that are put there once then accessed many times, they typically suffer significant disadvantages in write-heavy environments such as development or record generation due to their poor relative performance on write operations.

In a simple environment where you had 12 disks in a single RAID6 array, you would have 10 disks worth of capacity that should provide great read speeds, acceptable levels of redundancy, and good space efficiency (low RAID overhead).

In a simple environment where you had 12 disks in a single RAID10 array, you would have 6 disks worth of available capacity that should provide very good read, and write, speeds, plenty of redundancy, however, less space efficiency (50% of the capacity used for redundancy).

Applied to virtualization

The two previous examples are common in both un-virtualized and virtualized environments alike. The architecture of putting a large number of spindles in a single array was very common before virtualization, and as it provides acceptable performance. remains common now.

When virtualization is introduced, the demands on the disk change significantly. Instead of a single server, application or user making demands of the storage, there are now many (from a couple to thousands) of users or applications requesting that the disks deliver or receive data at once.

This leads to a great increase in contention on the resources, and a shift in the way that efficient storage is laid out. Now, rather than one very fast block of disks with a single lane of access (IO queue), it can be much more effective to have many blocks of disk with many lanes of access. The end effect of this is that the queues for any request to wait upon are much shorter, and can be distributed evenly amongst all of the storage resources.

Combining all of this information, it becomes sensible to first analyse the way data is used in the environment. Are there a large number of users or applications dependant on the storage? What does each application require to perform acceptably? What ratio of read to write traffic will there be? What RAID level suits the IO characteristics of the environment? What fabric can deliver the speed and latency that I need?

Once you have this information, you can then architect the basics of your solution. Once you have determined what is needed, then you can approach your suppliers and ask what prices they will offer for the storage that suits your needs.

In practice:

In practice, I have seen environments crippled, or taken completely offline by a lack of awareness of these principles. A storage array is a tool, and if not used correctly, cannot deliver what anyone might have promised. The same array configured differently can be the difference between your solution being effective, or worthless.

Many arrays, and now even Windows itself, allow users to pool storage. Whilst various arrays or technologies might operate differently, DataCore, the product I've spent the most time with thus far, allows a user to pool many RAID LUNs from virtually any device, and to spread the data amongst them to provide more IO queues and significantly reduce latency by a factor of 3x-5x when combined with the built-in DRAM caching.

For read-heavy environments with many concurrent demands, multiple, smaller, RAID5 or RAID6 configurations (many smaller arrays versus one large array) can deliver better performance through additional IO queues, with a fairly low RAID overhead, giving a balance between performance and the cost per TB.

For write-heavy environments, many RAID1 or RAID10 packs pooled can provide immediate and significant improvements to the usability of an environment, even before considering changing the underlying disk technology (from SATA to SAS or SSD for example). This comes down to being able to use the tools provided in the most effective way for the situation.

Final thoughts:

Everyone has a budget to work within. This is a fact of doing business. When tasked with delivering a solution, always be clear on exactly what is expected now, how the environment will be expected to perform, how it should be expected to scale.

If an environment is optimised toward cost per TB, expect it to perform well at storing large quantities of data, do not expect that it could do what an environment optimised for low latency or high IOPS can do.

Be rational when building environments, as it's likely that a little research now could save much more than a few dollars in unnecessary equipment purchases and labour costs later on.

This is a massive topic, and this article just starts to scratch at the surface. I'm planning on running some tests as proof points in a lab environment once I have enough disks available to illustrate and clarify.

No comments:

Post a Comment