Tuesday 19 March 2013

A primer on availability in virtualized storage environments


Why is storage so important to any infrastructure, and why should it be highly available and virtualized?

Everyone has a way of managing data, and each of those ways are related, but often very different in terms of recovery times, the effects of maintenance, the effects of failures, and how the environment will stand the test of time with growing workloads and data quantities.

Availability

In servers

In a business that relies on its data for operations, payroll, and in many cases their offerings, availability is king.

A great example would be if the server(s) goes down, a business can lose tens of thousands of dollars  an hour as employees sit there twiddling their thumbs, unable to access their calendars, contact details of clients, or in some cases even the machine, and this issue is only escalating.

With VOIP, virtual desktops, Exchange, SQL, Dynamics, all running on virtualized servers and performing the majority of the day-to-day processing, clustering for high availability has become the standard. Citrix, Microsoft and VMware all supply a hypervisor capable of keeping a business running, even if one of the servers completely fails.

The beauty of these systems is that most businesses can now use fairly standard hardware that fits their choices of budget, allowing them to grow based on their needs, and budget, at the time.

However...

It all relies on storage

Consolidating all of these resources has created a unique set of storage challenges. As these clustered servers all rely on shared storage to be able to quickly bring a resource online on a completely separate physical machine, the storage simultaneously became the most useful tool in the infrastructure, as well as boat anchor holding the infrastructure down.

Specialised, highly redundant devices that could tolerate a disk or two, maybe a power supply, or even a controller failing, quickly became the "must have". SAN technology was fast, Fibre Channel could deliver high speed volumes over great distances, but they took a bit of work. Others went down the NAS route, providing massive shared directories that could be looked into and accesssed using existing copper infrastructure.

Both of these approaches suffer the same glaring issues, a single point of failure, a lack of availability during maintenance or failures and lastly, not to mention that for all the virtualization and benefits of hardware independence, they are still highly proprietary devices.


To get around some of this, the hardware companies brought out highly specialised paired or clustered storage devices that could maintain exact copies of each other's internal disk arrays, and could be placed in separate locations so that a site outage would not take the entire infrastructure off-line. These tended to be extremely expensive, and would not play well with existing equipment, often requiring a full rip-and-replace to happen, even if their existing equipment was purchased from that same vendor. In many cases, not much has changed up to now.

The weakness in this is that the firmware is physically bound to the hardware, limiting the consumer to a "like it or lump it" choice of equipment at an inflexible price point. On top of this, there comes the logical issue that although most have recognised the benefits of virtualization for servers, desktops and app's, there are still many who have not made the connection that storage is the final stage of this, and that all of the benefits of virtualization can apply to storage as well.

 

Virtualizing Storage

The arrays in the broadest sense, are just commodity chassis with clever firmware embedded. This is where the cleverness of a decade and a half software development comes in. If you could take the "smarts" of these arrays out of the box, and couple it with whatever you wanted, you could build the exact environment you need, right?


Vendors like IBM, HP, EMC, Dell, Hitatchi and others have taken some steps toward this by releasing physical appliances that can utilise various underlying storage, as long as you purchase their hardware with embedded feature-set. This half-step toward allowing users the right to purchasing power still unfortunately locks an environment down in regards to how elaborate the fabric is, how many ports it can scale across, and overall appears to be a compromise.

Why not separate it to make it more accessible?

Some very smart people have done this, and there are software packages that run on top of Windows Server such as SANsymphony-V from DataCore (allowing you to use pretty much any industry-standard hardware to get started), along with various customised Linux platforms that offer more basic functionality (with certain restrictions on hardware based around what drivers you can get, often iSCSI only).

These software packages allow their users to pick and choose hardware based purely on capacity and performance, comparing "apples to apples" across hardware instead of trying to sift through the feature-sets of different arrays with slightly different hardware that may wind up being superseded in only a matter of months or years. Of course, a business would pick appropriate quality hardware for the capacity use with the software. There is no substitute for building on a solid foundation.

By taking the software (firmware) out of the physical hardware, and allowing all of the functionality over any device, someone who may only have access to basic equipment, now potentially has access to the same, or better, features formerly only available to the storage in large data-centres (high availability, auto-tiering, snapshots etc.). They can add technologies like SSD into their existing SATA / SAS mix without having to start over and migrate everything, and the benefits pile up year after year as new technologies from different vendors can be added easily, the useful life of equipment can be extended, and there can be full control over whether iSCSI or Fibre Channel or a mix of both is utilised.

So what does this mean for me?

For those who are running a lab at home, as many technologists do, can potentially have a more available and robust storage architecture than they ever imagined, all through parts scrounged off e-bay or bought from the local PC store. You can start with a couple of white-boxes and a few NICs, and piece by piece take it up to a full-blown Fibre Channel switched SAN without ever having the storage totally off-line to the hypervisor.


For businesses looking toward the era of "software defined storage", it means that the solution to the most common problems faced by anyone administrating storage (performance, flexibility, availability, manageability, scalability) is already here, you just have to look.


 
Find me on Twitter, Google+, and popping up where I'm needed at the time. Feel free to get in touch!

4 comments:

  1. Good article but base on the statements you made, I think they are addressing customer needs by introducing virtualization into the picture. IBM SVC, EMC Vplex, Datacore, HP 3par and others are giving customer's the ability to create storage pools or virtual volumes created from volumes from various storage arrays (i.e. mdisk). The virtual controller (i.e. virtualized controller or front-end array) is acting as a controller for all of the various vendors (i.e. Hitachi, EMC, HP, IBM, etc) where LUNs are presented to servers in a mix of methods (NFS, FC, iSCSI, etc). The solutions provide capabilities like DR, deduplication, HA, failover, reporting, replication, migration, disk encryption, NAS, SAN, IPv6, etc.

    I am not sure if I would say take the firmware out of the device, that is a bit much but I do see your point. There is firmware on the disks (called metadata) that give the controllers better management control over the disks. This is essential but I do see your point about using JBODs (i.e. Just a Bunch of Disks) to provide the same level service but in a virtualized storage environment.

    This article builds suspense but I would like to see information about the virtualization technologies that exist and the capabilities they provide, provide a comparison matrix.

    But very good start

    Todd (Sr. Partner, Enterprise Datacenter Architect
    ITOTS Networks, LLC | www.itotsnetworks.com | 240-424-0112

    ReplyDelete
    Replies
    1. Hi Tony,

      Thanks for your comments, and the idea of giving a comparison matrix and going into more detail on various solutions will give me something to work toward.

      Devices need firmware to operate at the most fundamental level, that should be a given. The tightly controlled OS that is bound to the specific appliance is, to my mind, so close to that definition that from a high level, it is virtually indistinguishable. This second layer of software that gives the average array its built-in smarts, above and beyond what its internal RAID card delivers is largely unnecessary once you look at having the presented LUNs virtualised.

      I wouldn't go as far as putting JBOD into production (lack of internal redundancy), however, any enclosure that can serve up storage that meets the performance, redundancy, cost, and capacity requirements, to my storage hypervisor would be of more value to me if I were building an environment than anything with all the bells and whistles that won't interoperate with anything else natively.

      "IBM SVC, EMC Vplex, DataCore, HP 3par" - one of these is not like the others in a very fundamental way. The SVC, Vplex and HP offerings are all bound up as an appliance (physical or virtual) or restricted in some significant way (SVC being iSCSI only is just one example), whereas something like DataCore is able to scale to the environment it is applied to, whether it be physical, virtual, iSCSI or FC.

      Again, thank you for your insight.

      Richard.

      Delete
  2. imho, I think you are overly focused on the proprietary/non-proprietary aspects over the decoupling of the controllers from the disks. In my view the decoupling is the big breakthrough allowing real scale up and flexibility. The proprietary issues will always be around as the something can always be added with proprietary function and people who need it will buy it.

    ReplyDelete
    Replies
    1. Thanks for the feedback David.

      I agree that divorcing the hardware from the software is the big step, and there will indeed be something proprietary at some level, and, with the move toward virtualization, anything proprietary is being shifted toward the software layer where it can be the most flexible.

      Delete