Why is storage so important to any infrastructure, and why should it be highly available and virtualized?
Everyone has a way of managing data, and each of those ways are related, but often very different in terms of recovery times, the effects of maintenance, the effects of failures, and how the environment will stand the test of time with growing workloads and data quantities.
Availability
In servers
In a business that relies on its data for operations, payroll, and in many cases their offerings, availability is king.A great example would be if the server(s) goes down, a business can lose tens of thousands of dollars an hour as employees sit there twiddling their thumbs, unable to access their calendars, contact details of clients, or in some cases even the machine, and this issue is only escalating.
With VOIP, virtual desktops, Exchange, SQL, Dynamics, all running on virtualized servers and performing the majority of the day-to-day processing, clustering for high availability has become the standard. Citrix, Microsoft and VMware all supply a hypervisor capable of keeping a business running, even if one of the servers completely fails.
The beauty of these systems is that most businesses can now use fairly standard hardware that fits their choices of budget, allowing them to grow based on their needs, and budget, at the time.
However...
It all relies on storage
Consolidating all of these resources has created a unique set of storage challenges. As these clustered servers all rely on shared storage to be able to quickly bring a resource online on a completely separate physical machine, the storage simultaneously became the most useful tool in the infrastructure, as well as boat anchor holding the infrastructure down.Specialised, highly redundant devices that could tolerate a disk or two, maybe a power supply, or even a controller failing, quickly became the "must have". SAN technology was fast, Fibre Channel could deliver high speed volumes over great distances, but they took a bit of work. Others went down the NAS route, providing massive shared directories that could be looked into and accesssed using existing copper infrastructure.
Both of these approaches suffer the same glaring issues, a single point of failure, a lack of availability during maintenance or failures and lastly, not to mention that for all the virtualization and benefits of hardware independence, they are still highly proprietary devices.
To get around some of this, the hardware companies brought out highly specialised paired or clustered storage devices that could maintain exact copies of each other's internal disk arrays, and could be placed in separate locations so that a site outage would not take the entire infrastructure off-line. These tended to be extremely expensive, and would not play well with existing equipment, often requiring a full rip-and-replace to happen, even if their existing equipment was purchased from that same vendor. In many cases, not much has changed up to now.
The weakness in this is that the firmware is physically bound to the hardware, limiting the consumer to a "like it or lump it" choice of equipment at an inflexible price point. On top of this, there comes the logical issue that although most have recognised the benefits of virtualization for servers, desktops and app's, there are still many who have not made the connection that storage is the final stage of this, and that all of the benefits of virtualization can apply to storage as well.
Virtualizing Storage
The arrays in the broadest sense, are just commodity chassis with clever firmware embedded. This is where the cleverness of a decade and a half software development comes in. If you could take the "smarts" of these arrays out of the box, and couple it with whatever you wanted, you could build the exact environment you need, right?
Vendors like IBM, HP, EMC, Dell, Hitatchi and others have taken some steps toward this by releasing physical appliances that can utilise various underlying storage, as long as you purchase their hardware with embedded feature-set. This half-step toward allowing users the right to purchasing power still unfortunately locks an environment down in regards to how elaborate the fabric is, how many ports it can scale across, and overall appears to be a compromise.
Why not separate it to make it more accessible?
Some very smart people have done this, and there are software packages that run on top of Windows Server such as SANsymphony-V from DataCore (allowing you to use pretty much any industry-standard hardware to get started), along with various customised Linux platforms that offer more basic functionality (with certain restrictions on hardware based around what drivers you can get, often iSCSI only).These software packages allow their users to pick and choose hardware based purely on capacity and performance, comparing "apples to apples" across hardware instead of trying to sift through the feature-sets of different arrays with slightly different hardware that may wind up being superseded in only a matter of months or years. Of course, a business would pick appropriate quality hardware for the capacity use with the software. There is no substitute for building on a solid foundation.
By taking the software (firmware) out of the physical hardware, and allowing all of the functionality over any device, someone who may only have access to basic equipment, now potentially has access to the same, or better, features formerly only available to the storage in large data-centres (high availability, auto-tiering, snapshots etc.). They can add technologies like SSD into their existing SATA / SAS mix without having to start over and migrate everything, and the benefits pile up year after year as new technologies from different vendors can be added easily, the useful life of equipment can be extended, and there can be full control over whether iSCSI or Fibre Channel or a mix of both is utilised.
So what does this mean for me?
For those who are running a lab at home, as many technologists do, can potentially have a more available and robust storage architecture than they ever imagined, all through parts scrounged off e-bay or bought from the local PC store. You can start with a couple of white-boxes and a few NICs, and piece by piece take it up to a full-blown Fibre Channel switched SAN without ever having the storage totally off-line to the hypervisor.For businesses looking toward the era of "software defined storage", it means that the solution to the most common problems faced by anyone administrating storage (performance, flexibility, availability, manageability, scalability) is already here, you just have to look.
Find me on Twitter, Google+, and popping up where I'm needed at the time. Feel free to get in touch!