Lately, there has been a lot of talk about the software-defined data center. By increasing management flexibility and lowering server costs significantly through increased utilization of existing CPU and memory resources, server virtualization brought significant benefits to IT organizations. But to complete the vision of the software-defined data center, both storage and networking need to be virtualized along similar lines.
There has already been movement in the industry towards the network virtualization goal. The recent $1.26 billion acquisition of Nicira by VMware was a huge step in that direction, and the agreement to buy Xsigo by Oracle back in July 2012 plays in the same sandbox.
It makes a lot of sense for the networking and storage components of the software-defined data center to operate as part of the hypervisor (or what today is called the server hypervisor). It’s pretty clear that’s what VMware was thinking when they bought Nicira. The ball is in Microsoft and Citrix’s court to respond to this strategic move. But what about similar developments on the storage side?
The storage hypervisor
The concept of the storage hypervisor has arisen recently. The basic idea is to do for storage what server hypervisor technology did for CPU and memory – virtualize it to increase its utilization and provide management flexibility. The promise of the right implementation here is to significantly lower storage infrastructure costs in virtual environments while improving manageability.
Storage is a cost problem in virtual environments today due to something called the “VM I/O blender.” The very random, very write-intensive I/O patterns generated by virtual hosts are much harder for legacy storage technologies to handle than the I/O patterns that were generated by most physical servers where a “write-intensive” environment was still 80 percent reads. The result is that latencies with spinning disks increase significantly, which at a macro level often translates to a 30-50 percent degradation in storage performance. When you need more IOPS, you generally either buy more spindles or you buy faster storage. Both lead to significantly higher storage costs.
This storage performance issue – which, by the way, is much worse in virtual desktop environments than it even is in virtual server environments – has driven a lot of interest in solid state disk (SSD) technology. But whether its deployed with caching or tiering technology, the bottom line is that it addresses the performance issue by requiring that you buy faster, much more expensive storage.
Log, don’t cache
There’s another way to address this issue, and that is to use a logging instead of a caching architecture. What makes a log different from a cache is that it effectively turns the very random I/O pattern generated by hosts into a sequential stream that any type of disk (spinning or SSD) is much better at handling. This approach uses your existing storage technology much more efficiently, giving you up to a 10x performance improvement without requiring that you buy any new hardware.
There’s something else really interesting about log architectures. They address the write performance issue using up to 90 percent less capacity than caches require. When you’re paying $60 – $85/GB for enterprise-class SSD, that savings can be meaningful.
Gaining speed, and then some
But performance is not the only issue. To deliver on the promise of the software-defined data center, there are other key issues like space efficiency and provisioning speeds. To be able to keep up with the bar set by server hypervisors for server resources, a storage hypervisor also has to be able to almost instantly provision high performance, space-efficient storage. What good is being able to create a new VM in three seconds if it takes 20 minutes to provision the high performance storage it needs?
And the needs of the software-defined data center don’t stop there. The storage technology must support all virtual environments, not just virtual desktops, so it has to be able to support high availability failover without data loss while providing its storage performance speedups. And the performance characteristics of it must extend not just to virtual disks, but to any snapshots and clones of those devices (and the VMs that own them). If performance degradation was not an issue with current virtual snapshot technology, it would open up a number of new options and workflows that would simplify operations in virtual environments. Instead, administrators are either trained to take a snapshot, use it, and throw it away or they are conditioned to believe they have to buy enterprise-class storage arrays to get high performance snapshot/clone technology.
Finally, this technology has to be transparently integratable into existing hypervisors. A number of vendors are creating “virtual storage solutions” that are tied to hardware. If you believe in the long run that the storage hypervisor technology must be absorbed into the hypervisor, it’s clear that the right way to implement it is with a software-based approach that supports heterogeneous storage.
There’s no doubt that the industry is on the road to the software-defined data center. CPU and memory have been appropriately virtualized, virtualization in the network is starting to be absorbed by hypervisors, and it’s clear that storage is next in line.