Software-Defined Storage is the Missing Link in the Software-Defined Data Center

At this year’s VMworld 2012, held in San Francisco, the “software-defined data center” was a major topic. The idea is appealing: make the entire IT infrastructure completely fluid so that all resources–compute, network, and storage–can be quickly and easily allocated or reclaimed to meet dynamic business requirements, thereby extending the flexibility and cost savings benefits virtualization provides across the entire IT infrastructure.

If you take what the server hypervisor did for compute resources as the analogy –allowing a single control point to manage the dynamic allocation of CPU and memory resources that are consistently used very efficiently – this part of the puzzle has been “software-defined” and solved. A solution to this problem for the network side is well on its way to being solved. VMware’s recent acquisition of Nicira and Brocade’s purchase of Vyatta, are steps towards allowing the networking infrastructure to be managed as purely virtual resources directly from the server hypervisor.

It’s clear that this same transformation needs to happen with storage. In essence, software-defined storage is the missing link, the last major hardware bridge to cross on the path to the software-defined data center.

What is software-defined storage?

What is “software-defined storage”? Drawing on the analogy with the server hypervisor and what it did for compute resources, storage virtualization is clearly part of it but is not sufficient by itself. Server hypervisor technology not only made compute resources fluid and much easier to manage, it also significantly increased the utilization of those resources to cut the costs of the compute hardware infrastructure. For servers, it was the server consolidation that virtualization enabled that provided this huge boost in utilization, effectively allowing customers to support the same application workloads with much smaller and less costly compute infrastructure.

The VM I/O blender issue…

Storage resources clearly do have to be virtualized to solve this problem, but the “improved utilization of existing resources” question manifests itself a little differently. Those familiar with virtual environments already know that storage performance suffers in most virtual deployments relative to expectations based on experiences with storage on physical servers. This “VM I/O blender” issue has been widely discussed in the industry so I won’t go into it here, but the 50 percent or more storage performance slowdown that virtual administrators experience in these environments touches not only runtime performance, but also the ability to use storage capacity optimization technologies like thin provisioning, valuable operational features like snapshots and clones, and instant provisioning of high performance storage on demand.

Case in point: in physical computing environments, 15K RPM disks generally handle around 180 I/Os per second (IOPS), assuming some RAID overhead and a random workload that includes 15-20 percent writes. The more random and the more write-intensive an I/O workload is, the poorer spinning disks perform. Analysis of these disks in use in virtual environments often indicates that, from a guest VM’s point of view, they are only delivering somewhere between 30-45 IOPS. Virtual workloads are much more random and much more write-intensive–often as much as 80 percent or more in certain use cases like virtual desktop infrastructure (VDI)–than most physical workloads, which accounts for part of the slowdown. But if other potentially interesting technologies like thin provisioning (to make more efficient use of storage) and snapshot/clone technology (to feed a variety of supplemental but required administrative operations like backup, test/dev, maintenance, and versioning) are in use, they will often drop the IOPS per disk spindle to well below 30.

There is a pure software response to improving the performance of existing disk by as much as 10x, and it is the log architecture that has been shipping for decades in enterprise-class database products. A log effectively removes all the randomness from writes, effectively enabling the log device to operate in sequential mode for a large percentage of the time. What’s most interesting about this is that the storage system behind that log appears to operate at the speed of the log from the point of view of guest VMs submitting those writes. If a log architecture could be built into the storage layer, then existing storage in virtual environments could support lower write latencies and up to 10x the throughput without requiring any additional hardware purchases.

Yet that same disk, if handling close to 100 percent sequential writes, can often deliver 2000 or more IOPS. Given this order of magnitude greater potential performance potential, how can the innate power of storage hardware be unlocked and harnessed to the benefit of virtual environments? And it’s not just a pure performance issue. To fully deliver on the promise of the software-defined data center, we need to be able to provision high performance storage as fast as we can create a new VM (in a couple of seconds), we need that storage to exhibit the same type of space-savings efficiency that compute resources do when managed by a server hypervisor, and we need to manage that storage in a manner that is intuitive to virtual administrators who think in terms of VMs, not storage LUNs.

Can software improve performance?

That is the other challenge that software-defined storage must address. It has to not only virtualize heterogeneous storage resources, but it also has to unlock the potential these devices have to deliver high performance, space-efficiency, rapid provisioning, and ease of management. And it should do this with a software layer (we call it software-defined storage for a reason) that does not require–but can accommodate and improve–higher performance storage devices incorporating flash memory technology.

The crucial impact of software & data services in SDN

The word “software” is indeed critical in this definition. Central management consoles like VMware vCenter, Microsoft Systems Center, and Citrix XenCenter are designed to comprehensively manage virtual environments. The software-defined storage capabilities should integrate seamlessly into the hypervisors and management products around which each of these environments are based and look like native storage objects (VMDKs for vSphere, VHDs for Hyper-V, etc.) to take maximum advantage of the wealth of existing tools and utilities these environments have to offer for live migration, failover, monitoring, quality of service, workload balancing, and other services.

One parting comment: data services must be a critical part of the software-defined storage offering. If data services like thin provisioning, de-duplication, snapshots, clones, encryption, auto-recovery, replication, tiering, and others can be applied as needed to reliable, heterogeneous storage, then storage pools can be defined to meet certain production requirements and VM classes can self-identify the storage they need as they are created. Automation and orchestration is a critical part of the data center of the future if we are indeed to believe that cloud-scale computing will be mainstream. This is where virtual computing is headed to deliver that reality.