Parting of the Clouds

Back in 2007, Wikibon warned its members to read the fine print in cloud SLAs. At the time we were discussing Google’s terms with enterprise customers when we stated:

Google’s SLA terms do leave lots of wiggle room for the consumer giant such as:

▪    Promising penalties for downtime based on user error rates exceeding 5% but using server error rates as the adjudicating measure.

▪    Defining downtime as 10 minutes or more of contiguous interruption.

▪    Making credits contingent upon the customer’s timely notification of service interruption.

▪    Excluding planned downtime from service interruption.

▪    Offering three 9’s of availability for unplanned downtime (almost 9 hours/year).

In the past few days the seemingly infinite amount of coverage on—and scrutiny of—the cloud computing sector has risen to levels never quite seen before. With the continued downtime of Amazon’s EC2 service, bloggers, media outlets, pundits, analysts and users are all chiming in with their take on what this massive cloud outage means in the long-term for customers relying on—and planning to rely on—outsourced, pay-as-you-go services. As the Wikibon and SiliconAngle research teams were analyzing the data, from hundreds of links being tweeted and re-tweeted across mobile devices around the world, one key point became clear—there is a clear difference between the ruby-on-rails, developer-class, good-for-consumer cloud; and the enterprise-class cloud that is becoming more prevalent as a direct result of inadequate SLA commitments from large cloud players.

The Implication: Customers will have to match their data type to the cloud tier/cloud provider that aligns with their specific business objectives. It’s not about “to the cloud,” it’s about “what kind of data should be placed in the cloud and which type of cloud is most appropriate?”

One article that caught our attention came from Jo Maitland of SearchCloudComputing, who wrote that “the AWS customer list is long, but it’s mostly companies that can afford a little downtime without losing business.” Or can they?

Upon looking at the list of companies that are currently down due to AmazonEC2’s outage, “a little downtime” may sound right on the surface, but then we saw a tweet from Jason Calcanis (@jason) that really brought to light the grim reality of the impact that EC2 is having on companies with critical data in the Amazon cloud. Jason was quoting what is evidently an Amazon customer from the AWS forums:

Life of our patients is at stake – I am desperately asking you to contact; we are a monitoring company and are monitoring hundreds of cardiac patients at home. We are unable to see their ECG signals since 21st of April. Can you please contact us?”

As we pointed out earlier this year, “Amazon has definitely taken once-expensive resources (e.g. compute and storage) and turned them into cheap commodities that come with a monthly bill. Amazon is creating a multi-billion dollar market for billable compute and storage by courting developers, startups and small enterprises—GREAT strategy; especially when you’re using your excess retail transaction capacity. BUT, for many enterprises, Amazon’s SLAs are not adequate. Amazon’s SLA’s are either too limited or too expensive. EMC’s Joe Tucci characterizes Amazon’s SLAs as “we’ll do our best but if we fail don’t call us.” And CIOs are telling their CEOs that Amazon is too risky, too insecure—i.e. good for the developer crowd but not for our world-class operation.

The key point is that CIOs and data center managers—no matter the size of company—need to be aware of the specific type of cloud they are placing their data in. Does the cloud provider have 24x7x365 support personnel or are you stuck dealing with a web page when a crisis erupts? If you’re monitoring the hearts of patients—you need to have a provider that guarantees five nines of availability at a minimum. This isn’t an option; it’s the fundamental cost of doing business.

Not to miss a marketing opportunity, Nirvanix is one company trying to leverage the less-than-stellar SLAs of companies like Amazon. Seizing on the Amazon outages, Nirvanix’s CEO, Scott Genereux, sent out a statement to the media last week:

It’s always unfortunate when an outage or negative news happens in the cloud industry, but at times like these it’s important to understand what’s really going on. As the cloud became the buzz of the IT sector, both established and emerging vendors rushed products out to market. The result is offerings that are not purpose-built cloud architectures specifically designed for Enterprise IT organizations—organizations that cannot justify or tolerate any downtime.

Translation: “We bet our company on the premise that Amazon’s SLAs suck and we can do better.” How’s that for an opportunistic CEO? You have to give Genereux credit for having the cojones to stick his neck out in public on this topic. Basically, Nirvanix is saying that renting excess capacity (the genesis of Amazon’s business) doesn’t cut it for enterprise IT.  As my Cube co-host John Furrier says: “Nirvanix is like Amazon S3 on enterprise steroids.”

Don’t get me wrong. I love Amazon’s cloud services. It’s a brilliant concept that has completely changed our business. At Wikibon, our developers use Amazon Web Services…but not for everything. And it begs the question – should you trust your data to a startup like Nirvanix versus a giant like Amazon? Well as Teri Mclure points out, service and support are the top criteria for vendor selection. Of course vendor brand is next which underscores the yin and the yang of doing business with startups. On the one hand they have to have a better mousetrap on the other hand references with real experience are not as prevalent.

So to me it comes down to references that are willing to talk. Nirvanix claims to have more than 700 customers including the likes of Cisco, Logitech, Universal, GE, Fox, and Millennium Partners doing public, private or hybrid cloud deployments using the company’s service and/or file system (on premise).

Five things I would ask cloud service providers include:

  1. Can I choose where my data is stored? Can I keep 3 or more copies in different physical locations?
  2. Can I inspect and audit the cloud service providers data center?
  3. Can I define a security incident to comply with my corporate edicts?
  4. Will you report incidents to me based on that definition?
  5. Will you put these commitments in writing?

As my friend Fred Moore says – “the information superhighway can be a dangerous place, don’t end up as road kill.” So in the end, companies need to understand UP FRONT what kind of cloud service they are dealing with. If you have data that isn’t business critical, then go with the developer-class/general purpose cloud. Amazon straight out says its cloud storage “is designed to make web-scale computing easier for developers” and that its compute cloud is “designed to make web-scale computing easier for developers.” So the message is clear—if you aren’t a developer, then you should be hiring some to run your Amazon cloud or you should be looking elsewhere. There is a definite segregation—a bifurcation—of cloud types and cloud tiers that will take place and customers need to play extremely close attention to where they are placing their data. Amazon’s EC2 debacle will not stop customers’ march to the cloud—the cost reductions, flexibility and speed are way to alluring to ignore. The key for customers is to do your own homework, understand the different cloud providers and pick your partners carefully.

Your company and its brand reputation may just depend on it.

David Vellante is a co-founder of the Wikibon Project, a community of business technology practitioners solving problems through an open source sharing of free advisory knowledge.