Hope is not engineering

My enterprise clients frequently want to know why fill-in-the-blank-cloud-IaaS only has a 99.95% SLA. “That’s more than four hours of downtime a year!” they cry. “More than twenty minutes a month! I can’t possibly live with that! Why can’t they offer anything better than that?”

The answer to that is simple: There is a significant difference between engineering and hope. Many internal IT organizations, for instance, set service-level objectives that are based on what they hope to achieve, rather than the level that the solution is engineered to achieve, and can be mathematically expected to deliver, based on calculated mean time between failures (MTBF) of each component of the service. Many organizations are lucky enough to achieve service levels that are higher than the engineered reliability of their infrastructure. IaaS providers, however, are likely to base their SLAs on their engineered reliability, not on hope.

If a service provider is telling you the SLA is 99.95%, it usually means they’ve got a reasonable expectation, mathematically, of delivering a level of availability that’s 99.95% or higher.

My enterprise client, with his data center that has a single UPS and no generator (much less dual power feeds, multiple carriers and fiber paths, etc.), with a single, non-HA, non-load-balanced server (which might not even have dual power supplies, dual NICs, etc.), will tell me that he’s managed to have 100% uptime on this application in the past year, so fie on you, Mr. Cloud Provider.

I believe that uptime claim. He’s gotten lucky. (Or maybe he hasn’t gotten lucky, but he’ll tell me that the power outage was an anomaly and won’t happen again, or that incident happened during a maintenance window so it doesn’t count.)

A service provider might be willing to offer you a higher SLA. It’s going to cost you, because once you get past a certain point, mathematically improving your reliability starts to get really, really expensive.

Now, that said, I’m not necessarily a fan of all cloud IaaS providers’ SLAs. But I encourage anyone looking at them (or a traditional hosting SLA, for that matter), to ponder the difference between engineering and hope.

Bookmark and Share

Posted on June 7, 2010, in Infrastructure and tagged . Bookmark the permalink. 2 Comments.

  1. The funny thing about SLAs is that they rarely have teeth anyway. Our new hosting provider has credited us a full month for an SLA breach, but most offer only prorated hourly credits up to a certain amount. Our new provider is also 3x more expensive than most managed hosters.

    For IaaS providers, you might be better off going for the cheaper 99.90 SLAs, and use multiple providers with replication and failover.

    Like

  2. Hey very cool web site!! Man .. Beautiful .. Superb .. I will bookmark
    your site and take the fdeds also? I’m glad to seek
    out a lot of useful info right here in the put up, we’d like work out extr
    techniques in this regard, thanks for sharing. .
    . . . .

    Like

Leave a comment