Monthly Archives: December 2010
Gartner is hiring cloud experts! We’re going to be hiring two analysts who are based in Europe. They’ll cover cloud computing and networking — specifically, they’ll be European counterparts to myself and Ted Chamberlin. That means we’re looking for people who know cloud IaaS, hosting and colocation, and, if possible, have some background in networking as well.
These are Research Director roles, so we’re looking for people who are pretty senior — currently a director or VP, probably, or equivalent, and therefore likely 10+ years of experience (though people with really intensive start-up experience might work out, too, with less).
If you’re interested, drop me a message on LinkedIn.
I’m pondering my poll results from the Gartner data center conference, and trying to understand the discontinuities. I spoke at two sessions at the conference. One was higher level and more strategic, called “Is Amazon or VMware the Future of Your Data Center?” The other was very focused and practical, called “Getting Real with Cloud Infrastructure Sevices”. The second session was in the very last slot, and therefore you had to really want to be there, I suppose. The poll sample size of the second session was about half of the first. My polling questions were similar but not identical, and this is the source of the difficulty in understanding the differences in results.
I normally ask a demographic question at the beginning of my session polls, about how much physical server infrastructure the audience members run in their data centers. This lets me cross-tabulate the poll results by demographic, with the expectation that those who run bigger data centers behave differently than those who run smaller data centers. Demographics for both sessions were essentially identical, with about a third of the audience under 250 physical servers, a third between 250 and 1000, and a third with more than 1000. I do not have the cross-tabbed results back yet, unfortunately, but I suspect they won’t explain my problematic results.
In my first session, 33% of the audience’s organizations had used Amazon EC2, and 33% had used a cloud IaaS provider other than Amazon. (The question explicitly excluded internal clouds.) I mentioned the denial syndrome in a previous blog post, and I was careful to note in my reading of the polling questions that I meant any use, not just sanctioned use — the buckets were very specific. The main difference in Amazon vs. non-Amazon was that more of the use of Amazon was informal (14% vs. 9%) and there was less production application usage (8% vs. 12%).
In my second session, 13% of respondents had used Amazon, and 6% had used a non-Amazon cloud IaaS. I am not sure whether I should attribute this vast difference to the fact that I did not emphasize the “any use”, or simply because this session drew a very different sort of attendee, perhaps one who was farther back on the adoption curve and wanting to learn more basic material, than the first session.
The two audiences also skewed extremely differently when asked what mattered to them in choosing a provider (choose top 3 out of list of options). I phrased the questions differently, though. In the first session, it was about “things that matter”; in the second session, it was “the provider who is best-in-class at this thing”. Where this really became a radically different result was in customer service. It was overwhelmingly the most heavily weighted thing in the first session (“excellent customer service, responsive and proactive in meeting my needs”), but was by far the least important thing in the second session (where I emphasized “best in class customer service” and not “good enough customer service”).
Things like this are why I generally do not like to cite conference keypad polls in my research, preferring instead to rely on formal primary research that’s been demographically weighted and where there are enough questions to tease out what’s going on in the respondent’s head. (I do love polls for being able to tailor my talk, on the fly, to the audience, though.)
EdgeCast has announced the beta of its new application delivery network service. For those of you who are CDN-watchers, that means it’s leaping into the fray with Akamai (Dynamic Site Accelerator and Web Application Accelerator, bundles where DSA is B2C and WAA is B2B), Cotendo’s DSA, and CDNetworks’s Dynamic Web Acceleration. EdgeCast’s technology is another TCP optimization approach, of the variety commonly used in application delivery controllers (like F5’s Web Application Accelerator) and WAN optimization controllers (like Riverbed).
EdgeCast seems to be gaining traction with my client base in the last few weeks — they’re appearing in a lot more shortlists. This appears to be the direct result of aggressive SEO-based marketing.
What’s important about this launch: EdgeCast isn’t just a standalone CDN. It is also a CDN partner to many carriers, and it is deriving an increasingly larger percentage of its revenue by selling its software. That means that EdgeCast introducing ADN services potentially has ripple effects on the ecosystem, in terms of spreading ADN technology more widely.
For a while now, I’ve been talking to Gartner clients about what concerns keep them off public cloud infrastructure, and the diminishing differences between private and public cloud from service providers. I’ve been testing a thesis with our clients for some time, and I’ve been talking to people here at Gartner’s data center conference about it, as well.
That thesis is this: People will share a data center network, as long as there is reasonable isolation of their traffic, and they are able to get private non-Internet connectivity, and there is a performance SLA. People will share storage, as long as there is reasonable assurance that nobody else can get at their data, which can be handled via encryption of the storage at rest and in flight, perhaps in conjunction with other logical separation mechanisms, and again, there needs to be a performance SLA. But people are worried about hypervisor security, and don’t want to share compute. Therefore, you can meet most requirements for private cloud functionality by offering temporarily dedicated compute resources.
Affinity rules in provisioning can address this very easily. Simply put, the service provider could potentially maintain a general pool of public cloud compute capacity — but set a rule for ‘psuedo-private cloud’ customers that says that if a VM is provisioned on a particular physical server for customer X, then that physical server can only be used to provision more VMs for customer X. (Once those VMs are de-provisioned, the hardware becomes part of the general pool again.) For a typical customer who has a reasonable number of VMs (most non-startups have dozens, usually hundreds, of VMs), the wasted capacity is minimal, especially if live VM migration techniques are used to optimize the utilization of the physical servers — and therefore the additional price uplift for this should be modest.
That gets you public cloud compute scale, while still assuaging customer fears about server security. (Interestingly, Amazon salespeople sometimes tell prospects that you can use Amazon as a private cloud — you just have to use only the largest instances, which eat the resources of the full physical server.)
I’m at Gartner’s Data Center Conference this week, and I’m finding it to be an interesting contrast to our recent Application Architecture, Development, and Integration Summit.
AADI’s primary attendees are enterprise architects and other people who hold leadership roles in applications development. The data center conference’s primary attendees are IT operations directors and others with leadership roles in the data center. Both have significant CIO attendance, especially the data center conference. Attendees at the data center conference, especially, skew heavily towards larger enterprises and those who otherwise have big data centers, so when you see polling results from the conference, keep the bias of the attendees in mind. (Those of you who read my blog regularly: I cite survey data — formal field research, demographically weighted, etc. — differently than conference polling data, as the latter is non-scientific.)
At AADI, the embrace of the public cloud was enthusiastic, and if you asked people what they were doing, they would happily tell you about their experiments with Amazon and whatnot. At this conference, the embrace of the public cloud is far more tentative. In fact, my conversations not-infrequently go like this:
Me: Are you doing any public cloud infrastructure now?
Them: No, we’re just thinking we should do a private cloud ourselves.
Me: Nobody in your company is doing anything on Amazon or a similar vendor?
Them: Oh, yeah, we have a thing there, but that’s not really our direction.
That is not “No, we’re not doing anything on the public cloud”. That’s, “Yes, we’re using the public cloud but we’re in denial about it”.
Lots of unease here about Amazon, which is not particularly surprising. That was true at AADI as well, but people were much more measured there — they had specific concerns, and ways they were addressing, or living with, those concerns. Here the concerns are more strident, particularly around security and SLAs.
Feedback from folks using the various VMware-based public cloud providers seems to be consistently positive — people seem to uniformly be happy with the services themselves and are getting the benefits they hoped to get, and are comfortable. Terremark seems to be the most common vendor for this, by a significant margin. Some Savvis, too. And Verizon customers seem to have talked to Verizon about CaaS, at least. (This reflects my normal inquiry trends, as well.)
My Magic Quadrant for Cloud Infrastructure as a Service and Web Hosting is done. The last week has been spent in discussion with service providers over their positioning and the positioning of their competitors and the whys and wherefores and whatnots. That has proven to be remarkably interesting this year, because it’s been full of angry indignation by providers claiming diametrically opposed things about the market.
Gartner gathers its data about what people want in two ways — from primary research surveys, and, often more importantly, from client inquiry, the IT organizations who are actually planning to buy things or better yet are actually buying things. I currently see a very large number of data points — a dozen or more conversations of this sort a day, much of it focused on buying cloud IaaS.
And so when a provider tells me, “Nobody in the market wants to buy X!”, I generally have a good base from which to judge whether or not that’s true, particularly since I’ve got an entire team of colleagues here looking at cloud stuff. It’s never that those customers don’t exist; it’s that the provider’s positioning has essentially guaranteed that they don’t see the deals outside their tunnel vision service.
The top common fallacy, overwhelmingly, is that enterprises don’t want to buy from Amazon. I’ve blogged previously about how wrong this is, but at some point in the future, I’m going to have to devote a post (or even a research note) to why this is one of the single greatest, and most dangerous, delusions, that a cloud provider can have. If you offer cloud IaaS, or heck, you’re a data-center-related business, and you think you don’t compete with Amazon, you are almost certainly wrong. Yes, even if your customers are purely enterprise — especially if your customers are large enterprises.
The fact of the matter is that the people out there are looking at different slices of cloud IaaS, but they are still slices of the same market. This requires enough examination that I’m actually going to write a research note instead of just blogging about it, but in summary, my thinking goes like this (crudely segmented, saving the refined thinking for a research note):
There are customers who want self-managed IaaS. They are confident and comfortable managing their infrastructure on their own. They want someone to provide them with the closest thing they can get to bare metal, good tools to control things (or an API they can use to write their own tools), and then they’ll make decisions about what they’re comfortable trusting to this environment.
There are customers who want lightly-managed IaaS, which I often think of as “give me raw infrastructure, but don’t let me get hacked” — which is to say, OS management (specifically patch management) and managed security. They’re happy managing their own applications, but would like someone to do all the duties they typically entrust to their junior sysadmins.
There are customers who want complex management, who really want soup-to-nuts operations, possibly also including application management.
And then in each of these segments, you can divide customers into those with a single application (which may have multiple components and be highly complex, potentially), and those who have a whole range of stuff that encompass more general data center needs. That drives different customer behaviors and different service requirements.
Claiming that there’s no “real” enterprise market for self-managed is just as delusional as claiming there’s no market for complex management. They’re different use cases in the same market, and customers often start out confused about where they fall along this spectrum, and many customers will eventually need solutions all along this spectrum.
Now, there’s absolutely an argument to be made that the self-managed and lightly-managed segments together represent an especially important segment of the market, where a high degree of innovation is taking place. It means that I’m writing some targeted research — selection notes, a Critical Capabilities rating of individual services, probably a Magic Quadrant that focuses specifically on this next year. But the whole spectrum is part of the cloud IaaS adoption phenomenon, and any individual segment isn’t representative of the total market evolution.
Cloud-savvy application architects don’t do things the same way that they’re done in the traditional enterprise.
Cloud applications assume failure. That is, well-architected cloud applications assume that just about anything can fail. Servers fail. Storage fails. Networks fail. Other application components fail. Cloud applications are designed to be resilient to failure, and they are designed to be robust at the application level rather than at the infrastructure level.
Enterprises, for the most part, design for infrastructure robustness. They build expensive data centers with redundant components. They buy expensive servers with dual-everything in case a component fails. They buy expensive storage and mirror their disks. And then whatever hardware they buy, they need two of. All so the application never has to deal with the failure of the underlying infrastructure.
The cloud philosophy is generally that you buy dirt-cheap things and expect they’ll fail. Since you’re scaling out anyway, you expect to have a bunch of boxes, so that any box failing is not an issue. You protect against data center failure by being in multiple data centers.
Cloud applications assume variable performance. Well-architected cloud applications don’t assume that anything is going to complete in a certain amount of time. The application has to deal with network latencies that might be random, storage latencies that might be random, and compute latencies that might be random. The principle of the distributed application of this sort is that just about anything that you’re talking to can mysteriously drop off the face of the Earth at any point in time, or at least not get back to you for a whlie.
Here’s where it gets funkier. Even most cloud-savvy architects don’t build applications this way today. This is why people howl about Amazon’s storage back-end for EBS, for instance — they’re used to consistent and reliable storage performance, and EBS isn’t built that way, and most applications are built with the assumption that seemingly local standard I/O is functionally local and therefore is totally reliable and high-performance. This is why people twitch about VM-to-VM latencies, although at least here there’s usually some application robustness (since people are more likely to architect with network issues in mind). This is the kind of problem things like Node.js were created to solve (don’t block on anything, and assume anything can fail), but it’s also a type of thinking that’s brand-new to most application architects.
Performance is actually where the real problems occur when moving applications to the cloud. Most businesses who are moving existing apps can deal with the infrastructure issues — and indeed, many cloud providers (generally the VMware-based ones) use clustering and live migration and so forth to present users with a reliable infrastructure layer. But most existing traditional enterprise apps don’t deal well with variable performance, and that’s a problem that will be much trickier to solve.