VMware takes stake in Terremark
I have been crazily, insanely busy, and my frequency of blog posting has suffered for it. On the plus side, I’ve been busy because a huge number of people — users, vendors, investors — want to talk about cloud.
I’ve seen enough questions about VMware investing $20 million in Terremark that I figured I’d write a quick take, though.
Terremark is a close VMware partner (and their service provider of the year for 2008). Data Return (acquired by Terremark in 2007) was the first to have a significant VMware-based utility hosting offering, dating all the way back to 2005. Terremark has since also gotten good traction with its VMware-based Enterprise Cloud offering, which is a virtual data center service. However, Terremark is not just a hosting/cloud provider; it also does carrier-neutral colocation. It has been sinking capital into data center builds, so an external infusion, particularly one directed specifically at funding the cloud-related engineering efforts, is probably welcome.
Terremark has been the leading-edge service provider for VMware-based on-demand infrastructure. It is to VMware’s advantage to get service providers to use its cutting-edge stuff, particularly the upcoming vCloud, as soon as possible, so giving Terremark money to accelerate its cloud plans is a perfectly good tactical move. I don’t think it’s necessary to read any big strategic message into this investment, although undoubtedly it’s interesting to contemplate.
The cloud computing forecast
John Treadway of Cloud Bzz asked my colleague Ben Pring, at our Outsourcing Summit, about how we derived our cloud forecast. Ben’s answer is apparently causing a bit of concern. I figured it might be useful for me to respond publicly, since I’m one of the authors of the forecast.
The full forecast document (clients only, sorry) contains a lot of different segments, which in turn make up the full market that we’ve termed “cloud computing”. We’ve forecasted each segment, along with subsegments within them. Those segments, and their subsegments, are Business Process Services (cloud-based advertising, e-commerce, HR, payments, and other); Applications (no subcategories; this is “cloud SaaS”); Application Infrastructure (platform and integration); and System Infrastructure (compute, storage, and backup).
Obviously, one argue whether or not it’s valid to include advertising revenue, but a key point that should not be missed is that in the trend towards the consumerization of IT, it is the advertiser that often implicitly pays for the consumer’s use of an IT service, rather than the consuer himself. Advertising revenue is a significant component of the overall market, part of the “cloud” phenomenon even if you don’t necessarily think of it as “computing”.
Because we offer highly granular breakouts within the forecast, those who are looking for specific details or who wish to classify the market in a particular way should be able to do so. If you want to define cloud computing as just typical notions of PaaS plus IaaS, for instance, you can probably simply take our platform, compute, and storage line-items and add them together.
Is it confusing to see the giant number with advertising included? It can be. I often start off descriptions of our forecast with, “This is a huge number, but you should note that a substantial percentage of these revenues are derived from online advertising.” and then drill down into a forecast for a particular segment or subsegment of audience interest.
Giant numbers can be splashily exciting on conference presentations, but pretty much anyone doing anything practical with the forecast (like trying to figure out their market opportunity) looks at a segment or even a subsegment.
The perils of defaults
A Fortune 1000 technology vendor installed a new IP phone system last year. There was one problem: By IT department policy, that company does not change any defaults associated with hardware or software purchased from a vendor. In this case, the IP phones defaulted to no ring tone. So the phone does not ring audibly when it gets a call. You can imagine just how useful that is. Stunningly, this remains the case months after the initial installation — the company would rather, say, miss customer calls, than change the Holy Defaults.
A software vendor was having an interesting difficulty with a larger customer. The vendor’s configuration file, as shipped with the software, has defaults set up for single-server operation. If you want to run multi-server for high availability or load distribution, you need to change some of the defaults in the configuration file. They encountered a customer with the same kind of “we do not change any defaults”. Unsurprisingly, their multi-server deployment was breaking. The vendor’s support explained what was wrong, explained how to fix it, and was confounded by the policy. This is one of the things a custom distribution from the vendor can be used for, of course, but it’s a head-slapping moment and a grotesque waste of everyone’s time.
Now I’m seeing cloud configurations confounding people who have these kinds of policies. What is “default” when you’re picking from drop-down menus? What do you do when the default selection is something other than what you actually need? And the big one: Will running software on cloud infrastructure necessitate violating virgin defaults?
As an analyst, I’m used to delivering carefully nuanced advice based on individual company situations, policies, and needs. But here’s one no-exceptions opinion: “We never ever change vendor defaults” is a universally stupid policy. It is particularly staggeringly dumb in the cloud world, where generally, if you can pick a configuration, it is a supported configuration. And bluntly, in the non-cloud world, configurable parameters are also just that — things that the vendor intends for you to be able to change. There are obviously ways to screw up your configuration, but those parameters are changeable for a reason. Moreover, if you are just using cloud infrastructure but regular software, you should expect that you may need to tune configuration parameters in order to get optimal performance on a shared virtualized environment that your users are accessing remotely (and you may want to change the security parameters, too).
Vendors: Be aware that some companies, even really big successful companies, sometimes have nonsensical, highly rigid policies regarding defaults. Consider the tradeoffs between defaults as a minimalistic set, and defaults as a common-configuration set. Consider offering multiple default “profiles”. Packaging up your software specifically for cloud deployment isn’t a bad idea, either (i.e., “virtual appliances”).
IT management: Your staff really isn’t so stupid that they’re not able to change any defaults without incurring catastrophic risks. If they are, it’s time for some different engineers, not needlessly ironclad policies.
If you worry about hardware, it’s not cloud
If you need more RAM, and you have to call your service provider, they’ve got to order the RAM, wait until they receive it, and then put it in a physical server, before you actually get more memory, and then they bill you on a one-off basis for buying and installing the RAM, you’re not doing cloud computing. If you have to negotiate the price of that RAM each time they buy some, you are really really not doing cloud computing.
I talked to a client yesterday who is in exactly this situation, with a small vendor who calls themselves a cloud computing provider. (I am not going to name names on my blog, in this case.)
Cloud infrastructure services should not be full of one-offs. (The example I cited is merely the worst of the service provider’s offenses against cloud concepts.) It’s reasonably to hybridize cloud solutions with non-cloud solutions, but for basic things — compute cores, RAM, storage, bandwidth — if it’s not on-demand, seamless, and nigh-instant, it’s not cloud, at least not in any reasonable definition of public cloud computing. (“Private cloud”, in the sense of in-house, virtualized data centers, adopts some but not all traits of the public cloud to varying degrees, and therefore gets cut more slack.)
Cloud infrastructure should be a fabric, not individual VMs that are tied to specific physical servers.
Out clauses
I’m seeing an increasing number of IT buyers try to negotiate “out clauses” in their contracts — clauses that let them arbitrarily terminate their services, or which allow them to do so based on certain economy-related business conditions.
People are doing this because they’re afraid of the future. If, for instance, they launch a service and it fails, they don’t want to be stuck in a two-year contract for hosting that service (or colocating that service, or having CDN services for it, etc.). Similarly, if the condition of their business deteriorates, they have an eye on what they can cut in that event.
We’re not talking about businesses that are already on the chopping block — we’re talking about businesses that seem to currently be in good health, whose prospects for growth would seem good. (Businesses that are on the chopping block, or wavering dangerously near it, are behaving in different defensive ways.)
Providers who would previously have never agreed to such conditions are sometimes now willing to negotiate clauses that address these specific fears of businesses. But don’t expect to see such clauses to be common, especially if the service provider has an up-front capital expenditure (such as equipment for dedicated, non-utility hosting). If you’re trying to negotiate a clause like this, you’re much more likely to have success if you tie it to specific business outcomes that would result in you entirely shutting down whatever it is that you’re outsourcing, rather than trying to negotiate an arbitrary out.
The costs of user-generated content
When I first started this blog, I intended to write more about virtual worlds, following the general theme of massive scalability. In this instance, though, I want to muse upon the balance between maximizing your revenues, and adhering to principle, especially when you’re a public company with shareholders to worry about. Also, this involves the unintended consequences of user-generated content, and there are lessons to be learned here if you’re looking at UGC, whether in your own enterprise or for consumers in general. Similarly, there are perils in any customer-controlled environment. Bear with me, though, because this is long.
Massively multiplayer online games (MMOGs), and MMO roleplaying games (MMORPGs) in particular, all have distinct communities, but each such community is always full of players with conflicting interests. The development studio has to balance their own vision, as well as the sometimes-warring interests of different types of players, and the commercial needs of the game (whether it’s paid for in subscriptions, real-money trade, or other, there has to be revenue), in order to maximize long-term profit. Communities are particularly fragile, and widespread changes can lead to mass exodus, as Sony Online Entertainment discovered with Star Wars: Galaxies, where a thorough and expensive revamp instead caused more than a 50% drop in subscriptions. Players who depart are not individuals — they are part of a community of family, friends, and online acquaintances, and when key players leave, there’s a domino effect.
Enter NCsoft (SEO:036570), and one of its veteran properties, five-year-old City of Heroes. CoH is relatively small fry for NCsoft — it peaked at around 200,000 subscribers, and now has something in the 150,000 range, paying a base of $15/month in subscription fees. NCsoft’s Lineage and Lineage II, by contrast, each have about a million subscribers; for anyone that isn’t Blizzard and the juggernaut that is World of Warcraft, these are impressive numbers, but they’re down hugely from their all-time highs.
CoH currently enjoys a position as the only superhero-themed MMOG out there. However, Champions Online comes out this summer, designed by the same folks who originally created CoH, creating an imminent competitive threat. Paragon Studios (the studio within NCsoft that’s responsible for CoH) chose to do something smart — introduce user-generated content, allowing players to create their own missions (scenarios), complete with fully custom enemies to fight. (As an on-and-off CoH player with what I hope is a creative streak, UGC is deeply welcome feature, and lots of people are using it to do very entertaining things.)
As one would expect, players immediately went diligently to work to find ways to hyperoptimize UGC in order to maximize rewards for a given amount of play time. The game’s EULA specifies you’re not allowed to use exploits, but the difficulty created was this: What is an exploit, versus merely unintended levels of reward? There are methods in the game that generate very high rewards per unit time, for instance; UGC simply allowed players to generate optimal situations for themselves. The game’s programmers rapidly closed down some methods, but left other methods live for almost a full month. The hyper-efficient methods were well-known and broadly used by the player base, but the studio was essentially silent, with no communication to customers, other than a request for feedback.
Usually, in a virtual world, when there’s an exploit, the exploiters are limited to a handful of people; players normally know a bug when they see one, like the ability to duplicate a valuable object. This particular case is unusual because it affects a sizable percentage of the player base, and it’s unclear what is and is not an exploit.
Consequently, players have been shocked to see NCsoft announce that they’ve decided to react harshly, stating that players who have “abused” the reward system may lose the rewards they’ve gained, including losing access to the characters used. Since CoH is an MMORPG, characters may represent hundreds, even thousands, of hours of investment, so this is a serious threat. The real-world cash value of optimized characters is significant, too, although such sales and transfers are against the EULA.
It’s an extraordinary choice on NCsoft’s part. Other than the instructions not to “exploit” the system, as well as explicit rules forbidding players from creating exploitative UGC, there was never any warning to customers not to play UGC that might be exploitative, although CoH‘s parent studio publicly communicates with customers on a daily basis through the game’s forums. NCsoft has recently been pushing sales of a new boxed set for new players, as well, leading to the high likelihood of inadvertent “abuse” by new players who would not necessarily know that these were exceptional levels of reward for the time.
Losing access to rewards and characters essentially represents nullifying the time investment of players, and the removal of avenues from which to have fun (the character represents the ability to access content). Thus, impacted customers, most of whom subscribe month-to-month, have a very high likelihood of cancelling. This represents a potential direct revenue hit at a time when the game is likely extremely vulnerable to competition, and the aforementioned domino effect of subscriber loss is real and must be considered. Yet, to not do anything is a compromise of principle, and potentially creates a whack-a-mole effect whereby players find new gray areas of high-reward generation and widely use them to gain rewards, while developers try to patch these as quickly as possible. Moreover, because virtual worlds have internal economies, exceptionally fast rewards create imbalances, so they have an impact beyond individual players. (This does not include the impact to “gold farmers” and “power-leveling services”, who offer in-game rewards and powerful characters in exchange for real money, a practice which is against nearly every MMOG’s terms of service, but is nonetheless a significant and growing business. Ironically, making it easier for players to gain quick rewards on their own devalues such services.)
NCsoft is facing the prospect of significant subscriber bleed due to the forthcoming Champions Online, so a decision that increases the likelihood of cancellations is an extraordinarily bold move. It’s unusual for public companies to be willing to choose principle over revenue. Implementing harsh penalties based on clear guidelines, possibly with an automated warning system (i.e., if a player has gotten more than X widgets per Y time, alert him to it), may be advisable, but retroactive imposition of penalties on one’s customer base is another matter. Creating “traps” for bad apples disguised as paying customers is certainly reasonable. Punishing ordinary customers for having done something gray, and which your company has failed to even suggest is black, may be a quick ticket to having to offer unpleasantly complex explanations to your shareholders. Industry-watchers may find the outcome of this to be instructive.
So here are the broader lessons:
A couple of months ago, I wrote about scaling and friendly failure. The same principle that applies here: It’s not what the limits are. It’s how well you communicate them to your customers in advance of enforcing them. It applies whether you’re a gaming company, a cloud computing company, a network services provider, or an entirely non-tech company.
If you are providing an environment with user-generated content, expect that it will be abused, sometimes in subtle ways. Even in a corporate environment, there are potentials for abuse, particularly if the company gives employees goals or bonuses to work towards for completing UGC. Human nature being what it is, people optimize; in the work world, they’re careful not to optimize so much that they think they could get fired over it, but again, the boundaries are gray and hazy. Clear communication of what is and isn’t acceptable, in advance, is necessary.
Recent research
I’m at Gartner’s business continuity management summit (BCM2) this week, and my second talk, upcoming later this morning, is on the relevance of colocation and cloud computing (i.e., do-it-yourself external solutions) to disaster recovery.
My recent written research has been all focused on cloud, although plenty of my day to day client time has been dealing with more traditional services — colocation, data center leasing, managed hosting, CDN services. Yet, cloud remains a persistent hot topic, particularly since it’s now difficult to have a discussion about most of the other areas I cover without also getting into utility/cloud and future data center strategy.
Here’s what I’ve published recently:
How to Select a Cloud Computing Infrastructure Provider. This is a lengthy document that takes you methodically through the selection process of a provider for cloud infrastructure services, and provides an education in the sorts of options that are currently available. There’s an accompanying Toolkit: Comparing Cloud Computing Infrastructure Providers, which is a convenient spreadsheet for collecting all of this data for multiple providers, and scoring each of them according to your needs.
Cool Vendors in Cloud Computing System and Application Infrastructure, 2009. Our Cool Vendors notes highlight small companies that we think are doing something notable. These aren’t vendor recommendations, just a look at things that are interesting in the marketplace. This year’s selections were AppZero, Engine Yard, Enomaly, LongJump, ServePath (GoGrid), Vaultscape, and Voxel. (Note for the cynical: Cool Vendor status can’t be bought, in any way shape or form; client status is not at a consideration at any point, and these kinds of small vendors often don’t have the money to spend on research anyway.)
Key Issues for Managed and Professional Network Services, 2009. I’m not the primary author for this, but I contributed to the section on cloud-based services. This note is targeted at carriers and other network service providers, providing a broad overview of things they need to be thinking about in the next year.
I’m keeping egregiously busy. I recently did my yearly corporate work plan, showing my productivity metrics. I’ve already done a full year of work, based on our average productivity metrics, and it’s April. That’s the kind of year it’s been. It’s an exciting time in the market, though.
Next round, Akamai vs. Limelight
In CDN news this past weekend, a judge has overturned the jury verdict in the Akamai vs. Limelight patent infringement case. Akamai has said it intends to appeal.
The judge cited Muniauction v. Thomson Corp. as the precedent for a judgement of law, which basically says that if you have a method claim in a patent that involves steps performed by multiple parties, you cannot claim direct infringement unless one party exercises control over the entire process.
I have not read the court filing yet, but based on the citation of precedent, it’s a good guess that because the CDN patent methods generally involve steps beyond the provider’s control, it falls under this citation. Unexpected, at least to me, and for those IP law watchers among you, rather fascinating, since in our increasingly federated, distributed, outsourced IT world, this would seem to raise a host of intellectual property issues for multi-party transactions, which are in some ways inherent to web services.
McKinsey on cloud computing
McKinsey is claiming, in a report called Clearing the Air on Cloud Computing, that cloud infrastructure (specifically Amazon EC2) is as much as 150% more expensive than in-house data center infrastructure (specifically a set of straw-man assumptions given by McKinsey).
In my opinion, McKinsey’s report lacks analytical rigor. They’ve crunched down all data center costs to a “typical” cost of assets, but in reality, these costs vary massively depending upon the size of one’s IT infrastructure. They’ve reduced the cloud to the specific example of Amazon. They seem to have an inconsistent definition of what a compute core actually is. And they’ve simply assumed that cloud infrastructure gets you a 10% labor savings. That’s one heck of an assumption, given that the whole analysis is underpinned by that. The presentation is full of very pretty charts, but they are charts founded on what appears to be a substantial amount of guesswork.
Interestingly, McKinsey also talks about enterprises setting their internal SLAs at 99.99%, vs. Amazon’s 99.95% on EC2. However, most businesses meet those SLAs through luck. Most enterprise data centers have mathematical uptimes below 99.99% (i.e., calculated mean time between failure), and a single server sitting in one of those data centers certainly has a mathematical uptime below that point. There is a vast gulf between engineering for reliability, and just trying to avoid attracting the evil eye. (Of course, sometimes cloud providers die at the hands of their own engineering safeguards.) Everyone wants 99.99% availability — but they often decide against paying for it, once they find out what it actually costs to reliably mathematically achieve it.
In my December note, Dataquest Insight: a Service Provider Roadmap to the Cloud Infrastructure Transformation, I wrote that Gartner’s Key Metrics data for servers (fully-loaded, broken-out costs for running data centers of various sizes) showed that for larger IT infrastructure bases, cloud infrastructure represented a limited cost savings on a TCO basis — but that it was highly compelling for small and mid-sized infrastructures. (Note that business size and infrastructure size don’t correlate; that depends on how heavily the business depends on IT.) Our Key Metrics numbers — a database gathered from examining the costs of thousands of businesses, broken down into hardware, software, data center facilities, labor, and more — show internal costs far higher than McKinsey cites, even for larger, more efficient organizations.
The primary cost savings for cloud infrastructure does not come in the savings on the hard assets. If you do an analysis based on the assumption that this is where it saves you money, your analysis will be flawed. Changing capex to opex, and taking advantage of the greater purchasing power of a cloud provider, can and will drive significant financial benefits for small to mid-size IT organizations that use the cloud. However, a substantial chunk of the benefits come from reducing the labor costs. You cannot analyze the cost of the cloud and simply handwave the labor differences. The labor costs on a per-CPU basis do vary widely as well — for instance, a larger IT organization with substantial automation is going to have much lower per-CPU costs than a small business with a network admin who does everything by hand.
I’ve been planning to publish some research analyzing the cost of cloud infrastructure vs. the internal data center, based on our Key Metrics data. I’ve also been planning to write, along with one of my colleagues with a finance background, an analysis of cloud financial benefits from a cost of capital perspective. I guess I should get on that…
You are not dating your vendor
One of the ongoing refrains of the analyst job is listening to clients gripe, day in and day out, about the things they don’t like about their vendors. Sometimes these things are niggling annoyances. Sometimes, though, these things are rage-inducing, or, in clients who tend to take everything calmly in stride, at least a distinct issue that materially impacts the service that they receive.
Sometimes these issues are recurring problems with a given vendor. I can tell you, for instance, that Vendor X has a process and organizational structure in place which essentially incentivizes its operations staff to kick requests from department to department without anyone being accountable for problems being resolved; unsurprisingly, this results in long resolution times for complex cross-functional issues, and frustrated customers. If you are with Vendor X, it’s something that you have to live with, since Vendor X’s internal politics do not permit fixing the core problem.
Sometimes, however, these issues are out of the ordinary, and would benefit from escalation. However, the majority of the time, the customer has generally not said anything to their provider about the issues they’re having — even if they’re so unhappy they’re planning to leave. Or if they’ve said something, they haven’t escalated into management. They don’t want to rock the boat, or disrupt the “relationship”. They’d rather suffer.
Since I have executive-level contacts at most of the service providers that our clients use, I usually offer to put such clients in touch with someone at their vendor who can see to it that real attention gets paid to the problem. Generally, unless their project is on the brink of failure, clients refuse that offer. Sometimes, they’ll permit me to raise the issue with the vendor, in a more anonymous fashion — i.e., something that doesn’t identify them personally, but which might provide just enough of a hint that the vendor can figure out who it is they ought to be helping.
I don’t get this. You are not dating your vendor. If you wait for them to bring you roses and chocolate, you are going to be disappointed. They will not read your mind, or recognize that you are quietly sulking and waiting for them to notice just how hurt you are and beg you to love them again. You are paying what is sometimes an egregious amount of money for services, and you deserve to get what you’re paying for.
To the vendors who wonder why they get anonymized passed-on complaints from analysts: It’s because analysts can be sort of like a combination of newspaper advice columnists, girl-gossip circles, and therapists. We can only do so much to coax clients into being honest with their vendors.
To the IT buyers out there: When you’re dealing with vendor frustrations, why do you seethe in silence, rather than complaining and escalating?