Google App Engine updates
For those of you who haven’t been following Google’s updates to App Engine, I want to call your attention to a number of recent announcements. At the six-month point of the beta, I asked when App Engine would be enterprise-ready; now, as we come to almost the year mark, these announcements show the progress and roadmap to addressing many of the issues I mentioned in my previous post.
Paid usage. Google is now letting applications grow beyond the free limits. You set quotas for various resources, and pay for what you use. I still have concerns about the quota model, but being able to bill for these services is an important step for Google. Google intends to be price-competitive with Amazon, but there’s an important difference — there’s still some free service. Google anticipates that the free quotas are enough to serve about five million page views. 5 MPVs is a lot; it pretty much means that if you’re willing to write to the platform, you can easily host your hobby project on it for free. For that matter, many enterprises don’t get 5 MPVs worth of hits on an individual Web app or site each month — it’s just that the platform restrictions are a barrier to mainstream adoption.
Less aggressive limits and fewer restrictions. Google has removed or reduced some limits and restrictions that were significant frustrations for developers.
Promised new features. Google has announced that it’s going to provide APIs for some vital bits of functionality that it doesn’t currently allow, like the ability to run scheduled jobs and background processes.
Release of Python 3.0. While there’s no word on how Google plans to manage the 3.0 transition for App Engine, it’s interesting to see how many Python contributors have been absorbed into Google.
Speaking personally, I like App Engine. Python is my strongest scripting language skill, so I prefer to write in it whenever possible. I also like Django, though I appreciate that Google’s framework is easier to get started with than Django (it’s very easy to crank out basic stuff). Like a lot of people, I’ve had trouble adjusting to the non-relational database, but that’s mostly a matter of programming practice. It is, however, clear that the platform is still in its early stages. (I once spent several hours of a weekend tearing my hair out at something that didn’t work, only to eventually find that it was a known bug in the engine.) But Google continues to work at improving it, and it’s worth keeping an eye on to see what it will eventually become. Just don’t expect it to be enterprise-ready this year.
Amazon announces reserved instances
Amazon’s announcement du jour is “reserved instances” for EC2.
Basically, with a reserved instance, you pay an up-front non-refundable fee for a one-year term or a three-year term. That buys you a discount on the usage fee for that instance, during that period of time. Reserved instances are only available for Unix flavors (i.e., no Windows) and, at present, only in the US availability zones.
Let’s do some math to see what the cost savings turn out to be.
An Amazon small instance (1 virtual core equivalent to a 1.0-1.2 GHz 2007 Opteron or Xeon) is normally $0.10 per hour. Assuming 720 hours in a month, that’s $72 a month, or $864 per year, if you run that instance full-time.
Under the reserved instance pricing scheme, you pay $325 for a one-year term, then $0.03 per hour. That would be $21 per month, or $259 per year. Add in the reserve fee and you’re at $584 for the year, averaging out to $49 per month — a pretty nice cost savings.
On a three-year basis, unreserved would cost you $2,592; reserved, full-time, is a $500 one-time fee, and with usage, a grand total of $1277. Big savings over the base price, averaging out to $35 per month.
This is important because at the unreserved prices, on a three-year cash basis, it’s cheaper to just buy your own servers. At the reserved price, does that equation change?
Well, let’s see. Today, in a Dell PowerEdge R900 (a reasonably popular server for virtualized infrastructure), I can get a four-socket server populated with quad-cores for around $15,000. That’s sixteen Xeon cores clocking at more than 2 GHz. Call it $1000 per modern core; split up over a 3-year period, that’s about $28 per month. Cheaper than the reserved price, and much less than the unreserved price.
Now, this is a crude, hardware-only, three-year cash calculation, of course, and not a TCO calculation. But it shows that if you plan to run your servers full-time on Amazon, it’s not as cheap as you might think when you think “it’s just three cents an hour!”
Launch of Cotendo, a new CDN / ADN
Cotendo, a new CDN backed by VC heavyweights Sequoia Capital and Benchmark Capital, has launched. The technical founders are ex-Commtouch; the VPs of Ops and Marketing are ex-Limelight. Cotendo is positioning itself as a software company (rather than an infrastructure company, per the market shift I blogged about a few months ago), but it’s not a software pure-play — it’s got the usual megaPOP-model deployment. However, they’re positioning themselves more as a fourth-generation approach.
Three things make this launch notable — an ADN service similar to Akamai’s (thus breaking the monopoly Akamai has had since the Netli acquisition), a global load-balancing solution beefed up into an arbitrage service (for multiple delivery resources), and real-time analytics. Plus, all of us CDN-watchers can experience a wry sense of relief to see that Cotendo, unlike practically every other CDN to launch in the last two years, is not focused on video.
Again, I apologize for what is essentially a news blurb, but since I expect it’s going to be a significant subject of client inquiry, I shouldn’t be giving away analysis on my blog. Gartner’s Invest clients are going to ask what this means in the e-commerce/enterprise space, and our mid-market IT buyer clients will want to know what it means for their options. Like usual, I’m happy to take inquiry. Also, more information about this will be going out in a research note.
Fourth-generation CDNs and the launch of Conviva
First-generation CDNs use a highly distributed edge model, and include companies like Akamai and Sandpiper Networks (whose acquisiton chain goes Digital Island, Exodus, Savvis, Level 3).
Second-generation CDNs basically try to achieve most of the performance of a first-generation CDN without needing hundreds of POPs, aiming for just a few dozen locations. Speedera (eventually acquired by Akamai) is the best example of a CDN of this type.
Third-generation CDNs follow a megaPOP model — two or three dozen huge points of presence, which they hope will be highly peered. Limelight, VitalStream (acquired by Internap), and the new entrants of the past two years are pretty much all megaPOP CDNs.
Fourth-generation CDNs are very different. They are a shift towards a more software-oriented model, and thus, these companies own limited (or even no) delivery assets themselves. Some of these are not (and will not be) so much CDNs themselves, as platforms that reside in the CDN ecosystem, or CDN enablers. Velocix (for their Metro product) and MediaMelon both reside in the fourth-generation space.
That gets us to the morning’s interesting announcement.
Conviva has come out of stealth mode with a powerhouse customer announcement — NBC Universal. Conviva is not a CDN in the traditional sense, but they’re part of the ecosystem for Internet video. Rather than owning delivery assets themselves, they’ve got a pure-play SaaS solution — a platform that can arbitrage resources from multiple content sources (multiple CDNs, data centers, etc.), as well as offer value-added services like real-time analytics and integration capabilities across those multiple sources. (From an ecosystem perspective, the closest analogue is probably Move Networks.)
What makes Conviva immediately notable is their ability to do real-time monitoring of the performance of every individual delivery, and seamlessly switch sources midway through playing a video, driven by metrics and business rules, thus allowing the customer to deliver consistently good-enough performance (i.e., a target of no buffering or other degradation) at the lowest price point, i.e., cost-arbitraged QoS.
I’ve been writing about the customer desire for control and the rise of the fourth-generation software “CDN” since last year. Conviva takes full advantage of the overlay model. I’d rate the significance of this launch on par with that of Netli’s (back in 2003), although obviously in a very different way.
Because it’s a particularly important launch, I know it’s going to be of substantial interest to Gartner’s Invest clients, and likely of significant interest to our media and telecommunications industry clients. As such, I’m refraining from blogging a detailed description or analysis of the company’s technology and strategy, its likely impact to the rest of the video delivery ecosystem (which goes beyond the CDNs themselves), and the more general impact of the conceptual shift that’s taking place with fourth-generation CDNs. If you have inquiry access, please feel free to use it. A note to clients will be published soon.
(Disclaimer: I was pre-briefed on this, and I am quoted in Conviva’s press release. As I almost always do, I wrote my own quote, rather than letting words be put in my mouth. As with all Gartner quotes in press releases, it is a statement about the market, and no endorsement of the vendor is implied.)
Smoke-and-mirrors and cloud software
On my more cynical, read-too-many-press-releases days, I wonder if there’s some hapless, tortured PR gnome at Amazon whose job consists solely of vetting one empty cloud fluff piece after another, proclaiming how such-and-such a vendor is now offering deployments on EC2, and how this therefore gives them an on-demand cloud offering (“please think of me as hip and visionary!”), when in reality, the vendor is doing nothing other than packaging up its software as an AMI (an Amazon machine image, basically a disk image of a server with the application installed).
Packaging something up as an AMI doesn’t make it a cloud service. It doesn’t make it massively scalable, automatically scalable, transparently scalable, on-demand, multi-tenant, or any one of a vast number of other terms that get fatuously lavished on anything with a whiff of cloudiness. If a piece of software doesn’t have any cloud traits when it’s deployed in your data center, it won’t have them when it’s deployed on EC2 (or any other cloud infrastructure service), either.
Cloud infrastructure services today, whether EC2 or from one of Amazon’s competitors, are basically servers in the sky. They are almost always a hypervisor-virtualized server with a normal operating system installation, on top of which you install normal applications. There is no magic cloud pixie dust that settles on these instances and turns them into application faeries of scalability and joy.
Building massively and horizontally scalable, multi-tenant software with elastic economics is hard. It’s even harder if you’re trying to take some legacy software package and re-engineer it. This is why practically no one does that kind of re-engineering, and why software vendors have to resort to puffed-up “yes, we run on EC2!” claims, rather than genuinely delivering on-demand cloud services.
Don’t be fooled.
Marketing and PR folks at software vendors: I forgive you for these releases because I know you’re under pressure to put something out, but every time I read them, I cringe on your behalf, and hope that you’re not genuinely entertaining the belief that releasing an AMI meaningfully moves you forward along the cloud path.
IT folks: When your CEO / CFO / CIO comes to you and asks you why you aren’t taking advantage of your software vendor’s awesome new money-saving cloud service, you can tell him it’s because the PR release is just artfully painting a unicorn — a mythical beast everyone talks about but doesn’t actually exist.
TCO tool for cloud computing
Gartner clients might be interested in my just-published piece of research, which is a TCO toolkit for comparing the cost of internal and cloud infrastructure.
A not-new link, but which I nonetheless want to draw people’s attention to as much as possible: Yahoo’s best practices for speeding up your web site is a superb list of clearly-articulated tips for improving your site performance and the user’s perception of performance (which goes beyond just site performance). Recommended reading for everyone from the serious Web developer to the guy just throwing some HTML up for his personal pages.
On the similarly not-new but still-interesting front, Voxel’s open-source mod_cdn module for Apache is a cool little bit of code that makes it easy to CDN-ify your site — install the module and it’ll automatically transform your links to static content. For those of you who are dealing with CDNs that don’t provide CNAME support (like the Rackspace/Limelight combo), are using Apache for your origin front-end, and who don’t want to fool with mod_rewrite, this might be an interesting alternative.
Billing is hard?
Spreading a little linkage…
A blog post from Reuven Cohen of Enomaly, in the form of musings on billing, metering, and measuring the cloud, and specifically, Amazon’s current inability to offer real-time billing-related reporting for their cloud services.
A blog post from James Hamilton, of the Microsoft Windows Live platform team, provides some brief thoughts on the fact that service billing is hard. The comments thread holds a few things of interest, too.
There’s more to cloud computing than Amazon
In dozens of client conversations, I keep encountering companies — both IT buyers and vendors — who seem to believe that Amazon’s EC2 platform is the be-all and end-all of the state of the art in cloud computing today. In short, they believe that if you can’t get it on EC2, there’s no cloud platform that can offer it to you. (I saw a blog post recently called “Why, right now, is Amazon the only game in town?” that exemplifies this stance.)
For better or for worse, this is simply not the case. While Amazon’s EC2 platform (and the rest of AWS) is a fantastic technical achievement, and it has demonstrated that it scales well and has a vast amount of spare capacity to be used on demand, as it stands, it’s got some showstoppers for many mainstream adopters. But that doesn’t mean that the rest of the market can’t fill those needs, like:
- Not having to make any changes to applications.
- Non-public-Internet connectivity options.
- High-performance, reliable storage with managed off-site backups.
- Hybridization with dedicated or colocated equipment.
- Meeting compliance and audit requirements.
- Real-time visibility into usage and billing.
- Enterprise-class customer support and managed services.
There are tons of providers who would be happy to sell some or all of that to you — newer names to most people, like GoGrid and SoftLayer, as well as familiar enterprise hosting names like AT&T, Savvis, and Terremark. Even your ostensibly stodgy IT outsourcers are starting to get into this game, although the boundaries of what’s a public cloud service and what’s an outsourced private one start to get blurry.
If you’ve got to suddenly turn up four thousand servers to handle a flash crowd, you’re going to need Amazon. But if you’re like most mainstream businesses looking at cloud today, you’ve got a cash crunch you’ve got to get through, you’re deploying at most dozens of servers this year, and you’re not putting up and tearing down servers hour by hour. Don’t get fooled into thinking that Amazon’s the only possible option for you. It’s just one of many. Every cloud infrastructure services platform is better for some needs than others.
(Gartner clients interested in learning more about Amazon’s EC2 platform should read my note “Is Amazon EC2 Right For You?“. Those wanting to know more about S3 should read “A Look at Amazon’s S3 Cloud-Computing Storage Service“, authored by my colleagues Stan Zaffos and Ray Paquet.)
Amazon’s Kindle
I am a big fan of Amazon’s Kindle. My husband gave me one as a gift when they first came out, and I went from taking it on the road to read the occasional thing as supplement to the paperbacks and magazines I was going through, to making it my sole form of reading material while out on travel, to wanting to read just about everything on it, period. The ability to change the font size, essentially allowing me to read every book as if it were in large print, is a big reason why — it’s easier to read bigger print when you’re in something that’s moving, as it creates less eyestrain.
I consume an enormous number of books (around a book a day if I’m traveling, and around half that if I’m not). Books are one of the most significant expenses in my household; my husband and I are both voracious consumers of fiction and non-fiction, and we mostly read different books. Kindle helps me spend a lot less on books, sort of — I pay less for the individual books, but because of the convenience, I also read even more than I normally would. And whereas I often used to wait for the paperback, now I buy books as soon as they come out in Kindle form. Plus, while business books are often grotesquely expensive for relatively limited value, especially when they’re in hardback, at Kindle prices, I don’t mind buying a book for the one cool idea in it, instead of standing around in the bookstore, flipping pages. Finally, rather than buying a ton of books that accumulate in piles and sometimes eventually disappear onto the shelves before I actually read them, I read every item I download onto my Kindle.
New York Times reviewer David Pogue understands the Kindle. But opinion columnist Roy Blount totally fails to get it, using the NYT megaphone to whine that the text-to-speech function potentially steals money from authors who would otherwise be able to sell audiobooks.
Seth Godin loves his Kindle. And he has a bunch of great suggestions for taking the Kindle service to the next level. Among other things, he points out that authors need to embrace these new models as a source for lots of new forms of revenue generation, rather than obtusely trying to cling to the way things are.
You can fear the future, or you can think different and embrace it. Devices like the Kindle open up a wealth of opportunities to authors who are willing to seize them.
Cloud failures
A few days ago, an unexpected side-effect of some new code caused a major Gmail outage. Last year, a small bug triggered a series of cascading failures that resulted in a major Amazon outage. These are not the first cloud failures, nor will they be the last.
Cloud failures are as complex as the underlying software that powers them. No longer do you have isolated systems; you have complex, interwoven ecosystems, delicately orchestrated by a swarm of software programs. In presenting simplicity to the user, the cloud provider takes on the burden of dealing with that complexity themselves.
People sometimes say that these clouds aren’t built to enterprise standards. In one sense, they aren’t — most aren’t intended to meet enterprise requirements in terms of feature-set. In another sense, though, they are engineered to far exceed anything that the enterprise would ever think of attempting themselves. Massive-scale clouds are designed to never, ever, fail in a user-visible way. The fact that they do fail nonetheless should not be a surprise, given the potential for human error encoded in software. It is, in fact, surprising that they don’t visibly fail more often. Every day, within these clouds, a whole host of small errors that would be outages if they occurred within the enterprise — server hardware failures, storage failures, network failures, even some software failures — are handled invisibly by the back-end. Most of the time, the self-healing works the way it’s supposed to. Sometimes it doesn’t. The irony in both the Gmail outage and the S3 outage is that both appear to have been caused by the very software components that were actively trying to create resiliency.
To run infrastructure on a massive scale, you are utterly dependent upon automation. Automation, in turn, depends on software, and no matter how intensively you QA your software, you will have bugs. It is extremely hard to test complex multi-factor failures. There is nothing that indicates that either Google or Amazon are careless about their software development processes or their safeguards against failure. They undoubtedly hate failure as much as, and possibly more than, their customers do. Every failure means sleepless nights, painful internal post-mortems, lost revenue, angry partners, and embarrassing press. I believe that these companies do, in fact, diligently seek to seamlessly handle every error condition they can, and that they generally possess sufficient quantity and quality of engineering talent to do it well.
But the nature of the cloud — the one homogenous fabric — magnifies problems. Still, that’s not isolated to the cloud alone. Let’s not forget VMware’s license bug from last year. People who normally booted up their VMs at the beginning of the day were pretty much screwed. It took VMware the better part of a day to produce a patch — and their original announced timeframe was 36 hours. I’m not picking on VMware — certainly you could find yourself with a similar problem with any kind of widely deployed software that was vulnerable to a bug that caused it all to fail.
Enterprise-quality software produced the SQL Slammer worm, after all. In the cloud, we ain’t seen nothing yet…