Blog Archives
The hardware-vendor cloud
Are hardware vendors the natural winners in cloud infrastructure services?
I don’t think so. Having the lowest cost of server hardware simply isn’t enough. The failure of both Dell and Intel to build successful hosting businesses ought to make that clear. (Dell sold to Sprint, and Intel sold to Savvis, at a significant loss.) Hosting is not an exact parallel, but it’s close enough to look to it for hard-won lessons. Like hosting, cloud infrastructure services are not and will not be just about the lowest costs possible. Cloud will still also be about service, support, and features.
Cloud tidbits in the press
A few tidbits of cloud computing in the press…
Sun announces plans for a cloud computing service. No details until March, other than a ZDnet interview comment about delivering SaaS and SaaS infrastructure.
Rackspace puts its own spin on cloud. Rackspace now has a “cloud 101” page, most interesting for the results of a recent survey that it commissioned on business awareness of “cloud hosting”. No clear definition is provided for that term, though.
Two IBM scientists write a cloud computing article in Dr. Dobb’s. This article made me blink a great deal, starting from the statement, “Currently, you can create cloud applications through two major implementations: Amazon Web Services (AWS) and Google Application Engine (GAE).” The reduction of the richness of the cloud infrastructure services space to just these two names is a little mind-boggling, and the rest of the article that follows is nearly a marketing glossy — accurate in its superficial overview, but offering nearly no information of actual usefulness to an engineer. I hope for better out of technical journals.
Seven years to SEAP, not to cloud in general
Gartner recently put out a press release titled “Gartner Says Cloud Application Infrastructure Technologies Need Seven Years to Mature“, based on a report from my colleague Mark Driver. That’s gotten a bunch of pickup in the press and in the blogosphere. I’ve read a lot of people commenting about how the timeline given seems surprisingly conservative, and I suspect it’s part of what has annoyed Reuven Cohen into posting, “Cloud computing is for everyone — except stupid people.”
The confusion, I think, is over what the timeline actually covers. Mark is talking specifically about service-enabled application platforms (SEAPs), not cloud computing in general. Basically, a SEAP is a foundation platform for software as a service. Examples of current-generation SEAP platforms are Google App Engine, Microsoft Azure, the Facebook application platform, Coghead, and Bungee Labs. (Gartner clients who want to drill into SEAP, see The Impact of SaaS on Application Servers and Platforms.) When you’re talking about SEAP adoption, you’re talking about something pretty complex, on a very different timeframe than the evolution of the broader cloud computing style.
Cloud computing in general already has substantial business uptake, with potential radical acceleration due to the economic downturn. I say “potential” because it’s very clear to me that existing public cloud services, at their current state of maturity, frequently don’t meet the requirements that enterprises are looking for right now. I have far more clients suddenly willing to consider taking even big risks to leap into the cloud, than I have clients who actually have projects well-suited to the public cloud and who will realize substantial immediate cost savings from that move.
On the flip side, for those who have public-facing Web infrastructure, cloud services are now a no-brainer. Expect cloud elasticity and fast provisioning to simply become part of hosting and data center outsourcing solutions. Traditional hosting providers who don’t make the transition near-immediately are going to get eaten alive.
The turf war of unified computing
The New York Times article about Cisco getting into the server market is a very interesting read, as is Cisco’s own blog post announcing something they call “Unified Computing“.
My colleague Tom Bittman has a thoughtful blog post on the topic, writing: What is apparent is that the comfortable sandboxes in which different IT vendors sat are shattering. Those words demand that computing become a much more flexible, unified fabric.
Tom ruminates on the vendors, but setting aside any opinion of Cisco’s approach (or any other vendor making a unified computing attempt), my mind goes to the people — specifically, the way that a unified approach impacts IT operations personnel, and the way that these engineers can help or hinder adoption of unified data center technologies.
Unified computing — unified management of compute, storage, and network elements — is not just going to shape up to be a clash between vendors. It’s going to become a turf war between systems administrators and network engineers. Today, computing and storage are classically the domain of the former, and the WAN the domain of the latter. The LAN might go either way, but the bigger the organization, the more likely it goes to the network guys. And devices like application delivery controllers fall into an uncomfortable in-between, but in most organizations, one group or the other takes them into their domain. The dispute over devices like that serves as the warning shot in this war, I think. (An ADC is a network element, but it is often closely associated with servers; it is usually an appliance, i.e. a purpose-built server, but its administration more closely resembles a network device than a server.) The more a given technology crosses turf lines, the greater the dispute over who manages it, whose budget it comes out of, etc.
(Yes, I really did make a lolcat just for this post.)
He who controls the entire enchilada — the management platform — is king of the data center. There will be personnel who are empire-builders, seeking to use platform control to assert dominance over more turf. And there will be personnel who try to push away everything that is not already their turf, trying to avoid more work piling up on their plate.
Unification is probably inevitable. We’ve seen this human drama play out once this past decade already — the WAN guys generally triumphed over the telephony guys in the VoIP convergence. But my personal opinion is that it’s the systems guys, not the network guys, who will be most likely to triumph in the unified-platform wars. In most organizations, systems guys significantly outnumber the network guys, and they tend to have a lot more clout, especially as you go up the management chain. Internal politics and whose vendor influence triumphs may turn out to influence solution selection as much as, or more than, the actual objective quality of the solutions themselves.
COBOL comes to the cloud
In this year of super-tight IT budgets and focus on stretching what you’ve got rather than replacing it with something new, Micro Focus is bringing COBOL to the cloud.
Most vendor “support for EC2” announcements are nothing more than hype. Amazon’s EC2 is a Xen-virtualized environment. It supports the operating systems that run in that environment; most customers use Linux. Applications run no differently there than they do in your own internal data center. There’s no magical conveyance of cloud traits. Same old app, externally hosted in an environment with some restrictions.
But Micro Focus (which is focused around COBOL-based products) is actually launching its own cloud service, built on top of partner clouds — EC2, as well as Microsoft’s Azure (previously announced).
Micro Focus has also said it has tweaked its runtime for cloud deployment. They give the example of storing VSAM files as blobs in SQL. This is undoubtedly due to Azure not offering direct access to the filesystem. (For EC2, you can get persistent normal file storage with EBS, but there are restrictions.) I assume that similar tweaks were made wherever the runtime needs to do direct file I/O. Note that this still doesn’t magically convey cloud traits, though.
It’s interesting to see that Micro Focus has built its own management console around EC2, providing easy deployment of apps based on their technology, and is apparently making a commitment to providing this kind of hosted environment. Amidst all of the burgeoning interest in next-generation technologies, it’s useful to remember that most enterprises have a heavy burden of legacy technologies.
(Disclaimer: My husband was founder and CTO of LegacyJ, a Micro Focus competitor, whose products allow COBOL, including CICS apps, to be deployed within standard J2EE environments — which would include clouds. He doesn’t work there any longer, but I figured I should note the personal interest.)
More cloud news
Another news round-up, themed around “competitive fracas”.
Joyent buys Reasonably Smart. Cloud hoster Joyent has picked up Reasonably Smart, a tiny start-up with an APaaS offering based, unusually enough, on JavaScript and the Git version-control system. GigaOM has an analysis; I’ll probably post my take later, once I get a better idea of exactly what Reasonably Smart does.
DreamHost offers free hosting. DreamHost — one of the more prominent, popular mass-market and SMB hosting providers — is now offering free hosting for certain applications, including WordPress, Drupal, MediaWiki, and PhpBB. There are a limited number of beta invites out there, and DreamHost notes that the service may become $50/year later. (The normal DreamHost base plan is $6/month.) Increasingly, shared hosting companies are having to compete with free application-specific hosting services like WordPress.com; and Wikidot, and they’re facing the looming spectre of some giants like Google giving away cloud capacity for free. And shared hosting is a cutthroat market already. So, here’s another marketing salvo being fired.
Google goes after Microsoft. Google has announced it’s hiring a sales force to pitch the Premier Edition of Google Apps to customers who are traditionally Microsoft customers. I’d expect the two key spaces where they’ll compete are in email and collaboration, going after the Exchange and Sharepoint base.
Cloud debate: GUI vs. CLI and API
In the greater blogosphere, as well as amongst the cloud analysts across the various research firms, there’s been an ongoing debate over the question, “Does a cloud have to have an API to be a cloud?”
Going beyond that question, though, there are two camps of cloud users emerging — those who prefer the GUI (control panel) approach to controlling their cloud, and those that prefer command-line interfaces and/or APIs. These two camps can probably be classified into the automated and the automators — those users who want easy access to pre-packaged automation, and those users who want to write automation of their own.
This distinction has long existed in the systems administration community — the split between those who rely on the administrator GUIs to do things, vs. those who do everything via the command line, editing config files, and their own scripts. But the advent of cloud computing and associated tools, with their relentless drive towards standardization and automation, is casting these preferences into an increasingly stark light. Moreover, the emerging body of highly sophisticated commercial tools for cloud management (virtual data center orchestration and everything that surrounds it) means that in the future, even those more sophisticated IT operations folks who are normally self-reliant, will end up taking advantage of those tools rather than writing stuff from scratch. That suggests that tools will also follow two paths — there will be tools that are designed to be customized via GUI, and tools that are readily decomposable into scriptable components and/or provide APIs.
I’ve previously asserted that cloud drives a skills shift in IT operations personnel, creating a major skills chasm between those who use tools, and those who write tools.
The emerging cloud infrastructure services seem to be pursuing one of two initial paths — exposure via API and thus highly scriptable by the knowledgeable (e.g., Amazon Web Services), and friendly control panel (e.g., Rackspace’s Mosso). While I’d expect that most public clouds will eventually offer both, I expect that both services and do-it-yourself cloud software will tend to emphasize capabilities one way or another, focusing on either the point-and-click crowd or the systems programmers.
(A humorous take on this, via an old Craigslist posting: Keep the shell people alive.)
Touring Amazon’s management console
The newly-released beta of Amazon’s management console is reasonably friendly, but it is not going to let your grandma run her own data center.
I took a bit of a tour today. I’m running Firefox 3 on a Windows laptop, but everything else I’m doing out of a Unix shell — I have Linux and MacOS X servers at home. I already had AWS stuff set up prior to trying this out; I’ve previously used RightScale to get a Web interface to AWS.
The EC2 dashboard starts with a big friendly “Launch instances” button. Click it, and it takes you to a three-tab window for picking an AMI (your server image). There’s a tab for Amazon’s images, one for your own, and one for the community’s (which includes a search function). After playing around with the search a bit (and wishing that every community image came with an actual blurb of what it is), and not finding a Django image that I wanted to use, I decided to install Amazon’s Ruby on Rails stack.
On further experience, the “Select” buttons on this set of tabs seem to have weird issues; sometimes you’ll go to them and they’ll be grayed out and unclickable, sometimes you’ll click them and they’ll go gray but you won’t get the little “Loading, please wait” box that appears before going onto the next tab — and it will leave you stuck, leaving you to cancel the window and try again.
Once you select an image, you’re prompted to select how many instances you want to launch, your instance type, key pair (necessary to SSH into your server), and a security group (firewall config). More twiddly bits, like the availability zone, are hidden in advanced options. Pick your options, click “Launch”, and you’re good to go.
From the launch window, your options for the firewall default to having a handful of relevant ports (like SSH, webserver, MySQL) open to the world. You can’t get more granular with the rules than this there; you’ve got to use the Security Group config panel to add a custom rule. I wish that the defaults would be slightly stricter, like limiting the MySQL port to Amazon’s back-end.
Next, I went to create an EBS volume for user data. This, too, is simple, although initially I did something stupid, failing to notice that my instance had launched in us-east-1b. (Your EBS volume must reside in the same availability zone as your instance, in order for the instance to mount it.)
That’s when I found the next interface quirk — the second time I went to create an EBS volume, the interface continued to insist for fifteen minutes that it was still creating the volume. Normally there’s a very nice Ajax bit that automatically updates the interface when it’s done, but this time, even clicking around the whole management console and trying to come back wouldn’t get it to update the status and thus allow me to attach it to my instance. I had to close out the Firefox tab, and relaunch the console.
Then, I remember that the default key pair that I’d created had been done via RightScale, and I couldn’t remember where I’d stashed the PEM credentials. So that led me to a round of creating a new key pair via the management console (very easy), and having to terminate and launch a new instance using the new key pair (subject to the previously-mentioned interface quirks).
The same interface-somehow-gets-into-indeterminate-state also seems to be a problem for other things, like the console “Output” button for interfaces — you get a blank screen rather than the console dump.
That all dealt with, I log into my server via SSH, don’t see the EBS volume mounted, and remember that I need to actually make a filesystem and explicitly mount it. All creating an EBS volume does is allocate you an abstraction on Amazon’s SAN, essentially. This leads me to trying to find documentation for EBS, which leads to the reminder that access to docs on AWS is terrible. The search function on the site doesn’t index articles, and there are far too many articles to just click through the list looking for what you want. A Google search is really the only reasonable way to find things.
All that aside, once I do that, I have an entirely functional server. I terminate the instance, check out my account, see that this little experiment has cost me 33 cents, and feel reasonably satisfied with the world.
Sun buys Q-Layer
Today, Sun announced the acquisition of Q-Layer, a Belgium-based start-up of about two dozen people. Q-Layer is a virtualization orchestration vendor, with a focus that seems similar to 3Tera. For a similar acquisition parallel, look at Dune Technologies, acquired by VMware in late 2007.
When people say “orchestrate virtual resources”, usually what they mean is, “make software handle the messy background details of the infrastructure, automatically, while allowing me to navigate through a point-and-click GUI to provision and manage my virtualized data center resources”. In other words, they’ve got a GUI that can be exposed to users, who can create, configure, manage, and destroy virtual servers (and related equipment) at whim.
Like 3Tera, Q-Layer targets the hosting market — notably, Q-Layer’s founders include folks from Dedigate, a small European managed hosting provider that was acquired by Terremark back in 2005. Unlike 3Tera, which has focused on Linux, Q-Layer has made the effort to support Sun technologies, like Solaris Containers. However, Q-Layer has virtually no market traction; it seems to have signed some small, country-specific managed hosting providers in Europe, who are offering a VMware-based Q-Layer solution. (3Tera’s notable hosting customers include Layered Technologies and 1-800-HOSTING, but despite relatively few hosting partners, it has done a good job of creating market awareness.)
Hosters who want to offer virtual data center hosting (“VDC hosting”) — blocks of capacity that customers can carve up into servers at whim — can buy an off-the-shelf orchestration solution, or, if they’re brave and sufficiently skilled, they can write their own (as Terremark has). It’s not a big market yet, but orchestration also has value for large enterprises deploying big virtualization environments and who would like to delegate the management down through the organization.
Sun’s various cloud ambitions are being expanded with this acquisition. Sun expects to derive near-term benefits from incorporating Q-Layer’s technologies into its product plans this year.
On a lighter note, last week, I had dinner with an old friend I haven’t seen for some years. She’s a former Sun employee, and we were reminescing about Sun’s heyday — I was Sun’s second-largest customer back in those days (ironically, only Enron bought more stuff from them). She joked that her Sun stock options had been priced so egregiously high that Sun would have had to invent teleportation for her to ever see a return on them. Then she stopped and said, “Of course, even if Sun did invent teleportation, they would still somehow have failed to make money from it. They’d probably have given it away for free to spite Microsoft.”
And there’s the rub: Sun is doing many interesting and cool things with technology, but seems to have a persistent problem actually generating meaningful revenue from those ideas. So the Q-Layer acquisition is reasonably logical and I know where I can expect it to fit into Sun’s product line, but I’m still feeling a bit like the plan is:
1. Buy company.
2. …
3. Profit!
Scaling limits and friendly failure
I’m on vacation, and I’ve been playing World of Goo (possibly the single-best construction puzzle game since 1991’s Lemmings by Psygnosis). I was reading the company’s blog (2D Boy), when I came across an entry about BlueHost’s no-notice termination of 2D Boy’s hosting.
And that got me thinking about “unlimited” hosting plans, throttling, limits, and the other challenges of running mass-market hosting — all issues also directly applicable to cloud computing.
BlueHost is a large and reputable provider of mass-market shared hosting. Their accounts are “unlimited”, and their terms of service essentially says that you can consume resources until you negatively impact other customers.
Now, in practice there are limits, and customers are sort of expected to know whether or not their needs fit shared hosting. Most people plan accordingly — although there have been some spectacular failures to do so, such as Sk*rt, a female-focused Digg competitor launched using BlueHost, prompting vast wonder at what kind of utter lack of thought results in trying to launch a high-traffic social networking site on a $7 hosting plan. Unlike Sk*rt, though, it was reasonable for 2D Boy to expect that shared hosting would cover their needs — hosting a small corporate site and blog. They were two guys who were making an indie garage game getting a gradual traffic ramp thanks to word-of-mouth, not an Internet company doing a big launch.
Limits are necessary, but no-notice termination of a legitimate company is bad customer service, however you slice it. Moreover, it’s avoidable bad customer service. Whatever mechanism is used to throttle, suspend service, etc. ought to be adaptable to sending out a warning alert: the “hey, if you keep doing this, you will be in violation of our policies and we’ll have to terminate you” note. Maybe even a, “hey, we will continue to serve your traffic for $X extra, and you have Y time to find a new host or reduce your traffic to normal volumes”. BlueHost does not sell anything beyond its $7 plan, so it has no upsell path; a provider with an upgrade path would hopefully have tried to encourage a migration, rather than executing a cold-turkey cut-off. (By the way, I have been on the service provider side of this equation, so I have ample sympathy for the vendor’s position against a customer whose usage is excessive, but I also firmly believe that no-notice termination of legitimate businesses is not the way to go.)
Automated elastic scaling is the key feature of a cloud, and consequently, limits and the way that they’re enforced technically and managed from a customer service standpoint, will be one of the ways that cloud infrastructure providers differentiate their services.
A vendor’s approach to limits has to be tied to their business goals. Similarly, what a customer desires out of limits must also be tied to their business goals. The customer wants reliable service within a budget. The vendor wants to be fairly compensated and ensure that his infrastructure remains stable.
Ideally, on cloud infrastructure, a customer scales seamlessly and automatically until the point where he is in danger of exceeding his budget. At that point, the system should alert him automatically, allowing him to increase his budget. If he doesn’t want to pay more, he will experience degraded service; degradation should mean slower or lower-priority service, or an automatic switch to a “lite” site, rather than outright failure.
Perhaps when you get right down to it, it’s really about what the failure mode is. Fail friendly. A vendor has a lot more flexibility in imposing limits if it can manage that.