Daily Archives: November 8, 2011

Trialing a lot of cloud IaaS providers

I’ve just finished writing the forthcoming Public Cloud IaaS Magic Quadrant (except for some anticipated tweaks when particular providers come back with answers to some questions), which has twenty providers. Although Gartner normally doesn’t do hands-on evaluations, this MQ was an exception, because the easiest way to find out if a given service can do X, was generally to get an account, and attempt to do X. Asking the vendor sometimes requires a bunch of back-and-forth, especially if they don’t do X but and are weaseling their reply, forcing you to ask a set of increasingly narrow, specific questions until you get a clear answer. Also, I did not want to constantly bombard the vendors with questions, since, come MQ time, it tends to result in a firedrill whether or not you intended the question as urgent or even particularly important. (I apologize for the fact that I ended up bombarding many vendors with questions, anyway.)

I’ve used cloud services before, of course, and I am a paying customer of two cloud IaaS providers and a hosting provider, for my personal hobbies. But there’s nothing quite like a blitzkrieg through this many providers all at once. (And I’m not quite done, because some providers without online sign-up are still getting back to me on getting a trial account.)

In the course of doing this, I have had some great experiences, some mediocre experiences, and some “you really sell this and people buy it?” experiences. I have online chatted with support folks for basic questions not covered in the documentation (like “if I stop this VM, does it stop billing me, or not?” which varies from provider to provider). I have filed numerous support tickets (for genuine issues, not for evaluation purposes). I have filed multiple bug reports. I have read documentation (sometimes scanty to non-existent). I have clicked around interfaces, and I have actually used the APIs (working in Python, and in one case, without using a library like libcloud); I have probably weirded out some vendors by doing these things at 2 am, although follow-the-sun support has been intriguing. Those of you who follow me on Twitter (@cloudpundit) have gotten little glimpses of some of these things.

Ironically, I have tried to not let these trials unduly influence my MQ evaluations, except to the extent that these things are indisputably factual — features, availability of documentation, etc. But I have taken away strong impressions about ease of use, even for just the basic task of provisioning and de-provisioning a virtual machine. There is phenomenal variation in ease of use, and many providers could really use the services of a usability expert.

Any number of these providers have made weird, seemingly boneheaded decisions in their UI or service design, for which there’s no penalty to anything in MQ scoring, but did occasionally make me stare and go, “Seriously?”

I’m reluctant to publicly call out vendors for this stuff, so I’ll pick just one example from a vendor that has open online sign-up, where it’s not a private issue that hasn’t been raised on a community forum, and they’re not the sort of vendor (I hope) to make angry calls to Gartner’s Ombudsman demanding that I take this post down. (Dear OpSource folks: Think of this as tough love, and I hope Dimension Data analyst relations doesn’t have conniptions.)

So, consider: OpSource has pre-built VMs, that come with a set amount of compute and RAM, bundled with an OS. Great. Except that you can’t alter a bundle at the time of provisioning. So, say, if I want their Ubuntu image, it comes only in a 2 CPU core config. If I want only 1 core, I have to provision that image, wait for the provision to finish, go in and edit the VM config to reduce it to 1 core, and then wait for it to restart. After I go through that song and dance once, I can clone the config… but it boggles the mind why I can’t get the config I want from the start. I’m sure there’s a good technical reason, but the provider’s job is to mask such things from the user.

The experience has also caused me to wholly revise my opinion of vCloud Director as a self-service tool for the average goomba who wants a VM. I’d always seen vCD as a demo being given by experts, where it looked like despite the pile of complex functionality, it was easy enough to use. The key thing is that the service catalogs were always pre-populated in those demos. If you’re starting from the bare vCD install that a vCloud Powered provider is going to give you, you face a daunting task. Complexity is necessary for that level of fine-grained functionality, but it’s software that is in desperate need of pre-configuration from the service provider, and quite possibly an overlay interface for Joe Average Developer.

Now we’ll see if my bank freezes my credit card for possible fraud, when I’m hit with a dozen couple-of-cents-to-a-few-dollar charges come the billing cycle — I used my personal credit card for this, not my corporate card, since Gartner doesn’t actually reimburse for this kind of work. Ironically, once I spent a bunch of time on these sites, Google and the other online ad networks have started displaying ads that consist of nothing but cloud providers, including “click here for a free trial” or “$50 credit” or whatever, but of course you can’t apply those to existing accounts, which makes every little, “hey, you’ve spent another ten cents provisioning and de-provisioning this VM” charge which I’m noting in the back of my head now, into something which will probably annoy me in aggregate come the billing cycle.

Some things, you just can’t know until you try it yourself.

Results of Symposium workshop on Amazon

I promised the attendees at my Gartner Symposium workshop, called “Using Amazon Web Services“, that I would post the notes from the session, so here they are — with some context for public consumption.

A workshop is a structured, facilitated discussions that are designed to assist participants in working through a problem, coming up with best practices, etc. This one had thirty people, all from IT organizations that were either using Amazon or planning to use Amazon.

Because I didn’t know what level of experience with Amazon the workshop attendees would have, I actually prepared two workshops in advance. One of them was a highly structured work-through of preparing to use Amazon in a more formal way (i.e., not a single developer with a credit card or the like), and the other was a facilitated sharing of challenges and best practices amongst current adopters. As the room skewed heavily towards people who already had a deployment well under way, this workshop focused on the latter.

I started the workshop with introductions — people, companies, current use cases. Then, I asked attendees to share their use cases in more details in their smaller working groups. This turned into a set of active discussions that I allowed extra time for, before I asked each of the group to make a list of their most significant challenges in adopting/using Amazon, and their solutions if any. Throughout, I circulated the room, listening and, rarely, commenting. Each group then shared their findings, and I offered some commentary and then did an open Q&A (with some more participant sharing of their answers to questions).

Broadly, I would say that we had three types of people in the room. We had folks from the public sector and education, who were at a relatively early stage in adoption; we had people who were test/dev oriented but in a significant way (i.e., formal adoption, not a handful of developers doing their thang); and we had people who were more e-business oriented (including people from net-native businesses like SaaS, as well as traditional businesses with a hosting type of need), although that could be test/dev or production. Most of the people were mid-level IT management with direct responsibility for the Amazon services.

Some key observations:

Dealing with the financial aspects of moving to the cloud is hard. Understanding the return on investment, accurately estimating costs in advance, comparing costs to internal costs, and understanding the details of billing were common challenges of the participants. Moreover, it raises the issue of “is capital king or is expense king?” Although the broader industry is constantly talking about how people are trying to move to expense rather than capital, workshop participants frequently asserted that it was easier for them to get capital than to up their recurring expenses. (As a side note, I have found that to be a frequent assertion in both inquiry and conference 1-on-1s.) Finally, user management, cost control, and turning resources on/off appropriately were problematic in the financial context.

Move low-risk workloads first. The workshop participants generally assessed Amazon as being suitable only to test/dev, non-mission-critical workloads, and things that had specifically been designed with Amazon’s characteristics in mind. Participants recommended a risk profile of apps, and moving low-risk apps first. They also saw their security organizations as being a barrier to adoption. Many had issues with their Legal departments either trying to prevent use of services or causing issues in the contracting process (what Amazon calls an Enterprise Agreement); participants recommended not involving Legal until after adopting the service.

Performance is a problem. Performance was cited as a frequent issue, especially storage performance, which participants noted was unsuitable to their production applications, and one participant made the key point that many test/dev situations also require highly performant storage (something he had first discovered when his ILM strategy placed test/dev storage at a lower more commodity tier and it impacted his developers).

Know what your SLA isn’t. Amazon’s limited SLAs were cited as an issue, particularly the mismatch in what many users thought the SLA was versus what it actually was, and what it’s actually turned out to be in practice (given Amazon’s outages this year). Participants also stressed business continuity planning in this context.

Integration is a challenge. Participants noted that going to test/dev in the cloud, while maintaining production in an internal data center, splits the software development lifecycle across data centers. This can be overcome to some degree with the appropriate tools, but still creates challenges and sometimes outright problems. Also, because speed of deployment is such a driving factor to go to the cloud, there is a resulting fragmentation of solutions. A service catalog would help some of these issues.

Data management can be a challenge. Participants were worried about regulatory compliance and the “where is my data?” question. Inexperienced participants were often not aware that non-S3 data is generally local to an availability zone. But even beyond that, there’s the question of what data is being put where by the cloud users. Participants with larger amounts of data also faced challenges in moving data in and out of the cloud.

Amazon isn’t the right provider for all workloads in the cloud. Several workshop participants used other cloud IaaS providers in addition to Amazon, for a variety of other reasons — greater ease of use for users who didn’t need complex things, enterprise-grade availability and performance, better manageability, security capabilities, and so forth.

I have conducted cloud workshops and what Gartner calls analyst/user roundtables at a bunch of our conferences now, and it’s always interesting what the different audiences think about, and how much it’s evolving over time. Compared to last year’s Symposium, the state of the art of Amazon adoption amongst conference attendees has clearly advanced hugely.