Blog Archives

Amazon VPC

Today, Amazon announced a new enhancement to its EC2 compute service, called Virtual Private Cloud (VPC). Amazon’s CTO, Werner Vogels, has, as usual, provided some useful thoughts on the release, accompanied by his thoughts on private clouds in general. And as always, the RightScale blog has a lucid explanation.

So what, exactly, is VPC?

VPC offers network isolation to instances (virtual servers) running in Amazon’s EC2 compute cloud. VPC instances do not have any connectivity to the public Internet. Instead, they only have Internet VPN connectivity (specifically, an IPsec VPN tunnel), allowing the instances to seem as if they’re part of the customer’s private network.

For the non-techies among my readers: Think about the way you connect your PC to a corporate VPN when you’re on the road. You’re on the general Internet at the hotel, but you run a VPN client on your laptop that creates a secure, encrypted tunnel over the Internet, between your laptop and your corporate network, so it seems like your laptop is on your corporate network, with an IP address that’s within your company’s internal address range.

That’s basically what’s happening here with VPC — the transport network is still the Internet, but now there’s a secure tunnel that “extends” the corporate network to an external set of devices. The virtual instances get corporate IP addresses (Amazon now even supports DHCP options), and although of course the traffic is still coming through your Internet gateway and you are experiencing Internet performance/latency/availability, devices on your corporate WAN “think” the instances are local.

To set this up, you use new features of the Amazon API that lets you create a VPC container (a logical construct for the concept of your private cloud), subnets, and gateways. When you actually activate the VPN, you begin paying 5 cents an hour to keep the tunnel up. You pay normal Amazon bandwidth charges on top of that (remember, your traffic is still going over the Internet, so the only extra expense to Amazon is the tunnel itself).

When you launch an EC2 instance, you can now specify that it belongs to a particular VPC subnet. A VPC-enabled instance is not physically isolated from the rest of EC2; it’s still part of the general shared pool of capacity. Rather, the virtual privacy is achieved via Amazon’s proprietary networking software, which they use to isolate virtual instances from one another. (It is not intra-VM firewalling per se; Amazon says this is layer 2 network isolation.)

At the moment, an instance can’t be both be part of a VPC and accessible to the general Internet, which means that this doesn’t solve a common use case — the desire to use a private network for back-end administration or data, but still have the server accessible to the Internet so that it can be customer-facing. Expect Amazon to offer this option in the future, though.

As it currently stands, with an EC2 instance with VPC limited to communicating with other instances within the VPC, as well as the corporate network, this solves the use case of customers who are using EC2 for purely internally-facing applications and are seeking a more isolated environment. While some customers are going to want to have genuinely private network connectivity (i.e., the ability to drop an MPLS VPN connection into the data center), a scenario that Amazon is unlikely to support, the VPC offering is likely to serve many needs.

Note, by the way, that the current limitation on communication also means that EC2 instances can’t reach other Amazon Web services, including S3. (However, EBS does work, as far as I know.) While monitoring is supported, load-balancing is not. Thus, auto-scaling functionality, one of the more attractive recent additions to the platform, is limited.

VPN connectivity for cloud servers is not a new thing in general, and part of what Amazon is addressing with this release is a higher-security option, for those customers who are uncomfortable with the fact that Amazon, unlike most of its competitors, does not offer a private VLAN to each customer. For EC2 specifically, there have been software-only approaches, like CohesiveFT’s VPN-Cubed. Other cloud compute service providers have offered VPN options, including GoGrid and SoftLayer. What distinguishes the Amazon offering is that the provisioning is fully automated, and the technology is proprietary.

This is an important step forward for Amazon, and it will probably cause some re-evaluations by prospective customers who previously rejected an Amazon solution because of the lack of connectivity options beyond public Internet only.

Cloud services are evolving with extraordinary rapidity. I always caution customers not to base deployment plans for one year out on the current state of the technology, because every vendor is evolving so rapidly that the feature that’s currently missing and that you really want has, assuming it’s not something wacky and unusual, a pretty high chance of being available when you’re actually ready to start using the service in a year’s time.

Bookmark and Share

Bits and pieces

Interesting recent news:

Amazon’s revocation of Orwell novels on the Kindle has stirred up some cloud debate. There seems to have been a thread of “will this controversy kill cloud computing”, which you can find in plenty of blogs and press articles. I think that question, in this context, is silly, and am not going to dignify it with a lengthy post of my own. I do think, however, that it highlights important questions around content ownership, application ownership, and data ownership, and the role that contracts (whether in the form of EULAs or traditional contracts) will play in the cloud. By giving up control over physical assets, whether data or devices, we place ourselves into the hands of thir parties, and we’re now subject to their policies and foibles. The transition from a world of ownership to a world of rental, even “permanent” lifetime rental, is not a trivial one.

Engine Yard has expanded its EC2 offering. Previously, Engine Yard was offering Amazon EC2 deployment of its stack via an offering called Solo, for low-end customers who only needed a single instance. Now, they’ve introduced a version called Flex, which is oriented around customers who need a cluster and associated capabilities, along with a higher level of support. This is notable because Engine Yard has been serving these higher-end customers out of their own data center and infrastructure. This move, however, seems to be consistent with Engine Yard’s gradual shift from hosting towards being more software-centric.

The Rackspace Cloud Servers API is now in open beta. Cloud Servers is essentially the product that resulted from Rackspace’s acquisition of Slicehost. Previously, you dealt with your Cloud Server through a Web portal; this new release adds a RESTful API, along with some new features, like shared IPs (useful for keepalived and the like). Also of note is the resize operation, letting you scale your server size up or down, but this is really handwaving magic in front of replacing a smaller virtual server with a larger virtual server, rather than expanding an already-running virtual instance. The API is fairly extensive and the documentation seems decent, although I haven’t had time to personally try it out yet. The API responses, interestingly, include both human-readable data as well as WADL (Web Application Description Language, which is machine-parseable).

SOASTA has introduced a cloud-based performance certification program. Certification is something of a marketing gimmick, but I do think that SOASTA is, overally, an interesting company. Very simply, SOASTA leverages cloud system infrastructure to offer high-volume load-testing services. In the past, you’d typically execute such tests using a tool like HP’s LoadRunner, and many Web hosters offer, as part of their professional services offerings, performance testing using LoadRunner or a similar tool. SOASTA is a full-fledged software as a service offering (i.e., it is their own test harness, monitors, analytics, etc., not a cloud repackaging of another vendor), and the price point makes it reasonable not just for the sort of well-established organizations that could previously afford commercial performance-testing tools, but also for start-ups.

Bookmark and Share

A hodgepodge of links

This is just a round-up of links that I’ve recently found to be interesting.

Barroso and Holzle (Google): Warehouse-Scale Computing. This is a formal lecture-paper covering the design of what these folks from Google refer to as WSCs. They write, “WSCs differ significantly from traditional data centers: they belong to a single organization, use a relatively homogenous hardware and system software platform, and share a common systems management layer. Often, much of the application, middleware, and system software is built in-house compared to the predominance of third-party software running in conventional data centers. Most importantly, WSCs run a smaller number of very large applications (or Internet services), and the common resource management infrastructure allows significant deployment flexibility.” The paper is wide-ranging but written to be readily understandable by the mildly technical layman. Highly recommended for anyone interested in cloud.

Washington Post: Metrorail Crash May Exemplify Automation Paradox. The WaPo looks back at serious failures of automated systems, and quotes a “growing consensus among experts that automated systems should be designed to enhance the accuracy and performance of human operators rather than to supplant them or make them complacent. By definition, accidents happen when unusual events come together. No matter how clever the designers of automated systems might be, they simply cannot account for every possible scenario, which is why it is so dangerous to eliminate ‘human interference’.” Definitely something to chew over in the cloud context.

Malcolm Gladwell: Priced to Sell. The author of The Tipping Point takes on Chris Anderon’s Free, and challenges the notion that information wants to be free. In turn, Seth Godin thinks Gladwell is wrong, and the book seems to be setting off some healthy debate.

Bruce Robertson: Capacity Planning Equals Budget Planning. My colleague Bruce riffs off a recent blog post of mine, and discusses how enterprise architects need to change the way they design solutions.

Martin English: Install SAP on Amazon Web Services. An interesting blog devoted to how to get SAP running on AWS. This is for people interested in hands-on instructions.

Robin Burkinshaw: Being homeless in the Sims 3. This blog tells the story, in words and images, of “Alice and Kev”, a pair of characters that the author (a game design student) created in the Sims 3. It’s a fascinating bit of user-generated content, and a very interesting take on what can be done with modern sandbox-style games.

Bookmark and Share

Overpromising

I’ve turned one of my earlier blog entries, Smoke-and-mirrors and cloud software into a full-blown research note: “Software on Amazon’s Elastic Compute Cloud: How to Tell Hype From Reality” (clients only). It’s a Q&A for your software vendor, if they suggest that you deploy their solution on EC2, or if you want to do so and you’re wondering what vendor support you’ll get if you do so. The information is specific to Amazon (since most client inquiries of this type involve Amazon), but somewhat applicable to other cloud compute service providers, too.

More broadly, I’ve noticed an increasing tendency on the part of cloud compute vendors to over-promise. It’s not credible, and it leaves prospective customers scratching their heads and feeling like someone has tried to pull a fast one on them. Worse still, it could leave more gullible businesses going into implementations that ultimately fail. This is exactly what drives the Trough of Disillusionment of the hype cycle and hampers productive mainstream adoption.

Customers: When you have doubts about a cloud vendor’s breezy claims that sure, it will all work out, ask them to propose a specific solution. If you’re wondering how they’ll handle X, Y, or Z, ask them and don’t be satisfied with assurances that you (or they) will figure it out.

Vendors: I believe that if you can’t give the customer the right solution, you’re better off letting him go do the right thing with someone else. Stretching your capabilities can be positive for both you and your customer, but if your solution isn’t the right path, or it is a significantly more difficult path than an alternative solution, both of you are likely to be happier if that customer doesn’t buy from you right now, at least not in that particular context. Better to come back to this customer eventually when your technology is mature enough to meet his needs, or look for the customer’s needs that do suit what you can offer right now. If you screw up a premature implementation, chances are that you won’t get the chance to grow this business the way that you hoped. There are enough early adopters with needs that you can meet, that you should be going after them. There’s nothing wrong with serving start-ups and getting “foothold” implementations in enterprises; don’t bite off more than you can chew.

Almost a decade of analyst experience has shown me that it’s hard for a vendor to get a second chance with a customer if they screwed up the first encounter. Even if, many many years later, the vendor has a vastly augmented set of capabilities and is managed entirely differently, a burned customer still tends to look at them through the lens of that initial experience, and often take that attitude to the various companies they move to. My observation is that in IT outsourcing, customers certainly hold vendor “grudges” for more than five years, and may do so for more than a decade. This is hugely important in emerging markets, as it can dilute early-mover advantages as time progresses.

Bookmark and Share

Amazon’s CloudWatch and other features

Catching up on some commentary…

Amazon recently introduced three new features: monitoring, load-balancing, and auto-scaling. (As usual, Werner Vogels has further explanation, and RightScale has a detailed examination.)

The monitoring service, called CloudWatch, provides utilization metrics for your running EC2 instances. This is a premium service on top of the regular EC2 fee; it costs 1.5 cents per instance-hour. The data is persisted for just two weeks, but is independent of running instances. If you need longer-term historical graphing, you’ll need to retrieve and archive the data yourself. There’s some simple data aggregation, but anyone who needs real correlation capabilities will want to feed this data back into their own monitoring tools.

CloudWatch is required to use the auto-scaling service, since that service uses the monitoring data to figure out when to launch or terminate instances. Basically, you define business rules for scaling that are based on the available CloudWatch metrics. Developers should take note that this is not magical auto-scaling. Adding or subtracting instances based on metrics isn’t rocket science. The tough part is usually writing an app that scales horizontally, plus automatically and seamlessly making other configuration changes necessary when you change the number of virtual servers in its capacity pool. (I field an awful lot of client calls from developers under the delusion that they can just write code any way they want, and simply putting their application on EC2 will remove all worries about scalability.)

The new load-balancing service essentially serves both global and local functions — between availability zones, and between instances within a zone. It’s auto-scaling-aware, but its health checks are connection-based, rather than using CloudWatch metrics. However, it’s free to EC2 customers and does not require use of CloudWatch. Customers who have been using HAproxy are likely to find this useful. It won’t touch the requirements of those who need full-fledged application delivery controller (ADC) functionality and have been using Zeus or the like.

As always, Amazon’s new features eat into the differentiating capabilities of third-party tools (RightScale, Elastra, etc.) with these services, but the “most, but not all of the way there” nature of their implementations mans that third-party tools still add value to the baseline. That’s particularly true given that only the load-balancing feature is free.

Bookmark and Share

AWS in Eclipse, and Azure announcements

Amazon’s announcement for today, with timing presumably associated with EclipseCon, is an AWS toolkit for the Eclipse IDE.

Eclipse, which is an open-source project under the aegis of IBM (who also offers a commercial version), is one of the most popular IDEs (the other is Microsoft Visual Studio). Originally designed for Java applications, it has since been extended to support many other languages and environments.

Integrating with Eclipse is a useful step for Amazon, and hopefully other cloud providers will follow suit. It’s also a competitive response to the integration that Microsoft has done between Visual Studio and its Azure platform.

Speaking of Azure, as part of a set of announcements, Microsoft has said that it’s supporting non-.Net languages on Azure via FastCGI. FastCGI is a webserver extension that basically compiles and loads your scripts once, instead of every time they’re accessed, resulting in a reduction of computational overhead. You can run most languages under it, including Java, but it doesn’t really give you the full featureset that you get with tight integration with the webserver through a language-specific extension. (Note that because .NET’s languages encompass anything that supports the CLR, users already had some reasonable access to non-C# languages on Azure — implementations like Ruby.NET, IronRuby, IronPython, etc.)

Also, in an interesting Q&A on a ZDnet blog post, Microsoft said that there will be no private Azure-based clouds, i.e., enterprises won’t be able to take the Azure software and host it in their own data centers. What’s not clear is whether or not the software written for Azure will be portable into the enterprise environment. Portability of this sort is a feature that Microsoft, with its complete control over the entire stack, is uniquely well-positioned to be able to deliver.

Bookmark and Share

Linkage du jour

Tossing a few links out there…

In the weekend’s biggest cloud news, Microsoft’s Azure was down for 22 hours. It’s now back up, with no root cause known.

Geva Perry has posted a useful Zoho Sheet calculator for figuring out whether an Amazon EC2 reserved instance will save you money over an unreserved instance.

Craig Balding has posted a down-to-earth dissection of PCI compliance in the cloud, and the practical reality that cloud infrastructure providers tend to deal with PCI compliance by encouraging you to push the actual payment stuff off to third parties.

Bookmark and Share

Amazon announces reserved instances

Amazon’s announcement du jour is “reserved instances” for EC2.

Basically, with a reserved instance, you pay an up-front non-refundable fee for a one-year term or a three-year term. That buys you a discount on the usage fee for that instance, during that period of time. Reserved instances are only available for Unix flavors (i.e., no Windows) and, at present, only in the US availability zones.

Let’s do some math to see what the cost savings turn out to be.

An Amazon small instance (1 virtual core equivalent to a 1.0-1.2 GHz 2007 Opteron or Xeon) is normally $0.10 per hour. Assuming 720 hours in a month, that’s $72 a month, or $864 per year, if you run that instance full-time.

Under the reserved instance pricing scheme, you pay $325 for a one-year term, then $0.03 per hour. That would be $21 per month, or $259 per year. Add in the reserve fee and you’re at $584 for the year, averaging out to $49 per month — a pretty nice cost savings.

On a three-year basis, unreserved would cost you $2,592; reserved, full-time, is a $500 one-time fee, and with usage, a grand total of $1277. Big savings over the base price, averaging out to $35 per month.

This is important because at the unreserved prices, on a three-year cash basis, it’s cheaper to just buy your own servers. At the reserved price, does that equation change?

Well, let’s see. Today, in a Dell PowerEdge R900 (a reasonably popular server for virtualized infrastructure), I can get a four-socket server populated with quad-cores for around $15,000. That’s sixteen Xeon cores clocking at more than 2 GHz. Call it $1000 per modern core; split up over a 3-year period, that’s about $28 per month. Cheaper than the reserved price, and much less than the unreserved price.

Now, this is a crude, hardware-only, three-year cash calculation, of course, and not a TCO calculation. But it shows that if you plan to run your servers full-time on Amazon, it’s not as cheap as you might think when you think “it’s just three cents an hour!”

Bookmark and Share

Smoke-and-mirrors and cloud software

On my more cynical, read-too-many-press-releases days, I wonder if there’s some hapless, tortured PR gnome at Amazon whose job consists solely of vetting one empty cloud fluff piece after another, proclaiming how such-and-such a vendor is now offering deployments on EC2, and how this therefore gives them an on-demand cloud offering (“please think of me as hip and visionary!”), when in reality, the vendor is doing nothing other than packaging up its software as an AMI (an Amazon machine image, basically a disk image of a server with the application installed).

Packaging something up as an AMI doesn’t make it a cloud service. It doesn’t make it massively scalable, automatically scalable, transparently scalable, on-demand, multi-tenant, or any one of a vast number of other terms that get fatuously lavished on anything with a whiff of cloudiness. If a piece of software doesn’t have any cloud traits when it’s deployed in your data center, it won’t have them when it’s deployed on EC2 (or any other cloud infrastructure service), either.

[the cake is a lie] Cloud infrastructure services today, whether EC2 or from one of Amazon’s competitors, are basically servers in the sky. They are almost always a hypervisor-virtualized server with a normal operating system installation, on top of which you install normal applications. There is no magic cloud pixie dust that settles on these instances and turns them into application faeries of scalability and joy.

Building massively and horizontally scalable, multi-tenant software with elastic economics is hard. It’s even harder if you’re trying to take some legacy software package and re-engineer it. This is why practically no one does that kind of re-engineering, and why software vendors have to resort to puffed-up “yes, we run on EC2!” claims, rather than genuinely delivering on-demand cloud services.

Don’t be fooled.

Marketing and PR folks at software vendors: I forgive you for these releases because I know you’re under pressure to put something out, but every time I read them, I cringe on your behalf, and hope that you’re not genuinely entertaining the belief that releasing an AMI meaningfully moves you forward along the cloud path.

IT folks: When your CEO / CFO / CIO comes to you and asks you why you aren’t taking advantage of your software vendor’s awesome new money-saving cloud service, you can tell him it’s because the PR release is just artfully painting a unicorn — a mythical beast everyone talks about but doesn’t actually exist.

Bookmark and Share

There’s more to cloud computing than Amazon

In dozens of client conversations, I keep encountering companies — both IT buyers and vendors — who seem to believe that Amazon’s EC2 platform is the be-all and end-all of the state of the art in cloud computing today. In short, they believe that if you can’t get it on EC2, there’s no cloud platform that can offer it to you. (I saw a blog post recently called “Why, right now, is Amazon the only game in town?” that exemplifies this stance.)

For better or for worse, this is simply not the case. While Amazon’s EC2 platform (and the rest of AWS) is a fantastic technical achievement, and it has demonstrated that it scales well and has a vast amount of spare capacity to be used on demand, as it stands, it’s got some showstoppers for many mainstream adopters. But that doesn’t mean that the rest of the market can’t fill those needs, like:

  • Not having to make any changes to applications.
  • Non-public-Internet connectivity options.
  • High-performance, reliable storage with managed off-site backups.
  • Hybridization with dedicated or colocated equipment.
  • Meeting compliance and audit requirements.
  • Real-time visibility into usage and billing.
  • Enterprise-class customer support and managed services.

There are tons of providers who would be happy to sell some or all of that to you — newer names to most people, like GoGrid and SoftLayer, as well as familiar enterprise hosting names like AT&T, Savvis, and Terremark. Even your ostensibly stodgy IT outsourcers are starting to get into this game, although the boundaries of what’s a public cloud service and what’s an outsourced private one start to get blurry.

If you’ve got to suddenly turn up four thousand servers to handle a flash crowd, you’re going to need Amazon. But if you’re like most mainstream businesses looking at cloud today, you’ve got a cash crunch you’ve got to get through, you’re deploying at most dozens of servers this year, and you’re not putting up and tearing down servers hour by hour. Don’t get fooled into thinking that Amazon’s the only possible option for you. It’s just one of many. Every cloud infrastructure services platform is better for some needs than others.

(Gartner clients interested in learning more about Amazon’s EC2 platform should read my note “Is Amazon EC2 Right For You?“. Those wanting to know more about S3 should read “A Look at Amazon’s S3 Cloud-Computing Storage Service“, authored by my colleagues Stan Zaffos and Ray Paquet.)

Bookmark and Share