Blog Archives

Pricing transparency and CDNs

It is possible that I am going to turn out to be mildly wrong about something. I predicted that neither Amazon’s CloudFront CDN nor the comparable Rackspace/Limelight offering (Mosso Cloud Files) would really impact the mainstream CDN market. I am no longer as certain that’s going to be the case, as it appears that behavioral economics play into these decisions more than one might expect. The impact is subtle, but I think it’s there.

I’m not talking about the giant video deals, mind you; those guys already get prices well below that of the cloud CDNs. I’m talking about the classic bread-and-butter of the CDN market, the e-commerce and enterprise customers, significant B2B and B2C brands that have traditionally been Akamai loyalists, or been scattered with smaller players like Mirror Image.

Simply put, the cloud CDNs put indirect pressure on mainstream CDN prices, and will absorb some new mainstream (enterprise but low-volume) clients, for a simple reason: Their pricing is transparent. $0.22/GB for Rackspace/Limelight. $0.20/GB for SoftLayer/Internap. $0.17/GB for Amazon CloudFront. And so on.

Transparent pricing forces people to rationalize what they’re buying. If I can buy Limelight service on zero commit for $0.22/GB, there’s a fair chance that I’m going to start wondering just what exactly Akamai is giving me that’s worth paying $2.50/GB for on a multi-TB commit. Now, the answer to that might be, “DSA Secure that speeds up my global e-commerce transactions and is invaluable to my business”, but that answer might also be, “The same old basic static caching I’ve been doing forever and have been blindly signing renewals for.” It is going to get me to wonder things like, “What are the actual competitive costs of the services I am using?” and, “What is the business value of what I’m buying?” It might not alter what people buy, but it will certainly alter their perception of value.

Since grim October, businesses have really cared about what things cost and what benefit they’re getting out of them. Transparent pricing really amps up the scrutiny, as I’m discovering as I talk to clients about CDN services. And remember that people can be predictably irrational.

While I’m on the topic of cloud CDNs: There have been two recent sets of public performance measurements for Rackspace (Mosso) Cloud Files on Limelight. One is part of a review by Matthew Sacks, and the other is Rackspace’s own posting of Gomez metrics comparing Cloud Files with Amazon CloudFront. The Limelight performance is, unsurprisingly, overwhelmingly better.

What I haven’t seen yet is a direct performance comparison of regular Limelight and Rackspace+Limelight. The footprint appears to be the same, but differences in cache hit ratios (likely, given that stuff on Cloud Files will likely get fewer eyeballs) and the like will create performance differences on a practical level. I assume it creates no differences for testing purposes, though (i.e., the usual “put a 10k file on two CDNs”), unless Limelight prioritizes Cloud Files requests differently.

Bookmark and Share

Aflexi, a new CDN aggregator

Aflexi has announced its launch, which is slated for January of 2009.

Aflexi is a CDN aggregator, targeting small Web hosters much in the same way that Velocix’s Metro product targets broadband providers. (What’s old is new again: remember Content Bridge and CDN peering, a hot idea back in 2001?)

Here’s how it works: Aflexi operates a marketplace and CDN routing infrastructure (i.e., the DNS-based brain servers that tell an end-user client what server to pull content from), plus has Linux-based CDN server software.

Web hosters can pay a nominal fee of $150 to register with Aflexi, granting them the right to deploy unlimited copies of Aflexi’s CDN server software. (Aflexi is recommending a minimum of a dual-core server with 4 GB of RAM and 20-30 GB of storage, for these cache servers. That is pretty much “any old hardware you have lying around.”) A hoster can put these servers wherever he likes, and is responsible for their connectivity and so forth. The Web hoster then registers his footprint, desired price for delivering a GB of traffic, and any content restrictions (like “no adult content”) on Aflexi’s marketplace portal.

Content owners can come to the portal to shop for CDN services. If they’re going through one of Aflexi’s hosting partners, they may be limited in their choices, at the hoster’s discretion. The content owner chooses which CDNs he wants to aggregate. Then, he can simply go live; Aflexi will serve his content only over the CDNs he’s chosen. Currently, the content routing is based upon the usual CDN performance metrics; Aflexi plans to offer price/performance routing late next year. Aflexi takes a royalty of 0.8 cents per GB (thus, under a penny); the remainder of the delivery fee goes to whatever hoster served a particular piece of content. Customers will typically be billed through their hoster; Aflexi integrates with the Parallels control panel (they’re packaging in the APS format).

Broadly, although the idea of aggregation isn’t new, the marketplace is an interesting take on it. This kind of federated model raises significant challenges in terms of business concerns — the ability to offer an SLA across a diversified base, and ensuring that content is not tampered with, are likely at the forefront of those concerns. Also, a $150 barrier to entry is essentially negligible, which means there will have to be some strenuous efforts to keep out bad actors in the ecosystem.

Aflexi sees the future of the CDN market as being hosters. I disagree, given that most hosters don’t own networks. However, I do believe that hosting and CDN are natural matches from a product standpoint, and that hosters need to have some form of CDN strategy. It’s clear that Aflexi wants to target small Web hosters and their small-business customers. They’re going to occupy a distinct niche, but I wonder how well that approach will hold up against Rackspace-plus-Limelight and Amazon’s CloudFront, which have solid credibility and are targeted at small customers. But the existence of Aflexi will offer small hosters a CDN option beyond pure resale.

Aflexi says its initial launch hosters will include ThePlanet. That in and of itself is an interesting tidbit, as ThePlanet (which is one of the largest providers of simple dedicated hosting in the world) currently resells EdgeCast.

One more odd little tidbit: The CEO is Whei Meng Wong, previously of UltraUnix, but also, apparently, previously of an interesting SpamHaus ROKSO record (designating a hoster who is a spam haven — willing to host the sites that spammers advertise). Assuming that it’s the same person, which it appears to be, that reputation could have significant effects upon Aflexi’s ability to attract legitimate customers — either hosters or content owners.

The company is funded through a Malaysian government grant. The CTO is Wai-Keen Woon; the VP of Engineering is Yuen-Chi Lian. Neither of them appears to have executive experience, or indeed, much experience period — the CTO’s Facebook profile says he’s an ’07 university graduate. The CEO’s blog seems to indicate he is also an ’07 graduate. So this is apparently a fresh-out-of-college group-of-buddies company — notably, without either a Sales or Marketing executive that they deemed worth mentioning in their launch presentation.

Bottom line, though: This is another example of CDN services moving up a level towards software overlays. The next generation of providers own software infrastructure and the CDN routing brain, but don’t deploy a bunch of servers and network capacity themselves.

Bookmark and Share

Google builds a CDN for its own content

An article in the Wall Street Journal today describes Google’s OpenEdge initiative (along with a lot of spin around net neutrality, resulting in a Google reply on its public policy blog).

Basically, Google is trying to convince broadband providers to let it place caches within their networks — effectively, pursuing the same architecture that a deep-footprint CDN like Akamai uses, but for Google content alone.

Much of the commentary around this seems to center on the idea that if Google can use this to obtain better performance for its content and applications, everyone else is at a disadvantage and it’s a general stab to net neutrality. (Even Om Malik, who is not usually given to mindless panic, asserts, “If Google can buy better performance for its service, your web app might be at a disadvantage. If the cost of doing business means paying baksheesh to the carriers, then it is the end of innovation as we know it.”)

I think this is an awful lot of hyperbole. Today, anyone can buy better performance for their Web content and applications by paying money to a CDN. And in turn, the CDNs pay baksheesh, if you want to call it that, to the carriers. Google is simply cutting out the middleman, and given that it accounts for as more traffic on the Internet than most CDNs, it’s neither illogical nor commercially unreasonable.

Other large content providers — Microsoft and AOL notably on a historical basis — have built internal CDNs in the past; Google is just unusual in that it’s attempting to push those caches deeper into the network on a widespread basis. I’d guess that it’s YouTube, more than anything else, that’s pushing Google to make this move.

This move is likely driven at least in part by the fact that most of the broadband providers simply don’t have enough 10 Gbps ports for traffic exchange (and space and power constraints in big peering points like Equinix’s aren’t helping matters, making it artificially hard for providers to get the expansions necessary to put big new routers into those facilities). Video growth has sucked up a ton of capacity. Google, and YouTube in particular, is a gigantic part of video traffic. If Google is offering to alleviate some of that logjam by putting its servers deeper into a broadband provider’s network, that might be hugely attractive from a pure traffic engineering standpoint. And providers likely trust Google to have enough remote management and engineering expertise to ensure that those cache boxes are well-behaved and not annoying to host. (Akamai has socialized this concept well over much of the last decade, so this is not new to the providers.)

I suspect that Google wouldn’t even need to pay to do this. For the broadband providers, the traffic engineering advantages, and the better performace to end-users, might be enough. In fact, this is the same logic that explains why Akamai doesn’t pay for most of its deep-network caches. It’s not that this is unprecedented. It’s just that this is the first time that an individual content provider has reached the kind of scale where they can make the same argument as a large CDN.

The cold truth is that small companies generally do not enjoy the same advantages as large companies. If you are a small company making widgets, chances are that a large company making widgets has a lower materials cost than you do, because they are getting a discount for buying in bulk. If you are a small company doing anything whatsoever, you aren’t going to see the kind of supplier discounts that a large company gets. The same thing is true for bandwidth — and for that matter, for CDN services. And big companies often leverage their scale into greater efficiency, to boot; for instance, unsurprisingly, Gartner’s metrics data shows that the average cost to running servers drops as you get more servers in your data center. Google employs both scale and efficiency leverage.

One of the key advantages of the emerging cloud infrastructure services, for start-ups, is that such services offer the leverage of scale, on a pay-by-the-drink basis. With cloud, small providers can essentially get the advantage of big providers by banding together into consortiums or paying an aggregator. However, on the deep-network CDN front, this probably won’t help. Highly distributed models work very well for extremely popular content. For long-tail content, cache hit ratios can be too low for it to be really worthwhile. That’s why it’s doubtful that you’ll see, say, Amazon’s Cloudfront CDN, push deep rather than continuing to follow a megaPOP model.

Ironically, because caching techniques aren’t as efficient for small content providers, it might actually be useful to them to be able to buy bandwidth at a higher QoS.

Bookmark and Share

Velocix Metro

CDN provider Velocix has announced the launch of a new product, called Velocix Metro. (I was first briefed on Metro almost eight months ago, so the official launch has been in the works for quite a while.)

Velocix Metro is essentially a turnkey managed CDN service, deployed at locations of an Internet service provider’s choice, and potentially embedded deep into that ISP’s network. The ISP gets a revenue share based on the traffic delivered via their networks from Velocix, plus the ability to do their own delivery from the deployed CDN nodes in their network. Velocix’s flagship customer for this service is Verizon.

You might recall that Velocix is a partner CDN to MediaMelon, which I discussed in the context of CDN overlays a few weeks ago. I believe that these kinds of federated networks are going to become increasingly common, because carriers are the natural choice to provide commoditized CDN services (due to their low network costs), and broadband service providers need some way to monetize the gargantuan and growing volumes of rich content being delivered to their end-user eyeballs.

The economics of the peering ecosystem make it very hard for broadband providers to raise the price of bandwidth bought by content providers, and intermodal competition (i.e., DSL/FiOS vs. cable) creates pricing competition that makes it hard to charge end-users more. So broadband providers need to find another out, and offering up their own CDNs, and thus faster access to their eyeballs, is certainly a reasonable approach. (That means that over the long term, providers that deploy their own CDNs are probably going to be less friendly about placing gear from other CDNs deep within their networks, especially if it’s for free.)

We are entering the period of the rise of the local CDN — CDNs with deep but strictly regional penetration. For instance, if you’re a broadcaster in Italy, with Italian-language programming, you probably aren’t trying to deliver to the world and you don’t want to pay the prices necessary to do so; you want deep coverage within Italy and other Italian-speaking countries, and that’s it. An overlay or federated approach makes it possible to tie together CDNs owned by incumbent regional carriers, giving you delivery in just the places you care about. And that, in turn, creates a compelling business case for every large network provider to have a CDN of their own. Velocix, along with other vendors who can provide turnkey solutions to providers who want to build their own CDN networks, ought to benefit from that shift.

Bookmark and Share

New CDN research notes

I have three new research notes out:

Determine Your Video Delivery Requirements. When I talk to clients, I often find that IT is trying to source a video delivery solution without having much of an idea of what the requirements actually are. This note is directed at them; it’s intended to serve as a framework for discussions with the content owners.

Toolkit: Determining Your Content Delivery Network Requirements. This toolkit consists of three Excel worksheets. The first gathers a handful of high-level requirements, in order to figure out what type of vendor you’re probably looking for. The second helps you estimate your volume and convert between the three typical measurements used (Mbps, MPVs, or GB delivered). The third is a pricing estimator and converter.

Purchasing Content Delivery Network Services. This is a practical guide to buying CDN services, targeted towards mid-sized and enterprise purchasers.

Bookmark and Share

CDN: software or infrastructure?

Akamai is a software company. It is not, fundamentally, an infrastructure company. You could blow up their network of servers from orbit, and most of the value of the company would be preserved. It’s a company that runs on its intellectual property, the breadth and depth of its feature set, and, ultimately, its ability to rapidly innovate software features. Like any SaaS company, it needs infrastructure upon which to deliver those features (which are mostly focused around content delivery), but the infrastructure itself is not really its value.

I used to be able to generalize this to the CDN market as a whole. Like most software markets, there was feature-set competition and the generality that what you invented today, your competitor would try to replicate in the next year.

I’ve come to the realization that this is no longer an accurate characterization. It’s true of Akamai and a very small number of competitors, but for everyone else, CDN has become an infrastructure play — about having a network and a lot of servers and just enough software to control it all efficiently.

Software and infrastructure businesses have very different characteristics, and the shift has a cascade of implications.

Bookmark and Share

Basic CDNs: where are my logs?

I’m now trialing three different basic CDNs, having each of them deliver exactly one image off the same front page of a website: Amazon’s CloudFront, Rackspace’s Cloud Files CDN (a Limelight Networks partnership), and SimpleCDN.

SimpleCDN has turned out to be the overwhelming winner in terms of initial set-up — you can simply point it at a site and it will mirror your content, just like you’ll find in more full-featured CDNs. The downside is that it has basically no additional controls — from what I can tell, you can’t even delete a mirror once you set it up, and there’s no way to explicitly mark content with a Cache-Control header or the like. It also wins on the sheer price of data transfer — just a hair under 6 cents per GB. The downside is that, at that price, it doesn’t come close come to touching Limelight’s footprint, or, for that matter, Amazon’s — there are no East Coast delivery locations, for instance. It’s also routed via AnyCast.

Cloud Files and SimpleCDN are both missing meaningful logging capabilities. Cloud Files can give me daily aggregate totals of disk space used, bandwidth in and out, and number of paid and free operations. (Just numbers. No graph.) SimpleCDN can give me aggregated traffic graphs (outbound bandwidth, hits per second, and hits per 15 minutes) for the past 2 days and the past 2 weeks, plus aggregate totals for the last 30 minutes, day, and week.

With Amazon, you get detailed usage reports, modulo the fact that they are indecipherably opaque. You can set up S3 to give you detailed server logs; I’m processing mine through S3stat, which is a service that will Webalizer your S3 logs for you. Amazon is promising such logs for CloudFront in the future. At the moment, I’m stuck with the enigmatic usage reports. Nothing I can find anywhere will tell me what the difference between a tier 1 and tier 2 request is, for instance. What I’m interested in finding out is what percentage of my requests end up falling back to the origin, but it looks like that is a mystery that will have to wait for CloudFront logging.

Lack of logging is not particularly a big deal if you are just trying to offload static content in the form of images and the like — presumably in that scenario you have decent analytics based off hits to the base page or a tracking counter or something like that. However, if you’re trying to track something like software downloads, it is certainly a much more significant problem. And importantly, logs let you verify exactly what is going on, which may be significant for troubleshooting as well as settling billing disputes.

Bookmark and Share

Rackspace Cloud Files (Mosso) + Limelight CDN

It’s official: Rackspace’s Cloud Files can now be distributed via Limelight’s CDN.

The Cloud Files CDN works exactly like announced: drop your files into Cloud Files, push a button, have them served over the Limelight CDN. It’s 22 cents flat-rate, without origin fees. I’ve discussed this deal before, including in the context of Amazon’s CloudFront, and my previous commentary stands.

What will be interesting to see is the degree to which Limelight preserves infrequently-accessed content in its edge caches, vs. Amazon, as that will make a significant impact on performance, as well as cost-effectiveness vs. CloudFront. Limelight has a superior footprint, superior peering relationships, and much more sophisticated content routing. For people who are just looking to add a little bit of effortless CDN, their ability to keep infrequently-accessed content fresh on their edge caches will determine just how much of a performance advantage they have — or don’t have — versus CloudFront.

A Little Testing

In the spirit of empirical testing, I’ve created a Cloud Files account, too, and am now delivering the images of the front page of a tiny website from a combination of CloudFront and Cloud Files (which is under Rackspace’s Mosso brand).

It looks like my Cloud Files are being served from the same Limelight footprint as other Limelight customers, at least to the extent that I get the exact same edge resolutions for http://www.dallascowboys.com as I do for cdn.cloudfiles.mosso.com.

The Cloud Files deployment is reasonably elegant, and Mosso provides a GUI and thus is more directly user-friendly. You create a container (folder), drop your files into it, and click a button to make the files public, making them automatically and instantly available via Limelight. (The Amazon process is slower, although still within minutes.)

However, in a move that’s distinctly inferior to the CloudFront implementation, the base URL for the files is http://cdn.cloudfiles.mosso.com/some-hash-value/yourfile, vs. the more elegant CloudFront http://some-hash-value.cloudfront.net/yourfile. You can CNAME the latter to some nicely in-your-domain host (images.yourcompany.com, let’s say), and in fact, if you’ve used another CDN or have simply been smart and delivered static content off its own hostname, you’d already have that set up. The Mosso method means that you’ve got to go through all your content and munge your URLs to point to the CDN.

Aside from being annoying, it makes the elegant fallback, less elegant. Having the CNAME means that if some reason CloudFront goes belly-up, I can trivially repoint the CNAME to my own server (and I can easily automate this task, monitoring CloudFront deliverability and swapping the DNS if it dies). I can’t do that as easily with Mosso (although I can CNAME cdn.cloudfiles.mosso.com to something in my domain, and set up a directory structure that uses the hash, or use a URL rewriting rule, and still get the same effect, so it is merely more awkward, not impossible).

That might be in part a way for Limelight to differentiate this from its normal CDN offering, which has the modern and elegant deployment mechanism of: simply point your website hostname at Limelight and Limelight will do the rest. In that scheme, Limelight automatically figures out what to cache, and does the right thing. That seems unlikely, though. I hope the current scheme is just temporary until they get a CloudFront-like DNS-based implementation done.

I’ll have more to say once I’ve accumulated some log data, and add SimpleCDN into the mix.

Bookmark and Share

Quick CloudFront set-up

I’ve got a tiny low-traffic site that I’ve just set up to use Amazon’s CloudFront CDN. I chose to do it the trivial, painless way for now — through RightScale’s dashboard. It just took a couple of minutes, most of which was waiting for systems to recognize the changes. I frankly expect that my content will never be fresh in cache, but doing this will give me something to play with, without actually costing me some unpredictable amount of money. I’ve been meaning to do some similar tinkering with SimpleCDN, too, and eventually with the Rackspace/Limelight cloud CDN (which thus far lacks a snappy short name).

I still have to finish my cloud server testing, too, which I started a few months ago and which keeps being interrupted by other work and personal demands… I always feel a bit guilty keeping demo accounts for long periods of time.

Bookmark and Share

Amazon’s CloudFront CDN

Amazon’s previously-announced CDN service is now live. It’s called CloudFront. It was announced on the Amazon Web Services blog, and discussed in a blog post by Amazon’s CTO. The RightScale blog has a deeper look, too.

How CloudFront Works

Basically, to use the CloudFront CDN, you drop your static objects (your static HTML, images, JavaScript libraries, etc.) into an S3 bucket. (A bucket is essentially the conceptual equivalent of a folder.) Then, you register that bucket with CloudFront. Once you’ve done this, you get back a hostname that you use for as the base of the URL to your static objects. The hostname looks to be a hash in the cloudfront.net domain; I expect CloudFront customers will normally create a CNAME (alias) in their own domain that points to it.

The tasks use Amazon API calls, but like every other useful bit of AWS functionality, there are tools out there that can make it point-and-click, so this can readily be CDN for Dummies. There’s no content “publication”, just making sure that the URLs for your static content point to Amazon. Intelligently-designed sites already have static content segregated onto its own hostname, so it’s a simple switch for them; for everyone else, it’s just some text editing, and it’s the same method as just about every other CDN for the better part of the last decade.

In short, you’re basically using S3 as the origin server. The actual delivery is through one of 14 edge locations — in the particular case of these locations, a limited footprint, but not a bad one in these days of the megaPOP CDN.

How Pricing Works

Pricing for the service is a little more complex to understand. If your content is being served from the edge, it’s priced similarly to S3 data transfers (same price at 10 TB or less, and 1 cent less for higher tiers). However, before your content can get served off the edge, it has to get there. So if the edge has to go back to the origin to get the content (i.e., the content is not in cache or has expired from cache), you’ll also incur a normal S3 charge.

You can think of this as being similar to the way that Akamai charged a few years ago — you’d pay a bandwidth fee for delivering content to users, but you’d also pay something called an “administrative bandwidth fee”, which was charged for edge servers fetching from the origin. That essentially penalized you when there was a cache miss. On a big commercial CDN like Akamai, customers reasonably felt this was sort of unfair, competitors like Mirror Image hit back by not charging those fees, and by and large the fee disappeared from the market.

It makes sense for Amazon to charge for having to go back to the (S3) origin, though, because the open-to-all-comers nature of CloudFront means that they would naturally have a ton of customers who have long-tail content and therefore very low cache hit ratios. On a practical basis, however, quite a few people who try out CloudFront will probably discover that the cache miss ratio for their content is too high; since a cache miss also confers no performance benefit, the higher delivery cost won’t make sense, and they’ll go back to just using S3 itself for static delivery. In other words, the pricing scheme will make customers self-regulate, over time.

Performance and CDN Server Misdirection

Some starting fodder for the performance-minded: My preliminary testing shows that Amazon’s accuracy in choosing an appropriate node can be extremely poor. I’ve noted previously that the nameserver is the location reference, and so the CDN’s content-routing task is to choose a content server close to that location (which will also hopefully be close to the end-user).

I live in suburban DC. I tested using cdn.wolfire.com (a publicly-cited CloudFront customer), and checked some other CloudFront customers as well to make sure it wasn’t a domain-specific aberration. My local friend on Verizon FIOS, using a local DC nameserver, gets Amazon’s node in Northern Virginia (suburban DC), certainly the closest node to us. Using the UltraDNS nameserver in the New York City area, I also get the NoVa node — but from the perspective of that nameserver, Amazon’s Newark node should actually be closer. Using the OpenDNS nameserver in NoVa, Amazon tries to send me to its CDN node in Palo Alto (near San Francisco) — not even close. My ISP, MegaPath, has a local nameserver in DC; using that, Amazon sends me to Los Angeles.

That’s a serious set of errors. By ping time, the correct, NoVa node is a mere 5 ms away from me. The LA node is 71 ms from me. Palo Alto is even worse, at 81 ms. Both of those times are considerably worse than pure cross-country latency across a decent backbone. My ping to the test site’s origin (www.wolfire.com) is only 30 ms, making the CDN latency more than twice as high.

However, I have a certain degree of faith: Over time, Amazon has clearly illustrated that they learn from their mistakes, they fix them, and they improve on their products. I assume they’ll figure out proper redirection in time.

What It Means for the Market

Competitively, it seems like Rackspace’s Cloud Files plus Limelight may turn out to be the stronger offering. The price of Rackspace/Limelight is slightly higher, but apparently there’s no origin retrieval charge, and Limelight has a broader footprint and therefore probably better global performance (although there are many things that go into CDN performance beyond footprint). And while anyone can use CloudFront for the teensiest bit of delivery, the practical reality is that they’re not going to — the pricing scheme will make that irrational.

Some people will undoubtedly excitedly hype CloudFront as a game-changer. It’s not. It’s certainly yet another step towards having ubiquitous edge delivery of all popular static content, and the prices are attractive at low to moderate volumes, but high-volume customers can and will get steeper discounts from established players with bigger footprints and a richer feature set. It’s a logical move on Amazon’s part, and a useful product that’s going to find a large audience, but it’s not going to majorly shake up the CDN industry, other than to accelerate the doom of small undifferentiated providers who were already well on their way to rolling into the fiery pit of market irrelevance and insolvency.

(I can’t write a deeper market analysis on my blog. If you’re a Gartner client and want to know more, make an inquiry and I’d be happy to discuss it.)

Bookmark and Share