Blog Archives

Recently-published research

Here’s a quick round-up of some of my recently-published research.

Is Amazon EC2 Right For You? This is an introduction to Amazon’s Elastic Compute Cloud, written for a mildly technical audience. It summarizes Amazon’s capabilities, the typical business case for using it, and what you’ve got to do to use it. If you’re an engineer looking for a quick briefing, or you want to show a “what this newfangled thing is” summary to your manager, or you’re an investor trying to understand what exactly it is that Amazon does, this is the document for you.

Dataquest Insight: Web Hosting, North America, 2006-2012. This is an in-depth look at the colocation and hosting business, together with market forecasts and trends. (Investors may also want to look at the Invest Implications.)

Dataquest Insight: Content Delivery Networks, North America, 2006-2012. This is an in-depth look at the CDN market, segment-by-segment, with market forecasts and trends. (Investors may also want to look at the Invest Implications.)

You’ll need to be a Gartner subscriber (or purchase the individual document) in order to view these pieces.

Upcoming research (for publication in the next month): A pricing guide for Web hosting and cloud infrastructure services; a classification scheme and service provider roadmap for cloud offerings; a toolkit for CDN requirements gathering and price estimation; a framework for gathering video requirements; and a CDN selection guide.

Bookmark and Share

The Microsoft hybrid-P2P CDN study

I noted previously that the Microsoft CDN study, titled “Measuring and Evaluating Large-Scale CDNs”, had disappeared. Now its lead author, Microsoft researcher Cheng Huang, has updated his home page to note that the study has been withdrawn.

Also removed from his home page, but still available from one of his collaborators, is a study from earlier this year, “Understanding Hybrid CDN-P2P: Why Limelight Needs Its Own Red Swoosh“. I assume the link was removed because it extensively details the CDN discovery methodology also used in the more recent Microsoft CDN study, so if you missed reading the study while it was available, you might want to read this slightly older paper for the details.

I just read the P2P study, which reveals something that I conjectured in my earlier analysis of the study’s blind spots: the visibility into Verizon was almost non-existent. The P2P study asserts that Akamai is present in just four locations inside Verizon’s network. This seems improbable. Verizon is Akamai’s most significant carrier reseller and one of its largest enterprise-focused resellers. It is also one of the largest broadband networks in the United States, and is a significant global network service provider. It was also a close partner of Netli, who inked a deal making Verizon its primary source of bandwidth; I would expect that even though Akamai integrated Netli into its network after acquiring it, it would have kept any strategic points of presence in Verizon’s network. One would have expected that the researchers would have wondered what the chances were that a close partner wouldn’t have substantial Akamai footprint, especially when their chart of Limelight indicated 10 Verizon locations. (Remember that the charting methodology is much less accurate for a deep-footprint CDN.)

The researchers then go on to explore the effects of hybrid P2P using those Verizon nodes (along with AT&T, which also looks like an incomplete discovery). Unfortunately, they don’t tell us much of value about peer-assisted offload; the real world has made it amply clear that actual P2P effectiveness depends tremendously on the nature of your content and your audience.

The methodological flaws make the hybrid-P2P paper’s conclusions deeply and fundamentally flawed. But like the other study, it is an interesting read.

Bookmark and Share

Rackspace’s deal with Limelight

Rackspace announced yesterday, as part of a general unveiling of its cloud strategy, a new partnership with Limelight Networks.

Under the new partnership, customers of Rackspace’s Cloud Files (formerly CloudFS) service — essentially, a competitor to Amazon S3 — will be able to choose to publish and deliver their files via Limelight’s CDN. Essentially, this will place Rackspace/Limelight in direct competition with Amazon’s forthcoming S3 CDN.

CDN delivery won’t cost Cloud Files customers any more than Rackspace’s normal bandwidth costs for Cloud Files. Currently, that’s $0.22/GB for the first 5 TB, scaling down to $0.15/GB for volumes above 50 TB. Amazon S3, by comparison, is $0.17/GB for the first 10 TB, down to $0.10/GB for volumes over 150 TB; we don’t yet know what its CDN upcharge, if any, will be. As another reference point, Internap resold via SoftLayer is $0.20/GB, so we can probably take that as a reasonable benchmark for the base entry cost of CDN services sold without any commit.

It’s a reasonably safe bet that Limelight’s CDN is going to deliver better performance than Amazon’s S3 CDN, given its broader footprint and peering relationships, so the usual question of, “What’s the business value of performance?” will apply.

It’s a smart move on Rackspace’s part, and an easy way into a CDN upsell strategy for its regular base of hosting customers, too. And it’s a good way for Limelight to pre-emptively compete against the Amazon S3 CDN.

Bookmark and Share

Akamai expands its advertising solutions

Akamai made an advertising-related announcement today, introducing something it calls Advertising Decision Solutions, and stating that it has agreed to aquire acerno for $95 million in cash.

acerno (which seems to belong to the e.e. cummings school of brand naming) is a small retailer-focused advertising network, but the reason that Akamai acquired it is that they operate a data cooperative, wherein retailers share shopping data. This data in turn is used to create a predictive model — i.e., if a customer bought X, then it’s likely they will also be shopping for Y and Z and therefore you might want to show them related ads.

Although Akamai states they’ll continue to operate the acerno business, don’t expect them to really push that ad network; Akamai knows where its bread is buttered and isn’t going to risk competing with the established large ad networks, which number amongst Akamai’s most significant customers. Instead, Akamai intends to use the acerno data and its data cooperative model to enhance the advertising-related capabilities that it offers to its customers.

This complements the Advertising Decision Solutions announcement. Basically, it appears that Akamai is going to begin to exploit its treasure-trove of user behavior data, as well as take advantage of the fact that they deliver content on behalf of the publishers as well as the ad networks, and therefore are able to insert elements into the delivery, such as cookies (thus enabling communication between cooperating Akamai customers without those customers having to manually set up such cooperation with their various partners).

This expansion of Akamai’s product portfolio is a smart move. With the cost of delivery dropping through the floor, Akamai needs new, high-value, high-margin services to offer to customers, as well as services that tie customers more closely to Akamai, creating a stickiness that will make customers more reluctant to switch providers to obtain lower costs. Note, however, that Akamai already dominates the online retail space; the new service probably won’t make much of a difference in a retail customer’s decision about whether or not to purchase Akamai services. It will, however, help them defend and grow their ad network customers, and help them maintain a hold on core website delivery for the media and entertainment space. (This is true even in the face of video delivery moving to low-cost CDNs, since you don’t need to deliver the site and the video from the same CDN.)

I think this move signals that we’re going to see Akamai move into adjacent markets where it can leverage its distributed computing platform, its aggregated data (whether about users, content, systems, or networks), or its customer ecosystem. Because these kinds of services will tend to be decoupled from the actual price of bit delivery, they should also help Akamai broaden its revenue streams.

Bookmark and Share

CDN overlays (and more on MediaMelon)

I was recently briefed by MediaMelon, a just-launched CDN offering a “video overlay network”. The implications of their technology are worth considering, even though I think the company itself is going to have a difficult road to travel. (MediaMelon has two customers thus far, and is angel-funded; it is entering an extremely tough, competitive market. I wish them luck, since their model essentially forces them to compete in the ever-more-cutthroat CDN price war, as their entire value proposition is tied up in lowering delivery costs.)

In brief, when a content provider publishes its video to MediaMelon, MediaMelon divides the video into small chunks, each of which is a separate file that can be delivered via HTTP, and relies upon the video player to re-assemble those chunks. This chunk-based delivery is conceptually identical to Move Networks streamlets. MediaMelon then publishes the content out to its CDN partners (currently Velocix plus an unannounced second partner). MediaMelon’s special sauce is that these chunks are then delivered via multiple sources. This is normally MediaMelon’s P2P network, with a fallback to MediaMelon’s CDN partners. Since the video is in chunks, the source can switch from chunk to chunk. The video player also reports its performance to MediaMelon’s servers, allowing MediaMelon to draw conclusions about how to serve content. As a delivery-focused company, MediaMelon has decided to leave the value-adds to its media platform partners, currently thePlatform.

Whatever the challenges of their business model, though, the overlay model is interesting, and from a broader market perspective, MediaMelon’s technology highlights several things about video player capabilities that should be kept in mind:

  • You can carve up your video and let the player re-assemble it.
  • You can deliver using multiple sources, including P2P.
  • The player knows what kind of performance it’s getting, and can report it.

These three key things make it extremely clear that it is technically feasible to create a “neutral” CDN overlay network, without requiring the cooperation of the CDNs themselves. MediaMelon is halfway there. It just hasn’t put together all the pieces (the technical hurdles are actually nontrivial), and it is designed to work with partner CDNs rather than force them into competition.

Basically, what a (non-AnyCast) CDN like Akamai or Limelight does, is that they’ve got a central engine gathering network performance data, which it uses to choose an individual CDN server, based on what it believes best for you (where “you” is defined by where your nameserver is). That individual CDN server then delivers the content to you.

What an overlay would have is a central engine that gathers performance data directly from the video player, and has a list of sources for a given piece of content (where that list includes multiple CDNs and maybe a P2P network). Based on historical and currently-reported performance data, it would direct the player to the source that delivers acceptable performance for the least cost. Dividing the content into chunks makes this easier, but isn’t strictly necessary. What you’d effectively have is a CDN-of-CDNs, with the overlay needing to own no infrastructure other than the routing processor.

That is the next-generation CDN. If it were vendor-neutral, allowing the customer to choose whomever it wanted to work with, it would usher in an era of truly brutal price competition.

Bookmark and Share

The Microsoft CDN study

The Microsoft/NYU CDN study by Cheng Huang, Angela Wang, et.al., seems to no longer be available. Perhaps it’s simply been temporarily withdrawn pending its presentation at the upcoming Internet Measurement Conference. You can still find it in Google’s cache, HTMLified, by searching for the title “Measuring and Evaluating Large-Scale CDNs”, though.

To sum it up in brief for those who missed reading it while it was readily available: Researchers at Microsoft and the Polytechnic Institute of New York University explored the performance of the Akamai and Limelight CDNs. Using a set of IP addresses derived from end-user clients of the MSN video service, and web hosts in Windows Live search logs, the researchers derived a set of vantage points based on the open-recursive DNS servers authoritative for those domains. They used these vantage points to chart the servers/clusters of the two CDNs. Then, using the King methodology, which measures the latency between DNS servers, they measured the performance of the two CDNs from the perspective of the vantage points. They also measured the availability of the servers. Then, they drew some conclusions about the comparative performance of the CDNs and how to prioritize deployments of new locations.

Both Akamai and Limelight pointed to flaws in the study, and I’ve done a series of posts that critique the methodology and the conclusions.

For convenience, here are the links to my analysis:
What the Microsoft CDN study measures
Blind spots in the Microsoft CDN study
Availability and the Microsoft CDN study
Assessing CDN performance

Hopefully the full PDF of the study will return to public view soon. Despite its flaws, it’s still tremendously interesting and a worthwhile read.

Bookmark and Share

MediaMelon and CDN overlays

MediaMelon has launched, with what they call their “video overlay network“.

I haven’t been briefed by the company yet (although I’ve just sent a request for a briefing), but from the press release and the website, it looks like what they’ve got is a client that utilizes multiple CDNs (and other data sources) to pull and assemble segments of video prior to the user watching the content. The company’s website mentions neither board of directors nor management team, though the press release mentions the CEO, Kumar Subramian.

I’ll post more when I have some details about the company and their technology, but I’ll note that I think that software-based CDN overlay networks are going to be a rising trend. As the high-volume video providers increasingly commoditize their CDN purchases, the value-added services layer will move from CDN-provided and CDN-specific, to CDN-neutral software-only components.

Bookmark and Share

The nameserver as CDN vantage point

I was just thinking about the nameserver as a vantage point in the Microsoft CDN study, and I remembered that for the CDNs themselves, the nameserver is normally their point of reference to the customer.

When a content provider uses a CDN, they typically use a DNS CNAME to alias a hostname to a hostname of the CDN provider. For instance, http://www.nbc.com maps to http://www.nbc.com.edgesuite.net; the edgesuite.net domain is owned by Akamai. That means that when a DNS resolver goes to try to figure out what the IP address of that hostname is, it’s going to query the CDN’s DNS servers for that answer. The CDN’s DNS server looks at the IP address of the querying nameserver, and tries to return a server that is good for that location.

Notably, the CDN’s DNS server does not know the user’s actual IP. That information is not present in the DNS query (RFC 1035 specifies the structure of queries).

Therefore, what nameserver you use, and its proximity to where you actually are on the network, will determine how good the CDN’s response actually is.

I did a little bit of testing, which has some interesting results. I’m using a combination of traceroute and IP geolocation to figure out where things are.

At home, I have my servers configured to use the UltraDNS “DNS Advantage” free resolvers. They return their own ad server rather than NXDOMAIN, which is an annoyance, but they are also very fast, and the speed difference makes a noticeable dent in the amount of time that my mail server spends in (SpamAssassin-based) anti-spam processing. But I can also use the nameservers provided to me by MegaPath; these are open-recursive.

UltraDNS appears to use anycast. The DNS server that it picks for me seems to be in New York. And http://www.nbc.com ends up mapping to an Akamai server that’s in New York City, 12 ms away.

MegaPath does not. Using the MegaPath DNS server, which is in the Washington DC area, somewhere near me, http://www.nbc.com ends up mapping to a server that’s directly off the MegaPath network, but which is 18 ms away. (IP geolocation says it’s in DC, but there’s a 13 ms hop between two points in the traceroute, which is either an awfully slow router or more likely, genuine distance.)

Now, let’s take my friend who lives about 20 miles from me and is on Verizon FIOS. Using Verizon’s DC-area nameserver, he gets the IP address of a server that seems to live off Comcast’s local network — and is a mere 6 ms from me.

For Limelight, I’m looking up http://www.dallascowboys.com. From UltraDNS in NYC, I’m getting a Limelight server that’s 14 ms away in NYC. Via MegaPath, I’m getting one in Atlanta, about 21 ms away. And asking my friend what IP address he gets off a Verizon lookup, I get a server here in Washington DC, 7 ms away.

Summing this up in a chart:

My DNS / CDN PingAkamaiLimelight
UltraDNS12 ms14 ms
MegaPath18 ms21 ms
Verizon6 ms7 ms

The fact that Verizon has local nameservers and the others don’t makes a big difference as to the quality of a CDN’s guess as to what server it ought to be using. Here’s a callout to service providers: Given the increasing amount of content, especially video, now served from CDNs, local DNS infrastructure is now really important to you. Not only will it affect your end-user performance, but it will also affect how much traffic you’re backhauling across your network or across your peers.

On the surface, this might make an argument for server selection via AnyCast, which is used by some lower-cost CDNs. Since you can’t rely upon a user’s nameserver actually being close to them, it’s possible that the crude BGP metric could return better results than you’d expect. AnyCast isn’t going to cut it if you’ve got lots of nodes, but for the many CDNs out there with a handful of nodes, it might not be that bad.

I went looking for other comparables. I was originally interested in Level 3, and dissected http://www.ageofconan.com (because there was a press release indicating an exclusive deal), but from that, discovered Funcom actually uses CacheFly for the website. funcom.cachefly.net returns the same IP no matter where you look it up from (I tried it locally, and from servers I have access to in Colorado and California). But traceroute clearly shows it’s going to different places, indicating an anycast implementation. Locally, I’ve got a CacheFly server a mere 6 ms away. From California, there’s also a local server, 13 ms away. Colorado, unfortunately, uses Chicago, a full 32 ms away. Unfortunately, this doesn’t tell us much, beyond the fact that CacheFly has limited footprint; we’d need to look at a CDN with enough footprint that uses AnyCast to see whether it actually return results better than the nameserver method does.

So here’s something for future researchers to explore: How well does resolver location correspond to user location? How much optimization is lost as a result? And how much better or worse would AnyCast be?

Bookmark and Share

Assessing CDN performance

This is the fourth and probably final post in a series examining the Microsoft CDN study. The three previous posts covered measurement, the blind spots, and availability. This post wraps up with some conclusions.

The bottom line: The Microsoft study is very interesting reading, but it doesn’t provide any useful information about CDN performance in the real world.

The study’s conclusions are flawed to begin with, but what’s of real relevance to purchasers of CDN services is that even if the study’s conclusions were valid, its narrow focus on one element — one-time small-packet latency to the DNS servers and content servers — doesn’t accurately reflect the components of real-world CDN performance.

Cache hit ratios have a tremendous impact upon real-world CDNs. Moreover, the fallback mechanism on a cache miss is also important — does a miss require going back to the origin, or is there a middle tier? This will determine how much performance is impacted by a miss. The nature of your content and the CDN’s architecture will determine what those cache hit ratios look like, especially for long-tail content.

Throughput determines how quickly you get a file, and how well a CDN can sustain a bitrate for video. Throughput is affected by many factors, and can be increased through TCP/IP optimizations. Consistency of throughput also determines what your overall experience is; start-stop behavior caused by jittery performance can readily result in user frustration.

More broadly, the problem is that any method of testing CDNs from anything other than the edge of the network, using real end-user points, is flawed. Keynote and Gomez provide the best approximations on a day to day basis, but they’re only statistical samples. Gomez’s “Actual Experience” service uses an end-user panel, but that introduces uncontrolled variables into the mix if you’re trying to compare CDNs, and it’s still only sampling.

The holy grail of CDN measurement, of course, is seeing performance in real-time — knowing exactly what users are getting at any given instant from any particular geography. But even if a real-time analytics platform existed, you’d still have to try a bunch of different CDNs to know how they’d perform for your particular situation.

Bottom line: If you want to really test a CDN’s performance, and see what it will do for your content and your users, you’ve got to run a trial.

Then, once you’ve done your trials, you’ve got to look at the performance and the cost numbers, and then ask yourself: What is the business value of performance to me? Does better performance drive real value for you? You need to measure more than just the raw performance — you need to look at time spent on your site, conversion rate, basket value, page views, ad views, or whatever it is that tells you how successful your site is. Then you can make an intelligent decision.

In the end, discussions of CDN architecture are academically interesting, and certainly of practical interest to engineers in the field, but if you’re buying CDN services, architecture is only relevant to you insofar as it results in the quality of the user experience. If you’re a buyer, don’t get dragged into the rathole that is debating the merits of one architecture versus another. Look at real-world performance, and think short-term; CDN contract lengths are getting shorter and shorter, and if you’re a high-volume buyer, what you care about is performance right now and maybe in the next year.

Bookmark and Share

Availability and the Microsoft CDN study

This post is the third in a series examining the Microsoft CDN study. My first post examined what was measured, and the second post looked at the blind spots created by the vantage-point discovery method they used. This time, I want to look at the availability and maintenance claims made by the study.

CDNs are inherently built for resilience. The whole point of a CDN is that individual servers can fail (or be taken offline for maintenance), with little impact on performance. Indeed, entire locations can fail, without affecting the availability of the whole.

If you’re a CDN, then the fewer nodes you have, the more impact the total failure of a node will have on your overall performance to end-users. However, the flip side of that is that megaPOP-architecture CDNs generally place their nodes in highly resilient facilities with extremely broad access to connectivity. The most likely scenario that takes out an entire such node is a power failure, which in such facilities generally requires a cascading chain of failure (but can go wrong at single critical points, as with the 365 Main outage of last year). By contrast, the closer you get to the edge, the higher the likelihood that you’re not in a particularly good facility and you’re getting connectivity from just one provider; failure is more probable but it also has less impact on performance.

Because the Microsoft study likely missed a significant number of Akamai server deployments, especially local deployments, it may underestimate Akamai’s single-server downtime, if you assume that such local servers are statistically more likely to be subject to failure.

I would expect, however, that most wider-scale CDN outages are related not to asset failure (facility or hardware), but to software errors. CDNs, especially large CDNs, are extraordinarily complex software systems. There are scaling challenges inherent in such systems, which is why CDNs often experience instability issues as part of their growing pains.

The problem with the Microsoft study of availability is that whether or not a particular server or set of servers responds to requests is not really germane to availability per se. What is useful to know is the variance in performance based upon that availability, and what percentage of the time the CDN selects a content server that is actually unavailable or which is returning poor performance. The variance plays into that edge-vs-megaPOP question, and the selection indicates the quality of the CDN’s software algorithms as well as real-world performance. The Microsoft study doesn’t help us there.

Similarly, whether or not a particular server is in service does not indicate what the actual maintenance cost of the CDN is. Part of the core skillset of a CDN company is the ability to maintain very large amounts of hardware without using a lot of people. They could very readily have automated processes pulling servers out of service, and executing software updates and the like with little to no human intervention.

Next up: Some conclusions.

Bookmark and Share