Assessing CDN performance

This is the fourth and probably final post in a series examining the Microsoft CDN study. The three previous posts covered measurement, the blind spots, and availability. This post wraps up with some conclusions.

The bottom line: The Microsoft study is very interesting reading, but it doesn’t provide any useful information about CDN performance in the real world.

The study’s conclusions are flawed to begin with, but what’s of real relevance to purchasers of CDN services is that even if the study’s conclusions were valid, its narrow focus on one element — one-time small-packet latency to the DNS servers and content servers — doesn’t accurately reflect the components of real-world CDN performance.

Cache hit ratios have a tremendous impact upon real-world CDNs. Moreover, the fallback mechanism on a cache miss is also important — does a miss require going back to the origin, or is there a middle tier? This will determine how much performance is impacted by a miss. The nature of your content and the CDN’s architecture will determine what those cache hit ratios look like, especially for long-tail content.

Throughput determines how quickly you get a file, and how well a CDN can sustain a bitrate for video. Throughput is affected by many factors, and can be increased through TCP/IP optimizations. Consistency of throughput also determines what your overall experience is; start-stop behavior caused by jittery performance can readily result in user frustration.

More broadly, the problem is that any method of testing CDNs from anything other than the edge of the network, using real end-user points, is flawed. Keynote and Gomez provide the best approximations on a day to day basis, but they’re only statistical samples. Gomez’s “Actual Experience” service uses an end-user panel, but that introduces uncontrolled variables into the mix if you’re trying to compare CDNs, and it’s still only sampling.

The holy grail of CDN measurement, of course, is seeing performance in real-time — knowing exactly what users are getting at any given instant from any particular geography. But even if a real-time analytics platform existed, you’d still have to try a bunch of different CDNs to know how they’d perform for your particular situation.

Bottom line: If you want to really test a CDN’s performance, and see what it will do for your content and your users, you’ve got to run a trial.

Then, once you’ve done your trials, you’ve got to look at the performance and the cost numbers, and then ask yourself: What is the business value of performance to me? Does better performance drive real value for you? You need to measure more than just the raw performance — you need to look at time spent on your site, conversion rate, basket value, page views, ad views, or whatever it is that tells you how successful your site is. Then you can make an intelligent decision.

In the end, discussions of CDN architecture are academically interesting, and certainly of practical interest to engineers in the field, but if you’re buying CDN services, architecture is only relevant to you insofar as it results in the quality of the user experience. If you’re a buyer, don’t get dragged into the rathole that is debating the merits of one architecture versus another. Look at real-world performance, and think short-term; CDN contract lengths are getting shorter and shorter, and if you’re a high-volume buyer, what you care about is performance right now and maybe in the next year.

Bookmark and Share

Posted on October 15, 2008, in Infrastructure and tagged , , . Bookmark the permalink. 4 Comments.

Leave a comment