What the Microsoft CDN study measures

Cheng Huang et.al.’s Microsoft Research and NYU collaboration on a study entitled Measuring and Evaluating Large-Scale CDNs is worth a closer look. This is the first of what I expect will be a series of posts that aims to explain what was studied and what it means.

The study charts the Akamai and Limelight CDNs, and compares their performance. Limelight has publicly responded, based on questions from Dan Rayburn.

I want to begin by talking about what this study does and doesn’t measure.

The study measures two things: latency to the CDN’s DNS server, and latency to the CDN’s content server. This is latency in the purest network sense — the milliseconds of transit time between the origin measurement point (the “vantage point”) and a particular CDN server. The study uses a modified King methodology, which means the origin measurement points are open recursive DNS servers. In plain English, that means that the origin measurement points are ordinary DNS resolvers — the servers provided by ISPs, universities, and some businesses who have their resolvers outside the firewall. The paper states that 282,700 unique resolvers were used as vantage points.

Open recursive DNS servers (I’m just going to call them “resolvers” for short) are typically at the core of networks, not at the edge. They sit in the data centers of service providers and organizations; in the case of service providers, they may sit at major aggregation points. For instance, I’m a MegaPath DSL customer; the two MegaPath-based resolvers provided to me sit at locations with ping times that average 18 ms and 76 ms away. The issues with this are particularly acute given the study’s resolver discovery methodology — open authoritatives found by a reverse DNS lookup. Among other things, this results in the large diverse networks being significantly under-represented.

So what this study emphatically does not measure is latency to the end user. Instead, think of it as latency to the core of a very broad spectrum of networks, where “the core” means a significant data center or aggregation point, and “networks” mean service provider networks as well as enterprise networks. This is going to be very important when we consider the Akamai/Limelight performance comparison.

Content delivery performance can typically broken down into the “start time” — the amount of time that passes until the first byte of content is delivered to the user — and the “transfer time”, which is how long it takes for the content to actually get delivered.

The first component of the start time is the DNS resolution time. The URL is typically a human-readable name; this has to get turned into an IP address that a computer can understand. This is where CDNs are magic — they take that hostname and they turn it into the IP address of a “good”, “nearby” CDN server to get the content from. This component is what the study is measuring when it’s measuring the CDN DNS servers. The performance of this involves:


  • the network latency between the end-user and his resolver
  • the network latency between his resolver and the CDN’s DNS server
  • the amount of time it takes for the CDN’s DNS server to return a response to the query (the CDN has to figure out which server it wants to serve the content from, which takes some computational cycles to process; in order to cut down computational time, it tends to be a “good enough” server rather than “the optimal” server)

The start time has another component, which is how long it takes for the CDN content server to find the file it’s going to serve, and start spitting it out over the network to the end user. This is a function of server performance and workload, but it’s also a function of whether or not the content is in cache. If it’s not in cache, it’s got to go fetch it from the origin server. Therefore, a cache miss is going to greatly increase the start time. The study doesn’t measure this at all, of course.

The transfer time itself is dependent upon the server performance and workload, but also upon the network performance between the CDN’s content server and the end user. This involves not just latency, but also packet loss (although most networks today have very little packet loss, to the point where some carriers offer 0% packet loss SLAs). During the transfer period, jitter (the consistency of the network performance) may also matter, since spikes in latency may impact things like video, causing a stream to rebuffer or a progressive-download viewing to pause. In the end, the performance comes down to throughput — how many bytes can be shoved across the pipe, each second. The study measures latency to the content server, but it does not measure throughput, and throughput is the real-world metric for understanding actual CDN performance. Moreover, the study measures latency using a DNS packet — lightweight and singular. So it in no way reflects any TCP/IP tricks that a CDN might be doing in order to optimize its throughput.

Now, let’s take all this in the context of the Akamai/Limelight comparison that’s being drawn. The study notes that DNS resolution time is 23% higher for Limelight than Akamai, and that Limelight’s content server latency is 114% higher. However, this includes regions for which Limelight has little or no geographic coverage. For instance, in North America, where both companies have good coverage, Akamai has a DNS server delay of 115.81 ms and a content server delay of 67.24, vs. 78.64 and 79.03 respectively for Limelight. (It’s well-known that Akamai’s DNS resolution can be somewhat slower than competitors, since its much more extensive and complex network results in greater computational complexity.)

The study theorizes that it’s comparing the Akamai “as far to the edge as possible” approach vs. the Limelight (and most other current-generation CDNs) “megaPOP” approach. In other words, the question being asked is, “How much performance difference is created by not being right at the edge?”

Unfortunately, this study doesn’t actually answer that question, because the vantage points — the open recursive DNS servers — are not at the edge. They’re at the core (whether of service provider or enterprise networks). They’re at locations with fast big-pipe connectivity, and likely located in places with excellent peering — both megaPOP-friendly. A CDN like Akamai is certainly also at those same megaPOP locations, of course, but the methodology means that a lot of vantage points are essentially looking at the same CDN points of presence, rather than the more diverse set that might otherwise be represented by actual end-users. It seems highly likely that the Akamai network performance difference, under conditions where both CDNs feel they have satisfactory coverage, is underestimated by the study’s methodology.

More to come…

Bookmark and Share

Posted on October 13, 2008, in Infrastructure and tagged , , . Bookmark the permalink. 18 Comments.

  1. Would you perhaps have a new link for the Microsoft study?

    The posted link is currently returning a 404.

    Like

  2. My spouse and I stumbled over here by a different web address
    and thought I should check things out. I like what
    I see so now i’m following you. Look forward to going over your web page yet again.

    Like

  3. I loved as much as you will receive carried out right here.
    The sketch is tasteful, your authored material stylish.
    nonetheless, you command get bought an shakiness over that you wish be delivering the following.
    unwell unquestionably come more formerly again as exactly
    the same nearly very often inside case you shield this increase.

    Like

  4. I like the valuable information you provide in your articles.
    I will bookmark your weblog and check again here regularly.

    I am quite sure I’ll learn many new stuff right here! Good luck for the
    next!

    Like

  5. Heya i am for the first time here. I came across this board and I find
    It truly useful & it helped me out much. I hope to give something back and aid others like you
    aided me.

    Like

  6. When some one searches for his vital thing, thus he/she wishes to be available that
    in detail, therefore that thing is maintained over here.

    Like

  7. Keylogger software allows you to read messages sent, emails written, websites
    visited and anything else typed on the keyboard.
    This is when the issue of Computer Vision Syndrome may
    occur. Our high end computers and mobile phones are results of electronic circuits and components.

    Like

  8. Have you ever thought about including a little bit more than just your articles?
    I mean, what you say is important and everything. Nevertheless just imagine if you added some great photos or videos to give your posts more, “pop”!

    Your content is excellent but with pics and video clips, this website could undeniably be one of the
    best in its field. Terrific blog!

    Like

  9. Very good post. I am facing many of these issues as well..

    Like

  10. Howdy! Do you know if they make any plugins to assist with SEO?

    I’m trying to get my blog to rank for some targeted keywords but I’m not seeing very good results.
    If you know of any please share. Thanks!

    Like

  11. Hey there, You have done a fantastic job. I’ll definitely digg it and personally recommend to my friends.
    I’m confident they will be benefited from this site.

    Like

  12. We’re a group of volunteers and starting a new scheme in our community.
    Your site provided us with valuable info to work on.
    You have done a formidable job and our entire community will be thankful to you.

    Like

  13. Way cool! Some extremely valid points! I appreciate you penning this post and also the rest of the website is
    extremely good.

    Like

  14. Each new age brought the criminal element forward with it. However Yahoo marketing and other pay per click search engines offer some excellent traffic at a more reasonable
    price. You can download Avast Free Antivirus for free from the Avast website.

    Like

  1. Pingback: Blind spots in the Microsoft CDN study « CloudPundit: Massive-Scale Computing

  2. Pingback: Availability and the Microsoft CDN study « CloudPundit: Massive-Scale Computing

  3. Pingback: Assessing CDN performance « CloudPundit: Massive-Scale Computing

  4. Pingback: In the Limelight » Blog Archive » More on Understanding CDN Performance

Leave a comment