CloudPundit: Massive-Scale Computing

Akamai article in ACM Queue

My colleague Nick Gall pointed out an article in the ACM Queue that I’d missed: Improving Performance on the Internet, by Akamai’s chief scientist, Tom Leighton.

There is certainly some amount of marketing spin in that article, but it is nonetheless a very good read. If you are looking for a primer on why there are CDNs, or are interested in understanding how the application delivery network service works, this is a great article. Even if you’re not interested in CDNs, the section called “Highly Distributed Network Design” has a superb set of principles for fault-tolerant distributed systems, which I’ll quote here:

Ensure significant redundancy in all systems to facilitate failover.
Use software logic to provide message reliability.
Use distributed control for coordination.
Fail cleanly and restart.
Phase software releases.
Notice and proactively quarantine faults.

One niggle: The article says, “The top 30 networks combined deliver only 50 percent of end-user traffic, and it drops off quickly from there, with a very-long-tail distribution over the Internet’s 13,000 networks.” That statement needs a very important piece of context: the fact that most of those networks do not belong to network operators (i.e., carriers, cable companies, etc.). Many of them are simply “autonomous systems” (in Internet parlance) owned by enterprises, or which belong to Web hosters, and so forth. That’s why the top 30 account for so much of the traffic, and that percentage would be sharply increased if you allocated them the enterprises who buy transit from them. (Those interested in looking at data to do a deeper dive should check out the Routing Report site.)

Posted in Infrastructure

Leave a comment

Tags: CDN

COBOL comes to the cloud

Jan 15

Posted by Lydia Leong

In this year of super-tight IT budgets and focus on stretching what you’ve got rather than replacing it with something new, Micro Focus is bringing COBOL to the cloud.

Most vendor “support for EC2” announcements are nothing more than hype. Amazon’s EC2 is a Xen-virtualized environment. It supports the operating systems that run in that environment; most customers use Linux. Applications run no differently there than they do in your own internal data center. There’s no magical conveyance of cloud traits. Same old app, externally hosted in an environment with some restrictions.

But Micro Focus (which is focused around COBOL-based products) is actually launching its own cloud service, built on top of partner clouds — EC2, as well as Microsoft’s Azure (previously announced).

Micro Focus has also said it has tweaked its runtime for cloud deployment. They give the example of storing VSAM files as blobs in SQL. This is undoubtedly due to Azure not offering direct access to the filesystem. (For EC2, you can get persistent normal file storage with EBS, but there are restrictions.) I assume that similar tweaks were made wherever the runtime needs to do direct file I/O. Note that this still doesn’t magically convey cloud traits, though.

It’s interesting to see that Micro Focus has built its own management console around EC2, providing easy deployment of apps based on their technology, and is apparently making a commitment to providing this kind of hosted environment. Amidst all of the burgeoning interest in next-generation technologies, it’s useful to remember that most enterprises have a heavy burden of legacy technologies.

(Disclaimer: My husband was founder and CTO of LegacyJ, a Micro Focus competitor, whose products allow COBOL, including CICS apps, to be deployed within standard J2EE environments — which would include clouds. He doesn’t work there any longer, but I figured I should note the personal interest.)

Posted in Infrastructure

Leave a comment

Tags: Amazon, appdev, cloud

Women, software development, and 10,000 hours

Jan 13

Posted by Lydia Leong

My colleague Thomas Otter is encouraging us to support Ada Lovelace day — an effort to have a day of bloggers showcasing role-model women in technology, on March 24th.

His comments lead me to muse upon how incredibly male-dominated the IT industry is, especially in the segment that I watch — IT operations. At Gartner’s last data center conference, for instance, women made up a tiny fraction of attendees, and many of the ones that I spoke with were in a “pink role” like Marketing with a vendor, or had side-stepped being hands-on by entering an IT management role out of an early career spent in some other discipline.

I’m all for raising awareness of role models, but I lean towards the belief that the lack of women in IT, especially in hands-on roles, is something more fundamental — a failure to be adequately prepared, in childhood, to enter a career in IT. Resources devoted to the girls-and-computers issue have long noted that a girl is less likely to get a PC in her bedroom (all the paranoia about letting kids privately use the Internet aside), less likely to be encouraged to start programming, and less likely to get encouragement from either peer or parental sources, compared to a boy. The differences are already stark by the high school level. And that’s not even counting the fact that the Internet can be a cesspool of misogyny.

Malcom Gladwell’s Outliers has gotten a lot of press and interest of late. In it, he asserts a general rule that it takes 10,000 hours, within 10 years, to become truly expert at something.

There is a substantial chance that a boy who is interested in computers will get in his 10,000 hours of useful tinkering, possibly even focused programming, before he finishes college. That’s particularly true if he’s got a computer in his bedroom, where he can focus quietly on a project.

Conversely, a girl who initially starts with similar inclinations is vastly less likely to be encouraged down the path that leads to spending two or three hours a night, for ten years, mucking around with a computer.

I was very fortunate, in that even though my parents viewed computers as essentially fancy toys for boys, that they nonetheless bought me computers (and associated things, like electronics kits), allowed me to have a computer in my bedroom, and tolerated the long stints at the keyboard that meant that I accumulated 10,000 hours of actual programming time (on top of some admittedly egregious number of hours playing computer games), well within a 10-year timeframe. I majored in computer science engineering, and I did a lot of recreational programming in college, too, as well paid systems administration and programming, but the key thing is: College classes taught me very few practical IT skills. I already had the foundation by the time I got there.

Academic computer science is great for teaching theory, but if you only do enough programming to do well in your classes, you’re simply not spending that much time acquiring expertise. And that leads to the phenomenon where companies interview entry-level software development candidates, who look pretty similar on paper, but some of whom have already put in 10,000+ hours learning the trade, and some of whom are going to have to spend the first five years of their careers doing so. The way the culture (at least in the US) is, there’s enormous social pressure on girls and women to not nerd out intensively on their own time, and while it might lead to ostensibly positive phrases like “a more balanced lifestyle”, it absolutely hurts many women when they try to enter the IT workforce.

Posted in Industry

Leave a comment

Tags: people

Cloud debate: GUI vs. CLI and API

Jan 12

Posted by Lydia Leong

In the greater blogosphere, as well as amongst the cloud analysts across the various research firms, there’s been an ongoing debate over the question, “Does a cloud have to have an API to be a cloud?”

Going beyond that question, though, there are two camps of cloud users emerging — those who prefer the GUI (control panel) approach to controlling their cloud, and those that prefer command-line interfaces and/or APIs. These two camps can probably be classified into the automated and the automators — those users who want easy access to pre-packaged automation, and those users who want to write automation of their own.

This distinction has long existed in the systems administration community — the split between those who rely on the administrator GUIs to do things, vs. those who do everything via the command line, editing config files, and their own scripts. But the advent of cloud computing and associated tools, with their relentless drive towards standardization and automation, is casting these preferences into an increasingly stark light. Moreover, the emerging body of highly sophisticated commercial tools for cloud management (virtual data center orchestration and everything that surrounds it) means that in the future, even those more sophisticated IT operations folks who are normally self-reliant, will end up taking advantage of those tools rather than writing stuff from scratch. That suggests that tools will also follow two paths — there will be tools that are designed to be customized via GUI, and tools that are readily decomposable into scriptable components and/or provide APIs.

I’ve previously asserted that cloud drives a skills shift in IT operations personnel, creating a major skills chasm between those who use tools, and those who write tools.

The emerging cloud infrastructure services seem to be pursuing one of two initial paths — exposure via API and thus highly scriptable by the knowledgeable (e.g., Amazon Web Services), and friendly control panel (e.g., Rackspace’s Mosso). While I’d expect that most public clouds will eventually offer both, I expect that both services and do-it-yourself cloud software will tend to emphasize capabilities one way or another, focusing on either the point-and-click crowd or the systems programmers.

(A humorous take on this, via an old Craigslist posting: Keep the shell people alive.)

Posted in Infrastructure

2 Comments

Tags: cloud, people

The value, or not, of hands-on testing

Jan 11

Posted by Lydia Leong

At Gartner, we generally do not do hands-on testing of products, with the exception of some folks who cover consumer devices and the like. And even then, it’s an informal thing. Unlike the trade rags, for instance, we don’t have test labs or other facilities for doing formal testing.

There are a lot of reasons for that, but from my perspective, the analyst role has been primarily to advise IT management. Engineers can and will want to do testing themselves (and better than we can). Also, for the mid-size to enterprise market that we target, any hands-on testing that we might do is relatively meaningless vis a vis the complexity of the typical enterprise implementation.

Yet, the self-service nature of cloud computing makes it trivially cheap to do testing of these services, and without needing to involve the vendor. (I find that if I’m paying for a service, I feel free to bother the customer support guys, and find out what that’s really like, without being a nuisance to analyst relations or getting a false impression.) So for me, testing things myself has a kind of magnetic draw; call it the siren song of my inner geek. The question I’m asking myself, given the time consumed, is, “To what end?”

I think the reason I’m trying to do at least a little bit of hands-on with each major cloud is that I feel like I’m grounding hype in reality. I know that in superficially dinking around with these clouds, I’m only lightly skimming the surface of what it’s like to deploy in the environments. But it gives me an idea of how turnkey something is, or not, as well as the level of polish in these initial efforts.

This is a market that is full of incredible hype, and going through the mental exercise of “how would I use this in production” helps me to advise my clients on what is and isn’t ready for prime-time. An acquaintance once memorably wrote, when he was disputing some research, that analysts sit at the apex of the hype food-chain, consuming pure hype and excreting little pellets of hype as dense as neutronium. I remember being both amused and deeply offended when I first read that. Of course, I think he was very wrong — whatever we’re fed in marketing, tends to be more than overcome by the overwhelming volume of IT buyer inquiry we do, which is full of the ugly reality of actual implementation. But the comment has stuck in my memory as a dark reminder that an analyst needs to be vigilant about not feeding at the hype-trough. Keeping in touch by being at least a little hands-on helps to innoculate against that.

However, I realized, after talking to a cloud vendor client the other day, that I probably should not waste their phone inquiry time talking about hands-on impressions. That’s better left to this blog, or email, or their customers and the geek blogosphere. My direct impressions are only meaningfully relevant to the extent that what I experience hands-on contradicts marketing messages, or indicates a misalignment between strategy and implementation, or otherwise touches something higher-level. My job, as an analyst, is to not get lost in the weeds.

Nevertheless, there’s simply nothing like gaining a direct feel for something, and I am, unfortunately, way behind in my testing; I’ve got more demos accumulating than I’ve had time to try out, and the longer it takes to set something up, the more it lags in my mental queue.

Posted in Analyst Life

4 Comments

Tags: Gartner, hands-on, research

Touring Amazon’s management console

Jan 10

Posted by Lydia Leong

The newly-released beta of Amazon’s management console is reasonably friendly, but it is not going to let your grandma run her own data center.

I took a bit of a tour today. I’m running Firefox 3 on a Windows laptop, but everything else I’m doing out of a Unix shell — I have Linux and MacOS X servers at home. I already had AWS stuff set up prior to trying this out; I’ve previously used RightScale to get a Web interface to AWS.

The EC2 dashboard starts with a big friendly “Launch instances” button. Click it, and it takes you to a three-tab window for picking an AMI (your server image). There’s a tab for Amazon’s images, one for your own, and one for the community’s (which includes a search function). After playing around with the search a bit (and wishing that every community image came with an actual blurb of what it is), and not finding a Django image that I wanted to use, I decided to install Amazon’s Ruby on Rails stack.

On further experience, the “Select” buttons on this set of tabs seem to have weird issues; sometimes you’ll go to them and they’ll be grayed out and unclickable, sometimes you’ll click them and they’ll go gray but you won’t get the little “Loading, please wait” box that appears before going onto the next tab — and it will leave you stuck, leaving you to cancel the window and try again.

Once you select an image, you’re prompted to select how many instances you want to launch, your instance type, key pair (necessary to SSH into your server), and a security group (firewall config). More twiddly bits, like the availability zone, are hidden in advanced options. Pick your options, click “Launch”, and you’re good to go.

From the launch window, your options for the firewall default to having a handful of relevant ports (like SSH, webserver, MySQL) open to the world. You can’t get more granular with the rules than this there; you’ve got to use the Security Group config panel to add a custom rule. I wish that the defaults would be slightly stricter, like limiting the MySQL port to Amazon’s back-end.

Next, I went to create an EBS volume for user data. This, too, is simple, although initially I did something stupid, failing to notice that my instance had launched in us-east-1b. (Your EBS volume must reside in the same availability zone as your instance, in order for the instance to mount it.)

That’s when I found the next interface quirk — the second time I went to create an EBS volume, the interface continued to insist for fifteen minutes that it was still creating the volume. Normally there’s a very nice Ajax bit that automatically updates the interface when it’s done, but this time, even clicking around the whole management console and trying to come back wouldn’t get it to update the status and thus allow me to attach it to my instance. I had to close out the Firefox tab, and relaunch the console.

Then, I remember that the default key pair that I’d created had been done via RightScale, and I couldn’t remember where I’d stashed the PEM credentials. So that led me to a round of creating a new key pair via the management console (very easy), and having to terminate and launch a new instance using the new key pair (subject to the previously-mentioned interface quirks).

The same interface-somehow-gets-into-indeterminate-state also seems to be a problem for other things, like the console “Output” button for interfaces — you get a blank screen rather than the console dump.

That all dealt with, I log into my server via SSH, don’t see the EBS volume mounted, and remember that I need to actually make a filesystem and explicitly mount it. All creating an EBS volume does is allocate you an abstraction on Amazon’s SAN, essentially. This leads me to trying to find documentation for EBS, which leads to the reminder that access to docs on AWS is terrible. The search function on the site doesn’t index articles, and there are far too many articles to just click through the list looking for what you want. A Google search is really the only reasonable way to find things.

All that aside, once I do that, I have an entirely functional server. I terminate the instance, check out my account, see that this little experiment has cost me 33 cents, and feel reasonably satisfied with the world.

Posted in Infrastructure

Leave a comment

Tags: Amazon, cloud, hands-on

News round-up

Jan 9

Posted by Lydia Leong

A handful of quick news-ish takes:

Amazon has released the beta of its EC2 management console. This brings point-and-click friendliness to Amazon’s cloud infrastructure service. A quick glance through the interface makes it clear that effort was made to make it easy to use, beginning with big colorful buttons. My expectation is that a lot of the users who might otherwise have gone to RightScale et.al. to get the easy-to-use GUI will now just stick with Amazon’s own console. Most of those users would have just been using that free service, but there’s probably a percentage that would otherwise have been upsold who will stick with what Amazon has.

Verizon is courting CDN customers with the “Partner Port Program”. It sounds like this is a “buy transit from us over a direct peer” service — essentially becoming explicit about settlement-based content peering with content owners and CDNs. I imagine Verizon is seeing plenty of content dumped onto its network by low-cost transit providers like Level 3 and Cogent; by publicly offering lower prices and encouraging content providers to seek paid peering with it, it can grab some revenue and improve performance for its broadband users.

Scott Cleland blogged about the “open Internet” panel at CES. To sum up, he seems to think that the conversation is now being dominated by the commercially-minded proponents. That would certainly seem to be in line with Verizon’s move, which essentially implies that they’re resigning themselves to the current peering ecosystem and are going to compete directly for traffic rather than whining that the system is unfair (always disengenuous, given ILEC and MSO complicity in creating the current circumstances of that ecosystem).

I view arrangements that are reasonable from a financial and engineering standpoint, that do not seek to discriminate based on the nature of the actual content, to be the most positive interpretation of network neutrality. And so I’ll conclude by noting that I heard an interesting briefing today from Anagran, a hardware vendor offering flow-based traffic management (i.e., it doesn’t care what you’re doing, it’s just managing congestion). It’s being positioned as an alternative or supplement to Sandvine and the like, offering a way to try to keep P2P traffic manageable without having to do deep-packet inspection (and thus explicit discrimination).

Posted in Infrastructure

Leave a comment

Tags: Amazon, CDN, net neutrality, news

Open source and behavioral economics

Jan 9

Posted by Lydia Leong

People occasionally ask me why busy, highly-skilled, highly-compensated programmers freely donate their time to open-source projects. In the past, I’ve nattered about the satisfaction of sharing with the community, the pleasure of programming as a hobby even if you do it for your day job, the “just make it work” attitude that often prevails among techies, altruism, idealism, the musings of people like Linus Torvalds, or research like the Lakhni and Wolf MIT/BCG study of developer motivation. (Speaking for myself, I code to solve problems, and I am naturally inclined to share what I do with others, and derive pleasure from having it be useful to others. The times I’ve written code for a living, I’ve always been lucky to have employers who were willing to let me open-source anything which wasn’t company-specific.)

But a chapter in Dan Ariely’s book Predictably Irrational got me thinking about a simpler way to explain it: Programmers contribute to free software projects for reasons that are similar to the reasons why lawyers do pro bono work.

The book posits that exchanges follow either social norms or market norms. If it’s a market exchange, we think in terms of money. If it’s a social exchange, we think in terms of human benefits. It’s the difference between a gift and a payment. Mentioning money (“a gift worth $10”) immediate transforms something into a market exchange. The book cites the example of lawyers being asked to do pro bono work — offered $30/hour to help needy clients, they refused, but asked to do it for free, there were plenty of takers. The $30/hour was viewed through the mental lens of a market exchange, mentally compared to their usual fees and deemed not worthwhile. Doing it for free, on the other hand, was viewed as a social exchange, evaluated on an entirely separate basis than the dollar value.

Contributing to free software follows the norms of the social exchange. The normative difference is also interesting in light of Richard Stallman’s assertion of the non-equivalence of “free software” and “open source”, and some of the philosophical debates that simmer in the background of the open-source movement; Stallman’s “free software” philosophy is intricately tied into the social community of software development.

The book also notes that issues occur when one tries to mix social norms and market norms. For instance, if you ask a friend to help you move, but he’s volunteering his time alongside paid commercial movers, that’s generally going to be seen as socially unacceptable. Commercial open-source projects conflate these two things all the time — which may go far to explaining why few commercialy-started projects gain much of a committer base beyond the core organizations and developers who care and are paid to do so (either directly, or indirectly via an end-user organization that makes heavy use of that software).

(Edit: I just discovered that Ariely has actually done an interview on open source, in quite some depth.)

Posted in Industry

1 Comment

Tags: book, open source, science

Predictably Irrational

Jan 8

Posted by Lydia Leong

If you deal with pricing, or for that matter, marketing or sales in general, and you’re going to read one related book this year, read Predictably Irrational: The Hidden Forces That Shape Our Decisions, by Dan Ariely. (I mentioned an article by him in a previous post on the impact of transparent pricing for CDNs, and I’ve finally had time to read his book.)

The book deals with behavioral economics, which can be summed up as the science of the way we perceive value and make economic decisions. It’s an entertaining read, describing a variety of experiments, their outcomes, and the broader conclusions that can be drawn. The book does an excellent job of demonstrating that we do not make such decisions in a fully rational manner, even when we think we are — but because there’s a predictable pattern to this irrationality, you can market and sell accordingly.

Two thoughts, among many others that I’m mulling over as a result:

Ariely asserts that people don’t know what new things ought to cost — they have no basis for comparison. Thus, establishing a basis for comparison creates the sense of value, and can be used to manipulate people’s mental pricing baselines and influence their decisions. For instance, given a thing, an inferior version but cheaper version of that thing, and some other less-similar thing, people will generally choose the thing. That’s relevant when you think about the way people compare CDN services, especially first-time enterprise buyers.

Ariely also shows that given a useful but brand-new thing, people might not know whether it’s a good value and thus may choose not to buy it — but establish a comparison in the form of a bigger but much more expensive form of that thing, and people will see the original as a good value and buy it. This is hugely relevant in the emerging cloud computing market, where people aren’t yet certain what the billing units should be and what they should cost.

Relating this to my usual topics of interest: Amazon has essentially established a transparent baseline in both the cloud computing and CDN markets, with clearly-articulated, readily-available pricing, and as a result, they have implicit control of the conversation around pricing. Broadly, any vendor who puts a public stake in the ground on prices is going to exert influence over a customer’s perception of not only their value, but every other comparable vendor.

Posted in Marketing

Leave a comment

Tags: book, science

CloudPundit: Massive-Scale Computing

the business of Internet infrastructure, cloud computing, and data centers

Akamai article in ACM Queue

Women, software development, and 10,000 hours

The value, or not, of hands-on testing

News round-up

Open source and behavioral economics

Predictably Irrational

About the Author

Search

Categories

Archives

Recent Posts

Recent Comments

More Comments

Top Clicks

Meta