DDoS season

We are, it seems, in the midst of a wave of distributed denial of service attacks. The victims include:

  • Neustar’s UltraDNS. (Problems with specific regional DNS clusters, with little customer-visible impact.)
  • Register.com. (Severe impact on Web hosting and email customers.)
  • GoGrid. (Severe impact on cloud hosting customers.)
  • ThePlanet. (Attack on their DNS servers, with severe impact on customers.)

The attack on ThePlanet is unusual in that it received minimal attention in the press, despite the company being one of the largest Web hosters, and having Cisco Guard (DDoS mitigation) appliances in place. Also, the status updates were eventually issued via Twitter, rather than a more expected form of customer communication. Here’s the full text, aggregated off Twitter:

Between 2:30am and 5:00am CDT on April 8, The Planet’s name servers were flooded again with a large brute force (DDoS) attack. Unlike the previous attack, this attack did not appear to be DNS-specific; instead, targeted resources indirectly supporting DNS services. Because the nature of this attack was different from the previous event, mirroring the response to the previous attack was ineffective. Once our investigation determined the nature of the attack, we applied filters throughout our DNS support system to alleviate the effects. The Planet’s network and DNS performance have been restored, and the attack originator has ceased actions. Any lingering issues may be indicative of a different problem that may have been exacerbated by the attack and should be resolved quickly. We are working on several projects to help mitigate similar attacks in the future. Once those plans are in order, we will update the DNS Status announcement thread in our community forums. We understand that other providers are experiencing similar events. We will reach out to them, pool our information and then work together to find consistencies between attacks. Our goal is to establish best practices as an industry to better respond to these recent events.

Jose Nazario of Arbor Networks claims these attacks are not Conficker at work, which makes this wave of attacks even more interesting.

The takeaway from this: Customers understand if you get DDoS’d. They don’t put up with a lack of communication. It’s enormously difficult to communicate with customers in the midst of a crisis, especially one that takes down customer-facing infrastructure in a customer-impacting way, but it’s also incredibly critical. Clearly, not everyone in the company is out trying to troubleshoot the problem, so you can usefully put them to work reaching out to your customers, if you have the policies and procedures in place to do so successfully.

Something to think about today, no matter who you are and who you work for: What policies do you have in place for customer communications when a crisis hits your company? (Book recommendation: Eric Dezenhall’s Damage Control, which is a hard-edged, realistic look at communication in a crisis, including coping with competitors who are deliberately fanning the negative-PR flames.)

Bookmark and Share

Google App Engine and other tidbits

As anticipated, Java support on Google App Engine has been announced. To date, GAE has supported only the Python programming language. In keeping with the “phenomenal cosmic power, itty bitty living space” sandboxing that’s become common to cloud execution environments, GAE/Java has all the restrictions of GAE/Python. However, the already containerized nature of Java applications means that the restrictions probably won’t feel as significant to developers. Many Python libraries and frameworks are not “pure Python”; they include C extensions for speed. Java libraries and frameworks are, by contrast, usually pure Java; the biggest issues for porting Java into the GAE environment are likely to be the restrictions on system calls and the lack of threads. Generically, GAE/Java offers servlets. The other things that developers are likely to miss are support for JMS and JMX (Java’s messaging and monitoring, respectively).

Overall, the Java introduction is a definite plus for GAE, and is presumably also an important internal proof point for them — a demonstration that GAE can scale and work with other languages. Also, because there are lots of languages that now target the Java virtual machine (i.e., they’ve got compilers/interpreters that produce byte code for the Java VM) — Clojure and Scala, for instance — as well as ports of other languages, like JRuby, we’ll likely see additional languages available on GAE ahead of Google’s own support for those environments.

Google also followed through on an earlier announcement, adding support for scheduld tasks (“cron”). Basically, at a scheduled time, GAE cron will invoke a URL that you specify. This is useful, but probably not everything people were hoping it would be. It’s still subject to GAE’s normal restrictions; this doesn’t let you invoke a long-running background process. It requires a shift in thinking — for instance, instead of doing the once-daily data cleanup run at 4 am, you ought to be doing cleanup throughout the day, every couple of minutes, a bit of your data set at a time.

All of that is going to be chewed over thoroughly by the press and blogosphere, and I’ve contributed my two cents to a soon-to-be-published Gartner take on the announcement and GAE itself, so now I’ll point out something that I don’t think has been widely noticed: the unladen-swallow project plan.

unladen-swallow is apparently an initiative within Google’s compiler optimization team, with a goal of achieving a 5x speed-up in CPython (i.e., the normal, mainstream, implementation of Python), starting from the 2.6 base (the current version, which is a transition point between the 2.5 used by App Engine, and the much-different Python 3.0). The developers intend to achieve this speed-up in part by moving from the existing custom VM to one built on top of LLVM. (I’ve mentioned Google’s interest in LLVM in the past.) I think this particular approach answers some of the mystery surrounding Google and Python 3.0 — this seems to indicate longer-term commitment to the existing 2.x base, while still being transition-friendly. As is typical with Google’s work with open-source code, they plan to release these changes back to the community.

All of which goes back to a point of mine earlier this week: Although programming language communities strongly resemble fandoms, languages are increasingly fungible. We’re a long way from platform maturity, too.

Bookmark and Share

Scala, Ruby, cost, and development trends

A recent interview of some Twitter developers, on Twitter’s use of Scala has touched off a fair amount of controversy in the Ruby community, and prompting Todd Hoff of the High Scalability to muse on an interesting statement: At some point, the cost of servers outweighs the cost of programmers.

We all know that the scripting languages that are frequently favored in Web development today — Ruby, Python, and PHP — do not perform as well as Java, and Java in turn can be outperformed by well-written native C/C++ code. However, these popular dynamic programming languages typically lead to better programmer productivity. The argument has been that it’s more cost-effective to have more productive developers, than it is to buy less infrastructure. There is a point, though, when that scale equation can be flipped on its head — when the cost of the servers, due to the performance sacrifices, gets too high. (I would add that you can’t look at simple hardware spend alone, either. You’ve got a infrastructure TCO to look at. It’s not just about more people to maintain more servers, either — that equation is not linear, as a sysadmin can manage more systems if they’re all identical and there are good automation tools. But systems that are struggling due to performance issues soak up operations time with daily firefighting.)

Twitter’s developers are not advocating that people abandon what they know and love, but they’re forging a new path for themselves, with an open-source language developed in academia. Scala can be compiled to either Java or .NET bytecode, allowing it to interoperate bidirectionally with Java and CLR code; this is important for driving adoption because programmers generally like to work with languages that have a solid base of libraries (i.e., someone else has conveniently done the work of producing code for commonly-needed capabilities), and because this makes it possible for Scala to leverage the existing tools community for Java and .NET. Scala’s equivalent of Rails, i.e., a convenient framework, is Lift.

Scala doesn’t have much adoption now, but it’s worth noting that the rapid pace of Web 2.0 innovation is capable of driving extremely fast uptake of things that turn out to solve real-world problems. (For comparison: Not long ago, practically no one had heard of Hadoop, either, but it’s built quite a bit of buzz now.) That’s important for anyone contemplating the long-term future of particular platforms, particularly APaaS offerings that are tied to specific programming languages. The favored platforms can and do change in a tidal fashion — just look at the Google trend graph for Ruby on Rails to see just how aggressively interest can increase over a single year (2005 to 2006).

As a coda to all of this, Twitter’s Alex Payne has a smart blog post, noting that social media fills the vacuum between peer-reviewed journals and water-cooler conversations, yet deploring the fact that in these mediums, emotion can rule over what is measurable. The takeaway — whether you’re an IT manager, a marketing manager at a vendor, or an investor — from my perspective, is this: There’s an emotional context to programming language choice. These are not merely technical communities; these are fandoms, and they form part of a developer’s self-identity.

Bookmark and Share

OnLive game streaming

OnLive debuted its gaming service at the Game Developers Conference in what was apparently a pretty impressive demonstration, to judge from the press and blogosphere buzz. Basically, OnLive will be running games on its server infrastructure, and then streams them live to users over the Internet, thus allowing users to play titles for multiple consoles, as well as games whose normal hardware specs exceed their own PCs, on whatever computers they want.

Forrester’s Josh Bernoff is enthused about both the announcement and the broader implications of “your life in the cloud”. His take is an interesting read, which I’m not entirely sure I agree with in its entirety. However, I do think that the implications of OnLive’s technology is well worth thinking about in the context of hosted desktop virtualization.

In order for OnLive to be able to deliver graphics-intensive, high-resolution, fast-twitch games over long-haul Internet links, they have to have an amazing, very low-latency way to transmit screen images from their central servers to users at the edge. We know it has to be screen images because in their scheme, the end-user’s computer is not responsible for rendering anything. (This kind of display is a hard problem; previous attempts to display games via remote desktop have run into serious performance issues.) From the way this is written about, the trick is that it’s sending video, meaning that it can stream as quickly as live video in general can be streamed. Real-time screen update is theoretically awesome for business uses too, not just for gaming. So I am extremely curious about the underlying technology.

I’m not sure whether I’m really OnLive’s target audience. I own all three modern consoles (Xbox 360, PS3, Wii), and a lot of my games come with peripherals. So my primary interest in this is mostly the ability to truly get games on-demand. But I am enough of a performance hound to own a high-end gaming monitor, gaming keyboard, gaming mouse, etc. for my PC (although ironically, no high-end graphics card), so any compromise in latency might not be my cup of tea. But it is certainly a terribly interesting idea.

Bookmark and Share

AWS in Eclipse, and Azure announcements

Amazon’s announcement for today, with timing presumably associated with EclipseCon, is an AWS toolkit for the Eclipse IDE.

Eclipse, which is an open-source project under the aegis of IBM (who also offers a commercial version), is one of the most popular IDEs (the other is Microsoft Visual Studio). Originally designed for Java applications, it has since been extended to support many other languages and environments.

Integrating with Eclipse is a useful step for Amazon, and hopefully other cloud providers will follow suit. It’s also a competitive response to the integration that Microsoft has done between Visual Studio and its Azure platform.

Speaking of Azure, as part of a set of announcements, Microsoft has said that it’s supporting non-.Net languages on Azure via FastCGI. FastCGI is a webserver extension that basically compiles and loads your scripts once, instead of every time they’re accessed, resulting in a reduction of computational overhead. You can run most languages under it, including Java, but it doesn’t really give you the full featureset that you get with tight integration with the webserver through a language-specific extension. (Note that because .NET’s languages encompass anything that supports the CLR, users already had some reasonable access to non-C# languages on Azure — implementations like Ruby.NET, IronRuby, IronPython, etc.)

Also, in an interesting Q&A on a ZDnet blog post, Microsoft said that there will be no private Azure-based clouds, i.e., enterprises won’t be able to take the Azure software and host it in their own data centers. What’s not clear is whether or not the software written for Azure will be portable into the enterprise environment. Portability of this sort is a feature that Microsoft, with its complete control over the entire stack, is uniquely well-positioned to be able to deliver.

Bookmark and Share

Gartner BCM summit pitches

I’ve just finished writing one of my presentations for Gartner’s Business Continuity Management Summit. My pitch is focused upon looking at colocation as well as the future of cloud infrastructure for disaster recovery purposes. (My other pitch at the conference is on network resiliency.)

When I started out to write this, I’d actually been expecting that some providers who had indicated that they’d have formal cloud DR services coming out shortly would be able to provide me with a briefing on what they were planning to offer. But that, unfortunately, turned out not to be the case in the end. So the pitch has been more focused on do-it-yourself cloud DR.

Lightweight DR services have appeared and disappeared from the market at an interesting rate ever since Inflow (many years and many acquisitions ago) began offering a service focused on smaller mid-market customers that couldn’t typically afford full-service DR solutions. It’s a natural complement to colocation (in fact, a substantial percentage of the people who use colo do it for a secondary site), and now, a natural complement to the cloud.

Bookmark and Share

Research du jour

My newest research notes are all collaborative efforts.

Forecast: Sizing the Cloud; Understanding the Opportunities in Cloud Services. This is Gartner’s official take on cloud segmentation and forecasting through 2013. It was a large-team effort; my contribution was primarily on the compute services portion.

Invest Insight: Content Delivery Network Arbitrage Increases Market Competition. This is a note specifically for Gartner Invest clients, written in conjunction with my colleague Frank Marsala (a former sell-side analyst who heads up our telecom sector for investors). It’s primarily about Conviva but also touches on Cotendo, but its key point is not to look at particular companies, but to look at technology-enabled long-term trends.

Cool Vendors in Cloud Computing Management and Professional Services, 2009. This is part of our annual “cool vendors” series highlighting small vendors whom we think are doing something notable. It’s a group effort, and we pick the vendors via committee. (And no, there is no way to buy your way into the report.) This year’s picks (never a secret, since vendors usually do press releases) are Appirio, CohesiveFT, Hyperic, RightScale, and Ylastic.

Bookmark and Share

Sun, IBM, and the cloud

The morning’s hot rumor: IBM and Sun are in acquisition talks. The punditry is in full swing in the press. My mailbox here at work is filling rapidly with research-community discussion of the implications, too. (As if Cisco’s Unified Computing Strategy wasn’t creating enough controversy for the week.)

Don’t let that buzz drown out Sun’s cloud announcement, though. An insider has useful detailed comments, along with links to the API itself. It’s Q-Layer inside, a RESTful API on top, and clearly in the early stages of development. I’ll likely post some further commentary once I get some time to read through all the documentation and think it through.

Bookmark and Share

A little SourceForge frustration

SourceForge puzzles me. I think it’s the combination of what is obviously eager effort to improve the site, and the fumbling to get the basics right.

On the plus side, SourceForge recently made a very welcome addition — adding “hosted apps”, including WordPress and MediaWiki — as an option for all projects, for free. And the announcement of support for additional repository types, notably git, is also a nice move.

But SourceForge is plagued by sluggish response (which is especially stark when compared to the consistent zippiness of Google Code) — across its website, source code repositories, etc. — as well as occasional outages. And the continual redesign of the site, especially in its current bright-orange incarnation, hasn’t seemed like a positive to me. With every redesign, I’ve felt like SourceForge was becoming harder and harder to use. As an example, one redesign ago, the Project Admin menu got so long it was basically unusable on smaller screens (like laptops). To SourceForge’s credit, the next iteration promptly fixed it; unfortunately, the chosen fix was by burying vitally important functionality like the file release system under the “Feature Settings” page (found under Project Admin). That led me on a wild hunt through most of the UI before I finally stumbled upon it the functionality I was looking for by accident.

SourceForge offers a tremendous amount of functionality for free, which is what’s allowing it to stay dominant against the proliferating number of alternative services out there. But not only does SourceForge need to innovate, it needs to make sure that it gets the basics right. It has to add functionality while still being fast and simple to use, and over the years, SourceForge seems to have grown tendrils of new features while the main octopod body has grown sessile and mottled with confusion.

Bookmark and Share

Linkage du jour

Tossing a few links out there…

In the weekend’s biggest cloud news, Microsoft’s Azure was down for 22 hours. It’s now back up, with no root cause known.

Geva Perry has posted a useful Zoho Sheet calculator for figuring out whether an Amazon EC2 reserved instance will save you money over an unreserved instance.

Craig Balding has posted a down-to-earth dissection of PCI compliance in the cloud, and the practical reality that cloud infrastructure providers tend to deal with PCI compliance by encouraging you to push the actual payment stuff off to third parties.

Bookmark and Share