CloudPundit: Massive-Scale Computing

the business of Internet infrastructure, cloud computing, and virtual worlds

Posts Tagged ‘security’

The cloud is not magic

Posted by Lydia Leong on June 8, 2010

Just because it’s in the cloud doesn’t make it magic. And it can be very, very dangerous to assume that it is.

I recently talked to an enterprise client who has a group of developers who decided to go out, develop, and run their application on Amazon EC2. Great. It’s working well, it’s inexpensive, and they’re happy. So Central IT is figuring out what to do next.

I asked curiously, “Who is managing the servers?”

The client said, well, Amazon, of course!

Except Amazon doesn’t manage guest operating systems and applications.

It turns out that these developers believed in the magical cloud — an environment where everything was somehow mysteriously being taken care of by Amazon, so they had no need to do the usual maintenance tasks, including worrying about security — and had convinced IT Operations of this, too.

Imagine running Windows. Installed as-is, and never updated since then. Without anti-virus, or any other security measures, other than Amazon’s default firewall (which luckily defaults to largely closed).

Plus, they also assumed that auto-scaling was going to make their app magically scale. It’s not designed to automagically scale horizontally. Somebody is going to be an unhappy camper.

Cautionary tale for IT shops: Make sure you know what the cloud is and isn’t getting you.

Cautionary tale for cloud providers: What you’re actually providing may bear no resemblance to what your customer thinks you’re providing.

Bookmark and Share

Posted in Infrastructure | Tagged: , , | Leave a Comment »

Credit cards and EA/Mythic’s epic billing mistake

Posted by Lydia Leong on April 11, 2010

Most of us have long since overcome our fear of handing over our credit cards to Internet merchants. It’s become routine for most of us to simply do so. We buy stuff, we sign up for subscriptions, it’s just like handing over plastic anytime else. For that matter, most of us have never really thought about all that credit card data laying around in the hands of brick-and-mortar merchants with whom we do business, until the unfortunate times when that data gets mass-compromised.

Bad billing problems plague lots of organizations, but Electronic Arts (in the form of its Mythic Entertainment studio, which does the massively multiplayer online RPGs Dark Ages of Camelot and Warhammer Online) just had a major screw-up: a severe billing system error that, several days ago, repeatedly charged customers their subscription fees. Not just one extra charge, but, some users say, more than sixty. Worse still, the error reportedly affected not just current customers, but past customers. A month’s subscription is $15, but users can pre-pay for as much as a year. And these days, with credit cards so often actually being checking-account debit cards, that is often an immediate hit to the wallet. So you can imagine the impact on even users with decent bank balances, being hit by multiple charges. (Plenty of people with good-sized savings cushions only keep enough money in the checking account to cover expected bills, so you don’t have to be on the actual fiscal edge to get smacked with overdraft fees.) EA is scrambling to get this straightened out, of course, but this is every company’s worst billing nightmare, and it comes at a time when EA and its competitors are all scrambling to shift their business models online.

How many merchants that you don’t do business with any longer, but used to have recurring billing permission on your credit card, still have your credit card on file? As online commerce and micropayments proliferate, how many more merchants will store that data? (Or will PayPal, Apple’s storefronts, and other payment services rule the world?)

Bookmark and Share

Posted in Industry | Tagged: , , , | Leave a Comment »

Link round-up

Posted by Lydia Leong on June 1, 2009

Recent links of interest…

I’ve heard that no less than four memcached start-ups have been recently funded. GigaOM speculates interestingly on whether memcached is good or bad for MySQL. It seems to me that in the age of cloud and hyperscale, we’re willing to sacrifice ACID compliance in many our transactions. RAM is cheap, and simplicity and speed are king. But I’m not sure that the widespread use of memcached in Web 2.0 applications, as a method of scaling a database, reflects the strengths of memcache so much as they reflect the weaknesses of the underlying databases.

Column-oriented databases are picking up some buzz lately. Sybase has a new white paper out on high-performance analytics. MySQL is plugging Infobright, a column-oriented engine for MySQL (replacing MyISAM, InnoDB, etc., just like any other engine).

Brian Krebs, the security blogger for the Washington Post, has an excellent post called The Scrap Value of a Hacked PC. It’s an examination of the ways that hacked PCs can be put to criminal use, and it’s intended to be printed out and handed to end-users who don’t think that security is their personal responsibility.

My colleague Ray Valdes has some thoughts on intuition-based vs. evidence-based design. It’s a riff on the recent New York Times article, Data, Not Design, Is King in the Age of Google, and a departing designer’s blog post that provides a fascinating look at data-driven decision making in an environment where you can immediately test everything.

Posted in Industry | Tagged: , , , | Leave a Comment »

DDoS season

Posted by Lydia Leong on April 13, 2009

We are, it seems, in the midst of a wave of distributed denial of service attacks. The victims include:

  • Neustar’s UltraDNS. (Problems with specific regional DNS clusters, with little customer-visible impact.)
  • Register.com. (Severe impact on Web hosting and email customers.)
  • GoGrid. (Severe impact on cloud hosting customers.)
  • ThePlanet. (Attack on their DNS servers, with severe impact on customers.)

The attack on ThePlanet is unusual in that it received minimal attention in the press, despite the company being one of the largest Web hosters, and having Cisco Guard (DDoS mitigation) appliances in place. Also, the status updates were eventually issued via Twitter, rather than a more expected form of customer communication. Here’s the full text, aggregated off Twitter:

Between 2:30am and 5:00am CDT on April 8, The Planet’s name servers were flooded again with a large brute force (DDoS) attack. Unlike the previous attack, this attack did not appear to be DNS-specific; instead, targeted resources indirectly supporting DNS services. Because the nature of this attack was different from the previous event, mirroring the response to the previous attack was ineffective. Once our investigation determined the nature of the attack, we applied filters throughout our DNS support system to alleviate the effects. The Planet’s network and DNS performance have been restored, and the attack originator has ceased actions. Any lingering issues may be indicative of a different problem that may have been exacerbated by the attack and should be resolved quickly. We are working on several projects to help mitigate similar attacks in the future. Once those plans are in order, we will update the DNS Status announcement thread in our community forums. We understand that other providers are experiencing similar events. We will reach out to them, pool our information and then work together to find consistencies between attacks. Our goal is to establish best practices as an industry to better respond to these recent events.

Jose Nazario of Arbor Networks claims these attacks are not Conficker at work, which makes this wave of attacks even more interesting.

The takeaway from this: Customers understand if you get DDoS’d. They don’t put up with a lack of communication. It’s enormously difficult to communicate with customers in the midst of a crisis, especially one that takes down customer-facing infrastructure in a customer-impacting way, but it’s also incredibly critical. Clearly, not everyone in the company is out trying to troubleshoot the problem, so you can usefully put them to work reaching out to your customers, if you have the policies and procedures in place to do so successfully.

Something to think about today, no matter who you are and who you work for: What policies do you have in place for customer communications when a crisis hits your company? (Book recommendation: Eric Dezenhall’s Damage Control, which is a hard-edged, realistic look at communication in a crisis, including coping with competitors who are deliberately fanning the negative-PR flames.)

Bookmark and Share

Posted in Infrastructure | Tagged: | 1 Comment »

Linkage du jour

Posted by Lydia Leong on March 16, 2009

Tossing a few links out there…

In the weekend’s biggest cloud news, Microsoft’s Azure was down for 22 hours. It’s now back up, with no root cause known.

Geva Perry has posted a useful Zoho Sheet calculator for figuring out whether an Amazon EC2 reserved instance will save you money over an unreserved instance.

Craig Balding has posted a down-to-earth dissection of PCI compliance in the cloud, and the practical reality that cloud infrastructure providers tend to deal with PCI compliance by encouraging you to push the actual payment stuff off to third parties.

Bookmark and Share

Posted in Infrastructure | Tagged: , , | Leave a Comment »

Anti-virus integration with cloud storage

Posted by Lydia Leong on December 12, 2008

Anti-virus vendor Authentium is now offering its AV-scanning SDK to cloud providers.

Authentium, unlike most other AV vendors, has traditionally been focused at the gateway; they offer an SDK designed to be embedded in applications and appliances. (Notably, Authentium is the scanning engine used by Google’s Postini service.) So courting cloud providers is logical for them.

Anti-virus integration makes particular sense for cloud storage providers. Users of cloud storage upload millions of files a day. Many businesses that use cloud storage do so for user-generated content. AV-scanning a file as part of an upload could be just another API call — one that could be charged for on a per-operation basis, just like GET, PUT, and other cloud storage operations. That would turn AV scanning into a cloud Web service, making it trivially easy for developers to integrate AV scanning into their applications. It’d be a genuine value-add for using cloud storage — a reason to do so beyond “it’s cheap”.

More broadly, security vendors have become interested in offering scanning as a service, although most have desktop installed bases to defend, and thus are looking at it as a supplement as opposed to a replacement for traditional desktop AV products; see the past news on McAfee’s Project Artemis or Trend Micro’s Smart Protection Network for examples.

Bookmark and Share

Posted in Infrastructure | Tagged: , , | Leave a Comment »

Google Native Client

Posted by Lydia Leong on December 10, 2008

Google announced something very interesting yesterday: their Native Client project.

The short form of what this does: You can develop part or all of your application client in a language that compiles down to native code (for instance, C or C++, compiled to x86 assembly), then let the user run it in their browser, in a semi-sandboxed environment that theoretically prevents malicious code from being executed.

Why would you want to do this? Because developing complex applications in JavaScript is a pain, and all of the other options (Java in a browser, Adobe Flex, Microsoft Silverlight) provide only a subset of functionality, and are slower than native applications. That’s one of the reasons that most applications are still done for the desktop.

It’s an ambitious project, not to mention one that is probably making every black-hat hacker on the planet drool right now. The security challenges inherent in this are enormous.

Adobe has previously had a similar thought, in the form of Alchemy, a labs project for a C/C++ compiler that generates code for AVM2 (the virtual machine inside the Flash player). But Google takes the idea all the way down to true native code.

The broader trend has been towards managed code environments and just-in-time compilers (JITs). But the idea of native code with managed-code-like protections is certainly extremely interesting, and the techniques developed will likely be interesting in the broader context of malware prevention in non-browser applications, too.

And while we’re talking about lower-level application infrastructure pies that Google has its fingers in, it’s worth noting that Google has also exhibited significant interest in LLVM (which stands for Low-Level Virtual Machine). LLVM is an open-source project now sponsored by Apple, who hired its developer and is now using it within MacOS X. In layman’s terms, LLVM makes it easier for developers to write new programming languages, and makes it possible to develop composite applications using multiple programming languages. A compiler or interpreter developer can generate LLVM instructions rather than compiling to native code, then let LLVM take care of dealing with the back-end, the final stage of getting it to run natively. But LLVM also makes it easier to do analysis of code, something that is going to be critical if Google’s efforts with Native Client are to succeed. I am somewhat curious if Google’s interests intersect here, or if they’re entirely unrelated (not all that uncommon in Google’s chaotic universe).

Bookmark and Share

Posted in Applications | Tagged: , , | 3 Comments »

Cloud risks and organizational culture

Posted by Lydia Leong on October 29, 2008

I’ve been working on a note about Amazon EC2, and pondering how different the Web operations culture of Silicon Valley is from that of the typical enterprise IT organization.

Silicon Valley’s prevailing Ops culture is about speed. There’s a desperate sense of urgency that seems to prevail there, a relentless expectation that you can be the Next Big Thing, if only you can get there fast enough. Or, alternatively, you are the Current Big Thing, and it is all you can do to keep up with your growth, or at least not have the Out Of Resources truck run right over you.

Enterprise IT culture tends to be about risk mitigation. It is about taking your time, being thorough, and making the right decisions and ensuring that nothing bad happens as the result of them.

To techgeeks at start-ups in the Valley (and I mean absolutely no disparagement by this, as I was one, and perhaps still would be, if I hadn’t become an analyst), the promise and usefulness of cloud computing is obvious. The question is not if; it is when — when can I buy a cloud that has the particular features I need to make my life easier? But: Simplify my architecture? Solve my scaling problems and improve my availability? Give me infrastructure the instant I need it, and charge me only when I get it? I want it right now. I wanted it yesterday, I wanted it last year. Got a couple of problems? Hey, everyone makes mistakes; just don’t make them twice. If I’d done it myself, I’d have made mistakes too; anyone would have. We all know this is hard. No SLA? Just fix it as quickly as you can, and let me know what went wrong. It’s not like I’m expecting you to go to Tahiti while my infrastructure burns; I know you’ll try your best. Sure, it’s risky, but heck, my whole business is a risk! No guts, no glory!

Your typical enterprise IT guy is struck aghast by that attitude. He does not have the problem of waking up one morning and discovering that his sleepy little Facebook app has suddenly gotten the attention of teenyboppers world-wide and now he needs a few hundred or a few thousand servers right this minute, while he prays that his application actually scales in a somewhat linear fashion. He’s not dealing with technology he’s built himself that might or might not work. He isn’t pushing the limits and having to call the vendor to report an obscure bug in the operating system. He isn’t being asked to justify his spending to the board of directors. He lives in a world of known things — budgets worked out a year in advance, relatively predictable customer growth, structured application development cycles stretched out over months, technology solutions that are thoroughly supported by vendors. And so he wants to try to avoid introducing unknowns and risks into his environment.

Despite eight years at Gartner, advising clients that are mostly fairly conservative in their technology decisions, I still find myself wanting to think in early-adopter mode. In trying to write for our clients, I’m finding it hard to shift from that mode. It’s not that I’m not skeptical about the cloud vendors (and I’m trying to be hands-on with as many platforms as I can, so I can get some first-hand understanding and a reality check). It’s that I am by nature rooted in that world that doesn’t care as much about risk. I am interested in reasonable risk versus the safest course of action.

Realistically, enterprises are going to adopt cloud infrastructure in a very different way and at a very different pace than fast-moving technology start-ups. At the moment, few enterprises are compelled towards that transformation in the way that the Web 2.0 start-ups are — their existing solutions are good enough, so what’s going to make them move? All the strengths of cloud infrastructure — massive scalability, cost-efficient variable capacity, Internet-readiness — are things that most enterprises don’t care about that much.

That’s the decision framework I’m trying to work out next.

I am actively interested in cloud infrastructure adoption stories, especially from “traditional” enterprises who have made the leap, even in an experimental way. If you’ve got an experience to share, using EC2, Joyent, Mosso, EngineYard, Terremark’s Infinistructure, etc., I’d love to hear it, either in a comment on my blog or via email at lydia dot leong at gartner dot com.

Bookmark and Share

Posted in Infrastructure | Tagged: , , , | 1 Comment »