Amazon’s Cloud NightmareApril 22, 2011 No Comments
The snafu at Amazon’s EC2 hosting service on Thursday, which knocked several big Web sites out of service, is being called a “black eye” for the cloud-computing business — a “we told you so” moment, according to cloud critics. But it could simply be a black eye for EC2.
It seems unlikely that this incident will cause startups to turn away from cloud computing, which for smaller companies is much cheaper than self-hosting. More likely, some of them will think twice about hosting with EC2, one of the industry leaders. That’s because this was a particularly nasty, widespread, and long-lasting outage. A whole bunch of sites were thrown totally or partially out of commission for most of the day Thursday, including Quora, Foursquare, Reddit, and HootSuite.
Technical glitches are bound to happen, of course, whether they’re in the cloud or in an expensive, staff-managed corporate server room. Sites go down all the time. But most often, the outages are brief and isolated. Apparently, this one jumped across various parts of Amazon’s cloud like lightning – in a way that Amazon had vowed it never could thanks to its “Availability Zones.”
“Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones,” Amazon (AMZN) says . The zones “protect your applications from failure of a single location.”
“It would seem they don’t exactly work as they’re designed,” notes The Register’s Cade Metz, who goes on to explain how Amazon’s cloud is (or is supposed to be) set up to avoid such failures. The company isn’t talking yet, but promises to publish a “postmortem” on the situation.
Some customers and observers said on Thursday that Amazon has a history of cloud trouble, though never anything this bad. Michael Hussey, chief executive of PeekYou, told Dow Jones that he’s been “seeing problems all year long” from Amazon’s Northern Virginia facility where the failure occurred. The problems haven’t been severe, he said, but they have caused his company to consider moving its data to a different cloud provider.
PeekYou didn’t go down because it also runs many of its own servers. Foursquare was knocked out entirely, but said in a blog post Thursday morning that Amazon is a “usually amazing” provider that was suffering “a few hiccups.”
The notion that this outage says anything about cloud computing’s utility as a whole seems far-fetched. “We don’t think the cloud is enterprise-ready,” Jimmy Tam, general manager of Peer Software, told The New York Times. His company provides data-backup services. “Are you really going to trust your corporate jewels to these cloud providers?” he asked.
Well, sure. Why not? Lots of companies have been doing it for years now (indeed, as Oracle (ORCL) chief Larry Ellison has noted , since way before “cloud computing” became a popular term), with little trouble. And even this glitch, though nasty, didn’t put anybody’s “corporate jewels” at risk. Many companies, like PeekYou, use the cloud for routine data handling, and keep their more “mission critical” tasks in-house. It all comes down to quality and service — there’s nothing inherent in cloud computing that makes it less reliable than the alternative.CLOUD COMPUTING, News