Companies Have their Head in the Cloud with Big Data – but is it Safe?July 10, 2012 1 Comment
By David A. Kelly, Upside Research
There is no doubt that Big Data and Cloud Computing are taking the enterprise computing landscape by storm hand-in-hand. Big Data — the ability to effectively capture, manage, and utilize large swaths of different-structure data – has companies with their head in the clouds, as the first enterprise-grade version of a Big Data software platform became available in January.
Hadoop Version 1.0 is a scalable, distributed computing software framework that was developed under the open-source Apache Software Foundation. A six-year project, Version 1.0 was considered the first-available enterprise-class version of the technology.
It hasn’t taken long for Hadoop to capture the hearts and imaginations of Big Data gurus at F100 companies. Hadoop is quickly becoming the de facto data platform for enterprises to store, process and query voluminous quantities of data of different structures, a marked departure of traditional data warehouse and data analytics efforts that most enterprises have employed since the 90s.
The computing industry has stepped up with support for Hadoop. A laundry list of most-popular technology companies like Microsoft, Google, IBM, Amazon, Facebook and Yahoo have been using the platform and have been working to help develop and test the software before it was officially released for general use earlier this year.
So, the question remains, how does Hadoop fit into the Cloud, another hot emerging technology for enterprises, especially in light of recent, continued Cloud stumbling blocks?
Vendors are moving to incorporate Hadoop distribution in their cloud offerings quickly, and recent developments suggest this is a space that is only beginning to fill out. Yahoo and Cloudera have quickly put themselves at the front of the pack with their offerings – Horton Dataworks Platform (HDP) and Cloudera Distribution including Apache Hadoop (CDH) respectively. But, Amazon is making strides as well, and last month it announced the availability to customers of MapR’s distribution on temporary clusters. Amazon also has its own Hadoop distribution. A few weeks later MapR announced a private beta of its Hadoop distribution running on the Google Compute Engine cloud, bringing Google front and center in the Big Data/Cloud effort.
There is no doubt that Big Data and Cloud could potentially be a match made in heaven. The cloud provides flexibility in resource allocation that melds well with the distributed processing of large data sets across clusters of commodity servers. With eighty percent of the world’s data being unstructured, and the cloud providing a way for enterprise to more easily afford the infrastructure to handle the volumes of data that exist, it makes sense for the Cloud to be the stage on which Big Data will have its biggest performance yet.
One caveat, however, to this blockbuster event. The performance issues of the cloud. Recent service outages that have sparked outrage by enterprise customers and questions of how capable cloud platform vendors are to deliver on service levels demanded by enterprises leave many unanswered questions. It remains to be seen if these disruptions will have any impact on Hadoop-in-the-cloud’s adoption by enterprises in the near future.Analyst Blog