The Swinging of the Big Data Pendulum

IT Briefcase Exclusive Interview: The Swinging of the Big Data Pendulum with Jack Norris, MapR Technologies

September 4, 2013 No Comments

In the interview below, Jack Norris from MapR Technologies discusses a paradigm shift in the way that data is viewed in the enterprise today, and outlines ways in which organizations can best keep up with this Big Data evolution.

Q. How do you see Big Data and open source technology transforming the way people view enterprise computing today?

A. Big Data and the open source technologies that deliver Big Data storage and processing capabilities mark a paradigm shift in how data is viewed in the enterprise. Data is no longer a by-product of transactions but a powerful force that can transform the business itself. There is a growing realization across industries that competitive advantage and possibly survival will heavily depend on how well an enterprise harnesses and benefits from big data. More enterprises have begun to harness the power of data to create new revenue generating products and services and implement changes that drive operational efficiencies.

Q. Do you feel the pendulum is swinging back from data and compute as separate entities to data and compute being unified again?

A. With rapidly growing data volumes, existing technologies are having trouble keeping up. It’s a combination of not only hardware and software scalability issues but the associated costs of expansion. The pendulum is swinging from separate storage and compute farms to a new approach that combines data and compute together in a distributed framework where processing is parallelized.

Q. How do you feel the role of IT is changing within this data management transformation?

A. IT is becoming more and more focused on the data. With datacenters today the emphasis has been on the physical plant, the servers, the rack, the network. In the future the focus will be increasingly on the data in the datacenter. As more and more CMOs, COOs, and new emerging roles such as Chief Risk Officers and Chief Data Officers drive Big Data business goals, IT departments will need to serve as the data stewards of the organization. The onus will be to not only deliver the right applications on the right platforms, but to deliver the right data.

Q. How integral has Hadoop been to this evolution?

A. Hadoop represents the biggest paradigm shift to impact enterprise computing that we’ve seen in decades. Hadoop is a distributed processing framework that enables organizations to leverage a greater volume, variety, and velocity of data. It is a platform that allows enterprises to not only process data more effectively, but more economically than ever before.

Q. Can you give us a few examples of how MapR solutions work to increase speed, agility, analytic capabilities, and locational independence for enterprises today?

A. MapR’s Distribution for Hadoop has helped companies across industries benefit from Big Data.

– ComScore processes over 50 billion internet and mobile events per day using Hadoop to understand and forecast web behavior.

– A major financial services firm used Hadoop to create a new service within one quarter.

– Ancestry.com is able to process petabytes worth of data with new algorithms developed on Hadoop. Mission critical services are supported on MapR’s 99.999% high availability platform.

Q. Can you please outline a few use cases exhibiting how these solutions have been successfully implemented?

A. MapR’s Distribution for Hadoop is used to provide new products and services for consumers in real-time for a leading credit card company. Advanced machine learning and statistical techniques are employed over data stored in a highly available Hadoop cluster. MapR delivers real-time data ingestion, high performance and self-healing high availability to support mission critical applications.

– A Fortune 500 firm uses MapR to offload ETL processing as well as data from their enterprise data warehouse. Not only does MapR provide a 50X cost advantage to store and analyze structured as well as unstructured data.

– Blizzard runs three triple-A games simultaneously and they share resources among the games but not the data. MapR’s NFS file system and volume features enable them to keep the data separate and to update data much more easily than alternative approaches.

Q. How will MapR’s recent release of M7 NoSQL Edition help users of NoSQL and Apache Hadoop™ applications increase overall dependability and performance?

A. Combining NoSQL functionality with Hadoop is a growing requirement for almost half of all Hadoop users today. HBase is a distributed NoSQL solution that is integrated into all Hadoop distributions.

HBase however, has not reached its true adoption potential because of a complex architecture that results in availability challenges, recovery issues and inconsistence performance.

MapR M7 has eliminated all of this complexity by delivering innovation that provides ease of use, dependability and performance advantages for NoSQL and Apache Hadoop applications. MapR M7 has removed the trade-offs organizations face when looking to deploy a NoSQL solution. M7 provides scale, strong consistency, reliability and continuous high performance.

About the Author

Jack Norris is the chief marketing officer of MapR Technologies and leads the company’s worldwide marketing efforts. Jack has over 20 years of enterprise software marketing and product management experience in defining and delivering analytics, storage, and information delivery products. Jack has also held senior executive roles with EMC, Rainfinity, Brio Technology, SQRIBE, and Bain and Company. Jack earned an MBA from UCLA Anderson and a BA in economics with honors and distinction from Stanford University.