Defining Big Data: The Missing “V”August 2, 2012 No Comments
By: Bill Franks, Chief Analytics Officer, Teradata
While the term big data was barely on the radar just two or three years ago, today it is one of the hottest topics among those with an interest in data and analytics. One common discussion thread centers on the best way to define what big data is. While that seems like an easy question, there is actually much debate about the answer.
Commonly, people don’t actually define big data as equating to a certain amount of data. Rather, the definitions provided indicate that what counts as big data is relative to what tools and technologies are available to handle it. Also, other characteristics of the data beyond size are considered. Characteristics considered typically include some variation on 1) Volume, 2) Variety, 3) Velocity, and 4) Variability or Complexity. This makes the definition somewhat “squishy” and those desiring a hard line in the sand can be disappointed.
Note that all of the preceding terms are focused upon inherent characteristics of the data from a technical perspective and do not address the business perspective at all.
The Missing V
Perhaps the most important “V” for defining big data is Value. While often overlooked, the only reason to worry about all the other characteristics is because you believe that there is value in the data and that it is worth going to the effort to collect and analyze it. The only example I recall seeing where the idea of value was given equal billing is the paper IDC’s Worldwide Big Data Taxonomy, 2011.
My view is that there really is no point worrying about what defines big data. If you identify a data source that is not currently available to your organization’s analytic processes and you think it has high value, then collect it and make it available. This is true whether the data is big, small, or in between. It is true that a complex, unstructured source of data may require different tools and more effort to analyze than traditional data. However, that is an implementation consideration and needn’t be a concern until it is determined that the data has value. The first, most strategic question to consider with a data source is what value it can have for your business.
Big Data Is About Being Different More Than Big
Another interesting fact about big data is that many of the challenges in handling it aren’t really tied to the size and scale of the data at all. Rather, the challenges are caused by the fact that the data is inherently of a different type than many traditional data sources. Additionally, many of the tools and techniques required to handle massive amounts of a new type of data are still required to handle small amounts as well. Let’s consider an example.
Social media analysis is popular today. The collection of every Facebook posting or Tweet is widely considered to be a big data problem. However, perhaps the biggest challenge lies in the text analysis required to figure out the essence of what each posting or Tweet says. The majority of the people effort to perform text analytics on billions of pieces of text is still required to handle just a few pieces of text. If I want the sentiment score for ten tweets about an event today, the same tools and processes to parse through that text are required as if I had ten billion tweets. The only different is the scale at which the analytics need to be applied.
Focus On The Value Big Data Analytics Can Drive
Organizations that prove to be successful with big data will approach it by first considering value and focusing on the analytics they can apply against the data. To be sure, there will be significant technical challenges for IT departments as organizations proceed in their big data journeys. Those challenges are much easier to tackle and will receive funding much more easily when focus is on demonstrating the value that can be achieved from the start.
Instead of fearing big data, organizations should embrace the amazing analytics that can be created from the data and the value that can be obtained through those analytics. No pain, no gain certainly applies as always. However, the gains with big data have the potential to be much bigger than most. Define it however you like, but just be sure to make use of it.
The first, most comprehensive, and most popular business book on big data and analytics. The book walks through the data, tools, technologies, processes, people, and organizational culture required to drive world class advanced analytics from big data.
Bill Franks is Chief Analytics Officer for Teradata’s global alliance programs, providing insight on trends in the Advanced Analytics space and helping clients understand how Teradata and its analytic partners can support their efforts. Bill also oversees the Business Analytic Innovation Center, which is jointly sponsored by Teradata and SAS and focuses on helping clients pursue innovative analytics. In addition, Bill works to help determine the right strategies and positioning for Teradata in the advanced analytics space.
Bill is the author of the book Taming The Big Data Tidal Wave (John Wiley & Sons, Inc., April, 2012). In the book, he applies his two decades of experience working with clients on large-scale analytics initiatives to outline what it takes to succeed in today’s world of big data and analytics. Bill is also a faculty member of the International Institute for Analytics, which was founded by leading analytics expert Tom Davenport. Bill is an active speaker and blogger. His blog can be found at the following address: http://iianalytics.com/category/faculty-blogs/bill-franks/.
Bill’s focus has always been to help translate complex analytics into terms that business users can understand and to then help an organization implement the results effectively within their processes. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit organizations.
Bill earned a Bachelor’s degree in Applied Statistics from Virginia Tech and a Master’s degree in Applied Statistics from North Carolina State University. You can learn more about Bill at http://www.bill-franks.com.DATA and ANALYTICS , Fresh Ink, Inside the Briefcase