4 Steps to a Big Data StrategyMay 1, 2013 No Comments
Featured Article by Joe Caserta, President and CEO of Caserta Concepts
Is Big Data a business initiative or an IT initiative? For decades, businesses have been trying to make sense of disorganized transaction data sprinkled throughout the enterprise, relying heavily on human resources to analyze data via data warehouses. Steps traditionally include collecting, cleaning, conforming, consolidating and organizing the data for business analysts to perform ad-hoc queries to answer key questions such as “How are my sales, by region?” or “How is my inventory, by product?” Of course, to query data, it must be structured in a data warehouse and browsed via Business Intelligence tools, correct? Well, not anymore. Big Data has changed the rules. Before embarking on a new big data project, a policy for handling this data must be in place. Here are four steps to a big data strategy:
A big data project requires planning and sophisticated orchestration. Why? Because it disrupts and introduces new hardware, software, resources and data sets. It will involve technical toolsets never experienced by the business or IT, and bring together data sets never before integrated. New policies, procedure, training and project planning need to be carefully provisioned.
Big data solutions include data warehouse data, raw transaction data and unstructured log data. A financial big data solution may have trade data, market data, position data, news feeds, customer reference data, weblogs and system logs. Repeatable processes must be established for the consumption of each data source. Techniques inherited from traditional data warehousing such as change-data-capture, micro-batch processing and real-time data streaming may still apply.
The paradigm shift to big data introduces a new role in the corporate organization: the data scientist. This role requires deep understanding of advanced mathematics, system engineering, data engineering and domain (business) expertise. In practice, it’s common to utilize a data science team, where statisticians, technologists and business subject matter experts collectively solve problems and provide solutions.
Every big data strategy must include continuous monitoring and maintenance of the technical solution. As data volume and analytic requirements increase, the configuration of the solution must evolve and grow. The distributed system will need to have nodes added, data redistributed/balanced, replication adjusted, and the configuration for all of the above, continuously fine-tuned for overall optimal performance.
Before a big data project is launched, a strategic readiness test should be performed to assess the adoption of the new paradigm. Business analysts will need to be retrained or repurposed. The goal of shifting to a big data platform may include changing from reactive analysis (did that campaign work?) to proactive (what should our next campaign offer?). Now we can proactively influence non-buyers to follow behavior patterns of loyal customers; or re-stimulate active customers when their behavior pattern begins to look like a lost customer.
Now armed with a complete big data ecosystem, including recommendations created by data scientists, it’s possible to close the loop – feed the results of the analysis into the engine that creates the customer experience: Your website, marketing department, sales force, product development and customer service. Moreover, the big data machine can now consume recommendations provided as a result of its analytics correlated to new customer behavior patterns and quantify its effectiveness.
As with any new initiative, there’s risk when implementing a new big data project. A tool, a language or a platform alone does not make a solution. To learn more about big data strategy steps for success, be sure to read our new article on the topic, “Big Data Strategy: A pragmatic approach.
Joe Caserta is the President and CEO of Caserta Concepts, a consulting and technology services firm that specializes in data warehousing, business intelligence and big data analytics. As a veteran solution provider and co-author of the industry best seller, “The Data Warehouse ETL Toolkit,” Joe is an industry thought leader whose methods have helped Fortune 1000 companies manage, clean and access their data for actionable business results. Many of his solutions have been published in industry publications, as he continuously sets new standards for building cost effective, sustainable data warehouse solutions.
DATA and ANALYTICS , Fresh Ink, Top Stories