Inside the Briefcase

IT Briefcase Exclusive Interview: Cloud Sandboxes and Their Many Benefits

IT Briefcase Exclusive Interview: Cloud Sandboxes and Their Many Benefits

with Shashi Kiran, Quali
IT Briefcase conducted the following...

The Automotive Industry Goes Driving in the Cloud

The Automotive Industry Goes Driving in the Cloud

Just when you think that you have seen it...

Preparing for the Adoption of Office 365

Preparing for the Adoption of Office 365

As you may know, Microsoft Office 365 is the...

How to Best Utilise Analytics in all its Forms

How to Best Utilise Analytics in all its Forms

Analytics is one of the most indispensable tools any...

2016 APM Reference Guide: Application Performance Monitoring

2016 APM Reference Guide: Application Performance Monitoring

IT Briefcase Analyst Report
This product guide allows you to...

Of Dark Data, Beware You Must

April 4, 2013 No Comments

Big data there is. To master it you must learn, but of dark data, beware you must.

A Data Padawan, on his quest to become a Data Jedi, many dangers he will encounter.  As big data slips from the peak of inflated expectations and into the trough of disillusionment at intergalactic speed, temptations to stray beyond the limits of the Trade Federation abound.  Dark data that beyond these limits resides, if properly mastered, incredible opportunities for Data Jedis will create, for the Force to unleash and for their organization’s bottom line to levitate.

Dark data is usually defined as data that is kept “just in case” but hasn’t (so far) found a proper usage, or can be harvested and leveraged beyond its primary (intended) usage.

Examples abound but could include:

- Measurements collected by the hundreds of sensors built all over a car (or the Millennium Falcon). These measurements are handy for the mechanic (or for Chewbacca) when the car/spacecraft is in the shop. But the manufacturer can also use it to diagnose patterns of failures, optimize performance, or even perform preventive maintenance.

- Access logs from facilities doors (or from the shield of the Death Star). Beyond their primary use (to prevent unauthorized access by Rebel vessels), such logs allow to analyze visitor flow, optimize elevator traffic, better regulate HVAC, protect from total destruction, etc.

- Unstructured data, such as audio, video, 3D holograms, Death Star blueprints, etc. – stored on servers, in the Cloud or in R2 droids, that can be mined for information beyond the intended message they mean to convey.

The first challenge faced by the Data Padawan is to identify which data is available, and where. By definition, dark data is data that was not meant to be used in that particular way. It’s usually not stored in databases or systems managed by IT, and rarely inventoried in the enterprise’s metadata catalog (when such a catalog exists). Rather, logs are often kept as files stored on disk/in memory inside the system itself, or in an embedded database.  Another obstacle is dark data collection. Connectivity to the systems can be difficult, because of protocols, security/permissions, firewalls, or even simply lack of APIs.

The next step in the Data Padawan’s apprenticeship is to process this dark data, and to produce value – the kind of value that develops the Force of the organization. Thankfully, many tools and technologies are available. Hadoop and NoSQL databases, data integration and data quality tools generating native MapReduce code, optimized SQL query systems for Hadoop such as Hive/Stinger or Impala, all make the life of the Data Padawan easier. Because frankly, while a light saber may come in handy for slicing and dicing data, it is a bit crude for detailed analysis…

There remains one major obstacle on this quest: the dark data island. A dark data system is not, cannot be, an isolated system. Dark data must be used in conjunction with the rest of the information system. Dark data applications must be connected and must exchange with other databases, applications, analytical platforms, etc. Only then will dark data embrace the Force, and forgo its Dark Side. To become simply data.

And only then, a Data Jedi the Padawan will become.

May the Force of data be with you.

YvesM casual2 lores Of Dark Data, Beware You Must

Master Yves de Montcheuil is a Data Jedi and the Vice President of Marketing at Talend, the recognized leader in open source integration. Yves holds a master’s degree in electrical engineering and computer science and has 20 years of experience in software product management, product marketing and corporate marketing. He is also a presenter, author, blogger, social media enthusiast, Star Wars fan, and can be followed on Twitter: @ydemontcheuil.

DATA and ANALYTICS , Fresh Ink

Leave a Reply





American Customer Festival 2016 New York

ITBriefcase Comparison Report

Cyber Security Exchange