Why Big Data is now such a big dealMarch 30, 2012 No Comments
Computers are spewing forth data at astronomical rates about everything from astrophysics to internet shopping. And it could be hugely valuable
One of the most famous quotes in the history of the computing industry is the assertion that “640KB ought to be enough for anybody“, allegedly made by Bill Gates at a computer trade show in 1981 just after the launch of the IBM PC. The context was that the Intel 8088 processor that powered the original PC could only handle 640 kilobytes of Random Access Memory (RAM) and people were questioning whether that limit wasn’t a mite restrictive.
Gates has always denied making the statement and I believe him; he’s much too smart to make a mistake like that. He would have known that just as you can never be too rich or too thin, you can also never have too much RAM. The computer on which I’m writing this has four gigabytes (GB) of it, which is roughly 6,000 times the working memory of the original PC, but even then it sometimes struggles with the software it has to run.
But even Gates could not have foreseen the amount of data computers would be called upon to handle within three decades. We’ve had to coin a whole new set of multiples to describe the explosion – from megabytes to gigabytes to terabytes to petabytes, exabytes, zettabytes and yottabytes (which is two to the power of 80, or 10 followed by 23 noughts).
This escalating numerology has been necessitated by an explosion in the volume of data surging round our digital ecosystem from developments in science, technology, networking, government and business. From science, we have sources such as astronomy, particle physics and genonomics. The Sloan Digital Sky Survey, for example, began amassing data in 2000 and collected more in its first few weeks than all the data collected before that in the history of astronomy. It’s now up to 140 terabytes and counting, and when its successor comes online in 2016 it will collect that amount of data every five days. Then there’s the Large Hadron Collider, (LHC) which in 2010 alone spewed out 13 petabytes – that’s 13m gigabytes – of data .DATA and ANALYTICS