Here’s a mind-boggling fact, in the time, you took, to read this sentence, approximately 100,000 GB (100 Terabytes) of data has been generated worldwide. In just a snap of a finger, Internet expanded by 0.0001%. We are at 15YB annual data production rate and it’s estimated that in 2025, we will be at 175YB per annum.
From the very beginning of Human civilization, even the Earth or more generally, the Universe itself, data has been generated. It is all around us. While the data beyond life is more abstract and closer to what only a physicist would understand, but we all can, undeniably, comprehend the importance of data in the formation of life. DNA (Deoxyribose Nucleic Acid), if I have to compare it to something, I would in a jiffy utter “A data center”. This “data center” contains the information that defines life, gives us our characteristic look and what not? Although our DNA is quite complex and large, we can store all the data in an average CD. Now the question arises, if a data of 800MB can define life, why aren’t we doing wonders with all the data in the world? We could but the problem is DNA is time-tested and tempered by the nature, refined and generated for only single purpose, life. Whereas, the data we have is random collection of pictures, text, etc. One might ask, but still in a billion gigabyte of data generated, is there not one megabyte that Is exemplary and worth the effort to look at? There is tons of data (possibly world changing) awaiting analysis. It is estimated less than 1% of the total data we have is ever analyzed. The Artificial Intelligence, that we so greatly hail, learns from this analyzed data. Best examples of what an AI can do, using this data, is predict cardiac arrhythmia accurately, even predict chances of having cancer or diagnose a disease. Whenever, we are watching videos on YouTube, we are surrounded by an army of AI battling each other to recommend us more videos, showing us relevant ads and generate revenue for the creators, same thing happens on the social media sites and everywhere else. If we could do this with only a fraction of the 1% of analyzed data, imagine what we could accomplish with say, 10, 20 or even 50% of the analyzed global data.
Which brings us to our answer, we could do all that, mentioned above, and more but we are limited as not many know what power the data holds. Every industry in the Silicon Valley has changed irrevocably due to this realization. This is the “era of data”, only he who knows how to use data, survives in the market and it has just begun. Future holds with those who are willing to plunge into the dirt and dig up the gold in it.
Arvinder Pal Singh Bali
A Data Science enthusiast