Doug Cutting, co-founder of Hadoop and Chief Architect of Cloudera.
As people, devices and things become increasingly connected, organisations will be flooded with data in all forms and from various sources. Big Data has thus been one of the hottest catchphrase since last year as organisations are increasingly pressured to efficiently and cost-effectively collect and connect data, as well as extract meaningful value from it to drive business decisions.
One software that is nearly synonymous with Big Data is Hadoop. "This open source software framework allows organisations to take many normal computers and make them work together like a single computer so that they can store data across those computers and run processes over that data in parallel using all of the CPUs," said Doug Cutting, co-founder of Hadoop and Chief Architect of Cloudera.
"Since Hadoop runs on simple hardware for web servers, it allows organisations to affordably store large amounts of data, have effective access to that data and analyse the data over a short period of time. This is in contrast to the previous generation of database/enterprise software which needed to run on expensive exotic hardware that may not even be able to store that much data," he added.
Time and time again, we've been told that the value of data does not lie in the raw data itself but the insights gained when relevant raw data analysed. So why should organisations store more data in their raw form as part of their Big Data strategy? According to Cutting, doing so will enable organisations to continually experiment with their data to gain more insights.
"Typically, data will undergo the extract, transform and load (ETL) process before it is deposited into a data warehouse. This means that the final data set stored might not contain some details from the raw data. As such, organisations won't be able to do other queries that require the 'lost' details from the original data. Hadoop counters this by allowing organisations to affordably store the full raw data, providing them flexibility to modify queries [and experiment with their data]. Besides that, Hadoop enables organisations to analyse their raw data in its entirety without needing to undergo the ETL procedure."
Slow adoption rate
Despite the benefits of Hadoop, the software has yet to see mainstream adoption and there is also a lack of intent to adopt it. According to Gartner's 2015 Adoption Study, only 26 percent of the respondents claimed to be deploying, piloting or experimenting with Hadoop while only 18 percent have plans to invest in Hadoop by 2017.
"A lot of organisations are structured according to how they use a particular set of technology. For instance, we had a customer that wanted to run Hadoop on an integrated storage and computing system. However, its team that oversaw storage was different from the team that dealt with processing and thus, they didn't know which team would own Hadoop if they were to deploy it. Solving this problem would require the customer to reorganise their business structure/units, which is probably why organisations are hesitant to implement Hadoop," said Cutting.
Sign up for CIO Asia eNewsletters.