Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Is 2014 the Year of the 'Big Data Stack'?

Thor Olavsrud | Jan. 10, 2014
There is a dizzying array of big data reference architectures available today. 2014 may be the year we see a big data stack—similar to the LAMP stack that drove development of dynamic and interactive websites in the dotcom era—begin to coalesce.

In many ways, this layer of the stack has the most work ahead of it, Daley says. Relational and analytical databases have years of development behind them, but Hadoop and NoSQL technologies are in relatively early days yet.

"Hadoop and NoSQL, I have to say we are early," Daley says. ""We're over the chasm in terms of adoption—we're beyond the early adopters. But there's still a lot that needs to be done in terms of management, services and operational capabilities for both of those environments. Hadoop is a very, very complicated bit of technology and still rough around the edges. If you look at the NoSQL environment, it's kind of a mess. Every single NoSQL engine has its own query language."

'I' Is for the Integration Layer
The next layer up is the integration layer. This is where data prep, data cleansing, data transformation and data integration happens.

"Very seldom do we only pull data from one source," Daley says. "If we're looking at a customer-360 app, we're pulling data from three, four or even five sources. When somebody has to do an analytical app or even a predictive app, 70 percent of the time is spent in this layer, mashing the data around."

While this layer is the "non-glamorous" part of big data, it's also an area that's relatively mature, Daley says, with lots of utilities (like Sqoop and Flume) and vendors out there filling the gaps.

'A' Is for the Analytics Layer
The next layer up is the analytics layer, where analytics and visualization happen.

"Now I've got the data. I've got it stored and ready to be looked at," Daley says. "I take a Tableau or Pentaho or Qlikview and visualize that data. Do I have patterns? This is where people—business users—can start to get some value out of it. This is also where I would include search. It's not just slice-and-dice or dashboards.

This area too is relatively mature, though Daley acknowledges there's a way to go yet.

"We've got to figure out as an industry how to squeeze more juice out of Hadoop—methods to get data faster," he says. "Maybe we acknowledge that it's a batch environment and we need to put certain data in other data sources? Vendors are working around the clock to make those integrations better and better."

'P' Is for the Predictive/Prescriptive Analytics
The top layer of the stack is predictive/prescriptive analytics, Daley says. This is where organizations start to truly recognize the value of big data. Predictive analytics uses data (historical data, external data and real-time data), business rules and machine learning to make predictions and identify risks and opportunities.


Previous Page  1  2  3  Next Page 

Sign up for CIO Asia eNewsletters.