Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Why simplicity - not speed - is key to enterprise Hadoop strategies

Scott Carey | April 18, 2016
Microsoft, EMC and HPE execs highlight customer demands at Hadoop Summit.

'Innovate' is where companies look to do more with their data when it is all stored in a Hadoop cluster, such as advanced analytics, machine learning and predictive modelling, which is HDF's remit.

Raghu Ramakrishnan, CTO for data at Microsoft sees his customers as very much in the renovate stage, with companies wanting to combine their proprietary data with contextual information to drive insight.

"Fundamentally I think what we are seeing is much more data-centricity in all facets of business," he said.

"What that means is you have the kind of use cases that traditional relational databases enabled. But increasingly you are seeing them want to blend that data with other information from operational data sources that are not relational, from third-party sources like Twitter to IoT devices.

Ramakrishnan added: "So you need to make everything as easy as it can be, because these systems have gotten complex enough that enterprises are asking us for a platform that can deal which allow them to focus on the business logic, and to do all of this with the data uniformly governed and audited."

HPE's Goodfellow agreed that his customers are getting more savvy about what they can do with their data once it is consolidated.

"With traditional business intelligence (BI) you would want to use that information to see what your sales were like last week. But also they want to start doing things in real time."

Case study: Predicting traffic patterns

To show the benefit of blending proprietary data with contextual, EMC's Stefan Radke from EMC gave the example of a government 'smart city' project which analysed traffic flows, a project which EMC helped facilitate.

"What kind of data would you have to have to predict traffic on the road?" asks Radke.

"That is not only how many cars on the street but how fast they go. So to predict things you would collect weather data, you would collect data from schools, when they open, when they close.

"All of these sources would be required to develop a predictive model. You have to have as much data as possible, store it in the data lake and decide on-demand on the schema that you want to use."

Open source

Microsoft's Ramakrishnan channelled the tech giant's greater acceptance of open source under the leadership of Satya Nadella during the session.

"All of the products need to run on the same machines where the data is, because that customer is not going to be locked into any one solution," he said.

Ramakrishnan went on to speak about the architectural issues around Hadoop deployments.

"The openness of the sockets we build around is key. I think we are seeing a LEGO-style architecture for data management and analytics and two of the key blocks are the place where we keep the data and manage it: the store, and resource management, that allows us to co-locate our computation as close to the data as possible."

 

Previous Page  1  2  3  Next Page 

Sign up for CIO Asia eNewsletters.