Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Event shows the many faces, challenges of big data

Chris Kanaracus | Feb. 16, 2012
The burgeoning tech industry movement around big data is churning up a variety of new applications, but remains an evolving field that faces lingering challenges, judging from an event held Wednesday at a Microsoft research facility in Cambridge, Massachusetts.

For one, it "has terrible performance on data management," he said. In addition, Hadoop is a low-level interface that requires people to program in Java, Stonebraker said. "Forty years of research says high-level languages are good."

The problems Stonebraker cited could be mitigated over time, however, given that an array of vendors have been rolling out various tools meant to make Hadoop easier to use.

Meanwhile, EMC's Greenplum division is "building a platform for the future of big data," said George Radford, field CTO, during a panel discussion. That includes both row-based and columnar stores, integrated Hadoop storage, and integration with the Gemfire in-memory data grid for in-memory analytics, he said. This integration is crucial, according to Radford. "One of the problems with point solutions is with big data, the last thing you want to do it move it. You want to ingest it and analyze it in place."

But a new problem for big data is emerging even as companies like EMC Greenplum make these technological strides, Radford added. "Like everyone else here, we're looking for data scientists. As we solve the platform issues, people are going to be transformed from bit-tweakers and tuners to active partners with the business."

At another point, talk turned to big data's relationship with cloud computing, particularly public infrastructure offerings like Amazon Web Services, which offer raw compute power for developers.

Such systems present "an extremely challenging environment" for big data processing given the limited control users ultimately have over factors like the underlying network and storage, said Fritz Knabe, distinguished engineer at IBM's Netezza division.

But the public cloud does make sense for large processing jobs in some cases, Stonebraker said. "If you are doing month-end reporting and you need 1,000 processors for three hours, go ahead and do that on the [public] cloud. There's some low-hanging fruit."


Previous Page  1  2 

Sign up for CIO Asia eNewsletters.