While big data is forecast by Gartner to drive US$34 billion of IT spending in 2013 and create 4.4 million IT jobs worldwide by 2015, its deployment requires painstaking planning and clear strategies to be successful.
According to Gartner research director Daniel Yuen, a major challenge is enterprises' lack of big data strategies. "That's a bigger problem than the inability to capture and analyze [big data]," he said.
Know what insights you need
Part of the problem: vendors' use of terminology. "When referring to big data, vendors [lack] consistent definition—and that confuses customers," said Yuen.
"The point of collecting data is to glean insight from it," said Stanley Lam, managing director, SAS Hong Kong. "You must be sure about what types of insight you want from your data, then pick the right strategy and tools to achieve it."
In the case of a retailer wanting insight into its customers, it needs to make sense of its customers' comments on social media networks, he said. "To do so, the retailer needs a text or sentiment analysis tool."
From his experience, many Hong Kong customers are interested in social media analytics. "All consumer-driven businesses including retailers find social media analytics relevant," Lam said. "Instead of manual search and study, you need a systematic way to understand comments on social media networks."
While data enables understanding of things useful to the business, bigger isn't necessarily always better. Data grows constantly and much of it is disposable. According to Gartner's Yuen, the usefulness of data is defined by its relevance—but that relevance changes as time passes, he added.
"With the cost of storage continuously dropping and the increasingly prevalent use of Massively Parallel Processing (MPP) computing (editor's note: In MPP databases, data is partitioned across multiple servers with each server having memory/processors to process data locally), multiple data re-use is possible at an affordable cost," said Yuen. "For instance, a customer navigation log on a website was once kept only for audit and compliance purposes. Now we can keep that data [for a longer period of time] for analysis of customer behavior, channel effectiveness, and costing structure."
But organizations still need to evaluate how relevance should be measured. To Lam, junk data is hard to define. "When it comes to data, there's one thing that organizations need to remember: we don't accumulate data like Facebook comments for accumulation's sake," he said. "We do it because we want insights."
Five big data myths
Enterprises should also be aware of these myths that hinder effective big data deployment.
1. Big data analytics isn't big analytics
Lam pointed out that big data analytics isn't 'big analytics'. "BI is about generating reports. When you need to generate a report on lots of data, that's big analytics," he said. "Big data analytics is about using advanced analytics tool to look at a huge amount of data for insights generation and predictions."
Sign up for CIO Asia eNewsletters.