Catalist replaced its "$100,000 and up" boxes with four NAS storage units at a cost of $40,000. "We quadrupled our capacity for about $10,000 each," Crigler says. "That was a year and a half ago," and the cost of storage has continued to go down.
Csaplar says he expects to see more lower-end storage systems on the market as more organizations find that they meet their needs. Big vendors like EMC see the writing on the wall and have been buying up smaller, boutique storage companies, he adds.
The Storage and Processing Gap
Data analytics workflow tools are allowing stored data to sit even closer to analytics tools, while their file compression capabilities keep storage needs under control. Vendors such as Hewlett-Packard's Vertica unit, for instance, have in-database analytics functionality that lets companies conduct analytics computations without the need to extract information to a separate environment for processing. EMC's Greenplum unit offers similar features. Both are part of a new generation of columnar databases, which are designed to offer significantly better performance, I/O, storage footprint and efficiency than row-based databases when it comes to analytic workloads. (In April, Greenplum became part of Pivotal Labs, an enterprise platform-as-a-service company that EMC acquired in March.)
Catalist opted for a Vertica database specifically for those features, Crigler says. Because the database is columnar rather than row-based, it looks at the cardinality of the data in the column and can compress it based on that. Cardinality describes the relationship of one data table to another, comparing one-to-many or many-to-many.
The Right People for the Job
What skill sets will big data storage and analytics require? By 2015, 4.4 million jobs around the world will require big data skills, but only one-third of those jobs will be filled, according to Gartner. IT professionals must acquire the skills needed to connect, analyze and manage any type of information, in any location, using any interface, to help organizations fully realize the potential of big data, according to a report by the research firm.
Dealing with big data requires a unique set of skills that may be scarce in mainstream IT. For traditional data analysis, such as for finance and HR, it's easy to find people who are familiar with a business discipline, who know what each data field means and who can help create reports. But with big data, there's more to it.
"You definitely need someone with business domain expertise," but you also need people who know how to work with data to do machine learning and other techniques to, for example, build an algorithm or a transfer function, says Vince Campisi, CIO at GE Software. Having people with more specialized skills "allows you to stitch together this information and produce an analytic that tells you something you couldn't have otherwise seen," he adds.
Sign up for CIO Asia eNewsletters.