"It's mostly for two reasons," Nadkarni explains. "Many times, these projects aren't done by IT. Second, because of the time to deploy and to go live, many business units find it easier to spin up a couple of instances in the cloud and get going, so it goes from a few weeks to a few days."
Campisi says most of the customers his unit supports are still storing and analyzing data on-site — for now. "We are transitioning to more and more using cloud technology and capabilities to support our strategy. From what I see from customers, it tends to be more of a traditional approach where they use their own internal corporate data center," he says.
For his part, Crigler is trying to figure out how to migrate all of Catalist's data to the cloud. The firm already replicates its database that matches voters' identities to the cloud "because it's a ton of data, and it's used on a very 'spikey' basis," he says. "Four to five months [before] an election, it gets hammered. So being able to expand processing capacity and throwing more disks and CPUs at it is really important."
He's also trying to come up with a strategy that gets the best performance for the money given the demand on that type of data and the need to do analytic queries against historical data.
"It's a big challenge," Crigler says. For instance, "Amazon's Elastic Block [Store] is slow, and S3 is even slower. The best option is the most expensive, which is the attached dedicated storage on the very large Amazon boxes — and that's really expensive. So you have to have a way of analyzing your data and calculating the price-performance curve for different kinds and ages of data, and optimizing your storage based on your real needs."
Though many companies are still grappling with the early stages of their big data storage strategies, it won't be long before hyperscale computing environments like those at Google and Facebook become more commonplace.
"It's happening," says Nadkarni. "This whole server-based storage design is a direct result of department practices followed by Amazon, Facebook, Google" and the like.
In Silicon Valley, startups are offering big data storage systems based on those companies' principles. At VMware's recent VMworld virtualization conference, says Nadkarni, "there were at least a dozen companies with founders who used to be at Google and Facebook."
For legal reasons, the startups can't replicate exactly their former employers' magic, "but the principles are well entrenched in Silicon Valley," Nadkarni says. "In a few years you'll see this hyperscale principle make its way into the mainstream enterprise because there won't be any other way to do it."
Sign up for CIO Asia eNewsletters.