A number of the attendees at the recent HP Big Data Conference in Boston weren't ready to lift and shift their big data to the cloud. Credit: HP
Is the cloud a good spot for big data?
That’s a controversial question, and the answer changes depending on who you ask.
Last week I attended the HP Big Data Conference in Boston and both an HP customer and an executive told me that big data isn’t a good fit for the public cloud.
I’ve heard many cloud vendors tell me the opposite.
CB Bohn is a senior database engineer at Etsy, and a user of HP’s Vertica database. The online marketplace uses the public cloud for some workloads, but its primary functions are run out of a co-location center, Bohn said. It doesn't make sense for the company to lift and shift its Postgres, Vertica SQL and Hadoop workloads into the public cloud, he said. It would be a massive undertaking for the company to port all the data associated with those programs into the cloud. Then, once its transferred to the cloud, the company would have to pay ongoing costs to store it there. Meanwhile, the company has a co-lo facility already set up and expertise in house to manage the infrastructure required to run those programs. The cloud just isn’t a good fit for Etsy’s big data, Bohn says.
Chris Selland, VP of Business Development at HP’s Big Data software division, says most of the company’s customers aren’t using the cloud in a substantial way with big data. Perhaps that’s because HP’s big data cloud, named Helion, isn’t quite as mature as say Amazon Web Services or Microsoft Azure. But still, Selland said there are both technical challenges (like data portability, and data latency) along with non-technical reasons, such as company executives being more comfortable with the data not being the cloud.
Bohn isn’t totally against the cloud though. For quick, large processing jobs the cloud is great. “Spikey” workloads that need fast access to large amounts of compute resources are ideal for the cloud. But, if an organization has a constant need for compute and storage resources, it can be more efficient to buy commodity hardware and run it yourself.
Public cloud vendors like Amazon Web Services make the opposite argument. I asked Amazon.com CTO Werner Vogels about private clouds recently and he argued that businesses should not waste time on building out data center infrastructure when AWS can supply it to them. Bohn argues that it’s cheaper to just buy the equipment than to rent it over the long-term.
Sign up for CIO Asia eNewsletters.