The single greatest distinction between Hadoop and Google Cloud Dataflow, though, lies in where and how each is most likely to be deployed. Data tends to be processed where it sits, and for that reason Hadoop has become a data store as much as a data processing system. Those eying Google Cloud Dataflow aren't likely to migrate petabytes of data into it from an existing Hadoop installation. It's more likely Cloud Dataflow will be used to enhance applications already written for Google Cloud, ones where the data already resides in Google's system or is being collected there. That's not where the majority of Hadoop projects, now or in the future, are likely to end up.
"I don't see this as a migration play," said Baer.
Sign up for CIO Asia eNewsletters.