That last feature is extremely important, Ghodsi says.
"It's actually really hard to transition between interactive computations and production pipelines," he says. "I think a lot of people have this mental model that there are two different things you can do: either you're doing interactive analysis or you're building data pipelines. That's not how developers work. While they're developing a data pipeline, they have to explore the data, debug and test to make sure the data pipeline is actually working. During this process, they need interactive analysis."
Moving among modes
And while you want your data pipelines to run without humans in the loop, if you do run into problems, you need to be able to seamlessly enter an interactive mode to further develop it.
"We want to make sure that you can easily and seamlessly move between these two modes," Ghodsi says.
"Databricks' latest developments for data engineering make it exceedingly easy to get started with Spark — providing a platform that is apt as both an integrated development environment and deployment pipeline, "Brett Bevers, engineering manager, Data Engineering, at Dollar Shave Club, added in a statement Wednesday. "On our first day using Databricks, we were equipped to grapple with an entirely new class of data challenges."
The new offering is immediately available. It's priced based on data engineering workloads such as ETL and automated jobs ($0.20 per Databricks Unit plus the cost of AWS).
Sign up for CIO Asia eNewsletters.