"If you need scalability, you need to give up consistency -- you have to give up one or the other," Thakur said.
That makes it tough to get a reliable snapshot of the big picture for point-in-time recovery. Not only is it more difficult to track which data might have moved where in a distributed database at any given moment, but the resiliency features that often come "baked" into newer distributed databases -- replication, for example -- won't protect you if data gets corrupted, said Simon Robinson, a research vice president with 451 Research.
"You just replicate that corrupted data," he said.
Earlier this month, Datos IO launched RecoverX to address those concerns through features including what it calls scalable versioning and semantic deduplication. The result is cluster-consistent backups that are both space-efficient and available in native formats, the company says.
Souvik Das, who until recently was CTO and managing vice president of engineering with CapitalOne Auto Finance, has felt the backup crunch first-hand.
After years of using traditional databases, CapitalOne underwent a "massive transformation" a few years back that included rolling out new distributed technologies such as Cassandra, said Das, who is now senior vice president of engineering at healthcare-focused startup Grand Rounds.
That meant looking for a new strategy for backup and recovery.
"Most of the backup vendors and software are typically tuned to the type of systems that they're backing up," he explained.
Using an older-style backup product with a newer distributed database could spell trouble, he said.
"Either that software would completely fail because it has no idea how to back up the new data stores, or it would work in a very suboptimal way," Das said. "We knew going in that we would have to have different backup solutions."
CapitalOne has been evaluating Datos IO as well as Talena, another major player in the space, Das said.
Vendors of more traditional backup products are gradually adjusting their own technologies for big data as well.
"It usually takes the incumbent backup vendors some time to support the newer technologies," 451 Research's Robinson said.
"Rewind 10 years and it was very difficult initially to easily do backups for VMware virtual machines," he added. "This opened the door for players like Veeam to enter and steal the VM backup market from under the noses of the incumbents."
Sign up for CIO Asia eNewsletters.