This conversation I had with an end user at the same time I was drafting this article is representative of the NoSQL analytics dilemma.
* Query latency. Complex analytics can be challenging for NoSQL datastores, so many companies are forced to pre-compute results. Tapjoy found this to be the case with HBase and outlined their challenges at the In-Memory Computing Conference in San Francisco during their Hitchhiker’s Guide to Building a Data Science Platform presentation. This batch processing workflow introduces system latency and reduces that business value of data. Never mind that a batch oriented workflow means the results are inherently out of date and disqualifies the opportunity to deliver real-time analytics.
* Hardware sprawl. While scale, and in particular the number of nodes in a cluster, can be a badge of honor, the goal is not how many nodes can be deployed, but rather how few. Even more important is the efficiency of transactions for each node. When NoSQL solutions need to be coupled with additional SQL layers, or pre-computing must be completed before queries can be run, it adds to hardware sprawl and costs.
* Preserve the model, consolidate workloads. There are other options, recently referred by Gartner as the “avant-garde” of relational databases that provide solutions using relational properties of SQL, and the performance needed to scale, frequently through the use of in-memory technologies. Many of these avant-garde databases also incorporate capabilities like JSON to provide data models for structured and semi-structured data.
Today customers are discovering that what appeared like a novel lower cost solution of NoSQL is actually much higher than initially thought. Fortunately, those challenges can be solved with a database that provides the performance needed and the ability to perform comprehensive SQL analytics all in a single solution.
Many big data industry participants have noted that a revolution is underway in the way companies capture and process data. But perhaps the climate is best summarized by Gwen Shapira, a prominent spokesperson on big data:
The revolution will not be schema-less :)— Gwen (Chen) Shapira (@gwenshap) December 3, 2015
This tweet puts the NoSQL movement in perspective. While it appeared that schema-less data management options offered a panacea for the future, the reality has been quite different, with many recognizing the time-tested value of structure, schemas, and SQL.
Sign up for CIO Asia eNewsletters.