Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Lies your database is telling you

Andrew C. Oliver | March 10, 2016
A wise person once said time is a device invented to keep everything from happening at once. Jonas Boner explains how the database world has abused time from the beginning

Recently, Typesafe-turned-Lightbend CTO Jonas Boner has been giving a presentation attacking the general model used for most database systems. He takes aim at CRUD or the ACID model used to achieve CRUD. Such critiques are scarcely unusual, yet Boner's argument was uncommonly brilliant.

He began with double-entry accounting, which was born as early as the seventh century and still pervades every major business in the world. Simpler alternativesare rarely considered because the disadvantages are too great. The basic premise of double-entry accounting is you can't change the past, only correct the present.

Database developers, however, thought they knew better, so they created the software equivalent of a time machine: The update statement and its awful cousin the delete statement. To be fair, when those statements were invented, a 5MB hard drive had to be loaded with a forklift, so other structures weren't necessarily feasible.

The database developers who had those great notions must have missed some of the best/worst episodes of "Star Trek," where you find out that time travel is generally a bad idea. With updates, you get concurrency control, mutexes, transactions, and other constructs that try to mitigate the negative effects of attempting to modify the same state while dealing with more than one thing happening at a time.

Now, there is an alternative: "insert only" structures. The trouble with those -- besides generating more instances, rows, attributes, or documents (like double-entry accounting) -- is that you never have a "consistent view" of the data. Boner asserts that this is OK because the consistent view is nothing more than a convenient fiction you have inconveniently created at the expense of adding more latency to your operational system.

According to Boner, not only is time an illusion, so is the present. It seems absurd, right? Now is the present. However, by the time you got to the end of that sentence to cognate what you read, it was no longer true. If you try to mentally hold on to the present in more than a general sense, you find that you can't because the present is no more than a pointer that is always moving.

When we get to the level of larger data sets, however, determining totals "right now" is at the very least laborious in an insert-only structure. The "local present" is a set of "facts derived from multiple concurrent pasts." That is, if you look at all of the states that "were" generated in the system up until "now," you can arrive at a conclusion as to the state or value of now.

Meanwhile, when you try to discover this "right now" state, you may find you don't have all of the information. In fact, you find that Donald Rumsfeld might have insight for you. Not only do you have known unknowns, but you have unknown unknowns. Why? Information has latency. There are facts you don't have yet. Even when we try and force a consistent view of the world, we make things more latent somewhere else, and our operational system is less concurrent and lower scale.

 

1  2  Next Page 

Sign up for CIO Asia eNewsletters.