Anyone who has waded through pages and pages of well-engineered code with layers and layers of abstractions knows that sometimes a bit of code with one simple input and one simple job description is better than a masterful pile of engineering and computer science. Junky but functional code can take 10 seconds to understand -- sophisticated architectures can take weeks.
It’s not that doing a good job is a bad thing -- however, many times no one has the time or energy to unwrap all of the sophistication. When time is in short supply, sometimes quick and sloppy win -- and win big.
Sometimes being smart isn’t worth the price. Nowhere is this more evident than when it comes to thoroughly studied algorithms with strong theoretical foundations.
Everyone knows the lessons from college. A smart data structure will do the job in time proportional to the size of the data. A bad one might get slower in time proportional to the square or even the cube of the number of data elements. Some of the truly horrible get exponentially slower as the amount of data grows. The lessons from computer science class are important. Bad algorithms can be really slow.
The problem is that smart, theoretically efficient algorithms can be slow too. They often require elaborate data structures full of pointers and caches of intermediate values, caches that chew up RAM. They can take months or years to get right. Sure, in the long run they’ll be faster, but what is it that economist John Maynard Keynes said? “In the long run we’re all dead.”
Part of the problem is that most of the theoretical models analyze how algorithms behave when the data set grows very large. But even in the era of big data, we may not be dealing with a data set that’s large enough to enjoy all of the theoretical savings.
In these cases, it might be a good idea to toss together a sloppy algorithm, even if it’s potentially slow.
Using a separate database server
When it comes to software performance, speed matters. A few milliseconds on the web can be the difference between early retirement and a total flop. The common wisdom goes: To speed up communications between the layers of your software, put your database on the same machine as the software for packaging the results for the user. With your database code and presentation layer communicating quickly, you eliminate the latency of having to ping a separate machine.
Except it doesn’t always pay off, especially when the single machine can’t efficiently serve the needs of both the presentation and the database layer.
Sign up for CIO Asia eNewsletters.