One technique they describe is "hedged requests," in which duplicate requests are sent to multiple servers, and the first response that is returned is the one that is used. Another technique is to set up micro-partitions, or multiple partitions on each machine, which allows the company to do more fine-tuned load-balancing. A third technique involves putting into practice "latency-induced probation," in which slow servers are quickly spotted and not assigned any additional work.
These techniques should "allow designers to continue to optimize for the common case while providing resilience against uncommon cases," the Google engineers wrote.
Stoica noted that many of the techniques that Google described would be applicable to smaller IT operations as well, though "though the effect would not be as pronounced as in a large deployment's as Google's."
Sign up for CIO Asia eNewsletters.