As each new generation of computer processors arrives with a larger number of computing cores, computer scientists grapple with how best to make use of this proliferation of parallel power.
Now researchers at the Massachusetts Institute of Technology have created a data structure that they claim can help large multicore processors churn through their workloads more effectively. Their trick? Do away with the traditional first-come, first-served work queue and assign tasks more randomly.
MIT's new SprayList algorithm allows processors with many cores to spread out their work so they don't stumble over one another, creating bottlenecks that hamper performance.
If it pans out in day-to-day operation, something like SprayList might pave the way for more effective use of new, many-core processors coming our way, such as Intel's new 18-core server chip,the E5 2600v3.
Multicore processors, in which a single processor contains two or more computational cores that operate simultaneously, can present a challenge in programming. The problem is that the work a computer needs to do must be distributed evenly across each of the cores for maximum performance.
When the first commodity two-core and four-core processors came out more than a decade ago, software researchers harnessed a simple and well-known computer science technique to dole out work, called priority queue, in which the task on top of the work queue is assigned to the next available core. The queue can be ordered through a combination of job priority and good old-fashioned first-come, first-served serialization.
Traditional implementations of priority queue work fine for up to eight cores. But performance suffers when additional cores are added, the researchers said.
Like having too many cooks in the kitchen, too many cores working on the top of a single priority queue can slow performance. Multiple cores hitting the queue at the exact same time can cause bottlenecks as each core contends for the top task. And because each core keeps its own copy of the priority queue in a cache, synchronizing the ever-changing queue across these multiple caches can be a headache — if processors could get headaches, that is.
So the researchers devised a new way of implementing priority queues in such a way that they will continue to be effective for up to 80 cores. Instead of each core being assigned the next task in the queue, the core is assigned a random task, reducing the chances of creating a bottleneck from two cores contending for the same task.
Random assignment has traditionally been frowned upon by those who think a lot about computer processors, the researchers noted in a paper explaining the work. A random scheduling algorithm takes longer to jump around the queue than a conventional one does. Caches can't be used to store upcoming work items. And if a set of tasks needed to perform one job are executed out of order, then the computer needs additional time to reassemble the final results.
Sign up for CIO Asia eNewsletters.