MIT has an easier way for multicore chips to use data structures


Credit: Flickr / Andrew Magill

A group of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory have discovered a way for data structures to work more efficiently with mutlicore chips, according to an MIT announcement on Friday. The scientists will present their findings in February during the Association for Computing Machinery’s Symposium on Principles and Practice of Parallel Programming.

Although hardware developers have been boosting the speed of computer chips by adding more cores and processing power (this also led to rise of multicore programming a few years back), it’s been challenging for researchers to scale out some data structures along with the multiple cores.

At the center of the MIT team’s research are a type of data structure known as priority queues, which are used for applications like data compression, network scheduling and simulating events. This type of data structure is used to designate data items and jobs based on importance so that the processor accesses whatever is at the front of the line, so to speak, followed by the next, and so on.

It gets a bit confusing when you add multiple cores to the mix, however, because each core attempts to access whatever piece of data is prioritized as being first in the data structure, which results in a host of conflicts, according to the release. Supposedly, performance starts to stumble after eight cores.

To solve this dilemma, the researchers found a way to alter the priority queue data set so that each one of the cores is no longer forced to always hook up with the first data item. Using a linked list, which adds the ability for each data item to be traced with a unique memory address, the team was able to assign data items to be processed by different cores.

However, while the data items are now mapped to specific cores, a linked list still maintains the properties of the priority queue in that each core — regardless of what data set it is assigned to — still needs to scroll down the list of prioritized data items to find its corresponding data set. If a data set ahead in line is being processed by another core, the core has to wait for that job to be finished before it can start working.

The scientists then used what’s called a skip list, which supposedly makes for a more efficient process of moving down a linked list by building “a hierarchy of linked lists on top of” the main linked list, the release stated.

From the release:
[blockquote person=”MIT” attribution=”MIT”]The skip list begins with a linked list and builds a hierarchy of linked lists on top of it. Only, say, half the elements in the root list are included in the list one layer up the hierarchy. Only half the elements in the second layer are included in the third, and so on…To find a given item in the root list, you follow the pointers through the top list until you identify the gap into which it falls, then move down one layer and repeat the process.[/blockquote]

With the linked list now sliced up into several chunks of lists, the cores have less of a chance to run into the same data items, although MIT said that collisions still occur. Apparently, when the researchers tested algorithms that relied on the re-architected data structure, they saw a notable performance increase with each additional core added, “up to a total of 80 cores.”



sounds intuitively correct. One can always find corner cases with certain mix of jobs in the priority queue to induce inefficiencies, but in a normally distributed work queue, this makes sense.

I would have thought we would have arrived here years ago. I seem to recollect seeing similar techniques being adopted nearly 20 years ago.

John DeTreville

No, not at all. This applies to shared-memory multiprocessing.

Comments are closed.