As semiconductors try to get faster without breaking the laws of physics (not that researchers aren’t trying that, too) multicore processors have become all the rage. Quad-core chips are commonplace in servers nowadays, and six-core chips have been launched this year. But after a certain point adding more processor cores doesn’t improve performance for certain problems, because there’s not enough memory and bandwidth in the right places on the chip.
This is becoming an issue in the supercomputing world right now, and it’s even rearing its head in multicore embedded chips for devices from base stations to routers. It’s an problem that will meander its way down the computing food chain in a few years or so to affect servers and even smartphones, possibly changing the way chips are designed.
The issue is that, while some tasks can be broken up so each core solves part of the problem in parallel, other computing problems require each core to access more information in order to solve a compute problem. That information is located in memory that may or may not even be located on the chip.
It’s kind of like a mom dealing with the demands of half a dozen kids; it’s hard to process them and even harder to fulfill them thanks to physical limits, like only having one pair of hands. And unlike kids, chips can’t shout louder to have their information heard; they still have to go through a defined path on the silicon to get from the memory to the cores. As more cores try to request more information, the memory on the chip isn’t enough for all the data, and the channels of communication between the cores and the memory become bottlenecks.
Daniel Reed, Microsoft’s scalable and multicore computing strategist, calls this a hidden problem that’s just as big as the challenges of developing code that optimizes multicore chips.
Chip firms are aware of the issue. Intel’s Nehalem processor for servers adds more memory on chip for the multiple cores and tried to improve communications on the chip. Firms such as Texas Instruments (a TXN) have tweaked the designs of their ARM-based chips for cell phones to address the issue as well. Freescale has created a “fabric” inside some of its multicore embedded chips so they can share information more efficiently across a variety of cores.
But it’s possible that a straight redesign on the processor side is what’s needed. SiCortex, which makes a specially designed chip for the high-performance computing market, questions whether merely adding more memory, as Intel is doing, is the way to solve the issue. Its solution is closer to creating a communication fabric inside the device that scales with the number of cores that are added.
The contention is that adding more memory is kind of like bringing Dad in to help handle the kids’ multiple demands. It doesn’t scale if you keep adding more kids. As chipmakers look for alternative ways to handle this, startups such as MetaRAM, which is pioneering dense memory; Acumem, which has a tool to spot bottlenecks causing an application to slow; or those designing their own chips, such as SiCortex, could lead the way to a solution.