Compute and data must link

MIT researchers claim they have a way to make faster chips

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

A team of MIT researchers have discovered a possible way to make multicore chips a whole lot faster than they currently are, according to a recently published research paper.

The researchers’ work involves the creation of a scheduling technique called CDCS, which refers to computation and data co-scheduling. This technique can distribute both data and computations throughout a chip in such a way that the researchers claim that in a 64-core chip, computational speeds saw a 46 percent increase while power consumption decreased by 36 percent. This boost in speed is important because multicore chips are becoming more prevalent in data centers and supercomputers as a way to increase performance.

The basic premise behind the new scheduling technique is that data has to be near the computation that uses it, and the best way to do so is with a combination of hardware and software that distributes both the data and computations throughout the chip more easily than before.

Although current techniques like nonuniform cache access (NUCA) — which basically involves storing cached data near the computations — have worked so far, these techniques don’t take in account the placement of the computations themselves.

The new research touts the use of an algorithm that optimally places the data and the compute together as opposed to only the data itself. This algorithm allows the researchers to anticipate where the data needs to be located.

“Now that the way to improve performance is to add more cores and move to larger-scale parallel systems, we’ve really seen that the key bottleneck is communication and memory accesses,” said MIT professor and author of the paper Daniel Sanchez in a statement. “A large part of what we did in the previous project was to place data close to computation. But what we’ve seen is that how you place that computation has a significant effect on how well you can place data nearby.”

While the CDCS-related hardware loaded on the chip accounts for 1 percent of the chip’s available space, the researchers believe that it’s worth it when it comes to the performance increase.

3 Responses to “MIT researchers claim they have a way to make faster chips”

  1. Lets say somewhat related or ta least interesting, noticed a presentation Marvell has Monday at ISSCC about going modular and more:
    “If chip-design engineers had also looked at the financial optimizations of the overall process, they would have built things differently. They should have realized that certain functions are better grouped into highly specialized integrated circuits that could easily and seamlessly talk to each other without compromising the overall system costs. The key to making this happen is what i call the Lego Block approach of designing integrated circuits. However , in order for the Lego Block approach to materialize, we need to change the way we architect our devices. We need to do many things. Define a new chip-to-chip interconnect protocol , take advantage of multi-chip-module packaging and high-speed SerDes technology ,redefine the memory hierarchy to take advantage of 3D solid-state memory instead of blindly increasing the DRAM size in our devices, repartition DRAM to serve different logical functions instead of building gigantic single-die DRAM to serve every function, change the way we build DRAM,so that they are optimized more for performance and power efficiency instead of capacity and redefine what should be done in hardware versus software. In short, we need to change our way of thinking and be brave enough to reject common wisdom. If we fail to take action, soon we will no longer see cost savings. On the other hand, if we succeed, we willsee life beyond the end of Moore’s Law”

    • This got me thinking about an active interposer , or maybe a better name would be smart interposer, to move the “system agent’ and the iterconnect on it and maybe make it programmable so it can adapt to various configurations and to be able to receive software upgrades at any time. Know next to nothing about this so no clue who might be working on that and how close they could be. Maybe Cypress should explore that since they do programmable chips (PSoC) and their subsidiary Deca Tech does interposers.