While we waste four cores, scientists use a million at a time

Chances are, the quad-core processor powering your desktop computer or high-end laptop is vastly underworked. But it’s not your fault: Writing code that executes in parallel is difficult, so most consumer applications (save for some compute-intensive video games that really need help, for example) continue to run on just one core at a time. Which makes it all the more impressive that a group of Stanford researchers recently ran a jet-engine-noise simulation across 1 million cores simultaneously.

As anyone even casually familiar with parallel processing knows, running applications across more nodes means jobs execute faster because they’re able to share the computing workload. The more cores, the faster it runs. This what makes Hadoop, for example, so great at processing large chunks of data. The MapReduce framework on which it’s based divvies up the work across nodes and everything they find is stitched back together as the result of a job.

But even Hadoop can only scale to tens of thousands of nodes and, because of its focus on “nodes,” actually isn’t really good at utilizing multi-core processors to their fullest (expect to hear more about the limitations of Hadoop at our Structure: Data conference March 20-21 in New York). The IBM-built (s ibm) Sequoia supercomputer (housed at Lawrence Livermore National Laboratory) that the Stanford team used consists of 98,304 processors (or nodes), each containing 16 computing cores. That’s a grand total of 1,572,864 cores, and the researchers were able to use the majority of them, which they claim is a record of some sort.

Sequoia, decomposed
Sequoia, decomposed

But record or not, that’s an incredibly complex undertaking. Programming the jet-engine simulation meant figuring out how to divvy the code into more than a million different tasks that could run across tens of thousands of nodes and 16 cores within each of those nodes. If even one of those processes is buggy, it could slow down or ruin the whole simulation.

Even in the world of supercomputing, where systems now regularly contain hundreds of thousands of cores — some of them special-purpose GPU co-processors — there’s a shortage of programming talent to actually use them all to their fullest potential. As my colleague Stacey Higginbotham explained in some time ago, the world of high-performance computing is hurtling toward exascale computing but a bigger problem than energy-consumption might be finding applications that need that much computing power and the algorithms capable of operating at that scale.

Still, the implications of advances in parallel programming are huge — like potentially life-altering huge. This is true not only because of the scientific questions we’ll soon be able to answer at speeds inconceivable even a decade ago, but also because of the computing power we’ll all soon be carrying around in our pockets and purses. If you think those multi-core smartphones and tablets are great now because they can run multiple applications at the same time, just wait until their processors are even bigger and badder and we have more applications — photo- and video-editing, computer-aided design, games and who knows what else — that can actually get the most out of them.