High-performance Computing Isn't Always About Performance

Cray's Jaguar Supercomputer
Cray's Jaguar Supercomputer

In data centers and on home PCs, the performance race has essentially stalled. Most people no longer focus on how fast a processor — or multiple processors — run, because processor performance isn’t as much of a bottleneck as it used to be. Unlike those running web servers or desktop applications, however, those in the high-performance computing sector haven’t been as eager to give up on performance gains. Scientists and researchers are still trying to crunch huge data sets quickly and break the exaflop barrier.

But the emphasis on performance at all costs in the HPC sector may be changing, brought about by power concerns and the ability to run high-performance computing jobs on publicly available clouds such as Amazon’s EC2. While performance is still the key, in June last year the TOP500 list of the world’s most powerful supercomputers also began tracking how much performance those supercomputers generated per watt — a measure of energy efficiency. A month earlier researchers at the Department of Energy’s Lawrence Berkeley National Laboratory proposed building a new type of supercomputer to model climate change, one that would use less-powerful processors that consumed less energy.

Currently the average power efficiency of a TOP10 supercomputer is 280 megaflops/watt –- up from 228 megaflops/watt six months ago. Average power consumption of a TOP500 system is 386 kilowatts and average power efficiency is 150 megaflops/watt. We’ll see how these numbers change over time.

On the cloud side, I’ve heard Werner Vogels, the CTO of Amazon (S amzn), tout EC2 as a fine place to run high-performance computing jobs. But I’ve also heard contrary opinions from the folks who actually run supercomputers. And thanks to John West over at InsideHPC, I read a blog post by Ian Foster, associate division director for Mathematics and Computer Science at Argonne National Laboratory, who posited that perhaps the less-than-stellar performance of a public cloud isn’t such a bad thing, since the researcher gets access to it right away. Foster writes:

However, before we conclude that EC2 is no good for science, I’d like to suggest that we consider the following question: what if I don’t care how fast my programs run, I simply want to run them as soon as possible? In that case, the relevant metric is not execution time but elapsed time from submission to the completion of execution. (In other words, the time that we must wait before execution starts becomes significant.)

So in the previous example, hardware speed is less important than how quickly one can access the hardware, a key advantage of external clouds. As companies evaluate moving to cloud computing, a lot has been written about how it changes the underlying economics of providing computing horsepower. But the idea of flexibility is an important one, especially since many companies seem to be pinning their hopes on the emergence of private clouds, which confer less economic advantages, but do offer agility.