On Friday I wrote about the rise of specialty computing clouds and AMD’s efforts to build a supercomputer that will essentially be a graphics rendering cloud. Today, insideHPC points me to a post from Josh Simons over at Sun Microsystems about his trip to the Oak Ridge National Laboratory (home of the Jaguar supercomputer) and the efforts they are making toward cloud-based supercomputing. The lines between scalable corporate computing and high-performance computing have blurred, but Simons’ post goes into detail about bringing virtualization to these petaflop machines to enable a true cloud environment.
The key benefit of virtualization for supercomputing is the ability to continue running a job despite a failed node. The ways the post talks about achieving this resiliency mimics a compute cloud. Other than resiliency, there are ways to use many virtual machines to test how an HPC job might scale out on the whole machine, before devoting precious compute cycles to that job. Downsides appear to be a slight loss in performance — a massive ego blow in an industry in which the competition for building and operating the fastest machine can take on the level of staged drama as a WWF match. However, as HPC moves from scientific to industrial applications, it seems like a move to HPC clouds might make sense in some cases.
Jaguar image courtesy of ORNL