The face of high-performance computing is changing. That means new technologies and new names, but also familiar names in new places. Sure, cluster management is still important, but anyone that doesn’t have a cloud computing story to tell, possibly a big data one too, might starting looking really old really quickly.
We’ve been seeing the change happening over the past couple years, as Amazon Web Services (s amzn) and Hadoop, in particular, have changed the nature of HPC by democratizing access to resources and technologies. AWS did it by making lots of cores available on demand, freeing scientists from the need to buy expensive clusters or wait for time on their organization’s system. That story clearly caught on, and even large pharmaceutial companies and space agencies began running certain research tasks on AWS.
Amazon took things a step further by supplementing its virtual machines with physical speed in the form of Cluster Compute Instances. With a 10 GbE backbone, Intel Nehalem processors and the option of Nvidia Tesla GPUs, users can literally have a Top500 supercomputer available on demand for a fraction of the cost of buying one. Cycle Computing, a startup that helps customers configure AWS-based HPC clusters, recently launched a 10,000-core offering that costs only $1,060 per hour.
Hadoop, for its part, made Google- or Yahoo-style parallel data-processing available to anyone with the ambition to learn how to do it — and a few commodity servers. It’s not the be all, end all of the big data movement, but Hadoop’s certainly driving the ship and has opened mainstream businesses to the promise of advanced analytics. Most organizations have lots of data, some of it not suitable for a database or data warehouse, and tools like Hadoop let them get real value from it if they’re willing to put in the effort.
This change in the way organizations think about obtaining advanced computing capabilities has opened the door for new players that operate at the intersection of HPC, cloud computing and big data.
One relative newcomer to HPC — and someone that should give Appistry and everyone else a run for their money — is Microsoft (s msft). It only got into the space in the late ’00s, so it didn’t have much of a legacy business to disrupt when the cloud took over. In a recent interview with HPCwire, Microsoft HPC boss Ryan Waite details, among other things, an increasingly HPC-capable Windows Azure offering and “the emergence of a new HPC workload, the data intensive or ‘big data’ workload.”
Indeed, Microsoft has been busy trying to accommodate big data workloads. It just launched an Azure-based MapReduce service called Project Daytona, and has been developing its on-premise Hadoop alternative called Dryad for quite some time.
The latest company to get into the game is Appistry. As I noted when covering its $12 million funding round yesterday, Appistry actually made a natural shift from positioning itself as a cloud software vendor to positioning itself as an HPC vendor. Sultan Meghji, Appistry’s vice president of analytics applications, explained to me just how far down the HPC path the company already has gone.
Probably the most extreme change is that Appistry is now offering its own cloud service for running HPC computational or analytic workloads. It’s based on a per-pipeline pricing model, and today is targeted at the life sciences community. Meghji said the scope will expand, but the cloud service just “soft launched” in May, and life sciences is a new field of particular interest to Appistry.
The new cloud service is built using Appistry’s existing CloudIQ software suite, which already is tuned for HPC on commodity gear thanks to parallel-processing capabilities, “computational storage” (i.e., co-locating processors and relevant data to speed throughput) and Hadoop compatibility.
Appistry is also tuning its software to work with common HPC and data-processing algorithms, as well as some it’s writing itself, and is bringing in expertise in fields like life sciences to help the company better serve those markets.
“Cloud has become, frankly, meaningless,” Meghji explained. Appistry had a choice between trying to get heard of the noise of countless other private cloud offerings or trying to add distinct value in areas where its software was always best suited. It chose the latter, in part because Appistry’s products are best taken as a whole. If you need just cloud, HPC or analytics, Meghji said, Appistry might not be the right choice.
One would be remiss to ignore AWS as a potential HPC heavyweight, too, although it seems content to simply provide the infrastructure and let specialists handle the management. However, its Cluster Compute Instances and Elastic MapReduce service do open the doors for other companies, such as Cycle Computing, to make their mark on the HPC space by leveraging that readily available computing power.
The old guard gets it
But the emergence of new vendors isn’t to say that mainstay HPC vendors were oblivious to the sea change. Many, including Adaptive Computing and Univa UD, have been particularly willing to embrace the cloud movement.
Platform Computing has really been making a name for itself in this new HPC world. It recently outperformed the competition in Forrester Research’s comparison of private-cloud software offerings, and its ISF software powers SingTel’s nationwide cloud service. Spotting an opportunity to cash in on the hype around Hadoop, Platform also has turned its attention to big data with a management product that’s compatible a number of other data-processing frameworks and storage engines.
Whoever the vendor, though, there’s lots of opportunity. That’s because the new HPC opens the doors to an endless pipeline of new customers and new business ideas that could never justify buying a supercomputer or developing a MapReduce implementation, but that can enter a credit-card number or buy a handful of commodity servers with the best of them.