Is Virtualization a Cloud Prerequisite?

[qi:gigaom_icon_cloud-computing] With the annual VMworld confab coming next week, we are bound to come away thinking that virtualization is the guiding light to take us to the promised land of cloud computing. The reality is different.

Virtualization and cloud computing aren’t always the answer. But we are slowly learning what applications work best in different types of environments. Let’s be clear — virtualization and cloud computing are two distinct, equally game-changing technologies. But given the hype and enthusiasm around them, it is always healthy to get a refreshing dose of why they may not always be the greatest approaches.

For instance, Basecamp, a popular web-based collaboration tool, released data in July showing that virtualization was slowing it down significantly compared with running on dedicated hardware. The full details are in a well-documented summary on its blog. Here are some highlights:

  • For months, we’ve been running our applications on virtualized instances.
  • For some time I’ve wanted to run some tests to see what the current performance of Basecamp on the virtualized instances was vs. the performance of dedicated hardware.
  • We host all our infrastructure with Rackspace.
  • To make a long story a little less long, we saw some pretty extreme performance improvements from moving Basecamp out of a virtualized environment and back onto dedicated hardware.
  • We were able to cut response times to about one-third of their previous levels even when handling over 20 percent more requests per minute.

This is only one instance compared with hundreds that have benefited extensively from virtualization. But in the case of newly designed web applications that can be configured for multithreading and deployed relatively easily with tools like Chef, any perceived virtualization benefits are also confounded by a severe performance penalty.

When the Cloud’s Not the Answer

In late June, John Adams from Twitter operations gave an exceptional talk on “Fixing Twitter: Improving the Performance and Scalability of the World’s Most Popular Micro-blogging Site” (see video). In the presentation, John says cloud computing is not an option for Twitter. It’s not irrelevant for all — it’s just not practical for his company’s needs. Here are some excerpts from the video:

  • We really want raw processing power.
  • We want to be able to look at the metrics of the system and accurately predict what is going to happen.
  • There is a place for clouds, and I’m sure that they work for a lot of people.
  • Given the scale that we are under and the amount of traffic that we want to process it [cloud computing] is currently not working.
  • We’d rather spend the time working on the real, hard computer science problems that we are under.

Twitter operations manages back-end software performance, availability, metrics-driven capacity planning and configuration management. But it does not oversee the physical plant, and instead relies on NTT for managed services including the server and network deployment. This isn’t really cloud computing, and I’d hesitate to call this infrastructure as a service because that implies the presence of a virtualization layer between the service and the infrastructure (which is exactly what happens with Amazon EC2). This is more like infrastructure leased, not bought, and with a full-time mechanic ready to do the racking and stacking. For Twitter, it works, and it showcases the need to explore options before a wholesale leap to the cloud.

So virtualization and cloud-based  infrastructure as a service are not the ubiquitous answers. At times, more efficient, streamlined access to raw hardware simply provides the best bang for the buck. For those interested in another perspective, Stacey recently spoke with Erich Clementi, the head of IBM’s cloud computing efforts, about Big Blue’s cloud strategy. Clementi pointed out, “Many people equate cloud computing to virtualization. It is not virtualization.” He further said:

Google is not virtualized, and virtualization is not sufficient to qualify as a cloud — it’s a better use of the physical infrastructure, but the real bang comes from modeling out the whole data center and taking energy, labor, software and hardware and acting on all those levels. It’s like Google’s idea that the data center has become the computer.

I’d second Clementi’s comment about modeling the data center. But let’s not forget about the modeling application, the workload, and impact on infrastructure. Virtualization and cloud computing are here to stay, and provide compelling benefits. At the same time, smart software on raw hardware can be an equally compelling proposition.