Blog Post

Is Virtualization a Cloud Prerequisite?

[qi:gigaom_icon_cloud-computing] With the annual VMworld confab coming next week, we are bound to come away thinking that virtualization is the guiding light to take us to the promised land of cloud computing. The reality is different.

Virtualization and cloud computing aren’t always the answer. But we are slowly learning what applications work best in different types of environments. Let’s be clear — virtualization and cloud computing are two distinct, equally game-changing technologies. But given the hype and enthusiasm around them, it is always healthy to get a refreshing dose of why they may not always be the greatest approaches.

For instance, Basecamp, a popular web-based collaboration tool, released data in July showing that virtualization was slowing it down significantly compared with running on dedicated hardware. The full details are in a well-documented summary on its blog. Here are some highlights:

  • For months, we’ve been running our applications on virtualized instances.
  • For some time I’ve wanted to run some tests to see what the current performance of Basecamp on the virtualized instances was vs. the performance of dedicated hardware.
  • We host all our infrastructure with Rackspace.
  • To make a long story a little less long, we saw some pretty extreme performance improvements from moving Basecamp out of a virtualized environment and back onto dedicated hardware.
  • We were able to cut response times to about one-third of their previous levels even when handling over 20 percent more requests per minute.

This is only one instance compared with hundreds that have benefited extensively from virtualization. But in the case of newly designed web applications that can be configured for multithreading and deployed relatively easily with tools like Chef, any perceived virtualization benefits are also confounded by a severe performance penalty.

When the Cloud’s Not the Answer

In late June, John Adams from Twitter operations gave an exceptional talk on “Fixing Twitter: Improving the Performance and Scalability of the World’s Most Popular Micro-blogging Site” (see video). In the presentation, John says cloud computing is not an option for Twitter. It’s not irrelevant for all — it’s just not practical for his company’s needs. Here are some excerpts from the video:

  • We really want raw processing power.
  • We want to be able to look at the metrics of the system and accurately predict what is going to happen.
  • There is a place for clouds, and I’m sure that they work for a lot of people.
  • Given the scale that we are under and the amount of traffic that we want to process it [cloud computing] is currently not working.
  • We’d rather spend the time working on the real, hard computer science problems that we are under.

Twitter operations manages back-end software performance, availability, metrics-driven capacity planning and configuration management. But it does not oversee the physical plant, and instead relies on NTT for managed services including the server and network deployment. This isn’t really cloud computing, and I’d hesitate to call this infrastructure as a service because that implies the presence of a virtualization layer between the service and the infrastructure (which is exactly what happens with Amazon EC2). This is more like infrastructure leased, not bought, and with a full-time mechanic ready to do the racking and stacking. For Twitter, it works, and it showcases the need to explore options before a wholesale leap to the cloud.

So virtualization and cloud-based  infrastructure as a service are not the ubiquitous answers. At times, more efficient, streamlined access to raw hardware simply provides the best bang for the buck. For those interested in another perspective, Stacey recently spoke with Erich Clementi, the head of IBM’s cloud computing efforts, about Big Blue’s cloud strategy. Clementi pointed out, “Many people equate cloud computing to virtualization. It is not virtualization.” He further said:

Google is not virtualized, and virtualization is not sufficient to qualify as a cloud — it’s a better use of the physical infrastructure, but the real bang comes from modeling out the whole data center and taking energy, labor, software and hardware and acting on all those levels. It’s like Google’s idea that the data center has become the computer.

I’d second Clementi’s comment about modeling the data center. But let’s not forget about the modeling application, the workload, and impact on infrastructure. Virtualization and cloud computing are here to stay, and provide compelling benefits. At the same time, smart software on raw hardware can be an equally compelling proposition.

18 Responses to “Is Virtualization a Cloud Prerequisite?”

  1. IMO virtualization is not an essential, defining characteristic of cloud computing – but it is an enabling technology. Cloud computing is about the flexibility to create, destroy and relocate resources on demand. One can do that with a pool of physical servers, but greater provisioning difficulty and whole-machine granularity make the result a lot less appealing to users. Many cloud providers offer instances as low as a single 2GHz core with 256MB memory and 10GB of storage, which can be handy for testing or for the lower-horsepower parts of a multi-system complex, but it’s not very economical to offer such small instances without virtualization. Even though cloud computing and virtualization don’t strictly have to go together, there is a certain synergy between them.

  2. Great comments all around! Let me add some additional insight from our customer activities to help frame the discussion: only 10% of the private IaaS cloud pilots we’re involved with are using non-virtual environments. BUT: of the 90% that are starting with virtualized use cases, 80% of them are planning to add non-virtual apps into their cloud. So customers are generally starting with the virtualized low hanging fruit to prove the cloud business model (with additional capabilities such as dynamic provisioning, self-service, service-offering definition, contracts, and billing) – and then expanding their infrastructure offerings to serve a broader set of application groups.

    So I emphatically agree that virtualization is not cloud – and this is why architects are seeking out technology-agnostic cloud management solutions that can allocate a variety of application workloads across any collection of hardware, operating systems, virtual machines, storage and networking. This is the only way to maximize existing resources and future infrastructure investments while avoiding vendor lock-in.

  3. A lot of the answers to these questions are not of a technical nature.

    Virtualisation can mean different things to different people in different scenarios however placed in the hands of the marketing department the term becomes interchangable.

    So you could argue that a cloud based service is virtual in that it lives “in the cloud” and not in your office. This has nothing to do with stuff like VMware.

    But the problem is that the IT world went for a number of years without anything really very new.

    VMware (and others) came along with something rather clever that met a real need however the marketing people (and others) hyped it to a level unseen for many years. In reality this kind of virtualisation is just a tool that meets some real IT challenges.

    But the hype started.

    So then along came the likes of Citrix who brought is desktop virtualisation (VDI). Again this got caught up in the virtualisation hype (with a fair degree of band wagon jumping) and suddenly a technology that Citrix (and others) had been pushing for many years was the next great thing again.

    But just as the hypervisor was about to become a commodity, the cloud came along and again the marketing team saw a new opportunity and a way to save their margins.

    Let us not forget that the biggest player in the hypervisor space is also owned by a company famed for the qualities of it’s marketing engine who have for many years been able to persuade big business to buy over priced disks that are no better than anybody else.

    Virtualisation has many meanings however all those marketing departments have a vested interest to see those meanings all mixed together in order to maintain margins on something that should be a commodity.

    Can you tell why a certain vendor has not invited me to San Francisco this week?

  4. Dan Graham

    Cloud computing does NOT require the use of virtualization. It may seem that it does because so many public and private clouds depend on virtualization tools. But these are two distinct markets and technologies with wonderous futures ahead of them.

    But consider the Software-as-a-Service [SaaS] cloud vendors. Do all SaaS vendors use virtualization? Most do and any one of them that needs to truly scale to huge server farms probably do. But the startups don’t need virtualization – they have more important issues on their mind. And it’s entirely feasible to buy a huge server and storage farm to support a few dozen multi-tenant subscriber companies without any virtualization software.

    Let’s say that a business user needs a database for analytics (a data mart). Imagine a user fills in a browser screen with what they need and the application locates a server-storage pair to allocate it for them and hand them back a URL link. The user could then start proof of concept testing or sand box analytics, never knowing where or what the database is running on. This would be an internal private cloud service. The database itself is an abstraction somewhere out in a private cloud. Like the SaaS vendor, it could be one of a few dozen databases on one big server and storage farm. The database is a virtual concept, probably running on a multi-tenant server, all handled via self service capacity on demand. Some database products merely partition an existing database for multitenant and again, the user or developer need not know.

    Virtualization is, of course, useful when managing huge server and storage farms. As cloud servers start scaling up, virtualization is often what helps IT preserve their sanity and keep costs down. I’m 100% in favor of virtualization. But I have to agree with the IBM comment. “Many people equate cloud computing to virtualization. It is not virtualization.”

  5. Anonymous

    One thing missed here is that virtualization provides a flexible, evolutionary approach to cloud computing that eliminates the need to rewrite code for closed platforms from the likes of Microsoft, Amazon and Google. VMware gives you a way to build your own private cloud right now, run the apps you already have, and use the hardware you already own, or hosted infrastructure, or both.

    That said, virtualization, like any layer of abstraction, does incur latency, there’s no doubt about it.

      • If your code doesn’t have tight hardware related code, it could be run smoothly on virtualized environment such as VMWare or EC2.

        But virtualization add another layer of complicity. That is why twitter and other high traffic system doesn’t buy it. In general, cloud computing price plan is higher than dedicated hosting, another reason for twitter to not go cloud.

  6. Gary asks the question, “Is Virtualization a Cloud Prerequisite?” But where is it addressed in this article? Instead, Gary makes the fallacious remark, “I’d hesitate to call this infrastructure as a service because that implies the presence of a virtualization layer.”

    The article starts off interesting and correct with the statement, “Let’s be clear — virtualization and cloud computing are two distinct, equally game-changing technologies.” While I’m not convinced virtualization is really game-changing, it doesn’t have anything to do with cloud computing. Service providers like offer cloud computing with no virtualization whatsoever, so the problems posed by Basecamp and Twitter have absolutely nothing to do with cloud computing and shouldn’t even be mentioned in an article on cloud computing. They had problems with virtualization ONLY! What are these examples doing in this article?

  7. Infrastructure virtualization never promised performance. It’s about legacy software isolation (many software are design to run on certain platform), improved utilization (in spite of isolation) and ease of management. Legacy software have surprising long life spans and virtualization is the only way to manage them on modern hardware and OS.

    Cloud computing is a broader sense of virtualization of data centers: computer resources behind a multi-level API with lowest level the infrastructure virtualization and highest level, SaaS.

    Virtualization in its narrow sense is definitely not a prerequisite for cloud computing. It’s one service a cloud platform can provide when it makes sense.

  8. I think the point really for virtualization is for managing server sprawl — instead of buying and using lots and lots of underutilized machines, you can leverage existing infrastructure by hosting virtual machines instead. It’s no secret that if you really want performance or vertical scalability, your write your application to leverage as much of the hardware as possible. However, if you’re out to be able to handle spikes and grow as seamlessly as you can without investing heavily on capital expenditure, that’s where putting your application on the cloud makes sense.

    Think about it this way: if you were to deploy a web application which you have no idea what kinds of loads will be facing at any given time (like say a Twitter clone). Today you’d want to write it in a manner that will leverage the most of the hardware and then deploy your solution in a horizontally scalable manner. To be cost effective and to be able to handle a surge (or multiple surges) you’d want to put it on something like Amazon’s EC2 and set up automatic scaling. Now if your service is mature enough and is “profitable” then you can consider dedicated hardware to handle most of the load. Then when you experience spikes, you can offload to the cloud for expansion.

    Enterprise applications that deal primarily with data-parallel applications would like to go the Cloud route because unless you run massively data-parallel batch jobs 24×7, you don’t want to be stuck with hardware that’s not fully utilized. Then in this situation you’d want to have a virtualization strategy to leverage the hardware you already do have that is underutilized.

  9. OldITGuy

    Having moved from Enterprise IT to the consumer software market about 6 years ago, I’ve somehow missed the rational for the high degree of virtualization being used in corporate data centers today. Perhaps one of the other GigaOM users can enlighten me. Seriously, wasn’t the job of the OS (i.e. Windows, Linux, Solaris) supposed to create essentially safe and exclusive environments for multiple users allowing them to share common resources such as storage, processing, networking, etc?

    I can understand if you want to run multiple OSes on common hardware, but why run so many instances of the same OS? Is it due to a lack of security and management at the OS layer or is there another reason that I’m just missing?