18 Comments

Summary:

[qi:gigaom_icon_cloud-computing] With the annual VMworld confab coming next week, we are bound to come away thinking that virtualization is the guiding light to take us to the promised land of cloud computing. The reality is different. Virtualization and cloud computing aren’t always the answer. But we […]

[qi:gigaom_icon_cloud-computing] With the annual VMworld confab coming next week, we are bound to come away thinking that virtualization is the guiding light to take us to the promised land of cloud computing. The reality is different.

Virtualization and cloud computing aren’t always the answer. But we are slowly learning what applications work best in different types of environments. Let’s be clear — virtualization and cloud computing are two distinct, equally game-changing technologies. But given the hype and enthusiasm around them, it is always healthy to get a refreshing dose of why they may not always be the greatest approaches.

For instance, Basecamp, a popular web-based collaboration tool, released data in July showing that virtualization was slowing it down significantly compared with running on dedicated hardware. The full details are in a well-documented summary on its blog. Here are some highlights:

  • For months, we’ve been running our applications on virtualized instances.
  • For some time I’ve wanted to run some tests to see what the current performance of Basecamp on the virtualized instances was vs. the performance of dedicated hardware.
  • We host all our infrastructure with Rackspace.
  • To make a long story a little less long, we saw some pretty extreme performance improvements from moving Basecamp out of a virtualized environment and back onto dedicated hardware.
  • We were able to cut response times to about one-third of their previous levels even when handling over 20 percent more requests per minute.

This is only one instance compared with hundreds that have benefited extensively from virtualization. But in the case of newly designed web applications that can be configured for multithreading and deployed relatively easily with tools like Chef, any perceived virtualization benefits are also confounded by a severe performance penalty.

When the Cloud’s Not the Answer

In late June, John Adams from Twitter operations gave an exceptional talk on “Fixing Twitter: Improving the Performance and Scalability of the World’s Most Popular Micro-blogging Site” (see video). In the presentation, John says cloud computing is not an option for Twitter. It’s not irrelevant for all — it’s just not practical for his company’s needs. Here are some excerpts from the video:

  • We really want raw processing power.
  • We want to be able to look at the metrics of the system and accurately predict what is going to happen.
  • There is a place for clouds, and I’m sure that they work for a lot of people.
  • Given the scale that we are under and the amount of traffic that we want to process it [cloud computing] is currently not working.
  • We’d rather spend the time working on the real, hard computer science problems that we are under.

Twitter operations manages back-end software performance, availability, metrics-driven capacity planning and configuration management. But it does not oversee the physical plant, and instead relies on NTT for managed services including the server and network deployment. This isn’t really cloud computing, and I’d hesitate to call this infrastructure as a service because that implies the presence of a virtualization layer between the service and the infrastructure (which is exactly what happens with Amazon EC2). This is more like infrastructure leased, not bought, and with a full-time mechanic ready to do the racking and stacking. For Twitter, it works, and it showcases the need to explore options before a wholesale leap to the cloud.

So virtualization and cloud-based  infrastructure as a service are not the ubiquitous answers. At times, more efficient, streamlined access to raw hardware simply provides the best bang for the buck. For those interested in another perspective, Stacey recently spoke with Erich Clementi, the head of IBM’s cloud computing efforts, about Big Blue’s cloud strategy. Clementi pointed out, “Many people equate cloud computing to virtualization. It is not virtualization.” He further said:

Google is not virtualized, and virtualization is not sufficient to qualify as a cloud — it’s a better use of the physical infrastructure, but the real bang comes from modeling out the whole data center and taking energy, labor, software and hardware and acting on all those levels. It’s like Google’s idea that the data center has become the computer.

I’d second Clementi’s comment about modeling the data center. But let’s not forget about the modeling application, the workload, and impact on infrastructure. Virtualization and cloud computing are here to stay, and provide compelling benefits. At the same time, smart software on raw hardware can be an equally compelling proposition.

  1. Having moved from Enterprise IT to the consumer software market about 6 years ago, I’ve somehow missed the rational for the high degree of virtualization being used in corporate data centers today. Perhaps one of the other GigaOM users can enlighten me. Seriously, wasn’t the job of the OS (i.e. Windows, Linux, Solaris) supposed to create essentially safe and exclusive environments for multiple users allowing them to share common resources such as storage, processing, networking, etc?

    I can understand if you want to run multiple OSes on common hardware, but why run so many instances of the same OS? Is it due to a lack of security and management at the OS layer or is there another reason that I’m just missing?

    Share
  2. I think the point really for virtualization is for managing server sprawl — instead of buying and using lots and lots of underutilized machines, you can leverage existing infrastructure by hosting virtual machines instead. It’s no secret that if you really want performance or vertical scalability, your write your application to leverage as much of the hardware as possible. However, if you’re out to be able to handle spikes and grow as seamlessly as you can without investing heavily on capital expenditure, that’s where putting your application on the cloud makes sense.

    Think about it this way: if you were to deploy a web application which you have no idea what kinds of loads will be facing at any given time (like say a Twitter clone). Today you’d want to write it in a manner that will leverage the most of the hardware and then deploy your solution in a horizontally scalable manner. To be cost effective and to be able to handle a surge (or multiple surges) you’d want to put it on something like Amazon’s EC2 and set up automatic scaling. Now if your service is mature enough and is “profitable” then you can consider dedicated hardware to handle most of the load. Then when you experience spikes, you can offload to the cloud for expansion.

    Enterprise applications that deal primarily with data-parallel applications would like to go the Cloud route because unless you run massively data-parallel batch jobs 24×7, you don’t want to be stuck with hardware that’s not fully utilized. Then in this situation you’d want to have a virtualization strategy to leverage the hardware you already do have that is underutilized.

    Share
  3. Infrastructure virtualization never promised performance. It’s about legacy software isolation (many software are design to run on certain platform), improved utilization (in spite of isolation) and ease of management. Legacy software have surprising long life spans and virtualization is the only way to manage them on modern hardware and OS.

    Cloud computing is a broader sense of virtualization of data centers: computer resources behind a multi-level API with lowest level the infrastructure virtualization and highest level, SaaS.

    Virtualization in its narrow sense is definitely not a prerequisite for cloud computing. It’s one service a cloud platform can provide when it makes sense.

    Share
  4. [...] Is Virtualization a Cloud Prerequisite? [...]

    Share
  5. Gary asks the question, “Is Virtualization a Cloud Prerequisite?” But where is it addressed in this article? Instead, Gary makes the fallacious remark, “I’d hesitate to call this infrastructure as a service because that implies the presence of a virtualization layer.”

    The article starts off interesting and correct with the statement, “Let’s be clear — virtualization and cloud computing are two distinct, equally game-changing technologies.” While I’m not convinced virtualization is really game-changing, it doesn’t have anything to do with cloud computing. Service providers like NewServers.com offer cloud computing with no virtualization whatsoever, so the problems posed by Basecamp and Twitter have absolutely nothing to do with cloud computing and shouldn’t even be mentioned in an article on cloud computing. They had problems with virtualization ONLY! What are these examples doing in this article?

    Share
  6. One thing missed here is that virtualization provides a flexible, evolutionary approach to cloud computing that eliminates the need to rewrite code for closed platforms from the likes of Microsoft, Amazon and Google. VMware gives you a way to build your own private cloud right now, run the apps you already have, and use the hardware you already own, or hosted infrastructure, or both.

    That said, virtualization, like any layer of abstraction, does incur latency, there’s no doubt about it.

    Share
    1. How does virtualization help eliminate rewriting code?

      Share
      1. If your code doesn’t have tight hardware related code, it could be run smoothly on virtualized environment such as VMWare or EC2.

        But virtualization add another layer of complicity. That is why twitter and other high traffic system doesn’t buy it. In general, cloud computing price plan is higher than dedicated hosting, another reason for twitter to not go cloud.

        Share
  7. it sounds like the same argument for keeping c++ or even assembly.

    some want control, while most want productivity.

    Share
  8. [...] than trust an outside company. Plus, some companies simply don’t have the computational needs that tech firms have, so cloud computing is simply not as [...]

    Share
  9. Cloud computing does NOT require the use of virtualization. It may seem that it does because so many public and private clouds depend on virtualization tools. But these are two distinct markets and technologies with wonderous futures ahead of them.

    But consider the Software-as-a-Service [SaaS] cloud vendors. Do all SaaS vendors use virtualization? Most do and any one of them that needs to truly scale to huge server farms probably do. But the startups don’t need virtualization – they have more important issues on their mind. And it’s entirely feasible to buy a huge server and storage farm to support a few dozen multi-tenant subscriber companies without any virtualization software.

    Let’s say that a business user needs a database for analytics (a data mart). Imagine a user fills in a browser screen with what they need and the application locates a server-storage pair to allocate it for them and hand them back a URL link. The user could then start proof of concept testing or sand box analytics, never knowing where or what the database is running on. This would be an internal private cloud service. The database itself is an abstraction somewhere out in a private cloud. Like the SaaS vendor, it could be one of a few dozen databases on one big server and storage farm. The database is a virtual concept, probably running on a multi-tenant server, all handled via self service capacity on demand. Some database products merely partition an existing database for multitenant and again, the user or developer need not know.

    Virtualization is, of course, useful when managing huge server and storage farms. As cloud servers start scaling up, virtualization is often what helps IT preserve their sanity and keep costs down. I’m 100% in favor of virtualization. But I have to agree with the IBM comment. “Many people equate cloud computing to virtualization. It is not virtualization.”

    Share

Comments have been disabled for this post