Blog Post

What K-O’ed Skype Last Week

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

Skype, the Internet telephony service went on the blink last week, stranding millions who use it for their communication needs. It took more than a day for the service to be restored. In conversation, CEO Tony Bates told me that the problem might lie with some errant Windows Clients. Well, make that many errant Windows clients! Today Skype’s Chief Information Officer, Lars Rabbe offers more details in a blog post.

In a nutshell, Skype says it was bug in a Windows Client software which lead to overloading of certain super nodes, which crashed and thus caused a chain reaction of problems.

On Wednesday, December 22, a cluster of support servers responsible for offline instant messaging became overloaded. As a result of this overload, some Skype clients received delayed responses from the overloaded servers. Because of a bug identified in a version of the Skype for Windows client (version 5.0.0152), the delayed responses from the overloaded servers were not properly processed, causing Windows clients running the affected version to crash.

Around 50 percent of all Skype users globally were running the version of Skype for Windows, and the crashes caused approximately 40 percent of those clients to fail. These clients included 25–30 percent of the publicly available supernodes, also failed as a result of this problem.

I wonder if some of these problems were brought on by recently introduced aggressive “forced updates” which have not gone down well with some users. Voxeo CEO Jonathan Taylor offered up the theory that buggy software that was pushed on to Windows users was to blame.

If you had the latest Skype for Windows (version, older versions of Skype Windows (4.0 versions), Skype for Mac, Skype for iPhone, Skype on your TV, and Skype Connect or Skype Manager for enterprises, you were not initially affected by this problem. However, with nearly a quarter of Skype’s super nodes going down, it quickly became a network-wide problem.

A supernode is important to the P2P network because it takes on additional responsibilities compared to regular nodes, acting like a directory, supporting other Skype clients and establishing connections between them by creating local clusters of several hundred peer nodes per each supernode.

Once a supernode has failed, even when restarted, it takes some time to become available as a resource to the P2P network again. As a result, the P2P network was left with 25–30 percent fewer supernodes than normal. This caused a disproportionate load on the remaining available supernodes. A significant proportion of users were also restarting crashed Windows clients at this time. This massively increased the load as they reconnected to the peer-to-peer cloud.

In order to deal with the problem, Skype essentially introduced “thousands of instances” of the Skype software into its P2P network and created temporary supernodes. The biggest lessons learned from this, Rabbe writes:

  1. More investments in their infrastructure so that the system becomes and stays reliable.
  2. More rigorous testing procedures that don’t let buggy software out into the market.This is not the first time Skype systems came under pressure because of faulty bugs. In August 2007, Skype had software problems as well, which in turn caused a flood of log-in requests and crashed the network.

Related content from GigaOM Pro (sub req’d):


12 Responses to “What K-O’ed Skype Last Week”

  1. Buggy Windows client may be the instigator and forced upgrade could have exacerbated the problem. But lack of some operational procedures are more glaring:
    1. not ensuring that the population of supernodes are diverse (not same OS, not same version of app)
    2. protecting overloaded supernodes from new additions to the network
    3. not preventing new nodes from being added which would increase signalling traffic between the supernodes

  2. No, it’s a Skype problem. They rely on their end users to provide the computing power necessary (instead of running centralized servers), and bill you for the privilege when you want to use premium services. A brilliant business model.

      • True but their architecture is arguably more resilient and redundant than any major SP (though maybe their software dev/QA/update processes need some tuning).

        Question: do we know if the temporary supernodes were regular, user-owned computers, just promoted, or if they were in fact Skype owned/rented nodes?

  3. oh come on, you can’t blame windows for this one. It was a skype problem with some badly written skype code for the windows problem.
    So put down your Apple / Nix soap box and shut up.