38 Comments

Summary:

Global Index, a technology owned by Skype co-creators Niklas Zennstrom and Janus Friis via their company JoltID, is the fulcrum of leverage in their ongoing dispute with current Skype owner eBay and its potential purchasers. If you were either the buyer or seller in this labyrinthine […]

skype_logoGlobal Index, a technology owned by Skype co-creators Niklas Zennstrom and Janus Friis via their company JoltID, is the fulcrum of leverage in their ongoing dispute with current Skype owner eBay and its potential purchasers. If you were either the buyer or seller in this labyrinthine transaction, you’d likely be tempted to declare, “Let’s just rip out Global Index and use something different.”

Such a move would undoubtedly take the wind out of JoltID’s sails as Skype tries to find a new home outside of eBay. Indeed, many VoIP pundits insist that Session Initiation Protocol (SIP) could be Skype’s savior. But while it’s true that technologies like SIP and its stepchild XMPP achieve a lot of the same goals as Global Index, such an argument ignores the fact that Skype is as successful as it is because it has exponentially better operating economics than the rest of the VoIP industry –- and Global Index is the singular reason why.

The Promise of SIP
In 2002, as Zennstrom and Friis were facing a bevy of lawsuits of indeterminate scope and the writing was on the wall as to the profitability prospects of a P2P file-sharing network, many in the then-fledgling VoIP industry were busily attempting to re-architect the telephone network in the Internet’s visage. A number of these services, including two from entrepreneur Jeff PulverVonage and Free World Dialup — used SIP at their core. They were consumer-focused services that, in their own way, attempted to mimic the architecture, business structure and design of the public switched telephone network using Internet Protocol technologies.

SIP is now hugely significant in shaping how many telecom networks are architected. But while many of us initially thought that SIP might herald an era of person-to-person multimedia communications free from the control of large companies, it hasn’t exactly worked out that way.

The Drawbacks of SIP
For telecom companies and their vendors, the draw of SIP was that it could be used to transpose the proprietary SS7 signaling network onto the Internet while allowing the calls themselves to transit IP networks –- both at a significant discount to the cost of switched telecom trunking. But even as a client-server phenomenon deployed on the public Internet, SIP is an incomplete solution. On its own it has no way of traversing firewalls or, more importantly, dealing with NAT traversal –- a critical oversight for a protocol created in the late 1990s for the IP address-starved modern Internet.

SIP user agents (such as that software on your computer or that phone on your desk) must also be manually configured to register themselves to a SIP proxy server if users are to be allowed to use them for differing networks. Furthermore, all traffic, addressing and routing decisions in a SIP network are typically handled at the network core or by equipment operated by the service provider. That includes the various workarounds such as STUN that enable folks behind firewalls or using private IP addresses to talk to each other, not to mention derivative (and much cooler) protocols and techniques such as XMPP and Jingle.

If adapting SIP to the vagaries of the modern Internet sounds expensive, it is. By 2006, Vonage had already burned through nearly half a billion dollars. More importantly, SIP architectures are a critically flawed design starting point for a true P2P network. SIP and XMPP networks are really client-server networks masquerading as P2P.

Forgetting about the exponentially more costly business of sending voice or video data across networks, 70 percent of traffic on XMPP instant messaging networks is the result presence updates. Some estimates are that as much as 60 percent of this information is itself redundant. Servicing the traffic on a network such as Skype’s, which consists of some 45 million daily users, would bury most startups in server and bandwidth expenses. Add to that the actual messages themselves, and having to handle the voice channel or video at the core, and it becomes clear that only the big boys get to play in distributed communications services.

Enter the Supernodes
Or do they? As it turns out, features like instant messaging, voice/video chat, and presence management are ideal applications for the technology that Skype’s founders had been playing with for years in the P2P world. The networks using their technology when Skype was founded in September 2002, Kazaa and Morpheus, handled massive volumes of data between peers with no real central core to speak of, but still significant domain control by FastTrack, the company created by Zennstrom and Friis to license technology using the same moniker. The key concept exploited by the services derivative of this technology is the distributed, auto-discovering, self-healing node-supernode model tied together by PKI encryption.

On any of the other VoIP and IM networks such as Gizmo or iChat, each user is a node -– a logical endpoint in a cloud that connects to host computers operated by the service. But with Kazaa, Skype and even Joost, a small percentage of each service’s users unwittingly conspire to provide the network’s backbone in the form of supernodes. Which means that if you have some combination of a permissive firewall, really good port-forwarding on your router and a public IP address on your computer, you, too, can be a Skype supernode. When it comes to traversing firewalls, NAT, and handling distributed authentication and presence management, supernodes do all of the heavy lifting.

That is the reason Skype is able to service 45 million daily users on a fraction of the infrastructure that a SIP-based provider like Vonage needs to deploy. The workload that normally would be handled by equipment owned by the company is distributed among the users themselves.

The Power of Global Index

In order to make this seamless to users, Skype implements a Service Discovery Protocol. Such technologies have always worked well on Local Area Networks (Apple’s implementation is called Bonjour) but often get confused on the public Internet because there is usually no central registry — and because the broadcast packets they use tend to get snubbed by access routers.

When you load it up, it starts with a table of known supernodes and the central Skype server. Skype’s only centralized involvement is in verifying your identity via PKI authentication and providing an update (if necessary) of friendly nearby supernodes. From that point on, your associated supernodes handle every piece of data you share on the network. An added bonus is that supernodes can redesignate the location of the master Skype hosts on your computer whenever necessary.

Since the whole thing is encrypted, and the encryption keys of nodes and supernodes are all validated by Skype’s root key authority, everything on the network is trustworthy and virtually impossible to hack or otherwise corrupt. In other words, the Skype network is fully distributed, self-healing and largely decentralized, but still maintains all of the advantages of command and control desired by a service operator who actually wants to make money from integrating the service.

Thanks to Global Index, Skype operates at cost levels that are believed to be a fraction of those of even the most efficient SIP or XMPP-driven networks. It is this economic advantage that trumps the possibility of forklifting standards-based telephony technologies into the core of Skype’s network. If you truly wanted to replicate Skype’s ingenious — and very practical — design, you’d be better off looking at technologies like Napster, Bittorrent or GNUtella.

Ian Andrew Bell is creator of the team management service rosterbot.com

Related research

Subscriber Content
?
Subscriber content comes from Gigaom Research, bridging the gap between breaking news and long-tail research. Visit any of our reports to learn more and subscribe.
By Ian Andrew Bell
  1. Ian:

    “Forgetting about the exponentially more costly business of sending voice or video data across networks, 70 percent of traffic on XMPP instant messaging networks is the result presence updates. Some estimates are that as much as 60 percent of this information is itself redundant. Servicing the traffic on a network such as Skype’s, which consists of some 45 million daily users, would bury most startups in server and bandwidth expenses. Add to that the actual messages themselves, and having to handle the voice channel or video at the core, and it becomes clear that only the big boys get to play in distributed communications services.”

    (a) SIP doesn’t require you to carry the voice via your bandwidth, just the signaling. There are some number of calls that would need the media to be sent via the company bandwidth. This paragraph exaggerates this costs.

    (b) The cost isn’t as high as you suggest. Skype could afford to have SIP infrastructure. I know this having worked, as you know, for a two VoIP companies that did use SIP (Gizmo and Yahoo!), and did shoulder more of the costs for calls and video that Skype does…and it was profitable. Considering we didn’t have the same scale as Skype with Gizmo, we were still profitable. At Yahoo! we had probably similar scale as Skype as a whole and were very profitable on voice calls (free and premium combined).

    It is clear you have a deep technical understanding of the issues. But Skype could work fine with a SIP infrastructure. P2P is helpful in deferring some of the costs, but it is certainly not necessary to make the business work or even thrive. Skype has a bunch of great technology and smart engineers. Those smart people can make Skype work via a standard protocol with some of their magic, given the state of things today (less true when they started back in 2003/4).

    Share
  2. Sten Tamkivi – Skype’s chief evangelist was asked about moving to SIP at eComm in Amsterdam last week.
    He said “We would need a very good reason” and then explained the benefits of SIP (and Skype for Asterisk) as interconnects at the edges of the Skype cloud -not in the core.

    The technology in P2P that is relevant is Distributed Hash Tables, not the actual file transfer protocols themselves.

    I did some musing aloud about how you could solve these problems a couple of months back over on my babyis60 blog.

    Share
  3. Ian, What no mention of Paradial? It was Paradial and GIPS that made Skype initially.

    I doubt that the IPR is much more than something of a tweak on what Paradial gave Skype initially.

    I also want to remind you that its hard to take this issue seriously on the IPR side. The issues between managements are something to take serious, but the issue is nothing SIP can fix unfortunately.

    Share
  4. Skype going with SIP would simply cause too much headache for the average Joe/Jane with NAT.

    Share
  5. The concept of Skype as a distributed network is a bit overhyped in my opinion. To prove that, you simply have to block around 4 or 5 IP addresses from your computer and watch skype fall over. The bootstrap process requires some known server/IPs in the internet, as well as a host of Skype-run supernodes for the traffic. I saw a talk on this at a SIP BOF during the IET.

    As another reader pointed out, SIP isn’t about media, it’s simply used to establish a session between two peers. While traversing NATs is a challenging problem, it’s not really difficult these days. STUN and ICE are both fairly mature (although the later was still in a draft stage a while ago). With those two technologies you can solve every NAT problem other than symmetric NAT to symmetric NAT, at which point you’ll need a tunnelling solution such as TURN. Unfortunately, TURN is still immature, so many companies have down proprietary tunnelling implementations, including CounterPath (X-Tunnels) and Yahoo! (who proxy traffic over their own web servers on port 80 in that scenario).

    I spent five years in the VOIP industry, and got out last January. My own take on it is there was (and is) a clear lack of vision. Telcos wanted to use SIP to replace the very architecture they already had, and nobody wanted to entertain ideas about what else you could do with this technology. Plus, the lack of QoS on the public internet basically guarantees that VOIP will almost always be sub-par to traditional telephony. You could argue that wideband codecs make up for that, but since you have to use a headset and not a traditional phone for that, it’s basically only reserved for Internet early adopters and geeks in front of their computers.

    Share
  6. On this narrow point:

    “Forgetting about the exponentially more costly business of sending voice or video data across networks, 70 percent of traffic on XMPP instant messaging networks is the result presence updates. Some estimates are that as much as 60 percent of this information is itself redundant. Servicing the traffic on a network such as Skype’s, which consists of some 45 million daily users, would bury most startups in server and bandwidth expenses.”

    AOL/AIM, MSN Messenger, and possibly Yahoo have existing networks of this scope that handle large amounts of presence updates. There may be a business combination possible where this infrastructure with sunk costs can be used for signaling/call setup and presence, and P2P can be used for actual media ( audio/video). As noted before – media does not always have to be sent in a client-server model so the bandwidth equation does not materially change vs. Skype.

    There are portions of Skype that are not fully decentralized either – I am sure Authentication and Billing are “centralized”.

    Share
  7. Thanks for the article. The technical backgroud on the way Skype works is fascinating.

    Share
  8. So is it possible to swap out the JoltID technology for SIP, modified SIP or something else?

    Share
    1. Sure the SIP Proxies replace the Skye Supernodes. NAT seems to be the problem and is likely solvable. The cost for a network of Proxies is manageable. It sounds like a fun project.

      Share
  9. Short answer, Patrick: No.

    Long answer… anything’s possible.

    SIP is so open-ended that when we talk about SIP networks today we’re really just talking about proprietary deployments which utilize SIP as an addressing/locator schema and media wrapper.

    SIP only knows how to establish a point-to-point connection between two public IP addresses. So while SIP is indeterminate insofar as media type, it is wholly responsible for determining how media gets from A to B. This is where it falls down.

    Technologies like STUN, TURN, ICE et al (from Paradial or any other vendor) are merely hacks which solve the NAT issue while allowing networks to still use SIP. They do this by connecting two connections and bridging them. I have a little knowledge in this area — I helped architect precisely the same thing for H.323 while at Cisco.

    And yes, while Gizmo Project and YahOo both implemented SIP (as does iChat) none of these is doing anywhere near the 8-9 billion minutes of peer-to-peer calls per month that Skype is. And if they did, the computational resources required to bridge them would be formidable.

    I too believe that Skype has significant investment in hosting their own supernodes but have yet to see anything that convinces me there’s a better way to scale that is addressed by another SIP-derived alternative.

    Share
  10. @Ian – that’s not really true. SIP has nothing to do with media, it’s just for establishing a session between two peers. As it typically relies on a publicly addressable rendezvous, it’s successful nearly 100% of the time at doing that.

    What you really mean to say is that the body of SIP messages, which can be anything (but is typically a SDP blob), fails to properly address how to get media from point A to point B. And like I pointed out, most people do it successfully using STUN, ICE and some form of relay. The computational resources are no worse than what Skype would have I imagine. In fact, if they’re doing a DHT type network, then each peer would have to talk to various peers before establishing a connection (which would make it less efficient).

    I spent a week at Google two years ago coding a peer to peer SIP library that used distributed hash tables. I’m not sure where the code ended up, but ideally you could use that to effectively replace the Skype backbone. I’m not arguing it’s better, but it would have been comparable technology with some of the same shortcomings as the Skype network.

    Share

Comments have been disabled for this post