Earlier this month, Facebook announced a new data center networking architecture that it calls, fittingly, “data center fabric.” The company explained the design and the rationale in an engineering blog post, and Gigaom’s Jonathan Vanian covered the news, but it’s a big enough deal that we had Facebook Director of Network Engineering Najam Ahmad on the Structure Show podcast this week to talk about the new fabric in more detail.
It’s a must-listen interview for anybody into data center networking, especially those concerned with how they can architect around some of the constrains that have typically limited how information can be transferred and applications can be designed. Here are a few highlights that explain, at a high level, why efforts like Facebook’s fabric matter.
Maximizing data center space
One of the key architectural principles of Facebook’s fabric is a shift from server clusters to a core-and-pod approach that allows for much more bandwidth among machines. Clusters served Facebook well for a while, Ahmad explained, but they were limited in size by the size of the switches available:
“The way Fabric is designed is you can deploy as many pods as you like. A pod is essentially a unit of compute, there are a bunch of servers or racks in it, and you can continue to deploy pods until you run out of physical space in the data center or you run out of power.”[/blockquote]
And, Ahmad added, power is more of a concern than space. “Normally, we measure our data center capacity in terms of megawatts of power,” he explained. “Real estate, or building the building, is relatively cheap compared to everything else So, yeah, it’s power that is our gating factor. For a change, network is not the gating factor.”
Dragging network vendors into this century
Despite the fact that Facebook, and the Open Compute Project it helped launch, has already designed a top-of-rack switch and is now rethinking how data center networks are designed and managed (via a bottoms-up SDN approach versus a top-down hardware-based approach), Ahmad says they’re talking to vendors and there’s always room for them to sell products that fit this new, modular network vision:
Better networks, better infrastructure, better applications
Although Facebook’s new fabric design will let the company take better advantage of its data center space, Ahmad said the bigger reason for the project was to help facilitate better applications:
One specific example Ahmad noted was that of memcached, the in-memory flash layer that Facebook relies on heavily as a low-latency cache in front of its MySQL database clusters. “Memcached is very, very chatty and needs a lot of really low-latency and high-performance, or high-bandwidth, network,” he explained, “and this sort of architecture facilitates that.”