Facebook's Insatiable Hunger for Hardware

45 Comments

Updated: Facebook these days is doing everything in its power to imitate Google, recruiting the search giant’s sales people, poaching its senior executives and — most importantly — using infrastructure as a competitive advantage. Like Google, Facebook has figured out that the right web infrastructure is the difference between user delight and dismay. And like Google, Facebook is finding out that it isn’t cheap. [digg=http://digg.com/hardware/Facebook_s_Insatiable_Hunger_for_Hardware]

I’ve been trying to get a handle on Facebook’s infrastructure for some time, but so far have been unable to get the company to open up. The last time I reached out to them, back in January, I was hearing that they had between 1,200 and 1,500 servers, along with storage and switches from EMC Corp. and Force 10 Networks respectively. As it turns out, those server numbers weren’t even close to the total servers used by them.

The company is running around 10,000 servers, according to Data Center Knowledge, citing comments made by Facebook VP of technology, Jeff Rothschild, at a recent MySQL user conference. (See video of the panel.) Of the 10,000 servers, 1,800 are from MySQL and around 805 of them are memcached servers. In order to house its sprawling infrastructure, Facebook has leased data center space from DuPont Fabros in Ashburn, Va., and Digital Realty Trust in Santa Clara, Calif., DCK reports.

How much is Facebook spending on its infrastructure? The company isn’t going to tell us, but there are clues. Server and storage company Rackable today reported first-quarter 2008 sales of around $69 million. Facebook is one of its largest customers, accounting for around 10 percent of Rackable’s sales (that number could be higher, but we’ll have to wait for Rackable’s 10-Q to get a clearer picture), so some quick, back-of-the-envelope math reveals $7 million in spending by the social networking company. A well placed source of mine just let me know that Facebook is going to spend over $9 million more on servers this year. That should be good news for Rackable. Next on my list is an estimate of Facebook’s bandwidth and data center costs.

The hardware spending by startups like Facebook will be a topic of discussion at our Structure 08 conference, where we are hoping to learn more about the infrastructure secrets of all of today’s top (and fast-growing) web players.

45 Comments

Pete Lundin

With that amount of server power running, I hope Facebook considers energy saving and environmental issues a top priority. Siting is definitely one of the most important factors that affect the energy consumption and environmental effects of a server farm. There are cooler climates than in the continental US and with more green electricity available. I would recommend anybody considering siting a datacenter to take a look at the Finnish website on these issues: http://www.fincloud.freehostingcloud.com/

Facebook Quiz

Facebook is doing it’s best to scale up and those hardships are going to be a hard juggle. I think they started off with their feet facing the wrong direction.

lenath

The original article mentions 2 DB Admin for 1,800 MySQL servers…. Ouch…. Don’t know how leggit this info is, but if it is actually the case, it is damn impressive!

lenath

What about the number of SysAdmins? Do you have any figures on this by any chance? Would be interesting to find out…

Ernie Oporto

@A.T.
Please expand on your ideas about why the US not being a leader in any way invalidates following Google’s proven success? I’m just wondering what other examples of mega-success you’ve come across on the Internet that has somehow escaped the attention of the rest of the Internet.

A.T.

strange, why to copy Google if Google strategy has been targeted on US success and times has changed – US is less and less leader (if at all, make yourself search for “food rationing”, yes, first time in US history)… do they have capability to think by own brain at FB at all?

Nazz

10,000 servers is ridiculous. For crying out loud, you can build a nice quad-quad core server with 64gb of RAM for $10k. With 20 of those I can power Facebook easily. That is 1.2 TB of RAM or shit, buy a 100 of those and you got 6 TB of RAM and 1600 processors. If you cannot power Facebook with that, then you don’t know how to architect for shit. Can’t do it with 100 servers? Ok, buy 1000 of them and you got 60 TB of RAM and 16000 processors. If you cannot power with that, then you are probably mentally retarded.

Mukul Kumar

It’s much better to run larger servers rather than smaller servers. This is because, I can bet, synchronizing memcached between 805 servers is much more difficult than synchronizing between let’s say 200 servers, of 4x the power. Also Apache and other web-servers can scale very well on high powered machines. Running Apache across many servers will require more code to run, and will cause much more heating, and consume more space.

Mukul.

Alan Wilensky

I have never been happy with Facebook’s latencies and load times, no matter where I access from, although I have noticed some modest improvements to date. It may be that Facebook’s questionable utility to serious businesses, coupled with it’s weak ability to monetize, will be a serious drag on its growth heading into the weakening economy.

Also, all of this heavy iron and the investment thereof, might just be one boat anchor of a liability (sucking up cash that might be put to better use). There are many more creative strategies for building capacity.

garry

I hope that I can come up with a simple business concept such as facebook and make it worth as much. Is facebook really worth its crazy current valuation of multi billions, even though there is plenty of valid reason for them being worthless.

williamtm

That’s a hell of a lot of power going to waste. Think of the planet!

Talking of the planet, all the Facebook staff should drive Hummers.

James

Klaus: I usually find Facebook and most big US sites fine; sadly, though, I’ve often found US/Canadian transit providers with little or no global transit bandwidth: technically, they provide “transit”, but with shockingly high latency by very inefficient routes. I have a server colocated in California on the end of a Peer1 connection – which routes UK traffic via Amsterdam, of all places, resulting in atrocious performance! (The really stupid thing is that IIRC they and the UK ISPs I was testing from have direct connections into LINX, which should give far better performance than the detour via the Netherlands. Maybe they’re trying to meet a traffic quota at AMSIX?)

Klaus

I can tell you, accessing Facebook from Europe is always a real pain, loading so slowly.

Om Malik

@ Rich Miller

I also updated on Facebook’s likely spending for rest of the year on the post: about $9 million.

GaryO

“between 1,200 and 1,500 servers from EMC Corp. and Force 10 Networks.”

EMC makes storage systems and Force 10 makes Gigabit Ethernet switches…not really “servers” in my opinion. Was this sentence supposed to read differently?

-gary

That strategy only works when power and space is cheap and processing power is expensive. We’re starting to see that shift in a major way. Big iron is making a comeback in power effecient packaging.

Jay

Isn’t Google employing strategy of 100 cheap used hardware instead of 1 costly powerful server? It would be interesting to confirm because its working for Google and maybe facebook should go that route. (Although its a different story that Google loads up those cheap machines with its own version of linux and file system for better performance).

ces

No, Google doesn’t use100 cheap computers instead of 1 powerful server. They *might* use a couple of slightly underpowered computers instead of 1 high end server. Google uses cost effective Intel and AMD processors. Intel uses standard PC disk drives spinning at 5400 to 7200 RPM instead of faster, more reliable SCSI. Google’s real strategy is that they customize the hardware and software for the problem at hand. They design and build their own mother boards, customize the OS, supply their own cluster software: file system, map-reduce, database, remote procedure call, distributed locking, distributed cluster managers; the whole kit and kaboodle.

Brian

I’m going to dream about understanding what the hell you just said tonight

Comments are closed.