Facebook is a designing a new data center designed specifically to store all those photos of your baby from three years ago or your senior road trip from seven years ago for the long haul. It has to be cheap, it has to be power efficient. And it’s a fundamentally different data center design and compute architecture than the big web companies use today.
Ahead of his talk with me later this month at our Structure:Europe conference in Amsterdam, I spoke with Jay Parikh, VP of infrastructure engineering at Facebook, about the computing challenges facing the giant social network. The one most on his mind at the moment is how to store users’ photos, videos and other digital bits so they can access them anytime they want. Like the piles of albums I have from my high school days, our digital photos have to live somewhere, so Facebook is trying to create a data center equivalent to that dusty old box in the attic that you only open when you move.
He says Facebook is rethinking the infrastructure for how it stores huge repositories or photos and videos in a way that’s accessible and convenient but also cost-effective. Unlike a business that might store records on tape, Facebook can’t afford to let users wait that long to access something, nor can it afford to build data centers that keep photos in caches next to the servers (those Fusion-io machines aren’t cheap!).
“The current data center and hardware design is actually very sub-optimal for that problem,” Parikh says. “People say to us we should just use tape, but I’d rather poke my eyes out with chopsticks.”
So instead he’s thinking about a new storage-based data center as described in this article, with servers that turn on and off as needed and take up a lot of floor space but consume very little power. He added that the company is building software to handle moving data between different regimes of popularity and that changes the properties of the stored files as they migrate from one location to another. The product team shouldn’t have to think about that, said Parikh, the infrastructure should.
“Where we are going to be innovating at the data center, the hardware, the operating system and the kernel level,” he said. “All of that needs to be rethought. You can’t do this by shoving it into existing computing environments. You need a separate storage facility and other, bigger data centers, and different physical building and different design.”
Unlike today’s focus on getting the most performance from a watt of power, generally in a dense computing array, the cold-storage problem requires a lot of floor space and machines consuming the least power possible. So, in Prineville, Ore., next to its large web-serving data center, Facebook is essentially building out a 62,000-square-foot attic for your digital junk.