Facebook builds a database benchmark for a graph-powered world

If you’re doing any sort of social-media application, you might want to take note of what Facebook just built. The company has created a benchmarking tool called LinkBench that measures the performance of databases tasked with serving graph-structured data, which, presumably, is the lifeblood of every startup around that’s concerned with who’s connected to whom.

Although, of all LinkBench’s features — and you can read all about them in a Facebook Engineer wall post from Monday morning — probably the biggest is that it’s open source and built to be extensible. One of the biggest problems with benchmarks overall is that they rarely align with actual production workloads inside the companies that are supposed to care about them. In this case, for example, a benchmark for measuring the performance of Facebook’s massive MySQL+memcached+Flashcache database architecture against its massive social graph and transaction activity would be all but worthless unless someone was just planning to rebuild Facebook.

linkbench copy

I’ve written in the past that perhaps crowdsourced benchmarks are the wave of the future: essentially a compiled set of statistics and best practices as more companies test different database (or Hadoop) technologies on different hardware setups against different workloads and publish the results. Everything will of course vary by the exact details within any given environment, but it would be a good way to get a sense of how a particular stack might, or perhaps should, fare.

But an open source benchmark tuned for a specific use case — social graphs — by probably the world’s foremost expert on that use case is interesting, too. Anyone else trying to serve data from their own social graphs can benefit from some of LinkBench’s more-prominent features, such as its ability to generate “large synthetic social graphs,” while tuning it to the specifics of their own infrastructure. After all, it might be that your app has different requirement around reading versus writing data, and it’s very possible you’re not using MySQL, either.

Or maybe you are using MySQL and want to see how a newer database technology might handle your graph workload. That, by the way, is one of the reasons Facebook built LinkBench, according to this post.

At any rate, the social web is all about graphs, and database performance really matters for anyone trying to build a service that stays online and provides a pleasant user experience. Say what you want about Facebook, but its services perform, so the bar is set high for anyone trying to dethrone it or at least to build something than can attract an equally large and devout following.