4 Comments

Summary:

Facebook has open sourced a new embedded database called RocksDB that’s meant to take advantage of all the performance flash has to offer, from right on the application server. It might be a sign of best practices to come.

Facebook's all-flash DragonStone server. Source: Facebook

Facebook is on an open source roll lately, and on Thursday announced its latest open source project — an embedded key-value store called RocksDB. The company uses it to power certain user-facing applications that would suffer too much from having to access an external database over the network and to eliminate the certain problems relating to non-fully utilized IO performance on flash storage devices.

Facebook database engineer Dhruba Borthakur describes the design of and rationale behind RocksDB in some detail in a blog post, but the biggest factor leading to its creation might be the emergence of relatively inexpensive flash storage cards for servers (or, in Facebook’s case, custom-built servers packed entirely with flash).

“With the advent of flash storage, we are starting to see newer applications that can access data quickly by managing their own dataset on flash instead of accessing data over a network. These new applications are using what we call an embedded database.

“… When database requests are frequently served from memory or from very fast flash storage, network latency can slow the query response time. Accessing the network within a data center can take about 50 microseconds, as can fast-flash access latency. This means that accessing data over a network could potentially be twice as slow as an application accessing data locally. “

RocksDB was designed with these new hardware realities in mind, so it can take full advantage of the IOPS potential of flash memory as well as the computing power of many-core servers, Borthakur explains. Facebook has posted the results of a benchmark test running on a Fusion-io-powered server on the RocksDB GitHub page, and claims it’s significantly faster than Google’s LevelDB embedded key-value store.

Facebook's all-flash DragonStone server. Source: Facebook

Facebook’s all-flash DragonStone server. Source: Facebook

From a broader IT perspective, RocksDB signals that the shifts in storage and computing economics that made the big data movement possible are now making their way into web application development, albeit using a storage media most organizations would consider using for storing “big data.” Facebook is performance hungry, but it’s also cost-sensitive, and it wouldn’t be storing “close to a petabyte of data across different applications,” as Borthakur writes, if the cost to do so was out of control.

He offered a handful of application types an embedded database like RocksDB is suitable for, including:

1. A user-facing application that stores the viewing history and state of users of a website.
2. A spam-detection application that needs fast access.
3. A graph-search query that needs to scan a data set in realtime.
4. RocksDB can be used to cache data from Hadoop, thereby allowing an app to query Hadoop data in realtime.
5. A message-queue that supports a high number of inserts and deletes.

In fact, Facebook has been finding all sorts of new ways to utilize flash as stepping stone between slow disks on one hand and expensive-but-fast RAM on the other.

Facebook is no doubt an early adopter of flash-heavy application architectures, but it’s also probably serving as a guiding light for other companies and their developers who want to achieve Facebook-like performance. As flash prices continue to drop — and now that Amazon Web Services is offering a whole suite of flash-backed instances on EC2 (the prices of which should also drop) — it’s conceivable we’re approaching an era of ever-better web and mobile applications that communicate with the network and the hard drive as little as possible.

  1. This helps large scale and/or computationally intensive applications truly resolve what had been a fairly big hurdle (in terms of complexity and cost) for the past couple years. There were already some solutions out there, but an announcement like this tends to open the floodgates and give much needed legitimacy to relatively new and emerging components. I can’t wait to see the rush of mind-boggling new applications built on this.

    Share
  2. Lots of potential for this type of technology, especially for mission critical applications in enterprise, will make the cost of storing that data lower. I just wish more of the IT giants followed the example of Facebook, Google and Twitter and contributed to the open source movement in this way.

    Share
  3. Joseph Brunner Friday, November 22, 2013

    This is great news for the future and should have guys like Oracle and Microsoft quaking in their boots… the young code hungry devs of today working at Facebook and Google will be the CTO’s and CIO’s of the rest of the Fortune 500 of tomorrow.

    They wont bring their love of big bloated, slow to manage and deploy software with them… 2025 tools like this will make big old software obsolete.

    Share
  4. > Facebook is no doubt an early adopter of flash-heavy application architectures

    It’ sgood to see that trend. BTW: Amazon’s DynamoDB (available as a service) is based on flesh storage and has been released 2 years ago.

    Share

Comments have been disabled for this post