Blog Post

Facebook trapped in MySQL ‘fate worse than death’

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to “a fate worse than death,” and the only way out is “bite the bullet and rewrite everything.”

Not that it’s necessarily Facebook’s fault, though. Stonebraker says the social network’s predicament is all too common among web startups that start small and grow to epic proportions.

During an interview this week, Stonebraker explained to me that Facebook has split its MySQL (s orcl) database into 4,000 shards in order to handle the site’s massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve. I’m checking with Facebook to verify the accuracy of those numbers, but Facebook’s history with MySQL is no mystery.

The oft-quoted statistic from 2008 is that the site had 1,800 servers dedicated to MySQL and 805 servers dedicated to memcached, although multiple MySQL shards and memcached instances can run on a single server. Facebook even maintains a MySQL at Facebook page dedicated to updating readers on the progress of its extensive work to make the database scale along with the site.

The widely accepted problem with MySQL is that it wasn’t built for webscale applications or those that must handle excessive transaction volumes. Stonebraker said the problem with MySQL and other SQL databases is that they consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading) and relatively few on actually finding and serving data. This might be fine for a small application with a small data set, but it quickly becomes too much to handle as data and transaction volumes grow.

This is a problem for a company like Facebook because it has so much user data, and because every user clicking “Like,” updating his status, joining a new group or otherwise interacting with the site constitutes a transaction its MySQL database has to process. Every second a user has to wait while a Facebook service calls the database is time that user might spend wondering if it’s worth the wait.

Not just a Facebook problem

In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.

But products such as MySQL are also open-source and free, and SQL skills aren’t hard to come by. This means, Stonebraker says, that when web startups decide they need to build a product in a hurry, MySQL is natural choice. But then they hit that hockey-stick-like growth rate like Facebook did, and they don’t really have the time to re-engineer the service from the database up. Instead, he said, they end up applying Band-Aid fixes that solve problems as they occur, but that never really fix the underlying problem of an inadequate data-management strategy.

There have been various attempts to overcome SQL’s performance and scalability problems, including the buzzworthy NoSQL movement that burst onto the scene a couple of years ago. However, it was quickly discovered that while NoSQL might be faster and scale better, it did so at the expense of ACID consistency. As I explained in a post earlier this year about Citrusleaf, a NoSQL provider claiming to maintain ACID properties:

ACID is an acronym for “Atomicity, Consistency, Isolation, Durability” — a relatively complicated way of saying transactions are performed reliably and accurately, which can be very important in situations like e-commerce, where every transaction relies on the accuracy of the data set.

Stonebraker thinks sacrificing ACID is a “terrible idea,” and, he noted, NoSQL databases end up only being marginally faster because they require writing certain consistency and other functions into the application’s business logic.

Stonebraker added, though, that NoSQL is a fine option for storing and serving unstructured or semi-structured data such as documents, which aren’t really suitable for relational databases. Facebook, for example, created Cassandra for certain tasks and also uses the Hadoop-based HBase heavily, but it’s still a MySQL shop for much of its core needs.

Is ‘NewSQL’ the cure?

But Stonebraker — an entrepreneur as much as a computer scientist — has an answer for the shortcoming of both “old SQL” and NoSQL. It’s called NewSQL (a term coined by 451 Group analyst Matthew Aslett) or scalable SQL, as I’ve referred to it in the past. Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB, NewSQL products maintain ACID properties while eliminating most of the other functions that slow legacy SQL performance. VoltDB, an online-transaction processing (OLTP) database, utilizes a number of methods to improve speed, including by running entirely in-memory instead of on disk.

It would be easy to accuse Stonebraker of tooting his own horn, but NewSQL vendors have been garnering lots of attention, investment and customers over the past year. There’s no guarantee they’re the solution for Facebook’s MySQL woes — the complexity of Facebook’s architecture and the company’s penchant for open source being among the reasons — but perhaps NewSQL will help the next generation of web startups avoid falling into the pitfalls of their predecessors. Until, that is, it, too, becomes a relic of the Web 3.0 era.

Feature image courtesy of Flickr user jimw; error image courtesy of Flickr user rubenerd.

177 Responses to “Facebook trapped in MySQL ‘fate worse than death’”

  1. Rudy Green

    Great post on FB database shortcomings, which is a main reason FB doesn’t offer ability to edit Status and Comments. Instead, you need to use archaic methods such as delete and repost.

  2. Faizan Javed

    Stonebraker’s latest NewSQL product VoltDB is also open-source. It is a speedy in-memory database which is ACID compliant, but relies on stored procedures to avoid round-trips and reduce network traffic. It is not as partition tolerant as Cassandra, but I believe is at least able to scale-out as much as MySQL but with far better performance and provides ACID guarantees. The main sticking point with VoltDB is that it stores all data in RAM – this may not be the best approach for a huge site such as Facebook considering that RAM is cheap but not that cheap.

    But to me there is potential – maybe a “next-gen” VoltDB which can get around this in-memory limitation might be the perfect product. I would prefer to have a database product which is a blazing fast scalable version of MySQL instead of spending thousands of developer hours making MySQL do what it does not out-of-the-box.

    • I think you’re missing the fact that MySQL has had a solution that is very similar to VoltDB which has been used by high scale, high performance and high availability Telco environments for a decade. It is ACID, in-memory by default, but also has the ability to have some or all non-indexed columns stored on disk. This, of course, allows for a much larger data set than VoltDB. It also supports a number of direct NoSQL access connectors to avoid the overhead of mysqld. Also, in 7.2 it has incredibly (20-40x) improved support for JOINs which have traditionally been problematic in distributed or NoSQL databases.


  3. dave watson

    Comment is made Craiglist is on MySQL. But actually they migrated to MongoDB.

    The SQL vs NoSQL classification is nonsense. Instead talk about different classes of DBMS: relational, document, graph, key value pair etc.

    Also Mr Stonebraker statement about FB is misinformed. Just listen to their own people give talks about their infrastructure and use of technologies like Hive, Hbase, HDFS etc.

      • dave watson

        Yes but others have done more complete migrations. Take a look at case studies such as and Guardian Newspaper. There is certainly a trend. Own tests for federal government agency comparing MongoDB with 2 leading commercial RDBMS (can’t name for legal reasons) showed considerable advantages in terms of flexibility of data access, productivity of development and performance.

  4. Martin Wondergem

    In other news, Stonebraker recommends getting rid of hammers saying: “old hammers (as he calls it) are good for nothing” and need to be “sent to the home for retired tools.”

    When building your next picnic table, Stonebraker recommends starting with concrete reinforced gauge 3 steel, just in case you end up with 750 million users at your party.

  5. In one year it would take Facebook to rewrite everything (and one year is optimistic) the hardware will double in power. Meaning they can just buy more servers with more memory and more multicore processors, and just crunch MySQL.

    Even with today’s technology, if Facebook keeps its steady grow to cover EVERYONE on the planet in about 5-10 years, they could just buy more of TODAY’S hardware and keep the site running.

  6. This article reminds me the conversation that I had with Stonebraker soon Informix acquired Illustra. He said that RDBMSs were history and Object Relational was the way to go – there would have been nothing else in 2-3 years time.

    It was 1996.

  7. Lee Tuck

    Nice advertising piece for Stoneblahblahblah.

    This issue probably has Facebook knocking down his door…

    This piece is like watching FAUX News fair and balanced! Are you guys own by Murdoch?

  8. Patrick

    If we’re talking startups, what really counts in all this is that for every FB there are 1000’s that won’t make it. Stonebraker’s big fallacy is the assumption that those 1000’s can afford to pay big bucks for their software. Fact is, they can’t, so they must use whatever open source is available. What Stonebraker and his ilk need to do is change their pricing model. Maybe Datameer have the right idea.

  9. Kevin

    NASDAQ runs on MS SQL Server. I hesitate to guess but would suspect the NASDAQ deployment and the processing demands far exceed most deployments out there today. SQL is not the issue, planning and costs are. Startups typically deploy opensource then get caught at the acquisition stage or when the technology maxs out…this is when they find themselves having to move to the big boys. Also, consider that the R&D money put in to MS Sql Server, Oracle etc far and away exceeds MYSQL and the others mentioned in this article, so would you bet your company on something other than these giants? Use your favourite search engine to learn more about industrial strength databases.

  10. Jonas

    The problem with RDBMSs is not performance, it’s availability. When I design internet services that always should be available I want to use a persistence technology with support for active-active clustering with no SPOF (like Cassandra).
    I don’t know of a RDBMS with that functionality. It can be simulated with big $, but why throw away the money?

    / Jonas

      • Dimitri

        No SPOFs isn’t good enough. We had no single point of failures in one of our application’s architecture, but recently had two of our core switches fail simultaneously. Both switches were from the same vendor running the same firmware and had the same bug. All of our virtual machines lost connectivity to the SAN. We had dual nics, dual routes, dual everything including core switches, but sometimes, shit happens.

  11. I’m a great fan of Michael Stonebraker. He’s done a huge amount for the science of databases.

    When he invented Postgres he got many things exactly right. One of those things was making the source code open, allowing it to be developed into a increasingly high quality product over the following 20 years. So when he explains how MySQL is bad, he is in some ways also dissing his own previous ideas.

    The problem is that by making Postgres open Mr. Stonebraker no longer makes any money from the project.

    Taking those points together, I’m more inclined to believe that he knew what he was talking about the first time, but now wishes to gloss over that in order to make even more money.

    We don’t need to use new products to take advantage of new ideas. PostgreSQL is just as innovative now as it was 20 years ago, and we are adding new features at an incredible rate. Innovation and maturity makes the best solution.

    Open source doesn’t mean it’s good, but lack of a venture capital funded marketing budget doesn’t mean its no longer valid. If anything it shows we’ve entered a phase of efficiency where less hype is needed to sustain a growing user base.

    • Every product has its pros & cons. I like Postgres in general, but Postgres can barely do a count(*) which takes forever on large tables because it has to always do a full table scan. This is because they implemented MVCC in a brain dead way. Trying counting 100 million + row tables and you will scratch your head. It wasn’t built for massive size databases. Postgres doesn’t do much in parallel either like creating indexes or parallel queries. Trying indexing a 300 million row table sometime. Also, it doesn’t really use multiple cores. It is one connection per core, but a single connection can’t use more than one core. I got 24 cores and Postgres can use one for a big operation. The newer enterprise features like hot standby and streaming replication are nice, but Oracle has had these since 9i/10g. Postgres is definitely getting there and Enterprisedb’s version with InfiniteCache is good, but every product has pros and cons like I said. Nothing is perfect. All of today’s databases are great at shoving data in, but none of them have any real archiving features to get data out and archive it in a nice way. Dumping a table is not archiving. Also, once you get into the terabyte range, most of the tools and utilities of all modern databases just outright break down.

  12. OK.. with the 750M user base and unknown number (at least for me) of staff (highly valued), FB and it’s users (did we hear any complaints about scalability?), and mySQL (which is owned by the top RDBMS company Oracle) couldn’t figure out… this guy figured out! Looks like a 2012 end of the world to me! Give FB, mySQL, and the ardent FB users a break and find a different way of selling whatever you are selling dude. Why can’t you write a post about how it’s not that easy (given the open source nature) to scale mySQL (not that you can’t scale… but scaling the resources to find the brains that can scale) than trying to ridicule a successful ecosystem that figured out how to scale?

  13. Boris Juric

    You might want to look into the rumor that Facebook is sacrificing smaller markets in order to save resources for important markets. Here’s the case: it’s been 1 month since Facebook bot stopped crawling websites on whole Croatian .hr TLD. It doesn’t read og tags anymore and the only visible thing on shared/liked links is URL. You can test it by trying to share anything from .hr domain, or by using lint tool. This could be happening on other domains also, without anyone important noticing.

  14. Terry Lambert

    I would have to say that for the front end, for things like status updates and so on, where the update propagation isn’t time sensitive, so long as it is (eventually) time-ordered, you could do worse than an OLTP system.

    I would probably use something like IBM’s MQSeries, or one of the other heavy-weights, rather than something new written in an interpreted language and backed only by fragile memory; instead, I’d probably do exactly what they are doing, and use a sharded SQL database of some kind for the persistent storage.

    Memory is a hard thing to trust, which is something VATech learned the hard way when they built their XServe based supercomputer with non-ECC memory, and had to divide up the calculations they were running and run them multiple times and “vote”. This was simply due to that amount of memory getting in the range of where cosmic rays start to become important to data-(non)integrity.

    Still, it’s interesting that by front-ending FaceBook with OLTP, you could probably resolve most of the data coherency issues fairly trivially by accepting a somewhat longer propagation pipeline delay, without having to resort to SQL transaction replays, so you could get a somewhat cleaner solution than you might get otherwise. This would probably work well for any other RSS-style application as well.

  15. “Stonebraker says the social network’s predicament is all too common among web startups that start small and grow to epic proportions.”

    All too common for him, maybe he reads too much TechCrunch? Let’s see the evidence of all the startups that really need more than one database server, in other words, rocketing to 100s of millions of visits/month.

  16. Doomsday prophecy for Facebook because they use MySQL. NoSQL a buzz and NewSQL the solution to save the world? How easy to say!
    I thought facebook is doing quite well with their infrastructure supporting 500M+ users with Apache hadoop cluster, Apache hive and
    RDBMS redistribution technology and …. MySQL!

  17. looks like a lot of FB employees have lots of free time to comment on this blog. The negativity is staggering, get to work guys, whats the next ‘awesome’ facebook announcement ?

  18. Manuel Cantu

    Why not switch to Oracle Database? With GoldenGate they can switch to one or several Oracle Database instances running on 1 or 2 Exadata Machines depending on their needs. Then you add Times Ten for in-memory database cache.

  19. I was under the impression that Facebook is using Cassandra, a NoSQL developed by Facebook itself (inspired by Amazon Dynamo infrastructure and Google BigTable data model), now an Apache project as its datastore. Quite surprised to know that a massive web application like Facebook still uses MySQL !!!