Real World NoSQL: MongoDB at Shutterfly

Edit Note: This is the second on a multi-part series of posts exploring the use cases for NoSQL deployments in the real world. Other published case studies include Hbase and Cassandra.

With all the excitement surrounding the relatively recent wave of non-relational – otherwise known as “NoSQL” – databases, it can be hard to separate the hype from the reality. There’s a lot of talk, but how much NoSQL action is there in the real world? In this series, we’ll take a look at some real-world NoSQL deployments.

Shutterfly (s slfy) is a popular, Internet-based, photo sharing and personal publishing company that manages a persistent store of more than 6 billion images with a transaction rate of up to 10,000 operations per second. Data Architect Kenny Gorman accepted the task of helping Shutterfly select and implement a replacement for its existing relational database management system: the Oracle’s RDBMS (s orcl).

Initially, Shutterfly considered open-source databases like MySQL and PostgreSQL. However, during the evaluation and concurrent re-architecting of the application, it became apparent that a non-relational database might be a better fit for Shutterfly’s data needs, potentially improving programmer productivity as well as performance and scalability. “There are tradeoffs, so we had to convince ourselves that a less mature non-transactional data store would work,” says Gorman.

Shutterfly looked at a wide variety of alternative database systems, including Cassandra, CouchDB and BerkeleyDB, before settling on the MongoDB document-oriented database. MongoDB stores data in a variant of the JSON (JavaScript Object Notation) format; each document is self describing and can have a complex internal structure.

The document approach matched the Shutterfly XML format while providing scale-out and failover replication. Moving to a document model wasn’t that big a step, according to Gorman: “If you are at the kind of scale where you would be looking at MongoDB, then you probably already have figured out you need to de-normalize your data.”

Like most NoSQL solutions, MongoDB provides a very different model for transactions and consistency – generally not providing immediately consistent or multi-object transactions. Consequently, Shutterfly has deployed MongoDB only for those parts of the application where strict consistency isn’t critical, such as the metadata associated with uploaded photos. For those parts of the application which require stronger consistency, or a richer transactional model – billing and account management perhaps– the traditional RDBMS is still in place. Those parts of the application that were moved to MongoDB were re-engineered with the simpler transactional model in mind.

Despite the significant effort and risks that accompany such a significant architectural shift, Shutterfly reports significant payoffs in terms of time-to-market, cost and performance. Furthermore, MongoDB has relieved a mismatch between the object model used by the application and the underlying database model. In the relational world, this mismatch is usually hidden by the Object Relational Mapping (ORM) which translates between the object and relational models. However, this obfuscation leads to performance and manageability issues. With MongoDB “you have an optimized stack, no ORM complexity, and you have better overall performance,” says Gorman. “At least that’s the hope.”

The compromises required by MongoDB – changes to the data model and transactional paradigms in particular – have required Shutterfly to make significant engineering investments. But so far, Shutterfly is happy with its decision. “I am a firm believer in choosing the correct tool for the job, and MongoDB was a nice fit, but not without compromises,” says Gorman. “In our case, those compromises were relatively small.”

Guy Harrison is a director of research and development at Quest Software, and has over 20 years of experience in database design, development, administration, and optimization. He can be found on the internet at, on e-mail at [email protected] and is @guyharrison on twitter.

Related content from GigaOM Pro (sub req’d):