Top executives at NoSQL startups are putting on a brave face in response to Amazon Web Services’ new DynamoDB offering. They roundly cite the new product (as well as Oracle’s October entrance into the space) as validation for the technology NoSQL companies have been pushing for years, while generally dismissing the competitive ramifications of having major vendors now playing in the same pool. But is that confidence justified?
Validation is good
Dwight Merriman, CEO of MongoDB proprietor 10gen, summed up the general sentiment of his peers in an email response to my request for a comment:
The Amazon Dynamo DB announcement is further validation that NoSQL is a big deal, and we are excited to see large players like Oracle and Amazon recognizing the need for alternatives to the relational database. Their entry into the field makes it clear to all large enterprises that this is an important trend – as we have seen that traditional databases do not fit well with cloud computing. New database technologies will be needed in the cloud, and also in the enterprise private cloud.
DataStax CEO Billy Bosworth makes a similar argument on his blog, as did new Cloudant CEO Derek Schoettle during a Friday-morning phone call. He said DynamoDB is “awesome” and Cloudant is “excited about it.” “[AWS] will be a competitor by default,” he said “but their success will be our success.” As the saying goes, and as GigaOM Pro’s Jo Maitland explains in research note on DynamoDB (subscription req’d), a rising tide floats all boats.
But is competition really good?
However, there are plenty of reasons for NoSQL-based startups to fear these new big-name competitors. When competing against Oracle, the challenge will be to convince large enterprises that third-party NoSQL databases are a better fit with existing Oracle ecosystems than is Oracle’s custom-built offering. Nobody ever got fired for buying Oracle, and if it’s offering NoSQL as part of an integrated data environment that also includes a relational database, data warehouse and Hadoop, there might be a natural inclination to just go with Oracle.
With AWS and DynamoDB, however, NoSQL companies find themselves fighting for the websites and other web-based customers that are now their bread and butter. Sid Anand, who helped transition Netflix from Oracle to AWS’s SimpleDB to Cassandra and who now is on the LinkedIn infrastructure team, wrote on his blog earlier this week that “[i]f [your NoSQL database] is not hosted (e.g. by AWS), be prepared to hire a fleet of ops folks to support it yourself. If you don’t have the manpower, I recommend AWS’[s] DynamoDB.”
It appears some are following his advice. One commenter on a blog post by Apache Cassandra chairman (and DataStax co-founder) Jonathan Ellis detailing the technical differences between Cassandra and DynamoDB wrote, “Cassandra’s tech is superior, as far as I can tell. But we’ll probably be using DynamoDB until there is an equivalent managed host service for Cassandra. Moving to Cassandra is simply too expensive right now.”
And AWS’s DynamoDB is built atop a solid-state-drive infrastructure, which helps ensure predictable performance that isn’t always available if you’re running a NoSQL database on cloud computing instances unless data is stored in-memory. In August, 10gen’s Merriman wrote a brief blog post simply asking “where are the SSDs in the cloud?”. Now we know: AWS has them, and, as of now, no one else can use them.
It depends whom you ask
As with most cloud services, at least in their initial incarnations, DynamoDB definitely favors simplicity over lots of features and fine-grained control. Amazon CTO Werner Vogels explains as much in his post announcing the service. If those things are important, users are almost certainly better off choosing a full-featured database.
Ellis’ aforementioned post lays out the reasons one might choose Cassandra. A spokesperson for Basho, which develops the Riak database, sent me a list of three questions everyone should ask when choosing a NoSQL option:
- Is this solution proprietary or open-source?
- Is my data secure? Is the solution fault tolerant?
- What are the querying capabilities for search and indexing?
thinks might very well argue that Riak is superior to DynamoDB on all counts, and CTO Justin Sheehy said via email that Riak runs on any infrastructure and very likely will cost less to run over time. Assuming that’s true, it’s really just an extension of the discussion of tradeoffs of choosing cloud-based servers or relational databases, now applied to a NoSQL database.
Cloudant CEO Schoettle acknowledges there’s “about 60 percent overlap” between DynamoDB and Cloudant, but companies dealing with large data sets and trying to solve complex problems would be better off choosing his company’s hosted CouchDB-based service. While DynamoDB is “essentially a key-value store with a hash methodology,” Cloudant offers integrated search, replication and advanced data analysis capabilities. It also offers SSDs if customers need them.
There also are a handful of hosted MongoDB options available, including MongoHQ and MongoLab, and MongoDB instances are available through a number of IaaS and PaaS providers. DataStax’s Cassandra database is currently in private beta on the Heroku platform.
So perhaps NoSQL vendors really are right to welcome Amazon’s DynamoDB with open arms. “You can perhaps get a little weak in the legs [when you hear you're competing with Amazon],” Schoettle said, but Amazon will go a long way toward educating potential customers on NoSQL, generally. When they realize they need something more, the existing camp of NoSQL will be there to help.
Image courtesy of Flickr user Jo Naylor.