17 Comments

Summary:

The discussion around NoSQL seems to have evolved from abolishing SQL databases to coexisting with SQL databases, and then to SQL is actually regaining momentum. Is SQL regaining favor, even among webscale types? Was it ever out of favor?

The discussion around NoSQL seems to have evolved from abolishing SQL databases to coexisting with SQL databases, and then to SQL is actually regaining momentum. Is SQL winning back favor, even among webscale types? Was it ever out of favor?

We saw evidence of this momentum shift back to SQL-based databases this week, with Facebook’s Jonathan Heiliger signing onto the advisory board of clustered SQL startup Clustrix. Facebook famously invented the NoSQL Cassandra database but still relies on the venerable MySQL-plus-memcached combination for the brunt of its critical operations. Additionally, Xeround now offers a scalable MySQL database on Amazon EC2, and database guru Michael Stonebraker recently launched his latest SQL-based startup, VoltDB. Will a scalable SQL option always win out against a NoSQL option? Even for unstructured data?

Once we’re no longer talking about serving data, but rather just about storing large volumes of it, NoSQL can seem nearly obsolete. For organizations willing to pay for data warehousing and analysis tools, the options are limitless: massively parallel software, data warehouse appliances, distributed file systems, and the list goes on. Pick your poison. Have lots of unstructured data to analyze and don’t want to pay for software? Try Hadoop. Plus, it might very well work with your existing data management software.

None of this is to say that NoSQL databases aren’t quality options. They actually vary greatly in terms of ideal uses, and some are gaining quite a bit of popularity. Aside from Membase, projects like Cassandra, CouchDB, MongoDB and Riak are maturing fast and gaining in popularity. But they’ve also been the cause of some noteworthy outages as of late. Perhaps these are just growing pains, but try telling that to most CIOs.

It’s a case of familiar versus unfamiliar, and the voices backing a better version of the status quo are getting louder. It will be tough, but not impossible, for NoSQL to be heard.

Read the full post here.

Image courtesy of Flickr user popculturegeek.com.

  1. Perhaps what is actually needed is a database with multiple “personalities” that can appear to be a set of relational tables and be queried using SQL when needed, but can also behave as a NoSQL database too, so you get the best of all possible worlds. And something preferably ideally tried and tested in the real-world to please those CIOs. Perhaps something like this: http://www.mgateway.com/docs/universalNoSQL.pdf

    Share
  2. Derrik

    I think that what were witnessing is more of a convergence of trends rather then a split into two camps.

    In other words that idea from the NoSQL worlds such as de-centralized data structures, relaxed consistency, dynamic schemas, map-reduce starting to merge with the semantics of SQL.

    This reminds me the emergence of Object database back in the
    90′s. When OO came to the word we started with specilized OO databases but at the end it was the O/R mapping tools that won the war. This is very similar to what we can already see with Google Big Table that is using JPA facade onto of its BiGTable datastore, the same goes with Hive and Hadoop and so forth.

    So the right strategy would be having NoSQL backend with SQL front-end on one end, and SQL engines starting to add support for dynamic data structures and map/reduce and de-centralized deployment on other end.

    You can find more details on that analysis on my post <href=”http://www.dzone.com/links/r/yesql_an_overview_of_the_various_query_semantics.html”>YeSQL and new one that i just published yesterday NoCAP

    Nati S
    GigaSpaces

    Share
  3. Hi Derrick,

    Very interesting analysis. Trends are always coming and going – at the end the good technologies stick.
    If SQL scalability interests you, I suggest you take a look at scalebase – who built the first database load balancer, allowing any SQL database to reach unlimited scaling.

    Share
  4. EMC’s new appliance built on general-purpose Intel Westmere EP processors and with Greenplum technology is also a product that supports the view that SQL is going nowhere. Oracle’s Exadata, Teradata’s 2580, Netezza’s TwinFin appliance offerings also cater to big data market on SQL databases. So far NOSQL seems to be a niche play with mostly support/consulting based business models.

    Share
  5. Just make better connections to articles that have been posted on gigaom.

    Previously, it was all about 2015 being the year that machine data explodes past user-generated data (look back at your IDC report coverage). And it is an actual exponential explosion.

    Think about that for a minute. All google does is gather user-generated data over HTTP, archive, index it, makes it available. But a single, easily accessible machine like a modern DNA sequencer in the hands of a small number of scientists right now could exhaust the entire storage device market (not the enterprise storage market, ALL storage devices mades) of 2010 just entirely on their own.

    So what we’re looking at now is not SQL vs NoSQL.

    What we’re seeing now is all the swirly mess that goes along with people actually thinking about their data over the next few years, and how it should be stored, accessed and relayed.

    We’re experimenting with different ways of storing data. Different ways of accessing it. And I don’t know anyone that’s completely replacing SQL with “NoSQL”. Everyone is doing both. Everyone is doing all of it. Google is a great example of a company that has MySQL and in-house solutions across all “nosql” market.

    Everyone is starting to think about how to put the right data in the right spot for the right reasons, rather than simply using MySQL because it was already installed or Oracle because it’s “enterprise”.

    Share
  6. It doesn’t matter if NoSQL will ultimately “win” or not. if NoSQL made the “regular” SQL guys start, stop sitting on their asses and thinking about problems they left untouched for decades that’s good enough :)

    Share
  7. A couple thoughts:

    (1) We are seeing developers in quite a few cases developers are preferring MongoDB for application development over a traditional (rdbms) approach because it makes development easier: in large part because of schema flexibility greatly increases agility of their development; also for some problems, the data models map more naturally. So it’s worth noting with “NoSQL” that it is not all about scale: there are other potential benefits. A good test is to do one project with mongodb and then see if one, as a software developer, wants to go back to the traditional approach.

    (2) It’s pretty clear to me that “one size fits all” is over. We already have products specifically for reporting and BI (hadoop, greenplum, aster, vertica, neteeza, …) — what I think we will see next is a little bit of specialization on the “online” side of the database world. Traditional RDBMS for highly transactional problems (such as banking); nosql for rapid development and easy scaling.

    But why not just test drive these things and see for one’s self? Almost everything in the space is open source, I would encourage developers to kick the tires a bit.

    dwight/mongodb

    Share
  8. I would add CloudB (http://code.google.com/p/cloudb) to the list, although it’s an hybrid system that can support [ideally] also SQL databases.

    Share
  9. hi, two thoughts

    (1) a lot of people are liking NoSQL (and the one I know about, MongoDB) because it makes application development easier. The flexible schema characteristics and document-oriented model can be quite helpful for speeding development. Thus I think “it’s not all about scale”. Do one project with it and you may not go back!

    (2) The one thing I know for sure is that mongodb has great momentum and every month is bigger than the last. Over 1MM download/year run rate.

    thanks
    dwight/mongodb

    Share
  10. Derrick Harris Sunday, October 17, 2010

    I couldn’t agree more with most of the comments, especially as they relate to a combination of data solutions for different applications. My point with the post (actually, for the original, expanded version on GigaOM Pro) is that we’ve been seeing a SQL resurgence after a period where all we heard was that NoSQL is the future, with scalability often being cited as the primary differentiation.

    @arnon: I think that might be exactly what happened. The question now is whether NoSQL options can get a fair chance if SQL proves itself scalable enough.

    Share
  11. “…we’ve been seeing a SQL resurgence after a period where all we heard was that NoSQL is the future, with scalability often being cited as the primary differentiation.”

    This line was, and perhaps still is, popular amongst analysts and journalists, not practitioners. Those building and operating such systems have always recognized that no single database would suit all purposes and that RDBMSes are extremely useful. Hybrid/composite systems are the rule, not the exception, for large-scale services.

    “The question now is whether NoSQL options can get a fair chance if SQL proves itself scalable enough.”

    Many people consistently confuse SQL and RDBMSes. They are not the same thing. A number of systems exist, for example VoltDB, that use SQL but have restrictions on what can be expressed and how. This is to be expected as the RDBMS model implies a number of semantic constraints that are nigh impossible to implement in a performant, reliable manner at extremely large scale. Hence EC systems, sharding, etc.

    SQL, particularly constrained variants of it, is popular in large part because it is familiar. There is nothing magical about it. This alleged conflict between “SQL” and “NoSQL” systems is an illusion born of ignorance, not the reality for builders and operators.

    b

    Share
  12. i work for a company that sees an enormous amount of data and traffic (>8m registered users). i am responsible for one of the core data stores. this core store is a mix of mongodb and mysql.

    Right from the start, i only shipped code that tested on loading. By my benchmarks, all SELECT queries of type IN worked much better on mysql by a factor of atleast 10. There is a need to pick strengths of both types of databases.

    The problem of making fair assessments is tougher because of huge fanboy articles from each camp

    Share
  13. Very interesting development and realization. Like they say “you get what you pay for” has never been more relevant. NoSQL is a great concept provided you’re aware of the inherent limitations. If reality strikes and indeed the relational and transactional capabilities are a mandatory requirement for your application, then NoSQL is probably not for you. On the flipside if your application can do with optimized variations of map-reduce in a highly distributed environment, than NoSQL maybe an option to consider.
    We believe there’s a way to enjoy both worlds…
    Razi Sharir (blog.xeround.com)

    Share
  14. [...] GigaOM Will Scalable Data Stores Make NoSQL a Non-Starter? [...]

    Share
  15. Hi, Derek.

    There are a few points in this article that might benefit from a little clarification. I hope that I can help.

    You said that NoSQL “projects like Cassandra, CouchDB, MongoDB and Riak” have been the “cause of some noteworthy outages.” There are two significant oversimplifications here. The first is that failures are generally more complicated than that implies: the reference you link to is about the Facebook and FourSquare outages… but the Facebook outage had nothing to do with anything NoSQL and the FourSquare outage (while MongoDB was at the center of it) was exactly the kind of poor capacity planning and lack of monitoring/management that also bites many users of sharded MySQL. The second reason is that your grouping of systems also includes some other databases, Couch and Riak, neither of which has any connection that I know of to any public outages at all.

    The other aspect of the article that I found a bit confusing was the idea that NoSQL was ever about “abolishing SQL databases.” Perhaps you have misunderstood someone saying “NoSQL is the future” as meaning that this was in opposition to the SQL language. Ask nearly anyone building something lumped into the NoSQL category and two things you’re almost certain to hear are:

    • It has little to do with using or not using the SQL language.

    • It wasn’t ever about abolishing anything, but rather about adding things.

    If there is any unifying factor throughout NoSQL, it is the idea that business needs should drive choices — real, differentiated choices — in selecting data management software just as they drive choices in other technology areas. For the few decades preceding the past few years, nearly any new software system would simply choose an “Oracle Style” RDBMS (including MS SQL, Postgres, MySQL, etc) for all data storage and management. The resurgence of many old and well-understood ideas in the form of new database implementations is providing the flexibility that many businesses need today. By running different parts of their data infrastructure on different systems that are well-suited to the tasks, people can manage cost, agility, and availability far better than before.

    If all of those “voices backing a better version of the status quo” just want a faster horse, I am certain someone will oblige them. The rest of us are busy working to build other things that people need. That need is palpable, even based only on the people that come to us at Basho every week asking for help building the systems that weren’t possible before or replacing the ones that are falling down.

    This “non-starter” is already running.

    -Justin

    Share
  16. In an ideal world, it would be better to use the same database engine for more uses rather than using a specialized one for each purpose. Clearly NoSQL solutions can’t be widely deployed given their limitations. What then, should we add to a SQL database to make it more applicable to wider variety of data sets and workloads? I posted my thoughts on this here:

    http://www.clustrix.com/blog/2010/10/22/sql-is-not-the-problem/

    Share
  17. I suppose people in general like a good argument, so there needs to be articles like this questioning the viability of NoSQL. It seems that (just like the Java Application Server growth in the 90′s) this space is quickly evolving and there *will* be a tier of the web/cloud application stack that includes an in-memory data layer. Call it what you will. Let’s let the vendors in this space innovate and customers will drive the direction it takes. I will bet it’s not called “NoSQL” 24 months from now, but whatever it’s called, the players involved will be a part of it.

    Share

Comments have been disabled for this post