Qubole, the cloud-based Hadoop service launched by Hive creators Ashish Thusoo and Joydeep Sen Sarma in 2012, is now offering users access to Presto, Facebook’s system interactive SQL queries on data stored in Hadoop. Facebook first announced it had created Presto in June, and then open sourced the technology in November.
The type of capability Presto provides — fast, interactive SQL queries on Hadoop — has been all the rage over the past year. It has become so popular because people got tired of having to move data to and from Hadoop to query it using analytic SQL databases, and they didn’t want to wait for the MapReduce-based Hive to return results when timeliness is an issue.
It’s actually difficult to keep track of how many SQL-on-Hadoop offerings are in the market right now, but here’s a short list of companies selling products or spearheading projects in this space: Cloudera, Hortonworks, MapR, Pivotal, IBM and Hadapt. Here is a now-outdated, but still kind of useful list that shows the market for this capability as of about this time last year. One one company that I know of — Drawn to Scale — isn’t around anymore.
I wrote at the time Facebook open sourced Presto that it could potentially have a material impact on the commercial the SQL-on-Hadoop market because it’s both available and has already been proven at scale. Qubole offering Presto as a service only underscores that opinion, although is hardly definitive proof given its startup stature and relatively small (but growing) user base.
Still, the idea is sound: There are a lot of good open source Hadoop innovations coming out of companies such as Facebook, Twitter and LinkedIn, and it’s probably a good idea for companies selling Hadoop to use these technologies to their advantage where it makes sense. But don’t take my word on what’s good for the Hadoop market: Come to Structure Data and ask the CEOs of Cloudera, Hortonworks and Pivotal yourself.