Cloud databases 101: Who builds ’em and what they do

17 Comments

Remember when there were just two or three cloud computing platforms to choose from, and just about as many cloud databases? Well, as clouds have proliferated, so have the database services built on top of them. In fact, it’s getting hard to keep up with what’s actually available.

Here’s a primer highlighting the available services (note, we’re talking managed database services, not database instances that users still need to manage and administer) and where they’re running. It’s intended to be thorough, but that can be easier said than done, so please note any omissions in the comments.

SQL services

  • Amazon Relational Database Service: One of the first cloud database services, Amazon Web Services’ RDS is now one of the most complete, too. Like most AWS (s amzn) services, it’s tied into the AWS management interface and is compatible with a large majority of AWS’s countless other cloud computing services. Initially just an AWS-hosted and -managed MySQL service, RDS now lets users choose Microsoft SQL Server and Oracle Database, as well.
  • Clustrix Database as a Service: Database vendor Clustrix just got into the cloud game on Wednesday, but it came to play. Its service, which runs on the Rackspace Cloud, gives users the high performance of solid-state drives, the peace of mind of single-tenant deployment, and the scalable MySQL capabilities of its flagship on-premise product. The company claims it’s suitable for both OLTP and OLAP applications, and that it monitors system health across more than 2,500 metrics.
  • EnterpriseDB Postgres Plus Cloud Database: EnterpriseDB is the primary company commercializing the PostgreSQL database, and this is the cloud-based version of its flagship Postgres Plus offering. Targeting enterprise developers more than weekend hackers, Postgres Plus Cloud includes features such as high-availability clusters, high connection counts and compatibility with Oracle environments.
  • FathomDB: Some GigaOM readers might remember FathomDB as the partner that was supposed to give Rackspace a chance to compete against AWS’s then-new RDS. Well, times have changed. FathomDB still exists, but it has open-sourced its original technology to help developers build anything as a service and currently isn’t offering a hosted database service. However, the company claims to be working on a next-generation database service, so stay tuned.
  • Google Cloud SQL: It’s not the most feature-rich database around, but Google Cloud SQL does have its benefits. For one, it’s integrated with the rest of Google’s cloud services for easy interaction. And, as is Google’s (s goog) claim to fame in the cloud, Cloud SQL is geographically replicated for maximum availability. Currently, though, it only supports Java and Python applications, and instances are limited to 10GB in storage capacity.
  • Heroku Postgres: Heroku Postgres is the public-facing implementation of platform-as-a-service darling Heroku’s (s crm) internal PostgreSQL database. It’s designed for reliability and data protection — Heroku claims 99.99 percent uptime and a design targeted to hit 99.999999999 percent data durability — and tries to bring the Heroku experience to developers that can’t use its PaaS offering. One of its more interesting features is called Data Clips, which lets users send the results of a SQL query to someone else via a URL.
  • HP Cloud Relational Database for MySQL: What is there to say about this service that the name doesn’t already? For starters, it’s presently in private beta, so there’s still a lot of work to be done and a lot of features to be added. It’s also built atop an OpenStack-based MySQL distribution, which, in theory, should make it easier to move one’s database business from cloud to cloud if need be.
  • IBM SmartCloud Application Services: Like HP (s hpq), IBM’s (s ibm) cloud database is still very much a work in progress. Details on specific features are sparse right now, other than that the service is based on IBM’s DB2 Server technology and is part of the SmartCloud Application Services (read “PaaS”) offering that’s currently in a pilot phase.
  • Microsoft SQL Database: Formerly known as SQL Azure, SQL Database is a critical component of Microsoft’s (s msft) new focus on hybrid cloud computing. Yes, it can operate as a standalone cloud database, but it also provides a shared user experience with Microsoft SQL Server and allows for data sharing with on-premise SQL Server databases. There’s also an option for syncing between other SQL Database deployments elsewhere within a company’s cloud infrastructure.
  • Oracle Database Cloud Service: It’s not for everyone, but existing Oracle database users that want a cloud-hosted option certainly should appreciate the Oracle Database Cloud Service. After all, it claims all the features and performance of Oracle Database 11g Release 2, of which there are a lot. Pricing isn’t made clear, but it’s a monthly rate based on the size of your database, although there are no long-term contracts.
  • Rackspace Cloud Databases: The latest addition to Rackspace’s (s rax) line of cloud offerings, Cloud Databases is first built from its inception atop the OpenStack platform. Still in early access mode, users won’t get SLAs or a host of features (such as monitoring, backups or a GUI) that are slated for the GA edition, but they will get promises of high performance and reliability thanks to the service’s container-based virtualization and storage-area network-based architecture.
  • Xeround: Save for Amazon RDS, Xeround might be the most-popular cloud database around. It’s also the most flexible in terms of where it can be deployed — the MySQL service can run atop pretty much any public cloud, including AWS, Rackspace, Joyent, Heroku … you get the picture. Xeround claims auto-scaling as one of its primary strengths and is architecturally unique in that it’s essentially a MySQL frontend atop a foundation that theoretically could support a variety of database options.

NoSQL services

  • Amazon DynamoDB: DynamoDB is AWS’s managed NoSQL service based upon the original Dynamo kay-value data store the company developed years ago for its internal purposes. Designed for web or big data applications needing fast access to data and potentially having to scale in a hurry, DynamoDB is built atop an SSD architecture and scales automatically as data is added to the system.
  • Amazon ElastiCache: It’s not technically a NoSQL service, but ElastiCache does fulfill a similar need by giving developers managed Memcached to make sure their MySQL deployments are serving user data as fast as possible. Memcached is used by many web applications, including Facebook (s fb), that are built upon disk-based relational databases but want to keep certain data in an in-memory cache.
  • Cloudant: Although it’s based on the open source CouchDB database, Cloudant doesn’t call itself a NoSQL service per se, but rather a Data Layer. Built across a collection of cloud-provider resources spanning the globe, it does offer a predictable scalable NoSQL data store, but also a built-in MapReduce analytics engine. That’s one reason agribusiness giant Monsanto uses Cloudant to underpin its genomics infrastructure.
  • Database.com: Salesforce.com’s standalone database service, Database.com, isn’t exactly NoSQL, but it isn’t exactly a relational database, either. What it is for sure is the same multitenant database architecture that has been underneath Salesforce.com’s CRM service and Force.com platform for years. It stores a variety of data types, including of the unstructured variety, and is designed for (although not limited to) applications tying into existing Salesforce.com services.
  • Microsoft Windows Azure Table Storage: This is the NoSQL data store for Windows Azure, which is designed for easily querying terabytes of non-relational data. Because it’s part of the overall Windows Azure Storage family, though, total database size is limited to 100TB overall (Blob, Table and Queue storage) per account.
  • MongoHQ/MongoLab: MongoDB is by far the most-popular NoSQL database around, but it can be a bear to manage in the cloud. As a result, there are numerous hosted MongoDB services around, although MongoHQ and MongoLab are probably the most widely known. The pitch for both is simple: fast deployment, thorough monitoring and reliability you probably can’t achieve yourself. Both services try to appeal to a broad range of users with both shared and dedicated offerings.

Feature image courtesy of Shutterstock user Oleksly Mark.

17 Comments

ScaleGrid

Derrick, also check out ScaleGrid (http://www.scalegrid.net). While everyone is offering Database as a service on the public cloud, ScaleGrid offers it on the private cloud. It supports VMware, SCVMM, CloudStack and OpenStack.

PS: I am one of the founders of ScaleGrid

ObjectRocket

Nice writeup Derrick.

Also, ObjectRocket (http://www.objectrocket.com) provides a unique spin on MongoDB hosting by building out infrastructure specifically geared towards MongoDB performance and availability.

Full disclosure, I am a founder at ObjectRocket Inc. ;-)

Dev

Derrick – Great article on cloud databases.

Question: Are you aware of any market research on the future market share of cloud databases vs. on-premise installs as well as new entrants vs. established players like Oracle and Microsoft.

There seems to be a lot of talk about database as a service in Silicon Valley but from what we see working day to day with mid-enterprise companies, this is still a very small market.

Derrick Harris

You’re probably right on that. Like most cloud computing services (esp. IaaS and PaaS), I think database services are used primarily for new apps and are nowhere near replacing Oracle, SQL Server or DB2 is most companies. As more apps move to the cloud, though, that should change.

Matt Aslett

451 Research has been looking at the emerging options in this space. Given many of the ‘cloud database’ providers are in early adopter/free beta phases it is a little difficult to predict overall cloud database vs. on-premises share at this stage.

However, one of our findings was that initial adoption of cloud-based
MySQL-as-a-service offerings might have been focused on development-and-test environments and new apps, but increased competition, lower prices and more entry-level offerings – combined with increased confidence in the cloud itself as a platform for mainstream applications – is likely to increase the adoption of DBaaS as a complement to traditional on-premises deployment.

One data point: we estimated that MySQL-as-a-service providers, including Amazon and Google, accounted for just 7% of all MySQL ecosystem revenue ($171m in total) in 2011. That figure is expected to rise to 22% (of $664m) by 2015.

More details here: http://blogs.the451group.com/information_management/2012/05/22/mysql-nosql-newsql/

Dev

Matt — A very interesting study and a set of findings.

One question: How would you estimate the current and future presence of Microsoft SQL Server vs. MySQL within mid-enterprise companies?

Your findings state that “MySQL was once the default database for new Web applications.” From what we see, MySQL’s leadership only applies to SMBs and startups. In the world of SMEs and mid-tier companies, Microsoft stack/Microsoft SQL Server remains a platform of choice for new Web applications.

I think it would be very interesting to see a study that incorporates not just MySQL vs. NoSQL but also Microsoft SQL Server vs. MySQL vs. NoSQL vs. NewSQL, especially as Microsoft is getting ready to integrate Big Data/Hadoop into its SQL Server offering.

Derrick — Would love to get your insights on this as well.

Mason

Hi Derrick, I will take you up on your offer to mention others in the comments. :-) I’d like to point out StormDB, although it is still in (free) beta.

It uses a traditional SQL interface (having a PostgreSQL based heritage), is fully ACID and unlike some of the solutions here which are limited to just one Vitrual Machine or server, StormDB scales out across multiple servers, all while running on bare metal (no VMs).

Also unlike some of the other solutions that offer either only read scalability or write scalability, StormDB offers OLTP read and write scalability and BI/DW scalability with MPP parallelism.

Kent Abendroth

great summary for the storage sales guy like me.. just curious and showing my limited insight but.. here to learn.. where does Cassandra fit into this summary?

Derrick Harris

Cassandra is NoSQL database — which some people call “cloud” databases, but moreso because of their distributed architecture. As far as I know, no one’s doing a hosted version of Cassandra, although DynamoDB is probably the closest thing.

Cashton Coleman

Derrick, I’m a little surprised that you didn’t add ClearDB to that list (disclaimer, I work there). After all, ClearDB runs MySQL for Microsoft on Windows Azure and has a higher uptime rating than just about every vendor on that list, due to our muli-regional design.

Even GigaOm’s @GigaBarb wrote a post that includes what we’re doing on Azure (http://gigaom.com/cloud/microsoft-paints-azure-with-open-source-brush/).

I encourage you to have a look at ClearDB, if you haven’t already: http://www.cleardb.com

Derrick Harris

Sorry. But, like I said, I figured I’d miss a few — it’s a big market now. That’s what the comments are for; we don’t have to wait for the next edition ;-)

Cashton Coleman

It is a big market, isn’t it? Lots of great companies tackling lots of various database challenges. It’s going to be very exciting to see how the cloud database as a service market grows and changes over the next 16-24 months.

Steve Ardire

Derrick you covered SQL, NoSQL but not RDF cloud databases so let me help

RDF Cloud databases and semantic web
http://goo.gl/izJgg

How Google and Microsoft taught search to “understand” the Web | Ars Technica http://arstechnica.com/information-technology/2012/06/inside-the-architecture-of-googles-knowledge-graph-and-microsofts-satori/

Google’s Knowledge Graph derives from Freebase, a proprietary graph database acquired by Google in 2010 when it bought Metaweb.

Microsoft’s Satori (named after a Zen Buddhist term for enlightenment) is a graph-based repository that comes out of Microsoft Research’s Trinity graph database and computing platform. It uses the Resource Description Framework and the SPARQL query language, and it was designed to handle billions of RDF “triples” (or entities).

Comments are closed.