Database startup Citus Data has open sourced a tool, called pg_shard, that lets users scale their PostgreSQL deployments across many machines while maintaining performance for operation workloads. As the name suggests, pg_shard is a Postgres extension that evenly distributes, or shards, the database as new machines are added to the cluster.
Earlier this year, Citus developed and open sourced an extension called Cstore that lets users add a columnar data store to their Postgres databases, making them more suitable for interactive analytic queries.
It’s all part of a move to transition Citus Data from being just another analytic database company into a company that’s helping drive advanced uses of Postgres, Co-founder and CEO Umur Cubukcu said. Citus launched in early 2013 promising to let Postgres users use the same SQL to query Hadoop, MongoDB and other NoSQL data stores, but has come to realize that its customers aren’t as excited about those capabilities as they are enamored with Postgres.
As Postgres undergoes something of a renaissance among web startups (it’s also the database foundation of PaaS pioneer Heroku and its managed database service), Cubukcu thinks there’s a big opportunity to provide tooling that lets developers take advantage of everything they love about Postgres and not have to worry about whether they’ll outgrow it or bring on another database to handle their analytic workloads.
The NoSQL connectivity is still there, but Cubukcu acknowledges that running analytics on those workloads might be a job best left for the technologies (e.g., Spark) focused on that world of data.
And whether or not pg_shard or Citus Data are the ultimate answer for scale-out Postgres, Cubukcu is definitely onto something when he talks about how the narrative around SQL and scalability has changed over the past few years. His company’s work, along with that of startups such as MemSQL and Tokutek, and open-source projects such as WebScaleSQL and Postgres-XL, have shown that SQL can scale. The tradeoff for developers is no longer relational capabilities for the scale of NoSQL.
Rather, Cubukcu thinks the new tradeoff is between open-source ecosystems and proprietary software as companies try to scale out their relational databases. At least when it comes to Postgres, he said, “Our take is, ‘You don’t have to do this.'”