Keep those datasets in sync

ClusterHQ rakes in $12M to make containers play nice with data

Big data startup ClusterHQ sees a lot of opportunity in capitalizing on container and database technology and, with a $12 million series A funding round that the company plans to announce on Thursday, it’s got a nice chunk of cash to help it do so.

ClusterHQ’s flagship technology is its open-source Flocker project, which the company released back in August. Flocker aims to make it possible for users to load up datasets into containers all inside the Docker Hub so that the housed datasets are all synced up to the application or application’s components—stored inside containers as well.

A developer would use Flocker to store the types of datasets inside Docker containers that power stateful services, which are essentially the databases, message queues or key-value stores that need constant updating in order for keeping the application up-to-date with the most reliable data.

Currently, applications built with Docker containers can be connected to the type of datasets used for stateful services, but those datasets have to be hosted outside the Docker environment, explained ClusterHQ CEO Mark Davis. Because of this, the Docker hub “doesn’t know anything about [those datasets]” and there is “no notion of picking up the external service and moving it around” like one can do with the application components that are stored in containers.

ClusterHQ team
ClusterHQ team

The plus side of having these kind of datasets that need constant updating to be stored outside of Docker is that if something were to cause the application to falter, the dataset wouldn’t go down with the whole system and any transmitted data can still be retained.

What ClusterHQ wants to do is make it so that these datasets can be as portable as the application components housed in containers, so that when deployed together in tandem, an application would be faster and more responsive to the user. Flocker’s technology, powered with the Sun Microsystems-developed Zettabyte file system (ZFS), can supposedly replicate changes across containerized databases and create backups in case something breaks.

“We are all trying to get to a highly scalable world,” said ClusterHQ CEO Mark Davis. “We want to get to the point where we don’t care about individual services and application services.”

The Bristol, England-based company currently counts 17 employees, which Davis said he wants to double “as fast as we can” with the investment round. Davis, a Silicon Valley veteran, plans to eventually set up a ClusterHQ office in the Bay Area in order for the company to be closer to the enterprise infrastructure landscape where it can be in contact with companies like Docker and CoreOS as well as the proponents of technology like [company]Google[/company]’s Kubernetes and Apache Mesos.

Accel Partners London drove the funding round along with Canaan Partners and previous investors. Kevin Comolli of Accel Partners will take a seat on ClusterHQ’s board.