Lest storage vendors thought they were immune to disruption that open source hardware is having on the server industry, Netflix’s new Open Connect content-delivery network might make them think again. While Open Connect directly targets commercial CDNs, it’s based upon (or at least inspired by) open source storage designs first released by Backblaze almost three years ago. Backblaze’s design evolving and expanding its range into the data centers of a Fortune 1000 company is significant in the same way the evolution of modern man was for neanderthals.
By way of background, Backblaze is a cloud storage provider focused solely on backing up lots of data for cheap (like $5 a month for unlimited capacity cheap). In order to do that, it had to build a storage system that could hold massive amounts of data without breaking the bank. As of last July, Backblaze’s architecture had evolved to a point where a 135TB pod cost less than $7,400 to build from scratch.
Understandably, the architecture generated a lot of interest from companies and organizations wanting to leverage it to soothe their own IT budgets, but none of them are Netflix. EMC’s (e emc) Pat Gelsinger said recently that the storage component of Facebook’s Open Compute Project, called Open Vault, isn’t yet ready for primetime because nobody is running — or would run — mission-critical workloads on it. That might be true of Open Vault today — the project just launched earlier this year — but it likely won’t be for long. If you consider a CDN that serves Netflix streaming video mission-critical, the criticism is already invalid for Backblaze’s designs as Netflix has adapted them.
It’s worth noting, too, that open source hardware isn’t the only piece of the stack threatening legacy storage vendors such as EMC. I’ve heard it suggested recently by someone experienced in building out large-scale cloud infrastructure that the Hadoop Distributed File System has the potential to become the default file system for large infrastructures once it works out some of the limitations around performance and availability. One of the biggest of those limitations — the NameNode— has been eliminated in the latest version of Apache Hadoop and is already integrated into Cloudera’s new CDH4 release.
Can storage deal with the open source disruption?
As with Open Compute’s effects on the server industry, though, open source storage doesn’t need to spell doom for legacy vendors if they’re willing to adapt. One reason is that, at least in the short term, there are still plenty of customers that don’t operate at Facebook or Netflix scale and can afford to pay a premium on smaller deployments that offer the features (and vendor support) those customers demand.
If the server shipments tell us anything, though, it’s that the rise of cloud computing and web giants will ultimately take a toll on the storage market, too. Fewer, but very large, customers will be responsible for a greater percentage of sales, and they won’t necessarily want all the bells and whistles that make enterprise storage products so expensive. And if VMware is correct, even mainstream enterprises will soon want to follow the examples of web giants like Google and Facebook by running relatively dumb hardware managed by really smart software.
If this scenario plays out, storage vendors will have to reassess how they deliver value and earn their money. That might mean adopting open source designs in their own gear while shifting their focus a lot more heavily toward software and services, or perhaps unlocking their storage-management software from the hardware and certifying it to run on open source gear.
Perhaps we’ll get some ideas for what the future storage and markets look like at our Structure conference June 20 and 21, where we’ll dive into the topic with Facebook’s Frank Frankovsy, Netflix’s Adrian Cockroft and VMware Steve Herrod. Whatever the case, it looks like something will have to give.
Feature image courtesy of Shutterstock user Zadorozhnyi Viktor.