Blog Post

Are You Ready for Open-Source Hardware?

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

According to the Chaos Theory, in a giant system that has lots of interconnections, even the smallest action can have a massive impact. It’s more simply described by the butterfly effect. This theory has taken its toll on the software business, thanks to the rise of open-source software platforms. Today, I learned about a move made by Backblaze, a small San Francisco-based online back-up service that can cause a similar disruption in the storage industry.

The company, whose primary business is selling online storage to consumers for a small monthly fee today, announced that it’s giving away the design of its storage cluster for anyone to use, modify and build upon. The design allows anyone to build large storage clusters -– from a few terabytes to over a petabyte. What’s so disruptive about this? What if I told you that you could build a petabyte-sized cluster for around $120,000?

Now compare that to a couple of million dollars via a storage company like EMC Corp. (s EMC) or a server maker such as Sun Microsystems (s SUNW). The image below actually does a much better job of making a comparison between the Backblaze solution and other commercial storage options.

costofapetabyte.gif Actually if this works, companies like NetApp (s NTAP) and EMC could be in trouble. Just like Linux slowly eroded away the premiums charged by the likes of Sun, these storage giants could see their business be negatively impacted. As the IT world transitions to cloud-based computing, the need for web-scale storage systems is going to increase. Google, for instance, has shown that you can build gigantic storage systems out of commodity parts and smart software.

In their GigaOM Pro report (subscription required), “Will Storage Go the Way of The Server?,” analysts Juergen Urbanski and George Gilbert pointed out that: “The long-term future of storage is about smart software that manages a large pool of cheap interchangeable hardware. Despite being one of the fastest growing technology sectors in terms of capacity, the economics for many participants are deteriorating.”

Looks like Backblaze just wants to accelerate that by giving away its designs. “We are hoping that people who are in the hardware business will take our design and build devices by improving on our design and in turn selling large amounts of storage at an affordable price,” Gleb Budman, CEO and co-founder of the company, told me during a conversation earlier this morning.

“At a fundamental level, we are a software company and we don’t want to build hardware,” he said. The company had to build its own mousetrap because it didn’t have much of a choice. The Amazon S3 offering wasn’t a feasible option, and the high-end systems from Netapp and EMC were way too expensive. Today the company has about petabyte and a half of storage space, putting it in the league of Facebook’s photo storage system.

backblazecluster.gifAt Backblaze, we provide unlimited storage to our customers for only $5 per month, so we had to figure out how to store hundreds of petabytes of customer data in a reliable, scalable way—and keep our costs low. After looking at several overpriced commercial solutions, we decided to build our own custom Backblaze Storage Pods: 67 terabyte 4U servers for $7,867.

A Backblaze Storage Pod is a self-contained unit that puts storage online. It’s made up of a custom metal case with commodity hardware inside. Specifically, one pod contains one Intel Motherboard with four SATA cards plugged into it. The nine SATA cables run from the cards to nine port multiplier backplanes that each have five hard drives. [The Backblaze Blog]

Is this the perfect solution for everyone? Who knows? Gary Orenstein, a storage industry veteran and a guest columnist for us, points out that the company has done a clever job of building a nice, lightweight wrapper around a bunch of drives to produce a system that is fine-tuned for Internet-driven uploads and downloads. Orenstein, who was co-founder of storage startup Nishan Systems, says that the big challenge with a system like this is managing drive failures over time and developing on the platform.

Nevertheless, this move by Backblaze is interesting because it addresses the current logjam in the hardware business. At our Structure 09 conference, Facebook’s VP of engineering, Jonathan Heiliger, lamented how the chip industry and hardware makers fail to address the needs of the big spenders: web companies. Facebook had to build its own mousetraps to met its specific needs. If your startup has open-source hardware designs that meet the needs of today’s web-based businesses, you can easily do an end run around the incumbents.

Budman said that the company is giving away design without any licenses because it doesn’t want anything to come between the design and cheap storage. As long as companies keep innovating, the company is happy with the karma points it notches up. “We hope that the people actually contribute back to the community created around this hardware design,” he said.

Will Backblaze’s big dream come true? It’s hard to say. What is safe to say is that if more companies start contributing their hardware designs in an open source manner, we can expect to see more innovation in the hardware business. So far, hardware innovation has been hampered by the high costs that go hand in hand with innovative products. A few more open-source hardware designs and we’ll soon see tinkering minds get to work.

I would recommend you read our research report, The Future of Data Center Storage. ($79-a-year subscription required.)

29 Responses to “Are You Ready for Open-Source Hardware?”

  1. Roger Weeks

    I wonder what the power and cooling requirements for these shelves are. How “green” are they? I don’t see how you’d replace a defective drive without taking a whole shelf offline, either.

  2. I’ve read this article as well as their blog and had a few questions:

    * Even with 6 fans, I wonder how much heat does each box generate and whether 6 fans is enough to cool the system.
    * From the vimeo video as well as the flickr snapshots, I noticed that the disks are very close to one another. I wonder how easy maintenance is for these types of setup. I also didn’t noticed a way for a way for the drives to be ejected easily (in case of drive failure).

    Great article and product / idea though. I’d be very interested to learn more how they handle outages (if any)

  3. Interesting hardware design and it’s good they are publishing the details. However, it would be a lot more useful if they would also distribute the open-source stack they are using, including its configuration: Linux, JFS, etc.

    And of course they are not giving away the real secret sauce referred to at the end: the higher level software stack that maps a backup request into specific encrypted blocks on a storage server, including de-duplication, incremental storage, etc.

  4. This is really an interesting topic. Specifically, I find it fascinating to witness the adoption curve of these sorts of things.

    I did some work for Vyatta (enterprise grade routing & security on Intel hardware) on business development into the ISP’s and service providers (similarly, large consumers of networking as-well as storage). It quickly comes down to a few things:

    – Customers will not be early adopters simply because due to disruptive economics, the entire solution needs to be there before they jump… as-in all the esoteric features which make the products truly deployable & manageable.

    – Despite closed hardware/systems being ridiculously expensive on a $$/horsepower basis, users have grown accustom to their “form” and have a hard time getting past this sort of “if it walks like a duck, it must be a duck” mentality. For example, because servers typically use hard drives for storage (even if RAID/redundant), people have a hard time considering them for use as a networking utility… because routers don’t have drives! It’s all psychological, as often a router sits next to a server with an equally mission critical function running on it.

    – Not surprisingly, this “nobody gets fired for using Cisco, EMC, NetApp” phenomenon leads “bigger” customers to be slower… opens the door for groups like Backblaze to disrupt…

    From what I have seen firsthand @ big customers, there is a ton of opportunity to build “closed-open hardware appliances” for these sorts of open solutions. ie: adaptations of intel hardware into form factors which more closely resemble their closed-system counterparts.

    • Hahnfield

      It is clear that Backblaze expects some ODMs in Asia or other new hardware start ups to build on their design so that more people will have a chance to buy big storage cheaper.

      The way the company described to me, they are giving away the whole shebang including the code and SDK so people can build on the whole system. I think that is what makes this more disruptive than usual.

      Thanks for your awesome comment. Enjoyed learning from you.

      • I just wanted to clarify that we are giving away the complete design for the Backblaze Storage Pods (which is the hardware plus the software stack that brings it online) but not an SDK or the code for the online backup service itself.

        Appreciate the perspectives and comments,

        Gleb Budman
        CEO, Backblaze

    • The big part missing from the analysis is support: NetApp, Sun (with ZFS and storage products) and others will all deliver bugfixes with some sort of SLA, as well as hardware support, none of which you get with a build your own approach. BackBlaze are providing their own hardware and software support, which makes sense as their volumes are so enormous – those with less need for storage may find commercial offerings are better.

      This is also why people pay for Red Hat Enterprise Linux (RHEL) support when they can get virtually identical software (down to the bugs and fixes) from CentOS, who rebuild every release of RHEL from source. Part of the support payments to Red Hat is to fund Linux kernel developers. Novell does something similar of course with SUSE Linux.

      Depending on volume of storage and in-house guruhood, it can pay to in-source support of hardware and/or software.

  5. Is this chart on an apples-to-apples basis?

    My understanding is that when you buy a ‘gigabyte’ on S3 you are actually getting multiple redundant copies of that gigabyte, whereas if you buy a gigabyte of RAID, it is only 1 gigabyte.

    So you would need to multiply any of the storage solutions by a multiple of 2x to 3x to even it up.

    Then there are labor costs, etc, but that is footnoted correctly in the chart.

  6. During my storage-networking days a few years back, I remember working on a storage product called as the “Cube” – similar in concept that storage is in modular blocks, self-contained although not in the business model….
    The difference between Raw & EMC is astronomical – certainly an opportunity for the model to permeate. May be it could be a spin of the Freemium model (a post from Om earlier talked about the Freemium model –

  7. Market disruption aside, Backblaze has effected a marketing coup. Did you ever hear about them before? I had no inkling I could get unlimited online storage for $5 a month. WOW and more WOW! If my instincts are right, people will burst their door jambs out to get in as customers.