Blog Post

Does big data really need custom hardware?

IBM (s ibm) and Cisco (s csco) are betting big on big data with new boxes designed specifically to store a lot of data, with the networking capabilities to move data around really quickly. As the glut of information inside businesses grow and the desire to analyze it becomes more pressing, big enterprise IT shops see an opportunity.

Where the generic server market has been commodified with low-end x86 servers companies like Teradata(s tdc) and EMC(s emc)  are doing their best to hold onto their hardware margins with specially designed systems. And it looks like IBM and Cisco have decided this is an opportunity not to be missed, and are taking it further. Cisco has released a unified computing system specifically designed to run SAP’s HANA database. Oracle (s orcl) is also heading down this path.

This is clearly aimed at enterprise customers who can afford the SAP licences as well as the Cisco gear, and the two companies worked with NetApp (s ntap) to pre-validate the box so customers can just plug it in without worrying about building their own Hadoop cluster or other technical feats that would require time and IT talent to get right.

IBM’s big box for big data.

IBM’s new PureData System, an addition to its older PureSystems converged hardware family, is an effort to cram security features for HIPPA and PCI compliance in at the chip level. Eweek coverage of the boxnotes that IBM is planning even more boxes:

IBM officials said the PureData System is the next step forward in the company’s overall strategy to deliver a family of systems with built-in expertise that leverages its decades of experience to reduce the cost and complexity associated with information technology. According to IBM, users can have the system up and running in 24 hours and handle more than 100 databases on a single system.

Both of these boxes are advertised as being specialized to tackle big data, but do big data workloads need such highly custom boxes? There are many who think that data processing will require something above and beyond a typical x86 set up, such as a box from SeaMicro or Calxeda machine with low-power cores that are networked to work in parallel to parse many bits of data in small chunks. Others are thinking farther ahead and envision new architectures that mimic the human brain.

Instead of these two boxes representing a new hardware for big data these really represent that capitulation by the major hardware vendors to a services model. Technically these boxes may have different chips when compared with commodity servers, but what these guys are actually selling is the plug and play aspect. Sure a customer can buy cheaper boxes and download a Hadoop or other open source software (or pay a licensing fee and have someone like Cloudera manage it for them) but they want something that works with little or no effort.

So these boxes aren’t about the whiz-bang tech inside, they’re an admission that services wrapped in a box are the main opportunity ahead for larger vendors. The question is, how long will that be enough? Especially as the cloud, either public or private makes its continued advance.

Database image courtesy of Shutterstock / z0w.

4 Responses to “Does big data really need custom hardware?”

  1. Interesting article and a good question, do Big Data workloads really need custom hardware? We don’t think so. That is why Compuverde has developed a hardware-independent Big Data solution. With Compuverde’s software solution, load is distributed evenly to all storage nodes instead of just the one gateway, thus eliminating the bottleneck problem and improving access speed at the same time. Suddenly, companies are able to confidently use a cheaper hardware that uses up to 50% less energy. I would be interested to hear what you think about our technology:

  2. Customization of hardware resource is definitely a win win condition for big data. Conventional architecture of systems or hardware resources could not support the processing of petabytes of data as under load conditions these systems slows down, then how can we expect that under conditions of big data, which is far way large in amount, these hardware specs will be able to support and provide smooth processing.

  3. Specialized hardware makes sense in some cases, we event designed some:

    But the big cabinet-size boxes being offered by larger manufactures are very inflexible, from specs to pricing. Problem with big data is – no tasks are the same. Natural language processing, log crunching, fraud detection – all require different combinations of processing power, I/O capabilities, storage capacity. A monolithic “appliance” would work well for some tasks but not others.

    What can succeed is modular architecture, where a cluster is assembled from lego blocks, based on specific job requirements.

  4. thierryhubert

    I think that Big Data is also a challenge for consuming information on highly time-sensitive social networks like Twitter. I actually discover of this article via the TechCrunch SAP Big Data Startup of the Year. This is the report that used and created for Big Data The issue of storage and scalable infrastructure for Big Data analysis in this space remains uncertain at this juncture. I am struggling to manage costs on AWS and I am looking for my tipping-point to invest in infrastructure or remain in the cloud.