Online dating service eHarmony is using SeaMicro’s specialized Intel Atom-powered servers as the foundation of its Hadoop infrastructure, demonstrating that big data might be a killer app for low-powered micro servers. The general consensus is that specialized gear from startups such as SeaMicro and Calxeda — which can save money and power by using processors initially designed for netbooks and smartphones instead of servers — will need to attract both applications and big-name users before it really catches on. Big data looks like it might bring both.
Calxeda CEO Barry Evans explained to me via e-mail why big data and micro servers are such a great match. “Big data is a great fit for us and ARM servers for three key reasons,” he wrote. “First, it is an inherently scale-out application, requiring a lot of efficient processors. Second, it is a fast-growing market place without a lot of requirements for legacy baggage. Third, the application software is widely available to run on ARM today.”
There arguably is a big difference between the ARM-based Calxeda and x86 (Atom)-based SeaMicro in terms of availability of software designed to run atop their respective architectures, but Evans’ first two factors are applicable across the micro server ecosystem.
Because Hadoop (and big data, in general) is a new undertaking for many organizations, most don’t likely have any infrastructure set aside for it, and it does require a scale-out architecture to best leverage the performance benefits of parallel processing. Speaking of Hadoop specifically, it also doesn’t require the types of high-end, high-powered gear that typically power enterprise data warehouse offerings. In this situation, micro servers present a compelling argument because they provide lots of cores and high efficiency in a small footprint.
SeaMicro packs 512 Intel Atom cores into a 10U-sized appliance that acts like a 1.28-terabit-per-second fabric and boasts a 75-percent reduction in energy usage compared with traditional servers. Calxeda, for its part, is putting 120 quad-core ARM Cortex A9 processors into a 2U box that it claims consumes only 5 Watts per node. As Stacey Higginbotham pointed out when discussing Calxeda’s plans, “Intel and AMD boxes using the x86 architecture can consume about 80 to 130 watts for a quad-core machine, while low-power versions of x86 chips can consume 30 watts.” For users wanting to stick with traditional server chips, Dell sells a line of cloudscale micro servers featuring Intel’s 30-Watt Xeon processors.
eHarmony’s story aligns with Evans’ theory and is a prime example of why micro servers are so ideally suited for big data applications such as Hadoop. eHarmony began its Hadoop foray by running batch-processing jobs in the cloud, but soon found out that cloud computing can get very expensive when users are running many instances and have to transfer huge amounts of data to and from the cloud. Having never invested in hardware for its Hadoop cluster, eHarmony was free to look at a brand-new architecture like what SeaMicro provides. As Rich Miller of Data Center Knowledge reported, “the switch reduced its operating expenses by ‘tens of thousands of dollars a month,’ and its total cost of ownership (TCO) by 74 percent.”
Given the now public success at eHarmony, it’s possible we’ll actually start seeing OEM deals pop up between server makers like SeaMicro and Calxeda and big-data software vendors such as Cloudera. As both big data and micro servers catch on among mainstream organizations, it would make some sense for vendors to ride the wave together by pushing pre-integrated systems in which the software and hardware have been specifically tuned to work together. Hadoop in a box, if you will.
We’ll be tackling the subject of next-generation distributed architecture in depth at Structure 2011 next week, including during a panel featuring Anant Agarwal, co-founder and CTO of fellow micro-server maker Tilera, and HP Labs’ Partha Ranganathan. The concept of packing lots of low-power cores into a small form factor has applications outside of big data — powering web applications probably being chief among them — and it should be very interesting to hear what new use cases might find themselves well suited for this architecture and how vendors such as Tilera, SeaMicro, Calxeda and even traditional server vendors must evolve to address this broader ecosystem of apps.