Why crowdsourced computing benchmarks are the future

Performance benchmarks for computing systems might make for good television — there’s always plenty of heated debate, talk of world records and  sometimes even a little ethical drama — but in the end there’s not a lot of substance.

More often than not, they’re conducted on highly optimized systems running workloads that don’t necessarily mirror what anyone actually runs in production. If the results in question come from vendors rather than third parties, there’s a good chance they’ve only been published because the vendor was able to achieve the desired result. Thinking your experience will be equally as fast is like watching a fishing show on television and then hitting the water expecting bite after bite.

However, cloud computing and the advent of popular open source software such as Hadoop and NoSQL databases could change the way we do benchmarks. With relatively little cost and effort, anyone can conduct their own tests to see how their specific applications and configurations run on their specific infrastructure. Throw in a platform to share these results, and you have crowdsourced performance benchmarks free from vendor hype and the vacuum-like conditions of standardized tests.

Ideally, it ends working a lot like the crowdsourced medical platforms I’ve come across lately, PatientsLikeMe and the forthcoming Lucine Biotechnology. Rather than rely on claims from drug companies or even doctors whose knowledge is limited to published research, users share their own real-world experiences with drugs, symptoms and side effects, and learn from others like them what they might expect.

It’s might get worse before it gets better

However, with so many Hadoop distributions in the market now, and so much money at play, I’d prepare to hear a lot more chest-beating in the months to come about whose implementation is actually fastest. It actually has been going on for a while — in December, I detailed a series of purported and contested records from SGI (s sgi), MapR and HPCC Systems on the Terasort benchmark — and not all the voices have yet to be heard. Like SGI before them, hardware partners such as Dell (s dell), HP (s hpq) and Cisco (s csco) probably want to prove their reference architectures are the best.

And VMware (s vmw) has already published a study claiming that Hadoop actually performs faster on its vSphere hypervisor than on bare metal. If Hadoop workloads really do move to virtual machines, Hadoop vendors are going to have to prove themselves there, too. VMware’s study ran CDH3 (Cloudera’s third-generation distribution), but Hortonworks has been working closely with VMware lately and might have something to say. Of course, EMC Greenplum is actually under the same corporate umbrella as VMware and can’t afford to be seen as slower than the competition on virtualized servers.

In cloud computing, too, providers have spent the past few years arguing against the idea of cloud servers as commodities by claiming their systems offer the best performance. There have been plenty of boasts (sometimes, perhaps, misleading) and quite a few attempts to benchmark cloud system and network performance (GigaOM Pro subscription req’d) With that much of the cloud market still up for grabs, we’re not yet done hearing about whose cloud is the biggest and the fastest (see, for example, Google’s emphasis on performance when it launched Compute Engine last month).

But it should get better

Despite all the effort by vendors and cloud providers to claim superiority, though, the truth is that it’s easier than ever for users of next-generation software and services to run their own tests. Hadoop or NoSQL databases aren’t expensive Oracle (s orcl) software that needs to run on scale-up, big-iron systems; they’re all free to download and can run on small clusters of commodity boxes. For applications that are going to run in the cloud, renting a few instances from a cloud provider might only cost a few bucks.

Sure, it might take a little time to configure everything (although software vendors might be willing to help), but isn’t that effort worth it in the end?

Use cases and test results that demonstrate the value of crowdsourcing in-the-wild performance metrics are everywhere on corporate technology blogs across the web. Earlier this month, for example, Medialets discussed how it tested its Hadoop workload (on a cluster of rented physical machines) and found that Cloudera actually outperformed the supposedly faster MapR for that job. This week, video-transcoding service Zencoder shared some interesting (if not always surprising) results when it compared Amazon Web Services’ highest-powered cloud instances to Google Compute Engine’s best.

Aggregated and indexed on a single platform, these types of experiences could help quiet the boasts from vendors and industry organizations touting their latest benchmark results. I’d argue it matters a lot more to a systems architect to know the production throughput of someone running a similar application on similar resources than to know how fast a generic workload ran in a lab on a setup he doesn’t have. Add some analytics to this information, and ideal configurations for different application types and data volumes might begin to emerge.

Cloud computing and open source software have freed IT practitioners from so much legacy vendor baggage over the past few years. Isn’t it time to free them from inane benchmark boasting, too?

Feature image courtesy of Shutterstock user Suzanne Tucker.