The most-interesting part about yesterday’s announcement that Groupon is using the Cloudera Distribution of Hadoop wasn’t the actual use — Groupon has been a known Hadoop user for a while — but, rather, the insight that Groupon is “building a world-class infrastructure” of which Hadoop will be a key part. It raises the question of whether Groupon, which certainly has the funds necessary to grow, can attract the engineering talent it needs to build its infrastructure, especially in the competitive Hadoop-hiring space.
I explained last month in my GigaOM Pro report on Hadoop (subscription req’d.) that hiring qualified personnel is one of the biggest inhibitors to Hadoop adoption. It’s still a fairly difficult set of tools to learn, and even if organizations have a minimum level of internal knowledge necessary to get their Hadoop efforts off the ground, at least one survey has shown it’s difficult to hire the additional talent needed to evolve into more-advanced or wider-scale Hadoop usage. Companies certainly are searching for Hadoop skills — Indeed.com, a job board, shows Groupon looking for three positions where Hadoop knowledge is preferred — but it’s difficult to gauge how successful any companies are in actually hiring employees with the desired skills.
Well, that’s not entirely true. According to a February article in The Register, Facebook and Google (s goog) are having no problems hiring skilled big-data-savvy talent, and that’s making life difficult for startups without as much money — including Cloudera, whose sole business is Hadoop — as well as, one would assume, companies without Google’s or Facebook’s cachet among the young, skilled set. As the article points out, rich, infrastructure-centric companies like Netflix (s nflx) are also offering high salaries to attract engineers.
Groupon claims its infrastructure goals have attracted a large number of talented engineers, including in the data-management area, and there’s no reason to doubt that’s accurate. After all, it has raised more than $1 billion in funds so far, and is presently valued at $25 billion. If Groupon wants to hire Hadoop talent, it should be able to afford the high salaries they might demand, and it has that potentially forthcoming IPO to make any stock options all the more appealing. However, Groupon still isn’t as rich as Google or Facebook, and those firms are only going to keep growing their infrastructures and continue hiring new talent to help them do so. Further, Groupon — and other fast-growing web companies, such as Zynga — are competing against each other for talent, too.
So, maybe that’s why Groupon chose Cloudera over the straight Apache Hadoop distribution like Facebook uses. As I explained last week, Cloudera’s distribution is all about integrating the entirety of Hadoop tools in a single distribution and offering commercial support on top of that. That support contract probably costs less than hiring a team of engineers. As long as Groupon doesn’t feel the need to create entirely new tools like Facebook did with Hive or Yahoo (S yhoo) did with Pig, those high-paid engineers might not be worth the investment yet. But if Groupon views its big data efforts as mission-critical, then it’s game on for hiring, and Groupon — or anyone else in its position — likely will have to pay up.