Blog Post

Tapjoy does the math, moves from bare metal to OpenStack cloud

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

These are interesting times for IT pros. The pressure is on to assess how their company’s tech is running and what deployment model will be best going forward. And they are inundated with claims that a) public cloud is best for everything, b) a mix of public and private resources is best, c) stark, bare metal is faster than cloud, d) co=location is cheapest  once you have a grip on your workload … the list goes on. As is usually the case, the truth is somewhere in the middle of that scrum.

Tapjoy, a mobile app marketing firm based in San Francisco, did its due diligence and decided to move a big chunk of its workload from bare-metal servers running at SoftLayer(s IBM) to OpenStack — but to OpenStack managed for it by Metacloud.

Here’s the thing, according to Tapjoy Head of Operations Wes Jossey (pictured above) who manages devops for the San Francisco-based company:  “We wanted the efficiency and the flexibility of a cloud environment but OpenStack was too complicated to do on our own without hiring a lot of new people.”

And that’s the rub on OpenStack: It can probably do everything you want, provided you know how to set it up and deploy its many modules properly. It is complicated and upgrades and migrations from one version to the next can be a bear. Not to mention that hiring OpenStack experts is expensive.

Bare metal vs. cloud brawl

The discussion of bare metal vs. cloud is common these days. Virtualization — a key cloud underpinning — can take its toll on performance — while bare metal can “wring every last bit of performance from your gear,”  Jossey said. Many companies prefer to run I/O intensive applications including databases on bare metal partly for that reason. But it’s also less flexible and forgiving than cloud, Jossey noted in an interview.

“We decided that what we lose in performance, we gain back in predictability, in scale and in flexibility,” he said.

Tapjoy first assessed a couple of open-source cloud technologies, including CloudStack and Eucalpytus, and ended up opting for OpenStack largely because of the huge community that has coalesced around that effort. Then it talked to an array of OpenStack providers — Jossey wouldn’t say what other companies were on the short list, but he did say Tapjoy ruled out the big legacy IT guys pushing Openstack because they were not at all price competitive and basically they all seemed to promote lock-in, a key concern. Instead, it focused on smaller, younger entities.

Priority: Offloading maintenance

Metacloud won the day largely because of its team and the promise that that team would handle all infrastructure management. Tapjoy made the decision after a final, three-hour discussion with Metacloud and then the teams adjourned to a bar for another less formal — but arguably equally productive — bonding session, for the rest of the night.

It’s important to note that this is not an all-or nothing proposition. A portion of the company’s work still runs on bare metal and Tapjoy will continue to run significant loads in Amazon(s azmn) Web Services which Jossey views as a strategic place to try out new infrastructure. He’s also open to evaluating Google(s goog) and Microsoft(s msft) public clouds for those kinds of jobs as well.

For this in-house deployment, Tapjoy bought its hardware — Metacloud supports all the major chips and server brands — but the implementation was rolled out and is supported by Metacloud so Jossey and his small team can focus on other more critical things, like working on Tapjoy-specific work (or perhaps arm wrestling.)

Wes Jossey


5 Responses to “Tapjoy does the math, moves from bare metal to OpenStack cloud”

  1. So this is actually a hybrid deployment, which is the optimum way to take advantage of cloud flexibility and co-lo cost economics. They should publish their figures to show exactly where the savings are – there aren’t enough real figures around at the moment!

      • Scott Sanchez

        Good quote from Wes that speaks to the numbers… “Supporting hundreds of millions of active users per month requires massive infrastructure. Over the years, we had built out dedicated hardware alongside virtualized infrastructure in AWS. Once we realized that our capacity could grow by over 5X for the same cost as we were spending today by switching to our own private cloud solution, we knew we needed to find a partner to help make that dream a reality.”

      • Weston Jossey

        Hi David,
        If we factor in labor, power, long term contracts, and a three year depreciation on our gear, we expect somewhere around 3x – 5x savings over our old deployment. As with any deployment of this size, there are tradeoffs to be had which can drive those savings up, or down. We consistently chose to not fully optimize for cost in every situation, and rather optimize for redundancy, flexibility, and predictability. I have a feeling we could have driven our multiples higher (9x isn’t an unheard of figure), but the effort & timeline to accomplish that figure was outside of what we were looking to do.

        We tend to take a well rounded approach in our engineering & infrastructure, which has served us well. Once you’re the size of a Google or Facebook, it’s understandable to take many more liberties around how complex your gear becomes, because the cost savings can greatly outweigh labor investments.

        There’s a great stealth startup in SF working right now on answering some of these broader questions. I have a feeling these queries won’t be quite so hard to answer in the near future.