4 Comments

Summary:

Parse.ly Co-founder and CTO Andrew Montalenti shares his views on how startups can best keep their costs down and options open by using cloud computing wisely. But it’s a fast-moving market, so they have to keep abreast of what’s happening.

Andrew Montalenti
photo: Andrew Montalenti

There are a lot of questions a startup has to answer when making decisions about its cloud infrastructure. Private cloud or public cloud? Amazon Web Services, Google Compute Engine or Rackspace? DynamoDB or open source Cassandra?And after all those questions are finally answered, prudent CTOs might find themselves asking them again as new products hit the market and hourly rates continue to fall.

This is why one investor and adviser told Parse.ly Co-founder and CTO Andrew Montalenti that, when it comes to the cloud, “life is a series of ever-changing alternatives.”

Andrew Montalenti

Andrew Montalenti

Montalenti, whose startup provides analytics to some of the biggest web publishers around, shared that insight and more about the economics of cloud computing on the Structure Show podcast this week. Here are some highlights of that interview, but you’ll want to listen to the whole thing for more details on what Parse.ly does and which open source technologies it uses to do it. (And, for a detailed take on the company’s cloud computing history and plans, you can read this recent story, too.)

Download this episode

Subscribe in iTunes

The Structure Show RSS Feed

Saying ‘No, thank you’ to colocation

“Basically, there’s no good reason anymore, at least not from a price-performance standpoint that I can find, to run your own data center,” Montalenti said. Even though, he acknowledged later, “When you walk through the equations, there might be some ways that you can find running your own colo will save some money on some metrics.”

In fact, he recently advised another startup CTO concerned that his company’s cloud costs had hit $20,000 a month that squeezing more savings out of the cloud provider (via Amazon Web Services reserved instances, for example) is a better option than putting servers in a colocation space. Economies of scale are there if you’re running a “robo data center,” he said, but probably not when you’re running a few dozen servers.

“You’re not going to actually save money today if you bring it in-house, you just think you are,” Montalenti told his peer. (Not everyone agrees with this sentiment, however.)

The colo servers Parse.ly is trying to do away with.

The colo servers Parse.ly is trying to do away with.

Keeping tabs on cloud prices

“[H]onestly, this ecosystem has gotten so complicated that any startup that’s running any system at scale really needs to have someone who’s monitoring the ecosystem to really understand where we can get the best bang for our buck over time,” Montalenti said.

To solve this problem at Parse.ly, he hired a “devops guy” who is tasked with keeping tabs on what’s available where, and for how much. Parse.ly, for example, is most concerned with the price per gigabyte of RAM per month, something that used to cost about $40 with Rackspace and now costs around $33, he said. It’s about $10 with AWS and $13 with Google Compute Engine.

“[T]he prices are dropping so fast, and the question is, ‘Have we already hit a price-drop plateau, or are they going to continue to drop?'” he said. “And the truth is that prices aren’t really dropping for running your own data center.”

Cloud computing is great, lock-in isn’t

“I basically hate lock-in with a passion. So as a result, all the Amazon services that are highly Amazon-specific, I tend to shy away from,” Montalenti explained. “If it’s not built on open source technology, I’ll tend to say that it’s sort of off limits for us to use, at least in production.”

The one exception he noted is Elastic MapReduce, one of AWS’s big data services. Building and maintaining Hadoop clusters is hard, he said, but “using EMR is just so damn easy.”

Parse.ly used an open source service called Libcloud to move workloads between Rackspace data centers earlier this year, he noted, and is going to rely on it again when the company ports a large portion of its Rackspace workloads to AWS. Although services like Amazon DynamoDB might work great, Montalenti said, “You don’t want to couple your tools to any one provider.”

 

You’re subscribed! If you like, you can update your settings

  1. I wonder what kind of scale Parse.ly is running at because this does not match what I’m seeing as the direction companies go.

    It would seem the way companies grow is usually starting on cloud, then maybe migration to dedicated but almost every company eventually moves to running their own hardware. It’s just significantly cheaper to do it that way given how much it costs to pay the likes of AWS or Rackspace on a monthly basis for long running cloud compute workloads.

    We can see this from companies like Moz, New Relic, Cloudflare and it’s something I’m doing now with my company, Server Density.

    The hybrid model seems to be a good end point for most companies – running their normal workloads with their own colo and tapping into elastic cloud for specific workloads like batch processing or handling spikes.

    There’s some good analysis of RAM pricing by cloud providers at http://www.theregister.co.uk/2013/12/04/cloud_comparison_analysis/ and AWS do have very good pricing compared to other cloud providers. But it’s still insanely high compared to buying your own.

    1. Hey David,

      The scale we’re running is about 60-70 production nodes with about 1.5 TB of RAM among them.

      The colo footprint we have right now is 5 hefty servers at 144GB of RAM each. I’m ignoring the CPU and disk not because they don’t matter, but because they tend to scale with the size of machines and don’t matter *that much* to us. As discussed in the earlier linked GigaOM article, Parse.ly moved into this colo in order to control cloud costs and get vertically-scaled high-memory machines. So, I’m well aware what an effective tool “rolling your own” can be to stabilize costs. But, when I did that move (Summer 2012), not only were high-memory instances not available in any public cloud, but the memory that was available was actually terribly expensive. That has changed rapidly in the last 18 months with Amazon’s available of high-memory instances, and even with Google’s and Rackspace’s most recent offerings.

      I agree with the other commenter that “Price per RAM is a simple and somewhat useful metric, but it does not paint a complete picture”. However, I never meant for this metric to paint a complete picture. Instead, it’s a metric that was important to Parse.ly and its workload, due to the fact that the system we meant to scale was an in-memory statistics / time series database. You might argue, “Couldn’t you avoid it being in-memory system?” Of course, with significant re-engineering. But in the meanwhile, we were able to acquire big gobs of RAM cheaply by rolling our own, and stabilize our costs. We did this with a full understanding that when the marketplace caught up, we’d move back into it — because, in our view, a software company has no business running its own hardware infrastructure *unless it absolutely needs to*.

      With new pricing, colo has now become an “almost-par” option for us and our workload. Not dramatically cheaper like it used to be, just “slightly better”. Well, that’s not good enough, and I know which way the market’s going.

      There is some niceness to the fact that these colo machines are single-tenant. But other considerations — like multiple region support, free network access to internal services, elasticity, scriptability, easier backups/snapshotting, simpler node build-outs, etc. all weigh in the public cloud’s favor.

      In short, with the new instances & reserved instance pricing, I can almost achieve the cost savings I achieved by rolling my own, but I get all these other benefits. So, why continue to maintain my own? Why bother to procure new hardware and have my DevOps guy set it up, when he can run a script and build the whole cluster at similar cost, instead? (And then, run the script once more, and build a duplicate setup in a region 1,000 miles away?) What’s his time worth? What’s mine worth?

      That’s essentially the decision I’m faced with, and so it’s clear why I’m choosing the way I am, isn’t it?

      1. Thanks – perhaps you could share some cost calculations (maybe on your blog if not in a comment reply)? I’d be interested to see why it’s so expensive, especially projected out 1-3 years. All the calculations I’ve seen from my own company and others suggest buying your own hardware and colocating it is significantly cheaper, even will all costs considered.

        I recently wrote about this at http://gigaom.com/2013/12/07/want-to-reduce-your-cloud-costs-70-percent-heres-how/

  2. Andrew Clay Shafer Sunday, December 8, 2013

    This is the kind of bad advice that costs organizations hundreds of thousands of extra dollars (or more) for under performing cloud wishful thinking.

    Price per RAM is a simple and somewhat useful metric, but it does not paint a complete picture. It’s certainly a misleading way to measure price/performance.

    I would love to see some price comparisons based on modeling real work loads instead of assertions and hand waving.

    Cloud has advantages. Price is rarely one of them.

Comments have been disabled for this post