117 Comments

Summary:

Charlie Oppenheimer may be a fan of Amazon Web Services. But, as he explains here, he’s long felt that the economics of the choice between self-hosted and cloud provider had more texture to it than the patently attractive sounding “10 cents an hour.”

Oppenheimer graphic4

Updated. Amazon Web Services (AWS), as the trailblazing provider of Infrastructure as a Service (IaaS), has changed the dialog about computing infrastructure. Today, instead of simply assuming that you’ll be buying and operating your own servers, storage and networking, AWS is always an option to consider, and for many new businesses, it’s simply the default choice.

I’m a huge fan of cloud computing in general and AWS in particular. But I’ve long had an instinct that the economics of the choice between self-hosted and cloud provider had more texture to it than the patently attractive sounding “10 cents an hour,” particularly as a function of demand distribution. As a case in point, Zynga has made it known that for economic reasons, they now use their own infrastructure for baseline loads and use Amazon for peaks and variable loads surrounding new game introductions.

An analysis of the load profiles

To tease out a more nuanced view of the economics, I’ve built a detailed Excel model that analyzes the relative costs and sensitivities of AWS versus self-hosted in the context of different load profiles. By “load profiles,” I mean the distribution of demand over the day/month as well as relative needs for bandwidth versus compute resources. The load profile is the key factor influencing the economic choice because it determines what resources are required and how heavily these resources are utilized.

The model provides a simple way to analyze various load profiles and allows one to skew the load between bandwidth-heavy, compute-heavy or any combination. In addition, the model presents the cost of operating 100 percent on AWS, 100 percent self-hosted as well as all hybrid mixes in between.

In a subsequent post, I will share the model and describe how you can use it for scenarios of interest to you. But for this post, I will outline some of the conclusions that I’ve derived from looking at many different scenarios. In most cases, the analysis illustrates why intuition is right (for example, that a highly variable compute load is a slam dunk for AWS). In other cases, certain high-sensitivity factors become evident and drive the economic answer. There are also cases where a hybrid infrastructure is at least worthy of consideration.

To frame an example analysis, here is the daily distribution of a typical Internet application. In the model, traffic distribution is an input from which bandwidth requirements are computed. The distribution over the day reflects the behavior of the user base (in this case, one with a high U.S. business-hour activity peak). Computing load is assumed to follow traffic according to a linear relationship, i.e. higher traffic implies higher compute load.

Note that while labor costs are included in the model, I am leaving them out of this example for simplicity. Because labor is a mostly fixed cost for each alternative, it will tend not to impact the relative comparison of the two alternatives. Rather, it will impact where the actual break-even point lies. If you use the model to examine your own situation, then of course I would recommend including the labor costs on each side.

For this example, to compute costs for Amazon, I have assumed Standard Extra Large instances and ELB load balancer for the Northern California region. The model computes the number of instances required for each hour of the day. Whenever the economics dictate it, the model applies as many AWS Reserved Instances (capacity contracts with lower variable costs) as justified and fills in with on-demand instances as required. Charges for data are computed according to the progressive pricing schedule that Amazon publishes. To compute costs for self-hosting, I assume co-location with the peak number of Std-XL-equivalent servers required, each loaded to no more than 80 percent of capacity. The costs of hardware are amortized over 36 months. Power is assumed to be included with rackspace fees. Bandwidth is assumed to be obtained on a 95th percentile price basis.

Now let’s look at a sensitivity analysis. Notice in the above example, that a bit more than half of the total cost for each alternative is for bandwidth/data transfer charges ($35,144 for self-hosted at $8/Mbps and $36,900 for AWS). This is important because while Amazon pricing is fixed and published, 95th percentile pricing is highly variable and competitive

The chart above shows total costs as a function of co-location bandwidth pricing. AWS costs are independent of this and thus flat. What this chart shows is that self-hosting costs less for any bandwidth pricing under about $9.50 per Mbps/Month. And if you can negotiate a price as low as $4, you’d be saving more than 40 percent to self-host. I’ll leave discussion of the hybrid to another post.

This should provide a bit of a feel for how I’ve been conducting these analyses. Above is a visual summary of how different scenarios tend to shake out. The intuitive conclusion that the more spiky the load, the better the economics of the AWS on-demand solution is confirmed. And similarly, the flatter or less variable the load distribution, the more self-hosting appears to make sense. And if you’ve got a situation that uses a lot of bandwidth, you need to look more closely at potential self-hosted savings that could be feasible with negotiated bandwidth reductions.

Update (Feb. 14): This post has garnered a lot of much appreciated attention. From the comments, I see that two clarifications would be helpful:

  1. The key point here is that a comparison of the cost of cloud hosting versus self-hosting needs to be based on the profile of your load. It is not that Amazon (or any other provider) is more expensive than self-hosting, as this is often not the case. Rather, it depends on the profile of your load. Moreover, it’s not so important where exactly your breakeven point is but rather it is most important to know the main sensitivities (e.g. bandwidth cost, CPU load, storage, etc.) for your situation so that you can understand which differences could flip the decision. The results here are for this example only and other examples will produce different results, some in favor of cloud and some in favor of self-hosting.
  2. The specific use case I’ve chosen is for a business that’s pretty far along. But some people have been wondering how this example applies to startups. That’s a great question.

While I’ve referred to “spiky” loads, there’s another way to say that which is “variable,” “unknown” or “unpredictable,” which describes the situation that a startup (or other new business endeavor) usually finds itself in. In those cases, the fact that you cannot forecast very well is a reason why it’s highly unlikely you’ll save money by self-hosting…because you’re very unlikely to buy the right amount of capacity. You’ll either overprovision and waste money on unused capacity, or you’ll buy too little and compromise the business. So while you might not call your startup load “spiky,” the fact that it’s unpredictable gives it a similar profile in the model and hence the economic conclusion would tell you to go with the cloud infrastructure route.

Another not-strictly-economic respect that needs to be considered for startups (and others) is the benefit of focusing one’s attention on primary value-creating activities versus commodity activities (relative to the business) that one might not be very good at anyway. In addition, AWS and other cloud providers give us the highly valuable ability to experiment with little downside. This is especially important for the highly iterative and trial-and-error nature of building successful Internet businesses.

The point of this particular example is that if you have a significant amount of load that is well known and predictable then you may be able to save some money by bringing a portion or all of that inside.

Charlie Oppenheimer is a serial-CEO and currently an executive-in-residence at venture-capital firm Matrix Partners. His most recent company, Digital Fountain, was acquired by Qualcomm, and his previous company, Aptivia, was acquired by Yahoo. He blogs at stratamotion.com

You’re subscribed! If you like, you can update your settings

  1. adrian cockcroft Saturday, February 11, 2012

    Interesting analysis, however if bandwidth costs are dominant you should be factoring a CDN based solution into the model, regardless of whether it’s datacenter or cloud hosted compute.

    1. @adrian: Not necessarily. While static content can certainly be CDN hosted quite easily, more dynamic content cannot. This is especially true of user-specific data. In that case, direct peering with the big boys is your best bet.

  2. Interesting. Rackspace has been pushing hybrid all along!

  3. It isn’t clear how you’re accounting for the self-service aspects of cloud computing. How much money did i save when i pushed a button to launch an XL server in 60 seconds? What about assigning it an Elastic IP in 15 seconds? Or attaching storage in 10 seconds?

    The point that Zynga, Google and others have made about long-running workloads typically applies to running *arrays of the same workload* (1 application running on 500 servers). The cloud excels when you run many different applications and need on-demand agility (500 different workloads running across 350 servers).

    This analysis is misleading, IMHO.

    1. you didn’t read the part about spikey workloads. any time you have variation in demand, you want to take advantage of some kind of pooling. that’s the cloud’s main pitch: that it pools demand across many companies so they can share a fixed pool. there’s no reason that fixed pool should be AWS, of course: depending on scale, you could obtain the same effects purely internally, by agregating demand between groups within a single company.

    2. Also, where is the cost in the personnel to maintain the self-hosted server capacity? It’s bundled in the Cloud / dedicated hosting price.

      How about the cost of spares for quick replacement of self-hosted server capacity that fails? All hardware fails.

      Where’s the cost in downtime as your expensive admins rush from home or a vacation to fix your self-hosted solution, or do you have enough staff to cover 24*7*365?

      Where’s the cost in your admins figuring out how to bring back online your self-hosted solution as they are not doing so day-in-day out?

      Lots of left out costs here.

      1. Absolutely. This was the first thing that comes to mind after reading the article.

        This kind of analysis is common for very young startups, on shoestring budget, with bunch of do-it-all, know-it-alls wearing all hats including sysadmin. However, they would typically not have workloads of the dimension discussed. Exceptions of course, may exist.

      2. I agree, self-hosting requires staff. But; putting applications on EC2 requires some work as well, though less, because the virtual machines have to be created, loaded with applications, etc.

        But in general; I estimate the cost for sys-admins to be far higher than amazon-hosted.

      3. You don’t have Sysadmins looking after your EC2 instances? I admire your courage ….

      4. Kind of more replying to the “You don’t have Sysadmins looking after your EC2 instances?” comment.

        Look at the numbers: the above analysis saves you $10k/month, or $120k/year. For 131 servers!

        So you’ve got the money for an extra one to one and a half sysadmins. That’s not enough to cover the difference in handling 131 machines with all their maintenance, and 131 machines in the cloud.

        No question, this is NOT a savings.

    3. Absolutely agree, how do you factor in disaster recovery, cost of labor, service level requirements, etc. in the simplistic cost breakout being used. One example: the $12,227 per month space calc – which includes power – is way too low. The cost of power alone eats probably 60-70% of that figure – if you factor the cost of multiple FTE’s handling sys admin, storage admin, general support, etc. – the dollar per month goes significantly above the per month cost allocated. In addition the self hosted cost doesn’t even account for a refresh of those servers, which increases the monthly cost by 25% if you count one refresh over 6 years. The cost of the hosted solution will decrease as the number of instances to handle the core computing requirements is reduced as the price for storage, IO, etc. goes down as you move forward in time.

  4. I found myself wondering how reserved instance utilization would affect the pricing model for AWS over a 36 month period. If you know your going to have a stable core set of instances it seems this could make a significant difference. Amazon claims, “Reserved Instances can provide savings of nearly 60% compared to using On-Demand Instances.”

    This is something I’m exploring for some of my AWS using clients.

    1. Two factors that seem really interesting are:
      – how to select the number of reserved instances, and
      – how discount schedules alter costs

      What happens when one is willing to over-reserve, and can negotiate a sweetheart deal with Amazon? That could move the “Amazon Hosted” line significantly.

      1. On reserved instances, the model calculates the number of reserved instances required (if any) to minimize total costs. It uses the load distribution to figure this number out so that over the total ((#-reserved * reserved-per-Hour) + amortized-reserved-fee) + (#-on-demand * on-demand-per-hour) is minimized.

        Discount schedules are applied according to the published volume schedules.

        If you can negotiate a better deal than published, then those new numbers go into the model.

        Charlie Oppenheimer

    2. Exactly my thought as well. What about the Reputation/Executive Risk of a cloud provider. I will be content proposing use of Amazon as the cloud vendor, not so much for the lesser knowns.

    3. We are using AWS reserved instances and its quite a savings. Good thing is, its not tied to a specific instance. Its tied to instance type. If you reserved small instance, as long as you have a small instance, you will get savings

  5. Michael Richardson Saturday, February 11, 2012

    Just on the level of storage and CDN services, it would be nice to see a comparison between Amazon’s S3 and other CDNs.

    1. S3 is not a CDN, Amazons CDN product is called CloudFront.

      You can compare cloudfront with other CDNs here http://www.cdnplanet.com/cdns/cloudfront/ (on the right side, choose another CDN to compare cloudfront with)

    2. S3 is not a CDN. Cloudfront is Amazon’s CDN.

  6. If you are serving up 650TB/mo you are making so much revenue that $60-70k/mo is insignificant.

    1. True, but I think 8USD/Mbps at 5Gbps usage is very high priced bandwidth, USD 2-3/Mbps or less is easily achievable at that scale with self-hosting/dedicated server hosting. 30K – 35K per month savings can be significant. As Daniel points out below, server amortization is not the only way apart from the cloud one can very well rent out dedicated servers for cheap.

  7. There are a bunch of choices – cloud (NOT only AWS), dedicated hosting, colocation with self/hosting.

    I certainly agree that an AWS-only approach is not good. However, when did AWS become the only cloud? And when did dedicated server offerings go away? AWS is not affordable for many base loads, but there are a number of options.

    1. “However, when did AWS become the only cloud?”

      They might not be the only player, but they are leagues ahead of any other cloud provider in terms of offerings (they also offer paaS, Saas, CDN not just IaaS), APIs, capacity and so forth. If you were to compare solely on the IaaS aspect Rackspace would be the only option and their pricing is not very “elastic” to meet some types of demand/usage.

  8. Joyent, SoftLayer, and 100tb.com are some (far) less expensive bandwidth options.

  9. Giuseppe Miriello Saturday, February 11, 2012

    I tend to agree to your analysis. I work in a datacenter and I noticed that people transition from shared cloud to self-hosting (or dedicated cloud) if they have nil or few traffic spikes.

    Many of them remain self hosted and – as zynga – use AWS to absorb traffic/computation spikes.

    1. zynga is going away from AWS, it would be an interesting study of how much that would cost.

Comments have been disabled for this post