Using cloud infrastructure is the natural starting point for any new project because it’s one of the ideal use cases for cloud infrastructure – where you have unknown requirements; the other being where you need elasticity to run workloads for short periods at large scale, or handle traffic spikes. The problem comes months later when you know your baseline resource requirements.
Let’s consider a high throughput database as an example. Most web applications have a database storing customer information behind the scenes but whatever the project, requirements are very similar – you need a lot of memory and high performance disk I/O.
Evaluating pure cloud
Looking at the costs for a single instance illustrates the requirements. In the real world you would need multiple instances for redundancy and replication but will just work with a single instance for now:
Amazon EC2 c3.4xlarge (we can’t consider m2.2xlarge because it is not SSD backed)
= 30GB RAM, 320GB SSD storage
= $1.20/hr or $3726 + $0.298/hr heavy utilization reserved
Rackspace Cloud 30GB Performance
= 30GB RAM, 300GB SSD storage
Databases also tend to exist for a long time and so don’t generally fit into the elastic model. This means you can’t take advantage of the hourly or minute based pricing that makes cloud infrastructure cheap in short bursts.
So extend those costs on an annual basis:
Amazon EC2 c3.4xlarge heavy utilization reserved
= $3,726 + ($0.298 * 24 * 365)
Rackspace Cloud 30GB Performance
= $1.36 * 24 * 365
Another issue with databases is they tend not to behave nicely if you’re contending for I/O on a busy host so both Rackspace and Amazon let you pay for dedicated instances — on Amazon this has a separate fee structure and on Rackspace you effectively have to get their largest instance type. Calculating those costs out for our annual database instance would look like this:
Amazon EC2 c3.4xlarge dedicated heavy utilization reserved
= $4099 + ($0.328 + $2.00) * 24 * 365
Rackspace Cloud 120GB Performance
= $5.44 * 24 * 365
(The extra $2 per hour on EC2 is charged once per region)
Note that because we have to go for the largest Rackspace instance, the comparison isn’t direct — you’re paying Rackspace for 120GB RAM and x4 300GB SSDs. On one hand this isn’t a fair comparison because the specs are entirely different but on the other hand, Rackspace doesn’t have the flexibility to give you a dedicated 30GB instance.
Consider the dedicated hardware option…
Given the annual cost of these instances, the next logical step is to consider dedicated hardware where you rent the resources and the provider is responsible for upkeep. At my company, Server Density, we use Softlayer, now owned by IBM, and have dedicated hardware for our database nodes. IBM is becoming very competitive with Amazon and Rackspace so let’s add a similarly spec’d dedicated server from SoftLayer, at list prices:
To match a similar spec we can choose the Dual Processor Hex Core Xeon 2620 – 2.0Ghz Sandy Bridge with 32GB RAM, 32GB system disk and 400GB secondary disk. This costs $789/month or $9,468/year. This is 80 percent cheaper than Rackspace and 61 percent cheaper than Amazon before you add data transfer costs – SoftLayer includes 5,000GB of data transfer per month which would cost $600/month on both Amazon and Rackspace, a saving of $7200/yearly.
… or buy your own
There is another step you can take as you continue to grow — purchasing your own hardware and renting data center space i.e. colocation. We’ll look into the tradeoffs on that scenario in a post to come.