Written by Alistair Croll, vice president of product management and co-founder of Coradiant
Virtualization and on-demand computing are giving companies new reasons to worry about code efficiency.
Once upon a time, lousy coding didn’t matter. Coder Joel and I could write the same app, and while mine might have consumed 50 percent of the machine’s CPU whereas his could have consumed a mere 10 percent, this wasn’t a big deal. We both paid for our computer, rackspace, bandwidth, and power.
Joel’s code wasn’t measurably “better” than mine (or vice-versa) as long as the apps were the same to the end user. Any advantages were hidden by the step function of physical hardware: Computing costs didn’t grow linearly with the amount of processing consumed.
Modern applications, however, are changing in several important ways:
- One, virtualization lets applications scale across multiple machines. Many companies are consolidating their server infrastructures, and decommissioning hundreds of machines in the process. A 2006 Yankee Group study of 700 firms estimated that 76 percent had already deployed server virtualization in the data center or planned to do so.
- Two, power is the limiting factor for many data centers. A typical Google (GOOG) data center — which puts 10,000 computers into 30,000 square feet — is likely located in a Wet State near a power source. For example, the new Google data center site in The Dalles, Ore., was chosen largely for its proximity to hydroelectric power.
- And three, Software-as-a-Service platforms let us run sophisticated applications on someone else’s infrastructure. Salesforce.com’s (CRM) recently unveiled Force platform is a good example of this, and Amazon’s (AMZN) EC2 and S3 provide lower-level computing on demand.
These three changes mean that bad code matters. Now, with ten instances of my application installed in a data center, I’m using five machines — while Joel only needs one. I’m five times as bad for the planet as Joel.
This hits my wallet, too: Amazon’s Elastic Computing service charges 10 cents per processing hour, plus storage and bandwidth costs, for a “typical” server.
Inefficiency doesn’t just come from writing bad code. Modern applications are written with several tiers of abstraction. The latest web 2.0 app is a layer cake of complexity: Adobe (ADBE) Flex, within an AJAX framework, dynamically rendered by a Java app that’s running inside a monitoring layer like Glassbox that’s loaded on a Sun (JAVA) JVM, that’s running on a virtual OS, which is managed by a VMWare (VMW) Hypervisor.
That’s a lot of distance — and computing overhead — between my code and the electricity of each processor cycle. Architecture choices, and even programming language, matter.
To anyone who’s worked on mainframes, this should look familiar. Administrators relied on tools like IBM’s (IBM) Workload Manager to measure processing usage in shared environments, and billed usage back to a company’s departments. But where mainframe operators had lots of instrumentation, in today’s environment each layer is hidden from those beneath it. This dramatically limits visibility.
We have a common language for most of the variables behind an application: gigabytes of storage, vertical inches of server space, kilowatts consumed, and so on. But we don’t have a good way of talking about processing workload. Some applications have their domain-specific metrics — for example, Microsoft (MSFT) Exchange uses megacycles per mailbox — but there’s no universal term for describing efficiency across the myriad platforms and frameworks of web 2.0.
In 2004, Michael S. Malone argued that we need to think about the overall efficiency of an electronic system, rather than a simple doubling of processing power.
As we move towards shared, on-demand infrastructure, we need to find ways to talk about “green” code. Until then, we’re at the mercy of bad coders and heavy applications.