As folks increasingly store and access information online, the data centers powering cloud services need to be managed more like a single computing entity rather than a bunch of servers, according to a Google white paper (Google calls it a mini-book) released today.
The paper lays out the concept of warehouse-scale computers (which we have previously referred to as both web-scale computing and mega data centers), specifically how to build out the infrastructure to support Internet services managed across thousands of servers. Google’s Luiz Barroso, a distinguished engineer, and Urs Hölzle, SVP of operations, both of whom help manage and build out Google’s data center, lay out their definition of WSCs:
The name is meant to call attention to the most distinguishing feature of these machines:
the massive scale of their software infrastructure, data repositories, and hardware platform. This perspective is a departure from a view of the computing problem that implicitly assumes a model where one program runs in a single machine. In warehouse-scale computing, the program is an Internet service, which may consist of tens or more individual programs that interact to implement complex end-user services such as email, search, or maps. These programs might be implemented and maintained by different teams of engineers, perhaps even across organizational, geographic, and company boundaries (as is the case with mashups, for example).
The 100-page paper also notes that there are a few other characteristics of these warehouse-scale computers, which are owned by companies such as Microsoft, Google, Amazon and Yahoo. These computers are mostly built in-house, are fairly flexible and are focused on cost efficiency. The Google engineers’ overview includes chapters analyzing the power consumption and costs associated with running such a warehouse-scale computer, and how to architect the hardware and software in these machines.
Commodity hardware gets a boost here, which doesn’t bode well for Cisco’s (c csco) new blade servers or IBM‘s efforts to create specialized hardware for different workloads in the cloud. The paper also discusses the challenges of building software that can run across a heterogeneous hardware layer, as well as programming for redundancy and resiliency. Given how much publicity Google’s various failures and traffic slowdowns have received, the chart below illustrating the causes behind such glitches is notable.
Many in the industry are positioning warehouse-scale computing as the future of IT, but this paper underscores how it may be an evolving style of computing that will likely coexist with more traditional models for some time to come. In the same way mammals coexisted with dinosaurs for millions of years, perhaps.