13 Comments

Summary:

We’re in the midst of a computing implosion: a re-centralization of resources driven by virtualization, many-core CPUs, GPU computing, flash memory, and high-speed networking. We have a lot to watch over the next few years: what I like to call the coming of the Super Server.

datacenter

We’re in the midst of a computing implosion: a re-centralization of resources driven by virtualization, many-core CPUs, GPU computing, flash memory, and high-speed networking. Some have predicted, only half-jokingly, that we will be able to buy a mainframe in a pizza box server that fits in a small fraction of a data center rack. That possibility — and in my opinion, inevitability — means we have a lot to watch over the next few years: what I like to call the coming of the Super Server.

The business drivers for the Super Server span power, management, new workloads, and big data needs. Let’s examine each briefly.

Power

Rising data center power bills, combined with a macro push to environmental friendliness, has led to a slew of power-optimized servers. Today, the three-year power bill for data center equipment can often equal or exceed the original capital cost. And with cloud architectures spawning mega data centers costing hundreds of millions of dollars, there’s plenty of room to reshape servers for power savings. We’ve already begun to see the impact with the announcements of Windows running on ARM processors, and emerging server vendors such as Calxeda and Sea Micro focusing on lower power chips that still deliver data center performance.

Management

In addition to power, space, and cooling costs, operating expenses are the other major data center equipment cost post-purchase. Data center administrators usually look to minimize the servers count, or server image count, requiring oversight. Today, through virtualization, architects can minimize the number of physical machines they manage while keeping the same number of server instances available through virtualization. Since virtualization tends to be memory-hungry and storage-hungry, placing more CPU, memory, and storage resources within a single server allow that physical server to manage more virtual machines. Administrators can tackle the same, or greater, number of applications and workloads with less physical equipment: a management and administrative win.

New Workloads and Applications

Our computing habits continue to evolve with Internet development, and new web businesses spur the need for supporting infrastructure. As an example, those running cloud data centers don’t care at all about having CD drives or extra USB ports on their servers, but they do need ways to handle fast and furious updates, millions of video downloads, or voluminous click-tracking. Web application areas like social networking, online video, advertising, and mobile applications require server architectures optimized for transactions, capacity, and web serving, all while minimizing power and management costs.

Big Data Needs

Our Internet-enabled information age has put us in a race to capture, process, and distill more data than ever. When Hadoop emerged as a dominant ,open-source implementation of Map Reduce, it forced a rethinking of storage infrastructure. Previously, many applications requiring large amounts of data made use of centralized storage in large arrays connected together with storage protocols. With the Hadoop Distributed File System, data was intended to be close to the CPU, and on disks within individual servers. So we’ve actually seen a move to pull storage out of the centralized array and back into the server. This triggered a return of larger servers with more internal drive slots to accommodate the storage capacity for Map Reduce operations, a key element of the Super Server.

Want to learn more about big data and the impact on infrastructure? Be sure to check out Structure Big Data March 23, 2011 in New York City.

Gary Orenstein is host of The Cloud Computing Show.

Related content from GigaOM Pro (subscription req’d):

  1. Hey Gary,

    Great article. I do agree with you on most of the article, except on the big data part. How does distributed storage for Big Data go along with the advent of a Super Server ? Either we have distributed storage on multiple independent/small(-ish) nodes with CPU being used to process the data, or we have a Super Server or a cluster of a few Super Servers handling the load, but i don’t understand your point of Hadoop = Super Servers. Can you elaborate what you mean ?

    I do agree that management is a key think when it comes to Datacenter operations, but i believe advanced software stacks are going to be solving this problem more efficiently than Super Servers.

    In the end, Power is probably going to be the most important part of this new era in computing. Green Power sure is something very cool, but it’s not available to everyone. I didn’t know about Calxeda and Sea Micro, thanks for the links ;-D

    So will we see a move of datacenters to countries like Iceland where geo-thermal power saves the day and helps create zero carbon footprint DCs ? I’m not so sure, but it sounds appealing for the planet ;)

    my 2 cents
    @mastachand

    Share
    1. Marc,
      I’ve seen distributed storage architectures swing the pendulum from smaller servers to larger servers with more internal storage. Hadoop, with the underlying distributed file system, seems to be leading folks in this direction. Not always, but often. I am broadly categorizing the larger, denser servers as Super Servers.
      Gary

      Share
      1. Hey Gary,
        Thanks for your answer.

        Where did you see storage going from smaller servers to bigger with distributed storage ? I mean, imho, legacy storage is basically large, dense arrays which i would call super servers.

        If you’re talking about the application servers who mount those network storage arrays through FC or iSCSI, and which typically had a very small local storage, then fine. But we’re not comparing apples and apples that way.

        It is true that nodes for distributed storage usually contain more local storage than app servers, but usually way less than large storage arrays.

        I might not understand what your point is, but anyway the point is that i believe that distributed storage over independent commoditized nodes probably helps limiting datacenter operations and single server energy utilization, if not limiting the number of servers in a DC.

        It’s a very interesting conversation. Thanks for the article again !

        -marc

        Share
  2. It’s not about the number of servers one admin manages, it’s about what level of intelligence you can surface from your compute infrastructure and what you can do with it (and how quickly) so that the system can self manage. Automation, right sizing the infrastructure dynamically, self healing, self updating, maximizing efficiency and cost.

    90% of the cost of a datacenter is born after it’s been built!

    Share
  3. I agree on the power comment (below). In U.S. industry (50% of electric power consumption in the U.S.) over half of the power us used by large electric motors. This has been the mix historically, and motor manufacturers have done a lousy job of improving on power consumption save for some automatic tuning features. Data server farms are beginning to catch up to this and that’s the rub. The risk is since most of these installations are in concentrated places (gigantic campuses or warehouses) they unbalance the electric grid and create problems even for any new smart-grid strategy. Not unreasonable to think that some “crazy” ideas will actually have to be brought into play:

    Crazy idea 1= put server farms in low earth orbit for cooling and solar power
    Crazy idea 2 = colocate a nuclear power plant with a server installation

    Maybe not so crazy ? What do you think.

    -g

    Share
    1. Paul Miller Sunday, March 6, 2011

      Hi Gary

      for ‘Crazy Idea 1,’ what would the bandwidth – up and down – to your LEO-based server farm look like… and could the storage handle the vibration from lift-off, or the radiation from the sun?

      ‘Crazy idea 2′ – not so crazy, and the same thinking that already drives locating data centers next to geothermal (Iceland), hydro-electric (some US states), etc. You still need the bandwidth to get data in and out of those server farms, of course… and geothermal has a nasty tendency to also mean geologically unstable…

      Share
  4. The Dawn of the Super Server – Trackback from DailyKix.com…

    The Dawn of the Super Server…

    Share
  5. Ryan Anguiano Sunday, March 6, 2011

    You can see an explanation of Hadoop and HDFS here:

    http://hackedexistence.com/project-hadoop.html

    There is also an example and sample code using Hadoop in the Netflix Prize Contest

    Share
  6. Creepy development. When is enough, ENOUGH??? http://www.telusconnect.net/

    Share
  7. If one day ARM and other mobile CPU can run servers, is intel in threat?

    Share
  8. [...] Source:http://gigaom.com/cloud/the-dawn-of-the-super-server/ This entry was posted in Technology and tagged center rack, centralization, dawn, flash memory, fraction, high speed, inevitability, mainframe, midst, pizza box, speed networking. Bookmark the permalink. ← Global Android Activations Mapped and Animated [...]

    Share
  9. Hi Gary,

    Very good post. I agree with your vision. I hope to catch up with you next time I’m in your neck of the woods. We have a lot to catch up on.

    Share
  10. [...] The Dawn of the Super Server – “We’re in the midst of a computing implosion: a re-centralization of resources driven by virtualization, many-core CPUs, GPU computing, flash memory, and high-speed networking. Some have predicted, only half-jokingly, that we will be able to buy a mainframe in a pizza box server that fits in a small fraction of a data center rack. That possibility — and in my opinion, inevitability — means we have a lot to watch over the next few years: what I like to call the coming of the Super Server….” [...]

    Share

Comments have been disabled for this post