Laptop Displaying the GigaOm Research Portal

Get your Free GigaOm account today.

Access complimentary GigaOm content by signing up for a FREE GigaOm account today — or upgrade to premium for full access to the GigaOm research catalog. Join now and uncover what you’ve been missing!

How to Solve Performance Problems at Petabyte-Scale

Table of Contents

  1. Summary
  2. The Problem
  3. A Deeper Look at the Problem
  4. The Storage System Solution
  5. How It Works
  6. Why It Is Important
  7. An Example of Performance at Scale
  8. Final Notes
  9. About Enrico Signoretti

1. Summary

Today’s huge storage systems, in the order of many petabytes, are associated more with capacity than performance, but that perception is changing. Until recently, the most requested storage feature has been active archiving, but the cloud, new technologies, and increased mobile applications now demand performance as well as capacity.

The three usual measurements for storage-system performance are input/output operations per second (I/OPS), throughput, and latency. Combining the three at a reasonable price is challenging, especially at high capacity. Even more demanding are the number of clients, applications, and workloads that contend for system resources from a multi-petabyte storage infrastructure. Adding to these demands is the challenge of achieving high performance from a distributed storage system spanning a geographically large, often global, area.

The first report in this four-part series describes how a traditional network-attached storage (NAS) system can scale to a few hundred terabytes and sometimes a few petabytes. But some scale-out NAS systems, though amazingly fast, are still not sufficient for webscale and large-organization infrastructures that must reach new scalability levels and indisputable performance while serving tens of thousands of local and remote clients with massive throughputs. An additional challenge is coping with long-distance data communication.

A deeper look at local and distributed performance helps illustrate the problem. For local performance, the clients are traditional servers and PCs, and connections are almost always reliable. For distributed performance, a variety of connections, protocols, and devices produce and consume data at blistering speeds, demanding efficiency and productivity.

Some next-generation multi-petabyte scale-out storage infrastructures have the feature set needed to leverage performance and capacity workloads simultaneously—either when data is saved locally or distributed globally. Separate load balancing, smart-caching techniques, scale-out file-system interfaces, clever use of flash memory, and so on, occur simultaneously to scale capacity at the backend, while delivering the needed performance at the front end.

Image courtesy of scanrail/iStock.