Traditional storage-area network (SAN), network-attached storage (NAS), and unified-storage systems are not — and never have been — designed to deal with exascale problems, which involve more than 1,000 petabytes (PB) or 1 million terabytes (TB) of data. In fact, most are barely capable of managing low levels of petascale problems. Even scale-out SAN and NAS are not capable of solving exascale problems today.
Some IT pros think they have little to be concerned about and will deal with this problem when it hits their horizon. Yet that horizon is closer than it may appear: Service providers and an increasing number of enterprise-IT organizations are already dealing with exascale problems.
Users and service providers report that data storage consumption rates are accelerating, not decelerating. Current market consensus on data storage growth is approximately 62 percent CAGR. A simple mathematical calculation demonstrates that storage capacities must double every 18 months to keep pace. Hard disk drives (HDDs) stopped keeping pace several years ago. Last year the largest HDD only grew from 2 TB to 3 TB, or a 50 percent growth rate. This year they’ve increased to 4 TB, or a 33 percent increase. Next year the biggest HDD is expected to be 5 TB — a 25 percent increase.
Solid-state drives (SSDs) are not doing any better. When the media cannot keep up with consumption the result is larger storage systems as well as a lot more of them. The current storage model is only manageable when capacities are projected into the low double-digit PBs. At that point more systems are required.
Brain science has discovered that human beings do not think exponentially or even geometrically. They think linearly. That’s why it’s so difficult for many IT organizations to envision how the unremitting storage growth will affect them. A single PB of storage today will grow to become more than one exabyte (EB) of storage in as little as 15 years. Even a relatively small amount of storage such as a single high-density 60-drive drawer of 4 TB HDDs equaling 240 TBs will grow into more than 240 PBs in that same 15 years. Essentially IT storage requirements will be 1,024 times greater in 15 years than they are today. That’s a major problem that will continue to build year over year. Exascale problems include data resilience, data durability, infrastructure, management, power, cooling, and total cost of ownership. While all these are real issues, the capacity scale issue is by far the most crucial.
This research report details the exascale problems for IT pros and decision makers and how four different 0bject-storage software products go about solving them. Two of these products are open-source (OpenStack Swift and Ceph) and two are commercial (Cleversafe Dispersed Storage and Scality Ring). Our analysis includes an overview of each of these products, a detailed look at how well they solve the exascale problems, and a chart that weights them against one other.
Market drivers behind this report
Currently there are best practices for implementing, operating, and managing data produced by trading systems, GPS, seismic processing, and automated analysis systems, among many other systems. However, these practices have become unsustainable when capacities move into the hundreds of PBs and EBs.
Capacity gains have been slowing precipitously while data storage growth is accelerating. Applying workarounds to overcome a storage system’s architectural breakdowns adds cost that’s already too expensive for an exascale environment. So traditional storage systems in turn are too costly for the exascale environment.
Regulatory compliance requirements, in addition to new requirements for analytical capabilities that deliver actionable intelligence on large amounts of current and historical unstructured data, make data durability crucially important. However, underlying storage media is not designed to last the requisite decades or centuries.