17 Comments

Summary:

As data moves into the cloud, many storage companies are evaluating their use of memory in the data center as they try to strike a balance between easily accessible cache memory powered by flash and slower-to-access disk memory powered by hard drives. At the same time, they’re trying to make their storage easier to provision and more reliable by looking at some form of virtualization. Both trends will change the dynamic for large storage vendors in the years to come.

As data moves into the cloud, storage companies are taking advantage of virtualization and adding more memory to the data center. Techniques such as storage virtualization can improve the usage of existing storage hardware and make provisioning easier, while adding memory to the data center can make accessing information faster.

Many companies are evaluating their use of memory in the data center as they try to strike a balance between easily accessible cache memory powered by flash and slower-to-access disk memory powered by hard drives. At the same time, they’re trying to make their storage easier to provision and more reliable by looking at some form of virtualization. Both trends will change the dynamic for large storage vendors in the years to come.

As you move along the storage technology continuum, you’re trading price for speed. Getting information stored on tape, which is cheap, can take hours or days while accessing something on flash, which costs a pretty penny, takes microseconds. Plus, solid-state drives using flash can’t possibly store all of the data people are creating. There’s also the question of how reliable it is.

Given this, most companies requiring huge storage arrays rely on expensive machines from the likes of EMC or HP. Or they make their own “storage cloud” using commodity disk drives and a proprietary layer of software. By allowing companies to allocate and provision the storage in a software layer, it virtualizes the storage array. It’s essentially the same model that underpins the storage services offered by Amazon S3 and Nirvanix.

Meanwhile, tier-one storage equipment vendors companies such as EMC, IBM and HP have recognized that cloud storage is the future of computing, and are attempting to ride that wave without cannibalizing their high-margin box business. For example, EMC is offering services for SMBs through its Mozy acquisition. IBM last year purchased XIV, which makes the software that can be used to virtualize storage. Large companies such as NetApp and 3Par are attempting virtualize storage as well.

But once the cloud is in place, there’s still the issue of calling up data and delivering it relatively quickly. For certain applications, such as those requiring instantaneous access to large quantities of data like seismic graphing or historical financial analysis, cloud storage may never replace a spinning drive connected to a sever via Fibre Channel.

But for many applications, including media delivery and most application delivery, tweaking storage for the cloud means adding faster cache memory or optimizing the storage infrastructure by geographic location. Nirvanix, the startup providing hosted storage in competition with Amazon’s S3, touts its multiple storage clusters as a way to deliver faster access to stored content. It’s also looking to provide nodes on the customer premise called “NAS heads” that will basically allow for frequently called up “hot data” to be stored there.

Alternatively, or possibly in conjunction with such a setup, a customer interested in amping up the speed of cloud storage might buy equipment from startups providing different levels of cache to aid in hasty data retrieval. We’ve covered some before, such as Atrato, which actually offers a box of disks attached to a controller that runs software designed to access and configure the hundreds of spinning disks. The result is the reliability of spinning disks with a faster information retrieval speed. Others that rely strictly on intelligently routing needed data to cache included Gear6 and Xiotech Corp.

Storage being served via the cloud is a forgone conclusion. It only remains to be seen if a startup like Nirvanix can grow to compete with the big players in storage or hosted computing, and how the larger storage vendors will walk the line of creating cloud products without jeopardizing their hardware business.

A far more interesting trend to watch will be how the growing amount of stored data is kept and delivered in the fastest amount of time. For proof that storage is relevant check out Facebook’s hardware. A little more than 8% of their servers are devoted to the distributed caching system, memcached. The entire purpose of those servers is to speed delivery of information for the social network. In this age of instant gratification, we may find that cache is king.

You’re subscribed! If you like, you can update your settings

  1. Hah…”cache is king”…nice piece, Stacy.

  2. you know, for some very interesting insights into all of this, perhaps consider a fup piece looking at the new world of database startups – nothing for years and then out of nowhere a group of companies get funded including luminaries like stonebraker (vertica), tan at greenplum, and so on…they’re all looking at how data stores scale for the web, the basic idea being that these “things” that were built 25 years ago for banks really no longer make sense.

    and i’m not just talking about interesting stuff like swiveldb and dabbledb, i also mean oracle, microsoft and ibm and hp…

    funny, business week just has a piece where hp’s own cto could not explain to the journalist reporting what “cloud” actually means! very comical, but darkly comical, right?

  3. Stacey Higginbotham Saturday, April 26, 2008

    @Dave, good idea. As for HP’s CTO, I would say that clouds are nebulous :)
    There’s even been some debate in our comment sections about defining the cloud.

  4. “In this age of instant gratification, we may find that cache is king”

    Another option is combining an pure in-memory storage a.k.a Data Grid and use SimpleDB/S3 and back end storage. In this way the application can benefit from the speed of memory storage while keeping the data backed by low cost storage and even get better scaling and performance then high-end storage devices. See an example of such project here:
    http://www.openspaces.org/display/EDS/External+Data+Source+by+Amazon+SimpleDB

  5. nirvanix vs s3. can’t seem to decide. but check this out

    When you’re really pushing traffic, Amazon S3 is more expensive than a CDN

    http://joyeur.com/2007/08/16/when-youre-really-pushing-traffic-amazon-s3-is-more-expensive-then-a-cdn

  6. News Roundup 29/04/2008 | Pure Chaos Online Tuesday, April 29, 2008

    [...] Who Will Cache in on Cloud Storage? [...]

  7. Sun Brightens Storage Options With Flash – GigaOM Wednesday, June 4, 2008

    [...] covered startups in the past whose entire existence is based on figuring out how to get to existing data faster, either through appliances or compression. With users storing more data and expecting continual [...]

  8. Apple is Not Fruit » Who Will Cache in on Cloud Storage? Thursday, June 12, 2008

    [...] read more | digg story [...]

  9. Video Data Helps Generate Cache – GigaOM Thursday, June 19, 2008

    [...] which means keeping them requires a trade-off between fast access and cheap storage. A range of companies are trying to address these sorts of storage problems through compression, caching and even Flash memory in the data [...]

  10. Most Online Videos Are 3-Day Wonders | Alex McFarlane Thursday, July 3, 2008

    [...] which means keeping them requires a trade-off between fast access and cheap storage. A range of companies are trying to address these sorts of storage problems through compression, caching and even Flash memory in the data [...]

Comments have been disabled for this post