Storing big data in the cloud is easy — getting it there is hard

Structure Europe 2012 Michelle Munson Aspera

If there’s one thing the cloud is good at, it’s storing large quantities of data, and making that data available to companies so they can use it in various ways — but as the amount of data continues to explode, one big problem is getting those huge quantities of data into and out of cloud-storage services.

Michelle Munson, co-founder and CEO of Aspera, told attendees at GigaOM’s Structure Europe conference that newer technologies like the ones her company provides are required to make up for the shortcomings of traditional internet transport protocols like TCP, and allowing all of that “big data” to be moved around more easily and more efficiently creates all kinds of new opportunities.

One of the most obvious areas where transportation of large quantities of data becomes a problem is the video industry, Munson said — a company like Netflix deals with hundreds of terabytes a month in streaming movies and TV programs, and it has to transcode that content into multiple formats and then ship it to the hosting networks that provide it to end users and customers. The company wanted to outsource its storage to Amazon’s S3 and also wanted to use outsourced transcoding services, but it couldn’t find any way of moving all of that data easily apart from shipping physical hard drives on trucks — so it contacted Aspera.

The company’s technology does two things, Munson said: it adds a layer on top of TCP that allows content like video and other things to be transmitted to the cloud more quickly, but it also transforms the data in a way that makes it easier for cloud providers like Amazon to ingest it and integrate it into their databases.

Making that process easier allows companies to develop new ways of using the data, the Aspera CEO said. For example, she said that the life-sciences research industry generates almost an order of magnitude more data than the video business — sequencing a single human genome can produce petabytes of data, and in the past that data was more or less trapped within a single research lab. With high-speed transport technologies, that data can be shared more easily and that allows for collaboration that might never have occurred before.

Check out the rest of our Structure Europe 2012 coverage here, and a video recording of the session follows below.


Comments have been disabled for this post