How to manage ridiculous amounts of data and traffic without falling down

Frank Frankovsky Facebook Structure:Europe 2013

We’ve wrapped up another Structure: Europe conference and, just like last year’s inaugural event, it was full of insights into the state of cloud computing in Europe and around the world. The fact that this was our first event since learning the full scale of the NSA’s internet-spying operations certainly played a role in many of the discussions — both onstage and in the hallways — as did a seemingly general consensus that Amazon Web Services cannot and will not be the only cloud computing provider that matters.

But what struck me most of all was the ridiculous amount of data that many of the speakers are dealing with. We’re not talking about a few terabytes here and there; rather, we’re talking about petabytes of storage and single companies — heck, single events — accounting for significant portions of all the world’s web traffic.

And then there’s CERN. The Swiss research institution is generating, processing and storing staggering amounts of data. Here are the highlights, but you’ll really want to watch Tim Bell’s entire presentation for the full and fascinating story behind why it’s growing so fast and how it’s doing so without crashing and burning:

  • CERN’s 100-megapixel cameras take 40 million pictures a second of proton collisions — creating 1 petabyte of data per second that needs to be filtered down to reasonable levels for analysis and retention.
  • CERN currently keeps about 35 petabytes of data per year, which scientists want to keep for 20 years. Its archival system consists of 45,000 tape drives.
  • CERN’s current cloud environment consists of 50,000 cores right now and is expected to grow to about 300,000 cores by 2015.

A transcript of Bell’s talk is available here.

All told, however, Structure: Europe was a great conference and a few statistics don’t really do it justice. Check out the live coverage page, read the blog posts and, if you have time, watch the sessions. Whether your interests lie in anything from big data to European cloud strategy, from privacy to webscale infrastructure, I think it’ll be worth the time.

loading

Comments have been disabled for this post