Summary:

Netflix is building its own customized server boxes to handle the massive amounts of data streaming the company has to deal with, but it is also looking to new technologies to make it cheaper and easier.

David Fullagar Netflix Brendan Collins, HGST, a Western Digital CompanyGigaOM Structure:Europe 2013
photo: Anna Gordon/GigaOM


Session Name: Lessons from Beyond the Edge: Streaming 114,000 Years of Video Every Month.

Chris Albrecht
Jo Maitland
David Fullagar
Brendan Collins

Chris Albrecht 00:04

Thank you David, thank you panel, really appreciate it. Up next, this is going to be a good one. Well, that was a good one, they’re all good ones. Are they off the stage yet? Good. We have lessons from beyond the edge. Streaming 114,000 years of video every month, and that’s going to be a discussion between Jo Maitland – she’s a research director for GigaOM Research, remember to visit the booth out there – and she’s going to be talking with Brendan Collins, the VP of product marketing for HGSD, a western digital company, and David Fullagar, the director of content delivery architecture for Netflix, so please welcome our next panel to the stage.

Jo Maitland 00:49

Hey good morning everybody, I think we’re about three sessions away from lunch – just a heads up on that. But before lunch let’s get into thinking about what Netflix does, super interesting company, kind of pioneering in the infrastructure space. David here is from Netflix and I guess I’d like to kick off first of all and just ask you what the difference is between your group and then what, say, ??? Within the Amazon Cloud. How are you fitting to what they’re doing?

David Fullagar 01:23

So Netflix basically has infrastructure in two different groups. The vast majority of the work we do is in Amazon’s Cloud, and so that would be things like encoding video, doing a large amount of the calculations around personalization on the website and on devices, and on the actual operation of the website. But for the content delivery, so that’s after people press play, how we get the video bits to them, because that’s a very scale intensive business on the network side rather than the computer side, we historically have used the [Pady?] CDNs and over the last two years we’ve transitioned the bulk of that traffic to our own CDN. And so that’s the group that I’m part of and so we develop our own software on our own specialized hardware platform. And we run a number of facilities where exchange traffic in the countries we operate, and we also have the same hardware co-located inside service providers to provide a good quality of experience.

Jo Maitland 02:26

So the previous CDN companies would be like Akumai, Level 3, those guys. So what can you do in building your own CDN that you couldn’t get from say, Akumai.

David Fullagar 02:38

So to a large extent what we do in our own CDN is a very specialized type of video streaming. And when you’re trying to operate a large multi-tenant CDN you have to make compromises and so different types of content, whether it’s software, video, webpages and images, and so by focusing on our own video traffic we can make very efficient server hardware in terms of the footprint and the amount of simultaneous streams per unit of hardware storage. And we also just have the scale in the countries we operate. We’re often tens of percent of NJUs of bandwidth. And because of that it makes it easier to put that highly concentrated traffic in a box that then can be sourced further within networks to save costs.

Jo Maitland 03:27

Okay, and so just sort of talking about simultaneous streams, just give us an idea in Europe what the Netflix traffic looks like. How big is that load?

David Fullagar 03:39

Sure. So, we obviously started in the US domestically and we expanded into the rest of North and South America. And about a year and a half ago we launched in the UK, an island, and then followed that last summer in the Nordic countries and last week in the Netherlands. And basically it takes a period of one to two years to grow the traffic in each of those regions. But we now do more than a terabit of traffic in Europe that’s growing week by week as subscribers come online and as subscribers basically are using more and more hours of Netflix each week versus the other types of internet that they watch.

Jo Maitland 04:30

So a terabit of traffic at any one time is traveling over the–

David Fullagar 04:33

Yeah, so our content is 24/7. We obviously have prime time peaks and we typically see more traffic on Friday, Saturday, Sunday nights, but it’s a pretty consistent amount of traffic. People do watch in the middle of the morning and kids programming in particular on weekends.

Jo Maitland 04:52

Right, so one of the things we want to ask you two about is this idea of building your own server, which you guys did, versus buying off the shelf. What were you able to do in building your own box that you couldn’t get from off the shelf get?

David Fullagar 05:08

So as part of the project we wanted to make a very simple building block server. We didn’t want to have to have racks of multiple servers, and switch, and have to pick up those. And so the server has multiple 10 gig connections to it. One of the key aspects of the project is our catalog is relatively large, the content we offer, and so there’s more than a petabyte of content available on Netflix. But to get decent offload of traffic, we preposition between 100 and 150 terabytes of content in our server. And to get that storage density even with the highest capacity, 3.5 inch drives, which are 4 terabytes each, you need a big box. And we wanted to have a box that was somewhat manageable. It’s still over 100 pounds, so it’s pretty heavy, but we wanted that very large storage density. And we don’t have to do a large amount of computation. So we have very efficient software that’s taking the movies off the drives and putting it onto the network interface. There’s not an awful lot of calculation, we don’t have to do any re-incoding, and so we pre-compute it.

Jo Maitland 06:23

So, Brendan, what is it like for you guys at HDST, when these kids go off and build their own boxes. I mean your whole company, that’s what you do, and then Netflix is deciding, “Well we can build it ourselves.” What does that mean for your–

Brendan Collins 06:36

Well traditionally our customer base had been the bigger OEMs, the server OEMs, the subsystem OEMs. But over the past – I would say five to seven years – we’ve been partnering with some of the bigger hyperscale Cloud service providers, the big social media. So those guys have been designing their own data centers, their own servers, their own file systems. So working with these guys is very, very similar. So if you’re a hard drive supplier to the industry, and you’re in a Wintel world, everything is very commoditized. With guys like Netflix, we get to differentiate products more. Since these guys have their own servers, they’re free to design whatever they want. We find that customers like Netflix, and Google and Facebook are a lot more willing to single-source and differentiate. And there’s more opportunities for us to add value.

Jo Maitland 07:36

So are they a single-source supplier to you guys on the drive side?

David Fullagar 07:45

They are for hard drives at the moment.

Jo Maitland 07:47

Wow. So tell me more about some of the lessons in building that box. I’m just curious to some of the lessons you learned on the way of building your own.

David Fullagar 07:58

Sure. I should comment that it’s a custom chassis but all the components stand at PC Commodity Server Components. The vast amount of space inside the box and the cost of the building the box is buying the storage itself. It’s about 75 to 80% of the cost. And then we put a relatively generic Intel server CPU in front of that and then as much cost-effective memory as we can. And then the other expensive component of the box is the fast network interface card with lots of ports to enable us to have the good through-port now. And we would prefer not to build our own custom chassis, it’s just one of the things that we have to focus on, and we don’t get the economies of scale of other people using it as well. So we talk about the box on our website, people can see how we build it, but the reality is in the future we’re hoping that we’ll be able to buy off the shelf, very compact high-storage units from one of the vendors, which hasn’t happened yet.

Jo Maitland 09:10

Brendan how about other trends. Maybe in the web scale space that you see from the storage side that’s kind of cold storage, interesting things with flash. What’s the most exciting thing from your perspective for web scale expansion?

Brendan Collins 09:30

I would say the big difference for us is when we look at the big hyperscale data centers. And these are big, so these are like 10 football fields with 100,000 servers in each one. A lot of the folks, whether it’s social media or the e-mail giants that are storing all this data, that’s growing exponentially, and their IT budgets are flat, they’re coming to us and they’re saying, “As a hard-drive supplier, you got to help us beyond just supplying capacity. You got to help us address our OpEx budget.” So if you look at the makeup of cost at the data center, I would say 40% of it is storage, is hard drives. And another 40% is OpEx. So they’re asking us to innovate and come up with new technologies that, as well as just increasing capacity, can you help us reduce power and cooling and footprint in the data center. So we’re off today developing new classes of storage that uses Helium as a gas inside a hard-disk drive. So if you can imagine today’s hard-disk drives, they turn in air, there’s a lot of vibration and turbulence inside. And when you fill it with gas all of that vibration and turbulence goes away. So you can now put more disks in there, and you draw less power. So we can now deliver probably 40 to 50% more capacity and at the same time reduce power and cooling.

Jo Maitland 11:07

Is this the drive that’s completely sealed?

Brendan Collins 11:10

Yes it is.

Jo Maitland 11:11

Right, I think I saw that, yeah.

Brendan Collins 11:13

So that’s actually where a lot of the IP is. I’ve been in the hard-drive industry for 20 years. I’d say in the last 10 years, all of the hard-drive vendors have tried to develop Helium as a technology but have never been able to manage to adequately seal it, or been able to build it in volume at low cost.

Jo Maitland 11:31

And have you guys tested this yet? Have you seen it or–

David Fullagar 11:35

Yeah, we have seen [inaudible]. The big advantage for us is incrementally more storage density, with reasonable price economics and the other nice advantage so far is we’re seeing good amounts of low power usage. Our goal when we build a box is to have a maintenance free box. So drives will fail still over a 3 to 5 year lifespan, but they should fail in place in the box and we won’t have to do any field maintenance whatsoever. And so we want as much storage density as possible because we’re building into the fact that we want to be able to go and hot swap drives out. And it’s really just a design decision to make it as maintenance free as possible and so we can put it into facilities where it’s not 24/7 remote hands. That’s a huge advantage for us.

Jo Maitland 12:36

So just removing the cost of somebody having to go out there to the colo–

David Fullagar 12:40

And even the burden of contractually working out how to do it. And when we talk with partners about that aspect of things, it’s just one less thing to have to discuss. The fact that the box should be very resilient. And if we do have any issues, things can go wrong with any of the components, we’ll actually advance RMA a brand new box, and then swap the old one back, and fix it back at base rather than trying to do any on-site maintenance.

Jo Maitland 13:07

So as you look out the next few years for Netflix, clearly the traffic, there’s no end in sight in terms of traffic growth. What are the challenges, is it power and cooling is going to keep you up? Just the density issue of packing more terabytes in–

David Fullagar 13:26

Yeah so as part of server design we’re really trying to optimize that sort of power versus throughput footprint. And the space itself is somewhat important. We see a good evolution of how drive technology, and we’re using more and more Flash. We have Flash in all our units as sort of a hybrid mechanism for the most popular content existing there on the application level. Ended up very, very logicites, we actually use Flash running units, because in that scenario we can generate enough demand off a single title to enable it to be on a Flash running unit.

Jo Maitland 14:04

So I know this is slightly topic than the boxes, but I know probably people here are kind of interested in, there’s been a lot of conversation about Netflix’s algorithm and how you guys decide what to show customers and stuff. How much – and part of the conference here we’re thinking about big data and analytics and things – how much of the decisions around what Netflix decides to create, now being a company that’s also creating content, is derived from looking at the algorithm and the results that the algorithm is telling you about what people are interested in versus content people who write shows. I’m curious, you know there’s a lot of talk about Netflix and this algorithm, I’m curious to just hear from your perspective.

David Fullagar 14:52

So, as a company we use data an awful lot to make informed decisions. So the content group that’s purchasing either pre-produced content that already exists or now with our original series going out and hearing pitches and trying to work out the genres that would be really interesting to our audience, we certainly use a large amount of the user data that we have from the service. Originally that was DVD user data, now it’s predominantly streaming user data. And what we found is because we have this broad spectrum of subscribers we can work out different niche content that potentially would find it hard to get on US broadcast television with a pilot mechanism that exists where a show has to succeed very, very early on in the total lifespan of the show. And so we can acquire a season of new television knowing that the show will have an audience after, say, 10 episodes. It doesn’t have to have an audience after 30 minutes.

Jo Maitland 15:57

So there’s that sort of business side of it, but then there’s just also, just thinking about the what people actually are going to consume and what can be popular. But you’re basically saying that because you can run something for 13 episodes without worrying about the pilot, you can take that risk.

David Fullagar 16:17

Yeah, that’s right. And we often find that shows do very well on Netflix that have been broadcast before and not been as successful. And so there’s obviously cult-type shows work relatively well. And for television in particular people seem to like watching things in batches together rather than spread out over multiple weeks or years.

Jo Maitland 16:42

So that’s this whole notion of binge-watching things. Does that culture, that dynamic that’s happening impact the way you’re thinking about building the network and the–

David Fullagar 16:54

It doesn’t really. Even the most popular shows are a fraction of a percent of traffic for us. And so we still have this very large middle tail of content. We have very little incredibly popular content, and we don’t really have a social media-style incredibly long tail. And so from a delivery point of view, we don’t think about caching and memory of content going onto the network. Even when we have a couple million people streaming Netflix simultaneously, very few people are watching the same title at the same place on the same bitrate in the same geography. And so with that, our system really is optimized to pull movies off hard drives directly onto the network interface. We’re not trying to cache it in any sort of way.

Jo Maitland 17:51

Got it. We’re on time here guys, we’ve got to wrap up. Thank you very much.

Brendan Collins 17:55

Thank you.

You’re subscribed! If you like, you can update your settings

firstpage of 2

Comments have been disabled for this post