Netflix uses data for a lot more than just recommendations

Netflix is famous for the way it uses algorithms to determine what programs or movies its members might want to watch, but data plays a much broader role inside the company’s streaming service than just informing recommendations. In a blog post on Wednesday, the company explained how it analyzes data to do everything from optimizing playback quality to identifying poorly translated subtitles.

The post, written by Netflix ?director of streaming science and algorithms Nirmal Govind, highlights several areas in which better algorithms could improve the Netflix experience, focusing largely on how to ensure the best-possible playback in any given situation — or, at least, how to ensure users are getting the playback quality they expect. It might be easy enough to find the right theoretical tradeoff between bit rate and rebuffer rates on streaming videos, or to figure out where (geographically) to place which content on the Open Connect content-delivery network, but nothing is that simple in practice.

“[W]e need to determine a mapping function that can quantify and predict how changes in [quality of experience] metrics affect user behavior,” Govind wrote. He continued:

“With vast amounts of data, the mapping function discussed above can be used to further improve the experience for our members at the aggregate level, and even personalize the streaming experience based on what the function might look like based on each member’s ‘QoE preference.’ Personalization can also be based on a member’s network characteristics, device, location, etc.”

However, the most interesting use of data Govind discussed might be how Netflix is using natural-language processing and text analysis to improve the actual quality of the movies and shows it streams. Audio and video quality may be paramount, but the accuracy of closed captions and subtitles is becoming a bigger problem as Netflix expands globally. Some of these issues are identified via Netflix’s own quality checks, but others are peppered throughout scores of member comments and feedback.

The supply chain through which Netflix is trying to optimize quality. Source: Netflix
The supply chain through which Netflix is trying to optimize quality. Source: Netflix

Govind highlights a couple of ways Netflix is trying to solve these problems:

“[W]e can detect viewing patterns such as sharp drop offs in viewing at certain times during the show and add in information from member feedback to identify problematic content. Machine learning models along with natural language processing (NLP) and text mining techniques can be used to build powerful models to both improve the quality of content that goes live and also use the information provided by our members to close the loop on quality and replace content that does not meet the expectations of Netflix members.”

Improving subtitles and captions, and filtering through mountains of comments to find relevant ones, sound like good candidates for the deep learning models that Netflix is experimenting with.

Techniques aside, though, using data to improve the viewing experience is arguably more important to Netflix’s continued success than are accurate recommendations. Yes, the easier it is to find programs you want to watch, the easier it is to watch them. But at the end of the day, Netflix has the same issues as other seemingly invincible companies like Facebook(s FB) and Google(s GOOG) (issues that we’ll delve into in detail at our Structure conference next week): loyalty on the web can be easy come, easy go. If performance starts slipping, those users will start looking elsewhere.

Feature image courtesy of Shutterstock user Twin Design.