After years struggling through a public identity crisis it appears Yahoo has decided, for better or worse, that it’s a content company. There will be no Yahoo smartphones or operating systems, no Yahoo Fiber, and no Yahoo drones, robots or satellites. But that doesn’t mean the company can’t innovate.
When it comes to the future of web content, in fact — how we’ll find it, consume it and monetize it — Yahoo might just have the inside track on innovation. I spoke recently with Ron Brachman, the head of Yahoo Labs, who’s now managing a team of 250 (and growing) researchers around the world. They’re experts in fields such as computational advertising, personalization and human-computer interaction, and they’re all focused on the company’s driving mission of putting the right content in front of the right people at the right time.
Really, it’s all about machine learning
However, Yahoo Labs’ biggest focus appears to be on machine learning, a discipline that can easily touch nearly every part of a data-driven company like Yahoo. Labs now has a dedicated machine learning group based in New York; some are working on what Brachman calls “hardcore science and some theory,” while others are building a platform that will open up machine learning capabilities across Yahoo’s employee base.
There’s also a related data science group, also in New York, that’s doing more applied research with product teams, “and we’ve hired machine learning scientists into almost every other group we have,” Brachman said. They’re working on everything from advertising to data centers, from social-graph analysis to network security.
But advertising is what pays the bills at Yahoo, and if there’s nobody to view the content, there’s nobody to see the ads. That’s why a lot of machine learning research is focused on making it easier for Yahoo’s users to get what they need. That means making images and videos as searchable as web pages, and making everything more searchable using natural language.
And while Yahoo Labs has hired a large number of Ph.Ds. since Marissa Mayer became CEO, some of its talent in the content space has come about, fortuitously, from acquisitions made without any (or little) input from Brachman. One of Summly’s technical leaders, for example, joined Yahoo Labs after that acquisition and created a technology for summarizing multiple documents that Brachman says is integral to the Yahoo homepage. He said the SkyPhrase team, which Yahoo acquired in December 2013, is a natural fit into Yahoo Labs given its research background and its cutting-edge natural-language processing technology.
“We want people to be able to access Yahoo products from wherever they are, through whatever type of device they have,” Brachman said about the promise of NLP. It might be that users have vision problems, or aren’t in front of a screen but still need to know the answer to a question or track down a piece of content.
“I think in the longer-term future of Yahoo … natural language understanding is going to be very important,” he added.
Cranking up deep learning and artificial intelligence
Right now, though, when many people think of machine learning and the web, they’re thinking of computer vision. Whether it’s based on deep learning or some other set of techniques, everyone — Google, Microsoft, Facebook, Pinterest, Dropbox, Twitter — seems to be investing in figuring out how to make sense of all visual content they have. And indeed, Brachman said, Yahoo Labs is “doing all kinds of cool stuff with Flickr and image search,” and is also working on ways of indexing and recommending videos.
He specifically called out work on computer vision algorithms that can determine what makes a good picture of a human — the right angles, right colors, etc. — and automatically curate search results to put the best images up top. Brachman’s team has also developed a method for producing video “thumbnails” so users can get a better sense of what they’re about and what they contain. Then, of course, there are object-recognition efforts, led largely by researcher Jia Li, to automatically tag Flickr images so users don’t need to know how they’re titled or tagged in order to actually find them.
“We’re doing some of that, as well,” Brachman said, referencing public claims by Google and Microsoft about the advances their deep learning research has had on the accuracy of image classification. However, he added, “We haven’t made a big public fuss about this like some of our colleagues out there have done.”
In fact, Bobby Jaros, the co-founder and CEO of LookFlow, a deep-learning-based computer vision startup Yahoo bought in October 2013, originally was embedded within the Flickr team but has joined Yahoo Labs in an effort to grow out a deep learning team. Presumably, that team will work on applying those techniques in areas beyond what Li is already researching in computer vision. Brachman cited advertising, recommendations, personalization and privacy as other areas where “neural-net-flavored deep learning will be a nice additional tool in our toolbox.”
He’s excited, but realistic, about what deep learning and other new approaches to artificial intelligence could mean to a company like Yahoo over the next few years. “Back in the earlier days … we were doing research that felt more speculative than it does now, in a way, because so much of artificial intelligence has become real,” Brachman said.
For example, when he created the DARPA framework that ultimately led SRI to develop Siri, he was skeptical that the work could be completed in five years, and now “Siri is in the pocket of some humongous number of people around the world,” he said. These successes have inspired the machine learning community as a whole, which is coming around on the idea of scaling up all sorts of approaches previously confined to laboratories. Stuff that was deemed futuristic 20 years ago is now legitimate, which is not something Brachman necessarily would have predicted.
“People are starting to entertain those things because they’ve seen artificial intelligence impact society and business,” Brachman said.
But it won’t all be smooth sailing as today’s researchers try to take today’s hot techniques to the next level. Brachman said there are still “huge advances” that need to happen before we have full-scale AI systems, starting with relatively simple things such as connecting the results of a deep learning algorithm to a knowledge graph for the sake of correcting the learning model’s mistakes. “No one knows how to do that,” he said.
Tying it altogether into a ubiquitous Yahoo
When everything comes together — everything Yahoo Labs is presently working on, and probably some stuff not yet on its plate — Brachman envisions a Yahoo that’s as ubiquitous as computers seem destined to be. Phones, watches, public terminals, brain implants — Yahoo wants to be able to deliver content to all of them. That probably means rethinking the inputs (e.g, typing, voice, and probably even video and location data) as well as what the applications actually look like and how users expect stuff to be delivered in any given situation to any given device.
“We really want to understand the human element of being mobile,” Brachman said. “If we ever build an integrated, useful Yahoo presence,” he added, “it needs to know about all these things.”
Of course, this being Yahoo, advertising will always play a role in how the company designs its future. Brachman said areas such as “advertising science” continue to be important, even though it has been years since Yahoo first claimed to have mastered the targeted ad. As devices and ad types evolve, design strategies and optimization algorithms need to evolve with them. One of Yahoo Labs’ bigger recent projects, for example, was a new advertising platform called Gemini that lets advertisers manage their mobile search and native advertising campaigns through a single system.
In the ubiquitous computing future Brachman predicts, Yahoo and its advertisers will have to figure out how to make ads something people actually desire. He points to current-day fashion and bridal magazines as media where people actually buy them for the ads. Others might point to a digital platform like Pinterest.
“What we need to think about at a deep conceptual level is ‘What is content?’ ‘What are communications between humans?’ ‘What is advertising?'” Brachman said. “… All in the abstract, all independent of how it gets to a person.”
Update: This post was updated at 5:26 to correct the LookFlow co-founder who has joined Yahoo Labs. It is Bobby Jaros, not Simon Osindero.