In this episode, Byron and Jeff talk about AGI, machine learning, and healthcare.
Byron Reese: Hello, this is Voices in AI brought to you by Gigaom. I am your host, Byron Reese. Today we welcome Jeff Dean onto the show. Jeff is a Google Senior Fellow and he leads the Google Brain project. His work probably touches my life, and maybe yours, about every hour of every day, so I can’t wait to begin the conversation. Welcome to the show Jeff.
Jeff Dean: Hi Byron, this is Jeff Dean. How are you?
I’m really good, Jeff, thanks for taking the time to chat. You went to work for Google, I believe, in the second millennium. Is that true?
Yes, I did, in 1999.
So the company wasn’t even a year old at that time.
That’s right, yeah it was pretty small. We were all kind of wedged in the second-floor office area, above what is now a T-Mobile store in downtown Palo Alto.
And did it feel like a start-up back then, you know? All the normal trappings that you would associate with one?
We had a ping pong table, I guess. That also doubled as where we served food for lunch. I don’t know—yeah, it felt exciting and vibrant, and we were trying to build a search engine that people would want to use. And so there was a lot of work in that area, which is exciting.
And so, over the last seventeen years… Just touch on, it’s an amazing list of the various things you’ve worked on.
Sure. The first thing I did was put together the initial skeleton of what became our advertising system, and I worked on that for a little while. Then mostly for the next four or five years I spent my time with a handful of other people working on our core search system. That’s everything from the calling system—when it goes out and fetches all the pages on the web that we can get our hands on—to the indexing system that then turns that into a system that we can actually query quickly when users are asking a question.
They type something into Google, and we want to be able to very quickly analyze what pages are going to be relevant to that query, and return the results we return today. And then the serving system that, when a query comes into Google, decides how to distribute that request over lots and lots of computers to have them farm that work out and then combine the results of their individual analyses into something that we can then return back to the user.
And that was kind of a pretty long stretch of time, where I worked on the core search and indexing system.
And now you lead the Google Brain project. What is that?
Right. So, it’s basically we have a fairly large research effort around doing machine learning and artificial intelligence research, and then using the results of our research to make intelligent systems. Where an intelligent system may be something that goes into a product, it might be something that enables new kinds of products, it might be, you know, some combination of that.
When we’re working with getting things into existing products, we often collaborate closely with different Google product teams to get the results of our work out into products. And then we also do a lot of research that is sort of pure research, untied to any particular products. It’s just something that we think will advance the capabilities of the kinds of systems we’re able to build, and ultimately will be useful even if they don’t have a particular application in mind at the moment.
“Artificial intelligence” is that phrase that everybody kind of disowns, but what does it mean to you? What is AI? When you think about it, what is it? How would you define it in simple English?
Right, so it’s a term that’s been around since the very beginning of computing. And to me it means essentially trying to build something that appears intelligent. So, the way we distinguish humans from other organisms is that we have these higher-level intelligence capabilities. We can communicate, we can absorb information, and understand it at a very high level.
We can imagine the consequences of doing different things as we decide how we’re going to behave in the world. And so we want to build systems that embody as many aspects of intelligence as we can. And sometimes those aspects are narrowly defined, like we want them to be able to do a particular task that we think is important, and requires a narrow intelligence.
But we also want to build systems that are flexible in their intelligence, and can do many different things. I think the narrow intelligence aspects are working pretty well in some areas today. The broad, really flexible intelligence is clearly an open research problem, and it’s going to consume people for a long time—to actually figure out how to build systems that can behave intelligently across a huge range of conditions.
It’s interesting that you emphasize “behave intelligently” or “appear intelligent.” So, you think artificial intelligence, like artificial turf, isn’t really turf—so the system isn’t really intelligent, it is emulating intelligence. Would you agree with that?
I mean, I would say it exhibits many of the same characteristics that we think of when we think of intelligence. It may be doing things differently, because I think you know biology and silicon have very different strengths and weaknesses, but ultimately what you care about is, “Can this system or agent operate in a manner that is useful and can augment what human intelligence can do?”
You mentioned AGI, an artificial general intelligence. The range of estimates on when we would get such a technology are somewhere between five and five hundred years. Why do you think there’s such a disparity in what people think?
I think there’s a huge range there because there’s a lot of uncertainty about what we actually need. We don’t quite know how humans process all the different kinds of information that they receive, and formulate strategies. We have some understanding of that, but we don’t have deep understanding of that, and so that means we don’t really know the scope of work that we need to do to build systems that exhibit similar behaviors.
And that leads to these wildly varying estimates. You know, some people think it’s right around the corner, some think it’s nearly impossible. I’m kind of somewhere in the middle. I think we’ve made a lot of progress in the last five or ten years, building on stuff that was done in the twenty or thirty years before that. And I think we will have systems that exhibit pretty broad kinds of intelligence, maybe in the next twenty or thirty years, but I have high error bars on those estimates.
And the way you describe that, it sounds like you think an AGI is an evolution from the work that we’re doing now, as opposed to it being something completely different we don’t even know. You know, we haven’t really started working on the AGI problem. Would you agree with that or not?
I think some of what we’re doing is starting to touch on the kind of work that we’ll need to build artificial general intelligence systems. I think we have a huge set of things that we don’t know how to solve yet, and that we don’t even know that we need yet, which is why this is an open and exciting research problem. But I do think some of the stuff we’re doing today will be part of the solution.
So you think you’ll live to see an AGI, while you’re still kind of in your prime?
Ah well, the future is unpredictable. I could have a bike accident tomorrow or something, but I think if you look out fifteen or twenty years, there will be things that are not really imaginable, that we don’t have today, that will do impressive things ten, fifteen, twenty years down the run.
Would that put us on our way to an AGI being conscious, or is machine consciousness a completely different thing which may or may not be possible?
I don’t really know. I tend not to get into the philosophical debates of what is consciousness. To my untrained neuroscience eye, consciousness is really just a certain kind of electrical activity in the neurons in a living system—that it can be aware of itself, that it can understand consequences, and so on. And so, from that standpoint consciousness doesn’t seem like a uniquely special thing. It seems like a property that is similar to other properties that intelligent systems exhibit.
So, absent your bicycle crash, what would that world look like, a world twenty years from now where we’ve made incredible strides in what AI can do, and maybe have something that is close to being an AGI? How do you think that plays out in the world? Is that good for humanity?
I think it will almost uniformly be good. I think if you look at technological improvements in the past—major things like the shift from an agrarian society to one that the Industrial Revolution fueled, which allowed what used to be ninety-nine percent of people working to grow food now, is now a few percent of people in many countries working on producing food supply. And that has freed up people to do many, many other things, all the other things that we see in our society, as a result of that big shift.
So, I think like any technology, there can be uses for it that are not so great, but by-and-large the vast set of things that happen will be improvements. I think the way to view this is, a really intelligent sidekick is something that would really improve humanity.
If I have a question, a very complicated thing—that today I can do via search engine, if I sit down for nine hours or ten hours and really think through and say, “I really want to learn about a particular topic, so I need to find all these papers and then read them and summarize them myself.” If I had an intelligent system that could do that for me, and I could say, “Find me all the papers on reinforcement learning for robotics and summarize them.” And the system could go back, and in twenty seconds do that, that would be hugely useful for humanity.
Oh absolutely. So, what are some of the challenges that you think separate us from that world? Like what are the next obstacles we need to overcome in the field?
One of the things that I think is really important today in the field of machine learning research, that we’ll need to overcome, is… Right now, when we want to build a machine learning system for a particular task we tend to have a human machine learning expert involved in that. So, we have some data, we have some computation capability, and then we have a human machine learning expert sit down and decide: Okay, we want to solve this problem, this is the way we’re going to go about it roughly. And then we have the system that can learn from observations that are provided to it, how to accomplish that task.
That’s sort of what generally works, and that’s driving a huge number of really interesting things in the world today. And you know this is why computer vision has made such great strides in the last five years. This is why speech recognition works much better. This is why machine translation now works much, much better than it did a year or two ago. So that’s hugely important.
But the problem with that is you’re building these narrowly defined systems that can do one thing and do it extremely well, or do a handful of things. And what we really want is a system that can do a hundred thousand things, and then when the hundred thousand-and-first thing comes along that it’s never seen before, we want it to learn from its experience to be able to apply the experience it’s gotten in solving the first hundred thousand things to be able to quickly learn how to do thing hundred thousand-and-one.
And that kind of meta learning, you want that to happen without a human machine learning expert in the loop to teach it how to do the hundred thousand-and-first thing.
And that might actually be your AGI at that point, right?
I mean it will start to look more like a system that can improve on itself over time, and can add the ability to do new novel tasks by building on what it already knows how to do.
Broadly speaking, that’s transferred learning, right? Where we take something in one space and use that to influence the other one. Is that a new area of study, or is that something that people have thought about for a long time, and we just haven’t gotten around to building a bunch of—
People have thought about that for quite a while, but usually in the context of, I have a few tasks that I want to do, and I’m going to learn to do three of them. And then, use the results of learning to do three, to do the fourth better with less data, maybe. Not so much at the scale of a million tasks… And then completely new ones come along, and without any sort of human involvement, the system can pick up and learn to do that new task.
So I think that’s the main difference. Multitask learning and transfer learning have been done with some success at very small scale, and we need to make it so that we can apply them at very large scales.
And the other thing that’s new is this meta learning work, that is starting to emerge as an important area of machine learning research—essentially learning to learn. And that’s where you’ll be able to have a system that can see a completely novel task and learn to accomplish it based on its experience, and maybe experiments that it conducts itself about what approaches it might want to try to solve this new task.
And that is currently where we have a human in the loop, to try different approaches and where we think this ‘learning to learn’ research is going to make faster progress.
There are those who worry that the advances in artificial intelligence will have implications for human jobs. That eventually machines can learn new tasks faster than a human can, and then there’s a group of people who are economically locked out of the productive economy. What are your thoughts on that?
So, I mean I think it’s very clear that computers are going to be able to automate some aspects of some kinds of jobs, and that those jobs—the things they’re going to be able to automate—are a growing set over time. And that has happened before, like the shift from agrarian societies to an industrial-based economy happened largely because we were able to automate a lot of the aspects of farm production, and that caused job displacement.
But people found other things to do. And so, I’m a bit of an optimist in general and I think, you know, politicians and policymakers should be thinking about what the society structures we want to have in place should be if computers can suddenly do a lot more things than they used to be able to. But I think that’s of largely a governmental and policy set of issues.
My view is, a lot of the things that computers will be able to automate are these kinds of repetitive tasks that humans currently do because they’re too complicated for our computers to learn how to do.
So am I reading you correctly, that you’re not worried about a large number of workers displaced from their jobs, from the technology?
Well I definitely think that there will be some job displacement, and it’s going to be uneven. Certain kinds of jobs are going to be much more amenable to automation than others. The way I like to think about it is, if you look at the set of things that a person does in their job, if it’s a handful of things that are all repetitive, that’s something that’s more likely to be automatable, than someone whose job involves a thousand different things every day, and you come in tomorrow and your job is pretty different from what you did today.
And within that, what are the things that you’re working on—on a regular basis—in AI right now?
Our group as a whole does a lot of different things, and so I’m leading our group to help provide direction for some of the things we’re doing. Some of the things we’re working on within our group that I’m personally involved in are use of machine learning for various healthcare related problems. I think machine learning has a real opportunity to make a significant difference in how healthcare is provided.
And then I’m personally working on how can we actually build the right kinds of computer hardware and computer software systems that enable us to build machine learning systems which can successfully try out lots of different machine learning ideas quickly—so that you can build machine learning systems that can scale.
So that’s everything from, working with our hardware design team to make sure we build the right kind of machine learning hardware. TensorFlow is an open source package that our group has produced—that we open-sourced about a year and a half ago—that is how we express our machine learning research ideas, and use it for training machine learning systems for our products. And we’ve now released it, so lots of people outside Google are using this system as well, and working collaboratively to improve it over time.
And then we have a number of different kinds of research efforts, and I’m personally following pretty closely our “learning to learn” efforts, because I think that’s going to be a pretty important area.
Many people believe that if we build an AGI, it will come out of a Google. Is that a possibility?
Well, I think there’s enough unknowns in what we need to do that it could come from anywhere. I think we have a fairly broad research effort because we think this is, you know, a pretty important field to push forward, and we certainly are working on building systems that can do more and more. But AGI is a pretty long-term goal, I would say.
It isn’t inconceivable that Google itself reaches some size where it takes on some emergent properties which are well, I guess, by their definition unforeseeable?
I don’t quite know what that means, I guess.
People are emergent, right? You’re a trillion cells that don’t know who you are, but collectively… You know none of your cells have a sense of humor, but you do. And so at some level the entire system itself acquires characteristics that no parts of it have. I don’t mean it in any ominous way. Just to say that it’s when you start looking at numbers, like the number of connections in the human brain and what not, that we start seeing things of the same sort of orders in the digital world. It just invites one to speculate.
Yeah, I think we’re still a few orders of magnitude off in terms of where a single human brain is, versus what the capabilities of computing systems are. We’re maybe at like newt or something. But, yes, I mean presumably the goal is to build more intelligent systems, and as you add more computational capability, those systems will get more capable.
Is it fair to say that the reason we’ve had such a surge in success with AI in the last decade is this, kind of, perfect storm of GPUs, plus better algorithms, plus better data collection—so better training sets, plus Moore’s Law at your back? Is it nothing more complicated than that? That there have just been a number of factors that have come together? Or did something happen, some watershed event that maybe passed unnoticed, that gave us this AI Renaissance that were in now?
So, let me frame it like this: A lot of the algorithms that we’re using today were actually developed twenty, twenty-five years ago during the first upsurge in interest in neural networks, which is a particular kind of machine learning model. One that’s working extremely well today, but twenty or twenty-five years ago showed interesting signs of life on a very small problem… But we lacked the computational capabilities to make them work well on large problems.
So, if you fast-forward twenty years to maybe 2007, 2008, 2009, we started to have enough computational ability, and data sets that were big enough and interesting enough, to make neural networks work on practical interesting problems—things like computer vision problems or speech recognition problems.
And what’s happened is neural networks have become the best way to solve many of these problems, because we now have enough computational ability and big enough data sets. And we’ve done a bunch of work in the last decade, as well, to augment the sort of foundational algorithms that were developed twenty, thirty years ago with new techniques and all of that.
GPUs are one interesting aspect of that, but I think the fundamental thing is the realization that neural nets in particular, and these machine learning models, really have different computational characteristics than most code you run today on computers. And those characteristics are that they essentially mostly do linear algebra kinds of operations—matrix multiply vector operations—and that they are also fairly tolerant of reduced precision. So you don’t need six or seven digits of precision when you’re doing the computations for a neural net—you need many fewer digits of precision.
Those two factors together allow you to build specialized kinds of hardware for very low-precision linear algebra. And that’s what’s kind of augmented the ability of us to apply more computation to some of these problems. GPUs being one thing, Google has developed a new kind of custom chip called the Tensor processing unit, a TPU, that uses lower-precision than GPUs and offers significant performance advantages, for example. And I think this is an interesting and exploding area. Because when building specialized hardware that’s tailored to a subset of things, as opposed to very general kinds of computations like a CPU does, you run the risk that that specialized subset is only a little bit of what you want to do in a computing system.
But the thing that neural nets and machine learning models have today is that they’re applicable to a really broad range of things. Speech recognition and translation and computer vision and medicine and robotics—all these things can use that same underlying set of primitives, you know, accelerated linear algebra to do vastly different things. So you can build specialized hardware that applies to a lot of different things.
I got you. Alright, well I think we’re at time. Do you have any closing remarks, or any tantalizing things we might look forward to coming out of your work?
Well, I’m very excited about a lot of different things. I’ll just name a few…
So, I think the use of machine learning for medicine and healthcare is going to be really important. It’s going to be a huge aid to physicians and other healthcare workers to be able to give them quick second opinions about what kinds of things might make sense for patients, or to interpret a medical image and give people advice about what kinds of things they should focus on in a medical image.
I’m very excited about robotics. I think machine learning for robotics is going to be an interesting and emerging field in the next five years, ten years. And I think this “learning to learn” work will lead to more flexible systems which can learn to do new things without requiring as much machine learning expertise. I think that’s going to be pretty interesting to watch, as that evolves.
Then, beneath all the machine learning work, this trend toward building customized hardware that is tailored to particular kinds of machine learning models is going to be an interesting one to watch over the next five years, I think.
One final thought, I guess, is that I think the field of machine learning has the ability to touch not just computer science but lots and lots of fields of human endeavor. And so, I think that it’s a really exciting time as people realize this and want to enter the field, and start to study and do machine learning research, and understand the implications of machine learning for different fields of science or different kinds of application areas.
And so that’s been really exciting to see over the last five or eight years, is more and more people from all different kinds of backgrounds are entering the field and doing really interesting, cool new work in this field.
Excellent. Well I want to thank you for taking the time today. It has been a fantastically interesting hour.
Okay thanks very much. Appreciate it.
Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here.