Medicine & Transfer Learning

Voices in AI – Episode 5: A Conversation with Daphne Koller

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

In this episode, Byron and Daphne talk about consciousness, personalized medicine, and transfer learning.

Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today our guest is Daphne Koller. She’s the Chief Computing Officer over at Calico. She has a PhD in Computer Science from Stanford, which she must have liked a whole lot, because she shortly thereafter became a professor there for eighteen years. And it was during that time that she founded Coursera with Andrew Ng. She is the recipient of so many awards, I would do them an injustice to try to list them all. Two of them that just stick out are the Presidential Early Career Award for Scientists and Engineers, and, famously, The MacArthur Foundation Fellowship.

Welcome to the show, Daphne.

Daphne Koller: Good to be here, Byron. Thank you for inviting me.

I watched a number of your videos, and you do a really interesting thing where you open up by defining your terms often, so that everybody has, as you say, a shared vocabulary. So what is ‘artificial intelligence’ when you use that term?

Well, I think artificial intelligence is one of the harder things to define because in many ways, it’s a moving target. Things that used to be considered artificial intelligence twenty years ago are now considered so mundane that no one even thinks of them as artificial intelligence—for instance, optical character recognition.

So, there is the big lofty AI goal of general artificial intelligence, building a single agent that achieves human-level type intelligence, but I actually think that artificial intelligence should—and in many people’s minds I hope still does—encompass the very many things that five years ago would have been considered completely out of reach, and now are becoming part of our day-to-day life. For instance, the ability to type a sentence in English and have it come out in Spanish or Chinese or even Swahili.

With regard to that, there isn’t an agreed-upon definition of intelligence to begin with. So what do you think of when you think of intelligence, and secondly, in which sense is it artificial? Is it artificial like artificial turf, is it really turf, or it just pretends to be? Do you think AI is actually “intelligent,” or is it a faux imitation intelligence?

Boy, that’s a really good question, Byron. I think intelligence is a very broad spectrum that ranges from very common sense reasoning that people just take for granted, to much more specialized tasks that require what people might consider to be a deeper level of intelligence, but in many cases are actually simpler for a computer to do. I think we should have a broad umbrella of all of these as being manifestations of the phenomenon of intelligence.

In terms of it being false intelligence, no; I think what makes artificial intelligence “artificial” is that it’s humanly-constructed. That is, it didn’t organically emerge as a phenomenon, but rather we built it. Now you could question whether the new machine learning techniques are in fact organic growth, and I would say that you could make the case that if we build an architecture, that you put it in the world with the same level of intelligence as a newborn infant, and it really learns to become intelligent—maybe we shouldn’t call it artificial intelligence at that point.

But I think, arguably, the reason for the use of the word “artificial” is because it’s human-constructed as opposed to biologically-constructed.

Interestingly, McCarthy, the man who coined the phrase, later regretted it. And that actually brings to mind another question, which is: When five scientists convened at Dartmouth for the summer of 1956, to “solve the problem with artificial intelligence,” they really thought they could do it in a summer of hard work.

Because they assumed that intelligence was like, you know, in physical laws… We found just a few laws that explained all physical phenomenon, and electricity just a few, and magnetism just a few, and there was a hope that intelligence was really something quite simple. You know, iteratively-complex but had just a few overriding laws. Do we still think that? Do you think that? Is it not like that at all?

That was the day of logical AI, and I think people thought that one could reason about the world using the rules of logic, where you have a whole bunch of facts that you know—dogs are mammals; Fido is a dog, therefore Fido is a mammal—and that all you would need is to write down those facts, and the laws of logic would then take care of the rest. I think we now understand that that is just not the case, and that there is a lot of complexity both on the fact side, and then how you synthesize those facts to create broader conclusions, and how do you deal with the noise, and so on and so forth.

So I don’t think anyone thinks that it’s as simple as that. As to whether there is a single, general architecture that you can embed all of intelligence in, I think some of the people who believe that deep neural networks are the solution to the future of AI would advocate that point of view. I’m agnostic about that. I personally think that that’s probably not going to be quite there, and you’re probably going to need at least one or two other big ideas, and then a heck of a lot of learning to fine-tune parts of the model to very different use models—in the same way that our visual system is quite different from our common sense reasoning system.

And I do want to get on, in a minute, to the here and now, but just in terms of thinking through what you just said, it sounds like you don’t necessarily think that an AGI is something that we are on the way towards. You know, we can make one percent of it, and when algorithms get a little better, and computers get a little faster, and we get a little more data, we’ll evolve our way there.

It sounds like what you said is that there is some breakthrough that we need that we don’t yet have; that AGI is something very different than the, kind of, weak AI we have today. Would you agree with that?

I would agree with that. I wouldn’t necessarily agree with the fact that we are not on the right path. I think there has been a huge amount of progress in the last, well, not only in the last few years, but across the evolution of AI. But it is definitely putting us on the path there. I just think that we need additional major breakthroughs to get us there.

So with regard to the human genome, you know it’s x-number of billions of base pairs, which map to something like 700 megabytes. But most of that we share with all life, even like plants and bananas and all of that, and if you look at the part that makes us different than say a bonobo or a chimp, it may only be half of one percent.

So it may only be like three megabytes. So does that imply to you that to build an AGI, the code might be very… We are an AGI, and our intelligence is evidently built with those three megabytes of code. When working to build an AGI computer, is that useful, or is that a fair way to think about it? Or is that apples and oranges in your view?

Boy! Well, first of all, I think I would argue that a bonobo is actually quite intelligent, and a lot of the things that make us generally intelligent are shared with a bonobo. Their visual system, their ability to manipulate objects, to create tools and so on is something that certainly we share with monkeys.

Fair enough.

I think there is that piece of it. I also think that there is an awful lot of complexity that happens as part of the learning process, that we as well as monkeys and other animals go through as we encounter the world. It evolves our neural system, and so that part of it is something that emerges as well, and could be shared. So I think it’s more nuanced than that, in terms of counting the number of bits.

Right. So, we have this brain, the only AGI that we know of…  And we, of course, don’t know how our brains work. We really don’t. We can’t even model a nematode worm’s 302 neurons in a computer, let alone our hundred billion. And then we have something we call the “mind,” which is a set of capabilities that the brain manifests that don’t seem to be—with the emphasis on seem to be—derivable from neurons firing.

And then you have consciousness, which of course… Nobody purports to say they know exactly how it is that hydrogen came to name itself. So, doesn’t that suggest that you need to understand the mind, and you need to understand consciousness, in order to actually make something that is intelligent? And it will also need those things.

You know, that’s a question that artificial intelligence has struggled with a lot. What is the mind, and to what extent does that emerge from neurons firing? And if you really dive into that question, it starts to relate to the notion of soul and religion, and all sorts of things that I’m not sure I am qualified to comment on. Most people wouldn’t necessarily agree with the others’ point of view on this anyway.

I think in this respect, Turing had it right. I don’t know that you’re conscious. All I can see is your observed behavior, and if you behave as if you are conscious, I take it on faith that you are. So if we build a general artificial intelligence that acts intelligent, that is able to interact with us, understand our emotions, express things that look like disappointment or anger or frustration or joy…

I think we should give it the benefit of the doubt that it has evolved a consciousness, regardless of our ability to understand how that came about.

So tell me about your newest gig, the Chief Computing Officer at Calico. Calico, according to their website, are aiming to devise interventions that slow aging and counteract age-related diseases. What’s your mission there, within that?

I came on board to create, at Calico, what you might call a second pillar of Calico’s efforts. One pillar being the science that we’re doing here, that drives toward an understanding of the basic concepts of aging, and the ability to turn that into therapeutics for aging and age-related diseases.

But we all know that biology, like many other disciplines, is turning into data science, where we have—”we” being the community at large—developed a remarkable range of technologies that can measure all sorts of things about biological systems, from the most microscopic level, all the way up to the organismal level—interventions that allow us to perturb single genes or even single nucleotides.

And how do you take this enormity of data and really extract insights from it, is a computational question. There need to be tools developed to do this. And this is not something that biologists can learn on their own. It’s also something computer scientists can’t do on their own. It requires a true partnership between those two communities working together to make sense of the data using computational techniques, and so what I am building here at Calico is an organization within Calico that does exactly that—in partnership with our pre-existing world class biology team.

Do you think there is broad consensus, or any consensus about the bigger questions of what is possible? Like do humans need to have a natural life span? Are we going to be able to better tailor medicines to people’s genomes? What are some of those things that are, kind of, within sight?

I am very excited about the personalized medicine, precision medicine trajectory. I completely agree with you that that is on the horizon. I find it remarkably frustrating that we treat people as one-size-fits-all. And, you know, a patient walks into a doctor’s office, and there is a standard of care that was devised for a population of people that is sometimes very different from the specifics of the person… Even to the point that there are a whole bunch of treatments which were designed largely based on a cohort of men, and you have a woman coming into the doctor’s office, and it might not work for her at all. Or similarly with people of different ethnic origins.

I think that’s clearly on the horizon, and it will happen gradually over the course of the coming years. I think the ability to intelligently design medications, in a way that is geared towards achieving particular biological effects that we’re able to detect using mechanisms like CRISPR, for instance.

CRISPR, by the way, for those of you who’ve not heard of this, is a gene editing system that was developed over the last five or ten years—probably more like five—and is remarkably able to do very targeted interventions in a genome. And then one can measure the effects and say, “Oh, wait a minute, that achieved this phenotypic outcome, let’s now create a therapeutic around that.” And that therapeutic might be a drug, or it could—as we get closer to viral therapies or even gene editing—be something that actually does the exact same thing that we did in the lab, but in the context of real patients. So that’s another thing that is on the horizon, and all of this is something that requires a huge understanding of the amounts of data that are being created, and a set of machine learning artificial intelligence tools.

Now, prior to World War II, I read that we only had about five medicines. You had quinine, you had penicillin—well, you didn’t have penicillin—you had aspirin, you had morphine; and they were all, fortunately, very inexpensive.

And then Jonas Salk develops the Salk vaccine, and they ask him who owns the patent and he says, there is no patent, you can’t patent the sun. And so you know, you get the Salk vaccine, so inexpensive. Now, though, we have these issues that, you know, if you have Hepatitis-C and you need that full treatment, that’s $70,000. Are we not on a path to create ever more and more expensive medications and therapies that will create a huge gulf between the haves and the have-nots?

I think it’s a very important moral dilemma. I tend to be, rightly or wrongly, I guess, an optimist about this, in that I think some medications are really expensive because we don’t have productionized processes for creating a medication. And we certainly don’t have productionized processes, or even a template, for how to come up with a new medication for an indication that’s discovered.

But—and again, I am an optimist—as we get a better understanding of, for instance, the human genome and maybe the microbiome, and how different aspects of that and the environment come together to create both healthy and aberrant phenotypes, it will become easier to construct new drugs that are better able to cure people. And as we get better at it, I hope that costs will come down.

Now, that’s sort of a longer-term solution. In the shorter term, I think that it’s incumbent upon us as a society to help the have-nots who are sick to get access to the best medications, or at least to a certain common baseline of medications that are important to help people stay alive. I think that that’s a place where some societies do this well, and others maybe not so well. And I don’t think that’s fair.

Of course, you know, there are more and more people that hit the age of 100, but the number of supercentenarian—people who hit 110—seems stubbornly fixed. I mean, you can go to Wikipedia and read a list of all of them. And the number of people who’ve hit 125 seems to be, you know, zero. People who’ve hit 130, zero. Why is it that, although the number of centenarians goes way up—and it’s in the hundreds of thousands—the number of people who make it to 125 is zero?

That’s a topic that’s been highly-discussed very recently. There’s been a series of papers that have talked about this. I think there’s a number of hypotheses. One that I find compelling is that what causes people to die, at a certain time in history, changes over time. I mean, there was a time, not that long ago, when women’s life spans were considerably shorter than that of men, because many of them died in childbirth. So the average lifespan of a woman was relatively shorter, until we realized that we needed to sterilize the doctor’s hands when they were delivering the baby, and now it’s different.

We discovered antibiotics, which allowed us to address many of the deaths that are attributed to pathogens, though not all of them. AIDS was a killer, and then we invented retroviral therapy which allows AIDS patients to live a much longer life. So, over time, we get through additional bottlenecks that are killing people at later and later points in time. So right now, for instance, we don’t have a cure for Alzheimer’s and Parkinson’s and other forms of dementia, and that kills a lot of people.

It kills at a much later age than they would have died from in earlier cases, at earlier times in history. But I hope that at some point in the next twenty years, someone will discover a cure for Alzheimer’s, and then people will be able to live longer. So I think over time, we solve the thing that kills you next, and that allows the thing that’s next down the line to kill you next, and then we go ahead and try and cure that one.

You know, when you look at the task before you, if you are trying to do machine learning to help people live longer and healthier lives, it’s got to be frustrating that, like, all the data must be bad, because symptoms generally aren’t recorded in a consistent way. You don’t have a control, like for example twins who, five minutes into the world go down different paths.

Everybody has different genomes. Everybody eats different food, breathes different air. How much of a hurdle is that to us being able to do really good machine learning on things like nutrition, which seems, you know… We don’t even know if eggs are good for you or bad for you and those sorts of things.

It’s a huge hurdle, and I think it was one of the big obstacles to the advancement of machine learning in other domains, up until relatively recently, when people were able to acquire enough data to get around that. If you look at the earlier days of, for instance, computer vision, the data sets were tiny—and that’s not that long ago, we’re talking about less than a decade.

You had data sets with a few hundred, and a few thousand images was considered large, and you couldn’t do much machine learning on that because when you think about the variation of a standard category… Like, a wedding banquet that ranges from photos of a roomful of people milling around to someone cutting a wedding cake.

And so the variability there is extremely large, and if all you have is twenty images of a wedding banquet, you’re not going to get very far training on that. Now, the data is still as noisy—and arguably even noisier when you download it from Google Images or Flickr or such—but there’s enough of it that you get to explore a sufficient part of the space for a machine learning algorithm. So that you can, not counteract the noise, but simply accommodate it as a variability in your models.

If we get enough data on the medical side, the hope is that we’ll be able to get to a similar place where, yes, the variability will remain, but if you have enough of the ethnic diversity, and enough of the people’s lifestyle, and so on, all represented in your data set, then that will allow us to address the variability. But that requires a major data collection effort, and I think we have not done a very good job as a society of making that a priority to collect, consolidate, and to some extent clean medical data so that we can learn from it.

The UK, for instance, has a project that I think is really exciting. It’s the UK Biobank project. It’s 500,000 people that were genotyped, densely-phenotyped, and their records are tied to the UK National Health Service; so you have ongoing outcome data for them. It’s still not perfect. It doesn’t tell you what they eat every day, but they asked them that in the initial survey, so you get at least some visibility into that. I think it’s an incredibly exciting project, and we should have more of those.

They don’t necessarily have to use the exact same technique that the UK Biobank is using, but if we have medical data for millions of people, we will be able to learn a lot more. Now we all understand there are serious privacy issues there, and we have to be really thoughtful about how to do this.

But if you talk to your average patient, especially ones who are suffering from a devastating illness, you will find that many of them are eager to share some information about their medical condition to the benefit of science, so that we can learn how to treat their disease better. Even if it doesn’t benefit them, because it might be too late, it will benefit others.

So you just mentioned object recognition, and of course humans do that so well. I could show you a photograph of a little Tiki statue, or a raven, or something… And then you could instantly go through a bunch of photos and recognize it if it’s underwater, or if it’s dark, or if it’s inside, and all of that. And I guess it’s through transferred learning of some kind. How far along are we… Do we know how to do it, and we just don’t have the horsepower to do it, or do we not really even understand how that works yet?

Well, I think it’s not that there is one way to do this. There’s a number of techniques that have been developed for transfer learning, and I agree with you that transfer learning is hugely important. But right now, if you look at models—like the Computer Vision Inception Network that Google has developed, there is a whole set of layers in that neural network that were devised based on a large category of web images that have a broad range of categories. But that same set of layers is now taken, pre-trained, and with a relatively small amount of training data—sometimes even as little as zero training examples—can be used for applications that it was never intended for, like the retinopathy project, for instance, that they recently published. I think that’s happening.

Another example, also from Google, is in the machine translation realm, where they recently showed that you could use a network architecture to translate between two languages for which you didn’t have any examples of those two languages together. The machine was effectively creating an interlingua on its own, so that you’re translating a sentence in Thai into this interlingua and then producing a sentence in Swahili as an output, and you’ve never seen a pair of sentences and Thai in Swahili together. So I think we’re already seeing examples of transfer learning emerging in the context of specific domains and I think it’s incredibly exciting.

You mentioned CRISPR/Cas9 a few minutes ago. And of course it comes with the possibility of actually changing genes in a way that that alters the line, right, where the children and the grandchildren have this new altered gene state. There is no legislative or ruling body that has any authority over any of that? CRISPR is cheap, and so can anybody do that?

I agree with you. I think there’s a very serious set of ethical questions there that we need to start thinking about seriously. So, in some ways, when people say to me, “Oh, we need to come up with legislation regarding the future of AI and the ethical treatment of artificial intelligence agents,” I tell them we have a good long time to think about that. I am not saying we shouldn’t think about it, but it’s not like it’s a burning question.

I think this is a much more burning question, and it comes up with editing the human genome, and I think it comes up at least as much in how do we prevent threats like someone recreating smallpox. That’s not CRISPR, that’s DNA synthesis, which also is a technology that’s here. So I think that’s a set of serious questions that the government ought to be thinking about, and I know that there is some movement towards that, but I think we’re behind the curve there.

Behind the curve in terms of we need to catch up?

Yeah, technology has overtaken our thinking about the legal and ethical aspects of this.

CRISPR would let you do transgenesis on a human. You could take a gene from something that glows in the dark, and make a human that glows in the dark, in theory. I mean, we are undoubtedly on the road to being able to use those technologies in horrific ways, very inexpensively. And it’s just hard to think, like, even if one nation can create legislation for it, it doesn’t mean that it couldn’t be done by somebody else. Is it an intractable problem?

I think all technology can be used for good or evil, or most technology can be used for good or evil. And we have successfully—largely successfully—navigated threats that are also quite significant, like the threat of a nuclear holocaust. Nuclear technology is another one of those examples that, it can be used for good, it has been used for good, it can also be used to great harm. We have not yet, fortunately, had a dirty bomb blow up in Manhattan, making all of Manhattan radioactive, and I am hopeful that will never happen.

So I am not telling you I have the solution to this, but I think that as a society, we should figure out what is morally permissible, and what is not, and then really try and put in guardrails both in terms of social norms, as well as in terms of legal and enforcement questions to try and prevent nations or individuals from doing things that we would consider to be horrific.

And I am not sure we have consensus as a society on what would be horrific. Is it horrific to genetically engineer a child that has a mutation that’s going to make their life untenable, or cut short after a matter of months, and make them better? I would say a lot of people would think that’s totally fine; I think that’s totally fine. Is it as permissible to make your child have superhuman vision, great muscle strength, stamina and so on? I think that’s in the gray zone. Is it permissible to make your child glow in the dark? Yeah, that’s getting beyond the pale, right? But those are discussions that we are not really having as a society, and we should be.

Yeah, and the tricky thing is, there is not agreement on whether you should use transgenesis on seeds, you know? You put Vitamin A in rice, and you can end Vitamin A deficiency, or diminish it, and we don’t seem to be able to get agreement on whether you should even do that.

Yeah. You know I find people’s attitudes here to be somewhat irrational in the sense that we’ve been doing genetic engineering on plants for a very long time, we’ve just been doing it the hard way. Most of the food that we eat comes from species of plants that don’t naturally grow in the wild. They have been very carefully bred to have specific characteristics in terms of resistance to certain kinds of pests, and growing in conditions that require hardier plants, and so on and so forth.

So even genetically engineering plants by very carefully interbreeding them, and doing various other things to create the kinds of food that, for whatever reason, we prefer—tomatoes that don’t spoil when you ship them in the bowels of a ship for three weeks—the fact that we are now doing it more easily doesn’t make it worse. In fact, you could argue that it might make it more targeted, and have fewer side effects.

I think when it comes to engineering other things, it becomes much more problematic, and you really need to think through the consequences of genetic engineering on a human, or genetic engineering on a bug.

Yeah, when x-rays came out, they would take a bunch of seeds and they would irradiate them, and then they would plant them, and very few would grow, but a few would grow poorly, and every now and then you would get some improvement, and that was the technique for much of the produce we eat today.

Indeed, and you don’t know what the radiation did, beyond the stuff that we can observe phenotypically, as in it grows better. So all of these things that are happening to all those other genes went unobserved and unmeasured. Now you are doing a much more precision intervention, in just changing the one gene that you care about. And for whatever reason some people view that as being inferior, and I think that’s a little bit of a misunderstanding of what exactly happened before, and is happening now.

It used to be that the phrase “cure aging” was looked at nonsensically. Is that something that is a valid concept, that we may be able to do?

So we do not use the term “cure aging” at Calico. What we view ourselves as doing is increasing healthspan, which is the amount of time that you live as a healthy, happy, productive human being. I think that we as a society have been increasing healthspan for a very long time. I’ve talked about a couple of examples in the past.

I don’t think that we are on the path to letting people live forever. Some people might think that that’s an achievable goal, but I think it’s definitely a worthy goal to make it so that you live healthy longer, and you don’t have people who spend twenty years of their lives in a nursing home being cared for by others because they are unable to care for themselves.

I think that’s a very important goal for us as a society, both for the people themselves, for their families, but also in terms of the cost that we incur as a society in supporting that level of care.

Well, obviously you’ve had a great impact, you know, presumably in two ways: One, with what you’ve done to promote education, and democratizing that, and then what you are doing in health. What are your goals? What do you hope to accomplish in the field? How do you want to be remembered as?

So, let’s see. I think there’s a number of answers that I could give to that question at different levels. At one level, I would like to be—and not the only one, by any stretch, because there is a whole community of us working here—one of the people that really brought together two fields that it’s critical that we bring together: the field of machine learning and the field of biology, and really turning biology into a data science.

I think that’s a hugely important thing because it is not possible, even today and certainly going forward, to make sense of the data that is being accumulated using simple, statistical methods. You really need to build much deeper models.

Another level of answer is that I would like to do something that made a difference to the lives of individual people. One of the things that I really loved about the work that we did at Coursera was that daily deluge, if you will, of learner stories. Of people who say, “My life has been transformed by the access to education that I would never have had before, and by doing that I am now employed and can feed my children and I was not able to do that before,” for instance.

And so if I can help us get to the point where I get an email from someone who says, “I had a genetic disposition that would have made me die of Alzheimer’s at an early age, but you were able to help create technology that allowed me to avoid that.” To me that would be incredibly fulfilling. Now, that is a very aspirational goal, and I’m not assuming that it’s necessarily achievable by me—and, even if it’s achievable, will definitely involve the work of many others—but that, I think, is what we should aspire to, what I aspire to.

You know, you mentioned the importance of merging machine learning with these other fields, and Pedro Domingos, who actually was on the show not long ago, wrote a book called The Master Algorithm where he proposes that there must exist a master algorithm that can solve all different kinds of problems, that unite the symbolists and the Bayesians and all of the different, what he calls, tribes. Do you think that such a thing is likely to exist? Do you think that neural nets may be that, kind of a one-size-fits-all solution to problems?

I think neural nets are very powerful technology, and they certainly help address, to a certain extent, a very large bottleneck, which is how do you construct a meaningful set of features in domains where it’s really hard for people to extract those, and solve problems really well. I think their development, especially over the last few years, when combined with large data, and the power of really high-end computing, has been transformative to the field.

Do I think they are the universal architecture? Not as of now. I think that there is going to be—and we discussed this earlier—at least one or two big things that would need to be added on top of that. I wish I knew what they were, but I don’t think we are quite there yet.

So you are walking on a beach, and you find a lamp. You rub the lamp, and out pops a genie, and the genie says: “I will give you one of the following three things: new cunning and brilliant algorithms that solve all kinds of problems in more efficient ways, an enormous amount of data that’s clean and accurate and structured, or computers that are vastly faster, way beyond the speed of what we have now.” What would you choose?

Data. I would choose data.

It sounded like, when I set that question up earlier about, “Oh, data, it’s so hard,” you were like, “Tell me about it.” So that is the daily challenge, because I know my doctor still keeps everything in those manila folders that have three letters of my last name, and I think, “Wow, that’s it? That’s what’s going to be driving the future?” So that is your bottleneck?

I think it really is the bottleneck, and it’s not even just a matter of, you know, digitizing the records that are there—which, by the way, it’s not just a matter of they are being kept in manila folders. It’s also a matter of the extent to which different doctors write things in different ways, and some of them don’t write things at all and just leave it to memory, and so on.

But I think even beyond that, there is all the stuff that’s not currently being measured. I think we’re starting to see some glimmers of light in certain ways; for instance, I’m excited by the use of wearable devices to measure things like people’s walking pace and activity and so on. I think that provides us with a really interesting window on daily activity, whereas, otherwise people see the doctor once a year or once every five years, sometimes—and that really doesn’t give us a lot of visibility into what’s going on with their lives the rest of the time.

I think there is a path forward on the data collection, but if you gave me a really beautiful large clean data set that had, you know, genetics and phenotypes and molecular biomarkers, like gene expression and proteomics and so on and so forth… I am not saying I have the algorithms today that can allow me to make sense of all of that but, boy, there is a lot that we can do with that, even today. And it would spur the development of really amazing creative algorithms.

I think we don’t lack creativity in algorithms. There is a lot that would need to happen, but I think we’re, in many cases, stymied by the lack of availability in data as well as just the amount of time and effort in terms of grunge work that’s required to clean what’s there.

So there is a lot of fear wrapped up in some people about artificial intelligence. And just to set the question up, specifically about employment, there’s three views about its effect: There’s one group of people who think we are going to enter into something like a permanent Great Depression, where there are people who don’t have the skills to compete against machines. Then there are those who believe that there’s nothing a machine can’t do eventually, and once they can learn to do things faster than we can, they’ll take every job. Then there is a third camp of people who say, look, every time we’ve had disruptive technologies, even electricity and steam power and machines, people just use those to increase their productivity and that’s how we got a rising standard of living. Which of those three camps—or a fourth one—do you identify with?

I probably would place myself—and again I tend to be an optimist, so factor that in—probably more in the third camp. Which is to say, each time that we’ve had a revolution, it has spurred productivity and people migrated from one job category into another job category that basically moves them in some ways, in many cases, further up the food chain.

So I would hope that that would be the case here; our standard of living will go up, and people will do jobs that are different. I do see the case of people saying that this revolution is different, because, over time, a larger and larger fraction of jobs will disappear and the number of jobs that are left will diminish. That is, you just won’t need that many people to do stuff.

Now, again from the optimist’s perspective, if we really have machines that do everything—from grow crops, to package them and put them in supermarkets, and so on, and basically take care of all of the day-to-day stuff that we need to exist—arguably you could imagine that a lot of us will live a life of partial leisure. And that will allow us to, at least, exist, and have food and water, and some level of healthcare and education, and so on, without having to work, and we will spend our time being creative and being artisans again or something.

Which of those is going to be the case, I think is an interesting question, and I don’t have a firm opinion on that.

So, I followed with a lot of interest Watson, when they took the cancer cases and the treatment that oncologists gave, and then Watson was able to match them ninety-some odd percent of the time, and even offered new ones because it read all of these journals and so forth.

So that’s a case of using artificial intelligence for treatment, but is treatment really fundamentally a much easier problem to solve than diagnosis? Because diagnosis is—you know, my eyes water when I eat potato chips—not very structured data.

I think that if you look back, even in the mid-’90s, which is a long way back now, there were diagnostic models that were actually pretty darn good. People moved away from that, partly because to really scale those out and make those robust, you just needed a lot more data, and also I think there are societal obstacles to the adoption of fully-automated diagnoses.

I think that’s actually an even more fundamental problem, is the extent to which doctors, patients, and insurance companies are willing to take a diagnosis that’s provided by a computer. I don’t think fundamentally, from a technological perspective, that is an unsolvable problem.

So is diagnosis a case for an expert system? I think that’s what you are alluding to—you know, how do you tell the difference between a cold and the flu? Well, do they have a fever; do they have aches and pains?

Is that a set of problems where you would use relatively older technologies to build all that out? And even if we don’t switch to that, being able to have access to just that knowledge base, in some parts of the world, is a huge step forward.

I would agree. And by the way, the thing I was thinking back on is not the earliest version of expert systems, which were all rule-based; but rather the ones that came later, which used a probabilistic model that really incorporated things like the chances of a certain thing manifesting in a somewhat different way, and if you have this predisposing factor, or, like, if you visited a country that has SARS recently, then maybe that changes the probability that what you have is not the cold or the flu but rather something worse.

And so all that needs to be built into the model. And the probabilistic models really did accommodate that, and are easily… In fact, there is a lot of technology that’s already in place for how to incorporate machine learning so that you can make those models better and better over time.

I think that’s an area that one could easily go back to, and construct technology that would be hugely impactful, especially in parts of the world where they lack access to good medical care because there just aren’t enough doctors per capita, or the doctors are all concentrated in big cities. And you have people who are living in some rural village and can’t get to see a doctor.

I agree with you that there is a huge possibility there. I think there is also a huge possibility in treatment of chronic care patients, because those are ones that consume a huge fraction of the resources of a doctor’s time, and there just aren’t enough hours in the day for a doctor to see people as frequently as might be beneficial for keeping track of whether they are deteriorating.

So maybe by the time they come and see the doctor six months later, their condition has already deteriorated, and maybe if it had been caught earlier we could have slowed that down by changing treatment. So I think there are a lot of opportunities to apply a combination of modeling and the machine learning, in medical care, that will really help make people’s lives better.

We’re almost out of time, so I have just two more questions for you. First, what is something that looks, for you, like the kind of problem in health that machine learning is going to be able to solve soon? What’s a breakthrough we can hope to pick up the newspaper and read about in five years, something really potentially big that is within our grasp, but just a little out of our reach?

I think there are a couple of areas that I see emerging which are already happening, and you’re starting to see that. Cancer—I think we talked earlier about the bottlenecks that are being addressed one after the other. And, you know, we have antibiotics and retrovirals and statins; and I think we are starting to see with areas like immuno-oncology, for instance, some actual cures for metastatic cancer which, by and large, is incurable using standard methods, with few exceptions. And I think that’s a big area where I think it’s really exciting.

I am seeing some really interesting developments on things that are in the context of specific diseases, that are more genetically-oriented therapies—be it CRISPR, be it viral therapies. We are seeing some others on the path to being approved in the next few years, and so I think that’s a place where, again, on the therapeutic side, there is a big opportunity.

I think the third one is the use of computers in the context of image-based diagnosis, and that’s an area that I used to work in when I was at Stanford—where you show an image of a tumor biopsy sample, or a radiology image, or a 3D Cat Scan of a patient, and they’re able to discover things that are not visible to a physician. Or maybe only visible to a small subset of truly expert physicians, but in most cases, you’re not going to be lucky enough to be the one that they look at.

So I think that’s an area where we will also see big advancements. These are just three off of the top of my head in the medical space, but I am sure there are others.

And a final question: You seem to be doing a whole lot of things. How do people keep up with you, what’s your social media of choice and so forth?

Boy, I am not much of a social media person, maybe because I am doing so many other things. So I think most of my visibility happens through scientific publications. As we develop new ideas, we subject them to peer review, and when we are confident that we have something to say, that’s when we say it.

Which I think is important because there is so much out there, and I think people rush to talk about stuff that’s half baked, not well-vetted… There is a lot of, unfortunately, somewhat bogus science out there—not to mention bogus news. And I think if we had less stuff, that was higher-quality—and we were not flooded with stuff of dubious correctness through which we had to sift—I think we would all be better off.

All righty. Well thank you so much for taking the time. It was a fascinating hour.

Thank you very much Byron. It was a pleasure for me too. Thank you.

Byron explores issues around artificial intelligence and conscious computers in his upcoming book The Fourth Age, to be published in April by Atria, an imprint of Simon & Schuster. Pre-order a copy here

2 Responses to “Voices in AI – Episode 5: A Conversation with Daphne Koller”