Blog Post

Watson goes to college: How the world’s smartest PC will revolutionize AI

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

In 2011, IBM achieved a quantum leap in artificial intelligence technology when its Watson computer program trounced human champions Ken Jennings and Brad Rutter in a three-day Jeopardy! tourney, taking home the million-dollar prize by outscoring the second place competitor by a three-to-one margin.

Since then, Watson has shown its computing prowess in the world of medicine and in other business settings. However, as was recently announced, IBM decided Watson could use a college education and so will join here us at Rensselaer Polytechnic Institute. With its help, we hope to further advance artificial intelligence in a number of key areas.

The Watson program is already a breakthrough technology in AI. For many years it had been largely assumed that for a computer to go beyond search and really be able to perform complex human language tasks it needed to do one of two things: either it would “understand” the texts using some kind of deep “knowledge representation,” or it would have a complex statistical model based on millions of texts.

Watson used very little of either of these. Rather, it uses a lot of memory and clever ways of pulling texts from that memory. Thus, Watson demonstrated what some in AI had conjectured, but to date been unable to prove: that intelligence is tied to an ability to appropriately find relevant information in a very large memory. (Watson also used a lot of specialized techniques designed for the peculiarities of the Jeopardy! game, such as producing questions from answers, but from a purely academic viewpoint that’s less important.)

Right now, to take Watson into a new domain — for example, to be able to answer questions about health and medicine — Watson works by reading texts. First, it needs a lot of information to go into its memory, which is generally provided by giving it a million or more documents to process from any particular area or discipline. Second, it needs to have information about the specialized terms used – for example, to be told that the word “attack” in “heart attack” is a noun and not a verb. Technical terms, such as, say, “myocardial infarction” also need to be identified. Finally, to hone its ability in the new area it needs a combination of questions and answers to train from.

But on the Web, we can find much more than text.  Watson will be more powerful if it can also take advantage of other resources. One of our first goals is to explore how Watson can be used in the big data context.  As an example, in the research group I run, we have collected information about more than one million datasets that have been released by governments around the world. We’re going to see what it takes to get Watson to answer questions such as “What datasets are available that talk about crop failures in the Horn of Africa?” Or, at a more local level, “What government agency can help me find a job in New York?”

We’re also looking at how Watson can use other kinds of information; ideas our students have suggested include:

  • Technical domains such as chemistry and nuclear engineering, where specialized diagrams and formulas are needed.
  • “Commonsense” domains, such as what is happening around our college, where Watson would need to interact with Twitter, Facebook, and other kinds of social media.
  • “Artificial worlds,” such as role-playing games, where Watson would need to differentiate between objects that are real and others that are imaginary.

On a more theoretical level, we want to understand what Watson can teach us about artificial intelligence in general.  We will explore how the memory-based techniques of Watson can be “embedded” into more complex reasoning systems.  As humans, our memories are used as a component of much more complex cognition than is needed for playing Jeopardy (a tough task in its own right, or course.) Memory helps us to decide which of various approaches may be best when faced with a problem, by providing analogies between the current situation and ones we’ve seen in the past. We also are able to pull up separate memories in different contexts, based on situational need.

By exploring how Watson’s memory functions as part of a more complex problem solver, we may learn more about how our own minds work. To this end, my colleague Selmer Bringsjord, head of the Cognitive Science Department, and his students, will explore how adding a reasoning component to Watson’s memory-based question-answering could let it do more powerful things. Can Watson be made to solve word problems it has never seen before? Can we enable it to justify the answers it gives? Could it be made into a conversationalist rather than just a question-answerer?

As we understand Watson better, our students will also be exploring how to deepen its capabilities by programming new components.  They will learn how this new generation of “cognitive computing,” as IBM Senior Vice President and Director of Research John E. Kelly has called it, really works. They will come to understand the architectures, software, and hardware that power the approach, and they will program new modules to give Watson new abilities.

And finally, of course, there’s the blue sky nature of what Watson may allow. Given such a potent new tool, we suspect this list of projects still just scratch the surface of what our students will come up with. As one said, he is “eager to teach Watson to daydream.”

James Hendler is a professor at Rensselaer Polytechnic Institute and head of the computer science department. Follow him on Twitter @jahendler.

Have an idea for a post you’d like to contribute to GigaOm? Click here for our guidelines and contact info.

9 Responses to “Watson goes to college: How the world’s smartest PC will revolutionize AI”

    • jahendler

      I agree – the author doesn’t get to title his column. I wanted it to be:

      Answer: It’s why the World’s Smartest Computer is going back to school.

  1. Dan McCreary

    Makes me wish I was back in grad school working for Jim! So many opportunities for innovation! Jim’s work at RPI converting the information at to RDF seems like a great starting point. I hope that some day the Apache UIMA frameworks benefit from these tools.

  2. jahendler

    Steve –
    While Watson does use those capabilities, saying it is just a collection of these is like saying that Deep Blue was just a search program. It’s the complexity of the task and level of performance that Watson achieved that showed something significant had happened. After examining the behaviors, it seems to me that the novel aspect was, as I wrote here, how the large memory, accessed appropriately, made the system work. It used learning primarily for adjusting weights on the answers of components, but those components worked to a large degree off the memory, So it is not that the three things you point out aren’t important, but rather it’s how they’re used in combination w/the store. It would be interesting to see a competitive solution that could function without the memory store – I certainly haven’t seen anything competitive to date

    • Steve Ardire

      Hi Jim – agree the key is how NLP, hypothesis generation, evidence-based learning is used in combination w/the memory store. Also not saying there’s many competitive solutions ( because this is heady stuff ) just a select few that hold some promise ;)

  3. Steve Ardire

    Hi Jim,

    This is very exciting stuff so good for you and your team at RPI.

    You say most competitive solutions “understand” text using some kind of deep “knowledge representation” or use complex statistical models……Yes agree

    Then you say Watson uses a lot of memory and ‘clever ways’ of pulling texts from that memory and provide elaborate descriptions and examples especially for healthcare.

    However, I’m curious why couldn’t you simply say Watson focuses on three key capabilities
    1) NLP
    2) Hypothesis generation
    3) Evidence-based learning

    This is interesting but these 3 capabilities are not unique i.e. there’s competitive solutions that can do pretty much do equivalent with same and different techniques.