In our quest to build robots with artificial intelligence, we face five distinct challenges. The first of these is the problem of the robot being able to see. You can put a camera on a robot, but that just gets you data. The robot has to discern what all that data is.
When you look inside your refrigerator, you see a bunch of items, but a robot sees a bunch of pixels–millions of specks of light and color. It doesn’t know what a shelf is, or a drawer, or a can, or a jar. It just sees an undifferentiated mass of pixels. How do you even start to make sense of that? How do you go from those numbers to “That’s a gallon of milk”? It’s quite difficult.
The number of things that go on in your brain when you look in the refrigerator is complex in the extreme. A description of how your mind performs that minor miracle would require pages of technobabble about polygons and cones and layers. The way that you can, in an instance, identify the architectural style of a house, or a duck in flight, or how you can tell twins apart, or any of the hundred other similar tasks that we effortlessly do is the envy of AI programmers everywhere.