Meet the algorithm that can learn “everything about anything”


The most recent advances in artificial intelligence research are pretty staggering, thanks in part to the abundance of data available on the web. We’ve covered how deep learning is helping create self-teaching and highly accurate systems for tasks such as sentiment analysis and facial recognition, but there are also models that can solve geometry and algebra problems, predict whether a stack of dishes is likely to fall over and (from the team behind Google’s word2vec) understand entire paragraphs of text.

(Hat tip to frequent commenter Oneasum for pointing out all these projects.)

One of the more interesting projects is a system called LEVAN, which is short for Learn EVerything about ANything and was created by a group of researchers out of the Allen Institute for Artificial Intelligence and the University of Washington. One of them, Carlos Guestrin, is also co-founder and CEO of a data science startup called GraphLab. What’s really interesting about LEVAN is that it’s neither human-supervised nor unsupervised (like many deep learning systems), but what its creators call “webly supervised.”


What that means, essentially, is that LEVAN uses the web to learn everything it needs to know. It scours Google Books Ngrams to learn common phrases associated with a particular concept, then searches for those phrases in web image repositories such as Google Images, Bing and Flickr. For example, LEVAN now knows that “heavyweight boxing,” “boxing ring” and “ali boxing” are all part of the larger concept of “boxing,” and it knows what each one looks like.

More impressive still is that because LEVAN uses text and image references to teach itself concepts, it’s also able to learn when words or phrases mean the same thing. So while it might learn, for example, that “Mohandas Gandhi” and “Mahatma Gandhi” are both sub-concepts of “Gandhi,” it will also learn after analyzing enough images that they’re the same person.


So far, LEVAN has modeled 150 different concepts and more than 50,000 sub-concepts, and has annotated more than 10 million images with information about what’s in them and what’s happening in them. The project website lets you examine its findings for each concept and download the models.

According to a recent presentation by one of its creators, LEVAN was designed to run nicely on the Amazon(s AMZN) Web Services cloud — yet another sign of how fast the AI space is moving. Computer science skills and math knowledge are one impediment to broadly accessible AI, but those can be addressed by SDKs, APIs, and other methods of abstracting complexity. However, training AI models can require a lot of computing power, something that is easily available to the likes of Facebook(s fb) and Google(s goog) but that for everyday users might need to be offloaded to the cloud.



Well, you know — nobody’s claiming that what it learns is true… but it seems pretty good at being politically correct. I note that it hasn’t learned that Mohandas K. Gandhi was a disgusting racist and that Indira Gandhi was into genital mutilation… but there you go: nuances, as you say…

Fritz Lenker

And there are limits to what can be solved algorithmically. Read Roger Penrose’ “The Emperor’s New Mind”, especially the chapter on Kurt Gödel who proved mathematically that no more than 50% of proofs can be solved algorithmically.


For others reading this, I’m almost certain that Goedel’s Theorem does not talk about any percentage, i.e. “no more than 50%” is false.


I don’t see a great benefit for everyday users of PCs at a normal level.
Artificial Intelligence is exactly that “Artificial”.
A computer can never reach the level of analysis, understanding, comprehension and creation of a human mind.
If a power failure stops a computer functioning it will always need the human hand to restart it or restore power!
Yes, there had been enormous advances in AI but the human brain as you already know is only using a 5 to 10% of its mental power. Can you imagine what it would be like if we discover a way to increase that level to higher percentages?


I wonder how scientists calculated percentage of “mental power” used by typical brain-user…
Even if its true, I would risk statement that it is naturally limited to this low percentages because of “poor design” of energy delivery to brain (glucose & o2) and problems with heating when too much neurons are switching simultaneously… Skull was very helpful at the beginning stages of mankind, but is limiting our CPU usage… (Mayans, i believe, know something about it :)

Horst JENS

Is there a license (hopefully creative-commons share alike) for this blog posting so that i can repost in other media ?


Indira Gandi is not a wife of Mahatma Gandhi. LOL.


No one is saying she was. You saw the pictures side-by-side and “Mrs. Gandhi” and came to an incorrect conclusion. You, as a faulty human, saw a pattern which does not exist. Likely a computer would not make the same mistake. :)

Patrick McCormack, SPHR

Depends upon how it is programmed. When a human has every bit of information needed to make a sane and rationale decision, including a matrix of results for each possible decision, wrong decisions continue to be made. One must also consider that not all bad decisions are wrong, nor that all wrong decisions are bad. All it takes is one stray gamma ray or one heart string pluked to make a decision that is not the optimal choice.

Citizen Pariah

My main question is how do you teach it what is canon? There is so much non-authoritative data out there that humans can’t tell what’s true anymore.

Still for basic hierarchical organizing of subjects/objects it’s an innovative start.

Rob Mac Hugh

Pretty soon, the machine will say: “what’s that idiot who wants to borrow my circuits?
Go upload yourself to a light switch pinhead!


It is impressive to know that computers are now made to learn by itself.

But do those people ever think what is the fine-line extent of such project.

You know, I hate to say it but ….. we would not want to end up in the world of matrix…

Satish Sharma

Well i am not impressed .. the example is rather poor

Mrs. Gandhi was never impressed with anyones writings — being illiterate despite her fathers best effort to send her to good schools.

And she was kind of related to the other Gandhi — he adopted her as her father refused to walk down the plank with her to marry her off — and the old man had penchant of hanging around with your girls — a computer will never know all this .. would it ? oh well not for another 100 years ..

Satish Sharma

Well I am not impressed with the Gandhi example.

Indira Gandhi wasn’t inspired by anything in her life — particularly anything written because she was fairly illiterate despite her fathers best efforts to send her to the best schools.

And of course she WAS related to the other Gandhi .. having been quasi adopted by him — her father refused to walk her down the marriage plank — and the older man who had a penchant for young girls did .. well a computer will never know all this nuanced stuff for at least another 100 years .. so I am not impressed!


Lots of great computer vision projects going on! Can’t wait to see where we’ll be in a few years…

Here’s another interesting paper to look at:

(See the results in section 5.) Looks like the kind of result that will have loads of amazing applications (improving things across the board) — only time will tell.

John Sun

Interesting paper indeed! But there was a very good reason as to why Martens came up with Hessian-Free… Scaling down MNIST??? Well, interesting nevertheless, and it might be possible to apply ideas to methods that use approximation of Hessian. Hopefuly.

Come join Google+ Deep Learning community – much better place to discuss such things:)

Fernando Olmos

It’s intelligence using only one facet of our human abilities – to recognise patterns. We still do it much better than LEVAN or any other AI ever will.

Derrick Harris

True in many respects, but speed and scale are what make AI so compelling for certain things, especially classification. People can only remember so many names, faces, facts, etc.

Steve Ardire

LEVAN looks interesting and appears ‘similar’ to NELL: Never-Ending Language Learning at CMU
@cmunell I am a machine reading research project at Carnegie Mellon

>However, training AI models can require a lot of computing power, something that is easily available to the likes of Facebook and Google

NELL is supported by the DARPA, Google, Yahoo.

Comments are closed.