Blog Post

The Real Reason Powerset Sold (Out)

When visiting Israel in the middle of summer, it’s generally not a good idea to go for a walk in the afternoon, even if it is along the sea. The heat and humidity sap your energy, making you feel as if you spent nearly three hours in the gym. But that wasn’t enough to stop me from writing a post about Microsoft buying Powerset for what is rumored to be around $100 million.

I’ve been unable to stop wondering why founder Barney Pell decided to take the money and run — after all, he used to turn blue in the face telling people how superior Powerset’s approach to search was. If it was so superior, Mike Masnick of Techdirt put it best when he wrote that “[T]he exit certainly falls well short of the hype around Powerset. If Powerset was actually seeing any traction at all it never would have agreed to sell at that price.”

To some extent, Mike is right, but I would add another reason: infrastructure, specifically how expensive it is to build. At our Hadoop meet-up earlier this year, Chad Walters, director of engineering at Powerset, noted that their search “requires 100 times more processing than simple keyword searching and indexing (about one second per sentence is required for processing).”

Powerset used some pretty nifty technologies to build out their system, but in order to really scale, they would have needed more money — a lot of it.

And Powerset would have had to scale; there’s no other way to compete with search’s 800-pound gorilla, Google. That’s why Microsoft is building a gigantic data center in the Chicago area focused almost entirely on search. (Which it can now use to help roll out Powerset’s search technology to a larger audience.)

This is an abject lesson for every startup looking to get into the business of search: No matter how good your algorithms are, you still have to deal with the cost of queries, which need to be low enough to be offset by some kind of advertising in order to make a profit. (The conspiracy theorist in me says that if your results are really good you won’t be able to generate enough inventory to serve up ads that bring in the dollars, but maybe I’m just too cynical.)

One of our readers believes that it is possible to build a search engine that surpasses Google’s. Nevertheless, as I’ve noted in the past, “[P]rocess-optimized infrastructure ensures that Google’s cost of executing a query keep going down” — and that allows the company to wring more dollars from the system.

Given all that, Powerset has done a good job of wringing a hundred million from Microsoft. Not that there’s anything wrong with that.

Bonus Link: Don Dodge of Microsoft explains the logic behind the deal.

21 Responses to “The Real Reason Powerset Sold (Out)”

  1. TJGodel

    The abject lesson for every startup looking to get into the business of search should not be about scale, but the lesson is the never head-to-head with an 800-pound gorilla as Powerset/Microsoft are trying to do. Compete within a niche and do it better by creating value that the 800-pound gorilla can’t match with radical change.

  2. Don,

    Thanks for the response.

    I don’t believe that I’ve said anywhere that it is easy or simple, in fact, I would contend that it is the opposite, that it is really difficult to get this right. But it does not mean that it cannot be done.

    That said, what exactly did Powerset accomplish? It was basically a PoC done over a VERY controlled data set. PoCs done over controlled data sets are at the best misleading. If they are having trouble dealing with what essentially is semantically structured data, they’ll have very little hope when it has to work in the open internet.

    The point about declining cost curves is one that holds true for anything, not just for Powerset.

    Incidentally, it is “triples” and not “tuples.” It was used with a particular intent and was not used in conjunction with databases.


  3. Jenkins, The response is just above your comment. Let me repeat it “If it (Powerset) was already done and proven the price tag would have been billions of dollars.”, not millions.

    Powerset started with a grand vision. That vision couldn’t possibly be completed in a couple years. They have made great progress and proven the vision on a small scale; Wikipedia. The next step is scaling it up to handle the whole web. That will take many millions of dollars and a lot of time. More time than startup investors are willing to wait. Microsoft is willing to invest for the long term.

  4. jenkins

    I noticed that Don never responded to my post which is based on tidbits I heard from investors that turned down Powerset. They were promising the moon and, instead, gave us a slightly better version of Wikipedia Search.

  5. Don Dodge

    Shyam, No dithering here…it does both. Most people only focus on the query parsing, but that is actually the easy part. The post processing of the index is the really hard part that takes all the science and all the capital for the compute infrastructure.

    The technology involves a lot more than crawling “tuples” and parsing queries. There is decades of research invested in this, and hundreds of man years of development from Powerset too. It is not as simple as “slapping a parser on top of….” But believe what you will. Many things are simple in concept, but nearly impossible in practice.

    The compute infrastructure required to do this semantic processing is huge and expensive…about 100 times more expensive. The cost curves must continue to come down and the technology must continue to improve to realize the full potential of Powerset. If it was already done and proven the price tag would have been billions of dollars. Time will tell if Powerset was a bargain or not.

  6. Om & Don,

    Let us see a bit of objectivity here. What exactly happened with the Powerset sale? Microsoft bought a company that was indexing Wikipedia and created a query parser that supposedly understands your intent.

    I read Don’s post on the matter yesterday he was dithering between, “it is the query parsing” and “it is the index!” as the value that Powerset is bringing to MS with the acquisition.

    But for all practical purposes (promise/potential does not have an automatic turnaround into realization anywhere in the world), it was still technology that worked demonstrably on a single website from which people are already crawling RDF triples (

    Slapping a good query parser on top of that and getting paid $100 million for it is what Visa would have called “priceless.”

  7. Om, Good story. Thanks for the link. I think you have analized the situation perfectly. Barney put together a great team and went for the big win. He knew it would be a long haul, and he understood the cost curves and how they come down over time.

    The technology is great. But, it takes enormous amounts of capital to execute the plan, apparently more than startup investors are willing to bet. Powerset got a great exit for investors and employees.

    How many companies have acheived an exit of that size in the last 12 to 24 months? Not many. Barney and the Powerset team did a great job, and will continue to do great things at Microsoft.

    Enjoy your time in Israel…and stay out of the noon day sun :-)


  8. jenkins

    Let’s be honest and stop sugar coating things here. Powerset was failing. They were having trouble getting new capital at an increased valuation because they FAILED to launch their beta and, instead, launched a beta for Wikipedia, which was not the original plan. If they came to me for money I would’ve been very concerned with their progress.

  9. Om- I don’t completely agree with you on this point. I think it’s certainly the case that you need to worry about per query cost computation. But I don’t think that means you need to have your revenue model figured out.

    Google certainly did not for a very long time. That said, paid search is certainly the standard model for monetizing search. I can imagine that the index for a search engine like Powerset’s might afford some other interesting possibilities for monetization (e.g. we can now monetize phrases like “near Paris” vs “in Paris”)

    Ultimately I think the most important thing is to first build a great search engine. That definitely requires infrastructure for highly computational query processing… but I don’t think requires a revenue stream.

  10. Interesting points. This sounds like something that might actually be most useful for an enterprise search product. Not a bad thing, since enterprise search often sucks compared to web search. For one thing, you don’t have the same sort of implicit metadata about content that can be used for ranking (a la PageRank).

  11. Having worked on the enterprise side of the semantic, NLP search business, I can say you’re pretty much spot on. Query intent, and creating the meta-tag ontologies and taxonomies to support it, take a massive amount of both processing power and storage space. On the consumer side of the business, it would have been nearly impossible for Powerset to challenge Google, no matter how superior their approach is/was.

    Now we’ll have to wait and see what Microsoft does with this new toolset.