When visiting Israel in the middle of summer, it’s generally not a good idea to go for a walk in the afternoon, even if it is along the sea. The heat and humidity sap your energy, making you feel as if you spent nearly three hours in the gym. But that wasn’t enough to stop me from writing a post about Microsoft buying Powerset for what is rumored to be around $100 million.

I’ve been unable to stop wondering why founder Barney Pell decided to take the money and run — after all, he used to turn blue in the face telling people how superior Powerset’s approach to search was. If it was so superior, Mike Masnick of Techdirt put it best when he wrote that “[T]he exit certainly falls well short of the hype around Powerset. If Powerset was actually seeing any traction at all it never would have agreed to sell at that price.”

To some extent, Mike is right, but I would add another reason: infrastructure, specifically how expensive it is to build. At our Hadoop meet-up earlier this year, Chad Walters, director of engineering at Powerset, noted that their search “requires 100 times more processing than simple keyword searching and indexing (about one second per sentence is required for processing).”

Powerset used some pretty nifty technologies to build out their system, but in order to really scale, they would have needed more money — a lot of it.

And Powerset would have had to scale; there’s no other way to compete with search’s 800-pound gorilla, Google. That’s why Microsoft is building a gigantic data center in the Chicago area focused almost entirely on search. (Which it can now use to help roll out Powerset’s search technology to a larger audience.)

This is an abject lesson for every startup looking to get into the business of search: No matter how good your algorithms are, you still have to deal with the cost of queries, which need to be low enough to be offset by some kind of advertising in order to make a profit. (The conspiracy theorist in me says that if your results are really good you won’t be able to generate enough inventory to serve up ads that bring in the dollars, but maybe I’m just too cynical.)

One of our readers believes that it is possible to build a search engine that surpasses Google’s. Nevertheless, as I’ve noted in the past, “[P]rocess-optimized infrastructure ensures that Google’s cost of executing a query keep going down” — and that allows the company to wring more dollars from the system.

Given all that, Powerset has done a good job of wringing a hundred million from Microsoft. Not that there’s anything wrong with that.

Bonus Link: Don Dodge of Microsoft explains the logic behind the deal.

  1. Having worked on the enterprise side of the semantic, NLP search business, I can say you’re pretty much spot on. Query intent, and creating the meta-tag ontologies and taxonomies to support it, take a massive amount of both processing power and storage space. On the consumer side of the business, it would have been nearly impossible for Powerset to challenge Google, no matter how superior their approach is/was.

    Now we’ll have to wait and see what Microsoft does with this new toolset.

  2. Interesting points. This sounds like something that might actually be most useful for an enterprise search product. Not a bad thing, since enterprise search often sucks compared to web search. For one thing, you don’t have the same sort of implicit metadata about content that can be used for ranking (a la PageRank).

  3. Om- I don’t completely agree with you on this point. I think it’s certainly the case that you need to worry about per query cost computation. But I don’t think that means you need to have your revenue model figured out.

    Google certainly did not for a very long time. That said, paid search is certainly the standard model for monetizing search. I can imagine that the index for a search engine like Powerset’s might afford some other interesting possibilities for monetization (e.g. we can now monetize phrases like “near Paris” vs “in Paris”)

    Ultimately I think the most important thing is to first build a great search engine. That definitely requires infrastructure for highly computational query processing… but I don’t think requires a revenue stream.

  4. Google’s real product? The vast and reliable systems infrastructure.

  5. Your post reminds me of a comment by Gordon Moore about a very different business: “Capacity is destiny.”

  6. Let’s be honest and stop sugar coating things here. Powerset was failing. They were having trouble getting new capital at an increased valuation because they FAILED to launch their beta and, instead, launched a beta for Wikipedia, which was not the original plan. If they came to me for money I would’ve been very concerned with their progress.

  7. eas is right, I believe; the integration of Powerset semantics within FAST enterprise search, on top of a diverse Sharepoint set of services, should make a really killer enterprise “information work” environment. See http://lewisshepherd.wordpress.com/2008/07/01/semantic-reality/

  8. Om, Good story. Thanks for the link. I think you have analized the situation perfectly. Barney put together a great team and went for the big win. He knew it would be a long haul, and he understood the cost curves and how they come down over time.

    The technology is great. But, it takes enormous amounts of capital to execute the plan, apparently more than startup investors are willing to bet. Powerset got a great exit for investors and employees.

    How many companies have acheived an exit of that size in the last 12 to 24 months? Not many. Barney and the Powerset team did a great job, and will continue to do great things at Microsoft.

    Enjoy your time in Israel…and stay out of the noon day sun :-)


  9. Om & Don,

    Let us see a bit of objectivity here. What exactly happened with the Powerset sale? Microsoft bought a company that was indexing Wikipedia and created a query parser that supposedly understands your intent.

    I read Don’s post on the matter yesterday he was dithering between, “it is the query parsing” and “it is the index!” as the value that Powerset is bringing to MS with the acquisition.

    But for all practical purposes (promise/potential does not have an automatic turnaround into realization anywhere in the world), it was still technology that worked demonstrably on a single website from which people are already crawling RDF triples (Dbpedia.org).

    Slapping a good query parser on top of that and getting paid $100 million for it is what Visa would have called “priceless.”


