6 Comments

Summary:

Paul Doscher, CEO of Lucid Imagination wants you to know that when it come to enterprise-class search, open-source Lucene is a contender. And a strong contender that can face off against Google, Amazon and Microsoft in the big data search arena.

4090121065_5c63f977ee_z

Paul Doscher, CEO of Lucid Imagination, wants you to know that when it comes to enterprise search — and search that can handle the big data wave — open-source Lucene is a contender.

Of course, as head of the company that offers both open source and commercial versions of Lucene, Doscher is no neutral observer. At the company’s Lucene Revolution conference Tuesday in Cambridge, Mass., Doscher announced an application development stack that knits together Hadoop, Mahout, R and Lucene/Solr for handling search, machine learning, recommendation engines and analytics as a platform for enterprise search. That stack, called LucidWorks Big Data, is in beta and aims to make it faster and easier for developers to deploy enterprise-scale search.

“Most Hadoop instances now are one-off — they’re not scalable and not repeatable,” Doscher told me in an interview. “With our stack — all of which is available via APIs, you can use your own user interface and algorithms, and get productive much faster.”

Lucene, the product of an Apache Software Foundation project, is already used by a ton of e-commerce sites — Zappos, the big online shoe store, for example, uses Lucene for 63 million customer searches, according to Computerworld. That’s interesting since Amazon, which bought Zappos three years ago, is now transitioning from Lucene to its own A9 search. A9 is the technology underlying Amazon’s Cloudsearch service announced a few weeks back.

But other big users include EMC, which is replacing Microsoft’s FAST Search technology in EMC’s Documentum document management system with Lucene.

Searching for the right search

As structured and unstructured data proliferate, the need to index and search that data efficiently will only grow.

To be sure, Lucene (which is the core engine) and SOLr (which is the more developer-friendly wrapper around that engine) are not the only dogs in this fight. Other players include the Google Search Appliance, and HP Autonomy — which Doscher called the “800-lb gorilla.” And there’s Microsoft with FAST and now Amazon, which is continually building up its cloud-based services. Lucid Imagination, Redwood City, Calif., offers both on-premises and cloud-based versions of its Lucene/Solr-based search.

Lucid, which employs 9 of the 36 contributors to Lucene, seems the fan favorite of the open-source contingent although there is a rival in Elasticsearch, said Lou Romm, senior program manager for Search Technologies, a consulting firm that helps businesses evaluate the best search for their needs.

Granted, it was a biased group at the conference, but several attendees — including one from the M.D. Anderson Cancer Research Center — said there really is no alternative to Lucene for his purposes. Lucene is able to handle all the data — images, text, structure, unstructured — that choke other solutions. “That’s a big deal when you’re trying to save lives,” he said.

Photo courtesy of Flickr user suvodeb

You’re subscribed! If you like, you can update your settings

  1. LexisNexis released it’s platform into the wild a while back, wonder where that’s gotten to? They have search tech that make Google look like a newspaper hat on Derby Day

    1. that is a good question. My bet is lexisNexis is a lot more pricey than lucene/solr

  2. Otis Gospodnetic Wednesday, May 9, 2012

    Lucene is distributed by Apache Software Foundation, NOT LUCID!

    1. lucidworks is the Lucid version.

  3. Do you have a source for the statement that Amazon is transitioning from Lucene to A9?

  4. isn’t Splunk competing in the same space?

Comments have been disabled for this post