Two Startups Point To Semantic Search’s Future


The best way to find the needle you’re looking for in that haystack is to organize the hay. We need to understand how pieces of information relate to each other, or they’re not useful. Two new startups I spoke with recently — Semantifi and FindTheBest — are turning big, opaque datasets into online databases that can be queried. Both use human power to format thousands of topic-specific web applications, which help users manipulate data to find more complex and satisfying answers than what could be found with a regular search query.

Semantifi was one of the cooler companies I saw at the DEMO Conference this week. You can use the company’s SEC filing search app query “Best Buy (s bby) Amazon sales last 2 years” and within seconds, you get a table comparing quarterly sales data for each company, as well as an automatically formatted graph (pictured below).

Stamford, Conn.-based Semantifi wants to provide access to structured datasets by enlisting people to “unlock the deep web” and use its web-based tools to create search apps. What Semantifi means by an app is a dataset that has been imported by a user, with the user teaching the system which cells are categories and a little bit about how they relate to each other. Semantifi has seeded its site with search apps to better access government and financial data.

Semantifi, which has raised $3.5 million in funding, is trying to enlist users to organize data, but it’s also striking partnerships with publishers; for example, Zacks Investment Research is helping create apps to query financial data. Publishers can turn their content and data into a platform and choose to only allow paying subscribers to access their apps.

How might a regular user to participate? CTO Vishy Dasari showed me how to create an app from Amazon (S AMZN) product data (taken from a chart provided to affiliate partners) to answer queries like “silver digital camera under $400.” If that’s the question you’re looking to get answered, it would take you many clicks within Amazon’s own site search and results pages.

Meanwhile, FindTheBest‘s (FTB) first notable feature may be its founder and CEO, Kevin O’Connor, who also co-founded the ad network DoubleClick, and is now returning to entrepreneurship after a 10-year hiatus. “On the Internet, you can find any piece of information, but when you actually want to compare things and make a decision it’s very difficult,” said O’Connor in a telephone interview. His Santa Barbara, Calif.-based, bootstrapped company wants to help users make comparisons by giving them detailed charts made by its in-house researchers and contract workers.

FTB determines what new topics to tackle by looking at search data. That part sounds a bit like Demand Media, but rather than articles or videos, the company creates living databases that look a lot like a Kayak search results page — big chart in the middle of the page, sliders on the left to tweak what the chart is showing, and options to dive deeper and compare a few top items.

The company has created comparison apps to evaluate the best ski resorts based on things like snowfall and personal preferences for terrain difficulty. There are also some more unexpected examples within the 400 created so far, like a comparison app to find the best California medical marijuana dispensaries. O’Connor pointed to another app for kidney dialysis centers, created after the company determined that 200,000 people per month are actively searching for the term. Our readers might be interested in an app that compares 400 technology company acquisitions.

FTB stands out for how manually it’s created. If the hope for the so-called “semantic web” is to think more like a human, we should stop trying to conjure magic software and hire some people. The company has 15 full-time employees, eight interns and 20 outsourced workers between its research and product teams, O’Connor said. Any user can edit an app, but the change has to be approved by FTB editors.

Since new search destinations almost never gain market share, it will be important for both these companies to bring users in through searches on Google (s goog) and elsewhere. They’re both actively working on search engine optimization. If you search Bing (s MSFT) for “housing starts unemployed last 60 months,” a Semantifi page is one of the top results. O’Connor said FTB is focused on attracting searchers for long-tail queries, then exposing them to a broader range of comparison information than they might have known was available.

These aren’t the only companies focused on helping people create web databases. I recently wrote about Needlebase, a fascinating side project of the travel company ITA that Google just acquired, and Google also just bought Metaweb and its open database Freebase. Other competitors include Wolfram Alpha, Socrata and Factual.

These startups answer Tim Berners-Lee’s call to turn “raw data” into “linked data.” He made a passionate plea at TED last year to bring data online in order to mash it up “for multidisciplinary purposes, like combining genomics data and protein data to try to cure Alzheimer’s.” Having what you’re doing blessed by the creator of the web is a pretty good start.

Yuriy Guskov

The first startup is yet another way to use some format and make your data compatible with it. The second startup is yet another way to force humans to collect information. It is not bad, not good, it is just yet another way. But both these startups and old and mature search engines miss one simple point: information always has missed, implied, and ignored parts. Therefore, we always meet with the same sad scenario: someone creates information, then someone try to restore all missed parts. The alternative is to have ways to identify information and make it more meaningful as soon as possible and in as simple as possible way. You can think about Semantic Web, but, unfortunately, it has several serious shortcomings which prevents it from acquiring the right of succession of conventional Web. Finally, it should be the technology which can match exactly a search query and a result, whereas a statistical guess (which contemporary search engines do) should be the last resort, if only other ways failed. You can read more at


Both are interesting services, but I like the structure of FTB, I’m sure I will be using it a lot.

Thanks, I wouldn’t have found these services otherwise.

