Factual Sees Open Data As Its Future

[qi:gigaom_icon_web-apps] First there was open source. And now, here comes open data. Or that’s what Gil Elbaz thinks. He’s returning to the startup arena with a new Los Angeles-based company called Factual, which is building its business around the concept of open data. It wants to help build open databases that will run the gamut — from the addresses of all Thai restaurants in San Francisco to the fauna of Florida. And instead of chasing the elusive consumer, the company is going to woo app developers by targeting its offerings to them.

“We wanted to build an open data repository,” Elbaz explained, “because just as databases on computers are used by apps, we wanted to make the web computable.” Elbaz first earned his chops as the co-founder of Applied Semantics, a company that was acquired by Google in 2003 for about $102 million. Applied’s technology is now part of Google’s AdSense technology. He left Google in 2007 and since then has been focused on Factual, a company he has so far funded with his own cash.

When I asked him why he started Factual, he said he’d observed that when people didn’t have enough data to use, they couldn’t make smarter decisions. And what if the data available wasn’t accurate or easily accessible — would decisions made be the right ones? All of which are good points, but how does one go about addressing them?

Factual is seeding the system with databases built out of government data, Wikipedia and other such resources. Elbaz is hoping that over time, folks will start their own data projects and leverage their communities to help build these databases. And since these databases will be open source, any app developer could tap into them for their own applications. In many ways Elbaz has taken his cue from Wikipedia.

Elbaz hopes that even larger companies will open up and start sharing their data sets with the community. “Companies will soon realize that they can clean up and grow their data sets without incurring huge costs associated with closed databases,” he said. From the Factual blog:

We think a good route to low cost and high quality data is the open data model.  By making data open to access (read) so that developers can create valuable new applications without complex data licensing restrictions, and by making the data open for opinion, comment, and debate (write) — we believe a groundswell of support for certain data verticals could emerge.

There have been a number of great open structured data projects that have positively impacted the web; ODP, MusicBrainz and OpenStreetMap are just a few examples. But we believe it’s just the beginning. Factual intends to build one of the largest repositories of open data by providing an open, collaborative environment where anyone can easily view, contribute, improve and share data.

We’ve been testing with several partners (see home page for list) who understand that Factual’s open data can help websites offer better data and tools for end users.  For example, we’ve partnered with Demand Media, on a cancer physician table on their Livestrong.com site.

The company plans to eventually charge for a suite of premium services, including access to for-fee premium APIs and quality-of-service guarantees. All that comes next, mostly because the company faces the uphill task of convincing developers and hundreds of data set creators to join Factual. It has competition from San Francisco-based Metaweb (with its Freebase products), amongst several other projects.

Factual’s hard challenges aside, I’m going to keep an eye on this company, mostly because of the pedigree of the founder and also due to the fact that data, or rather data analysis as a key strategic asset, is the next big thing.

Earlier this month, much to the chagrin of some of our readers, I equated the Hadoop-focused startup, Cloudera, to Red Hat (s RHAT). My argument was that in the late 1990s, open-source operating systems and web software proved to be major disruptors and helped Internet services grow exponentially. About a dozen years later, the future of Internet services revolves around data and data analytics.

And Hadoop, open-source data warehouse software, has become a popular choice for everyone who relies on data crunching, from advertising companies to biotech giants. Folks at Google (s GOOG) know this all too well, which is why they’ve invested extensively in their infrastructure systems. From that perspective, if Elbaz can convince enough people to sign up, his little Factual has a good chance at carving out a nice niche for itself.

Om’s Note: I am going to play around with their service more extensively before passing judgment on the service.