Everyone likes to talk about big data, but few know how to make use of it. Thanks to cloud computing and the efforts of several companies, however, the ability to access and make sense of huge chunks of information is here. The question is whether there’s a business in providing intelligible data sets to information workers, application developers and analysts in a world where turn-by-turn directions and real-time financial quotes — which used to be expensive — are now free.
Microsoft (s msft) is hoping there is, and to that end has built out a storefront for data sets that range from geolocation data to weather information that’s codenamed Project Dallas. The project, which will become commercially available in the second half of the year, aims to provide access to data from information providers like InfoUSA, Zillow and Navteq so that developers can use it to build applications and information services. Other potential users of the information are researchers, analysts and information workers — from buyers at retail stores to competitive intelligence officers at big companies. Microsoft will take a cut of the fee charged by the information providers, but Dallas isn’t about profiting from data brokerage so much as it’s about showcasing Microsoft’s Azure cloud and making its Office products more compelling.
“The indirect monetization is potentially bigger than the direct monetization,” said Moe Khosravy, general product manager of Project Dallas, in a conversation last week. “That will cover some bandwidth and compute and the credit card surcharges for the transactions, but the real opportunity is that more developers will use Azure and Office because we’ve made it easy and will build support for Dallas into Office.”
I explore Microsoft’s efforts as well as those of a startup called Infochimps, which is also building a data marketplace, in a research note over on GigaOM Pro (sub req’d) called Big Data Marketplaces Put a Price on Funding Patterns. In it, I lay out how the ability to host and process large data on compute clouds has changed the way people can access and profit off of data.
And while I spend a lot of time in the research note talking about business models and how to charge for data by the slice, Infochimps and Microsoft will both provide some data for free, much like Amazon.com (s amzn) and a startup called Bixo Labs are doing. Specifically, Khosravy said Microsoft may try to provide some municipal and federal data as a public service — or at least refrain from charging the governments from hosting the data on Azure.
Figuring out how to get public information on data marketplaces is difficult. Governments have a lot of access to data, but it’s generally on paper or in old databases that may not translate automatically to the cloud. There’s a clear public interest in providing that data in a clean format for developers and citizens, but the costs could quickly add up — and governments don’t tend to have a lot of taxpayer dollars floating around to transfer their data to the cloud. That’s why Microsoft’s volunteering to host “a percentage” of public data for free might help.
And the benefits of such easy accessibility and the ability to mash up different data sets could be huge. As an example, Microsoft is working with the City of Miami on a new 3-1-1 line that uses mapping data and inputs from the city’s existing 3-1-1 hotline to create a map of where potholes and street problems ares so city officials can tackle the issues in an organized way.
As data marketplaces grow, questions about who owns the data and privacy issues will get resolved, because the financial incentive to address them is huge. Then folks can focus on what they can build using huge swaths of demographic, geographic, financial and even personal data. Read my full analysis.