Blog Post

The government doesn’t want to mess up on big data

Stay on Top of Emerging Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

The federal government talked a lot about grand scientific visions when it unveiled its big data agenda last week, but the government has consumers on its mind, too. The message from Washington seems to be that it doesn’t want to unduly hinder innovation on consumer-facing big data applications, and it might even be willing to help make them better.

According to Vice President of Product and Marketing Michael Paulson, members of the company were invited to attend the White House’s big data summit, and one of the big takeaways was that Congress might be willing to admit that it doesn’t know what it doesn’t know with regard to big data. While discussions about privacy get a lot of attention, lawmakers don’t always know how data is being used to improve the user experience. There are a lot of bills about to be written governing data use, Paulson said, and part of the summit was an effort to make sure those writing the bills don’t hinder benign innovation with short-sighted laws.

Paulson also noted that several agency members approached Decide Co-Founder and CTO Oren Etzioni a Decide software engineer and suggested they could provide datasets that might actually improve Decide’s buy-or-wait engine for electronics purchases. Decide works by analyzing large datasets on pricing and product-release information in order to give consumers an idea whether now is the best time to buy a particular product or whether they should wait because the price is about to drop, or a new model is about to come out. It also aggregates reviews, news and other info about a product so consumers can make informed decisions beyond Decide’s algorithmically generated predictions. According to Paulson, federal agencies have data on factors from energy efficiency to manufacturing location that they might be willing to share.

If you’re wondering what makes Decide such an innovative use case that it was invited to attend and speak at the summit, here are some highlights of how the service works:

  • The founders and many engineers came from airfare-prediction service Farecast, which Microsoft (s msft) bought in 2008. Co-Founder and CTO Oren Etzioni founded Farecast, as well as Clearforest, which Thomson Reuters acquired in 2007.
  • Decide gives consumers a simple “buy” or “wait” decision on their planned purchases, as well as greater detail into how much the price might rise or fall in what timeframe, or when a new version is likely to be released.
  • Paulson said Decide is 77 percent accurate on predictions and saves users an average of $54 per purchase. About 20 percent of the time, he said, there’s a good reason to wait.
  • It has a 100TB-and-growing pricing database and more than 8 billion price observations to drive pricing models.
  • It built the world’s large first “lineage database,” a massive store of information on what products and versions are related to each other. This — along with news and rumors — helps predict the likelihood of new models.
  • Decide relies on Amazon Web Services (s amzn) for its operations, including Elastic MapReduce for Hadoop jobs. Other big data tools include Apache Solr/Lucene and Weka, an open source “collection of machine learning algorithms for data mining tasks.”

And we are only at the beginning when it comes to startups using data to build apps that will improve the consumer experience. Even for teams that aren’t stocked with Ph.Ds., analytic techniques such as natural-language processing and machine learning are slowly making their way into the mainstream. Just this morning, I had a call with the founders of Stazoo, a site that’s using sentiment analysis to help users determine what music and movies are hot among their friends and the greater world of Twitter users.

I have been adamant about the need for lawmakers to get a better grasp on where technology is headed so they don’t write laws that inadvertently stifle innovation down the line, and data is a particularly tricky area in that regard. If Congress and the administration are actually serious about getting educated on big data so they can ensure it flourishes, I say kudos to them.