3 Comments

Summary:

Google may have more distributed data than any other company but it still takes user input to create smarter machines. Google’s Voice Search speech recognition, for example, began to improve when the service started to train itself and improve accuracy through the use of end-user data

Alfred Spector, Google, at Structure Big Data 2011

Alfred Spector, Google, at Structure Big Data 2011Making sense of vast amounts of data is made easier through processor improvements, faster networks and a growing amount of cloud storage capacity, but there’s another factor that’s accelerating the ability to sift through information: user communities. At the Structure Big Data event on Wednesday, Alfred Spector, a VP of Research and Special Initiatives at Google, illustrated how to combine low-level user data with the massive information stores and cloud computing services offered by his company.

Perhaps the most prominent example is Google’s geographic data used both in both the Google Maps and Earth products. The company harvests global information to create useful products in their own right, but each can be supplemented through localized user data. A modern data management web app makes it easy for Google to host, manage, allow collaboration and publication of data tables or personalized maps. For example, Google Maps data combined with information from hospitals and doctors can easily show which nearby health-care providers have flu vaccines available.

Making large amounts of data usable and modifiable by end users has the potential to create solutions that Google hasn’t envisioned yet. But what it has done is allowed for what Spector calls a “hybrid intelligence” because users and computers are doing more together than either could do individually. Scientists that track global warming may only have access to limited datasets which show only a small picture of the overall situation. Google Earth, however, can augment its base data with sensor information from various satellites and datapoints, providing a more holistic view of global warming.

This user community and data combination approach is leading to smarter machines as well. The voice search features offered by Google are becoming more accurate due to speech recognition data provided by users. In effect, the speech service is training itself because it’s learning from all of the incoming data.

Just as they can with Google Maps data, end users can leverage these smarter machines as well. Spector said that a spam-killing blog moderator could be created by end users if they train the system with both good blog posts and spam comments. Those inputs, combined with Google’s prediction APIs and Python scripts, would effectively create an intelligent automated moderator that could continuously improve its own performance.

Watch live streaming video from gigaombigdata at livestream.com
  1. Probably that is the reason of Google’s success and it will keep prospering at a rapid pace if it keeps working with data that way.

    Share
  2. Analytics in Financial Services: TDWI + NY Tech Council Present First of a New Series on Practical Industry Analysis http://bit.ly/gM1FrM

    Share
  3. The converging of large data stores with local information has huge potential in the real estate sector for sure. It would be a tremendous help to home buyers when we can unify real estate for sale with housing data, crime statistics, local amenities, neighborhood sales, and foreclosure trends.

    Share

Comments have been disabled for this post