10 Comments

Summary:

With the firehose of information enabled by Facebook, location based services, and other forms of social media, the era of Big Data is upon us. But in the next decade much of that data won’t come from social networks, but rather, from sensor networks.

With the firehose of information enabled by Facebook, Twitter, location-based services, and other forms of social media, the era of Big Data is upon us. However, outside of the consumer world, the stakes are much higher: While advertisers and consumers are focused on monetizing sites that have hundreds of millions of users for a few pennies each, the ubiquity of connectivity and the growth of sensors has opened up a larger storehouse of information that will not only help businesses profit, but will also boost safety and enable environmental benefits.

For example, a Boeing jet generates 10 terabytes of information per engine every 30 minutes of flight, according to Stephen Brobst, the CTO of Teradata. So for a single six-hour, cross-country flight from New York to Los Angeles on a twin-engine Boeing 737 — the plane used by many carriers on this route — the total amount of data generated would be a massive 240 terabytes of data. There are about 28,537 commercial flights in the sky in the United States on any given day. Using only commercial flights, a day’s worth of sensor data quickly climbs into the petabyte scale — for a single day. Multiply that by weeks, months and years, and the scale of sensor data gets massive.

Brobst, whose company sells data warehousing appliances and analytics software, points out that the Internet of Things will dwarf social media sites in its ability to generate data. The stakes and potential for monetization are huge in a world where roads have sensors and can communicate with the vehicles passing over them to determine traffic patterns, find more sustainable ways to route cars and perhaps even generate data to be sold to insurance companies or other businesses seeking to tap transportation information.

Brobst says within the next five years, sensor data will hit the crossover point with unstructured data generated by social media. From there, the sensor data will dominate by factors 10-to-20 times that of social media. However, using this data will be difficult for the time being, as there are no standards to ensure the data’s readability beyond those possessing the right software or algorithm. There’s also a question of who owns the data.

For example, if a roadway has sensors embedded in it, does the federal or state government own them? Plus, what software does the government need to talk to those sensors, and since highway projects tend to be bid out by the states, will one state be using the same sensors or software as another? Once someone adds private industry to the mix, such as trying to assemble traffic data from cars, or trying to optimize routes for fuel efficiency, the questions become: Should a car manufacturer place that information in the car? Should the consumer opt-in via a cell phone? Or, should a consumer buy an insurance policy at a discount in exchange for getting a black box that will deliver a stream of data back to the insurer?

Other than ownership and interoperability questions, there’s also the question of how long companies should store the data and who has access to it. With so many disparate sources of data, and no real vision right now for a way to get all the data formatted in a manner that could be used by any number of interested parties, larger providers like Microsoft are seeking to create marketplaces where data can be bought and sold among interested parties.

However, as Brobst says, the amount of data is only going to continue to rise, so figuring out how to manage it, what to keep and how to mine it for useful information will become increasingly important. Effectively utilizing this data — from energy to fuel consumption to weather data — could also provide valuable tools or environmental sustainability. Big Data is a big opportunity, but it’s also leading to big questions.

Related GigaOM Pro Research (sub req’d):

  1. image, all that information…what to do…

    Share
  2. Interesting article Stacey. Beyond the problems you discuss, there are is the speed and reliability of data issues that need to be addressed as we start deploying more wireless sensornets. Reliability of data specifically is a key issue because it could severly hamper the purpose behind the deployment. Can you really offer discounts to insurance customers if data on their driving patterns are incomplete? Current sensor technology suffers from incomplete data streams, often due to packet loss or defomations of the topological space. Having said that though, in the grand scheme, these are solvable problems and sensor data deluge, as you say, will continue to be a huge part of our apparent reality.

    Share
  3. [...] Sensor Networks Top Social Networks for Big Data. [...]

    Share
  4. [...] just came across this post over on GigaOm about the data volume generated by social networks vs that generated by the various sensors that [...]

    Share
  5. [...] locked in their data stores. EMC recently bought GreenPlum for similar reasons as well. And with sensor-based networks and data those networks produce making their way into data warehouses soon, it makes perfect sense for IBM to make a bold [...]

    Share
  6. [...] that is hitting the corporate world.  We see him as a kindred spirit. Whether it is writing about the rise of sensor nets or webscale databases, big data is an area of focus for [...]

    Share
  7. Do airlines really store and process 240TB of engine data for every cross-country flight? Or is that what flows through an onboard computer that watches for anomalies and notifies the pilot as a warning light, throwing away most data so long as things are going well. Somehow I don’t picture an array of a hundred 3TB hard drives in the back of the plane.

    Share
  8. [...] But these speeds aren’t really about consumer applications today, but more for shifting the terabytes of data businesses are aiming to analyze in the near future and for medical imaging and other high-bandwidth needs. Believe it or not, at [...]

    Share
  9. [...] (terabytes) of data per run over a period of one to three days per run. "Sensor Networks.From http://gigaom.com/cloud/sensor-n… :"For example, a Boeing jet generates 10 terabytes of information per engine every 30 minutes [...]

    Share

Comments have been disabled for this post