The internet of things isn’t producing a data deluge … yet

2 Comments

There are already billions of devices — from forks to jet engines — connected to the internet, and all signs point to a huge surge in the coming years. Cisco, for example, predicts 21 billion of them in 2018, up from 13 billion in 2013. But despite those numbers, the companies that will be storing all that device data are less concerned sheer volume and more concerned about making it usable.

On the Amazon Web Services cloud, where anecdotal evidence suggests a large percentage IoT applications run, all that connected data is just a drop in an enormous bucket. Matt Wood, the company’s general manager of data science, said the world of big data has matured so much over the past few years that, for example, customers regularly spin up thousands of cores to process large datasets. It’s not yet commonplace, but common enough, that “we get blasé talking about petabytes and tens of thousands of cores,” he said.

So with connected devices and IoT, he added, “It’s never really concerned me, the volume.” When IoT gets into full swing, it might already be entirely normal for customers to store hundreds of petabytes of data with their cloud providers. AWS has gotten used to growing right along with demand for capacity, Wood said, so the challenge brought by a world of connected devices is really more about providing abstractions, connections and tools to make that data easier to use.

Matt Wood Amazon Web Services Structure Data 2013

Matt Wood at Structure Data 2013. Credit: Albert Chau / itsmebert.com

Even Splunk, the company behind the popular (and eponymous) software for analyzing machine-generated data isn’t yet feeling the force of the IoT wave. Splunk has more than 8,000 paying customers around the world and many more users of its free software (it’s still downloaded 25,000 times per quarter, CEO Godfrey Sullivan claims), and the vast majority of the data they’re analyzing comes from server logs.

“I still haven’t figured [the internet of things] out from a volume perspective,” Sullivan said.

For now, there just aren’t enough tried and true use cases around which to build repeatable businesses. Sure, Splunk has some big customers collecting device data, including Coca-Cola with its “freestyle” soda machines and New York Air Brake analyzing control-system data from thousands of trains, but there aren’t many other companies that would need similar systems in place. He estimates about 5 percent of Splunk’s business might fall under the IoT umbrella.

Source: Coca-Cola

Source: Coca-Cola

Sullivan thinks connected devices will probably be a noticeably bigger driver of data for Splunk users about five years from now, and that their growing comfort with cloud computing will probably play a big role in that shift. “If you can develop in the cloud and stay there, it just makes a lot of sense,” he said.

Looking at the types of long-tail systems currently being deployed on Amazon’s cloud might give Sullivan a sense of what that next generation of users might be doing. Wood mentioned a sushi restaurant in Japan that has RFID tags on its plates and monitors in real time what’s being eaten so it can ensure it’s serving fresh food. He noted an environmental monitoring startup that analyzes sensor data from construction sites in order make sure things aren’t getting too loud or too dirty.

Source: Spark

Source: Spark

And he mentioned a startup called Spark that’s building a microcontroller for connected devices as well as a cloud-based operating system for managing them and a set of tools for analyzing the data they produce. The easier it is for people to add sensors to their devices and start gathering the data, the faster we’ll see them finding their way onto everything. The days of the internet of things existing in its own little bubble, Wood said, are disappearing fast.

2 Comments

Peter Fretty

The key to success here is going to be in whether or not organizations have the strategies in place and the understanding of how to properly integrate all of the M2M data into analytical models. The potential is here but as a recent IDG survey show, most organizations lack effectiveness when it comes to most of the key components to a truly successful big data program.

Peter Fretty, IDG blogger working on behalf of SAS

Biff

What an enormously hyped and useless collection of data you do not need to provide to companies.

Comments are closed.