3 Comments

Summary:

Ahead of our Mobilize event Oct. 16 and 17, we asked experts how 50 billion connected devices and 6 billion people change their industry. In this essay GE’s Bill Ruh tackles the topic of data.

Bill Ruh, GE
photo: GE

Over the last 200 years, the world has experienced several waves of innovation, which successful companies learned to navigate. The Industrial Revolution brought machines and factories that powered economies of scale and scope, making a profound impact on society and the culture of the world. With the Internet Revolution we have seen the rise of computing power, information sharing and data networks, fundamentally changing the way we connect (on whatever device).

mobilize-2013-essayAnd now we are at the cusp of another metamorphic change that will spawn new business models, new jobs and new operational efficiencies: The industrial internet, the convergence of contextual data, people and brilliant machines.

Whether we’re discussing the consumer internet or the industrial internet, a new challenge arises with the seemingly endless proliferation of connected devices and intelligent machines: Big data. Consider that 90 percent of the data in the world today was created within the last two years. And according to IDC’s Digital Universe study, between now and 2020 the amount of digital data is expected to double every two years – much of which will be ‘dark’ data that won’t ever be used.

IDC Digital Universe Study Data Growth

To fully appreciate the potential for this data, it is important to consider how large the global industrial system has become. There are now millions of machines across the world, ranging from simple electric motors to highly advanced CT scanners. There are tens of thousands of vehicle fleets, ranging from the trucks fielded by utilities and governments to the aircraft that transport people and cargo around the world.

Tied to those fleets are thousands of complex networks ranging from power grids to railroad systems, all of which are spinning off millions of bytes of data daily. To put this in context, the information contained in all 33 million books in the Library of Congress equates to roughly 15 terabytes of digital data. Compare that to the almost six terabytes of data a single 500-turbine wind farm generates in one day. And that farm just makes up a tiny component of the industrial internet. If we can harness all of this data, it will help reduce unplanned downtime of these machines, which in turn lead to savings for both the company and consumers.

Big industry, big data, big analytics

Image (1) southpointwindfarm.jpg for post 71894
By 2020, the biggest change for industrial businesses will be in how they use new software-based services to manage this influx of critical data.

Companies like Amazon, Facebook and Google understand that existing data platforms – the majority of which were designed to handle traditional back office IT requirements – are not adequate to meet the on-demand needs of their customers. Thus they have developed new architectures (software and hardware) that can manage the ever-growing amount and pace of data-generation. They have learned to manage large volumes of consumer information to provide new services, and in doing so, have fundamentally changed the consumer landscape.

But the industrial sector needs to improve on what the consumer internet started. The next leap forward for industry will be all about more agile development, mastering data science and becoming proficient at repeatable processes. There is a need to create software and a platform that is equivalent, if not better, than what current consumer companies have built.

Hey industry - Oil refinery

When compared to other sectors (e.g., government, financial services, and retail), industrial data is different, growing at two times the rate of any other big data segment in the next ten years. Its creation and use are faster, safety considerations are more critical, and security environments are more restrictive. Computation requirements are also different. Industrial analytics need to be deployed on machines (sometimes in remote locations) as well as run on massive cloud-based computing environments. The unique and proprietary nature of industrial applications and processes also make it difficult to properly contextualize the data.

As a result, the integration and synchronization of data and analytics, often in real-time, are needed more than in other sectors. Industrial businesses require a big data platform with common software and hardware standards, optimized for these unique characteristics.

Indeed, these requirements are changing how data and information will be used by industrial operators.  There are six capabilities that industrial companies must adopt as part of their business strategy to be successful in this new era.

  • Data collection and aggregation. Industrial companies must collect and aggregate data and information from the widest possible range of industrial devices and software systems, as well as those from enterprise and web-based systems. They must be able to integrate and normalize different data types (streaming sensor data vs. transactional enterprise data), different response times (once per ten milliseconds vs. once per day), and different business requirements (real-time process optimization vs. less real-time asset optimization) and reconcile their use at different levels of analysis
  • Advanced analytics at the point of need. What’s required is a software-defined machine: the ability for assets to be abstracted into software running in connected virtual environments where analytics are continually tuned to the requirements of specific devices, business processes, and individual roles.
  • Cloud-agnostic, deployment independence. Industrial companies need a highly flexible deployment architecture that allows them to mix and match technology deployment methods – and avoid vendor lock-in – as their needs and technological options change. For companies that are bound by regulatory requirements, this may mean supporting private cloud deployments; for other companies, it may mean supporting third-party public clouds from various providers.
  • Extensibility and customizability. Industrial companies need a big data platform that is highly extensible and based on standardized APIs and data models that allow it to adapt to new capabilities, new devices, new data types, and new resources as they become available, while still preserving the capabilities of the legacy systems that continue to impart value.
  • Orchestration:  Industrial companies must support the orchestration of information, machine controls, analytics, and people in order to ensure that the different components of the industrial big data world interoperate effectively. For example, a machine-level analytic that detects and responds locally to an operational anomaly must also be able to set in motion other analyses and actions (e.g., rescheduling flights or moving spare parts) across the network in order to prevent knock-on problems throughout the system. This requires the ability to self-tune and adapt as data, processes, and business models change.
  • Modern user experience. Industrial companies must deliver the above components within the context of a modern user experience that is no longer bound to a desktop; this includes supporting a wide range of mobile devices and user interaction models, as well ensuring that the user experience is tailored to the individual’s role and requirements at a particular time and location.

Humans and machines are more interconnected than they have ever been before, and data is the common language between us. The rate at which data is created from the intelligent devices and machines we use every day is forcing corporations to shift their business processes and strategies to adapt to this new connected reality. And the ability to manage this data will not only change the landscape for businesses, but will allow us to navigate successfully through this next wave of innovation.

Oil refinery photo courtesy of Shutterstock user alice-photo

Bill Ruh is VP of GE’s Software and Analytics Center. He will be speaking at GigaOM’s Mobilize conference on October 16th and 17th in San Francisco.

  1. J. Andrew Rogers Monday, October 7, 2013

    While the industrial Internet will drive a wave of innovation, the elephant in the room is that the existing big data platforms were designed for the requirements of relatively static, human-scale data sources. You cannot use these platforms to do machine-scale real-time sensor aggregation and analysis, even in theory, as many companies discover the hard way.

    It is a matter of technology as much as adoption. Traditional big data platforms have no support for the large-scale geospatial analytics that are so important for many types of sensor data; it is not oversight, the platforms can’t support those data models. The join operations required to fuse data sources have famously poor support in current platforms. Many industrial data sources individually and continuously produce tens of millions of records per second that must be processed, stored, and analyzed at petabyte scales while that data is being ingested. Popular open source platforms like Hadoop do not even claim these kinds of capabilities but they are critical to the basic functionality of the industrial Internet.

    As someone who has been involved in the infrastructure of the industrial Internet for a decade, I am enthusiastically in agreement with the general thrust of the article. However, people need to be aware that it is a different kind of big data problem than the tech industry has dealt with and that many existing tool chains are fundamentally inadequate for the purpose.

    Share
  2. I read your article and Andrew Roger’s comment in pointing out the inadequacy of current consumer big data approach to the needs of the industrial Internet, namely Andrew mentioned “traditional big data platforms have no support for the large-scale geospatial analytics important to sensor data”. I paused a bit to think about what I know of the consumer web.

    Aren’t Google Maps, Android-reporting devices doing this in a gradual and discovery enabled method? First throttle the amount of data and processing one can handle; offer services on top of those inputs and turn up the knob as you see fit?

    I wonder if this is more of an issue of in house technology know-how & culture rather than existing technology limits as Andrew pointed out. Being in the San Francisco Bay Area, also tapped into numerous academic institutions, I get a sense that the really talented and determined people seek the path of “disruption” and the industrial world is playing catch-up to the rest of the fast moving companies.

    Just my take on this.

    Share
  3. Curt Schacker Monday, October 14, 2013

    Adding the dimension of time, notably real-time, to big data is a qualitative change to the requirements of the underlying infrastructure. It’s not a matter of making things “faster”, it’s a matter of meeting absolute time deadlines imposed by real-world physical events and responses to those events. This is not something you increment toward; rather, you need to step back and consider the fundamental architecture and infrastructural components used to implement it.

    Share

Comments have been disabled for this post