The Android of the Smart Grid: openPDC

openPDClogoCan an open source data management system do for the smart grid what Google’s open mobile operating system (s GOOG) Android is doing for cell phones — spawn innovation and low cost development? Execs at the Tennessee Valley Authority (TVA), the largest public power provider in the U.S., seem to think so. TVA analyst Josh Patterson says Google’s Android is a good analogy for openPDC, an open source version of a platform that aggregates and processes data about the health of the power grid, and which TVA has helped create. Like Android has done for the mobile industry, openPDC will enable utilities and the power industry to develop their own versions of data management services with more flexibility and at a lower cost than proprietary systems, Patterson points out.

The comparison might be a tad abstract, but I think it holds water. OpenPDC could be as key to the deployment of the smart grid as Android has been for the recent sea-change in the mobile industry (see GigaOM Pro’s report on Google’s Mobile Strategy). As the $3.4 billion in stimulus funds are allocated to the winning utilities, the power industry will be rapidly turning to a variety of methods for collecting, storing and processing the petabytes of data that will be unleashed. While many utilities will want to stick with proprietary systems and already established data management players, early-adopter utilities (particularly outside the U.S.) are starting to look at openPDC for a more economical and flexible approach.

So how exactly does it work? TVA’s openPDC specifically looks at information collected by devices called phasor measurement units (PMU), which gather information like voltage, current, frequency (and the accompanying location) several thousand times a second. The regulatory agency the North American Electric Reliability Corp. (NERC) designated the TVA’s PMU collection system as the national repository of such electrical data and TVA now aggregates info from more than 100 PMU devices. The phasor data collection system is known as the Super Phasor Data Concentrator (SuperPDC) and last month TVA announced that it had made that system open source, based on Hadoop, an open-source software framework.

Originally developed to analyze large data sets generated by web sites, Hadoop is a low-cost and open way to manage this massive amount of data so that it can be accessed and processed by utilities when they need it. Hadoop has been designed to run on a lot of cheap commodity computers and uses distributed features that make the system more reliable and easier to use than competing proprietary systems for running processes on large sets of data. Computing and web companies like IBM (s IBM), Amazon (s AMZN), Yahoo (s YHOO) and Google (s GOOG) are turning to Hadoop for the open source underlying for some of their new commercial products.

Not coincidentally two of the distributed features of Hadoop — its Distributed File System and its Distributed Processing Framework — take a cue from Google: Google’s File System, which distributes file system data across multiple servers and maintains multiple copies of all of it, and an algorithm popularized by Google called “MapReduce” that partitions compute jobs out to hundreds or thousands of nodes. In other words, Open PDC and Hadoop have roots in the same ideology that created Google’s open source Android platform.

That ideology calls for creating a platform that is open to any developer, easy to access and innovate around, and low cost. As analyst Phil Hendrix pointed out in this GigaOM Pro report, Android, Google’s open operating system for mobile, has removed licensing fees that developers normally have to pay and created a platform with significant programming efficiencies. As a result, Android had a slow start, but earlier this year HTC, the world’s fifth largest cell phone maker said that more than half of its devices will be built around Android. Motorola (s MOTO) announced its first Android phone at our Mobilize conference earlier this year.

TVA’s Ritchie Carroll, who started working on openPDC project back in 2004 and had a long history in the IT world before he joined the power industry, tells us that openPDC has garnered a lot of interest from utilities particularly in Brazil, China, South Korea and Russia. In the U.S., TVA plans to build out nine nodes across the nation that will be able to collect PMU data and monitor the health of the entire grid. All that data will be processed and stored by openPDC. “NERC found it very important to that a common nomenclature and common metadata was created about this data,” says Carroll.

Patterson and Carroll say they see utilities turning to Hadoop and open source data processing for other forms of smart grid data, like information from smart meters. Patterson doesn’t see utilities turning to traditional data storage and processing options used in the web world, because he says utilities have different needs. Instead, he envisions utilities building Hadoop clusters, like Google does, to store and process the information.

While I agree with Patterson that this would be the best way, the big stumbling block that I see is the utility learning curve. These are cutting-edge systems in the web and mobile industries, so developers at a utility will have to be particularly savvy to set up their own Hadoop clusters. Perhaps this will have to wait for the next generation of utility IT leaders? Carroll and Patterson agree, but think the future is coming soon: “the IT education for the power grid is already under way.”