20 Comments

Summary:

The massive amount of data that is emerging from connected, digital systems, is fundamentally changing everything, from Internet search to entertainment, to disease management, to energy consumption. Here’s 10 case studies that highlight the power of big data.

feature-waterfall

How big data can curb the world’s energy consumption

By Katie Fehrenbacher

The age-old thesis for energy efficiency is “if you measure it, you can manage it.” Once you identify how much energy a person or a building uses, you can reduce its consumption. But in a world where a massive amount of energy data is suddenly emerging — from sensors, devices and the Web — tapping into energy data will take on a whole new meaning, and big data tools could one day become a fundamental way to help the world curb energy consumption.

Opower’s big data plan

A few startups and early-adopter utilities are already turning to big data tools to deliver key aspects of energy efficiency. Opower, a venture-backed energy software startup with offices in Washington, D.C., and San Francisco, tells me it has been transitioning to using Hadoop, via startup Cloudera, to run heavy analytics on the data it crunches in the cloud.

Opower currently manages about 30 TB of information (and growing), which includes energy data from 50 million utility customers (across 60 utilities) as well as public and private data about weather and demographics, historical utility data, geographical data and much more. The data is stored and processed in a combination of over 20 MySQL databases and a production Hadoop cluster.

Most of Opower’s data is structured, with the exception of its systems-logs processing infrastructure. The data is processed in batch processes that access both MySQL and Hadoop, and the current production Hadoop cluster is 12 nodes; that is 80 TB of usable space, 72 cores, 0.5 TB of memory and 120 spindles. The Opower analytics team also uses Pentaho analytics and R in its regular business intelligence work.

The result of all of these new tools is that Opower can help utility customers shave about 2 percent off their home energy consumption by showing customers how well (or poorly) they are doing compared to their peers and neighbors (tapping into shame or guilt) or suggesting other tips like adding energy-saving lightbulbs.

Thanks to big data tools like Hadoop and new analytics, Opower can crunch data faster and deliver better results. Opower’s director of West Coast Engineering, Drew Hylbert, and Alex Newman, Opower’s data architect, told me in an interview that the new Hadoop data architecture enables Opower to create new and better algorithms, and it helps the company compare and aggregate disparate data sets all in one place. Hylbert said new Opower services, like one that forecasts a customer’s monthly bill (using three years of historical data), are relying significantly on the new transitioning data architecture.

Newman, who is helping lead the Hadoop transition, is a data architect wunderkind who previously hailed from Cloudera. Newman said that he joined Opower because it was inspiring to work on issues as important as energy efficiency.

Newman and Hylbert are the first to point out that their current data sets are not exactly “big data” compared to the data sets of huge Internet firms like Google, Facebook and Amazon. But Opower is rapidly growing, adding more utility customers, and it is also adding more data streams. Even running 30 TB of data through its system, Opower has been able to get its utility customers to save 700 million kilowatt-hours to date, which is the equivalent of 1 billion pounds of greenhouse gas emissions and the annual output of 90,000 cars.

Big data for energy sensors

While Opower might be one of the firms leading this new trend, it isn’t the only energy project embracing the cloud and big data. An open-source project called the openPDC is a framework for collecting and storing data from power grid sensor devices several thousand times per second; that data includes voltage, current, frequency and location.

The Tennessee Valley Authority started working on an early version of the openPDC in 2004, and the open-source project officially launched in 2009. The developers of the framework realized they would need big data tools like Hadoop to manage and analyze such a large set of data. The openPDC embraces both the Hadoop Distributed File System and MapReduce, and the organizers of the program opted for HDFS because it could run on commodity hardware, which means a lower cost of deployment.

Why big data about energy is important

The power grid is just beginning to add information technology that will enable computing, sensors, smart meters and software to collect energy data about consumption, available clean power and energy efficiency.

Smart meters — which can read your energy consumption every 15 minutes — are just being installed in major cities. Digital two-way thermostats are appearing on the shelves of big-box retailers like Best Buy. As these devices spread they will generate data that utilities will be able to use to better manage the load on the grid.

Decades down the road, when the power grid has gone truly digital, there will be an overwhelming explosion of energy data, and it will be the smart algorithms and software that will be able to crunch this wealth of data, helping to manage energy efficiently. Those managing such a large amount of data will inevitably need to utilize the next generation in big data tools. Big data, say hello to big energy.

firstpage of 11
  1. Reblogged this on Dots Of Color and commented:
    Big data big money!

    Share
    1. I don’t get it, what does Big Data have to do with a video card…or is this some lamesauce ad post?

      Share
    2. The emergence of this so-called big data phenomenon is also fundamentally changing everything from the way companies operate

      Share
      1. Yes, it does. Who controls the most data wins. At least Facebook would like to think so. ;-)

        Share
  2. Is gigabytes bytes more then a gigabyte?

    Share
    1. nope just a typo, fixed that, thanks!

      Share
      1. Typos happen.
        Gil’s “more thEn a gigabyte” is just plain ignorant.

        Share
    2. Grammar Police Thursday, March 15, 2012

      If you’re going to complain about a typo, make sure you don’t have any in your immature comment. When you have full mastery of the language, then you’ll be allowed to comment.

      Share
  3. infotech ideas Monday, March 12, 2012

    Great info! Bring the expo to SFO as well!

    Share
  4. remedy2020@gmail.com Monday, March 12, 2012

    Advertorial ! Advertorial! so fast you sold your soul!

    Share
  5. DataStax, more specifically Cassandra, can solve all big data problems.
    http://cassandra.apache.org/
    And its open sourced.

    Share
  6. SAP HANA to the rescue!

    Share
  7. Reblogged this on <i>cu Lì!</i> and commented:
    great info :)

    Share
  8. why do we have to click through so many pages. can you at least provide a way to read it in a single page? (like businessinsider) there is not even a print option and it doesn’t work with readability. i thought more of gigaom. disappointed.

    Share
  9. idiots…
    «“We want to unlock the black box of how an artist becomes a star,” White said»
    what makes the charts is good music, not $$$$$ pumped into it 8-X
    just like m$$$$$ can keep wasting billion$ on WP trying to make it a success, it won’t work. its crap, it doesn’t sell
    period

    Share
  10. Big Data is a tactical problem. Content and Business Analytics and Intelligence is the logistical problem. One must pay particular attention to Business Process Models, Entity-Relationship Models, and Data Modeling to be able to use ETL and Data Integration Technologies for developing your data storage organization, retrieval, formatting, clustering, Web Caching; and backup, recovery, and archival retention strategies.

    Share
  11. Edwin Ritter Wednesday, May 9, 2012

    Reblogged this on Ritter's Ruminations & Ramblings and commented:
    As 2012 reaches the half way mark, here is a quick view on how this hot topic. This is the first of three. Posts on the other trends will follow.

    So ‘big data’ is a hot topic. What is it? Simply stated, everything you do on the web is tracked and creates data. So much data is collected that 90% of the online data was created in just the last two years. This data is stored, sliced, diced and analyzed. The growth in data is due to several things such as proliferation of smart phones and tablets, lower storage costs and improved analytical tools. This article reveals 10 ways in which big data will have an impact.

    Share
  12. Thank you for the thoughts on this so far.
    The kate broadwell

    Share
  13. Thanks instead of the article. Blogging is replacing main onslaught news for various black people.
    The vpn port

    Share

Comments have been disabled for this post