In her keynote at the Strata conference in Santa Clara, Calif., on Tuesday, Rebecca Shockey, global research leader for business analytics and optimization at IBM’s Institute for Business Value, asked why about a quarter of respondents to a recent IBM survey still had not yet started engaging in “big data activities.” Making the business case and showing potential returns on investment turned out to be a major obstacle to adoption, she said. Later at the conference, some of those still on the fence might have found some good ideas.
Putting small energy data in perspective
Barry Fischer, head writer at the data blog from Opower, the company that crunches data for utility companies servicing almost half of all households in the United States, passed around sample bills that chart consumers’ year-to-year energy use and show how consumers compare with their neighbors. Besides the information for bills, Opower also provides alerts if consumers are on track to get a high energy bill and a Facebook app for consumers to compare their energy use with that of their friends. The data Opower collects — consumption figures from utilities, preferences from users and third-party weather, housing and demographic statistics — also enable Fischer and other company bloggers to present simple and consumer-friendly correlations, such as the fact that Yahoo Mail users typically pay $110 more per year in energy bills than those who use Google Mail. Taken together, Opower’s uses of data show how millions of people can benefit from contributing their own individual data.
In January, GigaOM’s Katie Fehrenbacher named Opower one of her 13 energy data startups to watch in 2013.
Etsy bakes a Funnel Cake
In another session, three data engineers from Etsy revealed how they use Hadoop to detect issues with various functions on the website, and they talked about building a program other employees use to optimize the parts of the site that generate the most revenue. At Etsy, already a big Hadoop user — at one point, engineers ran 5,000 Hadoop jobs for a variety of purposes in a single month — a popular term is the attribution funnel, or the process customers take as they buy products on the site. The data engineers wanted other employees to be able to identify the steps where customers get caught up before purchasing, such as email address verification to establish new accounts. So they built a program called Funnel Cake, which scales better and deliver real-time information faster than Hadoop, said engineer Wil Stuckey. Running Funnel Cake, employees can streamline the process and increase the percentage of site visitors who end up buying products. Beyond that, they can see which kinds of pages lead to the most sales and focus more or less attention on browsing and searching functions or storefronts from product makers.
Vending machines, advertisements and babies
Other use cases on display at the conference spanned from vending machines to babies. One company has installed sensors on its vending machines and now monitors the resulting data in real time to spot theft and cut down on purchasing new machines to replace stolen ones. An internet advertising company now uses highly scalable software based on Hadoop MapReduce, Apache Nutch and Apache Solr to detect traffic sources for advertisements, bringing new revenue. And a hospital’s neonatal intensive care unit has implemented a visualization tool for real-time health statistics that shows signs of “baby crashing” and thereby can reduce mortality rates.
Executives from Aetna, Williams-Sonoma, Facebook and other companies will discuss big data use cases at the GigaOM Structure:Data conference in New York on March 20-21.