And so the backlash begins. Hadoop, the open-source framework for handling tons of distributed data, does a lot, and it is a big draw for businesses wanting to leverage the data they create and that is created about them. That means it’s a hot button as well for the IT vendors who want to capture those customers. Virtually every tech vendor from EMC to Oracle to Microsoft has announced a Hadoop-oriented “big data” strategy in the past few months.
But here comes the pushback. Amid the hype, some vendors are starting to point out that building and maintaining a Hadoop cluster is complicated and — given demand for Hadoop expertise — expensive. Larry Feinsmith, the managing director of JPMorgan Chase’s office of the CIO, told Hadoop World 2011 attendees recently that Chase pays a 10 percent premium for Hadoop expertise — a differential that others said may be low.
Manufacturing, which typically generates a ton of relational and nonrelational data from ERP and inventory systems, the manufacturing operations themselves, and product life cycle management, is a perfect use case for big data collection and analytics. But not all manufacturers are necessarily jumping into Hadoop.
General Electric’s Intelligent Platforms Division, which builds software for monitoring and collecting all sorts of data from complex manufacturing operations, is pushing its new Proficy Historian 4.5 software as a quicker, more robust way to do what Hadoop promises to do.
“We have an out-of-the-box solution that is performance comparable to a Hadoop environment but without that cost and complexity. The amount of money it takes to implement Hadoop and hire Hadoop talent is very high,” said Brian Courtney, the GM of enterprise data management for GE.
Proficy Historian handles relational and nonrelational data from product manufacturing and testing — data like the waveforms generated in the process. GE has a lot of historical data about what happens in the production and test phases of building such gear, and it could be put to good use, anticipating problems that can occur down the pike.
For example, the software can look at the electrical signature generated when a gas turbine starts up, said Courtney. “The question is what is the digital signature of that load in normal startup mode and then what happens if there’s an anomaly? Have you ever seen that anomaly before? Given this waveform, you can go back five years and look at other anomalies and whether they were part of a subsequent system failure.”
If similar anomalies caused a system failure, you can examine how much time it took after the anomaly for the failure to happen. That kind of data lets the manufacturer prioritize fixes.
The new release of Proficy software seeks to handle bigger amounts of big data, supporting up to 15 million tags, up from two million in the previous release.
Joe Coyle, the CTO of Capgemini, the big systems integrator and consulting company, said big data is here to stay but that many businesses aren’t necessarily clued in to what that means. “After cloud, big data is question number two I get from customers. CIOs will call and say, ‘Big data, I need it. Now what is it?’”
Coyle agrees that Hadoop has big promise but is not quite ready for prime time. “It’s expensive and some of the tools aren’t there yet. It needs better analytics reporting engines. Right now, you really need to know what to analyze. Hadoop brings a ton of data, but until you know what to ask about it, it’s pretty much garbage in, garbage out.”
The companies who make the best use of big data are those that know what to ask about it.
For example, Coyle explained, “Victoria’s Secret harvests information from Facebook and can tell you in detail about every 24-year-old who bought this product in the last 12 months. It’s very powerful, but that’s because of the humans driving it.”