Is big data new, or have we forgotten its old heroes?

Greene_Is Big Data Really New_superhero_image

In the past few years, big data has essentially gone from zero to hero in the enterprise tech world. Except for one small thing: it hasn’t, really. Many seem to have forgotten that big data was around, and being put to good use, well before it became the buzzword du jour.

Without question, enterprise data volumes have grown immensely, and organizations have indeed begun to recognize the value hidden in these larger stores. According to a 2011 study by the Aberdeen Group, organizations that effectively integrate complex data are able to use up to 50 percent larger data sets for business intelligence and analytics, to integrate external unstructured data into business processes twice as successfully, and to slash error incidences almost in half. The connection between a company’s success and its ability to leverage big data is very clear. In that sense, the media firestorm around big data has been completely valid.

Don’t believe the NoSQL hype

However, much of the buzz treats big data as though it remains to be conquered. The hype surrounding NoSQL, for instance, makes it seem like it’s the only database management system that can effectively manage big data and that, without it, immense value would remain untapped.

The first iterations of the NoSQL relation provided knee-jerk solutions for companies such as Amazon and eBay that needed to solve a crushing scaling problem – and fast. Although it solved the scaling issues, NoSQL isn’t the best solution for today’s complex enterprise-class applications. For one, NoSQL technologies all use their own proprietary coding interfaces. So moving to NoSQL creates headaches and a drain on resources, because database administrators and programmers have to learn new skill sets.

Additionally, in Hadoop, for example, links between data are not automated as they are in relational or object-oriented engines, and they must be manually joined by the developer writing some custom code to do the set operation. Plus, these newer technologies don’t yet play well with enterprise-class management and monitoring protocols, making them a high risk factor for mission-critical applications. Since these are only a few of NoSQL’s pitfalls, thinking that a solution to big data problems still doesn’t exist is in some ways warranted.

The reality, however, is quite the opposite.

Other technologies have been tackling big data for years

The zero-to-hero perception of big data neglects the fact that many companies and industries jumped on the big data bandwagon long ago. When the amount and complexity of the data became too much for relational database management systems (RDBMs) to handle, big data pioneers began using somewhat more obscure technologies such as object oriented systems and databases (ODB).

At this point, you might be thinking, “Okay, but can a lesser-known technology really tackle big data better than the latest cutting-edge innovation?”  Based on the following three examples, I’d say the answer is “yes.”

1. Big data on the rails

The U.S. Federal Railroad Administration, expecting rail freight traffic to double by 2020, created the RailEdge Movement Planner application to perform analysis of highly complex object models more than 30 times faster than an RDB. In shipping, time really is money: fuel consumption and delivery times are determined almost exclusively by the availability of an aging infrastructure. Yet there is money to be made by mastering minutiae and affecting change in real time rather than relying on predicted outcomes.

RailEdge organizes minute details and readings from a vast network of information sensors and physical items — the number of engines and cars per train, payloads, rail traffic, congestion at depots, etc. — all against the backdrop of time. That is worth repeating. Analyzing all this data against time creates really big data. RailEdge has improved average train velocity and fuel-efficiency and saved about $200 million in annual capital and expenses.

2. Big data in the air

Processing airline tickets poses an even bigger big data challenge. The massive amount of transactional throughput of Travelocity.com and other online ticketing services puts huge pressure on databases to handle every detail quickly and with perfect accuracy.

To solve this problem, Travelocity uses the SabreSonic Inventory System, the world’s most popular ticketing inventory solution. The big-data needs of the participating airlines — 30, at last count — requires an object-oriented system to maintain high performance and minimize IT infrastructure costs.

Harnessing big data to quickly and accurately process millions of transactions per day has saved Travelocity.com money and boosted the brand’s reputation. The system allowed them to switch from using multi-million-dollar mainframe hardware to relatively low-cost commodity infrastructure without sacrificing performance. Even more impressive? Reliability: Since turning the system on almost four years ago, it has never gone offline. Ever.

3. Big data on ice

My favorite story about the forgotten heroes of big data is one that delivered value beyond mere dollars and cents.

Tracking the effects of Arctic ice sheets on the world’s climate is an intricate process that requires scientists to monitor incredibly minute pieces of both historical and contemporary data in petabyte volumes. The National Snow and Ice Data Center’s (NSIDC) scientists needed to process billions of complex data objects to analyze how changes in Greenland’s ice sheet over time have affected global climate. To boot, the system also needed to enable real-time queries of the data to answer new questions about the ice sheet’s changes as the researchers made new inferences.

Traditional databases can’t do this. Hadoop can’t do this. In those systems, the relationships have to be rebuilt for every query. The only way for the NSIDC to do this was to drive an object-oriented model deep into the database’s architecture.

Without the system that the NSIDC developed, processing this amount of information would have taken years, rendering any results a matter of historical record rather than actionable intelligence.

Harnessing big data to create real value is certainly taking a quantum leap in necessity across every industry. But while today’s media hype surrounds predictive analytics and NoSQL technologies, we shouldn’t forget that tried and true technologies are out there that have been the silent heroes of big data for years.

Robert Greene, vice president of technology at Versant Corporation, has more than 15 years experience working on high-performance, mission-critical software systems. He provides the technical direction for Versant’s database technology, used by Fortune 1000 companies, including Dow Jones, Ericsson and China Telecom. 

Some rights reserved by Creative Tools.

For more on this big data phenomenon, be sure to check out GigaOM’s Structure:Data Conference in New York City on March 21 and 22.

loading

Comments have been disabled for this post