8 Comments

Summary:

Seemingly overnight, big data became the behemoth to conquer. But the truth is, tried and true technologies have been tackling the problem for years. Versant’s Robert Greene gives respect to three unsung heroes of big data.

Greene_Is Big Data Really New_superhero_image

In the past few years, big data has essentially gone from zero to hero in the enterprise tech world. Except for one small thing: it hasn’t, really. Many seem to have forgotten that big data was around, and being put to good use, well before it became the buzzword du jour.

Without question, enterprise data volumes have grown immensely, and organizations have indeed begun to recognize the value hidden in these larger stores. According to a 2011 study by the Aberdeen Group, organizations that effectively integrate complex data are able to use up to 50 percent larger data sets for business intelligence and analytics, to integrate external unstructured data into business processes twice as successfully, and to slash error incidences almost in half. The connection between a company’s success and its ability to leverage big data is very clear. In that sense, the media firestorm around big data has been completely valid.

Don’t believe the NoSQL hype

However, much of the buzz treats big data as though it remains to be conquered. The hype surrounding NoSQL, for instance, makes it seem like it’s the only database management system that can effectively manage big data and that, without it, immense value would remain untapped.

The first iterations of the NoSQL relation provided knee-jerk solutions for companies such as Amazon and eBay that needed to solve a crushing scaling problem – and fast. Although it solved the scaling issues, NoSQL isn’t the best solution for today’s complex enterprise-class applications. For one, NoSQL technologies all use their own proprietary coding interfaces. So moving to NoSQL creates headaches and a drain on resources, because database administrators and programmers have to learn new skill sets.

Additionally, in Hadoop, for example, links between data are not automated as they are in relational or object-oriented engines, and they must be manually joined by the developer writing some custom code to do the set operation. Plus, these newer technologies don’t yet play well with enterprise-class management and monitoring protocols, making them a high risk factor for mission-critical applications. Since these are only a few of NoSQL’s pitfalls, thinking that a solution to big data problems still doesn’t exist is in some ways warranted.

The reality, however, is quite the opposite.

Other technologies have been tackling big data for years

The zero-to-hero perception of big data neglects the fact that many companies and industries jumped on the big data bandwagon long ago. When the amount and complexity of the data became too much for relational database management systems (RDBMs) to handle, big data pioneers began using somewhat more obscure technologies such as object oriented systems and databases (ODB).

At this point, you might be thinking, “Okay, but can a lesser-known technology really tackle big data better than the latest cutting-edge innovation?”  Based on the following three examples, I’d say the answer is “yes.”

1. Big data on the rails

The U.S. Federal Railroad Administration, expecting rail freight traffic to double by 2020, created the RailEdge Movement Planner application to perform analysis of highly complex object models more than 30 times faster than an RDB. In shipping, time really is money: fuel consumption and delivery times are determined almost exclusively by the availability of an aging infrastructure. Yet there is money to be made by mastering minutiae and affecting change in real time rather than relying on predicted outcomes.

RailEdge organizes minute details and readings from a vast network of information sensors and physical items — the number of engines and cars per train, payloads, rail traffic, congestion at depots, etc. — all against the backdrop of time. That is worth repeating. Analyzing all this data against time creates really big data. RailEdge has improved average train velocity and fuel-efficiency and saved about $200 million in annual capital and expenses.

2. Big data in the air

Processing airline tickets poses an even bigger big data challenge. The massive amount of transactional throughput of Travelocity.com and other online ticketing services puts huge pressure on databases to handle every detail quickly and with perfect accuracy.

To solve this problem, Travelocity uses the SabreSonic Inventory System, the world’s most popular ticketing inventory solution. The big-data needs of the participating airlines — 30, at last count — requires an object-oriented system to maintain high performance and minimize IT infrastructure costs.

Harnessing big data to quickly and accurately process millions of transactions per day has saved Travelocity.com money and boosted the brand’s reputation. The system allowed them to switch from using multi-million-dollar mainframe hardware to relatively low-cost commodity infrastructure without sacrificing performance. Even more impressive? Reliability: Since turning the system on almost four years ago, it has never gone offline. Ever.

3. Big data on ice

My favorite story about the forgotten heroes of big data is one that delivered value beyond mere dollars and cents.

Tracking the effects of Arctic ice sheets on the world’s climate is an intricate process that requires scientists to monitor incredibly minute pieces of both historical and contemporary data in petabyte volumes. The National Snow and Ice Data Center’s (NSIDC) scientists needed to process billions of complex data objects to analyze how changes in Greenland’s ice sheet over time have affected global climate. To boot, the system also needed to enable real-time queries of the data to answer new questions about the ice sheet’s changes as the researchers made new inferences.

Traditional databases can’t do this. Hadoop can’t do this. In those systems, the relationships have to be rebuilt for every query. The only way for the NSIDC to do this was to drive an object-oriented model deep into the database’s architecture.

Without the system that the NSIDC developed, processing this amount of information would have taken years, rendering any results a matter of historical record rather than actionable intelligence.

Harnessing big data to create real value is certainly taking a quantum leap in necessity across every industry. But while today’s media hype surrounds predictive analytics and NoSQL technologies, we shouldn’t forget that tried and true technologies are out there that have been the silent heroes of big data for years.

Robert Greene, vice president of technology at Versant Corporation, has more than 15 years experience working on high-performance, mission-critical software systems. He provides the technical direction for Versant’s database technology, used by Fortune 1000 companies, including Dow Jones, Ericsson and China Telecom. 

Some rights reserved by Creative Tools.

For more on this big data phenomenon, be sure to check out GigaOM’s Structure:Data Conference in New York City on March 21 and 22.

  1. Traditional databases can’t do this. Hadoop can’t do this. In those systems, the relationships have to be rebuilt for every query. The only way for the NSIDC to do this was to drive an object-oriented model deep into the database’s architecture.

    Share
  2. Steve Ardire Sunday, March 11, 2012

    > Don’t believe the NoSQL hype
    well yes some good points here and attempted defense of object oriented systems and databases (ODBs).

    However the answer is Big Data solutions with be offered across SQL, NoSQL ( they have big $ and great hype machines ), ODB and my friend in the semtech community have big data solutions on RDF triplestores.

    see Breaking into the #NoSQL Conversation – http://bit.ly/yrnfBc by @chirping_gonzo #SemTechBiz #semanticweb #BigData

    Share
  3. Why no mention that these are all Versant customers? Humility, I assume

    Share
  4. Good point. Even before it had a moniker, many of us identified the issue. Great to see the industry finally adopting the “3V”s of big data over 11 years after Gartner first published them. For reference, and a copy of the original article I wrote in 2001, see: http://blogs.gartner.com/doug-laney/deja-vvvue-others-claiming-gartners-volume-velocity-variety-construct-for-big-data/. –Doug Laney, VP Research, Gartner, @doug_laney

    Share
    1. Robert Greene Monday, March 12, 2012

      Doug, hats off to you for nailing it well ahead of the general market. Truly amazing how off base the industry got by simply following a bean counter approach to competitive business practices instead of letting our technical folks make the technical call. Well, the times are a changing, free information flow is making us smarter, lets hope going forward we can have the good sense to use the right tool for the job at hand.

      -Robert

      Share
  5. Surprise! The seller of an object oriented database touts object oriented databases as the silver bullet.

    What we learned instead is — there is no silver bullet. Go polyglot, go multidatabase or die.

    Share
    1. Robert Greene Monday, March 12, 2012

      Hopefully, this did not come across to others in the same way as it is not what I intended. There is no silver bullet in data management. Its about using the right tool for the job. For many years, managers were forcing the 1 hammer rule on their development organizations ( a cost justification ), spending millions on enterprise licenses and then telling the technical folks it’s “free” and therefore you must use it … without fully understanding the externl factors such as time to market, competitive advantage, system footprint costs ( more servers, power ), administration costs, software license multiplier effects,etc. NoSQL has changed that 1 hammer mentality which is a good thing. The article is intended to speak to Big Data in the face of true information model complexity. The ODB is particularly well suited to solving that kind of problem. The ODB underlying architecture is key:value, but the implementation was specifically done to address the reality of richly linked data models. So, for that class of problem, you might say its a silver bullet, but its not the solution to every data management problem, just as Hadoop is not the solution to all data management problems.

      -Robert

      Share
  6. What’s a supposed neutral industry analyst group doing promoting an advertorial like this?

    Share

Comments have been disabled for this post