Blog Post

More Data, More Problems: How Deduplication Can Drive Business Value

Across most organizations and all industries, data is exploding at an exponential rate, and CIOs are being tasked with making sense of all this information. This is what’s keeping many CIOs up at night. The result, as Gary Orenstein notes in a recent post, is that we’re in the midst of a data mining renaissance. But, before businesses can make sense of the data and start using it smartly to deliver business value, there lies the problem of managing and consolidating the data being transmitted throughout their organizations.

Like the foundation of a house, a company’s data infrastructure should be underpinned by reliable data and sound data governance. Governance standards provide the rules for structuring information, so the data can be reliably and consistently read, sorted, indexed, retrieved and, most importantly, trusted by the end user. Like the saying goes, “garbage in, garbage out.” But, in many organizations, sound data governance is severely lacking. And, in the absence of effective governance standards, many companies are grappling with data inconsistencies.

As always, data is being processed in data processing systems and extracted into analytics repositories. The problem is that, often, these repositories are siloed within individual departments — pockets within the organization have their own copy of the data, each holding slightly different information. In fact, some large telecoms and financial institutions can have as many as a dozen disparate copies of their high-volume transactional data. The solution is consolidating that data into a single location (otherwise known as deduplicating), eliminating redundant repositories of data and gaining real-time access to information — historic and current — that is transmitted across its enterprise.

Beyond governance and deduplication, given the immense cost associated with data breaches, companies are investing in data consolidation to address security concerns. While data encryption continues to play a critical role, it is easier to monitor and secure a single copy of data sitting in a single data warehouse, as opposed to siloed repositories in various locations within the enterprise. Data control and management becomes more effective as IT organizations can allow certain departments access from the data warehouse only to the data they require.

“Clean” and integrated data can serve as the backbone for some truly innovative analytics work that is happening at organizations around the world. In his post, Gary Orenstein talks about the explosion of web site data and the new ways to “measure and monetize this information.” The next step is integrated web intelligence, or, in other words, taking that web data and analyzing it against other enterprise sources, such as the retail branch, supply chain or contact center.

For example, some customers like to browse online and put items in “virtual shopping carts,” then buy at their local store. Traditional web analytics would struggle to identify this buying pattern and thus at marketing to them appropriately.  By integrating Web analytics with other enterprise data sources, you’d be correctly identified as a web browser and a brick-and-mortar buyer.

Deduplication of data can serve an important role in risk management, as well. One of our financial customers used to have data scattered across multiple silos and could only report its risk position with 75 percent accuracy. Since integrating its data, its information officers have a comprehensive view of risk exposure across all product lines and business units. This bank’s managers can now take action on risk factors knowing that more than 99.5 percent of their data is accurate.

In an ideal world, organizations would manage the data explosion by first putting an information management strategy in place that takes into account different accountabilities and responsibilities at all levels within an organization to ensure the accuracy and integrity of information. The next step is developing a blueprint for how the data is going to be reused, as well as encryption and security to protect the data assets.  Finally, use in-database data mining to reduce cost and save time by eliminating data movement, and get better answers.  Only then can CIOs wake up from the data management nightmare and begin to use the information to extract business value in some really creative ways.

Darryl McDonald is chief marketing officer for Teradata Corp.

2 Responses to “More Data, More Problems: How Deduplication Can Drive Business Value”