Table of Contents
- The Data Warehouse in the Organization
- Relationships to Other Research Reports
- The Data Warehouse Database
- Analytic Store Platform Choices
- Choosing the Data Warehouse Platform
- The Cloud Analytic Database
- Data Warehouse Flavors
- Key Takeaways
- About William McKnight
- About GigaOm
If your data warehouse is under-delivering to the enterprise or if somehow you have not deployed one, you have the opportunity to deploy or shore up this valuable company resource. As a matter of fact, of all the constructs in information management, the data warehouse would be the first entity to bring to standard for maximum ROI. There are innumerable subtleties and varieties in architecture and methods. Many are appropriate in context of the situation and the requirements. We will explore these in this report.
First, the data warehouse is an analytic database that is not meant to be operational in the sense of running the business. It is meant to deliver on reporting and analytics, whether straightforward basic reports or deep and complex predictive analytics. In this context, the data warehouse will be the generalized, multi-use, multi-source analytic database for which there may or may not be direct user access. This data warehouse distributes data to other analytic stores, frequently called data marts. The data warehouse ideally sits in the architecture at the same level as many of the other analytic stores. It is possible for an enterprise to have multiple data warehouses by this definition.
You should primarily make sure the data warehouse(s) is well suited as a cleansing, distribution, and history management system. Beyond that, be prepared to supplement the data warehouse with other analytic stores best designed for the intended use. At the least in this, where possible, smaller analytic stores will procure their data from the data warehouse, which is held to a data quality standard.
Regarding the platform, housed on a non-analytic database management system (DBMS), (not an appliance, not columnar, all HDD storage), if the analytics do not weigh it down, big data volumes will. As companies advance their capabilities to utilize every piece of information, they are striving to get all information under management.
The more one data warehouse can accomplish for an organization, the better. The reality is that many were built years ago to a standard that can no longer deliver the needs of the entire organization, and many have been built with too many limitations to reasonably overcome to make it a high maturity data warehouse.
The answer may not be to just build up more analytic (non-warehouse) stores. The answer may be to build another data warehouse, or to build a new one to replace the one you have. This continues the “build once, use many” value proposition of the data warehouse concept as far as it can reasonably go.
It means multiple business projects can use the data without having to build separate data layers. Allowing concurrent use of data at the data warehouse layer or creating a mart off the data warehouse is a lot less work, reduces risk, and lowers overall costs than does building from uncultivated, original source data. The shared data approach is worth pursuing.
- You could add virtualization over the top (actually you should) or add a common semantic layer but usually you need the performance and usage benefits of some high measure of physical consolidation.
There will be non-warehouse analytic stores but at the same time, there will be data warehousing for the actionable, foreseeable future. Each organization should look at their relationship to data warehousing and make their unique plan. There is no “one size fits all” and no two organizations come to this table with the same backstory and architecture and none are willing to “scrap everything” and start over.