The pressure to leverage data as a business asset is stronger than ever. Enterprises everywhere are eager to devise sound data strategies that are realistic and achievable, based on available budget and sensitive to in-house technology skill sets.
For a while, it looked like open source, specialized big data compute frameworks, including Hadoop and Spark, were the way to go. Enterprise organizations found them compelling for reasons of novelty, economics and the apparent prudence of a future-looking technology. But those frameworks are at a bit of a crossroads: the hype around them has subsided and — while things are improving — the success rate of enterprise projects involving them has been modest.
Meanwhile, the data warehouse (DW) which, for decades, has been a key technology platform for enterprise analytics, never went away. Yes, DWs struggled and incumbent DW platforms still do, but recent advances in storage costs and compute scalability issues, especially in the cloud, have addressed the most important challenges faced by DW platforms.
As a result, we are in a DW renaissance period. Problems with DWs have largely been solved, petabyte-scale data volumes no longer defeat them and the familiarity and ease of use that kept them viable all this time are now helping them face their open source big data competition and, in many cases, emerge victorious.
Still, if DW platforms have changed, what should enterprises do to build an analytics strategy that integrates them? Even the most DW-loyal shops will need to look at how DW platforms have evolved and adjust their strategies accordingly. Organizations that have committed to open source analytics technologies will need to take a second look at DW platforms and consider a strategy that combines them, essentially bringing the data warehouse together with the data lake.
The trick to adapting to the new world of data warehousing is understanding that it is not only the technology that has changed, but the applications and use cases for DW technology as well. DW platforms do not need to be used exclusively for Enterprise DW implementations. The platforms are now more versatile and can be used for use-case specific workloads and even exploratory analytics. In a sense, the DW isn’t just a DW anymore.
The juxtaposition of DW and Data Lake has even shifted – DW platforms today are increasingly able to ingest raw, semi-structured data, or query it in place. This means warehouse and lake technology can be used in combination and, sometimes, lake technology will not be necessary.
It is not just the Data Lake and its workloads that are becoming more integrated into the Warehouse. Streaming data, machine learning, and AI are onboarding as well. In addition, data governance and data protection are starting to enter the DW orbit.
While the familiarity of the relational model, dimensional design, and SQL are back; the application of DW technology now covers territory that may be less familiar. Moreover, new vendors who have championed or been born in the cloud are emerging in leadership positions.
The new capabilities, new use cases, and new vendors make the space exciting but also difficult to navigate (for newbies and veterans alike). Enterprise customers will need to understand how DW platforms have morphed and shape-shifted; how best to use, deploy and implement them; combine them with other applications; and understand the key differences between the vendors and their offerings.
Without this knowledge, Enterprise buyers will be frozen in indecision. With it, they will be armed to leverage today’s DW platforms to their fullest and cherry pick other technologies that can augment and optimize them. The end-result is organizations in the know will be ready to analyze all their data, in technologically familiar environments, for full competitive and operational advantage.
In this report we map out today’s DW landscape in the context of where the technology has been, where it is, and where it is going.
- The Data Warehouse is alive and well, perhaps enjoying its biggest popularity wave to date
- Open source technology challengers addressed issues of storage costs and horizontal scalability but did not achieve parity in terms of enterprise skill-set abundance, interactive query performance, or acting as authoritative data repositories
- The advent of the cloud has helped DW platforms transcend their vulnerabilities, through the cloud’s power of elasticity for compute and the use of economical, infinitely scalable cloud object storage
- Transcending their vulnerabilities and satisfying customers’ need for familiar SQL-relational platform paradigms has turned out to be a one-two punch for DWs
- Customers need to bring themselves up-to-date on the latest DW innovations and product categories, understanding each in the context of the DW’s historical evolution and market factors
- Customers must correlate DW product categories with corporate cloud (and multi-cloud) strategy
- Market & Maturity Factors: Brief History of Data Warehouse Technology, and its Vendor Ecosystem
- Considerations for using Enterprise Data Warehouses
- Key Players
- Near Term Outlook – The (Enterprise) Data Warehouse-Data Lake Interplay
- Key Takeaways
- About Andrew Brust
- About GigaOm