Table of Contents
Alongside the explosive data growth that every organization is experiencing, there is an increasing demand for big data analytics, AI, commercial HPC, and other high-demand workloads. At the same time, enterprises want the flexibility of the cloud while keeping data and costs under control.
In theory a hybrid cloud strategy, for example, to take advantage of cloud bursting for additional processing power looks like the right choice; however, keeping data far from compute resources is very inefficient due to the increased latency. What’s more, high-performance cloud storage is very expensive, while egress fees and other hidden costs can quickly discourage its use for interactive workloads.
Moving data is also difficult, expensive, and inefficient. Even thinking about selecting a data set and moving it to a remote place can be challenging, especially if we think that we need the resulting data back after the compute job is ended. The sticky nature of data is usually described as “data gravity.” This is not news, but it is very important to understand before looking for a solution.
Every organization hopes to consolidate all its data in a single place, to make it more accessible and controllable, and to avoid the creation of data silos. But as already mentioned, moving large data sets back and forth is expensive and impractical, and accessing them remotely is challenging and inefficient as well. This is an issue that is becoming even more severe with edge-cloud and multi-cloud infrastructures and data repositories that can be located far from the applications. Expensive connectivity options can alleviate some of these issues, like bandwidth requirements, but they are not dynamic and cannot bring data closer to the CPU.
Financially speaking, this is a nightmare:
- Inefficiency introduced by latency results in poor CPU utilization, longer processing time, and higher compute costs.
- Moving entire data sets to the compute nodes for the job and then back when they are processed can dramatically increase cloud storage bills because of egress fees and the number of transactions necessary to complete the job.
Costs increase and become unpredictable, posing several risks to the business and raising doubts about the sustainability of this model for the future.
Cloud providers built formidable virtual compute infrastructures that include acceleration options like FPGAs and GPUs for reasonable prices. Unfortunately, cloud storage often offers less flexibility, lacking the mechanisms to create cost-effective solutions for highly demanding workloads. The most interesting and cost-effective approach to bringing data close to the CPU, and to reducing latency with high-capacity volumes, is to separate the performance and access layer from capacity and protocol dependency. By doing so, it is possible to build a caching mechanism that works across long distances and the Internet, pushing hot data to the CPU and the applications that need it.
IC Manage Holodeck takes a unique approach to caching that enables users to take advantage of flash memory installed in the compute nodes and shared across high-speed local networks to intelligently cache data and metadata efficiently. The result: IC Manage enables users to accelerate many highly demanding workloads, both on-premises and in the cloud, to provide an impressive ROI quickly.