Are Outdated Data Architectures Holding Back the Cloud?


Cloud computing represents a fundamental technical and business trend. But there are barriers in cloud computing that limit broad cloud-based deployment of scaled enterprise-class services. Hybrid clouds will overcome some of these barriers in the short term, but, in the longer term, improvements in cloud architectures and virtualization technologies are required. Among the goals of these improvements must be exploiting commodity technology advances by vertically scaling the data tier to achieve unified cloud systems.

Industry Trends

Data center Quality of Service (QoS) and Total Cost of Ownership (TCO) has a major impact on an enterprise’s business growth and profits. IT departments and application developers seek architectures and deployments providing resilient performance scalability and high service availability to meet rising service demand while controlling capital and operating expenses.

Tremendous technology advances have been made in recent years that offer the potential for major improvements in QoS and TCO. Software technologies, including virtualization, cloud, and data stores, provide new deployment architectures that offer large potential improvements in scalability, availability, and cost structure. Commodity hardware advances in enterprise flash memory and multi-core processors offer significant potential improvements in performance while reducing power and space consumption. But fusing these technology trends to achieve effective cloud-based, large-scale service deployments requires architectural innovation.

Cloud Opportunities and Challenges

The potential benefits of server virtualization and cloud computing are clear and compelling.  Server virtualization provides increased computing resource utilization and elastic scaling.  Cloud computing provides cloud infrastructure companies economies of scale by sharing computing resources among multiple tenants. Also, instead of devoting capital expenditures up-front to buy hardware — which has the risk of rapid obsolescence — cloud computing lets IT organizations pay as they compute with a controllable operating expense.

However, to realize these benefits for cloud customers, cloud providers must deliver guaranteed service capacity, availability, and response time; multi-tenant management and security; and a net TCO savings. Presently, though, enterprises are hitting barriers in deploying enterprise class services into the cloud at scale. For many classes of applications and services, the realized performance and availability characteristics of cloud deployments are disappointing at scale, and the large quantity of cloud instances needed to support scaling a deployment drive the cost of cloud deployment to unacceptable levels. Building vertically scaled services on horizontally scaling clouds has a large penalty in performance, availability, and cost, losing the advantage of clouds.

Current cloud server-virtualization technologies rely primarily on provisioning application instances in virtual machines onto servers under the management of a hypervisor. This is an easy way to combine existing applications with multi-core systems, and can also provide for elasticity of service capacity through dynamic provisioning of more or fewer application instances based on the current workload demand. This virtualization approach has been successful in public clouds and works well for applications, such as web application servers, that have been designed to scale horizontally and can run under a VM hypervisor within the DRAM capacity.

But mission-critical, vertically scaling data-tier servers (including databases, caching services and key-value stores), which concurrently handle the requests of hundreds of web application servers, can only be effectively used in virtual machines for development or small-scale production. These services typically need all the physical resources and more (cores, DRAM I/O) of a dedicated physical server, so multiple instances cannot be effectively shared on a single physical commodity computer, and running under a hypervisor just increases overhead.

In production cloud environments, it becomes necessary to work around the diminished performance by providing additional data partitioning and caching layers, as well as provisioning many more instances than in a non-virtualized environment. These numerous, small data partitions, caches, and application instances drive up application and management complexity, increase cost and reduce service availability. As a consequence, less than 10 percent of production data-tier server workloads are virtualized today.

Intersection Opportunities

The compelling trends in commodity multi-core processors and flash memory offer huge potential for improving the inherent QOS and TCO of the data center’s data-access tier.  Emerging databases, data stores, and caching services that are tightly coupled with multi-core and flash memory achieve a tenfold improvement in throughput/watt/cm3 when compared with legacy data-access software on hard-drive-based systems, but these benefits are lost when using today’s virtualization technologies. Capturing these compelling benefits for enterprise-class, scaled deployments requires innovation in data-tier virtualization and in cloud architectures.

Approach: Flash-Optimized Data Tier + Hybrid Clouds + NextGen Virtualization

Current cloud abstractions and building blocks are not well suited for deployment of vertically scaling data-tier applications, especially those tightly integrated with multi-core and flash memory. New cloud abstractions and building blocks are required. VMs have allowed us to take the first steps toward defining, compartmentalizing, and isolating services, making them more mobile, portable, interoperable, easy to describe, inventory and in some cases more secure, but they don’t support or fix vertically scaling services effectively in production.

In the short term, hybrid clouds can fuse together the benefits of the compelling industry trends of architectural improvements in data-tier solutions and clouds. Hybrid clouds can use virtualized machine instances for the web and application tiers, while exploiting non-virtualized, vertically scaling data-tier solutions in balanced commodity, flash-based, multi-core system configurations. In these deployments, the optimized data-tier servers integrate into cloud data centers as shared, networked servers with explicit virtualization based on management APIs for provisioning, accounting, monitoring, security, and multi-tenancy controls rather than as virtualized machine instances.

In the longer term, improved virtualization technologies are needed so that data-tier software tightly integrated with flash and multi-core can be effectively virtualized within a unified virtual administration model applicable to all tiers in the data center, including dynamic provisioning, management, monitoring, and accounting,


We are at a point in our industry evolution where we can and must apply architecture innovation to effectively address QoS and TCO in the face of exponentially increasing service demands.   Key opportunities lie in integrating databases, data stores, and data caching services with advanced commodity multi-core and flash memory, and in creating new cloud virtualization technologies and architectures. These offer the potential for order of magnitude improvements in performance scalability and service availability to meet the exponentially expanding service demand while controlling capital and operating expenses and reducing power consumption.

Dr. John Busch is founder, chairman and CTO of Schooner Information Technology.

Image courtesy of Colin Babb.


Rick Cattell

Excellent points, here! People don’t realize how pitiful database performance can be on a virtualized cloud platform. A database system needs to effectively use CPU cores, RAM, disk buffers, high level caches, log disks, data disks, flash, multiple communicating processes/CPUs, etc. A virtualized platform is like trying to cook in a stranger’s kitchen: you don’t know what you have to work with, or where to find it. This is a problem even on NoSQL systems, for example, having enough RAM for your working set. Your hybrid suggestion is the best solution, unless or until we figure out how database systems can more effectively specify and control cloud resources.

John Calabrese

Funny you mention this. I have been in the performance engineering and testing space for the last 15 years and am building a whole business around exactly what you mention in this article.

The cloud vendors will actually tell you how to best architect you application stack to work in the cloud but folks just think they can spawn more virtual instances (just like they did when they first started using Virtualization in house)

What a refreshing article. I am launching Cloud in the next month exactly for the reasons you mention in the article. It sure would be nice to jointly work with the university of this initiative.

John Bailo

I come into my cubicle. I wait 30 minutes for log in and the point at which Outlook is useable. I check my network drives to see if they’ve been disconnected yet again because Active Directory hasn’t been sync’d.

During that time, I pull out my Android phone, check my (personal) email, read the news, get stock quotes and purchase a product on Amazon.

The cloud is here, but it’s in your pocket, not your desktop.

Comments are closed.