The fundamental underpinning of an organization is its transactions. It must do them well, with integrity and performance. Not only has transaction volume soared of late, but the level of granularity in the transaction details has also reached new heights. Fast transactions greatly improve the efficiency of a high-volume business. Performance is incredibly important. Any network latency, delays in the database, API/microservices calls, and other processes need to be eliminated.
There are a variety of databases available to the transactional application. Ideally, any database would have the required capabilities; however, depending on the scale of the application and the chosen cloud, some database solutions can be prone to delays. Recent trends in information management see organizations shifting their focus to cloud-based solutions. In the past, the only clear choice for most organizations was on-premises data using on-premises hardware. However, the costs of scale are chipping away at the notion that this remains the best approach for all, or some, of a company’s transactional needs. The factors driving operational and analytical data projects to the cloud are many, but the advantages, like data protection, high availability, and scale, are realized with an infrastructure as a service (IaaS) deployment. In many cases, a hybrid approach serves as an interim step for organizations migrating to a modern, capable cloud architecture.
This report outlines the results from a GigaOm Transactional Field Test, derived from the industry-standard TPC Benchmark™ E (TPC-E), to compare two IaaS cloud database offerings:
Both are installations of Microsoft SQL Server and we tested on both Windows Server OS and Red Hat Enterprise Linux OS. However, there are some distinct differences in the two infrastructure offerings that impact performance.
The results of the GigaOm Transactional Field Test are valuable to all operational functions of an organization, such as human resource management, production planning, material management, financial supply chain management, sales and distribution, financial accounting and controlling, plant maintenance, and quality management. Today the underlying data for many of these departments is in SQL Server, which is also frequently the source for operational interactive business intelligence (BI).
Microsoft SQL Server on Microsoft Azure Virtual Machines (VM) indicated 3.4x better performance on Windows over Microsoft SQL Server on Amazon Web Services (AWS) Elastic Cloud Compute (EC2). Microsoft SQL Server on Microsoft Azure Virtual Machines had 3x better performance over AWS when tested on Linux Server OS. SQL Server on Microsoft Azure Virtual Machines (VM) had up to 86.8% better price-performance when comparing Azure Hybrid Benefit to AWS License Mobility for three-year compute commitment, and up to 32.2% better price-performance when comparing the high-speed disks of AWS io1 and Azure Ultra Disk.
The parameters to replicate this test are provided. You are encouraged to compile your own representative queries, data sets, and data sizes, and test compatible configurations applicable to your requirements.
Relational databases are a cornerstone of an organization’s data ecosystem. While alternative SQL platforms have their place and are growing with the data deluge, today, workload platforming decision-makers highly consider and usually choose the relational database. This is for good reason. Since 1989, Microsoft SQL Server has proliferated to near-ubiquity as the relational Server of choice for the original database use case—On-Line Transaction Processing (OLTP)—and beyond. Now SQL Server is available on fully-functional infrastructure offered as a service, taking complete advantage of the cloud. The infrastructure-as-a-service (IaaS) cloud offerings provide predictable costs, cost savings, fast response times, and strong non-functionals.
As our testing confirms, the major difference between SQL Server on Azure and SQL Server on AWS is the storage I/O performance.
For operationally-intensive, business-critical workloads, Azure recommends two solid-state drive (SSD) disk types:
First, Premium Managed Disks are high-performance SSDs designed to support I/O intensive workloads and provide high throughput and low latency. Premium SSD Managed Disks are provisioned as a persistent disk with configurable size and performance characteristics. They can also be detached and reattached to different virtual machines.
The cost of Premium SSD Managed Disks depends on the number and size of the disks selected, as well as by the number of outbound data transfers. These disk sizes provide different input/output operations per sec (IOPS), throughput (MB/second), and monthly price per GiB. Several persistent disks attached to a VM can support petabytes of storage per VM. Premium four disk configurations can achieve up to 80,000 IOPS and 1,600 MB per second disk throughput per VM — which translates to less than one millisecond latency for read operations with applications which can take advantage of read caching. Premium SSD Managed Disks are supported by DS-series, FS-series, and GS-series VMs. The largest single disk is the P80 with 32TB of storage, IOPS up to 20,000, and up to 900 MB per second of throughput.
To save on cost, standard hard disk drive (HDD) Managed Disks are also offered. Azure customers can store Managed Disk snapshots and images from Premium SSD Managed Disks on Standard HDD storage. You can choose between locally redundant storage (LRS) and zone redundant storage (ZRS) options.
The second recommendation for high performance is Ultra Disk. Azure’s next-generation solid-state drive has configurable performance attributes that provide some of the lowest latency and consistently high IOPS/throughput offered in the cloud market to date. This new cloud disk type was made generally available in the Fall of 2019 and offers I/O performance with sub-millisecond latency.
IOPS on Ultra Disk can start out at reasonably high rates and are scaled upwards as workloads become more I/O intensive. Like the Premium drives, Ultra Disk is billed on an hourly rate and is priced according to the size of the disk and its performance configuration — including both IOPS and throughput (MB/s) ranges. It has a configurable maximum of 160,000 IOPS and 2,000 MB/sec throughput.
Amazon Web Services offers Elastic Block Store (EBS) as an easy to use, high-performance block storage service designed for use with Amazon Elastic Compute Cloud (EC2). Elastic Block Storage supports a broad range of workloads, such as relational and non-relational databases, enterprise applications, containerized applications, big data analytics engines, and file systems. With Amazon EBS, customers can choose from four different volume types to balance optimal price and performance. You can achieve single-digit, millisecond-latency for high-performance database workloads.
Amazon EBS has two different types of solid-state drives:
First, General Purpose SSD (gp2) volumes balance price and performance for a wide variety of workloads. AWS recommends this drive type for most workloads. However, with an IOPS ceiling of only 16,000, it tends to underperform unless relegated to system boot volumes, virtual desktops, low-latency applications, and development/test environments.
Amazon’s highest performing EBS volume option is the Provisioned IOPS (io1) SSD volume for mission-critical, low-latency, and high-throughput workloads. Critical business applications and large database workloads usually require sustained IOPS performance — possibly more than 16,000 IOPS or 250 MiB/s of throughput offered by gp2. The io1 volume has a maximum configurable IOPS of 64,000 and throughput of 1,750 MiB/s.
One of our main objectives in this benchmark is to test an I/O intensive workload on Amazon and Azure’s best and second-best SSD volume types head-to-head — to understand both the performance and price-per-performance differences of the two leading cloud vendor’s SQL Server offerings.
The GigaOm Transactional Field Test is a workload derived from the well-recognized industry-standard TPC Benchmark™ E (TPC-E). The workload was modified, i.e. transaction mix, from the standard TPC-E benchmark for ease of benchmarking and as such, the results generated are not comparable to official TPC Results. From tpc.org: “TPC Benchmark™ E (TPC-E) is an OLTP workload. It is a mixture of read-only and update intensive transactions that simulate the activities found in complex OLTP application environments. The database schema, data population, transactions, and implementation rules have been designed to be broadly representative of modern OLTP systems. The benchmark exercises a breadth of system components associated with such environments.” The TPC-E benchmark simulates the transactional workload of a brokerage firm with a central database that executes transactions related to the firm’s customer accounts. The data model consists of 33 tables, 27 of which have the 50 foreign key constraints. The results of the TPC-E are valuable to all operational functions of an organization, many driven by SQL Server, and frequently the source for operational interactive business intelligence (BI).
The data sets used in the benchmark were generated based on the information provided in the TPC Benchmark™ E (TPC-E) specification. For this testing, we used the database scaled for 800,000 customers. This scaling determined the initial data volume of the database. For example, a total of 800,000 customers is multiplied by 17,280 to determine the number of rows in the TRADE table: 13,824,000,000. All of the other tables were scaled according to the TPC-E specification and rules. On our Azure virtual machines and AWS EC2 instances, we allocated 4TB of storage — which was more than enough for this workload. Besides the scale factor of 800,000 customers, the test offers a few other “knobs” we turned in order to determine the database engine’s maximum throughput capability for both AWS and Azure. See Table 1.
We completed five runs per test on each platform. Each test run lasted a duration of at least two hours each. We then took the average transactions per second for the last 30 minutes of the test runs. Also, a full backup was restored to reset the database back to its original state between each run. The results are shared in the Field Test Results section.
Selecting and sizing the compute and storage for comparison can be challenging, particularly across two different cloud vendors’ offerings. There are various offerings between AWS and Azure for transaction-heavy workloads. As you will see in Table 2, at the time of testing and publication, there was not an exact match across the offerings in processors or memory.
We considered the variety of offerings on AWS and chose the memory-optimized r4 family. It was the most similar to the Azure offering. R4 is described as “optimized for memory-intensive and latency-sensitive database workloads, including data analytics, in-memory databases, and high-performance production workloads.”
On the Azure side, we expect mission-critical-minded customers to gravitate towards the Ev3 family which is described as “ideal for memory-intensive enterprise applications.” We opted for the ESv3 series which offers premium storage disks, as opposed to the standard disk offered with the regular Ev3 series. Thus, we decided on R4 for AWS RDS and ESv3 for Azure SQL VM. Our approach was to find the “nearest neighbor” best fit. The challenge was selecting a balance of both CPU and memory. R4.16xlarge on AWS has 64 vCPUs and 488 GiB memory. Azure offers a 64 core instance in E64s_v3, but it only has 432 GiB of memory, which is 11% less than the r4.16xlarge. This was our best, most diligent effort at selecting hardware compatibility for our testing. Results may vary across different configurations and, again, you are encouraged to compile your own representative queries, data sets, and data sizes to test compatible configurations applicable to your requirements. All told, our testing included two different database environments. For more information on storage type and IOPs, please reference the footnote.
Test 3: Azure VM with Ultra Disk Performance for Maximum IOPS
We also performed some additional testing of SQL Server on Azure with Ultra Disk attached storage. For the Azure Ultra Disk and AWS io1 tests, we used 400 users for Azure but kept AWS at 170 users. During initial testing of these configurations, both at 170 users, we found that AWS io1 was saturated in terms of IOPS and CPU utilization, while the Azure Ultra Disk system was not. Therefore, we upped the user count configuration to 400 users in order to push Ultra Disk to its maximum performance. Since the AWS io1 server was already at peak performance with 170 users, raising the user count to 400 would most likely have not helped its transaction per second (tps) metrics anymore and may have even caused it to suffer a performance loss. In our extensive experience with testing, we have found that once a system is saturated at a certain user level, raising it further is usually detrimental to overall performance.
We attempted to push the limits of its performance and really take the new Ultra Disk through the paces. We provisioned another 1.5TB Ultra Disk — this time, however, we set IOPS to the maximum of 80,000 (up from 60,000 during the previous tests). In test 3, we reduced the maximum amount of memory allowed for SQL Server to half of the available memory on the VM (216GB of the available 432GB). This memory pressure caused SQL Server to drive more disk I/O.
Test 4: Azure VM with Ultra Disk Performance for Maximum Memory
In the fourth test, we set the maximum memory allowed on SQL Server back to 100%, but increased the number of connected users to 400 across 2 SUT Databases (200 users on each SUT), as shown in table 6.
For these tests, we tested the platforms with standard “out-of-the-box” SQL Server configurations, that is, no additional or special configurations were made than what came installed on our SQL Server machines.
This section analyzes the tps from the fastest of the five runs of each of the three GigaOm Transactional Field Tests described in tables 3-5. A higher tps is better — meaning more transactions are being processed every second.
In test 3, the memory pressure caused SQL Server to drive more disk I/O. As you can see in the figure 4 performance monitoring chart, our Azure VM with Ultra Disk was able to sustain 80,000 IOPS once the test got up to full speed.
In the fourth test, the maximum memory allowed on SQL Server was at 100%, and the number of users was set to 400 across 2 SUT Databases (200 users on each SUT). During this test, SQL Server on Ultra Disk was able to achieve 1570 tps — an improvement of 27% over Ultra Disk at 60,000 IOPS and 46.6% over AWS Provisioned io1 at 32,000 IOPS.
The price-performance metric is price/throughput (tps). This is defined as the cost of running each of the cloud platforms continuously for three years divided by the transactions per second throughput uncovered in the previous tests 1-3. The calculation is as follows:
Price Per Performance = $/tps =
[(Compute with on-demand SQL Server license Hourly Rate × 24 hours/day × 365 days/year × 3 years)
+ (Data disk(s) monthly cost per TB × 4 TB × 12 months × 3 years)
+ (Log disk monthly cost per TB × 1 TB × 12 months × 3 years)] ÷ tps
When evaluating price-per-performance, the lower the number, the better. This means you get more compute power, storage I/O, and capacity for your budget.
We performed this calculation across two different pricing structures:
The prices in Table 7 were at the time of testing and reflect the US West (N. California) region on AWS and West US region on Azure for the first set of tests. The AWS io1 test was conducted in Ireland region and the Azure Ultra Disk test was conducted in the NorthEurope region. These regions were selected for having the most comparable offering.
The compute prices include both the actual AWS EC2/Azure VM hardware itself and the license costs of the operating system and Microsoft SQL Server Enterprise Edition. We also included Azure Hybrid Benefit for SQL Server and Windows Server versus AWS License Mobility for SQL Server rates for existing SQL Server and Windows license holders. In Table 7, Compute cost is hourly and storage cost is monthly and all prices are in US Dollars.
Note: Prices do not include support costs for either Azure or AWS.
Each platform has different pricing options. Buyers should evaluate all of their pricing choices, not just the ones presented in this paper.
This report outlines the results from a GigaOm Transactional Field Test to compare the same SQL Server infrastructure as a service (IaaS) offering on two cloud vendors: Microsoft SQL Server on Amazon Web Services (AWS) Elastic Cloud Compute (EC2) instances and Microsoft SQL Server Microsoft on Azure Virtual Machines (VM).
We have learned that the database, along with the cloud and the storage, matters to latency, which is the killer for important transactional applications. Microsoft Azure presents a powerful cloud infrastructure for the modern transactional workload.
Microsoft SQL Server on Microsoft Azure Virtual Machines (VM) showed 3.4x better performance on Windows than Microsoft SQL Server on Amazon Web Services (AWS) Elastic Cloud Compute (EC2), and 3x better performance on Linux. Microsoft SQL Server on Azure Virtual Machines (VM) had up to 86.8% better price-performance when comparing Azure Hybrid Benefit to AWS License Mobility for three-year reservations, and up to 32.2% better price-performance when comparing the high-speed disks of AWS io1 and Azure Ultra Disk.
Microsoft (Nasdaq “MSFT” @microsoft) enables digital transformation for the era of an intelligent cloud and an intelligent edge. Its mission is to empower every person and every organization on the planet to achieve more. Microsoft offers SQL Server on Azure. To learn more about Azure SQL Server Virtual Machines, visit https://azure.microsoft.com/en-us/services/virtual-machines/sql-server/
Performance is important but it is only one criterion for a business-critical database platform selection. This test is a point-in-time check into specific performance. There are numerous other factors to consider in selection across factors of Administration, Integration, Workload Management, User Interface, Scalability, Vendor, Reliability, and numerous other criteria. It is also our experience that performance changes over time and is competitively different for different workloads. Also, a performance leader can hit up against the point of diminishing returns and viable contenders can quickly close the gap.
The benchmark setup was informed by the TPC Benchmark™ E (TPC-E) specification. The workload was derived from TPC-E and is not an official TPC benchmark nor may the results be compared to official TPC-E publications.
GigaOm runs all of its performance tests to strict ethical standards. The results of the report are the objective results of the application of queries to the simulations described in the report. The report clearly defines the selected criteria and process used to establish the field test. The report also clearly states the data set sizes, the platforms, the queries, etc. used. The reader is left to determine for themselves how to qualify the information for their individual needs. The report does not make any claim regarding the third-party certification and presents the objective results received from the application of the process to the criteria as described in the report. The report strictly measures performance and does not purport to evaluate other factors that potential customers may find relevant when making a purchase decision.
This is a sponsored report. Microsoft chose the competitors, the test, and the Microsoft configuration. GigaOm chose the most compatible configurations for the other tested platform and ran the testing workloads. Choosing compatible configurations is subject to judgment. We have attempted to describe our decisions in this paper.
William McKnight is a former Fortune 50 technology executive and database engineer. An Ernst & Young Entrepreneur of the Year finalist and frequent best practices judge, he helps enterprise clients with action plans, architectures, strategies, and technology tools to manage information.
Currently, William is an analyst for GigaOm Research who takes corporate information and turns it into a bottom-line-enhancing asset. He has worked with Dong Energy, France Telecom, Pfizer, Samba Bank, ScotiaBank, Teva Pharmaceuticals, and Verizon, among many others. William focuses on delivering business value and solving business problems utilizing proven approaches in information management.
Jake Dolezal is a contributing analyst at GigaOm. He has two decades of experience in the information management field, with expertise in analytics, data warehousing, master data management, data governance, business intelligence, statistics, data modeling and integration, and visualization. Jake has solved technical problems across a broad range of industries, including healthcare, education, government, manufacturing, engineering, hospitality, and restaurants. He has a doctorate in information management from Syracuse University.
GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.
GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.
GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.