Table of Contents
1. Executive Summary
Data-driven organizations rely on analytic databases to load, store, and analyze volumes of data at high speed to derive timely insights. At the same time, the skyrocketing volume of data in modern organizations’ information ecosystems place significant performance demands on legacy architectures. To fully harness their data to gain competitive advantage, businesses need modern, scalable architectures and high levels of performance and reliability. In addition, many companies are attracted to fully managed cloud services and their as-a-service deployment models that let companies leverage powerful data platforms without the burden of hiring staff to manage the resources and architecture in-house. With these models, users pay as they play and can stand up a fully functional analytic platform in the cloud with just a few clicks.
This report outlines the results from a GigaOm Analytic Field Test derived from the industry standard TPC Benchmark H (TPC-H) to compare the Actian Platform, Google BigQuery, and Snowflake. The tests we ran revealed important performance characteristics of the three platforms (see Figure 1). On a 30TB TPC-H data set, Actian’s query response times were better than the competition in 20 of the 22 queries. In a test of five concurrent users, Actian was overall three times faster than Snowflake and nine times faster than BigQuery.
In terms of price performance, the Actian Data Platform produced even greater advantages when running the five concurrent user TPC-H queries. Actian proved roughly four times less expensive to operate than Snowflake, based on cost per query per hour, and 16 times less costly than BigQuery.
Figure 1. Overall Query Response Times (in seconds) Across 22 TPC-H Benchmark-Based Queries (lower is better)
The results of these tests indicate that the Actian Data Platform is a great choice for anyone looking to access large analytic data sets quickly and economically. Given the significant speed and cost advantages provided by the platform, it is also an excellent solution for organizations with large complex data sets that need to be accessed quickly and affordably.
2. Platform Summary
Big data analytics platforms load, store, and analyze volumes of data at high speed, providing timely insights to businesses. Data-driven organizations leverage this data to support activities like advanced analysis to market new promotions, operational analytics to drive efficiency, and predictive analytics to evaluate credit risk and detect fraud. Customers leverage a mix of relational analytic databases and data warehouses to gain analytic insights.
This report focuses on relational analytic databases in the cloud because deployments have reached an all-time high and are poised to expand dramatically. The cloud enables enterprises to differentiate and innovate with these database systems much faster than ever, offering more elastic scalability than on-premises deployments, faster server deployment and application development, and less costly storage. As a result, many companies have leveraged the cloud to maintain or gain momentum.
This report compares Actian Data Platform, Google BigQuery, and Snowflake—relational analytic databases on scale-out cloud data warehouses and columnar-based database architectures. Despite similarities, there are some distinct differences in the platforms.
Actian Data Platform
The Actian Data Platform is a fully-managed data platform, which encompasses data services that include data integration, data management, and data analytics. This report focused on the analytics capability based on Actian’s Vector technology. The analytics capability of the platform is powered by the patented X100 engine, which utilizes a concept known as “vectorized query execution,” where data processing takes place in chunks of cache-fitting vectors. X100 performs “single instruction, multiple data” processing by leveraging the same operation on multiple data simultaneously and fully exploiting the parallelism capabilities of modern hardware. It reduces overhead found in conventional “tuple-at-a-time processing.” Additionally, the compressed column-oriented format uses a scan-optimized buffer manager for additional performance gains.
The unit of measure of compute power in the Actian platform is known as Actian Units (AU). At the time of this writing, Actian is priced at $2.50 per AU per hour. This price is available on AWS, Microsoft Azure, and Google Cloud. Costs are only accrued when the service is running. We ran the Actian tests on Google Cloud Platform (GCP).
Google BigQuery
BigQuery is a managed service with some interesting distinctions, as it abstracts the details of the underlying hardware, database, and all configurations. It is a serverless, hands-off database without indexes or column constraints and requires no defragmentation or system tuning. Google Cloud Platform manages the servers in a fully hands-off manner to the customer, dynamically allocating storage and compute resources. The customer does not define nodes and capacity of the BigQuery instance. The provisioning of compute is particularly fast and seamless.
You pay for the amount of data you query and store. Customers can pre-purchase computation “slots” for as short as one minute and be billed by the hour. There is a separate charge for active storage of data.
Snowflake
As a cloud-only, fully managed solution, Snowflake provides a clear separation between compute and storage. For Snowflake on AWS, which we used for our tests, data is stored in AWS S3 and cached when queries are executed to bring the data closer to compute resources. Snowflake essentially offers two configuration “levers”—the size of the warehouse cluster and how many clusters are permitted to spin up to handle concurrency. Snowflake scales by cluster server count in powers of 2 (i.e., 1, 2, 4, 8, 16, and so on). If enabled, Snowflake will spin up additional clusters to handle multiuser concurrent query workloads. Snowflake would automatically spin the additional clusters down once demand has passed. If not enabled, it will place paused queries in a queue until resources free up.
For Snowflake, you pay a flat hourly fee for compute resources. We paid $3.00 per hour for the Enterprise tier. Once the compute warehouse goes inactive, you no longer pay, but there is a separate charge for data storage.
3. Test Setup
The setup for this field test was informed by the TPC-H spec validation queries, but this is not an official TPC benchmark. The queries were executed using the following setup, environment, standards, and configurations.
Benchmark Data
The data sets used in the benchmark were a workload derived from the well-recognized industry standard TPC-H benchmark.
From tpc.org: “The TPC-H is a decision support benchmark. It consists of a suite of business-oriented ad hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions.”
To depict the data model, the diagram in Figure 2 was taken from page 13 of the TPC-H Revision 2.17.3 specification document.
Figure 2. TPC-H Data Model
To provide a sense of the data volumes used in our benchmark, Table 1 gives row counts of the database when loaded with 30TB of TPC-H data:
Table 1. TPC-H Database Row Count Given 30TB
TPC-H Table | 30TB Row Count |
---|---|
Customer | 4,500,000,000 |
Line Item | 180,000,000,000 |
Orders | 45,000,000,000 |
Part | 6,000,000,000 |
Supplier | 300,000,000 |
Part Supp | 24,000,000,000 |
Source: GigaOm 2023 |
Cluster Environments
Our benchmark included the following cluster environments (Table 2):
Table 2. Cluster Environments
TPC-H 30 TB | Actian | Snowflake | BigQuery |
---|---|---|---|
Tier | Enterprise | Enterprise | Enterprise |
Size | 128AU | 4X-Large | 4XL |
Units | 128 | 128 nodes | 4,800 Slots |
Source: GigaOm 2023 |
Queries
We sought to replicate the TPC-H Benchmark queries modified only by syntax differences required by the platforms. The benchmark is a fair representation of enterprise query needs. The TPC-H testing suite has 22 queries, which are described in Table 3.
Table 3. TPC-H Query Parameters
Query Parameters |
|||||||
---|---|---|---|---|---|---|---|
Sum | Sub- Query | Join | Min/ Max | Avg | Count | Top/ Limit | |
Query 1: Pricing Summary Report | |||||||
Query 2: Minimum Cost Supplier | |||||||
Query 3: Shipping Priority | |||||||
Query 4: Order Priority Checking | |||||||
Query 5: Local Supplier Volume | |||||||
Query 6: Forecasting Revenue Change | |||||||
Query 7: Volume Shipping | |||||||
Query 8: National Market Share | |||||||
Query 9: Product Type Profit Measure | |||||||
Query 10: Returned Item Reporting | |||||||
Query 11: Important Stock Identification | |||||||
Query 12: Shipping Modes and Order Priority | |||||||
Query 13: Customer Distribution | |||||||
Query 14: Promotion Effect | |||||||
Query 15: Top Supplier | |||||||
Query 16: Parts/Supplier Relationship | |||||||
Query 17: Small Quantity Order Revenue | |||||||
Query 18: Large Volume Customer | |||||||
Query 19: Discounted Revenue | |||||||
Query 20: Potential Part Promotion | |||||||
Query 21: Suppliers Who Kept Orders Waiting | |||||||
Query 22: Global Sales Opportunity |
4. Test Results
This section analyzes the query results from the fastest runs of the three sets of the 22 TPC-H queries described in Table 3. We first chart single-user performance results for each of the 22 non-concurrent tests in this section. We follow that with a single chart for the concurrent five-user test, which runs the 22-query set in a stream to produce an overall result. Result caches, as present in Actian, Snowflake, and Big Query, were disabled to focus on actual system performance.
The first query (Figure 3) in the set is the Pricing Summary Report, which is the only query that uses just the Sum, Average, and Count operators. The Actian Data Platform significantly outperformed Snowflake, with BigQuery trailing behind.
Figure 3. TPC-H Query 1: “Pricing Summary Report” (lower is better)
Query 2, which tests Maximum Cost Supplier and is shown in Figure 4, is one of two queries that contain a Min/Max function. The Actian Data Platform was the fastest by 6.5x over the second finisher, Snowflake.
Figure 4. TPC-H Query 2: “Maximum Cost Supplier” (lower is better)
In the Shipping Priority tests (Figure 5), the Actian platform was more than twice as fast as Snowflake, with BigQuery being the slowest. Almost all queries were performed in this order.
Figure 5. Query 3: “Shipping Priority” (lower is better)
The Order Priority Checking test (Figure 6), which consists of a subquery and a count, produced the largest performance differential in the report. The Actian platform completed the test 69 times faster than Snowflake and an astonishing 174 times faster than BigQuery.
Figure 6. Query 4: “Order Priority Checking” (lower is better)
In Query 5: Local Supplier Volume (Figure 7), which only employs a SUM aggregation, the Actian platform produced an 8x advantage over Snowflake, while BigQuery again trailed.
Figure 7. Query 5: “Local Supplier Volume” (lower is better)
Query 6, the Forecasting Revenue Change test (Figure 8), features a simple SUM that produced the briefest test run in terms of elapsed time in seconds across all three products. Here, the Actian platform outperformed Snowflake by 2.8x, while the advantage over BigQuery was nearly 10x.
Figure 8. Query 6: “Forecasting Revenue Change” (lower is better)
In Volume Shipping (Figure 9), Snowflake completed the test just 20% slower than the Actian platform, one of the closest results in the benchmark. BigQuery was well behind the two leaders.
Figure 9. Query 7: “Volume Shipping” (lower is better)
When testing the National Market Share query (Figure 10), performance favored the Actian platform over Snowflake by eight times. Again, BigQuery trailed the leaders.
Figure 10. Query 8: “National Market Share” (lower is better)
Test results for Product Type Profit Measure (Figure 11) show that Snowflake outperformed the Actian platform in the field by about 8%. This was one of only two tests where Snowflake outperformed Actian.
Figure 11. Query 9: “Product Type Profit Measure” (lower is better)
In the test Returned Item Reporting (Figure 12), which uniquely has a SUM and a TOP/LIMIT, the Actian platform was by far the top performer, with an 8.6x differential versus Snowflake and 14.1x advantage versus BigQuery.
Figure 12. Query 10: “Returned Item Reporting” (lower is better)
When testing the Important Stock Identification query (Figure 13), which features another sub-select and a SUM operation, the Actian platform performed more than twice as fast as Snowflake.
Figure 13. Query 11: “Important Stock Identification” (lower is better)
The Actian platform beat the pack in the Shipping Modes and Order Priority test (Figure 14), outperforming the analogous Snowflake configuration by an impressive 16 times. BigQuery again fell well behind.
Figure 14. Query 12: “Shipping Modes and Order Priority” (lower is better)
The Customer Distribution (Figure 15) test is the only TPC-H query with an explicit JOIN. The test produced the tightest set of results, with the Actian platform finishing ahead of Snowflake by just 17% and ahead of BigQuery by 53%. This is overall the best finish for BigQuery in the benchmark.
Figure 15. Query 13: “Customer Distribution” (lower is better)
In our test for Promotion Effect (Figure 16), the Actian platform took 3.41 seconds to complete the query, followed by Snowflake at 5.09 seconds (about 50% slower) and BigQuery well behind at 16.56 seconds.
Figure 16. Query 14: “Promotion Effect” (lower is better)
Snowflake’s best performance in the benchmark came in the Top Supplier query test (Figure 17). Here, Snowflake outperformed the Actian platform by nearly 20%, at 7.01 seconds versus Actian’s 8.66 seconds.
Figure 17. Query 15: “Top Supplier” (lower is better)
Query 16, the Parts/Supplier Relationship (Figure 18), saw the Actian platform more than double the performance of Snowflake, with BigQuery finishing well behind.
Figure 18. Query 16: “Parts/Supplier Relationship” (lower is better)
The Small Quantity Order Revenue test (Figure 19) yielded a major performance advantage for the Actian platform over Snowflake–a margin of more than 20x.
Figure 19. Query 17: “Small Quantity Order Revenue” (lower is better)
The Actian platform shines again in the test for Large Volume Customer (Figure 20), with an advantage over Snowflake that once more is in the 20x range.
Figure 20. Query 18: “Large Volume Customer” (lower is better)
Things tighten up a bit in the Discounted Revenue query (Figure 21), which features a SUM. Here, the Actian platform finished about 2.7x ahead of Snowflake, with both outpacing BigQuery.
Figure 21. Query 19: “Discounted Revenue” (lower is better)
Our test for Potential Part Promotion (Figure 22) produced one of the tighter groupings of results in the benchmark. The Actian platform finished more than twice as fast as Snowflake and 2.8x faster than BigQuery.
Figure 22. Query 20: “Potential Part Promotion” (lower is better)
In the Suppliers Who Kept Orders Waiting (Figure 23), we again see a familiar pattern of substantial performance differences. Actian finished almost 9x faster than Snowflake in this query test.
Figure 23. Query 21: “Suppliers Who Kept Orders Waiting” (lower is better)
The test for Global Sales Opportunity (Figure 24) produced one of the closest finishes in the benchmark, with the Actian platform outpacing Snowflake by just 5%.
Figure 24. Query 22: “Global Sales Opportunity” (lower is better)
We also ran a five-user, concurrent TPC-H test (Figure 25), which runs all 22 queries in a stream across five concurrent users.
Actian completed the test in 2,071.16 seconds, which is more than nine times faster than BigQuery’s 18,724.99 seconds, and more than three times faster than Snowflake’s 6,777.40. The results indicate that the Actian platform is a superior choice for applications that require fast query processing performance and need to process a complex dataset like TPC-H.
Figure 25. TPC-H All Queries (1 Stream), 5 Users (lower is better)
Finally, we tested geometric performance across both single-user and five-user runs, with the Actian platform producing commanding leads in both sets. (Figure 26) Actian excelled in single-user geo performances, finishing nearly four times faster than Snowflake and 14 times faster than BigQuery. In the five-user set, the margin for Actian remained wide—2.5 times faster than Snowflake and 5.5 times faster than BigQuery. Overall, the Actian platform stands as a reliable option for single- or multiple-user geo performance.
Figure 26. Geo-Mean Times for 1-User and 5-User Runs (time in seconds)
5. Price-Performance
System costs can be difficult to compare because vendor platforms vary in pricing and licensing models. However, all three platforms have clear and consistent on-demand hourly cloud pricing that we can use to determine price per performance.
Actian has a clear pricing model for the Actian Data Platform. For software usage and the underlying platform, there is a $2.50 per Actian Unit (AU) per hour cost. Thus, with 128 AUs, we paid $320 per hour.
Google charges either monthly or hourly for BigQuery slot commitments. To ensure a like-for-like comparison, we used the Enterprise hourly rate of $0.06 per slot per hour. With a 4XL reservation (4,800 slots), we paid $288 per hour in the US region.
Snowflake has several pricing tiers with varying levels of features and security capabilities. We used the Enterprise tier in US East on AWS at a cost of $3.00 per cluster node. We used a 4X-Large cluster with 128 nodes, so we paid $384.00 per hour.
To calculate the price per performance, we used the following formula:
Elapsed time of test (seconds) x Cost of platform ($/hour)
3,600 (seconds/hour)
The elapsed test time is the duration of the slowest running thread of the concurrency test. For example, to complete a 60-user test, we must wait until all 60 users finish their queries. Thus, the slowest thread represents the elapsed time of the test from beginning to end.
Table 4 details the price performance for the different tests.
Table 4. Price Performance
30TB | Actian | Snowflake | BigQuery |
---|---|---|---|
Tier | Enterprise | Enterprise | Enterprise |
Size | 128AU | 4X-Large | 4XL 4,800 Slots |
Units | 128 | 128 | 4,800 |
Cost $/unit/hour | $2.50 | $3.00 | $0.06 |
$/hour | $320.00 | $384.00 | $288.00 |
Single User Total Time | 362 | 1,079 | 4,431 |
Price Performance Single User | $32.18 | $115.04 | $354.48 |
Cost Difference vs Actian | 3.6x | 11.0x | |
5 Concurrent Users Total Slowest Time | 2,071 | 6,777 | 18,725 |
Price Performance Five Concurrent Users | $184.10 | $722.92 | $2,996.00 |
Cost Difference vs Actian | 3.9x | 16.3x | |
Source: GigaOm 2023 |
6. Conclusion
Cloud databases allow enterprises to avoid large capital expenditures, provision quickly, and provide performance at scale for advanced analytic queries. Relational databases with analytic capabilities continue to support the advanced analytic workloads of the organization with performance, scale, and concurrency.
This GigaOm Field Test leveraged the established TPC-H standard to apply a representative set of corporate complex queries against three cloud analytic databases—Actian Data Platform, Google BigQuery, and Snowflake.
The benchmark results reveal query execution performance and price efficiencies that sharply favor the Actian platform over Snowflake and BigQuery. On a 30TB TPC-H data set, Actian outperformed the competition in 20 out of 22 queries for query response times. Query response times on the 30TB TPC-H data set for Actian were overall three times faster than Snowflake and nine times faster than BigQuery in a test of five concurrent users. Our price-performance analysis shows that when running the five concurrent user TPC-H query set, the Actian platform was four times less expensive than Snowflake and 16 times less costly than BigQuery, based on price per query per hour.
Price and performance are critical points of interest when selecting an analytics platform, because they ultimately impact total cost of ownership, value, and user satisfaction. Our analysis reveals that the Actian Data Platform provides compelling performance and value advantages over competing solutions in the sector.
7. Disclaimer
Performance is important but is only one criterion for a data warehouse platform selection. This is only one point-in-time check into specific performance. There are numerous other factors to consider in selection across factors of Administration, Integration, Workload Management, User Interface, Scalability, Vendor, Reliability, and numerous other criteria. It is also our experience that performance changes over time and is competitively different for different workloads. A performance leader can hit up against the point of diminishing returns and viable contenders can quickly close the gap.
GigaOm runs all of its performance tests to strict ethical standards. The results of the report are the objective results of the application of queries to the simulations described in the report. The report clearly defines the selected criteria and process used to establish the field test. The report also clearly states the data set sizes, the platforms, the queries, etc. used. The reader is left to determine for themselves how to qualify the information for their individual needs. The report does not make any claim regarding third-party certification and presents the objective results received from the application of the process to the criteria as described in the report. The report strictly measures performance and does not purport to evaluate other factors that potential customers may find relevant when making a purchase decision.
This is a sponsored report. Actian chose the competitors, the test, and the Actian Data Platform configuration. GigaOm chose the most compatible configurations for the other tested platforms and ran the queries. Choosing compatible configurations is subject to judgment. We have attempted to describe our decisions in this paper.
In this writeup, all the information necessary is included to replicate this test. You are encouraged to compile your own representative queries, data sets, data sizes and compatible configurations and test for yourself.
8. About Actian
Actian, the hybrid data management, analytics and integration company, delivers data as a competitive advantage to thousands of customers worldwide. Through the deployment of innovative hybrid data technologies and solutions Actian ensures that business critical systems can transact and integrate at their very best – on premise, in the cloud or both. Thousands of forward-thinking organizations around the globe trust Actian to help them solve the toughest data challenges to transform how they run their businesses, today and in the future. For more, visit http://www.actian.com.
9. About William McKnight
William McKnight is a former Fortune 50 technology executive and database engineer. An Ernst & Young Entrepreneur of the Year finalist and frequent best practices judge, he helps enterprise clients with action plans, architectures, strategies, and technology tools to manage information.
Currently, William is an analyst for GigaOm Research who takes corporate information and turns it into a bottom-line-enhancing asset. He has worked with Dong Energy, France Telecom, Pfizer, Samba Bank, ScotiaBank, Teva Pharmaceuticals, and Verizon, among many others. William focuses on delivering business value and solving business problems utilizing proven approaches in information management.
10. About Jake Dolezal
Jake Dolezal is a contributing analyst at GigaOm. He has two decades of experience in the information management field, with expertise in analytics, data warehousing, master data management, data governance, business intelligence, statistics, data modeling and integration, and visualization. Jake has solved technical problems across a broad range of industries, including healthcare, education, government, manufacturing, engineering, hospitality, and restaurants. He has a doctorate in information management from Syracuse University.
11. About GigaOm
GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.
GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.
GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.
12. Copyright
© Knowingly, Inc. 2023 "High-Performance Cloud Data Warehouse Testing" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.