Laptop Displaying the GigaOm Research Portal

Get your Free GigaOm account today.

Access complimentary GigaOm content by signing up for a FREE GigaOm account today — or upgrade to premium for full access to the GigaOm research catalog. Join now and uncover what you’ve been missing!

High-Volume Data Replicationv1.0

Evaluating Fivetran HVR and Qlik Replicate

Table of Contents

  1. Executive Summary
  2. Platform Summary
  3. Test Setup
  4. Performance Test Results
  5. Total Cost of Ownership
  6. Conclusion
  7. Appendix
  8. Disclaimer
  9. About Fivetran
  10. About William McKnight
  11. About Jake Dolezal

1. Executive Summary

This report was commissioned by Fivetran.

Whether for operational or analytical purposes – databases are the backbone of how many businesses run; from collecting consumer behavior on your website to processing IOT data across your supply chain and so much more. Accessing and replicating massive volumes of database content is key to business success and the responsibility of managing this crucial element of your infrastructure falls to data leaders and their teams.

Ensuring your solution for database replication can keep up with your business is a pressing need for every data leader across every industry and company size. In this report, we investigate two major vendors in database replication and put them to the test in terms of speed and cost.

Behind the Scenes: How it Works
The process of locating and recording modifications to data in a database and instantly sending those updates to a system or process downstream is known as data replication or change data capture (CDC).

Data is extracted from a source, optionally transformed, and then loaded into a target repository—such as a data lake or data warehouse. Ensuring that all transactions in a source database are recorded and instantly transferred to a target keeps the systems synchronized and facilitates movement of data between on-premises sources and the cloud with minimal to no downtime for dependable data replication.

CDC–an incredibly effective method for moving data across technologies–is essential to modern cloud architectures. The real-time data transfer accelerates analytics and data science use cases. Enterprise data architectures utilize CDC to efficiently power continuous data transport between systems. Log-based CDC is a CDC method that uses a database’s transaction log to capture changes and replicate them downstream.

Using competing technologies Fivetran HVR and Qlik Replicate, our scenario assessed the total cost of ownership (TCO) of syncing 50 GB to 200 GB per hour of change data between a source Oracle database and a target Snowflake data warehouse using log-based CDC on the source. Notably, we assessed TCO based on configurations that reflect the performance requirements of enterprise customers. For this assessment, data replication latency needed to stay below five minutes to meet the requirements of data replication customers, regardless of redo log change rate. These tests simulate scenarios commonly encountered by large enterprises when utilizing technologies for log-based CDC.

At 200 GB/hour Fivetran HVR

proved 25% less costly than Qlik Replicate.

In this study, we sought to compare the total cost of ownership between Fivetran HVR and Qlik Replicate, based on similar levels of operational latency.

  • In our performance testing, as the volume of redo log change data increased, Fivetran HVR produced a flat linear trend in replication latency while Qlik Replicate latency steadily increased. At 50 GB/hour, tested latencies for both platforms were safely below five minutes; but at 100 and 200 GB/hour of change data, the single Qlik instance produced unacceptably high latencies (as much as 27 times greater than those produced by Fivetran HVR).
  • To produce a valid TCO comparison, we factored in the cost to scale the Qlik buildout with additional instances. As redo log change data doubled from 50 to 100 GB/hour, a second Qlik instance was accounted for, and at 200 GB/hour the instance count was doubled again to a total of four.
  • Based on these findings, TCO calculations reveal that Fivetran HVR is 7% less expensive than Qlik Replicate as redo log change data rates increase to 100 GB/hour, and 25% less expensive at 200 GB/hour. At the base 50 GB/hour data volume, Qlik was 5% less expensive to operate than Fivetran.